WorldWideScience

Sample records for machine learning applications

  1. Machine Learning applications in CMS

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Machine Learning is used in many aspects of CMS data taking, monitoring, processing and analysis. We review a few of these use cases and the most recent developments, with an outlook to future applications in the LHC Run III and for the High-Luminosity phase.

  2. New Applications of Learning Machines

    DEFF Research Database (Denmark)

    Larsen, Jan

    * Machine learning framework for sound search * Genre classification * Music separation * MIMO channel estimation and symbol detection......* Machine learning framework for sound search * Genre classification * Music separation * MIMO channel estimation and symbol detection...

  3. Application of Machine Learning Techniques in Aquaculture

    OpenAIRE

    Rahman, Akhlaqur; Tasnim, Sumaira

    2014-01-01

    In this paper we present applications of different machine learning algorithms in aquaculture. Machine learning algorithms learn models from historical data. In aquaculture historical data are obtained from farm practices, yields, and environmental data sources. Associations between these different variables can be obtained by applying machine learning algorithms to historical data. In this paper we present applications of different machine learning algorithms in aquaculture applications.

  4. Machine learning applications in genetics and genomics.

    Science.gov (United States)

    Libbrecht, Maxwell W; Noble, William Stafford

    2015-06-01

    The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets. Here, we provide an overview of machine learning applications for the analysis of genome sequencing data sets, including the annotation of sequence elements and epigenetic, proteomic or metabolomic data. We present considerations and recurrent challenges in the application of supervised, semi-supervised and unsupervised machine learning methods, as well as of generative and discriminative modelling approaches. We provide general guidelines to assist in the selection of these machine learning methods and their practical application for the analysis of genetic and genomic data sets.

  5. Advanced Machine learning Algorithm Application for Rotating Machine Health Monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Kanemoto, Shigeru; Watanabe, Masaya [The University of Aizu, Aizuwakamatsu (Japan); Yusa, Noritaka [Tohoku University, Sendai (Japan)

    2014-08-15

    The present paper tries to evaluate the applicability of conventional sound analysis techniques and modern machine learning algorithms to rotating machine health monitoring. These techniques include support vector machine, deep leaning neural network, etc. The inner ring defect and misalignment anomaly sound data measured by a rotating machine mockup test facility are used to verify the above various kinds of algorithms. Although we cannot find remarkable difference of anomaly discrimination performance, some methods give us the very interesting eigen patterns corresponding to normal and abnormal states. These results will be useful for future more sensitive and robust anomaly monitoring technology.

  6. Advanced Machine learning Algorithm Application for Rotating Machine Health Monitoring

    International Nuclear Information System (INIS)

    Kanemoto, Shigeru; Watanabe, Masaya; Yusa, Noritaka

    2014-01-01

    The present paper tries to evaluate the applicability of conventional sound analysis techniques and modern machine learning algorithms to rotating machine health monitoring. These techniques include support vector machine, deep leaning neural network, etc. The inner ring defect and misalignment anomaly sound data measured by a rotating machine mockup test facility are used to verify the above various kinds of algorithms. Although we cannot find remarkable difference of anomaly discrimination performance, some methods give us the very interesting eigen patterns corresponding to normal and abnormal states. These results will be useful for future more sensitive and robust anomaly monitoring technology

  7. Application of machine learning methods in bioinformatics

    Science.gov (United States)

    Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen

    2018-05-01

    Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.

  8. Machine learning paradigms applications in recommender systems

    CERN Document Server

    Lampropoulos, Aristomenis S

    2015-01-01

    This timely book presents Applications in Recommender Systems which are making recommendations using machine learning algorithms trained via examples of content the user likes or dislikes. Recommender systems built on the assumption of availability of both positive and negative examples do not perform well when negative examples are rare. It is exactly this problem that the authors address in the monograph at hand. Specifically, the books approach is based on one-class classification methodologies that have been appearing in recent machine learning research. The blending of recommender systems and one-class classification provides a new very fertile field for research, innovation and development with potential applications in “big data” as well as “sparse data” problems. The book will be useful to researchers, practitioners and graduate students dealing with problems of extensive and complex data. It is intended for both the expert/researcher in the fields of Pattern Recognition, Machine Learning and ...

  9. Ensemble Machine Learning Methods and Applications

    CERN Document Server

    Ma, Yunqian

    2012-01-01

    It is common wisdom that gathering a variety of views and inputs improves the process of decision making, and, indeed, underpins a democratic society. Dubbed “ensemble learning” by researchers in computational intelligence and machine learning, it is known to improve a decision system’s robustness and accuracy. Now, fresh developments are allowing researchers to unleash the power of ensemble learning in an increasing range of real-world applications. Ensemble learning algorithms such as “boosting” and “random forest” facilitate solutions to key computational issues such as face detection and are now being applied in areas as diverse as object trackingand bioinformatics.   Responding to a shortage of literature dedicated to the topic, this volume offers comprehensive coverage of state-of-the-art ensemble learning techniques, including various contributions from researchers in leading industrial research labs. At once a solid theoretical study and a practical guide, the volume is a windfall for r...

  10. Extreme learning machines 2013 algorithms and applications

    CERN Document Server

    Toh, Kar-Ann; Romay, Manuel; Mao, Kezhi

    2014-01-01

    In recent years, ELM has emerged as a revolutionary technique of computational intelligence, and has attracted considerable attentions. An extreme learning machine (ELM) is a single layer feed-forward neural network alike learning system, whose connections from the input layer to the hidden layer are randomly generated, while the connections from the hidden layer to the output layer are learned through linear learning methods. The outstanding merits of extreme learning machine (ELM) are its fast learning speed, trivial human intervene and high scalability.   This book contains some selected papers from the International Conference on Extreme Learning Machine 2013, which was held in Beijing China, October 15-17, 2013. This conference aims to bring together the researchers and practitioners of extreme learning machine from a variety of fields including artificial intelligence, biomedical engineering and bioinformatics, system modelling and control, and signal and image processing, to promote research and discu...

  11. Virtual Things for Machine Learning Applications

    OpenAIRE

    Bovet , Gérôme; Ridi , Antonio; Hennebert , Jean

    2014-01-01

    International audience; Internet-of-Things (IoT) devices, especially sensors are pro-ducing large quantities of data that can be used for gather-ing knowledge. In this field, machine learning technologies are increasingly used to build versatile data-driven models. In this paper, we present a novel architecture able to ex-ecute machine learning algorithms within the sensor net-work, presenting advantages in terms of privacy and data transfer efficiency. We first argument that some classes of ...

  12. Intelligent Machine Learning Approaches for Aerospace Applications

    Science.gov (United States)

    Sathyan, Anoop

    Machine Learning is a type of artificial intelligence that provides machines or networks the ability to learn from data without the need to explicitly program them. There are different kinds of machine learning techniques. This thesis discusses the applications of two of these approaches: Genetic Fuzzy Logic and Convolutional Neural Networks (CNN). Fuzzy Logic System (FLS) is a powerful tool that can be used for a wide variety of applications. FLS is a universal approximator that reduces the need for complex mathematics and replaces it with expert knowledge of the system to produce an input-output mapping using If-Then rules. The expert knowledge of a system can help in obtaining the parameters for small-scale FLSs, but for larger networks we will need to use sophisticated approaches that can automatically train the network to meet the design requirements. This is where Genetic Algorithms (GA) and EVE come into the picture. Both GA and EVE can tune the FLS parameters to minimize a cost function that is designed to meet the requirements of the specific problem. EVE is an artificial intelligence developed by Psibernetix that is trained to tune large scale FLSs. The parameters of an FLS can include the membership functions and rulebase of the inherent Fuzzy Inference Systems (FISs). The main issue with using the GFS is that the number of parameters in a FIS increase exponentially with the number of inputs thus making it increasingly harder to tune them. To reduce this issue, the FLSs discussed in this thesis consist of 2-input-1-output FISs in cascade (Chapter 4) or as a layer of parallel FISs (Chapter 7). We have obtained extremely good results using GFS for different applications at a reduced computational cost compared to other algorithms that are commonly used to solve the corresponding problems. In this thesis, GFSs have been designed for controlling an inverted double pendulum, a task allocation problem of clustering targets amongst a set of UAVs, a fire

  13. Machine Learning in Radiology: Applications Beyond Image Interpretation.

    Science.gov (United States)

    Lakhani, Paras; Prater, Adam B; Hutson, R Kent; Andriole, Kathy P; Dreyer, Keith J; Morey, Jose; Prevedello, Luciano M; Clark, Toshi J; Geis, J Raymond; Itri, Jason N; Hawkins, C Matthew

    2018-02-01

    Much attention has been given to machine learning and its perceived impact in radiology, particularly in light of recent success with image classification in international competitions. However, machine learning is likely to impact radiology outside of image interpretation long before a fully functional "machine radiologist" is implemented in practice. Here, we describe an overview of machine learning, its application to radiology and other domains, and many cases of use that do not involve image interpretation. We hope that better understanding of these potential applications will help radiology practices prepare for the future and realize performance improvement and efficiency gains. Copyright © 2017 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  14. Machine Learning

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Machine learning, which builds on ideas in computer science, statistics, and optimization, focuses on developing algorithms to identify patterns and regularities in data, and using these learned patterns to make predictions on new observations. Boosted by its industrial and commercial applications, the field of machine learning is quickly evolving and expanding. Recent advances have seen great success in the realms of computer vision, natural language processing, and broadly in data science. Many of these techniques have already been applied in particle physics, for instance for particle identification, detector monitoring, and the optimization of computer resources. Modern machine learning approaches, such as deep learning, are only just beginning to be applied to the analysis of High Energy Physics data to approach more and more complex problems. These classes will review the framework behind machine learning and discuss recent developments in the field.

  15. Machine learning in radiation oncology theory and applications

    CERN Document Server

    El Naqa, Issam; Murphy, Martin J

    2015-01-01

    ​This book provides a complete overview of the role of machine learning in radiation oncology and medical physics, covering basic theory, methods, and a variety of applications in medical physics and radiotherapy. An introductory section explains machine learning, reviews supervised and unsupervised learning methods, discusses performance evaluation, and summarizes potential applications in radiation oncology. Detailed individual sections are then devoted to the use of machine learning in quality assurance; computer-aided detection, including treatment planning and contouring; image-guided rad

  16. Conformal prediction for reliable machine learning theory, adaptations and applications

    CERN Document Server

    Balasubramanian, Vineeth; Vovk, Vladimir

    2014-01-01

    The conformal predictions framework is a recent development in machine learning that can associate a reliable measure of confidence with a prediction in any real-world pattern recognition application, including risk-sensitive applications such as medical diagnosis, face recognition, and financial risk prediction. Conformal Predictions for Reliable Machine Learning: Theory, Adaptations and Applications captures the basic theory of the framework, demonstrates how to apply it to real-world problems, and presents several adaptations, including active learning, change detection, and anomaly detecti

  17. Applications of machine learning in cancer prediction and prognosis.

    Science.gov (United States)

    Cruz, Joseph A; Wishart, David S

    2007-02-11

    Machine learning is a branch of artificial intelligence that employs a variety of statistical, probabilistic and optimization techniques that allows computers to "learn" from past examples and to detect hard-to-discern patterns from large, noisy or complex data sets. This capability is particularly well-suited to medical applications, especially those that depend on complex proteomic and genomic measurements. As a result, machine learning is frequently used in cancer diagnosis and detection. More recently machine learning has been applied to cancer prognosis and prediction. This latter approach is particularly interesting as it is part of a growing trend towards personalized, predictive medicine. In assembling this review we conducted a broad survey of the different types of machine learning methods being used, the types of data being integrated and the performance of these methods in cancer prediction and prognosis. A number of trends are noted, including a growing dependence on protein biomarkers and microarray data, a strong bias towards applications in prostate and breast cancer, and a heavy reliance on "older" technologies such artificial neural networks (ANNs) instead of more recently developed or more easily interpretable machine learning methods. A number of published studies also appear to lack an appropriate level of validation or testing. Among the better designed and validated studies it is clear that machine learning methods can be used to substantially (15-25%) improve the accuracy of predicting cancer susceptibility, recurrence and mortality. At a more fundamental level, it is also evident that machine learning is also helping to improve our basic understanding of cancer development and progression.

  18. Machine learning for epigenetics and future medical applications

    OpenAIRE

    Holder, Lawrence B.; Haque, M. Muksitul; Skinner, Michael K.

    2017-01-01

    ABSTRACT Understanding epigenetic processes holds immense promise for medical applications. Advances in Machine Learning (ML) are critical to realize this promise. Previous studies used epigenetic data sets associated with the germline transmission of epigenetic transgenerational inheritance of disease and novel ML approaches to predict genome-wide locations of critical epimutations. A combination of Active Learning (ACL) and Imbalanced Class Learning (ICL) was used to address past problems w...

  19. Applications of Machine Learning in Cancer Prediction and Prognosis

    Directory of Open Access Journals (Sweden)

    Joseph A. Cruz

    2006-01-01

    Full Text Available Machine learning is a branch of artificial intelligence that employs a variety of statistical, probabilistic and optimization techniques that allows computers to “learn” from past examples and to detect hard-to-discern patterns from large, noisy or complex data sets. This capability is particularly well-suited to medical applications, especially those that depend on complex proteomic and genomic measurements. As a result, machine learning is frequently used in cancer diagnosis and detection. More recently machine learning has been applied to cancer prognosis and prediction. This latter approach is particularly interesting as it is part of a growing trend towards personalized, predictive medicine. In assembling this review we conducted a broad survey of the different types of machine learning methods being used, the types of data being integrated and the performance of these methods in cancer prediction and prognosis. A number of trends are noted, including a growing dependence on protein biomarkers and microarray data, a strong bias towards applications in prostate and breast cancer, and a heavy reliance on “older” technologies such artificial neural networks (ANNs instead of more recently developed or more easily interpretable machine learning methods. A number of published studies also appear to lack an appropriate level of validation or testing. Among the better designed and validated studies it is clear that machine learning methods can be used to substantially (15-25% improve the accuracy of predicting cancer susceptibility, recurrence and mortality. At a more fundamental level, it is also evident that machine learning is also helping to improve our basic understanding of cancer development and progression.

  20. Machine learning in manufacturing: advantages, challenges, and applications

    Directory of Open Access Journals (Sweden)

    Thorsten Wuest

    2016-01-01

    Full Text Available The nature of manufacturing systems faces ever more complex, dynamic and at times even chaotic behaviors. In order to being able to satisfy the demand for high-quality products in an efficient manner, it is essential to utilize all means available. One area, which saw fast pace developments in terms of not only promising results but also usability, is machine learning. Promising an answer to many of the old and new challenges of manufacturing, machine learning is widely discussed by researchers and practitioners alike. However, the field is very broad and even confusing which presents a challenge and a barrier hindering wide application. Here, this paper contributes in presenting an overview of available machine learning techniques and structuring this rather complicated area. A special focus is laid on the potential benefit, and examples of successful applications in a manufacturing environment.

  1. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.

    Science.gov (United States)

    Huang, Shujun; Cai, Nianguang; Pacheco, Pedro Penzuti; Narrandes, Shavira; Wang, Yang; Xu, Wayne

    2018-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications. Copyright© 2018, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  2. Machine Learning.

    Science.gov (United States)

    Kirrane, Diane E.

    1990-01-01

    As scientists seek to develop machines that can "learn," that is, solve problems by imitating the human brain, a gold mine of information on the processes of human learning is being discovered, expert systems are being improved, and human-machine interactions are being enhanced. (SK)

  3. Machine learning application in online lending risk prediction

    OpenAIRE

    Yu, Xiaojiao

    2017-01-01

    Online leading has disrupted the traditional consumer banking sector with more effective loan processing. Risk prediction and monitoring is critical for the success of the business model. Traditional credit score models fall short in applying big data technology in building risk model. In this manuscript, data with various format and size were collected from public website, third-parties and assembled with client's loan application information data. Ensemble machine learning models, random fo...

  4. Modern machine learning techniques and their applications in cartoon animation research

    CERN Document Server

    Yu, Jun

    2013-01-01

    The integration of machine learning techniques and cartoon animation research is fast becoming a hot topic. This book helps readers learn the latest machine learning techniques, including patch alignment framework; spectral clustering, graph cuts, and convex relaxation; ensemble manifold learning; multiple kernel learning; multiview subspace learning; and multiview distance metric learning. It then presents the applications of these modern machine learning techniques in cartoon animation research. With these techniques, users can efficiently utilize the cartoon materials to generate animations

  5. Application of Machine Learning to Rotorcraft Health Monitoring

    Science.gov (United States)

    Cody, Tyler; Dempsey, Paula J.

    2017-01-01

    Machine learning is a powerful tool for data exploration and model building with large data sets. This project aimed to use machine learning techniques to explore the inherent structure of data from rotorcraft gear tests, relationships between features and damage states, and to build a system for predicting gear health for future rotorcraft transmission applications. Classical machine learning techniques are difficult, if not irresponsible to apply to time series data because many make the assumption of independence between samples. To overcome this, Hidden Markov Models were used to create a binary classifier for identifying scuffing transitions and Recurrent Neural Networks were used to leverage long distance relationships in predicting discrete damage states. When combined in a workflow, where the binary classifier acted as a filter for the fatigue monitor, the system was able to demonstrate accuracy in damage state prediction and scuffing identification. The time dependent nature of the data restricted data exploration to collecting and analyzing data from the model selection process. The limited amount of available data was unable to give useful information, and the division of training and testing sets tended to heavily influence the scores of the models across combinations of features and hyper-parameters. This work built a framework for tracking scuffing and fatigue on streaming data and demonstrates that machine learning has much to offer rotorcraft health monitoring by using Bayesian learning and deep learning methods to capture the time dependent nature of the data. Suggested future work is to implement the framework developed in this project using a larger variety of data sets to test the generalization capabilities of the models and allow for data exploration.

  6. Machine learning for epigenetics and future medical applications.

    Science.gov (United States)

    Holder, Lawrence B; Haque, M Muksitul; Skinner, Michael K

    2017-07-03

    Understanding epigenetic processes holds immense promise for medical applications. Advances in Machine Learning (ML) are critical to realize this promise. Previous studies used epigenetic data sets associated with the germline transmission of epigenetic transgenerational inheritance of disease and novel ML approaches to predict genome-wide locations of critical epimutations. A combination of Active Learning (ACL) and Imbalanced Class Learning (ICL) was used to address past problems with ML to develop a more efficient feature selection process and address the imbalance problem in all genomic data sets. The power of this novel ML approach and our ability to predict epigenetic phenomena and associated disease is suggested. The current approach requires extensive computation of features over the genome. A promising new approach is to introduce Deep Learning (DL) for the generation and simultaneous computation of novel genomic features tuned to the classification task. This approach can be used with any genomic or biological data set applied to medicine. The application of molecular epigenetic data in advanced machine learning analysis to medicine is the focus of this review.

  7. Machine Learning-Empowered Biometric Methods for Biomedicine Applications

    Directory of Open Access Journals (Sweden)

    Qingxue Zhang

    2017-07-01

    Full Text Available Nowadays, pervasive computing technologies are paving a promising way for advanced smart health applications. However, a key impediment faced by wide deployment of these assistive smart devices, is the increasing privacy and security issue, such as how to protect access to sensitive patient data in the health record. Focusing on this challenge, biometrics are attracting intense attention in terms of effective user identification to enable confidential health applications. In this paper, we take special interest in two bio-potential-based biometric modalities, electrocardiogram (ECG and electroencephalogram (EEG, considering that they are both unique to individuals, and more reliable than token (identity card and knowledge-based (username/password methods. After extracting effective features in multiple domains from ECG/EEG signals, several advanced machine learning algorithms are introduced to perform the user identification task, including Neural Network, K-nearest Neighbor, Bagging, Random Forest and AdaBoost. Experimental results on two public ECG and EEG datasets show that ECG is a more robust biometric modality compared to EEG, leveraging a higher signal to noise ratio and also more distinguishable morphological patterns. Among different machine learning classifiers, the random forest greatly outperforms the others and owns an identification rate as high as 98%. This study is expected to demonstrate that properly selected biometric empowered by an effective machine learner owns a great potential, to enable confidential biomedicine applications in the era of smart digital health.

  8. Extreme learning machine for ranking: generalization analysis and applications.

    Science.gov (United States)

    Chen, Hong; Peng, Jiangtao; Zhou, Yicong; Li, Luoqing; Pan, Zhibin

    2014-05-01

    The extreme learning machine (ELM) has attracted increasing attention recently with its successful applications in classification and regression. In this paper, we investigate the generalization performance of ELM-based ranking. A new regularized ranking algorithm is proposed based on the combinations of activation functions in ELM. The generalization analysis is established for the ELM-based ranking (ELMRank) in terms of the covering numbers of hypothesis space. Empirical results on the benchmark datasets show the competitive performance of the ELMRank over the state-of-the-art ranking methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Machine learning approaches: from theory to application in schizophrenia.

    Science.gov (United States)

    Veronese, Elisa; Castellani, Umberto; Peruzzo, Denis; Bellani, Marcella; Brambilla, Paolo

    2013-01-01

    In recent years, machine learning approaches have been successfully applied for analysis of neuroimaging data, to help in the context of disease diagnosis. We provide, in this paper, an overview of recent support vector machine-based methods developed and applied in psychiatric neuroimaging for the investigation of schizophrenia. In particular, we focus on the algorithms implemented by our group, which have been applied to classify subjects affected by schizophrenia and healthy controls, comparing them in terms of accuracy results with other recently published studies. First we give a description of the basic terminology used in pattern recognition and machine learning. Then we separately summarize and explain each study, highlighting the main features that characterize each method. Finally, as an outcome of the comparison of the results obtained applying the described different techniques, conclusions are drawn in order to understand how much automatic classification approaches can be considered a useful tool in understanding the biological underpinnings of schizophrenia. We then conclude by discussing the main implications achievable by the application of these methods into clinical practice.

  10. Machine Learning Approaches: From Theory to Application in Schizophrenia

    Directory of Open Access Journals (Sweden)

    Elisa Veronese

    2013-01-01

    Full Text Available In recent years, machine learning approaches have been successfully applied for analysis of neuroimaging data, to help in the context of disease diagnosis. We provide, in this paper, an overview of recent support vector machine-based methods developed and applied in psychiatric neuroimaging for the investigation of schizophrenia. In particular, we focus on the algorithms implemented by our group, which have been applied to classify subjects affected by schizophrenia and healthy controls, comparing them in terms of accuracy results with other recently published studies. First we give a description of the basic terminology used in pattern recognition and machine learning. Then we separately summarize and explain each study, highlighting the main features that characterize each method. Finally, as an outcome of the comparison of the results obtained applying the described different techniques, conclusions are drawn in order to understand how much automatic classification approaches can be considered a useful tool in understanding the biological underpinnings of schizophrenia. We then conclude by discussing the main implications achievable by the application of these methods into clinical practice.

  11. Machine Learning

    Energy Technology Data Exchange (ETDEWEB)

    Chikkagoudar, Satish; Chatterjee, Samrat; Thomas, Dennis G.; Carroll, Thomas E.; Muller, George

    2017-04-21

    The absence of a robust and unified theory of cyber dynamics presents challenges and opportunities for using machine learning based data-driven approaches to further the understanding of the behavior of such complex systems. Analysts can also use machine learning approaches to gain operational insights. In order to be operationally beneficial, cybersecurity machine learning based models need to have the ability to: (1) represent a real-world system, (2) infer system properties, and (3) learn and adapt based on expert knowledge and observations. Probabilistic models and Probabilistic graphical models provide these necessary properties and are further explored in this chapter. Bayesian Networks and Hidden Markov Models are introduced as an example of a widely used data driven classification/modeling strategy.

  12. Application of Machine Learning for Dragline Failure Prediction

    Directory of Open Access Journals (Sweden)

    Taghizadeh Amir

    2017-01-01

    Full Text Available Overburden stripping in open cast coal mines is extensively carried out by walking draglines. Draglines’ unavailability and unexpected failures result in delayed productions and increased maintenance and operating costs. Therefore, achieving high availability of draglines plays a crucial role for increasing economic feasibility of mining projects. Applications of methodologies which can forecast the failure type of dragline based on the available failure data not only help to reduce the maintenance and operating costs but also increase the availability and the production rate. In this study, Machine Learning approaches have been applied for data which has been gathered from an operating coal mine in Turkey. The study methodology consists of three algorithms as: i implementation of K-Nearest Neighbors, ii implementation of Multi-Layer Perceptron, and iii implementation of Radial Basis Function. The algorithms have been utilized for predicting the draglines’ failure types. In this sense, the input data, which are mean time-to-failure, and the output data, failure types, have been fed to the algorithms. The regression analysis of methodologies have been compared and showed the K- Nearest Neighbors has a higher rate of regression which is around 70 percent. Thus, the K-Nearest Neighbor algorithm can be applied in order to preventive components replacement which causes to minimized preventive and corrective cost parameters. The accurate prediction of failure type, indeed, causes to optimized number of inspections. The novelty of this study is application of machine learning approaches in draglines’ reliability subject for first time.

  13. Application of Extreme Learning Machines to inverse neutron kinetics

    International Nuclear Information System (INIS)

    Picca, Paolo; Furfaro, Roberto

    2017-01-01

    Highlights: • The paper applies the Extreme Learning Machines (ELMs) to inverse reactor problems. • Multi-group transport model is used for the inversion as opposed to point kinetics. • ELMs are compared against Artificial Neural Networks (ANNs). • Various options are tested to improve the reliability of the estimation. • Results highlight the potential of the ELM approach. - Abstract: The paper presents the application of Extreme Leaning Machines (ELMs) for inverse reactor kinetic applications. ELMs were proposed by Huang and co-workers (2004, 2006a,b, 2015), which showed their enhances capabilities in terms of training speed and generalization with respect to classical Artificial Neural Networks (ANNs). ELMs are here implemented for reactivity determination as an alternative to ANNs (e.g. Picca et al. (2008)) and Gaussian Processes (Picca and Furfaro, 2012). After a review of the main features of ELMs, their application to inverse kinetic problems is proposed. The ELMs performance is tested on a typical accelerator drive system configuration (Yalina reactor) and the inversion is carried out on an accurate kinetic model (multi-group transport).

  14. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics

    OpenAIRE

    HUANG, SHUJUN; CAI, NIANGUANG; PACHECO, PEDRO PENZUTI; NARANDES, SHAVIRA; WANG, YANG; XU, WAYNE

    2017-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better ...

  15. Machine learning application in the life time of materials

    OpenAIRE

    Yu, Xiaojiao

    2017-01-01

    Materials design and development typically takes several decades from the initial discovery to commercialization with the traditional trial and error development approach. With the accumulation of data from both experimental and computational results, data based machine learning becomes an emerging field in materials discovery, design and property prediction. This manuscript reviews the history of materials science as a disciplinary the most common machine learning method used in materials sc...

  16. Application of machine learning on brain cancer multiclass classification

    Science.gov (United States)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  17. Introduction to machine learning.

    Science.gov (United States)

    Baştanlar, Yalin; Ozuysal, Mustafa

    2014-01-01

    The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning approaches for this application area. In this chapter, we first review the fundamental concepts of machine learning such as feature assessment, unsupervised versus supervised learning and types of classification. Then, we point out the main issues of designing machine learning experiments and their performance evaluation. Finally, we introduce some supervised learning methods.

  18. Energy landscapes for a machine learning application to series data

    Energy Technology Data Exchange (ETDEWEB)

    Ballard, Andrew J.; Stevenson, Jacob D.; Das, Ritankar; Wales, David J., E-mail: dw34@cam.ac.uk [University Chemical Laboratories, Lensfield Road, Cambridge CB2 1EW (United Kingdom)

    2016-03-28

    Methods developed to explore and characterise potential energy landscapes are applied to the corresponding landscapes obtained from optimisation of a cost function in machine learning. We consider neural network predictions for the outcome of local geometry optimisation in a triatomic cluster, where four distinct local minima exist. The accuracy of the predictions is compared for fits using data from single and multiple points in the series of atomic configurations resulting from local geometry optimisation and for alternative neural networks. The machine learning solution landscapes are visualised using disconnectivity graphs, and signatures in the effective heat capacity are analysed in terms of distributions of local minima and their properties.

  19. Energy landscapes for a machine learning application to series data

    International Nuclear Information System (INIS)

    Ballard, Andrew J.; Stevenson, Jacob D.; Das, Ritankar; Wales, David J.

    2016-01-01

    Methods developed to explore and characterise potential energy landscapes are applied to the corresponding landscapes obtained from optimisation of a cost function in machine learning. We consider neural network predictions for the outcome of local geometry optimisation in a triatomic cluster, where four distinct local minima exist. The accuracy of the predictions is compared for fits using data from single and multiple points in the series of atomic configurations resulting from local geometry optimisation and for alternative neural networks. The machine learning solution landscapes are visualised using disconnectivity graphs, and signatures in the effective heat capacity are analysed in terms of distributions of local minima and their properties.

  20. Machine learning applications in cancer prognosis and prediction.

    Science.gov (United States)

    Kourou, Konstantina; Exarchos, Themis P; Exarchos, Konstantinos P; Karamouzis, Michalis V; Fotiadis, Dimitrios I

    2015-01-01

    Cancer has been characterized as a heterogeneous disease consisting of many different subtypes. The early diagnosis and prognosis of a cancer type have become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients. The importance of classifying cancer patients into high or low risk groups has led many research teams, from the biomedical and the bioinformatics field, to study the application of machine learning (ML) methods. Therefore, these techniques have been utilized as an aim to model the progression and treatment of cancerous conditions. In addition, the ability of ML tools to detect key features from complex datasets reveals their importance. A variety of these techniques, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Support Vector Machines (SVMs) and Decision Trees (DTs) have been widely applied in cancer research for the development of predictive models, resulting in effective and accurate decision making. Even though it is evident that the use of ML methods can improve our understanding of cancer progression, an appropriate level of validation is needed in order for these methods to be considered in the everyday clinical practice. In this work, we present a review of recent ML approaches employed in the modeling of cancer progression. The predictive models discussed here are based on various supervised ML techniques as well as on different input features and data samples. Given the growing trend on the application of ML methods in cancer research, we present here the most recent publications that employ these techniques as an aim to model cancer risk or patient outcomes.

  1. Application of Machine Learning Algorithms for the Query Performance Prediction

    Directory of Open Access Journals (Sweden)

    MILICEVIC, M.

    2015-08-01

    Full Text Available This paper analyzes the relationship between the system load/throughput and the query response time in a real Online transaction processing (OLTP system environment. Although OLTP systems are characterized by short transactions, which normally entail high availability and consistent short response times, the need for operational reporting may jeopardize these objectives. We suggest a new approach to performance prediction for concurrent database workloads, based on the system state vector which consists of 36 attributes. There is no bias to the importance of certain attributes, but the machine learning methods are used to determine which attributes better describe the behavior of the particular database server and how to model that system. During the learning phase, the system's profile is created using multiple reference queries, which are selected to represent frequent business processes. The possibility of the accurate response time prediction may be a foundation for automated decision-making for database (DB query scheduling. Possible applications of the proposed method include adaptive resource allocation, quality of service (QoS management or real-time dynamic query scheduling (e.g. estimation of the optimal moment for a complex query execution.

  2. Machine learning in materials informatics: recent applications and prospects

    Science.gov (United States)

    Ramprasad, Rampi; Batra, Rohit; Pilania, Ghanshyam; Mannodi-Kanakkithodi, Arun; Kim, Chiho

    2017-12-01

    Propelled partly by the Materials Genome Initiative, and partly by the algorithmic developments and the resounding successes of data-driven efforts in other domains, informatics strategies are beginning to take shape within materials science. These approaches lead to surrogate machine learning models that enable rapid predictions based purely on past data rather than by direct experimentation or by computations/simulations in which fundamental equations are explicitly solved. Data-centric informatics methods are becoming useful to determine material properties that are hard to measure or compute using traditional methods—due to the cost, time or effort involved—but for which reliable data either already exists or can be generated for at least a subset of the critical cases. Predictions are typically interpolative, involving fingerprinting a material numerically first, and then following a mapping (established via a learning algorithm) between the fingerprint and the property of interest. Fingerprints, also referred to as "descriptors", may be of many types and scales, as dictated by the application domain and needs. Predictions may also be extrapolative—extending into new materials spaces—provided prediction uncertainties are properly taken into account. This article attempts to provide an overview of some of the recent successful data-driven "materials informatics" strategies undertaken in the last decade, with particular emphasis on the fingerprint or descriptor choices. The review also identifies some challenges the community is facing and those that should be overcome in the near future.

  3. Machine Learning and Radiology

    Science.gov (United States)

    Wang, Shijun; Summers, Ronald M.

    2012-01-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. PMID:22465077

  4. Machine learning and radiology.

    Science.gov (United States)

    Wang, Shijun; Summers, Ronald M

    2012-07-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. Copyright © 2012. Published by Elsevier B.V.

  5. MACHINE LEARNING FOR THE SELF-ORGANIZATION OF DISTRIBUTED SYSTEMS IN ECONOMIC APPLICATIONS

    OpenAIRE

    Jerzy Balicki; Waldemar Korłub

    2017-01-01

    In this paper, an application of machine learning to the problem of self-organization of distributed systems has been discussed with regard to economic applications, with particular emphasis on supervised neural network learning to predict stock investments and some ratings of companies. In addition, genetic programming can play an important role in the preparation and testing of several financial information systems. For this reason, machine learning applications have been discussed because ...

  6. Kernel Methods for Machine Learning with Life Science Applications

    DEFF Research Database (Denmark)

    Abrahamsen, Trine Julie

    Kernel methods refer to a family of widely used nonlinear algorithms for machine learning tasks like classification, regression, and feature extraction. By exploiting the so-called kernel trick straightforward extensions of classical linear algorithms are enabled as long as the data only appear a...

  7. Support vector machines applications

    CERN Document Server

    Guo, Guodong

    2014-01-01

    Support vector machines (SVM) have both a solid mathematical background and good performance in practical applications. This book focuses on the recent advances and applications of the SVM in different areas, such as image processing, medical practice, computer vision, pattern recognition, machine learning, applied statistics, business intelligence, and artificial intelligence. The aim of this book is to create a comprehensive source on support vector machine applications, especially some recent advances.

  8. Machine Learning Identification of Protein Properties Useful for Specific Applications

    KAUST Repository

    Khamis, Abdullah

    2016-03-31

    Proteins play critical roles in cellular processes of living organisms. It is therefore important to identify and characterize their key properties associated with their functions. Correlating protein’s structural, sequence and physicochemical properties of its amino acids (aa) with protein functions could identify some of the critical factors governing the specific functionality. We point out that not all functions of even well studied proteins are known. This, complemented by the huge increase in the number of newly discovered and predicted proteins, makes challenging the experimental characterization of the whole spectrum of possible protein functions for all proteins of interest. Consequently, the use of computational methods has become more attractive. Here we address two questions. The first one is how to use protein aa sequence and physicochemical properties to characterize a family of proteins. The second one focuses on how to use transcription factor (TF) protein’s domains to enhance accuracy of predicting TF DNA binding sites (TFBSs). To address the first question, we developed a novel method using computational representation of proteins based on characteristics of different protein regions (N-terminal, M-region and C-terminal) and combined these with the properties of protein aa sequences. We show that this description provides important biological insight about characterization of the protein functional groups. Using feature selection techniques, we identified key properties of proteins that allow for very accurate characterization of different protein families. We demonstrated efficiency of our method in application to a number of antimicrobial peptide families. To address the second question we developed another novel method that uses a combination of aa properties of DNA binding domains of TFs and their TFBS properties to develop machine learning models for predicting TFBSs. Feature selection is used to identify the most relevant characteristics

  9. Application of Machine Learning Approaches for Protein-protein Interactions Prediction.

    Science.gov (United States)

    Zhang, Mengying; Su, Qiang; Lu, Yi; Zhao, Manman; Niu, Bing

    2017-01-01

    Proteomics endeavors to study the structures, functions and interactions of proteins. Information of the protein-protein interactions (PPIs) helps to improve our knowledge of the functions and the 3D structures of proteins. Thus determining the PPIs is essential for the study of the proteomics. In this review, in order to study the application of machine learning in predicting PPI, some machine learning approaches such as support vector machine (SVM), artificial neural networks (ANNs) and random forest (RF) were selected, and the examples of its applications in PPIs were listed. SVM and RF are two commonly used methods. Nowadays, more researchers predict PPIs by combining more than two methods. This review presents the application of machine learning approaches in predicting PPI. Many examples of success in identification and prediction in the area of PPI prediction have been discussed, and the PPIs research is still in progress. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  10. MACHINE LEARNING FOR THE SELF-ORGANIZATION OF DISTRIBUTED SYSTEMS IN ECONOMIC APPLICATIONS

    Directory of Open Access Journals (Sweden)

    Jerzy Balicki

    2017-03-01

    Full Text Available In this paper, an application of machine learning to the problem of self-organization of distributed systems has been discussed with regard to economic applications, with particular emphasis on supervised neural network learning to predict stock investments and some ratings of companies. In addition, genetic programming can play an important role in the preparation and testing of several financial information systems. For this reason, machine learning applications have been discussed because some software applications can be automatically constructed by genetic programming. To obtain a competitive advantage, machine learning can be used for the management of self-organizing cloud computing systems performing calculations for business. Also the use of selected economic self-organizing distributed systems has been described, including some testing methods of predicting borrower reliability. Finally, some conclusions and directions for further research have been proposed.

  11. An experimental result of estimating an application volume by machine learning techniques.

    Science.gov (United States)

    Hasegawa, Tatsuhito; Koshino, Makoto; Kimura, Haruhiko

    2015-01-01

    In this study, we improved the usability of smartphones by automating a user's operations. We developed an intelligent system using machine learning techniques that periodically detects a user's context on a smartphone. We selected the Android operating system because it has the largest market share and highest flexibility of its development environment. In this paper, we describe an application that automatically adjusts application volume. Adjusting the volume can be easily forgotten because users need to push the volume buttons to alter the volume depending on the given situation. Therefore, we developed an application that automatically adjusts the volume based on learned user settings. Application volume can be set differently from ringtone volume on Android devices, and these volume settings are associated with each specific application including games. Our application records a user's location, the volume setting, the foreground application name and other such attributes as learning data, thereby estimating whether the volume should be adjusted using machine learning techniques via Weka.

  12. Machine Learning and Applied Linguistics

    OpenAIRE

    Vajjala, Sowmya

    2018-01-01

    This entry introduces the topic of machine learning and provides an overview of its relevance for applied linguistics and language learning. The discussion will focus on giving an introduction to the methods and applications of machine learning in applied linguistics, and will provide references for further study.

  13. The application of machine learning techniques in the clinical drug therapy.

    Science.gov (United States)

    Meng, Huan-Yu; Jin, Wan-Lin; Yan, Cheng-Kai; Yang, Huan

    2018-05-25

    The development of a novel drug is an extremely complicated process that includes the target identification, design and manufacture, and proper therapy of the novel drug, as well as drug dose selection, drug efficacy evaluation, and adverse drug reaction control. Due to the limited resources, high costs, long duration, and low hit-to-lead ratio in the development of pharmacogenetics and computer technology, machine learning techniques have assisted novel drug development and have gradually received more attention by researchers. According to current research, machine learning techniques are widely applied in the process of the discovery of new drugs and novel drug targets, the decision surrounding proper therapy and drug dose, and the prediction of drug efficacy and adverse drug reactions. In this article, we discussed the history, workflow, and advantages and disadvantages of machine learning techniques in the processes mentioned above. Although the advantages of machine learning techniques are fairly obvious, the application of machine learning techniques is currently limited. With further research, the application of machine techniques in drug development could be much more widespread and could potentially be one of the major methods used in drug development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  14. A Comprehensive Review and meta-analysis on Applications of Machine Learning Techniques in Intrusion Detection

    Directory of Open Access Journals (Sweden)

    Manojit Chattopadhyay

    2018-05-01

    Full Text Available Securing a machine from various cyber-attacks has been of serious concern for researchers, statutory bodies such as governments, business organizations and users in both wired and wireless media. However, during the last decade, the amount of data handling by any device, particularly servers, has increased exponentially and hence the security of these devices has become a matter of utmost concern. This paper attempts to examine the challenges in the application of machine learning techniques to intrusion detection. We review different inherent issues in defining and applying the machine learning techniques to intrusion detection. We also attempt to identify the best technological solution for changing usage pattern by comparing different machine learning techniques on different datasets and summarizing their performance using various performance metrics. This paper highlights the research challenges and future trends of intrusion detection in dynamic scenarios of intrusion detection problems in diverse network technologies.

  15. Machine learning applications in proteomics research: how the past can boost the future.

    Science.gov (United States)

    Kelchtermans, Pieter; Bittremieux, Wout; De Grave, Kurt; Degroeve, Sven; Ramon, Jan; Laukens, Kris; Valkenborg, Dirk; Barsnes, Harald; Martens, Lennart

    2014-03-01

    Machine learning is a subdiscipline within artificial intelligence that focuses on algorithms that allow computers to learn solving a (complex) problem from existing data. This ability can be used to generate a solution to a particularly intractable problem, given that enough data are available to train and subsequently evaluate an algorithm on. Since MS-based proteomics has no shortage of complex problems, and since publicly available data are becoming available in ever growing amounts, machine learning is fast becoming a very popular tool in the field. We here therefore present an overview of the different applications of machine learning in proteomics that together cover nearly the entire wet- and dry-lab workflow, and that address key bottlenecks in experiment planning and design, as well as in data processing and analysis. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Soft computing in machine learning

    CERN Document Server

    Park, Jooyoung; Inoue, Atsushi

    2014-01-01

    As users or consumers are now demanding smarter devices, intelligent systems are revolutionizing by utilizing machine learning. Machine learning as part of intelligent systems is already one of the most critical components in everyday tools ranging from search engines and credit card fraud detection to stock market analysis. You can train machines to perform some things, so that they can automatically detect, diagnose, and solve a variety of problems. The intelligent systems have made rapid progress in developing the state of the art in machine learning based on smart and deep perception. Using machine learning, the intelligent systems make widely applications in automated speech recognition, natural language processing, medical diagnosis, bioinformatics, and robot locomotion. This book aims at introducing how to treat a substantial amount of data, to teach machines and to improve decision making models. And this book specializes in the developments of advanced intelligent systems through machine learning. It...

  17. e-Learning Application for Machine Maintenance Process using Iterative Method in XYZ Company

    Science.gov (United States)

    Nurunisa, Suaidah; Kurniawati, Amelia; Pramuditya Soesanto, Rayinda; Yunan Kurnia Septo Hediyanto, Umar

    2016-02-01

    XYZ Company is a company based on manufacturing part for airplane, one of the machine that is categorized as key facility in the company is Millac 5H6P. As a key facility, the machines should be assured to work well and in peak condition, therefore, maintenance process is needed periodically. From the data gathering, it is known that there are lack of competency from the maintenance staff to maintain different type of machine which is not assigned by the supervisor, this indicate that knowledge which possessed by maintenance staff are uneven. The purpose of this research is to create knowledge-based e-learning application as a realization from externalization process in knowledge transfer process to maintain the machine. The application feature are adjusted for maintenance purpose using e-learning framework for maintenance process, the content of the application support multimedia for learning purpose. QFD is used in this research to understand the needs from user. The application is built using moodle with iterative method for software development cycle and UML Diagram. The result from this research is e-learning application as sharing knowledge media for maintenance staff in the company. From the test, it is known that the application make maintenance staff easy to understand the competencies.

  18. Machine Learning for Security

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    Applied statistics, aka ‘Machine Learning’, offers a wealth of techniques for answering security questions. It’s a much hyped topic in the big data world, with many companies now providing machine learning as a service. This talk will demystify these techniques, explain the math, and demonstrate their application to security problems. The presentation will include how-to’s on classifying malware, looking into encrypted tunnels, and finding botnets in DNS data. About the speaker Josiah is a security researcher with HP TippingPoint DVLabs Research Group. He has over 15 years of professional software development experience. Josiah used to do AI, with work focused on graph theory, search, and deductive inference on large knowledge bases. As rules only get you so far, he moved from AI to using machine learning techniques identifying failure modes in email traffic. There followed digressions into clustered data storage and later integrated control systems. Current ...

  19. Human Machine Learning Symbiosis

    Science.gov (United States)

    Walsh, Kenneth R.; Hoque, Md Tamjidul; Williams, Kim H.

    2017-01-01

    Human Machine Learning Symbiosis is a cooperative system where both the human learner and the machine learner learn from each other to create an effective and efficient learning environment adapted to the needs of the human learner. Such a system can be used in online learning modules so that the modules adapt to each learner's learning state both…

  20. Application of machine learning techniques to lepton energy reconstruction in water Cherenkov detectors

    Science.gov (United States)

    Drakopoulou, E.; Cowan, G. A.; Needham, M. D.; Playfer, S.; Taani, M.

    2018-04-01

    The application of machine learning techniques to the reconstruction of lepton energies in water Cherenkov detectors is discussed and illustrated for TITUS, a proposed intermediate detector for the Hyper-Kamiokande experiment. It is found that applying these techniques leads to an improvement of more than 50% in the energy resolution for all lepton energies compared to an approach based upon lookup tables. Machine learning techniques can be easily applied to different detector configurations and the results are comparable to likelihood-function based techniques that are currently used.

  1. Application of Machine Learning to Proteomics Data: Classification and Biomarker Identification in Postgenomics Biology

    Science.gov (United States)

    Swan, Anna Louise; Mobasheri, Ali; Allaway, David; Liddell, Susan

    2013-01-01

    Abstract Mass spectrometry is an analytical technique for the characterization of biological samples and is increasingly used in omics studies because of its targeted, nontargeted, and high throughput abilities. However, due to the large datasets generated, it requires informatics approaches such as machine learning techniques to analyze and interpret relevant data. Machine learning can be applied to MS-derived proteomics data in two ways. First, directly to mass spectral peaks and second, to proteins identified by sequence database searching, although relative protein quantification is required for the latter. Machine learning has been applied to mass spectrometry data from different biological disciplines, particularly for various cancers. The aims of such investigations have been to identify biomarkers and to aid in diagnosis, prognosis, and treatment of specific diseases. This review describes how machine learning has been applied to proteomics tandem mass spectrometry data. This includes how it can be used to identify proteins suitable for use as biomarkers of disease and for classification of samples into disease or treatment groups, which may be applicable for diagnostics. It also includes the challenges faced by such investigations, such as prediction of proteins present, protein quantification, planning for the use of machine learning, and small sample sizes. PMID:24116388

  2. Quantum machine learning.

    Science.gov (United States)

    Biamonte, Jacob; Wittek, Peter; Pancotti, Nicola; Rebentrost, Patrick; Wiebe, Nathan; Lloyd, Seth

    2017-09-13

    Fuelled by increasing computer power and algorithmic advances, machine learning techniques have become powerful tools for finding patterns in data. Quantum systems produce atypical patterns that classical systems are thought not to produce efficiently, so it is reasonable to postulate that quantum computers may outperform classical computers on machine learning tasks. The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers. Recent work has produced quantum algorithms that could act as the building blocks of machine learning programs, but the hardware and software challenges are still considerable.

  3. Machine learning and medical imaging

    CERN Document Server

    Shen, Dinggang; Sabuncu, Mert

    2016-01-01

    Machine Learning and Medical Imaging presents state-of- the-art machine learning methods in medical image analysis. It first summarizes cutting-edge machine learning algorithms in medical imaging, including not only classical probabilistic modeling and learning methods, but also recent breakthroughs in deep learning, sparse representation/coding, and big data hashing. In the second part leading research groups around the world present a wide spectrum of machine learning methods with application to different medical imaging modalities, clinical domains, and organs. The biomedical imaging modalities include ultrasound, magnetic resonance imaging (MRI), computed tomography (CT), histology, and microscopy images. The targeted organs span the lung, liver, brain, and prostate, while there is also a treatment of examining genetic associations. Machine Learning and Medical Imaging is an ideal reference for medical imaging researchers, industry scientists and engineers, advanced undergraduate and graduate students, a...

  4. Theory of Belief Functions for Data Analysis and Machine Learning Applications: Review and Prospects

    Science.gov (United States)

    Denoeux, Thierry

    The Dempster-Shafer theory of belief functions provides a unified framework for handling both aleatory uncertainty, arising from statistical variability in populations, and epistemic uncertainty, arising from incompleteness of knowledge. An overview of both the fundamentals and some recent developments in this theory will first be presented. Several applications in data analysis and machine learning will then be reviewed, including learning under partial supervision, multi-label classification, ensemble clustering and the treatment of pairwise comparisons in sensory or preference analysis.

  5. Application of artificial neural network with extreme learning machine for economic growth estimation

    Science.gov (United States)

    Milačić, Ljubiša; Jović, Srđan; Vujović, Tanja; Miljković, Jovica

    2017-01-01

    The purpose of this research is to develop and apply the artificial neural network (ANN) with extreme learning machine (ELM) to forecast gross domestic product (GDP) growth rate. The economic growth forecasting was analyzed based on agriculture, manufacturing, industry and services value added in GDP. The results were compared with ANN with back propagation (BP) learning approach since BP could be considered as conventional learning methodology. The reliability of the computational models was accessed based on simulation results and using several statistical indicators. Based on results, it was shown that ANN with ELM learning methodology can be applied effectively in applications of GDP forecasting.

  6. Machine learning with R

    CERN Document Server

    Lantz, Brett

    2013-01-01

    Written as a tutorial to explore and understand the power of R for machine learning. This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating the results. These steps will build the knowledge you need to apply them to your own data science tasks.Intended for those who want to learn how to use R's machine learning capabilities and gain insight from your data. Perhaps you already know a bit about machine learning, but have never used R; or

  7. Application of machine learning methods for traffic signs recognition

    Science.gov (United States)

    Filatov, D. V.; Ignatev, K. V.; Deviatkin, A. V.; Serykh, E. V.

    2018-02-01

    This paper focuses on solving a relevant and pressing safety issue on intercity roads. Two approaches were considered for solving the problem of traffic signs recognition; the approaches involved neural networks to analyze images obtained from a camera in the real-time mode. The first approach is based on a sequential image processing. At the initial stage, with the help of color filters and morphological operations (dilatation and erosion), the area containing the traffic sign is located on the image, then the selected and scaled fragment of the image is analyzed using a feedforward neural network to determine the meaning of the found traffic sign. Learning of the neural network in this approach is carried out using a backpropagation method. The second approach involves convolution neural networks at both stages, i.e. when searching and selecting the area of the image containing the traffic sign, and when determining its meaning. Learning of the neural network in the second approach is carried out using the intersection over union function and a loss function. For neural networks to learn and the proposed algorithms to be tested, a series of videos from a dash cam were used that were shot under various weather and illumination conditions. As a result, the proposed approaches for traffic signs recognition were analyzed and compared by key indicators such as recognition rate percentage and the complexity of neural networks’ learning process.

  8. Machine learning in virtual screening.

    Science.gov (United States)

    Melville, James L; Burke, Edmund K; Hirst, Jonathan D

    2009-05-01

    In this review, we highlight recent applications of machine learning to virtual screening, focusing on the use of supervised techniques to train statistical learning algorithms to prioritize databases of molecules as active against a particular protein target. Both ligand-based similarity searching and structure-based docking have benefited from machine learning algorithms, including naïve Bayesian classifiers, support vector machines, neural networks, and decision trees, as well as more traditional regression techniques. Effective application of these methodologies requires an appreciation of data preparation, validation, optimization, and search methodologies, and we also survey developments in these areas.

  9. Microsoft Azure machine learning

    CERN Document Server

    Mund, Sumit

    2015-01-01

    The book is intended for those who want to learn how to use Azure Machine Learning. Perhaps you already know a bit about Machine Learning, but have never used ML Studio in Azure; or perhaps you are an absolute newbie. In either case, this book will get you up-and-running quickly.

  10. Pattern recognition & machine learning

    CERN Document Server

    Anzai, Y

    1992-01-01

    This is the first text to provide a unified and self-contained introduction to visual pattern recognition and machine learning. It is useful as a general introduction to artifical intelligence and knowledge engineering, and no previous knowledge of pattern recognition or machine learning is necessary. Basic for various pattern recognition and machine learning methods. Translated from Japanese, the book also features chapter exercises, keywords, and summaries.

  11. Introduction to machine learning

    OpenAIRE

    Baştanlar, Yalın; Özuysal, Mustafa

    2014-01-01

    The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning app...

  12. Applied machine learning in greenhouse simulation; new application and analysis

    Directory of Open Access Journals (Sweden)

    Morteza Taki

    2018-06-01

    Full Text Available Prediction the inside environment variables in greenhouses is very important because they play a vital role in greenhouse cultivation and energy lost especially in cold and hot regions. The greenhouse environment is an uncertain nonlinear system which classical modeling methods have some problems to solve it. So the main goal of this study is to select the best method between Artificial Neural Network (ANN and Support Vector Machine (SVM to estimate three different variables include inside air, soil and plant temperatures (Ta, Ts, Tp and also energy exchange in a polyethylene greenhouse in Shahreza city, Isfahan province, Iran. The environmental factors which influencing all the inside temperatures such as outside air temperature, wind speed and outside solar radiation were collected as data samples. In this research, 13 different training algorithms were used for ANN models (MLP-RBF. Based on K-fold cross validation and Randomized Complete Block (RCB methodology, the best model was selected. The results showed that the type of training algorithm and kernel function are very important factors in ANN (RBF and MLP and SVM models performance, respectively. Comparing RBF, MLP and SVM models showed that the performance of RBF to predict Ta, Tp and Ts variables is better according to small values of RMSE and MAPE and large value of R2 indices. The range of RMSE and MAPE factors for RBF model to predict Ta, Tp and Ts were between 0.07 and 0.12 °C and 0.28–0.50%, respectively. Generalizability and stability of the RBF model with 5-fold cross validation analysis showed that this method can use with small size of data groups. The performance of best model (RBF to estimate the energy lost and exchange in the greenhouse with heat transfer models showed that this method can estimate the real data in greenhouse and then predict the energy lost and exchange with high accuracy. Keywords: Black box method, Energy lost, Environmental situation, Energy

  13. Spatial extreme learning machines: An application on prediction of disease counts.

    Science.gov (United States)

    Prates, Marcos O

    2018-01-01

    Extreme learning machines have gained a lot of attention by the machine learning community because of its interesting properties and computational advantages. With the increase in collection of information nowadays, many sources of data have missing information making statistical analysis harder or unfeasible. In this paper, we present a new model, coined spatial extreme learning machine, that combine spatial modeling with extreme learning machines keeping the nice properties of both methodologies and making it very flexible and robust. As explained throughout the text, the spatial extreme learning machines have many advantages in comparison with the traditional extreme learning machines. By a simulation study and a real data analysis we present how the spatial extreme learning machine can be used to improve imputation of missing data and uncertainty prediction estimation.

  14. Application of machine-learning methods to solid-state chemistry: ferromagnetism in transition metal alloys

    International Nuclear Information System (INIS)

    Landrum, G.A.Gregory A.; Genin, Hugh

    2003-01-01

    Machine-learning methods are a collection of techniques for building predictive models from experimental data. The algorithms are problem-independent: the chemistry and physics of the problem being studied are contained in the descriptors used to represent the known data. The application of a variety of machine-learning methods to the prediction of ferromagnetism in ordered and disordered transition metal alloys is presented. Applying a decision tree algorithm to build a predictive model for ordered phases results in a model that is 100% accurate. The same algorithm achieves 99% accuracy when trained on a data set containing both ordered and disordered phases. Details of the descriptor sets for both applications are also presented

  15. Machine-Learning Research

    OpenAIRE

    Dietterich, Thomas G.

    1997-01-01

    Machine-learning research has been making great progress in many directions. This article summarizes four of these directions and discusses some current open problems. The four directions are (1) the improvement of classification accuracy by learning ensembles of classifiers, (2) methods for scaling up supervised learning algorithms, (3) reinforcement learning, and (4) the learning of complex stochastic models.

  16. Model-based machine learning.

    Science.gov (United States)

    Bishop, Christopher M

    2013-02-13

    Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application, a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this study, we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning do not have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this study, we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications.

  17. Benchmarking Data Analysis and Machine Learning Applications on the Intel KNL Many-Core Processor

    OpenAIRE

    Byun, Chansup; Kepner, Jeremy; Arcand, William; Bestor, David; Bergeron, Bill; Gadepally, Vijay; Houle, Michael; Hubbell, Matthew; Jones, Michael; Klein, Anna; Michaleas, Peter; Milechin, Lauren; Mullen, Julie; Prout, Andrew; Rosa, Antonio

    2017-01-01

    Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the Lincoln Laboratory Supercomputing Center (LLSC), the majority of users are running data analysis applications such as MATLAB and O...

  18. Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

    Directory of Open Access Journals (Sweden)

    Jiali Du

    2014-12-01

    Full Text Available This paper discusses the application of computational linguistics in the machine learning (ML system for the processing of garden path sentences. ML is closely related to artificial intelligence and linguistic cognition. The rapid and efficient processing of the complex structures is an effective method to test the system. By means of parsing the garden path sentence, we draw a conclusion that the integration of theoretical and statistical methods is helpful for the development of ML system.

  19. Clustering and Candidate Motif Detection in Exosomal miRNAs by Application of Machine Learning Algorithms.

    Science.gov (United States)

    Gaur, Pallavi; Chaturvedi, Anoop

    2017-07-22

    The clustering pattern and motifs give immense information about any biological data. An application of machine learning algorithms for clustering and candidate motif detection in miRNAs derived from exosomes is depicted in this paper. Recent progress in the field of exosome research and more particularly regarding exosomal miRNAs has led much bioinformatic-based research to come into existence. The information on clustering pattern and candidate motifs in miRNAs of exosomal origin would help in analyzing existing, as well as newly discovered miRNAs within exosomes. Along with obtaining clustering pattern and candidate motifs in exosomal miRNAs, this work also elaborates the usefulness of the machine learning algorithms that can be efficiently used and executed on various programming languages/platforms. Data were clustered and sequence candidate motifs were detected successfully. The results were compared and validated with some available web tools such as 'BLASTN' and 'MEME suite'. The machine learning algorithms for aforementioned objectives were applied successfully. This work elaborated utility of machine learning algorithms and language platforms to achieve the tasks of clustering and candidate motif detection in exosomal miRNAs. With the information on mentioned objectives, deeper insight would be gained for analyses of newly discovered miRNAs in exosomes which are considered to be circulating biomarkers. In addition, the execution of machine learning algorithms on various language platforms gives more flexibility to users to try multiple iterations according to their requirements. This approach can be applied to other biological data-mining tasks as well.

  20. Machine learning with R

    CERN Document Server

    Lantz, Brett

    2015-01-01

    Perhaps you already know a bit about machine learning but have never used R, or perhaps you know a little R but are new to machine learning. In either case, this book will get you up and running quickly. It would be helpful to have a bit of familiarity with basic programming concepts, but no prior experience is required.

  1. A Novel Application of Machine Learning Methods to Model Microcontroller Upset Due to Intentional Electromagnetic Interference

    Science.gov (United States)

    Bilalic, Rusmir

    A novel application of support vector machines (SVMs), artificial neural networks (ANNs), and Gaussian processes (GPs) for machine learning (GPML) to model microcontroller unit (MCU) upset due to intentional electromagnetic interference (IEMI) is presented. In this approach, an MCU performs a counting operation (0-7) while electromagnetic interference in the form of a radio frequency (RF) pulse is direct-injected into the MCU clock line. Injection times with respect to the clock signal are the clock low, clock rising edge, clock high, and the clock falling edge periods in the clock window during which the MCU is performing initialization and executing the counting procedure. The intent is to cause disruption in the counting operation and model the probability of effect (PoE) using machine learning tools. Five experiments were executed as part of this research, each of which contained a set of 38,300 training points and 38,300 test points, for a total of 383,000 total points with the following experiment variables: injection times with respect to the clock signal, injected RF power, injected RF pulse width, and injected RF frequency. For the 191,500 training points, the average training error was 12.47%, while for the 191,500 test points the average test error was 14.85%, meaning that on average, the machine was able to predict MCU upset with an 85.15% accuracy. Leaving out the results for the worst-performing model (SVM with a linear kernel), the test prediction accuracy for the remaining machines is almost 89%. All three machine learning methods (ANNs, SVMs, and GPML) showed excellent and consistent results in their ability to model and predict the PoE on an MCU due to IEMI. The GP approach performed best during training with a 7.43% average training error, while the ANN technique was most accurate during the test with a 10.80% error.

  2. Machine Learning for Hackers

    CERN Document Server

    Conway, Drew

    2012-01-01

    If you're an experienced programmer interested in crunching data, this book will get you started with machine learning-a toolkit of algorithms that enables computers to train themselves to automate useful tasks. Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation. Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you'll learn how to analyz

  3. Fast learning method for convolutional neural networks using extreme learning machine and its application to lane detection.

    Science.gov (United States)

    Kim, Jihun; Kim, Jonghong; Jang, Gil-Jin; Lee, Minho

    2017-03-01

    Deep learning has received significant attention recently as a promising solution to many problems in the area of artificial intelligence. Among several deep learning architectures, convolutional neural networks (CNNs) demonstrate superior performance when compared to other machine learning methods in the applications of object detection and recognition. We use a CNN for image enhancement and the detection of driving lanes on motorways. In general, the process of lane detection consists of edge extraction and line detection. A CNN can be used to enhance the input images before lane detection by excluding noise and obstacles that are irrelevant to the edge detection result. However, training conventional CNNs requires considerable computation and a big dataset. Therefore, we suggest a new learning algorithm for CNNs using an extreme learning machine (ELM). The ELM is a fast learning method used to calculate network weights between output and hidden layers in a single iteration and thus, can dramatically reduce learning time while producing accurate results with minimal training data. A conventional ELM can be applied to networks with a single hidden layer; as such, we propose a stacked ELM architecture in the CNN framework. Further, we modify the backpropagation algorithm to find the targets of hidden layers and effectively learn network weights while maintaining performance. Experimental results confirm that the proposed method is effective in reducing learning time and improving performance. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Machine Learning Applications to Resting-State Functional MR Imaging Analysis.

    Science.gov (United States)

    Billings, John M; Eder, Maxwell; Flood, William C; Dhami, Devendra Singh; Natarajan, Sriraam; Whitlow, Christopher T

    2017-11-01

    Machine learning is one of the most exciting and rapidly expanding fields within computer science. Academic and commercial research entities are investing in machine learning methods, especially in personalized medicine via patient-level classification. There is great promise that machine learning methods combined with resting state functional MR imaging will aid in diagnosis of disease and guide potential treatment for conditions thought to be impossible to identify based on imaging alone, such as psychiatric disorders. We discuss machine learning methods and explore recent advances. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Emerging Paradigms in Machine Learning

    CERN Document Server

    Jain, Lakhmi; Howlett, Robert

    2013-01-01

    This  book presents fundamental topics and algorithms that form the core of machine learning (ML) research, as well as emerging paradigms in intelligent system design. The  multidisciplinary nature of machine learning makes it a very fascinating and popular area for research.  The book is aiming at students, practitioners and researchers and captures the diversity and richness of the field of machine learning and intelligent systems.  Several chapters are devoted to computational learning models such as granular computing, rough sets and fuzzy sets An account of applications of well-known learning methods in biometrics, computational stylistics, multi-agent systems, spam classification including an extremely well-written survey on Bayesian networks shed light on the strengths and weaknesses of the methods. Practical studies yielding insight into challenging problems such as learning from incomplete and imbalanced data, pattern recognition of stochastic episodic events and on-line mining of non-stationary ...

  6. Learning scikit-learn machine learning in Python

    CERN Document Server

    Garreta, Raúl

    2013-01-01

    The book adopts a tutorial-based approach to introduce the user to Scikit-learn.If you are a programmer who wants to explore machine learning and data-based methods to build intelligent applications and enhance your programming skills, this the book for you. No previous experience with machine-learning algorithms is required.

  7. Creativity in Machine Learning

    OpenAIRE

    Thoma, Martin

    2016-01-01

    Recent machine learning techniques can be modified to produce creative results. Those results did not exist before; it is not a trivial combination of the data which was fed into the machine learning system. The obtained results come in multiple forms: As images, as text and as audio. This paper gives a high level overview of how they are created and gives some examples. It is meant to be a summary of the current work and give people who are new to machine learning some starting points.

  8. Learning with Support Vector Machines

    CERN Document Server

    Campbell, Colin

    2010-01-01

    Support Vectors Machines have become a well established tool within machine learning. They work well in practice and have now been used across a wide range of applications from recognizing hand-written digits, to face identification, text categorisation, bioinformatics, and database marketing. In this book we give an introductory overview of this subject. We start with a simple Support Vector Machine for performing binary classification before considering multi-class classification and learning in the presence of noise. We show that this framework can be extended to many other scenarios such a

  9. Quantum Machine Learning

    OpenAIRE

    Romero García, Cristian

    2017-01-01

    [EN] In a world in which accessible information grows exponentially, the selection of the appropriate information turns out to be an extremely relevant problem. In this context, the idea of Machine Learning (ML), a subfield of Artificial Intelligence, emerged to face problems in data mining, pattern recognition, automatic prediction, among others. Quantum Machine Learning is an interdisciplinary research area combining quantum mechanics with methods of ML, in which quantum properties allow fo...

  10. Improved Extreme Learning Machine and Its Application in Image Quality Assessment

    OpenAIRE

    Mao, Li; Zhang, Lidong; Liu, Xingyang; Li, Chaofeng; Yang, Hong

    2014-01-01

    Extreme learning machine (ELM) is a new class of single-hidden layer feedforward neural network (SLFN), which is simple in theory and fast in implementation. Zong et al. propose a weighted extreme learning machine for learning data with imbalanced class distribution, which maintains the advantages from original ELM. However, the current reported ELM and its improved version are only based on the empirical risk minimization principle, which may suffer from overfitting. To solve the overfitting...

  11. A Digital Liquid State Machine With Biologically Inspired Learning and Its Application to Speech Recognition.

    Science.gov (United States)

    Zhang, Yong; Li, Peng; Jin, Yingyezhe; Choe, Yoonsuck

    2015-11-01

    This paper presents a bioinspired digital liquid-state machine (LSM) for low-power very-large-scale-integration (VLSI)-based machine learning applications. To the best of the authors' knowledge, this is the first work that employs a bioinspired spike-based learning algorithm for the LSM. With the proposed online learning, the LSM extracts information from input patterns on the fly without needing intermediate data storage as required in offline learning methods such as ridge regression. The proposed learning rule is local such that each synaptic weight update is based only upon the firing activities of the corresponding presynaptic and postsynaptic neurons without incurring global communications across the neural network. Compared with the backpropagation-based learning, the locality of computation in the proposed approach lends itself to efficient parallel VLSI implementation. We use subsets of the TI46 speech corpus to benchmark the bioinspired digital LSM. To reduce the complexity of the spiking neural network model without performance degradation for speech recognition, we study the impacts of synaptic models on the fading memory of the reservoir and hence the network performance. Moreover, we examine the tradeoffs between synaptic weight resolution, reservoir size, and recognition performance and present techniques to further reduce the overhead of hardware implementation. Our simulation results show that in terms of isolated word recognition evaluated using the TI46 speech corpus, the proposed digital LSM rivals the state-of-the-art hidden Markov-model-based recognizer Sphinx-4 and outperforms all other reported recognizers including the ones that are based upon the LSM or neural networks.

  12. Machine learning topological states

    Science.gov (United States)

    Deng, Dong-Ling; Li, Xiaopeng; Das Sarma, S.

    2017-11-01

    Artificial neural networks and machine learning have now reached a new era after several decades of improvement where applications are to explode in many fields of science, industry, and technology. Here, we use artificial neural networks to study an intriguing phenomenon in quantum physics—the topological phases of matter. We find that certain topological states, either symmetry-protected or with intrinsic topological order, can be represented with classical artificial neural networks. This is demonstrated by using three concrete spin systems, the one-dimensional (1D) symmetry-protected topological cluster state and the 2D and 3D toric code states with intrinsic topological orders. For all three cases, we show rigorously that the topological ground states can be represented by short-range neural networks in an exact and efficient fashion—the required number of hidden neurons is as small as the number of physical spins and the number of parameters scales only linearly with the system size. For the 2D toric-code model, we find that the proposed short-range neural networks can describe the excited states with Abelian anyons and their nontrivial mutual statistics as well. In addition, by using reinforcement learning we show that neural networks are capable of finding the topological ground states of nonintegrable Hamiltonians with strong interactions and studying their topological phase transitions. Our results demonstrate explicitly the exceptional power of neural networks in describing topological quantum states, and at the same time provide valuable guidance to machine learning of topological phases in generic lattice models.

  13. SMARTbot: A Behavioral Analysis Framework Augmented with Machine Learning to Identify Mobile Botnet Applications.

    Directory of Open Access Journals (Sweden)

    Ahmad Karim

    Full Text Available Botnet phenomenon in smartphones is evolving with the proliferation in mobile phone technologies after leaving imperative impact on personal computers. It refers to the network of computers, laptops, mobile devices or tablets which is remotely controlled by the cybercriminals to initiate various distributed coordinated attacks including spam emails, ad-click fraud, Bitcoin mining, Distributed Denial of Service (DDoS, disseminating other malwares and much more. Likewise traditional PC based botnet, Mobile botnets have the same operational impact except the target audience is particular to smartphone users. Therefore, it is import to uncover this security issue prior to its widespread adaptation. We propose SMARTbot, a novel dynamic analysis framework augmented with machine learning techniques to automatically detect botnet binaries from malicious corpus. SMARTbot is a component based off-device behavioral analysis framework which can generate mobile botnet learning model by inducing Artificial Neural Networks' back-propagation method. Moreover, this framework can detect mobile botnet binaries with remarkable accuracy even in case of obfuscated program code. The results conclude that, a classifier model based on simple logistic regression outperform other machine learning classifier for botnet apps' detection, i.e 99.49% accuracy is achieved. Further, from manual inspection of botnet dataset we have extracted interesting trends in those applications. As an outcome of this research, a mobile botnet dataset is devised which will become the benchmark for future studies.

  14. Application of machine learning methodology for pet-based definition of lung cancer

    Science.gov (United States)

    Kerhet, A.; Small, C.; Quon, H.; Riauka, T.; Schrader, L.; Greiner, R.; Yee, D.; McEwan, A.; Roa, W.

    2010-01-01

    We applied a learning methodology framework to assist in the threshold-based segmentation of non-small-cell lung cancer (nsclc) tumours in positron-emission tomography–computed tomography (pet–ct) imaging for use in radiotherapy planning. Gated and standard free-breathing studies of two patients were independently analysed (four studies in total). Each study had a pet–ct and a treatment-planning ct image. The reference gross tumour volume (gtv) was identified by two experienced radiation oncologists who also determined reference standardized uptake value (suv) thresholds that most closely approximated the gtv contour on each slice. A set of uptake distribution-related attributes was calculated for each pet slice. A machine learning algorithm was trained on a subset of the pet slices to cope with slice-to-slice variation in the optimal suv threshold: that is, to predict the most appropriate suv threshold from the calculated attributes for each slice. The algorithm’s performance was evaluated using the remainder of the pet slices. A high degree of geometric similarity was achieved between the areas outlined by the predicted and the reference suv thresholds (Jaccard index exceeding 0.82). No significant difference was found between the gated and the free-breathing results in the same patient. In this preliminary work, we demonstrated the potential applicability of a machine learning methodology as an auxiliary tool for radiation treatment planning in nsclc. PMID:20179802

  15. SMARTbot: A Behavioral Analysis Framework Augmented with Machine Learning to Identify Mobile Botnet Applications.

    Science.gov (United States)

    Karim, Ahmad; Salleh, Rosli; Khan, Muhammad Khurram

    2016-01-01

    Botnet phenomenon in smartphones is evolving with the proliferation in mobile phone technologies after leaving imperative impact on personal computers. It refers to the network of computers, laptops, mobile devices or tablets which is remotely controlled by the cybercriminals to initiate various distributed coordinated attacks including spam emails, ad-click fraud, Bitcoin mining, Distributed Denial of Service (DDoS), disseminating other malwares and much more. Likewise traditional PC based botnet, Mobile botnets have the same operational impact except the target audience is particular to smartphone users. Therefore, it is import to uncover this security issue prior to its widespread adaptation. We propose SMARTbot, a novel dynamic analysis framework augmented with machine learning techniques to automatically detect botnet binaries from malicious corpus. SMARTbot is a component based off-device behavioral analysis framework which can generate mobile botnet learning model by inducing Artificial Neural Networks' back-propagation method. Moreover, this framework can detect mobile botnet binaries with remarkable accuracy even in case of obfuscated program code. The results conclude that, a classifier model based on simple logistic regression outperform other machine learning classifier for botnet apps' detection, i.e 99.49% accuracy is achieved. Further, from manual inspection of botnet dataset we have extracted interesting trends in those applications. As an outcome of this research, a mobile botnet dataset is devised which will become the benchmark for future studies.

  16. SMARTbot: A Behavioral Analysis Framework Augmented with Machine Learning to Identify Mobile Botnet Applications

    Science.gov (United States)

    Karim, Ahmad; Salleh, Rosli; Khan, Muhammad Khurram

    2016-01-01

    Botnet phenomenon in smartphones is evolving with the proliferation in mobile phone technologies after leaving imperative impact on personal computers. It refers to the network of computers, laptops, mobile devices or tablets which is remotely controlled by the cybercriminals to initiate various distributed coordinated attacks including spam emails, ad-click fraud, Bitcoin mining, Distributed Denial of Service (DDoS), disseminating other malwares and much more. Likewise traditional PC based botnet, Mobile botnets have the same operational impact except the target audience is particular to smartphone users. Therefore, it is import to uncover this security issue prior to its widespread adaptation. We propose SMARTbot, a novel dynamic analysis framework augmented with machine learning techniques to automatically detect botnet binaries from malicious corpus. SMARTbot is a component based off-device behavioral analysis framework which can generate mobile botnet learning model by inducing Artificial Neural Networks’ back-propagation method. Moreover, this framework can detect mobile botnet binaries with remarkable accuracy even in case of obfuscated program code. The results conclude that, a classifier model based on simple logistic regression outperform other machine learning classifier for botnet apps’ detection, i.e 99.49% accuracy is achieved. Further, from manual inspection of botnet dataset we have extracted interesting trends in those applications. As an outcome of this research, a mobile botnet dataset is devised which will become the benchmark for future studies. PMID:26978523

  17. Predictive Power of Machine Learning for Optimizing Solar Water Heater Performance: The Potential Application of High-Throughput Screening

    Directory of Open Access Journals (Sweden)

    Hao Li

    2017-01-01

    Full Text Available Predicting the performance of solar water heater (SWH is challenging due to the complexity of the system. Fortunately, knowledge-based machine learning can provide a fast and precise prediction method for SWH performance. With the predictive power of machine learning models, we can further solve a more challenging question: how to cost-effectively design a high-performance SWH? Here, we summarize our recent studies and propose a general framework of SWH design using a machine learning-based high-throughput screening (HTS method. Design of water-in-glass evacuated tube solar water heater (WGET-SWH is selected as a case study to show the potential application of machine learning-based HTS to the design and optimization of solar energy systems.

  18. Machine learning systems

    Energy Technology Data Exchange (ETDEWEB)

    Forsyth, R

    1984-05-01

    With the dramatic rise of expert systems has come a renewed interest in the fuel that drives them-knowledge. For it is specialist knowledge which gives expert systems their power. But extracting knowledge from human experts in symbolic form has proved arduous and labour-intensive. So the idea of machine learning is enjoying a renaissance. Machine learning is any automatic improvement in the performance of a computer system over time, as a result of experience. Thus a learning algorithm seeks to do one or more of the following: cover a wider range of problems, deliver more accurate solutions, obtain answers more cheaply, and simplify codified knowledge. 6 references.

  19. Application of heuristic and machine-learning approach to engine model calibration

    Science.gov (United States)

    Cheng, Jie; Ryu, Kwang R.; Newman, C. E.; Davis, George C.

    1993-03-01

    Automation of engine model calibration procedures is a very challenging task because (1) the calibration process searches for a goal state in a huge, continuous state space, (2) calibration is often a lengthy and frustrating task because of complicated mutual interference among the target parameters, and (3) the calibration problem is heuristic by nature, and often heuristic knowledge for constraining a search cannot be easily acquired from domain experts. A combined heuristic and machine learning approach has, therefore, been adopted to improve the efficiency of model calibration. We developed an intelligent calibration program called ICALIB. It has been used on a daily basis for engine model applications, and has reduced the time required for model calibrations from many hours to a few minutes on average. In this paper, we describe the heuristic control strategies employed in ICALIB such as a hill-climbing search based on a state distance estimation function, incremental problem solution refinement by using a dynamic tolerance window, and calibration target parameter ordering for guiding the search. In addition, we present the application of a machine learning program called GID3* for automatic acquisition of heuristic rules for ordering target parameters.

  20. Machine learning in genetics and genomics

    Science.gov (United States)

    Libbrecht, Maxwell W.; Noble, William Stafford

    2016-01-01

    The field of machine learning promises to enable computers to assist humans in making sense of large, complex data sets. In this review, we outline some of the main applications of machine learning to genetic and genomic data. In the process, we identify some recurrent challenges associated with this type of analysis and provide general guidelines to assist in the practical application of machine learning to real genetic and genomic data. PMID:25948244

  1. Improved Extreme Learning Machine and Its Application in Image Quality Assessment

    Directory of Open Access Journals (Sweden)

    Li Mao

    2014-01-01

    Full Text Available Extreme learning machine (ELM is a new class of single-hidden layer feedforward neural network (SLFN, which is simple in theory and fast in implementation. Zong et al. propose a weighted extreme learning machine for learning data with imbalanced class distribution, which maintains the advantages from original ELM. However, the current reported ELM and its improved version are only based on the empirical risk minimization principle, which may suffer from overfitting. To solve the overfitting troubles, in this paper, we incorporate the structural risk minimization principle into the (weighted ELM, and propose a modified (weighted extreme learning machine (M-ELM and M-WELM. Experimental results show that our proposed M-WELM outperforms the current reported extreme learning machine algorithm in image quality assessment.

  2. Gaussian processes for machine learning.

    Science.gov (United States)

    Seeger, Matthias

    2004-04-01

    Gaussian processes (GPs) are natural generalisations of multivariate Gaussian random variables to infinite (countably or continuous) index sets. GPs have been applied in a large number of fields to a diverse range of ends, and very many deep theoretical analyses of various properties are available. This paper gives an introduction to Gaussian processes on a fairly elementary level with special emphasis on characteristics relevant in machine learning. It draws explicit connections to branches such as spline smoothing models and support vector machines in which similar ideas have been investigated. Gaussian process models are routinely used to solve hard machine learning problems. They are attractive because of their flexible non-parametric nature and computational simplicity. Treated within a Bayesian framework, very powerful statistical methods can be implemented which offer valid estimates of uncertainties in our predictions and generic model selection procedures cast as nonlinear optimization problems. Their main drawback of heavy computational scaling has recently been alleviated by the introduction of generic sparse approximations.13,78,31 The mathematical literature on GPs is large and often uses deep concepts which are not required to fully understand most machine learning applications. In this tutorial paper, we aim to present characteristics of GPs relevant to machine learning and to show up precise connections to other "kernel machines" popular in the community. Our focus is on a simple presentation, but references to more detailed sources are provided.

  3. Mutual learning in a tree parity machine and its application to cryptography

    International Nuclear Information System (INIS)

    Rosen-Zvi, Michal; Klein, Einat; Kanter, Ido; Kinzel, Wolfgang

    2002-01-01

    Mutual learning of a pair of tree parity machines with continuous and discrete weight vectors is studied analytically. The analysis is based on a mapping procedure that maps the mutual learning in tree parity machines onto mutual learning in noisy perceptrons. The stationary solution of the mutual learning in the case of continuous tree parity machines depends on the learning rate where a phase transition from partial to full synchronization is observed. In the discrete case the learning process is based on a finite increment and a full synchronized state is achieved in a finite number of steps. The synchronization of discrete parity machines is introduced in order to construct an ephemeral key-exchange protocol. The dynamic learning of a third tree parity machine (an attacker) that tries to imitate one of the two machines while the two still update their weight vectors is also analyzed. In particular, the synchronization times of the naive attacker and the flipping attacker recently introduced in Ref. 9 are analyzed. All analytical results are found to be in good agreement with simulation results

  4. Application of Machine Learning Techniques for Amplitude and Phase Noise Characterization

    DEFF Research Database (Denmark)

    Zibar, Darko; de Carvalho, Luis Henrique Hecker; Piels, Molly

    2015-01-01

    In this paper, tools from machine learning community, such as Bayesian filtering and expectation maximization parameter estimation, are presented and employed for laser amplitude and phase noise characterization. We show that phase noise estimation based on Bayesian filtering outperforms...

  5. Machine Learning in Medicine.

    Science.gov (United States)

    Deo, Rahul C

    2015-11-17

    Spurred by advances in processing power, memory, storage, and an unprecedented wealth of data, computers are being asked to tackle increasingly complex learning tasks, often with astonishing success. Computers have now mastered a popular variant of poker, learned the laws of physics from experimental data, and become experts in video games - tasks that would have been deemed impossible not too long ago. In parallel, the number of companies centered on applying complex data analysis to varying industries has exploded, and it is thus unsurprising that some analytic companies are turning attention to problems in health care. The purpose of this review is to explore what problems in medicine might benefit from such learning approaches and use examples from the literature to introduce basic concepts in machine learning. It is important to note that seemingly large enough medical data sets and adequate learning algorithms have been available for many decades, and yet, although there are thousands of papers applying machine learning algorithms to medical data, very few have contributed meaningfully to clinical care. This lack of impact stands in stark contrast to the enormous relevance of machine learning to many other industries. Thus, part of my effort will be to identify what obstacles there may be to changing the practice of medicine through statistical learning approaches, and discuss how these might be overcome. © 2015 American Heart Association, Inc.

  6. Machine Learning in Medicine

    Science.gov (United States)

    Deo, Rahul C.

    2015-01-01

    Spurred by advances in processing power, memory, storage, and an unprecedented wealth of data, computers are being asked to tackle increasingly complex learning tasks, often with astonishing success. Computers have now mastered a popular variant of poker, learned the laws of physics from experimental data, and become experts in video games – tasks which would have been deemed impossible not too long ago. In parallel, the number of companies centered on applying complex data analysis to varying industries has exploded, and it is thus unsurprising that some analytic companies are turning attention to problems in healthcare. The purpose of this review is to explore what problems in medicine might benefit from such learning approaches and use examples from the literature to introduce basic concepts in machine learning. It is important to note that seemingly large enough medical data sets and adequate learning algorithms have been available for many decades – and yet, although there are thousands of papers applying machine learning algorithms to medical data, very few have contributed meaningfully to clinical care. This lack of impact stands in stark contrast to the enormous relevance of machine learning to many other industries. Thus part of my effort will be to identify what obstacles there may be to changing the practice of medicine through statistical learning approaches, and discuss how these might be overcome. PMID:26572668

  7. Estimation of the applicability domain of kernel-based machine learning models for virtual screening

    Directory of Open Access Journals (Sweden)

    Fechner Nikolas

    2010-03-01

    Full Text Available Abstract Background The virtual screening of large compound databases is an important application of structural-activity relationship models. Due to the high structural diversity of these data sets, it is impossible for machine learning based QSAR models, which rely on a specific training set, to give reliable results for all compounds. Thus, it is important to consider the subset of the chemical space in which the model is applicable. The approaches to this problem that have been published so far mostly use vectorial descriptor representations to define this domain of applicability of the model. Unfortunately, these cannot be extended easily to structured kernel-based machine learning models. For this reason, we propose three approaches to estimate the domain of applicability of a kernel-based QSAR model. Results We evaluated three kernel-based applicability domain estimations using three different structured kernels on three virtual screening tasks. Each experiment consisted of the training of a kernel-based QSAR model using support vector regression and the ranking of a disjoint screening data set according to the predicted activity. For each prediction, the applicability of the model for the respective compound is quantitatively described using a score obtained by an applicability domain formulation. The suitability of the applicability domain estimation is evaluated by comparing the model performance on the subsets of the screening data sets obtained by different thresholds for the applicability scores. This comparison indicates that it is possible to separate the part of the chemspace, in which the model gives reliable predictions, from the part consisting of structures too dissimilar to the training set to apply the model successfully. A closer inspection reveals that the virtual screening performance of the model is considerably improved if half of the molecules, those with the lowest applicability scores, are omitted from the screening

  8. Estimation of the applicability domain of kernel-based machine learning models for virtual screening.

    Science.gov (United States)

    Fechner, Nikolas; Jahn, Andreas; Hinselmann, Georg; Zell, Andreas

    2010-03-11

    The virtual screening of large compound databases is an important application of structural-activity relationship models. Due to the high structural diversity of these data sets, it is impossible for machine learning based QSAR models, which rely on a specific training set, to give reliable results for all compounds. Thus, it is important to consider the subset of the chemical space in which the model is applicable. The approaches to this problem that have been published so far mostly use vectorial descriptor representations to define this domain of applicability of the model. Unfortunately, these cannot be extended easily to structured kernel-based machine learning models. For this reason, we propose three approaches to estimate the domain of applicability of a kernel-based QSAR model. We evaluated three kernel-based applicability domain estimations using three different structured kernels on three virtual screening tasks. Each experiment consisted of the training of a kernel-based QSAR model using support vector regression and the ranking of a disjoint screening data set according to the predicted activity. For each prediction, the applicability of the model for the respective compound is quantitatively described using a score obtained by an applicability domain formulation. The suitability of the applicability domain estimation is evaluated by comparing the model performance on the subsets of the screening data sets obtained by different thresholds for the applicability scores. This comparison indicates that it is possible to separate the part of the chemspace, in which the model gives reliable predictions, from the part consisting of structures too dissimilar to the training set to apply the model successfully. A closer inspection reveals that the virtual screening performance of the model is considerably improved if half of the molecules, those with the lowest applicability scores, are omitted from the screening. The proposed applicability domain formulations

  9. Enhanced Quality Control in Pharmaceutical Applications by Combining Raman Spectroscopy and Machine Learning Techniques

    Science.gov (United States)

    Martinez, J. C.; Guzmán-Sepúlveda, J. R.; Bolañoz Evia, G. R.; Córdova, T.; Guzmán-Cabrera, R.

    2018-06-01

    In this work, we applied machine learning techniques to Raman spectra for the characterization and classification of manufactured pharmaceutical products. Our measurements were taken with commercial equipment, for accurate assessment of variations with respect to one calibrated control sample. Unlike the typical use of Raman spectroscopy in pharmaceutical applications, in our approach the principal components of the Raman spectrum are used concurrently as attributes in machine learning algorithms. This permits an efficient comparison and classification of the spectra measured from the samples under study. This also allows for accurate quality control as all relevant spectral components are considered simultaneously. We demonstrate our approach with respect to the specific case of acetaminophen, which is one of the most widely used analgesics in the market. In the experiments, commercial samples from thirteen different laboratories were analyzed and compared against a control sample. The raw data were analyzed based on an arithmetic difference between the nominal active substance and the measured values in each commercial sample. The principal component analysis was applied to the data for quantitative verification (i.e., without considering the actual concentration of the active substance) of the difference in the calibrated sample. Our results show that by following this approach adulterations in pharmaceutical compositions can be clearly identified and accurately quantified.

  10. Clojure for machine learning

    CERN Document Server

    Wali, Akhil

    2014-01-01

    A book that brings out the strengths of Clojure programming that have to facilitate machine learning. Each topic is described in substantial detail, and examples and libraries in Clojure are also demonstrated.This book is intended for Clojure developers who want to explore the area of machine learning. Basic understanding of the Clojure programming language is required, but thorough acquaintance with the standard Clojure library or any libraries are not required. Familiarity with theoretical concepts and notation of mathematics and statistics would be an added advantage.

  11. Mastering machine learning with scikit-learn

    CERN Document Server

    Hackeling, Gavin

    2014-01-01

    If you are a software developer who wants to learn how machine learning models work and how to apply them effectively, this book is for you. Familiarity with machine learning fundamentals and Python will be helpful, but is not essential.

  12. Efficient Machine Learning Approach for Optimizing Scientific Computing Applications on Emerging HPC Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Arumugam, Kamesh [Old Dominion Univ., Norfolk, VA (United States)

    2017-05-01

    Efficient parallel implementations of scientific applications on multi-core CPUs with accelerators such as GPUs and Xeon Phis is challenging. This requires - exploiting the data parallel architecture of the accelerator along with the vector pipelines of modern x86 CPU architectures, load balancing, and efficient memory transfer between different devices. It is relatively easy to meet these requirements for highly structured scientific applications. In contrast, a number of scientific and engineering applications are unstructured. Getting performance on accelerators for these applications is extremely challenging because many of these applications employ irregular algorithms which exhibit data-dependent control-ow and irregular memory accesses. Furthermore, these applications are often iterative with dependency between steps, and thus making it hard to parallelize across steps. As a result, parallelism in these applications is often limited to a single step. Numerical simulation of charged particles beam dynamics is one such application where the distribution of work and memory access pattern at each time step is irregular. Applications with these properties tend to present significant branch and memory divergence, load imbalance between different processor cores, and poor compute and memory utilization. Prior research on parallelizing such irregular applications have been focused around optimizing the irregular, data-dependent memory accesses and control-ow during a single step of the application independent of the other steps, with the assumption that these patterns are completely unpredictable. We observed that the structure of computation leading to control-ow divergence and irregular memory accesses in one step is similar to that in the next step. It is possible to predict this structure in the current step by observing the computation structure of previous steps. In this dissertation, we present novel machine learning based optimization techniques to address

  13. Providing QoS through machine-learning-driven adaptive multimedia applications.

    Science.gov (United States)

    Ruiz, Pedro M; Botía, Juan A; Gómez-Skarmeta, Antonio

    2004-06-01

    We investigate the optimization of the quality of service (QoS) offered by real-time multimedia adaptive applications through machine learning algorithms. These applications are able to adapt in real time their internal settings (i.e., video sizes, audio and video codecs, among others) to the unpredictably changing capacity of the network. Traditional adaptive applications just select a set of settings to consume less than the available bandwidth. We propose a novel approach in which the selected set of settings is the one which offers a better user-perceived QoS among all those combinations which satisfy the bandwidth restrictions. We use a genetic algorithm to decide when to trigger the adaptation process depending on the network conditions (i.e., loss-rate, jitter, etc.). Additionally, the selection of the new set of settings is done according to a set of rules which model the user-perceived QoS. These rules are learned using the SLIPPER rule induction algorithm over a set of examples extracted from scores provided by real users. We will demonstrate that the proposed approach guarantees a good user-perceived QoS even when the network conditions are constantly changing.

  14. Using Machine Learning to Predict Student Performance

    OpenAIRE

    Pojon, Murat

    2017-01-01

    This thesis examines the application of machine learning algorithms to predict whether a student will be successful or not. The specific focus of the thesis is the comparison of machine learning methods and feature engineering techniques in terms of how much they improve the prediction performance. Three different machine learning methods were used in this thesis. They are linear regression, decision trees, and naïve Bayes classification. Feature engineering, the process of modification ...

  15. Massively collaborative machine learning

    NARCIS (Netherlands)

    Rijn, van J.N.

    2016-01-01

    Many scientists are focussed on building models. We nearly process all information we perceive to a model. There are many techniques that enable computers to build models as well. The field of research that develops such techniques is called Machine Learning. Many research is devoted to develop

  16. Trends in Machine Learning for Signal Processing

    DEFF Research Database (Denmark)

    Adali, Tulay; Miller, David J.; Diamantaras, Konstantinos I.

    2011-01-01

    By putting the accent on learning from the data and the environment, the Machine Learning for SP (MLSP) Technical Committee (TC) provides the essential bridge between the machine learning and SP communities. While the emphasis in MLSP is on learning and data-driven approaches, SP defines the main...... applications of interest, and thus the constraints and requirements on solutions, which include computational efficiency, online adaptation, and learning with limited supervision/reference data....

  17. An application of machine learning to the organization of institutional software repositories

    Science.gov (United States)

    Bailin, Sidney; Henderson, Scott; Truszkowski, Walt

    1993-01-01

    Software reuse has become a major goal in the development of space systems, as a recent NASA-wide workshop on the subject made clear. The Data Systems Technology Division of Goddard Space Flight Center has been working on tools and techniques for promoting reuse, in particular in the development of satellite ground support software. One of these tools is the Experiment in Libraries via Incremental Schemata and Cobweb (ElvisC). ElvisC applies machine learning to the problem of organizing a reusable software component library for efficient and reliable retrieval. In this paper we describe the background factors that have motivated this work, present the design of the system, and evaluate the results of its application.

  18. Can machine learning explain human learning?

    NARCIS (Netherlands)

    Vahdat, M.; Oneto, L.; Anguita, D.; Funk, M.; Rauterberg, G.W.M.

    2016-01-01

    Learning Analytics (LA) has a major interest in exploring and understanding the learning process of humans and, for this purpose, benefits from both Cognitive Science, which studies how humans learn, and Machine Learning, which studies how algorithms learn from data. Usually, Machine Learning is

  19. International Conference on Extreme Learning Machine 2015

    CERN Document Server

    Mao, Kezhi; Wu, Jonathan; Lendasse, Amaury; ELM 2015; Theory, Algorithms and Applications (I); Theory, Algorithms and Applications (II)

    2016-01-01

    This book contains some selected papers from the International Conference on Extreme Learning Machine 2015, which was held in Hangzhou, China, December 15-17, 2015. This conference brought together researchers and engineers to share and exchange R&D experience on both theoretical studies and practical applications of the Extreme Learning Machine (ELM) technique and brain learning. This book covers theories, algorithms ad applications of ELM. It gives readers a glance of the most recent advances of ELM. .

  20. Machine Learning-based Virtual Screening and Its Applications to Alzheimer's Drug Discovery: A Review.

    Science.gov (United States)

    Carpenter, Kristy A; Huang, Xudong

    2018-06-07

    Virtual Screening (VS) has emerged as an important tool in the drug development process, as it conducts efficient in silico searches over millions of compounds, ultimately increasing yields of potential drug leads. As a subset of Artificial Intelligence (AI), Machine Learning (ML) is a powerful way of conducting VS for drug leads. ML for VS generally involves assembling a filtered training set of compounds, comprised of known actives and inactives. After training the model, it is validated and, if sufficiently accurate, used on previously unseen databases to screen for novel compounds with desired drug target binding activity. The study aims to review ML-based methods used for VS and applications to Alzheimer's disease (AD) drug discovery. To update the current knowledge on ML for VS, we review thorough backgrounds, explanations, and VS applications of the following ML techniques: Naïve Bayes (NB), k-Nearest Neighbors (kNN), Support Vector Machines (SVM), Random Forests (RF), and Artificial Neural Networks (ANN). All techniques have found success in VS, but the future of VS is likely to lean more heavily toward the use of neural networks - and more specifically, Convolutional Neural Networks (CNN), which are a subset of ANN that utilize convolution. We additionally conceptualize a work flow for conducting ML-based VS for potential therapeutics of for AD, a complex neurodegenerative disease with no known cure and prevention. This both serves as an example of how to apply the concepts introduced earlier in the review and as a potential workflow for future implementation. Different ML techniques are powerful tools for VS, and they have advantages and disadvantages albeit. ML-based VS can be applied to AD drug development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  1. Machine learning a probabilistic perspective

    CERN Document Server

    Murphy, Kevin P

    2012-01-01

    Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic method...

  2. Machine Learning Technologies and Their Applications for Science and Engineering Domains Workshop -- Summary Report

    Science.gov (United States)

    Ambur, Manjula; Schwartz, Katherine G.; Mavris, Dimitri N.

    2016-01-01

    The fields of machine learning and big data analytics have made significant advances in recent years, which has created an environment where cross-fertilization of methods and collaborations can achieve previously unattainable outcomes. The Comprehensive Digital Transformation (CDT) Machine Learning and Big Data Analytics team planned a workshop at NASA Langley in August 2016 to unite leading experts the field of machine learning and NASA scientists and engineers. The primary goal for this workshop was to assess the state-of-the-art in this field, introduce these leading experts to the aerospace and science subject matter experts, and develop opportunities for collaboration. The workshop was held over a three day-period with lectures from 15 leading experts followed by significant interactive discussions. This report provides an overview of the 15 invited lectures and a summary of the key discussion topics that arose during both formal and informal discussion sections. Four key workshop themes were identified after the closure of the workshop and are also highlighted in the report. Furthermore, several workshop attendees provided their feedback on how they are already utilizing machine learning algorithms to advance their research, new methods they learned about during the workshop, and collaboration opportunities they identified during the workshop.

  3. Student Modeling and Machine Learning

    OpenAIRE

    Sison , Raymund; Shimura , Masamichi

    1998-01-01

    After identifying essential student modeling issues and machine learning approaches, this paper examines how machine learning techniques have been used to automate the construction of student models as well as the background knowledge necessary for student modeling. In the process, the paper sheds light on the difficulty, suitability and potential of using machine learning for student modeling processes, and, to a lesser extent, the potential of using student modeling techniques in machine le...

  4. Quantum Machine Learning

    Science.gov (United States)

    Biswas, Rupak

    2018-01-01

    Quantum computing promises an unprecedented ability to solve intractable problems by harnessing quantum mechanical effects such as tunneling, superposition, and entanglement. The Quantum Artificial Intelligence Laboratory (QuAIL) at NASA Ames Research Center is the space agency's primary facility for conducting research and development in quantum information sciences. QuAIL conducts fundamental research in quantum physics but also explores how best to exploit and apply this disruptive technology to enable NASA missions in aeronautics, Earth and space sciences, and space exploration. At the same time, machine learning has become a major focus in computer science and captured the imagination of the public as a panacea to myriad big data problems. In this talk, we will discuss how classical machine learning can take advantage of quantum computing to significantly improve its effectiveness. Although we illustrate this concept on a quantum annealer, other quantum platforms could be used as well. If explored fully and implemented efficiently, quantum machine learning could greatly accelerate a wide range of tasks leading to new technologies and discoveries that will significantly change the way we solve real-world problems.

  5. What is the machine learning?

    Science.gov (United States)

    Chang, Spencer; Cohen, Timothy; Ostdiek, Bryan

    2018-03-01

    Applications of machine learning tools to problems of physical interest are often criticized for producing sensitivity at the expense of transparency. To address this concern, we explore a data planing procedure for identifying combinations of variables—aided by physical intuition—that can discriminate signal from background. Weights are introduced to smooth away the features in a given variable(s). New networks are then trained on this modified data. Observed decreases in sensitivity diagnose the variable's discriminating power. Planing also allows the investigation of the linear versus nonlinear nature of the boundaries between signal and background. We demonstrate the efficacy of this approach using a toy example, followed by an application to an idealized heavy resonance scenario at the Large Hadron Collider. By unpacking the information being utilized by these algorithms, this method puts in context what it means for a machine to learn.

  6. A Review of Machine Learning and Data Mining Approaches for Business Applications in Social Networks

    OpenAIRE

    Evis Trandafili; Marenglen Biba

    2013-01-01

    Social networks have an outstanding marketing value and developing data mining methods for viral marketing is a hot topic in the research community. However, most social networks remain impossible to be fully analyzed and understood due to prohibiting sizes and the incapability of traditional machine learning and data mining approaches to deal with the new dimension in the learning process related to the large-scale environment where the data are produced. On one hand, the birth and evolution...

  7. The application of machine learning to the modelling of percutaneous absorption: an overview and guide.

    Science.gov (United States)

    Ashrafi, P; Moss, G P; Wilkinson, S C; Davey, N; Sun, Y

    2015-01-01

    Machine learning (ML) methods have been applied to the analysis of a range of biological systems. This paper reviews the application of these methods to the problem domain of skin permeability and addresses critically some of the key issues. Specifically, ML methods offer great potential in both predictive ability and their ability to provide mechanistic insight to, in this case, the phenomena of skin permeation. However, they are beset by perceptions of a lack of transparency and, often, once a ML or related method has been published there is little impetus from other researchers to adopt such methods. This is usually due to the lack of transparency in some methods and the lack of availability of specific coding for running advanced ML methods. This paper reviews critically the application of ML methods to percutaneous absorption and addresses the key issue of transparency by describing in detail - and providing the detailed coding for - the process of running a ML method (in this case, a Gaussian process regression method). Although this method is applied here to the field of percutaneous absorption, it may be applied more broadly to any biological system.

  8. Localised and Learnt Applications of Machine Learning for Robotic Incremental Sheet Forming

    DEFF Research Database (Denmark)

    Nicholas, Paul; Zwierzycki, Mateusz; Ramsgaard Thomsen, Mette

    2017-01-01

    Sheet Forming (ISF) and exemplified in the fabrication of a bridge structure. The methods we describe compensate for springback and improve forming tolerance by using localised in-process distance sensing to adapt tool-paths, and by using pre-process supervised machine learning to predict stringback...

  9. Machine learning with R cookbook

    CERN Document Server

    Chiu, Yu-Wei

    2015-01-01

    If you want to learn how to use R for machine learning and gain insights from your data, then this book is ideal for you. Regardless of your level of experience, this book covers the basics of applying R to machine learning through to advanced techniques. While it is helpful if you are familiar with basic programming or machine learning concepts, you do not require prior experience to benefit from this book.

  10. A Study of Applications of Machine Learning Based Classification Methods for Virtual Screening of Lead Molecules.

    Science.gov (United States)

    Vyas, Renu; Bapat, Sanket; Jain, Esha; Tambe, Sanjeev S; Karthikeyan, Muthukumarasamy; Kulkarni, Bhaskar D

    2015-01-01

    The ligand-based virtual screening of combinatorial libraries employs a number of statistical modeling and machine learning methods. A comprehensive analysis of the application of these methods for the diversity oriented virtual screening of biological targets/drug classes is presented here. A number of classification models have been built using three types of inputs namely structure based descriptors, molecular fingerprints and therapeutic category for performing virtual screening. The activity and affinity descriptors of a set of inhibitors of four target classes DHFR, COX, LOX and NMDA have been utilized to train a total of six classifiers viz. Artificial Neural Network (ANN), k nearest neighbor (k-NN), Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree--(DT) and Random Forest--(RF). Among these classifiers, the ANN was found as the best classifier with an AUC of 0.9 irrespective of the target. New molecular fingerprints based on pharmacophore, toxicophore and chemophore (PTC), were used to build the ANN models for each dataset. A good accuracy of 87.27% was obtained using 296 chemophoric binary fingerprints for the COX-LOX inhibitors compared to pharmacophoric (67.82%) and toxicophoric (70.64%). The methodology was validated on the classical Ames mutagenecity dataset of 4337 molecules. To evaluate it further, selectivity and promiscuity of molecules from five drug classes viz. anti-anginal, anti-convulsant, anti-depressant, anti-arrhythmic and anti-diabetic were studied. The TPC fingerprints computed for each category were able to capture the drug-class specific features using the k-NN classifier. These models can be useful for selecting optimal molecules for drug design.

  11. Application of Machine Learning Algorithms to the Study of Noise Artifacts in Gravitational-Wave Data

    Science.gov (United States)

    Biswas, Rahul; Blackburn, Lindy L.; Cao, Junwei; Essick, Reed; Hodge, Kari Alison; Katsavounidis, Erotokritos; Kim, Kyungmin; Young-Min, Kim; Le Bigot, Eric-Olivier; Lee, Chang-Hwan; hide

    2014-01-01

    The sensitivity of searches for astrophysical transients in data from the Laser Interferometer Gravitationalwave Observatory (LIGO) is generally limited by the presence of transient, non-Gaussian noise artifacts, which occur at a high-enough rate such that accidental coincidence across multiple detectors is non-negligible. Furthermore, non-Gaussian noise artifacts typically dominate over the background contributed from stationary noise. These "glitches" can easily be confused for transient gravitational-wave signals, and their robust identification and removal will help any search for astrophysical gravitational-waves. We apply Machine Learning Algorithms (MLAs) to the problem, using data from auxiliary channels within the LIGO detectors that monitor degrees of freedom unaffected by astrophysical signals. Terrestrial noise sources may manifest characteristic disturbances in these auxiliary channels, inducing non-trivial correlations with glitches in the gravitational-wave data. The number of auxiliary-channel parameters describing these disturbances may also be extremely large; high dimensionality is an area where MLAs are particularly well-suited. We demonstrate the feasibility and applicability of three very different MLAs: Artificial Neural Networks, Support Vector Machines, and Random Forests. These classifiers identify and remove a substantial fraction of the glitches present in two very different data sets: four weeks of LIGO's fourth science run and one week of LIGO's sixth science run. We observe that all three algorithms agree on which events are glitches to within 10% for the sixth science run data, and support this by showing that the different optimization criteria used by each classifier generate the same decision surface, based on a likelihood-ratio statistic. Furthermore, we find that all classifiers obtain similar limiting performance, suggesting that most of the useful information currently contained in the auxiliary channel parameters we extract

  12. International Conference on Extreme Learning Machines 2014

    CERN Document Server

    Mao, Kezhi; Cambria, Erik; Man, Zhihong; Toh, Kar-Ann

    2015-01-01

    This book contains some selected papers from the International Conference on Extreme Learning Machine 2014, which was held in Singapore, December 8-10, 2014. This conference brought together the researchers and practitioners of Extreme Learning Machine (ELM) from a variety of fields to promote research and development of “learning without iterative tuning”.  The book covers theories, algorithms and applications of ELM. It gives the readers a glance of the most recent advances of ELM.  

  13. An introduction to quantum machine learning

    OpenAIRE

    Schuld, M.; Sinayskiy, I.; Petruccione, F.

    2014-01-01

    Machine learning algorithms learn a desired input-output relation from examples in order to interpret new inputs. This is important for tasks such as image and speech recognition or strategy optimisation, with growing applications in the IT industry. In the last couple of years, researchers investigated if quantum computing can help to improve classical machine learning algorithms. Ideas range from running computationally costly algorithms or their subroutines efficiently on a quantum compute...

  14. Machine learning in healthcare informatics

    CERN Document Server

    Acharya, U; Dua, Prerna

    2014-01-01

    The book is a unique effort to represent a variety of techniques designed to represent, enhance, and empower multi-disciplinary and multi-institutional machine learning research in healthcare informatics. The book provides a unique compendium of current and emerging machine learning paradigms for healthcare informatics and reflects the diversity, complexity and the depth and breath of this multi-disciplinary area. The integrated, panoramic view of data and machine learning techniques can provide an opportunity for novel clinical insights and discoveries.

  15. Extensions and applications of ensemble-of-trees methods in machine learning

    Science.gov (United States)

    Bleich, Justin

    Ensemble-of-trees algorithms have emerged to the forefront of machine learning due to their ability to generate high forecasting accuracy for a wide array of regression and classification problems. Classic ensemble methodologies such as random forests (RF) and stochastic gradient boosting (SGB) rely on algorithmic procedures to generate fits to data. In contrast, more recent ensemble techniques such as Bayesian Additive Regression Trees (BART) and Dynamic Trees (DT) focus on an underlying Bayesian probability model to generate the fits. These new probability model-based approaches show much promise versus their algorithmic counterparts, but also offer substantial room for improvement. The first part of this thesis focuses on methodological advances for ensemble-of-trees techniques with an emphasis on the more recent Bayesian approaches. In particular, we focus on extensions of BART in four distinct ways. First, we develop a more robust implementation of BART for both research and application. We then develop a principled approach to variable selection for BART as well as the ability to naturally incorporate prior information on important covariates into the algorithm. Next, we propose a method for handling missing data that relies on the recursive structure of decision trees and does not require imputation. Last, we relax the assumption of homoskedasticity in the BART model to allow for parametric modeling of heteroskedasticity. The second part of this thesis returns to the classic algorithmic approaches in the context of classification problems with asymmetric costs of forecasting errors. First we consider the performance of RF and SGB more broadly and demonstrate its superiority to logistic regression for applications in criminology with asymmetric costs. Next, we use RF to forecast unplanned hospital readmissions upon patient discharge with asymmetric costs taken into account. Finally, we explore the construction of stable decision trees for forecasts of

  16. Application of Machine-Learning Models to Predict Tacrolimus Stable Dose in Renal Transplant Recipients

    Science.gov (United States)

    Tang, Jie; Liu, Rong; Zhang, Yue-Li; Liu, Mou-Ze; Hu, Yong-Fang; Shao, Ming-Jie; Zhu, Li-Jun; Xin, Hua-Wen; Feng, Gui-Wen; Shang, Wen-Jun; Meng, Xiang-Guang; Zhang, Li-Rong; Ming, Ying-Zi; Zhang, Wei

    2017-02-01

    Tacrolimus has a narrow therapeutic window and considerable variability in clinical use. Our goal was to compare the performance of multiple linear regression (MLR) and eight machine learning techniques in pharmacogenetic algorithm-based prediction of tacrolimus stable dose (TSD) in a large Chinese cohort. A total of 1,045 renal transplant patients were recruited, 80% of which were randomly selected as the “derivation cohort” to develop dose-prediction algorithm, while the remaining 20% constituted the “validation cohort” to test the final selected algorithm. MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied and their performances were compared in this work. Among all the machine learning models, RT performed best in both derivation [0.71 (0.67-0.76)] and validation cohorts [0.73 (0.63-0.82)]. In addition, the ideal rate of RT was 4% higher than that of MLR. To our knowledge, this is the first study to use machine learning models to predict TSD, which will further facilitate personalized medicine in tacrolimus administration in the future.

  17. Inverse problems in machine learning: An application to brain activity interpretation

    International Nuclear Information System (INIS)

    Prato, M; Zanni, L

    2008-01-01

    In a typical machine learning problem one has to build a model from a finite training set which is able to generalize the properties characterizing the examples of the training set to new examples. The model has to reflect as much as possible the set of training examples but, especially in real-world problems in which the data are often corrupted by different sources of noise, it has to avoid a too strict dependence on the training examples themselves. Recent studies on the relationship between this kind of learning problem and the regularization theory for ill-posed inverse problems have given rise to new regularized learning algorithms. In this paper we recall some of these learning methods and we propose an accelerated version of the classical Landweber iterative scheme which results particularly efficient from the computational viewpoint. Finally, we compare the performances of these methods with the classical Support Vector Machines learning algorithm on a real-world experiment concerning brain activity interpretation through the analysis of functional magnetic resonance imaging data.

  18. Machine learning and statistical techniques : an application to the prediction of insolvency in Spanish non-life insurance companies

    OpenAIRE

    Díaz, Zuleyka; Segovia, María Jesús; Fernández, José

    2005-01-01

    Prediction of insurance companies insolvency has arisen as an important problem in the field of financial research. Most methods applied in the past to tackle this issue are traditional statistical techniques which use financial ratios as explicative variables. However, these variables often do not satisfy statistical assumptions, which complicates the application of the mentioned methods. In this paper, a comparative study of the performance of two non-parametric machine learning techniques ...

  19. Application of machine learning classification for structural brain MRI in mood disorders: Critical review from a clinical perspective.

    Science.gov (United States)

    Kim, Yong-Ku; Na, Kyoung-Sae

    2018-01-03

    Mood disorders are a highly prevalent group of mental disorders causing substantial socioeconomic burden. There are various methodological approaches for identifying the underlying mechanisms of the etiology, symptomatology, and therapeutics of mood disorders; however, neuroimaging studies have provided the most direct evidence for mood disorder neural substrates by visualizing the brains of living individuals. The prefrontal cortex, hippocampus, amygdala, thalamus, ventral striatum, and corpus callosum are associated with depression and bipolar disorder. Identifying the distinct and common contributions of these anatomical regions to depression and bipolar disorder have broadened and deepened our understanding of mood disorders. However, the extent to which neuroimaging research findings contribute to clinical practice in the real-world setting is unclear. As traditional or non-machine learning MRI studies have analyzed group-level differences, it is not possible to directly translate findings from research to clinical practice; the knowledge gained pertains to the disorder, but not to individuals. On the other hand, a machine learning approach makes it possible to provide individual-level classifications. For the past two decades, many studies have reported on the classification accuracy of machine learning-based neuroimaging studies from the perspective of diagnosis and treatment response. However, for the application of a machine learning-based brain MRI approach in real world clinical settings, several major issues should be considered. Secondary changes due to illness duration and medication, clinical subtypes and heterogeneity, comorbidities, and cost-effectiveness restrict the generalization of the current machine learning findings. Sophisticated classification of clinical and diagnostic subtypes is needed. Additionally, as the approach is inevitably limited by sample size, multi-site participation and data-sharing are needed in the future. Copyright

  20. Application of machine learning algorithms to the study of noise artifacts in gravitational-wave data

    Science.gov (United States)

    Biswas, Rahul; Blackburn, Lindy; Cao, Junwei; Essick, Reed; Hodge, Kari Alison; Katsavounidis, Erotokritos; Kim, Kyungmin; Kim, Young-Min; Le Bigot, Eric-Olivier; Lee, Chang-Hwan; Oh, John J.; Oh, Sang Hoon; Son, Edwin J.; Tao, Ye; Vaulin, Ruslan; Wang, Xiaoge

    2013-09-01

    The sensitivity of searches for astrophysical transients in data from the Laser Interferometer Gravitational-wave Observatory (LIGO) is generally limited by the presence of transient, non-Gaussian noise artifacts, which occur at a high enough rate such that accidental coincidence across multiple detectors is non-negligible. These “glitches” can easily be mistaken for transient gravitational-wave signals, and their robust identification and removal will help any search for astrophysical gravitational waves. We apply machine-learning algorithms (MLAs) to the problem, using data from auxiliary channels within the LIGO detectors that monitor degrees of freedom unaffected by astrophysical signals. Noise sources may produce artifacts in these auxiliary channels as well as the gravitational-wave channel. The number of auxiliary-channel parameters describing these disturbances may also be extremely large; high dimensionality is an area where MLAs are particularly well suited. We demonstrate the feasibility and applicability of three different MLAs: artificial neural networks, support vector machines, and random forests. These classifiers identify and remove a substantial fraction of the glitches present in two different data sets: four weeks of LIGO’s fourth science run and one week of LIGO’s sixth science run. We observe that all three algorithms agree on which events are glitches to within 10% for the sixth-science-run data, and support this by showing that the different optimization criteria used by each classifier generate the same decision surface, based on a likelihood-ratio statistic. Furthermore, we find that all classifiers obtain similar performance to the benchmark algorithm, the ordered veto list, which is optimized to detect pairwise correlations between transients in LIGO auxiliary channels and glitches in the gravitational-wave data. This suggests that most of the useful information currently extracted from the auxiliary channels is already described

  1. Adaptive Machine Aids to Learning.

    Science.gov (United States)

    Starkweather, John A.

    With emphasis on man-machine relationships and on machine evolution, computer-assisted instruction (CAI) is examined in this paper. The discussion includes the background of machine assistance to learning, the current status of CAI, directions of development, the development of criteria for successful instruction, meeting the needs of users,…

  2. A Mobile Health Application to Predict Postpartum Depression Based on Machine Learning.

    Science.gov (United States)

    Jiménez-Serrano, Santiago; Tortajada, Salvador; García-Gómez, Juan Miguel

    2015-07-01

    Postpartum depression (PPD) is a disorder that often goes undiagnosed. The development of a screening program requires considerable and careful effort, where evidence-based decisions have to be taken in order to obtain an effective test with a high level of sensitivity and an acceptable specificity that is quick to perform, easy to interpret, culturally sensitive, and cost-effective. The purpose of this article is twofold: first, to develop classification models for detecting the risk of PPD during the first week after childbirth, thus enabling early intervention; and second, to develop a mobile health (m-health) application (app) for the Android(®) (Google, Mountain View, CA) platform based on the model with best performance for both mothers who have just given birth and clinicians who want to monitor their patient's test. A set of predictive models for estimating the risk of PPD was trained using machine learning techniques and data about postpartum women collected from seven Spanish hospitals. An internal evaluation was carried out using a hold-out strategy. An easy flowchart and architecture for designing the graphical user interface of the m-health app was followed. Naive Bayes showed the best balance between sensitivity and specificity as a predictive model for PPD during the first week after delivery. It was integrated into the clinical decision support system for Android mobile apps. This approach can enable the early prediction and detection of PPD because it fulfills the conditions of an effective screening test with a high level of sensitivity and specificity that is quick to perform, easy to interpret, culturally sensitive, and cost-effective.

  3. Application of Machine Learning Approaches for Classifying Sitting Posture Based on Force and Acceleration Sensors

    Directory of Open Access Journals (Sweden)

    Roland Zemp

    2016-01-01

    Full Text Available Occupational musculoskeletal disorders, particularly chronic low back pain (LBP, are ubiquitous due to prolonged static sitting or nonergonomic sitting positions. Therefore, the aim of this study was to develop an instrumented chair with force and acceleration sensors to determine the accuracy of automatically identifying the user’s sitting position by applying five different machine learning methods (Support Vector Machines, Multinomial Regression, Boosting, Neural Networks, and Random Forest. Forty-one subjects were requested to sit four times in seven different prescribed sitting positions (total 1148 samples. Sixteen force sensor values and the backrest angle were used as the explanatory variables (features for the classification. The different classification methods were compared by means of a Leave-One-Out cross-validation approach. The best performance was achieved using the Random Forest classification algorithm, producing a mean classification accuracy of 90.9% for subjects with which the algorithm was not familiar. The classification accuracy varied between 81% and 98% for the seven different sitting positions. The present study showed the possibility of accurately classifying different sitting positions by means of the introduced instrumented office chair combined with machine learning analyses. The use of such novel approaches for the accurate assessment of chair usage could offer insights into the relationships between sitting position, sitting behaviour, and the occurrence of musculoskeletal disorders.

  4. Application of Machine Learning Approaches for Classifying Sitting Posture Based on Force and Acceleration Sensors.

    Science.gov (United States)

    Zemp, Roland; Tanadini, Matteo; Plüss, Stefan; Schnüriger, Karin; Singh, Navrag B; Taylor, William R; Lorenzetti, Silvio

    2016-01-01

    Occupational musculoskeletal disorders, particularly chronic low back pain (LBP), are ubiquitous due to prolonged static sitting or nonergonomic sitting positions. Therefore, the aim of this study was to develop an instrumented chair with force and acceleration sensors to determine the accuracy of automatically identifying the user's sitting position by applying five different machine learning methods (Support Vector Machines, Multinomial Regression, Boosting, Neural Networks, and Random Forest). Forty-one subjects were requested to sit four times in seven different prescribed sitting positions (total 1148 samples). Sixteen force sensor values and the backrest angle were used as the explanatory variables (features) for the classification. The different classification methods were compared by means of a Leave-One-Out cross-validation approach. The best performance was achieved using the Random Forest classification algorithm, producing a mean classification accuracy of 90.9% for subjects with which the algorithm was not familiar. The classification accuracy varied between 81% and 98% for the seven different sitting positions. The present study showed the possibility of accurately classifying different sitting positions by means of the introduced instrumented office chair combined with machine learning analyses. The use of such novel approaches for the accurate assessment of chair usage could offer insights into the relationships between sitting position, sitting behaviour, and the occurrence of musculoskeletal disorders.

  5. Application of Machine Learning in Postural Control Kinematics for the Diagnosis of Alzheimer’s Disease

    Directory of Open Access Journals (Sweden)

    Luís Costa

    2016-01-01

    Full Text Available The use of wearable devices to study gait and postural control is a growing field on neurodegenerative disorders such as Alzheimer’s disease (AD. In this paper, we investigate if machine-learning classifiers offer the discriminative power for the diagnosis of AD based on postural control kinematics. We compared Support Vector Machines (SVMs, Multiple Layer Perceptrons (MLPs, Radial Basis Function Neural Networks (RBNs, and Deep Belief Networks (DBNs on 72 participants (36 AD patients and 36 healthy subjects exposed to seven increasingly difficult postural tasks. The decisional space was composed of 18 kinematic variables (adjusted for age, education, height, and weight, with or without neuropsychological evaluation (Montreal cognitive assessment (MoCA score, top ranked in an error incremental analysis. Classification results were based on threefold cross validation of 50 independent and randomized runs sets: training (50%, test (40%, and validation (10%. Having a decisional space relying solely on postural kinematics, accuracy of AD diagnosis ranged from 71.7 to 86.1%. Adding the MoCA variable, the accuracy ranged between 91 and 96.6%. MLP classifier achieved top performance in both decisional spaces. Having comprehended the interdynamic interaction between postural stability and cognitive performance, our results endorse machine-learning models as a useful tool for computer-aided diagnosis of AD based on postural control kinematics.

  6. Issues of Application of Machine Learning Models for Virtual and Real-Life Buildings

    Directory of Open Access Journals (Sweden)

    Young Min Kim

    2016-06-01

    Full Text Available The current Building Energy Performance Simulation (BEPS tools are based on first principles. For the correct use of BEPS tools, simulationists should have an in-depth understanding of building physics, numerical methods, control logics of building systems, etc. However, it takes significant time and effort to develop a first principles-based simulation model for existing buildings—mainly due to the laborious process of data gathering, uncertain inputs, model calibration, etc. Rather than resorting to an expert’s effort, a data-driven approach (so-called “inverse” approach has received growing attention for the simulation of existing buildings. This paper reports a cross-comparison of three popular machine learning models (Artificial Neural Network (ANN, Support Vector Machine (SVM, and Gaussian Process (GP for predicting a chiller’s energy consumption in a virtual and a real-life building. The predictions based on the three models are sufficiently accurate compared to the virtual and real measurements. This paper addresses the following issues for the successful development of machine learning models: reproducibility, selection of inputs, training period, outlying data obtained from the building energy management system (BEMS, and validation of the models. From the result of this comparative study, it was found that SVM has a disadvantage in computation time compared to ANN and GP. GP is the most sensitive to a training period among the three models.

  7. Food category consumption and obesity prevalence across countries: an application of Machine Learning method to big data analysis

    Science.gov (United States)

    Dunstan, Jocelyn; Fallah-Fini, Saeideh; Nau, Claudia; Glass, Thomas; Global Obesity Prevention Center Team

    The applications of sophisticated mathematical and numerical tools in public health has been demonstrated to be useful in predicting the outcome of public intervention as well as to study, for example, the main causes of obesity without doing experiments with the population. In this project we aim to understand which kind of food consumed in different countries over time best defines the rate of obesity in those countries. The use of Machine Learning is particularly useful because we do not need to create a hypothesis and test it with the data, but instead we learn from the data to find the groups of food that best describe the prevalence of obesity.

  8. Machine Learning for Robotic Vision

    OpenAIRE

    Drummond, Tom

    2018-01-01

    Machine learning is a crucial enabling technology for robotics, in particular for unlocking the capabilities afforded by visual sensing. This talk will present research within Prof Drummond’s lab that explores how machine learning can be developed and used within the context of Robotic Vision.

  9. Quantitative forecasting of PTSD from early trauma responses: A Machine Learning application

    DEFF Research Database (Denmark)

    Galatzer-Levy, I. R.; Karstoft, K. I.; Statnikov, A.

    2014-01-01

    -traumatic stress disorder (PTSD) is plausible given the disorder's salient onset and the abundance of putative biological and clinical risk indicators. This work evaluates the ability of Machine Learning (ML) forecasting approaches to identify and integrate a panel of unique predictive characteristics...... algorithm identified a set of predictors that rendered all others redundant. Support Vector Machines (SVMs) as well as other ML classification algorithms were used to evaluate the forecasting accuracy of i) ML selected features, ii) all available features without selection, and iii) Acute Stress Disorder......). The feature selection algorithm identified 16 predictors, present in >= 95% cross-validation trials. The accuracy of predicting non-remitting PTSD from that set (AUC = .77) did not differ from predicting from all available information (AUC = .78). Predicting from ASD symptoms was not better then chance (AUC...

  10. Variable complexity online sequential extreme learning machine, with applications to streamflow prediction

    Science.gov (United States)

    Lima, Aranildo R.; Hsieh, William W.; Cannon, Alex J.

    2017-12-01

    In situations where new data arrive continually, online learning algorithms are computationally much less costly than batch learning ones in maintaining the model up-to-date. The extreme learning machine (ELM), a single hidden layer artificial neural network with random weights in the hidden layer, is solved by linear least squares, and has an online learning version, the online sequential ELM (OSELM). As more data become available during online learning, information on the longer time scale becomes available, so ideally the model complexity should be allowed to change, but the number of hidden nodes (HN) remains fixed in OSELM. A variable complexity VC-OSELM algorithm is proposed to dynamically add or remove HN in the OSELM, allowing the model complexity to vary automatically as online learning proceeds. The performance of VC-OSELM was compared with OSELM in daily streamflow predictions at two hydrological stations in British Columbia, Canada, with VC-OSELM significantly outperforming OSELM in mean absolute error, root mean squared error and Nash-Sutcliffe efficiency at both stations.

  11. Machine Learning for Medical Imaging.

    Science.gov (United States)

    Erickson, Bradley J; Korfiatis, Panagiotis; Akkus, Zeynettin; Kline, Timothy L

    2017-01-01

    Machine learning is a technique for recognizing patterns that can be applied to medical images. Although it is a powerful tool that can help in rendering medical diagnoses, it can be misapplied. Machine learning typically begins with the machine learning algorithm system computing the image features that are believed to be of importance in making the prediction or diagnosis of interest. The machine learning algorithm system then identifies the best combination of these image features for classifying the image or computing some metric for the given image region. There are several methods that can be used, each with different strengths and weaknesses. There are open-source versions of most of these machine learning methods that make them easy to try and apply to images. Several metrics for measuring the performance of an algorithm exist; however, one must be aware of the possible associated pitfalls that can result in misleading metrics. More recently, deep learning has started to be used; this method has the benefit that it does not require image feature identification and calculation as a first step; rather, features are identified as part of the learning process. Machine learning has been used in medical imaging and will have a greater influence in the future. Those working in medical imaging must be aware of how machine learning works. © RSNA, 2017.

  12. Model-Agnostic Interpretability of Machine Learning

    OpenAIRE

    Ribeiro, Marco Tulio; Singh, Sameer; Guestrin, Carlos

    2016-01-01

    Understanding why machine learning models behave the way they do empowers both system designers and end-users in many ways: in model selection, feature engineering, in order to trust and act upon the predictions, and in more intuitive user interfaces. Thus, interpretability has become a vital concern in machine learning, and work in the area of interpretable models has found renewed interest. In some applications, such models are as accurate as non-interpretable ones, and thus are preferred f...

  13. Genetic Algorithms for Optimization of Machine-learning Models and their Applications in Bioinformatics

    KAUST Repository

    Magana-Mora, Arturo

    2017-04-29

    Machine-learning (ML) techniques have been widely applied to solve different problems in biology. However, biological data are large and complex, which often result in extremely intricate ML models. Frequently, these models may have a poor performance or may be computationally unfeasible. This study presents a set of novel computational methods and focuses on the application of genetic algorithms (GAs) for the simplification and optimization of ML models and their applications to biological problems. The dissertation addresses the following three challenges. The first is to develop a generalizable classification methodology able to systematically derive competitive models despite the complexity and nature of the data. Although several algorithms for the induction of classification models have been proposed, the algorithms are data dependent. Consequently, we developed OmniGA, a novel and generalizable framework that uses different classification models in a treeXlike decision structure, along with a parallel GA for the optimization of the OmniGA structure. Results show that OmniGA consistently outperformed existing commonly used classification models. The second challenge is the prediction of translation initiation sites (TIS) in plants genomic DNA. We performed a statistical analysis of the genomic DNA and proposed a new set of discriminant features for this problem. We developed a wrapper method based on GAs for selecting an optimal feature subset, which, in conjunction with a classification model, produced the most accurate framework for the recognition of TIS in plants. Finally, results demonstrate that despite the evolutionary distance between different plants, our approach successfully identified conserved genomic elements that may serve as the starting point for the development of a generic model for prediction of TIS in eukaryotic organisms. Finally, the third challenge is the accurate prediction of polyadenylation signals in human genomic DNA. To achieve

  14. Application of machine learning and expert systems to Statistical Process Control (SPC) chart interpretation

    Science.gov (United States)

    Shewhart, Mark

    1991-01-01

    Statistical Process Control (SPC) charts are one of several tools used in quality control. Other tools include flow charts, histograms, cause and effect diagrams, check sheets, Pareto diagrams, graphs, and scatter diagrams. A control chart is simply a graph which indicates process variation over time. The purpose of drawing a control chart is to detect any changes in the process signalled by abnormal points or patterns on the graph. The Artificial Intelligence Support Center (AISC) of the Acquisition Logistics Division has developed a hybrid machine learning expert system prototype which automates the process of constructing and interpreting control charts.

  15. MLnet report: training in Europe on machine learning

    OpenAIRE

    Ellebrecht, Mario; Morik, Katharina

    1999-01-01

    Machine learning techniques offer opportunities for a variety of applications and the theory of machine learning investigates problems that are of interest for other fields of computer science (e.g., complexity theory, logic programming, pattern recognition). However, the impacts of machine learning can only be recognized by those who know the techniques and are able to apply them. Hence, teaching machine learning is necessary before this field can diversify computer science. In order ...

  16. Online transfer learning with extreme learning machine

    Science.gov (United States)

    Yin, Haibo; Yang, Yun-an

    2017-05-01

    In this paper, we propose a new transfer learning algorithm for online training. The proposed algorithm, which is called Online Transfer Extreme Learning Machine (OTELM), is based on Online Sequential Extreme Learning Machine (OSELM) while it introduces Semi-Supervised Extreme Learning Machine (SSELM) to transfer knowledge from the source to the target domain. With the manifold regularization, SSELM picks out instances from the source domain that are less relevant to those in the target domain to initialize the online training, so as to improve the classification performance. Experimental results demonstrate that the proposed OTELM can effectively use instances in the source domain to enhance the learning performance.

  17. Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines

    Science.gov (United States)

    Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.

    2011-01-01

    Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433

  18. Application of machine learning for the evaluation of turfgrass plots using aerial images

    Science.gov (United States)

    Ding, Ke; Raheja, Amar; Bhandari, Subodh; Green, Robert L.

    2016-05-01

    Historically, investigation of turfgrass characteristics have been limited to visual ratings. Although relevant information may result from such evaluations, final inferences may be questionable because of the subjective nature in which the data is collected. Recent advances in computer vision techniques allow researchers to objectively measure turfgrass characteristics such as percent ground cover, turf color, and turf quality from the digital images. This paper focuses on developing a methodology for automated assessment of turfgrass quality from aerial images. Images of several turfgrass plots of varying quality were gathered using a camera mounted on an unmanned aerial vehicle. The quality of these plots were also evaluated based on visual ratings. The goal was to use the aerial images to generate quality evaluations on a regular basis for the optimization of water treatment. Aerial images are used to train a neural network so that appropriate features such as intensity, color, and texture of the turfgrass are extracted from these images. Neural network is a nonlinear classifier commonly used in machine learning. The output of the neural network trained model is the ratings of the grass, which is compared to the visual ratings. Currently, the quality and the color of turfgrass, measured as the greenness of the grass, are evaluated. The textures are calculated using the Gabor filter and co-occurrence matrix. Other classifiers such as support vector machines and simpler linear regression models such as Ridge regression and LARS regression are also used. The performance of each model is compared. The results show encouraging potential for using machine learning techniques for the evaluation of turfgrass quality and color.

  19. Machine learning classification with confidence: application of transductive conformal predictors to MRI-based diagnostic and prognostic markers in depression.

    Science.gov (United States)

    Nouretdinov, Ilia; Costafreda, Sergi G; Gammerman, Alexander; Chervonenkis, Alexey; Vovk, Vladimir; Vapnik, Vladimir; Fu, Cynthia H Y

    2011-05-15

    There is rapidly accumulating evidence that the application of machine learning classification to neuroimaging measurements may be valuable for the development of diagnostic and prognostic prediction tools in psychiatry. However, current methods do not produce a measure of the reliability of the predictions. Knowing the risk of the error associated with a given prediction is essential for the development of neuroimaging-based clinical tools. We propose a general probabilistic classification method to produce measures of confidence for magnetic resonance imaging (MRI) data. We describe the application of transductive conformal predictor (TCP) to MRI images. TCP generates the most likely prediction and a valid measure of confidence, as well as the set of all possible predictions for a given confidence level. We present the theoretical motivation for TCP, and we have applied TCP to structural and functional MRI data in patients and healthy controls to investigate diagnostic and prognostic prediction in depression. We verify that TCP predictions are as accurate as those obtained with more standard machine learning methods, such as support vector machine, while providing the additional benefit of a valid measure of confidence for each prediction. Copyright © 2010 Elsevier Inc. All rights reserved.

  20. Machine Learning Techniques in Clinical Vision Sciences.

    Science.gov (United States)

    Caixinha, Miguel; Nunes, Sandrina

    2017-01-01

    This review presents and discusses the contribution of machine learning techniques for diagnosis and disease monitoring in the context of clinical vision science. Many ocular diseases leading to blindness can be halted or delayed when detected and treated at its earliest stages. With the recent developments in diagnostic devices, imaging and genomics, new sources of data for early disease detection and patients' management are now available. Machine learning techniques emerged in the biomedical sciences as clinical decision-support techniques to improve sensitivity and specificity of disease detection and monitoring, increasing objectively the clinical decision-making process. This manuscript presents a review in multimodal ocular disease diagnosis and monitoring based on machine learning approaches. In the first section, the technical issues related to the different machine learning approaches will be present. Machine learning techniques are used to automatically recognize complex patterns in a given dataset. These techniques allows creating homogeneous groups (unsupervised learning), or creating a classifier predicting group membership of new cases (supervised learning), when a group label is available for each case. To ensure a good performance of the machine learning techniques in a given dataset, all possible sources of bias should be removed or minimized. For that, the representativeness of the input dataset for the true population should be confirmed, the noise should be removed, the missing data should be treated and the data dimensionally (i.e., the number of parameters/features and the number of cases in the dataset) should be adjusted. The application of machine learning techniques in ocular disease diagnosis and monitoring will be presented and discussed in the second section of this manuscript. To show the clinical benefits of machine learning in clinical vision sciences, several examples will be presented in glaucoma, age-related macular degeneration

  1. Machine Learning an algorithmic perspective

    CERN Document Server

    Marsland, Stephen

    2009-01-01

    Traditional books on machine learning can be divided into two groups - those aimed at advanced undergraduates or early postgraduates with reasonable mathematical knowledge and those that are primers on how to code algorithms. The field is ready for a text that not only demonstrates how to use the algorithms that make up machine learning methods, but also provides the background needed to understand how and why these algorithms work. Machine Learning: An Algorithmic Perspective is that text.Theory Backed up by Practical ExamplesThe book covers neural networks, graphical models, reinforcement le

  2. Machine learning for healthcare technologies

    CERN Document Server

    Clifton, David A

    2016-01-01

    This book brings together chapters on the state-of-the-art in machine learning (ML) as it applies to the development of patient-centred technologies, with a special emphasis on 'big data' and mobile data.

  3. Machine Learning via Mathematical Programming

    National Research Council Canada - National Science Library

    Mamgasarian, Olivi

    1999-01-01

    Mathematical programming approaches were applied to a variety of problems in machine learning in order to gain deeper understanding of the problems and to come up with new and more efficient computational algorithms...

  4. Machine Learning examples on Invenio

    CERN Document Server

    CERN. Geneva

    2017-01-01

    This talk will present the different Machine Learning tools that the INSPIRE is developing and integrating in order to automatize as much as possible content selection and curation in a subject based repository.

  5. Scikit-learn: Machine Learning in Python

    OpenAIRE

    Pedregosa, Fabian; Varoquaux, Gaël; Gramfort, Alexandre; Michel, Vincent; Thirion, Bertrand; Grisel, Olivier; Blondel, Mathieu; Prettenhofer, Peter; Weiss, Ron; Dubourg, Vincent; Vanderplas, Jake; Passos, Alexandre; Cournapeau, David; Brucher, Matthieu; Perrot, Matthieu

    2011-01-01

    International audience; Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic ...

  6. Scikit-learn: Machine Learning in Python

    OpenAIRE

    Pedregosa, Fabian; Varoquaux, Gaël; Gramfort, Alexandre; Michel, Vincent; Thirion, Bertrand; Grisel, Olivier; Blondel, Mathieu; Louppe, Gilles; Prettenhofer, Peter; Weiss, Ron; Dubourg, Vincent; Vanderplas, Jake; Passos, Alexandre; Cournapeau, David; Brucher, Matthieu

    2012-01-01

    Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings....

  7. Quick Estimation Model for the Concentration of Indoor Airborne Culturable Bacteria: An Application of Machine Learning

    Directory of Open Access Journals (Sweden)

    Zhijian Liu

    2017-07-01

    Full Text Available Indoor airborne culturable bacteria are sometimes harmful to human health. Therefore, a quick estimation of their concentration is particularly necessary. However, measuring the indoor microorganism concentration (e.g., bacteria usually requires a large amount of time, economic cost, and manpower. In this paper, we aim to provide a quick solution: using knowledge-based machine learning to provide quick estimation of the concentration of indoor airborne culturable bacteria only with the inputs of several measurable indoor environmental indicators, including: indoor particulate matter (PM2.5 and PM10, temperature, relative humidity, and CO2 concentration. Our results show that a general regression neural network (GRNN model can sufficiently provide a quick and decent estimation based on the model training and testing using an experimental database with 249 data groups.

  8. FACT. Streamed data analysis and online application of machine learning models

    Energy Technology Data Exchange (ETDEWEB)

    Bruegge, Kai Arno; Buss, Jens [Technische Universitaet Dortmund (Germany). Astroteilchenphysik; Collaboration: FACT-Collaboration

    2016-07-01

    Imaging Atmospheric Cherenkov Telescopes (IACTs) like FACT produce a continuous flow of data during measurements. Analyzing the data in near real time is essential for monitoring sources. One major task of a monitoring system is to detect changes in the gamma-ray flux of a source, and to alert other experiments if some predefined limit is reached. In order to calculate the flux of an observed source, it is necessary to run an entire data analysis process including calibration, image cleaning, parameterization, signal-background separation and flux estimation. Software built on top of a data streaming framework has been implemented for FACT and generalized to work with the data acquisition framework of the Cherenkov Telescope Array (CTA). We present how the streams-framework is used to apply supervised machine learning models to an online data stream from the telescope.

  9. Machine Learning of Musical Gestures

    OpenAIRE

    Caramiaux, Baptiste; Tanaka, Atau

    2013-01-01

    We present an overview of machine learning (ML) techniques and theirapplication in interactive music and new digital instruments design. We firstgive to the non-specialist reader an introduction to two ML tasks,classification and regression, that are particularly relevant for gesturalinteraction. We then present a review of the literature in current NIMEresearch that uses ML in musical gesture analysis and gestural sound control.We describe the ways in which machine learning is useful for cre...

  10. Identification of immune correlates of protection in Shigella infection by application of machine learning.

    Science.gov (United States)

    Arevalillo, Jorge M; Sztein, Marcelo B; Kotloff, Karen L; Levine, Myron M; Simon, Jakub K

    2017-10-01

    Immunologic correlates of protection are important in vaccine development because they give insight into mechanisms of protection, assist in the identification of promising vaccine candidates, and serve as endpoints in bridging clinical vaccine studies. Our goal is the development of a methodology to identify immunologic correlates of protection using the Shigella challenge as a model. The proposed methodology utilizes the Random Forests (RF) machine learning algorithm as well as Classification and Regression Trees (CART) to detect immune markers that predict protection, identify interactions between variables, and define optimal cutoffs. Logistic regression modeling is applied to estimate the probability of protection and the confidence interval (CI) for such a probability is computed by bootstrapping the logistic regression models. The results demonstrate that the combination of Classification and Regression Trees and Random Forests complements the standard logistic regression and uncovers subtle immune interactions. Specific levels of immunoglobulin IgG antibody in blood on the day of challenge predicted protection in 75% (95% CI 67-86). Of those subjects that did not have blood IgG at or above a defined threshold, 100% were protected if they had IgA antibody secreting cells above a defined threshold. Comparison with the results obtained by applying only logistic regression modeling with standard Akaike Information Criterion for model selection shows the usefulness of the proposed method. Given the complexity of the immune system, the use of machine learning methods may enhance traditional statistical approaches. When applied together, they offer a novel way to quantify important immune correlates of protection that may help the development of vaccines. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Application of Machine Learning to Predict Dietary Lapses During Weight Loss.

    Science.gov (United States)

    Goldstein, Stephanie P; Zhang, Fengqing; Thomas, John G; Butryn, Meghan L; Herbert, James D; Forman, Evan M

    2018-05-01

    Individuals who adhere to dietary guidelines provided during weight loss interventions tend to be more successful with weight control. Any deviation from dietary guidelines can be referred to as a "lapse." There is a growing body of research showing that lapses are predictable using a variety of physiological, environmental, and psychological indicators. With recent technological advancements, it may be possible to assess these triggers and predict dietary lapses in real time. The current study sought to use machine learning techniques to predict lapses and evaluate the utility of combining both group- and individual-level data to enhance lapse prediction. The current study trained and tested a machine learning algorithm capable of predicting dietary lapses from a behavioral weight loss program among adults with overweight/obesity (n = 12). Participants were asked to follow a weight control diet for 6 weeks and complete ecological momentary assessment (EMA; repeated brief surveys delivered via smartphone) regarding dietary lapses and relevant triggers. WEKA decision trees were used to predict lapses with an accuracy of 0.72 for the group of participants. However, generalization of the group algorithm to each individual was poor, and as such, group- and individual-level data were combined to improve prediction. The findings suggest that 4 weeks of individual data collection is recommended to attain optimal model performance. The predictive algorithm could be utilized to provide in-the-moment interventions to prevent dietary lapses and therefore enhance weight losses. Furthermore, methods in the current study could be translated to other types of health behavior lapses.

  12. Feature combination networks for the interpretation of statistical machine learning models: application to Ames mutagenicity.

    Science.gov (United States)

    Webb, Samuel J; Hanser, Thierry; Howlin, Brendan; Krause, Paul; Vessey, Jonathan D

    2014-03-25

    A new algorithm has been developed to enable the interpretation of black box models. The developed algorithm is agnostic to learning algorithm and open to all structural based descriptors such as fragments, keys and hashed fingerprints. The algorithm has provided meaningful interpretation of Ames mutagenicity predictions from both random forest and support vector machine models built on a variety of structural fingerprints.A fragmentation algorithm is utilised to investigate the model's behaviour on specific substructures present in the query. An output is formulated summarising causes of activation and deactivation. The algorithm is able to identify multiple causes of activation or deactivation in addition to identifying localised deactivations where the prediction for the query is active overall. No loss in performance is seen as there is no change in the prediction; the interpretation is produced directly on the model's behaviour for the specific query. Models have been built using multiple learning algorithms including support vector machine and random forest. The models were built on public Ames mutagenicity data and a variety of fingerprint descriptors were used. These models produced a good performance in both internal and external validation with accuracies around 82%. The models were used to evaluate the interpretation algorithm. Interpretation was revealed that links closely with understood mechanisms for Ames mutagenicity. This methodology allows for a greater utilisation of the predictions made by black box models and can expedite further study based on the output for a (quantitative) structure activity model. Additionally the algorithm could be utilised for chemical dataset investigation and knowledge extraction/human SAR development.

  13. An introduction to quantum machine learning

    Science.gov (United States)

    Schuld, Maria; Sinayskiy, Ilya; Petruccione, Francesco

    2015-04-01

    Machine learning algorithms learn a desired input-output relation from examples in order to interpret new inputs. This is important for tasks such as image and speech recognition or strategy optimisation, with growing applications in the IT industry. In the last couple of years, researchers investigated if quantum computing can help to improve classical machine learning algorithms. Ideas range from running computationally costly algorithms or their subroutines efficiently on a quantum computer to the translation of stochastic methods into the language of quantum theory. This contribution gives a systematic overview of the emerging field of quantum machine learning. It presents the approaches as well as technical details in an accessible way, and discusses the potential of a future theory of quantum learning.

  14. Machine learning methods for planning

    CERN Document Server

    Minton, Steven

    1993-01-01

    Machine Learning Methods for Planning provides information pertinent to learning methods for planning and scheduling. This book covers a wide variety of learning methods and learning architectures, including analogical, case-based, decision-tree, explanation-based, and reinforcement learning.Organized into 15 chapters, this book begins with an overview of planning and scheduling and describes some representative learning systems that have been developed for these tasks. This text then describes a learning apprentice for calendar management. Other chapters consider the problem of temporal credi

  15. Using Machine Learning to Advance Personality Assessment and Theory.

    Science.gov (United States)

    Bleidorn, Wiebke; Hopwood, Christopher James

    2018-05-01

    Machine learning has led to important advances in society. One of the most exciting applications of machine learning in psychological science has been the development of assessment tools that can powerfully predict human behavior and personality traits. Thus far, machine learning approaches to personality assessment have focused on the associations between social media and other digital records with established personality measures. The goal of this article is to expand the potential of machine learning approaches to personality assessment by embedding it in a more comprehensive construct validation framework. We review recent applications of machine learning to personality assessment, place machine learning research in the broader context of fundamental principles of construct validation, and provide recommendations for how to use machine learning to advance our understanding of personality.

  16. Machine learning paradigms in design optimization: Applications in turbine aerodynamic design

    Science.gov (United States)

    Goel, Sanjay

    Mechanisms of incorporating machine learning paradigms in design optimization have been investigated in the current research. The primary focus of the work is on machine learning algorithms which use computational models that are analogous to the hypothesized principles of natural or biological learning. Examples from structural and aerodynamic optimization have been used to demonstrate the potential of the proposed schemes. The first strategy examined in the current work seeks to improve the convergence of optimization problems by pruning the search space of weak variables. Such variables are identified by learning from a database of existing designs using neural networks. By using clustering techniques, different sets of weak variables are identified in different regions of the design space. Parameter sensitivity information obtained in the process of identifying weak variables provides accurate heuristics for formulating design rules. The impact of this methodology on obtaining converged designs has been investigated for a turbine design problem. Optimization results from a three-stage power turbine and an aircraft engine turbine are presented in this thesis. The second scheme is an evolutionary design optimization technique which gets progressively 'smarter' during the optimization process by learning from computed domain knowledge. This technique employs adaptive learning mechanisms (classifiers) which recognize the influence of the design variables on the problem solution and then generalize them to dynamically create or change design rules during optimization. This technique, when applied to a constrained optimization problem, shows progressive improvement in convergence of search, as successive generations of rules evolve by learning from the environment. To investigate this methodology, a truss optimization problem is solved with an objective of minimizing the truss weight subject to stress constraints in the truss members. A distinct convergent trend is

  17. Quantum machine learning for quantum anomaly detection

    Science.gov (United States)

    Liu, Nana; Rebentrost, Patrick

    2018-04-01

    Anomaly detection is used for identifying data that deviate from "normal" data patterns. Its usage on classical data finds diverse applications in many important areas such as finance, fraud detection, medical diagnoses, data cleaning, and surveillance. With the advent of quantum technologies, anomaly detection of quantum data, in the form of quantum states, may become an important component of quantum applications. Machine-learning algorithms are playing pivotal roles in anomaly detection using classical data. Two widely used algorithms are the kernel principal component analysis and the one-class support vector machine. We find corresponding quantum algorithms to detect anomalies in quantum states. We show that these two quantum algorithms can be performed using resources that are logarithmic in the dimensionality of quantum states. For pure quantum states, these resources can also be logarithmic in the number of quantum states used for training the machine-learning algorithm. This makes these algorithms potentially applicable to big quantum data applications.

  18. Comparison of four machine learning algorithms for their applicability in satellite-based optical rainfall retrievals

    Science.gov (United States)

    Meyer, Hanna; Kühnlein, Meike; Appelhans, Tim; Nauss, Thomas

    2016-03-01

    Machine learning (ML) algorithms have successfully been demonstrated to be valuable tools in satellite-based rainfall retrievals which show the practicability of using ML algorithms when faced with high dimensional and complex data. Moreover, recent developments in parallel computing with ML present new possibilities for training and prediction speed and therefore make their usage in real-time systems feasible. This study compares four ML algorithms - random forests (RF), neural networks (NNET), averaged neural networks (AVNNET) and support vector machines (SVM) - for rainfall area detection and rainfall rate assignment using MSG SEVIRI data over Germany. Satellite-based proxies for cloud top height, cloud top temperature, cloud phase and cloud water path serve as predictor variables. The results indicate an overestimation of rainfall area delineation regardless of the ML algorithm (averaged bias = 1.8) but a high probability of detection ranging from 81% (SVM) to 85% (NNET). On a 24-hour basis, the performance of the rainfall rate assignment yielded R2 values between 0.39 (SVM) and 0.44 (AVNNET). Though the differences in the algorithms' performance were rather small, NNET and AVNNET were identified as the most suitable algorithms. On average, they demonstrated the best performance in rainfall area delineation as well as in rainfall rate assignment. NNET's computational speed is an additional advantage in work with large datasets such as in remote sensing based rainfall retrievals. However, since no single algorithm performed considerably better than the others we conclude that further research in providing suitable predictors for rainfall is of greater necessity than an optimization through the choice of the ML algorithm.

  19. What is the machine learning.

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    Applications of machine learning tools to problems of physical interest are often criticized for producing sensitivity at the expense of transparency. In this talk, I explore a procedure for identifying combinations of variables -- aided by physical intuition -- that can discriminate signal from background. Weights are introduced to smooth away the features in a given variable(s). New networks are then trained on this modified data. Observed decreases in sensitivity diagnose the variable's discriminating power. Planing also allows the investigation of the linear versus non-linear nature of the boundaries between signal and background. I will demonstrate these features in both an easy to understand toy model and an idealized LHC resonance scenario.

  20. Machine Learning Phases of Strongly Correlated Fermions

    Directory of Open Access Journals (Sweden)

    Kelvin Ch’ng

    2017-08-01

    Full Text Available Machine learning offers an unprecedented perspective for the problem of classifying phases in condensed matter physics. We employ neural-network machine learning techniques to distinguish finite-temperature phases of the strongly correlated fermions on cubic lattices. We show that a three-dimensional convolutional network trained on auxiliary field configurations produced by quantum Monte Carlo simulations of the Hubbard model can correctly predict the magnetic phase diagram of the model at the average density of one (half filling. We then use the network, trained at half filling, to explore the trend in the transition temperature as the system is doped away from half filling. This transfer learning approach predicts that the instability to the magnetic phase extends to at least 5% doping in this region. Our results pave the way for other machine learning applications in correlated quantum many-body systems.

  1. Machine learning from hard x-ray surveys: applications to magnetic cataclysmic variable studies

    Science.gov (United States)

    Scaringi, Simone

    2009-11-01

    Within this thesis are discussed two main topics of contemporary astrophysics. The first is that of machine learning algorithms for astronomy whilst the second is that of magnetic cataclysmic variables (mCVs). To begin, an overview is given of ISINA: INTEGRAL Scouce Identifiction Network Algorithm. This machine learning algorithm, using random forests, is applied to the IBIS/ISGRI data set in order to ease the production of unbiased future soft gamma-ray source catalogues. The feature extraction process on an initial candidate list is described together with feature merging. Three trainng and testing sets are created in order to deal with the diverse time-scales encountered when dealing with the gamma-ray sky: one dealing with faint persistent source recognition, one dealing with strong persistent sources and a final one dealing with transients. For the latter, a new transient detection technique is introduced and described: the transient matrix. Finally the performance of the network is assessed and discussed using the testing set and some illustrative source examples. ISINA is also compared to the more conventional approach of visual inspection. Next mCVs are discussed, and in particular the properties arising from a hard X-ray selected sample which has proven remarkably efficient in detecting intermediate polars and asynchronous polars, two of the rarest type of cataclysmic variables (CVs). This thesis focuses particularly on the link between hard X-ray properties and spin/orbital periods. To this end, a new sample of these objects is constructed by cross-corelating candidate sources detected in INTEGRAL/IBIS observations against catalogues of known CVs. Also included in the analysis are hard X-ray Observations from Swift/BAT and SUZAKU/HXD in order to make the study more complete. It is found that most hard X-ray detected mCVs have Pspin/Porb<0.1 above the period gap. In this respect, attention is given to the very low number of detected systems in any ban

  2. Machine learning for evolution strategies

    CERN Document Server

    Kramer, Oliver

    2016-01-01

    This book introduces numerous algorithmic hybridizations between both worlds that show how machine learning can improve and support evolution strategies. The set of methods comprises covariance matrix estimation, meta-modeling of fitness and constraint functions, dimensionality reduction for search and visualization of high-dimensional optimization processes, and clustering-based niching. After giving an introduction to evolution strategies and machine learning, the book builds the bridge between both worlds with an algorithmic and experimental perspective. Experiments mostly employ a (1+1)-ES and are implemented in Python using the machine learning library scikit-learn. The examples are conducted on typical benchmark problems illustrating algorithmic concepts and their experimental behavior. The book closes with a discussion of related lines of research.

  3. Machine Learning: developing an image recognition program : with Python, Scikit Learn and OpenCV

    OpenAIRE

    Nguyen, Minh

    2016-01-01

    Machine Learning is one of the most debated topic in computer world these days, especially after the first Computer Go program has beaten human Go world champion. Among endless application of Machine Learning, image recognition, which problem is processing enormous amount of data from dynamic input. This thesis will present the basic concept of Machine Learning, Machine Learning algorithms, Python programming language and Scikit Learn – a simple and efficient tool for data analysis in P...

  4. Game-powered machine learning.

    Science.gov (United States)

    Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert

    2012-04-24

    Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the "wisdom of the crowds." Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., "funky jazz with saxophone," "spooky electronica," etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data.

  5. Application of Machine Learning tools to recognition of molecular patterns in STM images

    Science.gov (United States)

    Maksov, Artem; Ziatdinov, Maxim; Fujii, Shintaro; Kiguchi, Manabu; Higashibayashi, Shuhei; Sakurai, Hidehiro; Kalinin, Sergei; Sumpter, Bobby

    The ability to utilize individual molecules and molecular assemblies as data storage elements has motivated scientist for years, concurrent with the continuous effort to shrink a size of data storage devices in microelectronics industry. One of the critical issues in this effort lies in being able to identify individual molecular assembly units (patterns), on a large scale in an automated fashion of complete information extraction. Here we present a novel method of applying machine learning techniques for extraction of positional and rotational information from scanning tunneling microscopy (STM) images of π-bowl sumanene molecules on gold. We use Markov Random Field (MRF) model to decode the polar rotational states for each molecule in a large scale STM image of molecular film. We further develop an algorithm that uses a convolutional Neural Network combined with MRF and input from density functional theory to classify molecules into different azimuthal rotational classes. Our results demonstrate that a molecular film is partitioned into distinctive azimuthal rotational domains consisting typically of 20-30 molecules. In each domain, the ``bowl-down'' molecules are generally surrounded by six nearest neighbor molecules in ``bowl-up'' configuration, and the resultant overall structure form a periodic lattice of rotational and polar states within each domain. Research was supported by the US Department of Energy.

  6. A machine learning approach to automated structural network analysis: application to neonatal encephalopathy.

    Directory of Open Access Journals (Sweden)

    Etay Ziv

    Full Text Available Neonatal encephalopathy represents a heterogeneous group of conditions associated with life-long developmental disabilities and neurological deficits. Clinical measures and current anatomic brain imaging remain inadequate predictors of outcome in children with neonatal encephalopathy. Some studies have suggested that brain development and, therefore, brain connectivity may be altered in the subgroup of patients who subsequently go on to develop clinically significant neurological abnormalities. Large-scale structural brain connectivity networks constructed using diffusion tractography have been posited to reflect organizational differences in white matter architecture at the mesoscale, and thus offer a unique tool for characterizing brain development in patients with neonatal encephalopathy. In this manuscript we use diffusion tractography to construct structural networks for a cohort of patients with neonatal encephalopathy. We systematically map these networks to a high-dimensional space and then apply standard machine learning algorithms to predict neurological outcome in the cohort. Using nested cross-validation we demonstrate high prediction accuracy that is both statistically significant and robust over a broad range of thresholds. Our algorithm offers a novel tool to evaluate neonates at risk for developing neurological deficit. The described approach can be applied to any brain pathology that affects structural connectivity.

  7. Applications of machine-learning algorithms for infrared colour selection of Galactic Wolf-Rayet stars

    Science.gov (United States)

    Morello, Giuseppe; Morris, P. W.; Van Dyk, S. D.; Marston, A. P.; Mauerhan, J. C.

    2018-01-01

    We have investigated and applied machine-learning algorithms for infrared colour selection of Galactic Wolf-Rayet (WR) candidates. Objects taken from the Spitzer Galactic Legacy Infrared Midplane Survey Extraordinaire (GLIMPSE) catalogue of the infrared objects in the Galactic plane can be classified into different stellar populations based on the colours inferred from their broad-band photometric magnitudes [J, H and Ks from 2 Micron All Sky Survey (2MASS), and the four Spitzer/IRAC bands]. The algorithms tested in this pilot study are variants of the k-nearest neighbours approach, which is ideal for exploratory studies of classification problems where interrelations between variables and classes are complicated. The aims of this study are (1) to provide an automated tool to select reliable WR candidates and potentially other classes of objects, (2) to measure the efficiency of infrared colour selection at performing these tasks and (3) to lay the groundwork for statistically inferring the total number of WR stars in our Galaxy. We report the performance results obtained over a set of known objects and selected candidates for which we have carried out follow-up spectroscopic observations, and confirm the discovery of four new WR stars.

  8. Integrating Symbolic and Statistical Methods for Testing Intelligent Systems Applications to Machine Learning and Computer Vision

    Energy Technology Data Exchange (ETDEWEB)

    Jha, Sumit Kumar [University of Central Florida, Orlando; Pullum, Laura L [ORNL; Ramanathan, Arvind [ORNL

    2016-01-01

    Embedded intelligent systems ranging from tiny im- plantable biomedical devices to large swarms of autonomous un- manned aerial systems are becoming pervasive in our daily lives. While we depend on the flawless functioning of such intelligent systems, and often take their behavioral correctness and safety for granted, it is notoriously difficult to generate test cases that expose subtle errors in the implementations of machine learning algorithms. Hence, the validation of intelligent systems is usually achieved by studying their behavior on representative data sets, using methods such as cross-validation and bootstrapping.In this paper, we present a new testing methodology for studying the correctness of intelligent systems. Our approach uses symbolic decision procedures coupled with statistical hypothesis testing to. We also use our algorithm to analyze the robustness of a human detection algorithm built using the OpenCV open-source computer vision library. We show that the human detection implementation can fail to detect humans in perturbed video frames even when the perturbations are so small that the corresponding frames look identical to the naked eye.

  9. Big Data Toolsets to Pharmacometrics: Application of Machine Learning for Time-to-Event Analysis.

    Science.gov (United States)

    Gong, Xiajing; Hu, Meng; Zhao, Liang

    2018-05-01

    Additional value can be potentially created by applying big data tools to address pharmacometric problems. The performances of machine learning (ML) methods and the Cox regression model were evaluated based on simulated time-to-event data synthesized under various preset scenarios, i.e., with linear vs. nonlinear and dependent vs. independent predictors in the proportional hazard function, or with high-dimensional data featured by a large number of predictor variables. Our results showed that ML-based methods outperformed the Cox model in prediction performance as assessed by concordance index and in identifying the preset influential variables for high-dimensional data. The prediction performances of ML-based methods are also less sensitive to data size and censoring rates than the Cox regression model. In conclusion, ML-based methods provide a powerful tool for time-to-event analysis, with a built-in capacity for high-dimensional data and better performance when the predictor variables assume nonlinear relationships in the hazard function. © 2018 The Authors. Clinical and Translational Science published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.

  10. Applicability of internet search index for asthma admission forecast using machine learning.

    Science.gov (United States)

    Luo, Li; Liao, Chengcheng; Zhang, Fengyi; Zhang, Wei; Li, Chunyang; Qiu, Zhixin; Huang, Debin

    2018-04-15

    This study aimed to determine whether a search index could provide insight into trends in asthma admission in China. An Internet search index is a powerful tool to monitor and predict epidemic outbreaks. However, whether using an internet search index can significantly improve asthma admissions forecasts remains unknown. The long-term goal is to develop a surveillance system to help early detection and interventions for asthma and to avoid asthma health care resource shortages in advance. In this study, we used a search index combined with air pollution data, weather data, and historical admissions data to forecast asthma admissions using machine learning. Results demonstrated that the best area under the curve in the test set that can be achieved is 0.832, using all predictors mentioned earlier. A search index is a powerful predictor in asthma admissions forecast, and a recent search index can reflect current asthma admissions with a lag-effect to a certain extent. The addition of a real-time, easily accessible search index improves forecasting capabilities and demonstrates the predictive potential of search index. Copyright © 2018 John Wiley & Sons, Ltd.

  11. Quantitative forecasting of PTSD from early trauma responses: a Machine Learning application.

    Science.gov (United States)

    Galatzer-Levy, Isaac R; Karstoft, Karen-Inge; Statnikov, Alexander; Shalev, Arieh Y

    2014-12-01

    There is broad interest in predicting the clinical course of mental disorders from early, multimodal clinical and biological information. Current computational models, however, constitute a significant barrier to realizing this goal. The early identification of trauma survivors at risk of post-traumatic stress disorder (PTSD) is plausible given the disorder's salient onset and the abundance of putative biological and clinical risk indicators. This work evaluates the ability of Machine Learning (ML) forecasting approaches to identify and integrate a panel of unique predictive characteristics and determine their accuracy in forecasting non-remitting PTSD from information collected within 10 days of a traumatic event. Data on event characteristics, emergency department observations, and early symptoms were collected in 957 trauma survivors, followed for fifteen months. An ML feature selection algorithm identified a set of predictors that rendered all others redundant. Support Vector Machines (SVMs) as well as other ML classification algorithms were used to evaluate the forecasting accuracy of i) ML selected features, ii) all available features without selection, and iii) Acute Stress Disorder (ASD) symptoms alone. SVM also compared the prediction of a) PTSD diagnostic status at 15 months to b) posterior probability of membership in an empirically derived non-remitting PTSD symptom trajectory. Results are expressed as mean Area Under Receiver Operating Characteristics Curve (AUC). The feature selection algorithm identified 16 predictors, present in ≥ 95% cross-validation trials. The accuracy of predicting non-remitting PTSD from that set (AUC = .77) did not differ from predicting from all available information (AUC = .78). Predicting from ASD symptoms was not better then chance (AUC = .60). The prediction of PTSD status was less accurate than that of membership in a non-remitting trajectory (AUC = .71). ML methods may fill a critical gap in forecasting PTSD. The

  12. Application of machine learning methods in big data analytics at management of contracts in the construction industry

    Directory of Open Access Journals (Sweden)

    Valpeters Marina

    2018-01-01

    Full Text Available The number of experts who realize the importance of big data continues to increase in various fields of the economy. Experts begin to use big data more frequently for the solution of their specific objectives. One of the probable big data tasks in the construction industry is the determination of the probability of contract execution at a stage of its establishment. The contract holder cannot guarantee execution of the contract. Therefore it leads to a lot of risks for the customer. This article is devoted to the applicability of machine learning methods to the task of determination of the probability of a successful contract execution. Authors try to reveal the factors influencing the possibility of contract default and then try to define the following corrective actions for a customer. In the problem analysis, authors used the linear and non-linear algorithms, feature extraction, feature transformation and feature selection. The results of investigation include the prognostic models with a predictive force based on the machine learning algorithms such as logistic regression, decision tree, randomize forest. Authors have validated models on available historical data. The developed models have the potential for practical use in the construction organizations while making new contracts.

  13. Predicting ground contact events for a continuum of gait types: An application of targeted machine learning using principal component analysis.

    Science.gov (United States)

    Osis, Sean T; Hettinga, Blayne A; Ferber, Reed

    2016-05-01

    An ongoing challenge in the application of gait analysis to clinical settings is the standardized detection of temporal events, with unobtrusive and cost-effective equipment, for a wide range of gait types. The purpose of the current study was to investigate a targeted machine learning approach for the prediction of timing for foot strike (or initial contact) and toe-off, using only kinematics for walking, forefoot running, and heel-toe running. Data were categorized by gait type and split into a training set (∼30%) and a validation set (∼70%). A principal component analysis was performed, and separate linear models were trained and validated for foot strike and toe-off, using ground reaction force data as a gold-standard for event timing. Results indicate the model predicted both foot strike and toe-off timing to within 20ms of the gold-standard for more than 95% of cases in walking and running gaits. The machine learning approach continues to provide robust timing predictions for clinical use, and may offer a flexible methodology to handle new events and gait types. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Application of machine learning techniques for solving real world business problems : the case study - target marketing of insurance policies

    OpenAIRE

    Juozenaite, Ineta

    2018-01-01

    The concept of machine learning has been around for decades, but now it is becoming more and more popular not only in the business, but everywhere else as well. It is because of increased amount of data, cheaper data storage, more powerful and affordable computational processing. The complexity of business environment leads companies to use data-driven decision making to work more efficiently. The most common machine learning methods, like Logistic Regression, Decision Tree, Artificial Neural...

  15. Deep learning: Using machine learning to study biological vision

    OpenAIRE

    Majaj, Najib; Pelli, Denis

    2017-01-01

    Today most vision-science presentations mention machine learning. Many neuroscientists use machine learning to decode neural responses. Many perception scientists try to understand recognition by living organisms. To them, machine learning offers a reference of attainable performance based on learned stimuli. This brief overview of the use of machine learning in biological vision touches on its strengths, weaknesses, milestones, controversies, and current directions.

  16. Higgs Machine Learning Challenge 2014

    CERN Document Server

    Olivier, A-P; Bourdarios, C ; LAL / Orsay; Goldfarb, S ; University of Michigan

    2014-01-01

    High Energy Physics (HEP) has been using Machine Learning (ML) techniques such as boosted decision trees (paper) and neural nets since the 90s. These techniques are now routinely used for difficult tasks such as the Higgs boson search. Nevertheless, formal connections between the two research fields are rather scarce, with some exceptions such as the AppStat group at LAL, founded in 2006. In collaboration with INRIA, AppStat promotes interdisciplinary research on machine learning, computational statistics, and high-energy particle and astroparticle physics. We are now exploring new ways to improve the cross-fertilization of the two fields by setting up a data challenge, following the footsteps of, among others, the astrophysics community (dark matter and galaxy zoo challenges) and neurobiology (connectomics and decoding the human brain). The organization committee consists of ATLAS physicists and machine learning researchers. The Challenge will run from Monday 12th to September 2014.

  17. Neuromorphic Deep Learning Machines

    OpenAIRE

    Neftci, E; Augustine, C; Paul, S; Detorakis, G

    2017-01-01

    An ongoing challenge in neuromorphic computing is to devise general and computationally efficient models of inference and learning which are compatible with the spatial and temporal constraints of the brain. One increasingly popular and successful approach is to take inspiration from inference and learning algorithms used in deep neural networks. However, the workhorse of deep learning, the gradient descent Back Propagation (BP) rule, often relies on the immediate availability of network-wide...

  18. An Evolutionary Machine Learning Framework for Big Data Sequence Mining

    Science.gov (United States)

    Kamath, Uday Krishna

    2014-01-01

    Sequence classification is an important problem in many real-world applications. Unlike other machine learning data, there are no "explicit" features or signals in sequence data that can help traditional machine learning algorithms learn and predict from the data. Sequence data exhibits inter-relationships in the elements that are…

  19. Machine Learning for Neuroimaging with Scikit-Learn

    Directory of Open Access Journals (Sweden)

    Alexandre eAbraham

    2014-02-01

    Full Text Available Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g. multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g. resting state functional MRI or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.

  20. Machine learning for neuroimaging with scikit-learn.

    Science.gov (United States)

    Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël

    2014-01-01

    Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.

  1. Application of Classification Algorithm of Machine Learning and Buffer Analysis in Torism Regional Planning

    Science.gov (United States)

    Zhang, T. H.; Ji, H. W.; Hu, Y.; Ye, Q.; Lin, Y.

    2018-04-01

    Remote Sensing (RS) and Geography Information System (GIS) technologies are widely used in ecological analysis and regional planning. With the advantages of large scale monitoring, combination of point and area, multiple time-phases and repeated observation, they are suitable for monitoring and analysis of environmental information in a large range. In this study, support vector machine (SVM) classification algorithm is used to monitor the land use and land cover change (LUCC), and then to perform the ecological evaluation for Chaohu lake tourism area quantitatively. The automatic classification and the quantitative spatial-temporal analysis for the Chaohu Lake basin are realized by the analysis of multi-temporal and multispectral satellite images, DEM data and slope information data. Furthermore, the ecological buffer zone analysis is also studied to set up the buffer width for each catchment area surrounding Chaohu Lake. The results of LUCC monitoring from 1992 to 2015 has shown obvious affections by human activities. Since the construction of the Chaohu Lake basin is in the crucial stage of the rapid development of urbanization, the application of RS and GIS technique can effectively provide scientific basis for land use planning, ecological management, environmental protection and tourism resources development in the Chaohu Lake Basin.

  2. Newton Methods for Large Scale Problems in Machine Learning

    Science.gov (United States)

    Hansen, Samantha Leigh

    2014-01-01

    The focus of this thesis is on practical ways of designing optimization algorithms for minimizing large-scale nonlinear functions with applications in machine learning. Chapter 1 introduces the overarching ideas in the thesis. Chapters 2 and 3 are geared towards supervised machine learning applications that involve minimizing a sum of loss…

  3. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach.

    Science.gov (United States)

    Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong

    2017-06-19

    A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.

  4. Machine learning: novel bioinformatics approaches for combating antimicrobial resistance.

    Science.gov (United States)

    Macesic, Nenad; Polubriaginof, Fernanda; Tatonetti, Nicholas P

    2017-12-01

    Antimicrobial resistance (AMR) is a threat to global health and new approaches to combating AMR are needed. Use of machine learning in addressing AMR is in its infancy but has made promising steps. We reviewed the current literature on the use of machine learning for studying bacterial AMR. The advent of large-scale data sets provided by next-generation sequencing and electronic health records make applying machine learning to the study and treatment of AMR possible. To date, it has been used for antimicrobial susceptibility genotype/phenotype prediction, development of AMR clinical decision rules, novel antimicrobial agent discovery and antimicrobial therapy optimization. Application of machine learning to studying AMR is feasible but remains limited. Implementation of machine learning in clinical settings faces barriers to uptake with concerns regarding model interpretability and data quality.Future applications of machine learning to AMR are likely to be laboratory-based, such as antimicrobial susceptibility phenotype prediction.

  5. Statistical and machine learning approaches for network analysis

    CERN Document Server

    Dehmer, Matthias

    2012-01-01

    Explore the multidisciplinary nature of complex networks through machine learning techniques Statistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph measures for classification. By providing different approaches based on experimental data, the book uniquely sets itself apart from the current literature by exploring the application of machine learning techniques to various types of complex networks. Comprised of chapters written by internation

  6. Attention: A Machine Learning Perspective

    DEFF Research Database (Denmark)

    Hansen, Lars Kai

    2012-01-01

    We review a statistical machine learning model of top-down task driven attention based on the notion of ‘gist’. In this framework we consider the task to be represented as a classification problem with two sets of features — a gist of coarse grained global features and a larger set of low...

  7. Visible Machine Learning for Biomedicine.

    Science.gov (United States)

    Yu, Michael K; Ma, Jianzhu; Fisher, Jasmin; Kreisberg, Jason F; Raphael, Benjamin J; Ideker, Trey

    2018-06-14

    A major ambition of artificial intelligence lies in translating patient data to successful therapies. Machine learning models face particular challenges in biomedicine, however, including handling of extreme data heterogeneity and lack of mechanistic insight into predictions. Here, we argue for "visible" approaches that guide model structure with experimental biology. Copyright © 2018. Published by Elsevier Inc.

  8. Multipolar electrostatics based on the Kriging machine learning method: an application to serine.

    Science.gov (United States)

    Yuan, Yongna; Mills, Matthew J L; Popelier, Paul L A

    2014-04-01

    A multipolar, polarizable electrostatic method for future use in a novel force field is described. Quantum Chemical Topology (QCT) is used to partition the electron density of a chemical system into atoms, then the machine learning method Kriging is used to build models that relate the multipole moments of the atoms to the positions of their surrounding nuclei. The pilot system serine is used to study both the influence of the level of theory and the set of data generator methods used. The latter consists of: (i) sampling of protein structures deposited in the Protein Data Bank (PDB), or (ii) normal mode distortion along either (a) Cartesian coordinates, or (b) redundant internal coordinates. Wavefunctions for the sampled geometries were obtained at the HF/6-31G(d,p), B3LYP/apc-1, and MP2/cc-pVDZ levels of theory, prior to calculation of the atomic multipole moments by volume integration. The average absolute error (over an independent test set of conformations) in the total atom-atom electrostatic interaction energy of serine, using Kriging models built with the three data generator methods is 11.3 kJ mol⁻¹ (PDB), 8.2 kJ mol⁻¹ (Cartesian distortion), and 10.1 kJ mol⁻¹ (redundant internal distortion) at the HF/6-31G(d,p) level. At the B3LYP/apc-1 level, the respective errors are 7.7 kJ mol⁻¹, 6.7 kJ mol⁻¹, and 4.9 kJ mol⁻¹, while at the MP2/cc-pVDZ level they are 6.5 kJ mol⁻¹, 5.3 kJ mol⁻¹, and 4.0 kJ mol⁻¹. The ranges of geometries generated by the redundant internal coordinate distortion and by extraction from the PDB are much wider than the range generated by Cartesian distortion. The atomic multipole moment and electrostatic interaction energy predictions for the B3LYP/apc-1 and MP2/cc-pVDZ levels are similar, and both are better than the corresponding predictions at the HF/6-31G(d,p) level.

  9. Machine learning enhanced optical distance sensor

    Science.gov (United States)

    Amin, M. Junaid; Riza, N. A.

    2018-01-01

    Presented for the first time is a machine learning enhanced optical distance sensor. The distance sensor is based on our previously demonstrated distance measurement technique that uses an Electronically Controlled Variable Focus Lens (ECVFL) with a laser source to illuminate a target plane with a controlled optical beam spot. This spot with varying spot sizes is viewed by an off-axis camera and the spot size data is processed to compute the distance. In particular, proposed and demonstrated in this paper is the use of a regularized polynomial regression based supervised machine learning algorithm to enhance the accuracy of the operational sensor. The algorithm uses the acquired features and corresponding labels that are the actual target distance values to train a machine learning model. The optimized training model is trained over a 1000 mm (or 1 m) experimental target distance range. Using the machine learning algorithm produces a training set and testing set distance measurement errors of learning. Applications for the proposed sensor include industrial scenario distance sensing where target material specific training models can be generated to realize low <1% measurement error distance measurements.

  10. Machine learning an artificial intelligence approach

    CERN Document Server

    Banerjee, R; Bradshaw, Gary; Carbonell, Jaime Guillermo; Mitchell, Tom Michael; Michalski, Ryszard Spencer

    1983-01-01

    Machine Learning: An Artificial Intelligence Approach contains tutorial overviews and research papers representative of trends in the area of machine learning as viewed from an artificial intelligence perspective. The book is organized into six parts. Part I provides an overview of machine learning and explains why machines should learn. Part II covers important issues affecting the design of learning programs-particularly programs that learn from examples. It also describes inductive learning systems. Part III deals with learning by analogy, by experimentation, and from experience. Parts IV a

  11. Book review: A first course in Machine Learning

    DEFF Research Database (Denmark)

    Ortiz-Arroyo, Daniel

    2016-01-01

    "The new edition of A First Course in Machine Learning by Rogers and Girolami is an excellent introduction to the use of statistical methods in machine learning. The book introduces concepts such as mathematical modeling, inference, and prediction, providing ‘just in time’ the essential background...... to change models and parameter values to make [it] easier to understand and apply these models in real applications. The authors [also] introduce more advanced, state-of-the-art machine learning methods, such as Gaussian process models and advanced mixture models, which are used across machine learning....... This makes the book interesting not only to students with little or no background in machine learning but also to more advanced graduate students interested in statistical approaches to machine learning." —Daniel Ortiz-Arroyo, Associate Professor, Aalborg University Esbjerg, Denmark...

  12. Learning Extended Finite State Machines

    Science.gov (United States)

    Cassel, Sofia; Howar, Falk; Jonsson, Bengt; Steffen, Bernhard

    2014-01-01

    We present an active learning algorithm for inferring extended finite state machines (EFSM)s, combining data flow and control behavior. Key to our learning technique is a novel learning model based on so-called tree queries. The learning algorithm uses the tree queries to infer symbolic data constraints on parameters, e.g., sequence numbers, time stamps, identifiers, or even simple arithmetic. We describe sufficient conditions for the properties that the symbolic constraints provided by a tree query in general must have to be usable in our learning model. We have evaluated our algorithm in a black-box scenario, where tree queries are realized through (black-box) testing. Our case studies include connection establishment in TCP and a priority queue from the Java Class Library.

  13. LINGUISTIC ANALYSIS FOR THE BELARUSIAN CORPUS WITH THE APPLICATION OF NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Yu. S. Hetsevich

    2017-01-01

    Full Text Available The article focuses on the problems existing in text-to-speech synthesis. Different morphological, lexical and syntactical elements were localized with the help of the Belarusian unit of NooJ program. Those types of errors, which occur in Belarusian texts, were analyzed and corrected. Language model and part of speech tagging model were built. The natural language processing of Belarusian corpus with the help of developed algorithm using machine learning was carried out. The precision of developed models of machine learning has been 80–90 %. The dictionary was enriched with new words for the further using it in the systems of Belarusian speech synthesis.

  14. Learning Machine Learning: A Case Study

    Science.gov (United States)

    Lavesson, N.

    2010-01-01

    This correspondence reports on a case study conducted in the Master's-level Machine Learning (ML) course at Blekinge Institute of Technology, Sweden. The students participated in a self-assessment test and a diagnostic test of prerequisite subjects, and their results on these tests are correlated with their achievement of the course's learning…

  15. Machine learning in geosciences and remote sensing

    Directory of Open Access Journals (Sweden)

    David J. Lary

    2016-01-01

    Full Text Available Learning incorporates a broad range of complex procedures. Machine learning (ML is a subdivision of artificial intelligence based on the biological learning process. The ML approach deals with the design of algorithms to learn from machine readable data. ML covers main domains such as data mining, difficult-to-program applications, and software applications. It is a collection of a variety of algorithms (e.g. neural networks, support vector machines, self-organizing map, decision trees, random forests, case-based reasoning, genetic programming, etc. that can provide multivariate, nonlinear, nonparametric regression or classification. The modeling capabilities of the ML-based methods have resulted in their extensive applications in science and engineering. Herein, the role of ML as an effective approach for solving problems in geosciences and remote sensing will be highlighted. The unique features of some of the ML techniques will be outlined with a specific attention to genetic programming paradigm. Furthermore, nonparametric regression and classification illustrative examples are presented to demonstrate the efficiency of ML for tackling the geosciences and remote sensing problems.

  16. From machine learning to deep learning: progress in machine intelligence for rational drug discovery.

    Science.gov (United States)

    Zhang, Lu; Tan, Jianjun; Han, Dan; Zhu, Hao

    2017-11-01

    Machine intelligence, which is normally presented as artificial intelligence, refers to the intelligence exhibited by computers. In the history of rational drug discovery, various machine intelligence approaches have been applied to guide traditional experiments, which are expensive and time-consuming. Over the past several decades, machine-learning tools, such as quantitative structure-activity relationship (QSAR) modeling, were developed that can identify potential biological active molecules from millions of candidate compounds quickly and cheaply. However, when drug discovery moved into the era of 'big' data, machine learning approaches evolved into deep learning approaches, which are a more powerful and efficient way to deal with the massive amounts of data generated from modern drug discovery approaches. Here, we summarize the history of machine learning and provide insight into recently developed deep learning approaches and their applications in rational drug discovery. We suggest that this evolution of machine intelligence now provides a guide for early-stage drug design and discovery in the current big data era. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Machine learning analysis of binaural rowing sounds

    DEFF Research Database (Denmark)

    Johard, Leonard; Ruffaldi, Emanuele; Hoffmann, Pablo F.

    2011-01-01

    Techniques for machine hearing are increasing their potentiality due to new application domains. In this work we are addressing the analysis of rowing sounds in natural context for the purpose of supporting a training system based on virtual environments. This paper presents the acquisition metho...... methodology and the evaluation of different machine learning techniques for classifying rowing-sound data. We see that a combination of principal component analysis and shallow networks perform equally well as deep architectures, while being much faster to train.......Techniques for machine hearing are increasing their potentiality due to new application domains. In this work we are addressing the analysis of rowing sounds in natural context for the purpose of supporting a training system based on virtual environments. This paper presents the acquisition...

  18. Classifying smoking urges via machine learning.

    Science.gov (United States)

    Dumortier, Antoine; Beckjord, Ellen; Shiffman, Saul; Sejdić, Ervin

    2016-12-01

    Smoking is the largest preventable cause of death and diseases in the developed world, and advances in modern electronics and machine learning can help us deliver real-time intervention to smokers in novel ways. In this paper, we examine different machine learning approaches to use situational features associated with having or not having urges to smoke during a quit attempt in order to accurately classify high-urge states. To test our machine learning approaches, specifically, Bayes, discriminant analysis and decision tree learning methods, we used a dataset collected from over 300 participants who had initiated a quit attempt. The three classification approaches are evaluated observing sensitivity, specificity, accuracy and precision. The outcome of the analysis showed that algorithms based on feature selection make it possible to obtain high classification rates with only a few features selected from the entire dataset. The classification tree method outperformed the naive Bayes and discriminant analysis methods, with an accuracy of the classifications up to 86%. These numbers suggest that machine learning may be a suitable approach to deal with smoking cessation matters, and to predict smoking urges, outlining a potential use for mobile health applications. In conclusion, machine learning classifiers can help identify smoking situations, and the search for the best features and classifier parameters significantly improves the algorithms' performance. In addition, this study also supports the usefulness of new technologies in improving the effect of smoking cessation interventions, the management of time and patients by therapists, and thus the optimization of available health care resources. Future studies should focus on providing more adaptive and personalized support to people who really need it, in a minimum amount of time by developing novel expert systems capable of delivering real-time interventions. Copyright © 2016 Elsevier Ireland Ltd. All rights

  19. Galaxy Classification using Machine Learning

    Science.gov (United States)

    Fowler, Lucas; Schawinski, Kevin; Brandt, Ben-Elias; widmer, Nicole

    2017-01-01

    We present our current research into the use of machine learning to classify galaxy imaging data with various convolutional neural network configurations in TensorFlow. We are investigating how five-band Sloan Digital Sky Survey imaging data can be used to train on physical properties such as redshift, star formation rate, mass and morphology. We also investigate the performance of artificially redshifted images in recovering physical properties as image quality degrades.

  20. Applicability of Machine-Learning Enabled LIBS in Post Irradiation Nuclear Forensic Analysis of High Level Waste

    International Nuclear Information System (INIS)

    Onkongi, J.; Maina, D.; Angeyo, H.K.

    2017-01-01

    Nuclear Forensics seeks Information to determine; Chemical Composition, Routes of transit, Origin (Provenance) and Intended use. Post Irradiation/Post detonation NF In a post-detonation event could you get clues/signatures from glass debris, minute sample sizes? Nuclear Forensic Technique Should be State-of -the art that is Rapid, Non-invasive, Remote ability and Non-destructive. Laser Induced Breakdown Spectroscopy (LIBS) unlike other Analytic Techniques that require tedious sample preparations such as Dissolution, digestion & matrix removal, which generate additional nuclear wastes that require proper Procedures for handling, storage & ultimate disposal, LIBS overcomes these limitations. Utility of Machine Learning Techniques employed include; Artificial Neural Networks, ANN (Regression/Modelling), Principal component Analysis, PCA (Classification) and Support Vector Machine SVM (Comparative study/Classification Machine Learning coupled with LIBS gives a state of the art analytic method. Utility of the technic in safeguards security and non-proliferation

  1. Applicability of Machine-Learning Enabled LIBS in Post Irradiation Nuclear Forensic Analysis of High Level Nuclear Waste

    International Nuclear Information System (INIS)

    Onkongi, J.; Maina, D.; Angeyo, H. K.

    2017-01-01

    Nuclear Forensics seeks Information to determine; Chemical Composition, Routes of transit, Origin (Provenance) and Intended use. Post Irradiation/Post detonation NF In a post-detonation event could you get clues/signatures from glass debris, minute sample sizes? Nuclear Forensic Technique Should be State-of -the art that is Rapid, Non-invasive, Remote ability and Non-destructive. Laser Induced Breakdown Spectroscopy (LIBS) unlike other Analytic Techniques that require tedious sample preparations such as Dissolution, digestion & matrix removal, which generate additional nuclear wastes that require proper Procedures for handling, storage & ultimate disposal, LIBS overcomes these limitations. Utility of Machine Learning Techniques employed include; Artificial Neural Networks, ANN (Regression/Modelling), Principal component Analysis, PCA (Classification) and Support Vector Machine SVM (Comparative study/Classification Machine Learning coupled with LIBS gives a state of the art analytic method. Utility of the technic in safeguards security and non-proliferation

  2. Learning About Climate and Atmospheric Models Through Machine Learning

    Science.gov (United States)

    Lucas, D. D.

    2017-12-01

    From the analysis of ensemble variability to improving simulation performance, machine learning algorithms can play a powerful role in understanding the behavior of atmospheric and climate models. To learn about model behavior, we create training and testing data sets through ensemble techniques that sample different model configurations and values of input parameters, and then use supervised machine learning to map the relationships between the inputs and outputs. Following this procedure, we have used support vector machines, random forests, gradient boosting and other methods to investigate a variety of atmospheric and climate model phenomena. We have used machine learning to predict simulation crashes, estimate the probability density function of climate sensitivity, optimize simulations of the Madden Julian oscillation, assess the impacts of weather and emissions uncertainty on atmospheric dispersion, and quantify the effects of model resolution changes on precipitation. This presentation highlights recent examples of our applications of machine learning to improve the understanding of climate and atmospheric models. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

  3. Continual Learning through Evolvable Neural Turing Machines

    DEFF Research Database (Denmark)

    Lüders, Benno; Schläger, Mikkel; Risi, Sebastian

    2016-01-01

    Continual learning, i.e. the ability to sequentially learn tasks without catastrophic forgetting of previously learned ones, is an important open challenge in machine learning. In this paper we take a step in this direction by showing that the recently proposed Evolving Neural Turing Machine (ENTM...

  4. Machine learning in heart failure: ready for prime time.

    Science.gov (United States)

    Awan, Saqib Ejaz; Sohel, Ferdous; Sanfilippo, Frank Mario; Bennamoun, Mohammed; Dwivedi, Girish

    2018-03-01

    The aim of this review is to present an up-to-date overview of the application of machine learning methods in heart failure including diagnosis, classification, readmissions and medication adherence. Recent studies have shown that the application of machine learning techniques may have the potential to improve heart failure outcomes and management, including cost savings by improving existing diagnostic and treatment support systems. Recently developed deep learning methods are expected to yield even better performance than traditional machine learning techniques in performing complex tasks by learning the intricate patterns hidden in big medical data. The review summarizes the recent developments in the application of machine and deep learning methods in heart failure management.

  5. Machine Learning of Fault Friction

    Science.gov (United States)

    Johnson, P. A.; Rouet-Leduc, B.; Hulbert, C.; Marone, C.; Guyer, R. A.

    2017-12-01

    We are applying machine learning (ML) techniques to continuous acoustic emission (AE) data from laboratory earthquake experiments. Our goal is to apply explicit ML methods to this acoustic datathe AE in order to infer frictional properties of a laboratory fault. The experiment is a double direct shear apparatus comprised of fault blocks surrounding fault gouge comprised of glass beads or quartz powder. Fault characteristics are recorded, including shear stress, applied load (bulk friction = shear stress/normal load) and shear velocity. The raw acoustic signal is continuously recorded. We rely on explicit decision tree approaches (Random Forest and Gradient Boosted Trees) that allow us to identify important features linked to the fault friction. A training procedure that employs both the AE and the recorded shear stress from the experiment is first conducted. Then, testing takes place on data the algorithm has never seen before, using only the continuous AE signal. We find that these methods provide rich information regarding frictional processes during slip (Rouet-Leduc et al., 2017a; Hulbert et al., 2017). In addition, similar machine learning approaches predict failure times, as well as slip magnitudes in some cases. We find that these methods work for both stick slip and slow slip experiments, for periodic slip and for aperiodic slip. We also derive a fundamental relationship between the AE and the friction describing the frictional behavior of any earthquake slip cycle in a given experiment (Rouet-Leduc et al., 2017b). Our goal is to ultimately scale these approaches to Earth geophysical data to probe fault friction. References Rouet-Leduc, B., C. Hulbert, N. Lubbers, K. Barros, C. Humphreys and P. A. Johnson, Machine learning predicts laboratory earthquakes, in review (2017). https://arxiv.org/abs/1702.05774Rouet-LeDuc, B. et al., Friction Laws Derived From the Acoustic Emissions of a Laboratory Fault by Machine Learning (2017), AGU Fall Meeting Session S025

  6. Two-Dimensional Extreme Learning Machine

    Directory of Open Access Journals (Sweden)

    Bo Jia

    2015-01-01

    (BP networks. However, like many other methods, ELM is originally proposed to handle vector pattern while nonvector patterns in real applications need to be explored, such as image data. We propose the two-dimensional extreme learning machine (2DELM based on the very natural idea to deal with matrix data directly. Unlike original ELM which handles vectors, 2DELM take the matrices as input features without vectorization. Empirical studies on several real image datasets show the efficiency and effectiveness of the algorithm.

  7. Network anomaly detection a machine learning perspective

    CERN Document Server

    Bhattacharyya, Dhruba Kumar

    2013-01-01

    With the rapid rise in the ubiquity and sophistication of Internet technology and the accompanying growth in the number of network attacks, network intrusion detection has become increasingly important. Anomaly-based network intrusion detection refers to finding exceptional or nonconforming patterns in network traffic data compared to normal behavior. Finding these anomalies has extensive applications in areas such as cyber security, credit card and insurance fraud detection, and military surveillance for enemy activities. Network Anomaly Detection: A Machine Learning Perspective presents mach

  8. Revisit of Machine Learning Supported Biological and Biomedical Studies.

    Science.gov (United States)

    Yu, Xiang-Tian; Wang, Lu; Zeng, Tao

    2018-01-01

    Generally, machine learning includes many in silico methods to transform the principles underlying natural phenomenon to human understanding information, which aim to save human labor, to assist human judge, and to create human knowledge. It should have wide application potential in biological and biomedical studies, especially in the era of big biological data. To look through the application of machine learning along with biological development, this review provides wide cases to introduce the selection of machine learning methods in different practice scenarios involved in the whole biological and biomedical study cycle and further discusses the machine learning strategies for analyzing omics data in some cutting-edge biological studies. Finally, the notes on new challenges for machine learning due to small-sample high-dimension are summarized from the key points of sample unbalance, white box, and causality.

  9. Machine learning in jet physics

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    High energy collider experiments produce several petabytes of data every year. Given the magnitude and complexity of the raw data, machine learning algorithms provide the best available platform to transform and analyse these data to obtain valuable insights to understand Standard Model and Beyond Standard Model theories. These collider experiments produce both quark and gluon initiated hadronic jets as the core components. Deep learning techniques enable us to classify quark/gluon jets through image recognition and help us to differentiate signals and backgrounds in Beyond Standard Model searches at LHC. We are currently working on quark/gluon jet classification and progressing in our studies to find the bias between event generators using domain adversarial neural networks (DANN). We also plan to investigate top tagging, weak supervision on mixed samples in high energy physics, utilizing transfer learning from simulated data to real experimental data.

  10. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    Science.gov (United States)

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-07

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  11. Reverse hypothesis machine learning a practitioner's perspective

    CERN Document Server

    Kulkarni, Parag

    2017-01-01

    This book introduces a paradigm of reverse hypothesis machines (RHM), focusing on knowledge innovation and machine learning. Knowledge- acquisition -based learning is constrained by large volumes of data and is time consuming. Hence Knowledge innovation based learning is the need of time. Since under-learning results in cognitive inabilities and over-learning compromises freedom, there is need for optimal machine learning. All existing learning techniques rely on mapping input and output and establishing mathematical relationships between them. Though methods change the paradigm remains the same—the forward hypothesis machine paradigm, which tries to minimize uncertainty. The RHM, on the other hand, makes use of uncertainty for creative learning. The approach uses limited data to help identify new and surprising solutions. It focuses on improving learnability, unlike traditional approaches, which focus on accuracy. The book is useful as a reference book for machine learning researchers and professionals as ...

  12. EEG classification for motor imagery and resting state in BCI applications using multi-class Adaboost extreme learning machine

    Science.gov (United States)

    Gao, Lin; Cheng, Wei; Zhang, Jinhua; Wang, Jue

    2016-08-01

    Brain-computer interface (BCI) systems provide an alternative communication and control approach for people with limited motor function. Therefore, the feature extraction and classification approach should differentiate the relative unusual state of motion intention from a common resting state. In this paper, we sought a novel approach for multi-class classification in BCI applications. We collected electroencephalographic (EEG) signals registered by electrodes placed over the scalp during left hand motor imagery, right hand motor imagery, and resting state for ten healthy human subjects. We proposed using the Kolmogorov complexity (Kc) for feature extraction and a multi-class Adaboost classifier with extreme learning machine as base classifier for classification, in order to classify the three-class EEG samples. An average classification accuracy of 79.5% was obtained for ten subjects, which greatly outperformed commonly used approaches. Thus, it is concluded that the proposed method could improve the performance for classification of motor imagery tasks for multi-class samples. It could be applied in further studies to generate the control commands to initiate the movement of a robotic exoskeleton or orthosis, which finally facilitates the rehabilitation of disabled people.

  13. Machine Learning and Inverse Problem in Geodynamics

    Science.gov (United States)

    Shahnas, M. H.; Yuen, D. A.; Pysklywec, R.

    2017-12-01

    During the past few decades numerical modeling and traditional HPC have been widely deployed in many diverse fields for problem solutions. However, in recent years the rapid emergence of machine learning (ML), a subfield of the artificial intelligence (AI), in many fields of sciences, engineering, and finance seems to mark a turning point in the replacement of traditional modeling procedures with artificial intelligence-based techniques. The study of the circulation in the interior of Earth relies on the study of high pressure mineral physics, geochemistry, and petrology where the number of the mantle parameters is large and the thermoelastic parameters are highly pressure- and temperature-dependent. More complexity arises from the fact that many of these parameters that are incorporated in the numerical models as input parameters are not yet well established. In such complex systems the application of machine learning algorithms can play a valuable role. Our focus in this study is the application of supervised machine learning (SML) algorithms in predicting mantle properties with the emphasis on SML techniques in solving the inverse problem. As a sample problem we focus on the spin transition in ferropericlase and perovskite that may cause slab and plume stagnation at mid-mantle depths. The degree of the stagnation depends on the degree of negative density anomaly at the spin transition zone. The training and testing samples for the machine learning models are produced by the numerical convection models with known magnitudes of density anomaly (as the class labels of the samples). The volume fractions of the stagnated slabs and plumes which can be considered as measures for the degree of stagnation are assigned as sample features. The machine learning models can determine the magnitude of the spin transition-induced density anomalies that can cause flow stagnation at mid-mantle depths. Employing support vector machine (SVM) algorithms we show that SML techniques

  14. The Application of Machine Learning Algorithms for Text Mining based on Sentiment Analysis Approach

    Directory of Open Access Journals (Sweden)

    Reza Samizade

    2018-06-01

    Full Text Available Classification of the cyber texts and comments into two categories of positive and negative sentiment among social media users is of high importance in the research are related to text mining. In this research, we applied supervised classification methods to classify Persian texts based on sentiment in cyber space. The result of this research is in a form of a system that can decide whether a comment which is published in cyber space such as social networks is considered positive or negative. The comments that are published in Persian movie and movie review websites from 1392 to 1395 are considered as the data set for this research. A part of these data are considered as training and others are considered as testing data. Prior to implementing the algorithms, pre-processing activities such as tokenizing, removing stop words, and n-germs process were applied on the texts. Naïve Bayes, Neural Networks and support vector machine were used for text classification in this study. Out of sample tests showed that there is no evidence indicating that the accuracy of SVM approach is statistically higher than Naïve Bayes or that the accuracy of Naïve Bayes is not statistically higher than NN approach. However, the researchers can conclude that the accuracy of the classification using SVM approach is statistically higher than the accuracy of NN approach in 5% confidence level.

  15. Electrical machines & their applications

    CERN Document Server

    Hindmarsh, J

    1984-01-01

    A self-contained, comprehensive and unified treatment of electrical machines, including consideration of their control characteristics in both conventional and semiconductor switched circuits. This new edition has been expanded and updated to include material which reflects current thinking and practice. All references have been updated to conform to the latest national (BS) and international (IEC) recommendations and a new appendix has been added which deals more fully with the theory of permanent-magnets, recognising the growing importance of permanent-magnet machines. The text is so arra

  16. Proceedings of the IEEE Machine Learning for Signal Processing XVII

    DEFF Research Database (Denmark)

    The seventeenth of a series of workshops sponsored by the IEEE Signal Processing Society and organized by the Machine Learning for Signal Processing Technical Committee (MLSP-TC). The field of machine learning has matured considerably in both methodology and real-world application domains and has...... become particularly important for solution of problems in signal processing. As reflected in this collection, machine learning for signal processing combines many ideas from adaptive signal/image processing, learning theory and models, and statistics in order to solve complex real-world signal processing......, and two papers from the winners of the Data Analysis Competition. The program included papers in the following areas: genomic signal processing, pattern recognition and classification, image and video processing, blind signal processing, models, learning algorithms, and applications of machine learning...

  17. BELM: Bayesian extreme learning machine.

    Science.gov (United States)

    Soria-Olivas, Emilio; Gómez-Sanchis, Juan; Martín, José D; Vila-Francés, Joan; Martínez, Marcelino; Magdalena, José R; Serrano, Antonio J

    2011-03-01

    The theory of extreme learning machine (ELM) has become very popular on the last few years. ELM is a new approach for learning the parameters of the hidden layers of a multilayer neural network (as the multilayer perceptron or the radial basis function neural network). Its main advantage is the lower computational cost, which is especially relevant when dealing with many patterns defined in a high-dimensional space. This brief proposes a bayesian approach to ELM, which presents some advantages over other approaches: it allows the introduction of a priori knowledge; obtains the confidence intervals (CIs) without the need of applying methods that are computationally intensive, e.g., bootstrap; and presents high generalization capabilities. Bayesian ELM is benchmarked against classical ELM in several artificial and real datasets that are widely used for the evaluation of machine learning algorithms. Achieved results show that the proposed approach produces a competitive accuracy with some additional advantages, namely, automatic production of CIs, reduction of probability of model overfitting, and use of a priori knowledge.

  18. A review of machine learning in obesity.

    Science.gov (United States)

    DeGregory, K W; Kuiper, P; DeSilvio, T; Pleuss, J D; Miller, R; Roginski, J W; Fisher, C B; Harness, D; Viswanath, S; Heymsfield, S B; Dungan, I; Thomas, D M

    2018-05-01

    Rich sources of obesity-related data arising from sensors, smartphone apps, electronic medical health records and insurance data can bring new insights for understanding, preventing and treating obesity. For such large datasets, machine learning provides sophisticated and elegant tools to describe, classify and predict obesity-related risks and outcomes. Here, we review machine learning methods that predict and/or classify such as linear and logistic regression, artificial neural networks, deep learning and decision tree analysis. We also review methods that describe and characterize data such as cluster analysis, principal component analysis, network science and topological data analysis. We introduce each method with a high-level overview followed by examples of successful applications. The algorithms were then applied to National Health and Nutrition Examination Survey to demonstrate methodology, utility and outcomes. The strengths and limitations of each method were also evaluated. This summary of machine learning algorithms provides a unique overview of the state of data analysis applied specifically to obesity. © 2018 World Obesity Federation.

  19. Designing anticancer peptides by constructive machine learning.

    Science.gov (United States)

    Grisoni, Francesca; Neuhaus, Claudia; Gabernet, Gisela; Müller, Alex; Hiss, Jan; Schneider, Gisbert

    2018-04-21

    Constructive machine learning enables the automated generation of novel chemical structures without the need for explicit molecular design rules. This study presents the experimental application of such a generative model to design membranolytic anticancer peptides (ACPs) de novo. A recurrent neural network with long short-term memory cells was trained on alpha-helical cationic amphipathic peptide sequences and then fine-tuned with 26 known ACPs. This optimized model was used to generate unique and novel amino acid sequences. Twelve of the peptides were synthesized and tested for their activity on MCF7 human breast adenocarcinoma cells and selectivity against human erythrocytes. Ten of these peptides were active against cancer cells. Six of the active peptides killed MCF7 cancer cells without affecting human erythrocytes with at least threefold selectivity. These results advocate constructive machine learning for the automated design of peptides with desired biological activities. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Machine learning for micro-tomography

    Science.gov (United States)

    Parkinson, Dilworth Y.; Pelt, Daniël. M.; Perciano, Talita; Ushizima, Daniela; Krishnan, Harinarayan; Barnard, Harold S.; MacDowell, Alastair A.; Sethian, James

    2017-09-01

    Machine learning has revolutionized a number of fields, but many micro-tomography users have never used it for their work. The micro-tomography beamline at the Advanced Light Source (ALS), in collaboration with the Center for Applied Mathematics for Energy Research Applications (CAMERA) at Lawrence Berkeley National Laboratory, has now deployed a series of tools to automate data processing for ALS users using machine learning. This includes new reconstruction algorithms, feature extraction tools, and image classification and recommen- dation systems for scientific image. Some of these tools are either in automated pipelines that operate on data as it is collected or as stand-alone software. Others are deployed on computing resources at Berkeley Lab-from workstations to supercomputers-and made accessible to users through either scripting or easy-to-use graphical interfaces. This paper presents a progress report on this work.

  1. Enhancing interpretability of automatically extracted machine learning features: application to a RBM-Random Forest system on brain lesion segmentation.

    Science.gov (United States)

    Pereira, Sérgio; Meier, Raphael; McKinley, Richard; Wiest, Roland; Alves, Victor; Silva, Carlos A; Reyes, Mauricio

    2018-02-01

    Machine learning systems are achieving better performances at the cost of becoming increasingly complex. However, because of that, they become less interpretable, which may cause some distrust by the end-user of the system. This is especially important as these systems are pervasively being introduced to critical domains, such as the medical field. Representation Learning techniques are general methods for automatic feature computation. Nevertheless, these techniques are regarded as uninterpretable "black boxes". In this paper, we propose a methodology to enhance the interpretability of automatically extracted machine learning features. The proposed system is composed of a Restricted Boltzmann Machine for unsupervised feature learning, and a Random Forest classifier, which are combined to jointly consider existing correlations between imaging data, features, and target variables. We define two levels of interpretation: global and local. The former is devoted to understanding if the system learned the relevant relations in the data correctly, while the later is focused on predictions performed on a voxel- and patient-level. In addition, we propose a novel feature importance strategy that considers both imaging data and target variables, and we demonstrate the ability of the approach to leverage the interpretability of the obtained representation for the task at hand. We evaluated the proposed methodology in brain tumor segmentation and penumbra estimation in ischemic stroke lesions. We show the ability of the proposed methodology to unveil information regarding relationships between imaging modalities and extracted features and their usefulness for the task at hand. In both clinical scenarios, we demonstrate that the proposed methodology enhances the interpretability of automatically learned features, highlighting specific learning patterns that resemble how an expert extracts relevant data from medical images. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Archetypal Analysis for Machine Learning

    DEFF Research Database (Denmark)

    Mørup, Morten; Hansen, Lars Kai

    2010-01-01

    Archetypal analysis (AA) proposed by Cutler and Breiman in [1] estimates the principal convex hull of a data set. As such AA favors features that constitute representative ’corners’ of the data, i.e. distinct aspects or archetypes. We will show that AA enjoys the interpretability of clustering - ...... for K-means [2]. We demonstrate that the AA model is relevant for feature extraction and dimensional reduction for a large variety of machine learning problems taken from computer vision, neuroimaging, text mining and collaborative filtering....

  3. Reinforcement and Systemic Machine Learning for Decision Making

    CERN Document Server

    Kulkarni, Parag

    2012-01-01

    Reinforcement and Systemic Machine Learning for Decision Making There are always difficulties in making machines that learn from experience. Complete information is not always available-or it becomes available in bits and pieces over a period of time. With respect to systemic learning, there is a need to understand the impact of decisions and actions on a system over that period of time. This book takes a holistic approach to addressing that need and presents a new paradigm-creating new learning applications and, ultimately, more intelligent machines. The first book of its kind in this new an

  4. Machine learning approaches in medical image analysis

    DEFF Research Database (Denmark)

    de Bruijne, Marleen

    2016-01-01

    Machine learning approaches are increasingly successful in image-based diagnosis, disease prognosis, and risk assessment. This paper highlights new research directions and discusses three main challenges related to machine learning in medical imaging: coping with variation in imaging protocols......, learning from weak labels, and interpretation and evaluation of results....

  5. Improving Hyperspectral Image Classification Method for Fine Land Use Assessment Application Using Semisupervised Machine Learning

    Directory of Open Access Journals (Sweden)

    Chunyang Wang

    2015-01-01

    Full Text Available Study on land use/cover can reflect changing rules of population, economy, agricultural structure adjustment, policy, and traffic and provide better service for the regional economic development and urban evolution. The study on fine land use/cover assessment using hyperspectral image classification is a focal growing area in many fields. Semisupervised learning method which takes a large number of unlabeled samples and minority labeled samples, improving classification and predicting the accuracy effectively, has been a new research direction. In this paper, we proposed improving fine land use/cover assessment based on semisupervised hyperspectral classification method. The test analysis of study area showed that the advantages of semisupervised classification method could improve the high precision overall classification and objective assessment of land use/cover results.

  6. Introducing Machine Learning Concepts with WEKA.

    Science.gov (United States)

    Smith, Tony C; Frank, Eibe

    2016-01-01

    This chapter presents an introduction to data mining with machine learning. It gives an overview of various types of machine learning, along with some examples. It explains how to download, install, and run the WEKA data mining toolkit on a simple data set, then proceeds to explain how one might approach a bioinformatics problem. Finally, it includes a brief summary of machine learning algorithms for other types of data mining problems, and provides suggestions about where to find additional information.

  7. Machine-learning techniques for family demography: an application of random forests to the analysis of divorce determinants in Germany

    OpenAIRE

    Arpino, Bruno; Le Moglie, Marco; Mencarini, Letizia

    2018-01-01

    Demographers often analyze the determinants of life-course events with parametric regression-type approaches. Here, we present a class of nonparametric approaches, broadly defined as machine learning (ML) techniques, and discuss advantages and disadvantages of a popular type known as random forest. We argue that random forests can be useful either as a substitute, or a complement, to more standard parametric regression modeling. Our discussion of random forests is intuitive and...

  8. Deep Machine Learning Application to the Detection of Preclinical Neurodegenerative Diseases of Aging

    Directory of Open Access Journals (Sweden)

    Mathew J. Summers

    2017-12-01

    Full Text Available Artificial intelligence (AI deep learning protocols offer solutions to complex data processing and analysis. Increasingly these solutions are being applied in the healthcare field, most commonly in processing complex medical imaging data used for diagnosis. Current models apply AI to screening populations of patients for markers of disease and report detection accuracy rates exceeding those of human data screening. In this paper, we explore an alternate model for AI deployment, that of monitoring and analysing an individual’s level of function over time. In adopting this approach, we propose that AI may provide highly accurate and reliable detection of preclinical disease states associated with aging-related neurodegenerative diseases. One of the key challenges facing clinical detection of preclinical phases of diseases such as dementia is the high degree of inter-individual variability in aging-related changes to cognitive function. AI based monitoring of an individual over time offers the potential for the early detection of change in function for the individual, rather than relying on comparing the individual’s performance to population norms. We explore an approach to developing AI platforms for individual monitoring and preclinical disease detection and examine the potential benefits to the stakeholders in this technological development.

  9. Machine learning in medicine cookbook

    CERN Document Server

    Cleophas, Ton J

    2014-01-01

    The amount of data in medical databases doubles every 20 months, and physicians are at a loss to analyze them. Also, traditional methods of data analysis have difficulty to identify outliers and patterns in big data and data with multiple exposure / outcome variables and analysis-rules for surveys and questionnaires, currently common methods of data collection, are, essentially, missing. Obviously, it is time that medical and health professionals mastered their reluctance to use machine learning and the current 100 page cookbook should be helpful to that aim. It covers in a condensed form the subjects reviewed in the 750 page three volume textbook by the same authors, entitled “Machine Learning in Medicine I-III” (ed. by Springer, Heidelberg, Germany, 2013) and was written as a hand-hold presentation and must-read publication. It was written not only to investigators and students in the fields, but also to jaded clinicians new to the methods and lacking time to read the entire textbooks. General purposes ...

  10. Image Classification, Deep Learning and Convolutional Neural Networks : A Comparative Study of Machine Learning Frameworks

    OpenAIRE

    Airola, Rasmus; Hager, Kristoffer

    2017-01-01

    The use of machine learning and specifically neural networks is a growing trend in software development, and has grown immensely in the last couple of years in the light of an increasing need to handle big data and large information flows. Machine learning has a broad area of application, such as human-computer interaction, predicting stock prices, real-time translation, and self driving vehicles. Large companies such as Microsoft and Google have already implemented machine learning in some o...

  11. Machine Learning in Medical Imaging.

    Science.gov (United States)

    Giger, Maryellen L

    2018-03-01

    Advances in both imaging and computers have synergistically led to a rapid rise in the potential use of artificial intelligence in various radiological imaging tasks, such as risk assessment, detection, diagnosis, prognosis, and therapy response, as well as in multi-omics disease discovery. A brief overview of the field is given here, allowing the reader to recognize the terminology, the various subfields, and components of machine learning, as well as the clinical potential. Radiomics, an expansion of computer-aided diagnosis, has been defined as the conversion of images to minable data. The ultimate benefit of quantitative radiomics is to (1) yield predictive image-based phenotypes of disease for precision medicine or (2) yield quantitative image-based phenotypes for data mining with other -omics for discovery (ie, imaging genomics). For deep learning in radiology to succeed, note that well-annotated large data sets are needed since deep networks are complex, computer software and hardware are evolving constantly, and subtle differences in disease states are more difficult to perceive than differences in everyday objects. In the future, machine learning in radiology is expected to have a substantial clinical impact with imaging examinations being routinely obtained in clinical practice, providing an opportunity to improve decision support in medical image interpretation. The term of note is decision support, indicating that computers will augment human decision making, making it more effective and efficient. The clinical impact of having computers in the routine clinical practice may allow radiologists to further integrate their knowledge with their clinical colleagues in other medical specialties and allow for precision medicine. Copyright © 2018. Published by Elsevier Inc.

  12. Data Mining and Machine Learning in Astronomy

    Science.gov (United States)

    Ball, Nicholas M.; Brunner, Robert J.

    We review the current state of data mining and machine learning in astronomy. Data Mining can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those in which data mining techniques directly contributed to improving science, and important current and future directions, including probability density functions, parallel algorithms, Peta-Scale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.

  13. Early identification of posttraumatic stress following military deployment: Application of machine learning methods to a prospective study of Danish soldiers.

    Science.gov (United States)

    Karstoft, Karen-Inge; Statnikov, Alexander; Andersen, Søren B; Madsen, Trine; Galatzer-Levy, Isaac R

    2015-09-15

    Pre-deployment identification of soldiers at risk for long-term posttraumatic stress psychopathology after home coming is important to guide decisions about deployment. Early post-deployment identification can direct early interventions to those in need and thereby prevents the development of chronic psychopathology. Both hold significant public health benefits given large numbers of deployed soldiers, but has so far not been achieved. Here, we aim to assess the potential for pre- and early post-deployment prediction of resilience or posttraumatic stress development in soldiers by application of machine learning (ML) methods. ML feature selection and prediction algorithms were applied to a prospective cohort of 561 Danish soldiers deployed to Afghanistan in 2009 to identify unique risk indicators and forecast long-term posttraumatic stress responses. Robust pre- and early postdeployment risk indicators were identified, and included individual PTSD symptoms as well as total level of PTSD symptoms, previous trauma and treatment, negative emotions, and thought suppression. The predictive performance of these risk indicators combined was assessed by cross-validation. Together, these indicators forecasted long term posttraumatic stress responses with high accuracy (pre-deployment: AUC = 0.84 (95% CI = 0.81-0.87), post-deployment: AUC = 0.88 (95% CI = 0.85-0.91)). This study utilized a previously collected data set and was therefore not designed to exhaust the potential of ML methods. Further, the study relied solely on self-reported measures. Pre-deployment and early post-deployment identification of risk for long-term posttraumatic psychopathology are feasible and could greatly reduce the public health costs of war. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Relevance as a metric for evaluating machine learning algorithms

    NARCIS (Netherlands)

    Kota Gopalakrishna, A.; Ozcelebi, T.; Liotta, A.; Lukkien, J.J.

    2013-01-01

    In machine learning, the choice of a learning algorithm that is suitable for the application domain is critical. The performance metric used to compare different algorithms must also reflect the concerns of users in the application domain under consideration. In this work, we propose a novel

  15. Introduction to Machine Learning: Class Notes 67577

    OpenAIRE

    Shashua, Amnon

    2009-01-01

    Introduction to Machine learning covering Statistical Inference (Bayes, EM, ML/MaxEnt duality), algebraic and spectral methods (PCA, LDA, CCA, Clustering), and PAC learning (the Formal model, VC dimension, Double Sampling theorem).

  16. Building machine learning systems with Python

    CERN Document Server

    Coelho, Luis Pedro

    2015-01-01

    This book primarily targets Python developers who want to learn and use Python's machine learning capabilities and gain valuable insights from data to develop effective solutions for business problems.

  17. Learning as a Machine: Crossovers between Humans and Machines

    Science.gov (United States)

    Hildebrandt, Mireille

    2017-01-01

    This article is a revised version of the keynote presented at LAK '16 in Edinburgh. The article investigates some of the assumptions of learning analytics, notably those related to behaviourism. Building on the work of Ivan Pavlov, Herbert Simon, and James Gibson as ways of "learning as a machine," the article then develops two levels of…

  18. Tunnel boring machine applications

    International Nuclear Information System (INIS)

    Bhattacharyya, K.K.; McDonald, R.; Saunders, R.S.

    1992-01-01

    This paper reports that characterization of Yucca Mountain for a potential repository requires construction of an underground Exploratory Studies Facility (ESF). Mechanical excavating methods have been proposed for construction of the ESF as they offer a number of advantages over drilling and blasting at the Yucca Mountain site, including; less ground disturbance and therefore a potential for less adverse effects on the integrity of the site, creation of a more stable excavation cross section requiring less ground support, and an inherently safer and cleaner working environment. The tunnel boring machine (TBM) provides a proven technology for excavating the welded and unwelded Yucca Mountain tuffs. The access ramps and main underground tunnels form the largest part of the ESF underground construction work, and have been designed for excavation by TBM

  19. Teaching machine learning to design students

    NARCIS (Netherlands)

    Vlist, van der B.J.J.; van de Westelaken, H.F.M.; Bartneck, C.; Hu, J.; Ahn, R.M.C.; Barakova, E.I.; Delbressine, F.L.M.; Feijs, L.M.G.; Pan, Z.; Zhang, X.; El Rhalibi, A.

    2008-01-01

    Machine learning is a key technology to design and create intelligent systems, products, and related services. Like many other design departments, we are faced with the challenge to teach machine learning to design students, who often do not have an inherent affinity towards technology. We

  20. Fall Detection in Individuals With Lower Limb Amputations Using Mobile Phones: Machine Learning Enhances Robustness for Real-World Applications.

    Science.gov (United States)

    Shawen, Nicholas; Lonini, Luca; Mummidisetty, Chaithanya Krishna; Shparii, Ilona; Albert, Mark V; Kording, Konrad; Jayaraman, Arun

    2017-10-11

    .022). Detection of falls in control individuals yielded similar results (sensitivity: mean 0.979, SEM 0.022; specificity: mean 0.991, SEM 0.012). A mean 2.2 (SD 1.7) false alarms per day were obtained when evaluating the model (vs mean 122.1, SD 166.1 based on thresholds) on data recorded as participants carried the phone during their daily routine for two or more days. Machine-learning classifiers outperformed the threshold-based one (Pmobile phone-based fall detection model can use data from non-amputee individuals to detect falls in individuals walking with a prosthesis. We successfully detected falls when the mobile phone was carried across multiple locations and without a predetermined orientation. Furthermore, the number of false alarms yielded by the model over a longer period of time was reasonably low. This moves the application of mobile phone-based fall detection systems closer to a real-world use case scenario. ©Nicholas Shawen, Luca Lonini, Chaithanya Krishna Mummidisetty, Ilona Shparii, Mark V Albert, Konrad Kording, Arun Jayaraman. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 11.10.2017.

  1. Considerations upon the Machine Learning Technologies

    OpenAIRE

    Alin Munteanu; Cristina Ofelia Sofran

    2006-01-01

    Artificial intelligence offers superior techniques and methods by which problems from diverse domains may find an optimal solution. The Machine Learning technologies refer to the domain of artificial intelligence aiming to develop the techniques allowing the computers to “learn”. Some systems based on Machine Learning technologies tend to eliminate the necessity of the human intelligence while the others adopt a man-machine collaborative approach.

  2. Considerations upon the Machine Learning Technologies

    Directory of Open Access Journals (Sweden)

    Alin Munteanu

    2006-01-01

    Full Text Available Artificial intelligence offers superior techniques and methods by which problems from diverse domains may find an optimal solution. The Machine Learning technologies refer to the domain of artificial intelligence aiming to develop the techniques allowing the computers to “learn”. Some systems based on Machine Learning technologies tend to eliminate the necessity of the human intelligence while the others adopt a man-machine collaborative approach.

  3. Finding New Perovskite Halides via Machine learning

    Directory of Open Access Journals (Sweden)

    Ghanshyam ePilania

    2016-04-01

    Full Text Available Advanced materials with improved properties have the potential to fuel future technological advancements. However, identification and discovery of these optimal materials for a specific application is a non-trivial task, because of the vastness of the chemical search space with enormous compositional and configurational degrees of freedom. Materials informatics provides an efficient approach towards rational design of new materials, via learning from known data to make decisions on new and previously unexplored compounds in an accelerated manner. Here, we demonstrate the power and utility of such statistical learning (or machine learning via building a support vector machine (SVM based classifier that uses elemental features (or descriptors to predict the formability of a given ABX3 halide composition (where A and B represent monovalent and divalent cations, respectively, and X is F, Cl, Br or I anion in the perovskite crystal structure. The classification model is built by learning from a dataset of 181 experimentally known ABX3 compounds. After exploring a wide range of features, we identify ionic radii, tolerance factor and octahedral factor to be the most important factors for the classification, suggesting that steric and geometric packing effects govern the stability of these halides. The trained and validated models then predict, with a high degree of confidence, several novel ABX3 compositions with perovskite crystal structure.

  4. Finding New Perovskite Halides via Machine learning

    Science.gov (United States)

    Pilania, Ghanshyam; Balachandran, Prasanna V.; Kim, Chiho; Lookman, Turab

    2016-04-01

    Advanced materials with improved properties have the potential to fuel future technological advancements. However, identification and discovery of these optimal materials for a specific application is a non-trivial task, because of the vastness of the chemical search space with enormous compositional and configurational degrees of freedom. Materials informatics provides an efficient approach towards rational design of new materials, via learning from known data to make decisions on new and previously unexplored compounds in an accelerated manner. Here, we demonstrate the power and utility of such statistical learning (or machine learning) via building a support vector machine (SVM) based classifier that uses elemental features (or descriptors) to predict the formability of a given ABX3 halide composition (where A and B represent monovalent and divalent cations, respectively, and X is F, Cl, Br or I anion) in the perovskite crystal structure. The classification model is built by learning from a dataset of 181 experimentally known ABX3 compounds. After exploring a wide range of features, we identify ionic radii, tolerance factor and octahedral factor to be the most important factors for the classification, suggesting that steric and geometric packing effects govern the stability of these halides. The trained and validated models then predict, with a high degree of confidence, several novel ABX3 compositions with perovskite crystal structure.

  5. Machine Learning wins the Higgs Challenge

    CERN Multimedia

    Abha Eli Phoboo

    2014-01-01

    The winner of the four-month-long Higgs Machine Learning Challenge, launched on 12 May, is Gábor Melis from Hungary, followed closely by Tim Salimans from the Netherlands and Pierre Courtiol from France. The challenge explored the potential of advanced machine learning methods to improve the significance of the Higgs discovery.   Winners of the Higgs Machine Learning Challenge: Gábor Melis and Tim Salimans (top row), Tianqi Chen and Tong He (bottom row). Participants in the Higgs Machine Learning Challenge were tasked with developing an algorithm to improve the detection of Higgs boson signal events decaying into two tau particles in a sample of simulated ATLAS data* that contains few signal and a majority of non-Higgs boson “background” events. No knowledge of particle physics was required for the challenge but skills in machine learning - the training of computers to recognise patterns in data – were essential. The Challenge, hosted by Ka...

  6. Machine vision systems using machine learning for industrial product inspection

    Science.gov (United States)

    Lu, Yi; Chen, Tie Q.; Chen, Jie; Zhang, Jian; Tisler, Anthony

    2002-02-01

    Machine vision inspection requires efficient processing time and accurate results. In this paper, we present a machine vision inspection architecture, SMV (Smart Machine Vision). SMV decomposes a machine vision inspection problem into two stages, Learning Inspection Features (LIF), and On-Line Inspection (OLI). The LIF is designed to learn visual inspection features from design data and/or from inspection products. During the OLI stage, the inspection system uses the knowledge learnt by the LIF component to inspect the visual features of products. In this paper we will present two machine vision inspection systems developed under the SMV architecture for two different types of products, Printed Circuit Board (PCB) and Vacuum Florescent Displaying (VFD) boards. In the VFD board inspection system, the LIF component learns inspection features from a VFD board and its displaying patterns. In the PCB board inspection system, the LIF learns the inspection features from the CAD file of a PCB board. In both systems, the LIF component also incorporates interactive learning to make the inspection system more powerful and efficient. The VFD system has been deployed successfully in three different manufacturing companies and the PCB inspection system is the process of being deployed in a manufacturing plant.

  7. Novel jet observables from machine learning

    Science.gov (United States)

    Datta, Kaustuv; Larkoski, Andrew J.

    2018-03-01

    Previous studies have demonstrated the utility and applicability of machine learning techniques to jet physics. In this paper, we construct new observables for the discrimination of jets from different originating particles exclusively from information identified by the machine. The approach we propose is to first organize information in the jet by resolved phase space and determine the effective N -body phase space at which discrimination power saturates. This then allows for the construction of a discrimination observable from the N -body phase space coordinates. A general form of this observable can be expressed with numerous parameters that are chosen so that the observable maximizes the signal vs. background likelihood. Here, we illustrate this technique applied to discrimination of H\\to b\\overline{b} decays from massive g\\to b\\overline{b} splittings. We show that for a simple parametrization, we can construct an observable that has discrimination power comparable to, or better than, widely-used observables motivated from theory considerations. For the case of jets on which modified mass-drop tagger grooming is applied, the observable that the machine learns is essentially the angle of the dominant gluon emission off of the b\\overline{b} pair.

  8. Machine learning search for variable stars

    Science.gov (United States)

    Pashchenko, Ilya N.; Sokolovsky, Kirill V.; Gavras, Panagiotis

    2018-04-01

    Photometric variability detection is often considered as a hypothesis testing problem: an object is variable if the null hypothesis that its brightness is constant can be ruled out given the measurements and their uncertainties. The practical applicability of this approach is limited by uncorrected systematic errors. We propose a new variability detection technique sensitive to a wide range of variability types while being robust to outliers and underestimated measurement uncertainties. We consider variability detection as a classification problem that can be approached with machine learning. Logistic Regression (LR), Support Vector Machines (SVM), k Nearest Neighbours (kNN), Neural Nets (NN), Random Forests (RF), and Stochastic Gradient Boosting classifier (SGB) are applied to 18 features (variability indices) quantifying scatter and/or correlation between points in a light curve. We use a subset of Optical Gravitational Lensing Experiment phase two (OGLE-II) Large Magellanic Cloud (LMC) photometry (30 265 light curves) that was searched for variability using traditional methods (168 known variable objects) as the training set and then apply the NN to a new test set of 31 798 OGLE-II LMC light curves. Among 205 candidates selected in the test set, 178 are real variables, while 13 low-amplitude variables are new discoveries. The machine learning classifiers considered are found to be more efficient (select more variables and fewer false candidates) compared to traditional techniques using individual variability indices or their linear combination. The NN, SGB, SVM, and RF show a higher efficiency compared to LR and kNN.

  9. Adaptive Learning Systems: Beyond Teaching Machines

    Science.gov (United States)

    Kara, Nuri; Sevim, Nese

    2013-01-01

    Since 1950s, teaching machines have changed a lot. Today, we have different ideas about how people learn, what instructor should do to help students during their learning process. We have adaptive learning technologies that can create much more student oriented learning environments. The purpose of this article is to present these changes and its…

  10. Learning Machines Implemented on Non-Deterministic Hardware

    OpenAIRE

    Gupta, Suyog; Sindhwani, Vikas; Gopalakrishnan, Kailash

    2014-01-01

    This paper highlights new opportunities for designing large-scale machine learning systems as a consequence of blurring traditional boundaries that have allowed algorithm designers and application-level practitioners to stay -- for the most part -- oblivious to the details of the underlying hardware-level implementations. The hardware/software co-design methodology advocated here hinges on the deployment of compute-intensive machine learning kernels onto compute platforms that trade-off deter...

  11. Building machine learning systems with Python

    CERN Document Server

    Richert, Willi

    2013-01-01

    This is a tutorial-driven and practical, but well-grounded book showcasing good Machine Learning practices. There will be an emphasis on using existing technologies instead of showing how to write your own implementations of algorithms. This book is a scenario-based, example-driven tutorial. By the end of the book you will have learnt critical aspects of Machine Learning Python projects and experienced the power of ML-based systems by actually working on them.This book primarily targets Python developers who want to learn about and build Machine Learning into their projects, or who want to pro

  12. Probabilistic machine learning and artificial intelligence.

    Science.gov (United States)

    Ghahramani, Zoubin

    2015-05-28

    How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.

  13. Probabilistic machine learning and artificial intelligence

    Science.gov (United States)

    Ghahramani, Zoubin

    2015-05-01

    How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.

  14. Machine learning a theoretical approach

    CERN Document Server

    Natarajan, Balas K

    2014-01-01

    This is the first comprehensive introduction to computational learning theory. The author's uniform presentation of fundamental results and their applications offers AI researchers a theoretical perspective on the problems they study. The book presents tools for the analysis of probabilistic models of learning, tools that crisply classify what is and is not efficiently learnable. After a general introduction to Valiant's PAC paradigm and the important notion of the Vapnik-Chervonenkis dimension, the author explores specific topics such as finite automata and neural networks. The presentation

  15. Status Checking System of Home Appliances using machine learning

    Directory of Open Access Journals (Sweden)

    Yoon Chi-Yurl

    2017-01-01

    Full Text Available This paper describes status checking system of home appliances based on machine learning, which can be applied to existing household appliances without networking function. Designed status checking system consists of sensor modules, a wireless communication module, cloud server, android application and a machine learning algorithm. The developed system applied to washing machine analyses and judges the four-kinds of appliance’s status such as staying, washing, rinsing and spin-drying. The measurements of sensor and transmission of sensing data are operated on an Arduino board and the data are transmitted to cloud server in real time. The collected data are parsed by an Android application and injected into the machine learning algorithm for learning the status of the appliances. The machine learning algorithm compares the stored learning data with collected real-time data from the appliances. Our results are expected to contribute as a base technology to design an automatic control system based on machine learning technology for household appliances in real-time.

  16. Proceedings of IEEE Machine Learning for Signal Processing Workshop XVI

    DEFF Research Database (Denmark)

    Larsen, Jan

    These proceedings contains refereed papers presented at the sixteenth IEEE Workshop on Machine Learning for Signal Processing (MLSP'2006), held in Maynooth, Co. Kildare, Ireland, September 6-8, 2006. This is a continuation of the IEEE Workshops on Neural Networks for Signal Processing (NNSP......). The name of the Technical Committee, hence of the Workshop, was changed to Machine Learning for Signal Processing in September 2003 to better reflect the areas represented by the Technical Committee. The conference is organized by the Machine Learning for Signal Processing Technical Committee...... the same standard as the printed version and facilitates the reading and searching of the papers. The field of machine learning has matured considerably in both methodology and real-world application domains and has become particularly important for solution of problems in signal processing. As reflected...

  17. Python for probability, statistics, and machine learning

    CERN Document Server

    Unpingco, José

    2016-01-01

    This book covers the key ideas that link probability, statistics, and machine learning illustrated using Python modules in these areas. The entire text, including all the figures and numerical results, is reproducible using the Python codes and their associated Jupyter/IPython notebooks, which are provided as supplementary downloads. The author develops key intuitions in machine learning by working meaningful examples using multiple analytical methods and Python codes, thereby connecting theoretical concepts to concrete implementations. Modern Python modules like Pandas, Sympy, and Scikit-learn are applied to simulate and visualize important machine learning concepts like the bias/variance trade-off, cross-validation, and regularization. Many abstract mathematical ideas, such as convergence in probability theory, are developed and illustrated with numerical examples. This book is suitable for anyone with an undergraduate-level exposure to probability, statistics, or machine learning and with rudimentary knowl...

  18. An introduction to machine learning with Scikit-Learn

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    This tutorial gives an introduction to the scientific ecosystem for data analysis and machine learning in Python. After a short introduction of machine learning concepts, we will demonstrate on High Energy Physics data how a basic supervised learning analysis can be carried out using the Scikit-Learn library. Topics covered include data loading facilities and data representation, supervised learning algorithms, pipelines, model selection and evaluation, and model introspection.

  19. Distributed Extreme Learning Machine for Nonlinear Learning over Network

    Directory of Open Access Journals (Sweden)

    Songyan Huang

    2015-02-01

    Full Text Available Distributed data collection and analysis over a network are ubiquitous, especially over a wireless sensor network (WSN. To our knowledge, the data model used in most of the distributed algorithms is linear. However, in real applications, the linearity of systems is not always guaranteed. In nonlinear cases, the single hidden layer feedforward neural network (SLFN with radial basis function (RBF hidden neurons has the ability to approximate any continuous functions and, thus, may be used as the nonlinear learning system. However, confined by the communication cost, using the distributed version of the conventional algorithms to train the neural network directly is usually prohibited. Fortunately, based on the theorems provided in the extreme learning machine (ELM literature, we only need to compute the output weights of the SLFN. Computing the output weights itself is a linear learning problem, although the input-output mapping of the overall SLFN is still nonlinear. Using the distributed algorithmto cooperatively compute the output weights of the SLFN, we obtain a distributed extreme learning machine (dELM for nonlinear learning in this paper. This dELM is applied to the regression problem and classification problem to demonstrate its effectiveness and advantages.

  20. Automatic classification of written descriptions by healthy adults: An overview of the application of natural language processing and machine learning techniques to clinical discourse analysis.

    Science.gov (United States)

    Toledo, Cíntia Matsuda; Cunha, Andre; Scarton, Carolina; Aluísio, Sandra

    2014-01-01

    Discourse production is an important aspect in the evaluation of brain-injured individuals. We believe that studies comparing the performance of brain-injured subjects with that of healthy controls must use groups with compatible education. A pioneering application of machine learning methods using Brazilian Portuguese for clinical purposes is described, highlighting education as an important variable in the Brazilian scenario. The aims were to describe how to:(i) develop machine learning classifiers using features generated by natural language processing tools to distinguish descriptions produced by healthy individuals into classes based on their years of education; and(ii) automatically identify the features that best distinguish the groups. The approach proposed here extracts linguistic features automatically from the written descriptions with the aid of two Natural Language Processing tools: Coh-Metrix-Port and AIC. It also includes nine task-specific features (three new ones, two extracted manually, besides description time; type of scene described - simple or complex; presentation order - which type of picture was described first; and age). In this study, the descriptions by 144 of the subjects studied in Toledo 18 were used,which included 200 healthy Brazilians of both genders. A Support Vector Machine (SVM) with a radial basis function (RBF) kernel is the most recommended approach for the binary classification of our data, classifying three of the four initial classes. CfsSubsetEval (CFS) is a strong candidate to replace manual feature selection methods.

  1. Automatic classification of written descriptions by healthy adults: An overview of the application of natural language processing and machine learning techniques to clinical discourse analysis

    Directory of Open Access Journals (Sweden)

    Cíntia Matsuda Toledo

    Full Text Available Discourse production is an important aspect in the evaluation of brain-injured individuals. We believe that studies comparing the performance of brain-injured subjects with that of healthy controls must use groups with compatible education. A pioneering application of machine learning methods using Brazilian Portuguese for clinical purposes is described, highlighting education as an important variable in the Brazilian scenario.OBJECTIVE: The aims were to describe how to: (i develop machine learning classifiers using features generated by natural language processing tools to distinguish descriptions produced by healthy individuals into classes based on their years of education; and (ii automatically identify the features that best distinguish the groups.METHODS: The approach proposed here extracts linguistic features automatically from the written descriptions with the aid of two Natural Language Processing tools: Coh-Metrix-Port and AIC. It also includes nine task-specific features (three new ones, two extracted manually, besides description time; type of scene described - simple or complex; presentation order - which type of picture was described first; and age. In this study, the descriptions by 144 of the subjects studied in Toledo18 were used, which included 200 healthy Brazilians of both genders.RESULTS AND CONCLUSION:A Support Vector Machine (SVM with a radial basis function (RBF kernel is the most recommended approach for the binary classification of our data, classifying three of the four initial classes. CfsSubsetEval (CFS is a strong candidate to replace manual feature selection methods.

  2. Machine learning techniques in optical communication

    DEFF Research Database (Denmark)

    Zibar, Darko; Piels, Molly; Jones, Rasmus Thomas

    2016-01-01

    Machine learning techniques relevant for nonlinearity mitigation, carrier recovery, and nanoscale device characterization are reviewed and employed. Markov Chain Monte Carlo in combination with Bayesian filtering is employed within the nonlinear state-space framework and demonstrated for parameter...

  3. Machine learning techniques in optical communication

    DEFF Research Database (Denmark)

    Zibar, Darko; Piels, Molly; Jones, Rasmus Thomas

    2015-01-01

    Techniques from the machine learning community are reviewed and employed for laser characterization, signal detection in the presence of nonlinear phase noise, and nonlinearity mitigation. Bayesian filtering and expectation maximization are employed within nonlinear state-space framework...

  4. Computer vision and machine learning for archaeology

    NARCIS (Netherlands)

    van der Maaten, L.J.P.; Boon, P.; Lange, G.; Paijmans, J.J.; Postma, E.

    2006-01-01

    Until now, computer vision and machine learning techniques barely contributed to the archaeological domain. The use of these techniques can support archaeologists in their assessment and classification of archaeological finds. The paper illustrates the use of computer vision techniques for

  5. Using Machine Learning for Land Suitability Classification

    African Journals Online (AJOL)

    User

    West African Journal of Applied Ecology, vol. ... evidence for the utility of machine learning methods in land suitability classification especially MCS methods. ... Artificial intelligence tools. ..... Numerical values of index for the various classes.

  6. IRB Process Improvements: A Machine Learning Analysis.

    Science.gov (United States)

    Shoenbill, Kimberly; Song, Yiqiang; Cobb, Nichelle L; Drezner, Marc K; Mendonca, Eneida A

    2017-06-01

    Clinical research involving humans is critically important, but it is a lengthy and expensive process. Most studies require institutional review board (IRB) approval. Our objective is to identify predictors of delays or accelerations in the IRB review process and apply this knowledge to inform process change in an effort to improve IRB efficiency, transparency, consistency and communication. We analyzed timelines of protocol submissions to determine protocol or IRB characteristics associated with different processing times. Our evaluation included single variable analysis to identify significant predictors of IRB processing time and machine learning methods to predict processing times through the IRB review system. Based on initial identified predictors, changes to IRB workflow and staffing procedures were instituted and we repeated our analysis. Our analysis identified several predictors of delays in the IRB review process including type of IRB review to be conducted, whether a protocol falls under Veteran's Administration purview and specific staff in charge of a protocol's review. We have identified several predictors of delays in IRB protocol review processing times using statistical and machine learning methods. Application of this knowledge to process improvement efforts in two IRBs has led to increased efficiency in protocol review. The workflow and system enhancements that are being made support our four-part goal of improving IRB efficiency, consistency, transparency, and communication.

  7. Implementing Machine Learning in the PCWG Tool

    Energy Technology Data Exchange (ETDEWEB)

    Clifton, Andrew; Ding, Yu; Stuart, Peter

    2016-12-13

    The Power Curve Working Group (www.pcwg.org) is an ad-hoc industry-led group to investigate the performance of wind turbines in real-world conditions. As part of ongoing experience-sharing exercises, machine learning has been proposed as a possible way to predict turbine performance. This presentation provides some background information about machine learning and how it might be implemented in the PCWG exercises.

  8. On the Conditioning of Machine-Learning-Assisted Turbulence Modeling

    Science.gov (United States)

    Wu, Jinlong; Sun, Rui; Wang, Qiqi; Xiao, Heng

    2017-11-01

    Recently, several researchers have demonstrated that machine learning techniques can be used to improve the RANS modeled Reynolds stress by training on available database of high fidelity simulations. However, obtaining improved mean velocity field remains an unsolved challenge, restricting the predictive capability of current machine-learning-assisted turbulence modeling approaches. In this work we define a condition number to evaluate the model conditioning of data-driven turbulence modeling approaches, and propose a stability-oriented machine learning framework to model Reynolds stress. Two canonical flows, the flow in a square duct and the flow over periodic hills, are investigated to demonstrate the predictive capability of the proposed framework. The satisfactory prediction performance of mean velocity field for both flows demonstrates the predictive capability of the proposed framework for machine-learning-assisted turbulence modeling. With showing the capability of improving the prediction of mean flow field, the proposed stability-oriented machine learning framework bridges the gap between the existing machine-learning-assisted turbulence modeling approaches and the demand of predictive capability of turbulence models in real applications.

  9. Addressing uncertainty in atomistic machine learning

    DEFF Research Database (Denmark)

    Peterson, Andrew A.; Christensen, Rune; Khorshidi, Alireza

    2017-01-01

    Machine-learning regression has been demonstrated to precisely emulate the potential energy and forces that are output from more expensive electronic-structure calculations. However, to predict new regions of the potential energy surface, an assessment must be made of the credibility of the predi......Machine-learning regression has been demonstrated to precisely emulate the potential energy and forces that are output from more expensive electronic-structure calculations. However, to predict new regions of the potential energy surface, an assessment must be made of the credibility...... of the predictions. In this perspective, we address the types of errors that might arise in atomistic machine learning, the unique aspects of atomistic simulations that make machine-learning challenging, and highlight how uncertainty analysis can be used to assess the validity of machine-learning predictions. We...... suggest this will allow researchers to more fully use machine learning for the routine acceleration of large, high-accuracy, or extended-time simulations. In our demonstrations, we use a bootstrap ensemble of neural network-based calculators, and show that the width of the ensemble can provide an estimate...

  10. Contemporary machine learning: techniques for practitioners in the physical sciences

    Science.gov (United States)

    Spears, Brian

    2017-10-01

    Machine learning is the science of using computers to find relationships in data without explicitly knowing or programming those relationships in advance. Often without realizing it, we employ machine learning every day as we use our phones or drive our cars. Over the last few years, machine learning has found increasingly broad application in the physical sciences. This most often involves building a model relationship between a dependent, measurable output and an associated set of controllable, but complicated, independent inputs. The methods are applicable both to experimental observations and to databases of simulated output from large, detailed numerical simulations. In this tutorial, we will present an overview of current tools and techniques in machine learning - a jumping-off point for researchers interested in using machine learning to advance their work. We will discuss supervised learning techniques for modeling complicated functions, beginning with familiar regression schemes, then advancing to more sophisticated decision trees, modern neural networks, and deep learning methods. Next, we will cover unsupervised learning and techniques for reducing the dimensionality of input spaces and for clustering data. We'll show example applications from both magnetic and inertial confinement fusion. Along the way, we will describe methods for practitioners to help ensure that their models generalize from their training data to as-yet-unseen test data. We will finally point out some limitations to modern machine learning and speculate on some ways that practitioners from the physical sciences may be particularly suited to help. This work was performed by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

  11. Machine Learning Techniques for Optical Performance Monitoring from Directly Detected PDM-QAM Signals

    DEFF Research Database (Denmark)

    Thrane, Jakob; Wass, Jesper; Piels, Molly

    2017-01-01

    Linear signal processing algorithms are effective in dealing with linear transmission channel and linear signal detection, while the nonlinear signal processing algorithms, from the machine learning community, are effective in dealing with nonlinear transmission channel and nonlinear signal...... detection. In this paper, a brief overview of the various machine learning methods and their application in optical communication is presented and discussed. Moreover, supervised machine learning methods, such as neural networks and support vector machine, are experimentally demonstrated for in-band optical...

  12. Machining dynamics fundamentals, applications and practices

    CERN Document Server

    Cheng, Kai

    2008-01-01

    Machining dynamics are vital to the performance of machine tools and machining processes in manufacturing. This book discusses the state-of-the-art applications, practices and research in machining dynamics. It presents basic theory, analysis and control methodology. It is useful for manufacturing engineers, supervisors, engineers and designers.

  13. 2015 International Conference on Machine Learning and Signal Processing

    CERN Document Server

    Woo, Wai; Sulaiman, Hamzah; Othman, Mohd; Saat, Mohd

    2016-01-01

    This book presents important research findings and recent innovations in the field of machine learning and signal processing. A wide range of topics relating to machine learning and signal processing techniques and their applications are addressed in order to provide both researchers and practitioners with a valuable resource documenting the latest advances and trends. The book comprises a careful selection of the papers submitted to the 2015 International Conference on Machine Learning and Signal Processing (MALSIP 2015), which was held on 15–17 December 2015 in Ho Chi Minh City, Vietnam with the aim of offering researchers, academicians, and practitioners an ideal opportunity to disseminate their findings and achievements. All of the included contributions were chosen by expert peer reviewers from across the world on the basis of their interest to the community. In addition to presenting the latest in design, development, and research, the book provides access to numerous new algorithms for machine learni...

  14. Machine learning techniques for optical communication system optimization

    DEFF Research Database (Denmark)

    Zibar, Darko; Wass, Jesper; Thrane, Jakob

    In this paper, machine learning techniques relevant to optical communication are presented and discussed. The focus is on applying machine learning tools to optical performance monitoring and performance prediction.......In this paper, machine learning techniques relevant to optical communication are presented and discussed. The focus is on applying machine learning tools to optical performance monitoring and performance prediction....

  15. Machine learning methods without tears: a primer for ecologists.

    Science.gov (United States)

    Olden, Julian D; Lawler, Joshua J; Poff, N LeRoy

    2008-06-01

    Machine learning methods, a family of statistical techniques with origins in the field of artificial intelligence, are recognized as holding great promise for the advancement of understanding and prediction about ecological phenomena. These modeling techniques are flexible enough to handle complex problems with multiple interacting elements and typically outcompete traditional approaches (e.g., generalized linear models), making them ideal for modeling ecological systems. Despite their inherent advantages, a review of the literature reveals only a modest use of these approaches in ecology as compared to other disciplines. One potential explanation for this lack of interest is that machine learning techniques do not fall neatly into the class of statistical modeling approaches with which most ecologists are familiar. In this paper, we provide an introduction to three machine learning approaches that can be broadly used by ecologists: classification and regression trees, artificial neural networks, and evolutionary computation. For each approach, we provide a brief background to the methodology, give examples of its application in ecology, describe model development and implementation, discuss strengths and weaknesses, explore the availability of statistical software, and provide an illustrative example. Although the ecological application of machine learning approaches has increased, there remains considerable skepticism with respect to the role of these techniques in ecology. Our review encourages a greater understanding of machin learning approaches and promotes their future application and utilization, while also providing a basis from which ecologists can make informed decisions about whether to select or avoid these approaches in their future modeling endeavors.

  16. MACHINE LEARNING TECHNIQUES USED IN BIG DATA

    Directory of Open Access Journals (Sweden)

    STEFANIA LOREDANA NITA

    2016-07-01

    Full Text Available The classical tools used in data analysis are not enough in order to benefit of all advantages of big data. The amount of information is too large for a complete investigation, and the possible connections and relations between data could be missed, because it is difficult or even impossible to verify all assumption over the information. Machine learning is a great solution in order to find concealed correlations or relationships between data, because it runs at scale machine and works very well with large data sets. The more data we have, the more the machine learning algorithm is useful, because it “learns” from the existing data and applies the found rules on new entries. In this paper, we present some machine learning algorithms and techniques used in big data.

  17. Machine learning techniques for persuasion dectection in conversation

    OpenAIRE

    Ortiz, Pedro.

    2010-01-01

    Approved for public release; distribution is unlimited We determined that it is possible to automatically detect persuasion in conversations using three traditional machine learning techniques, naive bayes, maximum entropy, and support vector machine. These results are the first of their kind and serve as a baseline for all future work in this field. The three techniques consistently outperformed the baseline F-score, but not at a level that would be useful for real world applications. The...

  18. Inverse analysis of turbidites by machine learning

    Science.gov (United States)

    Naruse, H.; Nakao, K.

    2017-12-01

    This study aims to propose a method to estimate paleo-hydraulic conditions of turbidity currents from ancient turbidites by using machine-learning technique. In this method, numerical simulation was repeated under various initial conditions, which produces a data set of characteristic features of turbidites. Then, this data set of turbidites is used for supervised training of a deep-learning neural network (NN). Quantities of characteristic features of turbidites in the training data set are given to input nodes of NN, and output nodes are expected to provide the estimates of initial condition of the turbidity current. The optimization of weight coefficients of NN is then conducted to reduce root-mean-square of the difference between the true conditions and the output values of NN. The empirical relationship with numerical results and the initial conditions is explored in this method, and the discovered relationship is used for inversion of turbidity currents. This machine learning can potentially produce NN that estimates paleo-hydraulic conditions from data of ancient turbidites. We produced a preliminary implementation of this methodology. A forward model based on 1D shallow-water equations with a correction of density-stratification effect was employed. This model calculates a behavior of a surge-like turbidity current transporting mixed-size sediment, and outputs spatial distribution of volume per unit area of each grain-size class on the uniform slope. Grain-size distribution was discretized 3 classes. Numerical simulation was repeated 1000 times, and thus 1000 beds of turbidites were used as the training data for NN that has 21000 input nodes and 5 output nodes with two hidden-layers. After the machine learning finished, independent simulations were conducted 200 times in order to evaluate the performance of NN. As a result of this test, the initial conditions of validation data were successfully reconstructed by NN. The estimated values show very small

  19. In silico machine learning methods in drug development.

    Science.gov (United States)

    Dobchev, Dimitar A; Pillai, Girinath G; Karelson, Mati

    2014-01-01

    Machine learning (ML) computational methods for predicting compounds with pharmacological activity, specific pharmacodynamic and ADMET (absorption, distribution, metabolism, excretion and toxicity) properties are being increasingly applied in drug discovery and evaluation. Recently, machine learning techniques such as artificial neural networks, support vector machines and genetic programming have been explored for predicting inhibitors, antagonists, blockers, agonists, activators and substrates of proteins related to specific therapeutic targets. These methods are particularly useful for screening compound libraries of diverse chemical structures, "noisy" and high-dimensional data to complement QSAR methods, and in cases of unavailable receptor 3D structure to complement structure-based methods. A variety of studies have demonstrated the potential of machine-learning methods for predicting compounds as potential drug candidates. The present review is intended to give an overview of the strategies and current progress in using machine learning methods for drug design and the potential of the respective model development tools. We also regard a number of applications of the machine learning algorithms based on common classes of diseases.

  20. A robust hybrid model integrating enhanced inputs based extreme learning machine with PLSR (PLSR-EIELM) and its application to intelligent measurement.

    Science.gov (United States)

    He, Yan-Lin; Geng, Zhi-Qiang; Xu, Yuan; Zhu, Qun-Xiong

    2015-09-01

    In this paper, a robust hybrid model integrating an enhanced inputs based extreme learning machine with the partial least square regression (PLSR-EIELM) was proposed. The proposed PLSR-EIELM model can overcome two main flaws in the extreme learning machine (ELM), i.e. the intractable problem in determining the optimal number of the hidden layer neurons and the over-fitting phenomenon. First, a traditional extreme learning machine (ELM) is selected. Second, a method of randomly assigning is applied to the weights between the input layer and the hidden layer, and then the nonlinear transformation for independent variables can be obtained from the output of the hidden layer neurons. Especially, the original input variables are regarded as enhanced inputs; then the enhanced inputs and the nonlinear transformed variables are tied together as the whole independent variables. In this way, the PLSR can be carried out to identify the PLS components not only from the nonlinear transformed variables but also from the original input variables, which can remove the correlation among the whole independent variables and the expected outputs. Finally, the optimal relationship model of the whole independent variables with the expected outputs can be achieved by using PLSR. Thus, the PLSR-EIELM model is developed. Then the PLSR-EIELM model served as an intelligent measurement tool for the key variables of the Purified Terephthalic Acid (PTA) process and the High Density Polyethylene (HDPE) process. The experimental results show that the predictive accuracy of PLSR-EIELM is stable, which indicate that PLSR-EIELM has good robust character. Moreover, compared with ELM, PLSR, hierarchical ELM (HELM), and PLSR-ELM, PLSR-EIELM can achieve much smaller predicted relative errors in these two applications. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.

  1. Using Machine Learning to Predict MCNP Bias

    Energy Technology Data Exchange (ETDEWEB)

    Grechanuk, Pavel Aleksandrovi [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2018-01-09

    For many real-world applications in radiation transport where simulations are compared to experimental measurements, like in nuclear criticality safety, the bias (simulated - experimental keff) in the calculation is an extremely important quantity used for code validation. The objective of this project is to accurately predict the bias of MCNP6 [1] criticality calculations using machine learning (ML) algorithms, with the intention of creating a tool that can complement the current nuclear criticality safety methods. In the latest release of MCNP6, the Whisper tool is available for criticality safety analysts and includes a large catalogue of experimental benchmarks, sensitivity profiles, and nuclear data covariance matrices. This data, coming from 1100+ benchmark cases, is used in this study of ML algorithms for criticality safety bias predictions.

  2. A distributed algorithm for machine learning

    Science.gov (United States)

    Chen, Shihong

    2018-04-01

    This paper considers a distributed learning problem in which a group of machines in a connected network, each learning its own local dataset, aim to reach a consensus at an optimal model, by exchanging information only with their neighbors but without transmitting data. A distributed algorithm is proposed to solve this problem under appropriate assumptions.

  3. Interactive Algorithms for Unsupervised Machine Learning

    Science.gov (United States)

    2015-06-01

    in Neural Information Processing Systems, 2013. 14 [3] Louigi Addario-Berry, Nicolas Broutin, Luc Devroye, and Gábor Lugosi. On combinato- rial...Myung Jin Choi, Vincent Y F Tan , Animashree Anandkumar, and Alan S Willsky. Learn- ing Latent Tree Graphical Models. Journal of Machine Learning

  4. Efficient tuning in supervised machine learning

    NARCIS (Netherlands)

    Koch, Patrick

    2013-01-01

    The tuning of learning algorithm parameters has become more and more important during the last years. With the fast growth of computational power and available memory databases have grown dramatically. This is very challenging for the tuning of parameters arising in machine learning, since the

  5. Machine learning a Bayesian and optimization perspective

    CERN Document Server

    Theodoridis, Sergios

    2015-01-01

    This tutorial text gives a unifying perspective on machine learning by covering both probabilistic and deterministic approaches, which rely on optimization techniques, as well as Bayesian inference, which is based on a hierarchy of probabilistic models. The book presents the major machine learning methods as they have been developed in different disciplines, such as statistics, statistical and adaptive signal processing and computer science. Focusing on the physical reasoning behind the mathematics, all the various methods and techniques are explained in depth, supported by examples and problems, giving an invaluable resource to the student and researcher for understanding and applying machine learning concepts. The book builds carefully from the basic classical methods to the most recent trends, with chapters written to be as self-contained as possible, making the text suitable for different courses: pattern recognition, statistical/adaptive signal processing, statistical/Bayesian learning, as well as shor...

  6. Machine learning: Trends, perspectives, and prospects.

    Science.gov (United States)

    Jordan, M I; Mitchell, T M

    2015-07-17

    Machine learning addresses the question of how to build computers that improve automatically through experience. It is one of today's most rapidly growing technical fields, lying at the intersection of computer science and statistics, and at the core of artificial intelligence and data science. Recent progress in machine learning has been driven both by the development of new learning algorithms and theory and by the ongoing explosion in the availability of online data and low-cost computation. The adoption of data-intensive machine-learning methods can be found throughout science, technology and commerce, leading to more evidence-based decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and marketing. Copyright © 2015, American Association for the Advancement of Science.

  7. Machine learning techniques for the optimization of joint replacements: Application to a short-stem hip implant.

    Science.gov (United States)

    Cilla, Myriam; Borgiani, Edoardo; Martínez, Javier; Duda, Georg N; Checa, Sara

    2017-01-01

    Today, different implant designs exist in the market; however, there is not a clear understanding of which are the best implant design parameters to achieve mechanical optimal conditions. Therefore, the aim of this project was to investigate if the geometry of a commercial short stem hip prosthesis can be further optimized to reduce stress shielding effects and achieve better short-stemmed implant performance. To reach this aim, the potential of machine learning techniques combined with parametric Finite Element analysis was used. The selected implant geometrical parameters were: total stem length (L), thickness in the lateral (R1) and medial (R2) and the distance between the implant neck and the central stem surface (D). The results show that the total stem length was not the only parameter playing a role in stress shielding. An optimized implant should aim for a decreased stem length and a reduced length of the surface in contact with the bone. The two radiuses that characterize the stem width at the distal cross-section in contact with the bone were less influential in the reduction of stress shielding compared with the other two parameters; but they also play a role where thinner stems present better results.

  8. Is demography destiny? Application of machine learning techniques to accurately predict population health outcomes from a minimal demographic dataset.

    Directory of Open Access Journals (Sweden)

    Wei Luo

    Full Text Available For years, we have relied on population surveys to keep track of regional public health statistics, including the prevalence of non-communicable diseases. Because of the cost and limitations of such surveys, we often do not have the up-to-date data on health outcomes of a region. In this paper, we examined the feasibility of inferring regional health outcomes from socio-demographic data that are widely available and timely updated through national censuses and community surveys. Using data for 50 American states (excluding Washington DC from 2007 to 2012, we constructed a machine-learning model to predict the prevalence of six non-communicable disease (NCD outcomes (four NCDs and two major clinical risk factors, based on population socio-demographic characteristics from the American Community Survey. We found that regional prevalence estimates for non-communicable diseases can be reasonably predicted. The predictions were highly correlated with the observed data, in both the states included in the derivation model (median correlation 0.88 and those excluded from the development for use as a completely separated validation sample (median correlation 0.85, demonstrating that the model had sufficient external validity to make good predictions, based on demographics alone, for areas not included in the model development. This highlights both the utility of this sophisticated approach to model development, and the vital importance of simple socio-demographic characteristics as both indicators and determinants of chronic disease.

  9. Stochastic weather inputs for improved urban water demand forecasting: application of nonlinear input variable selection and machine learning methods

    Science.gov (United States)

    Quilty, J.; Adamowski, J. F.

    2015-12-01

    Urban water supply systems are often stressed during seasonal outdoor water use as water demands related to the climate are variable in nature making it difficult to optimize the operation of the water supply system. Urban water demand forecasts (UWD) failing to include meteorological conditions as inputs to the forecast model may produce poor forecasts as they cannot account for the increase/decrease in demand related to meteorological conditions. Meteorological records stochastically simulated into the future can be used as inputs to data-driven UWD forecasts generally resulting in improved forecast accuracy. This study aims to produce data-driven UWD forecasts for two different Canadian water utilities (Montreal and Victoria) using machine learning methods by first selecting historical UWD and meteorological records derived from a stochastic weather generator using nonlinear input variable selection. The nonlinear input variable selection methods considered in this work are derived from the concept of conditional mutual information, a nonlinear dependency measure based on (multivariate) probability density functions and accounts for relevancy, conditional relevancy, and redundancy from a potential set of input variables. The results of our study indicate that stochastic weather inputs can improve UWD forecast accuracy for the two sites considered in this work. Nonlinear input variable selection is suggested as a means to identify which meteorological conditions should be utilized in the forecast.

  10. Is demography destiny? Application of machine learning techniques to accurately predict population health outcomes from a minimal demographic dataset.

    Science.gov (United States)

    Luo, Wei; Nguyen, Thin; Nichols, Melanie; Tran, Truyen; Rana, Santu; Gupta, Sunil; Phung, Dinh; Venkatesh, Svetha; Allender, Steve

    2015-01-01

    For years, we have relied on population surveys to keep track of regional public health statistics, including the prevalence of non-communicable diseases. Because of the cost and limitations of such surveys, we often do not have the up-to-date data on health outcomes of a region. In this paper, we examined the feasibility of inferring regional health outcomes from socio-demographic data that are widely available and timely updated through national censuses and community surveys. Using data for 50 American states (excluding Washington DC) from 2007 to 2012, we constructed a machine-learning model to predict the prevalence of six non-communicable disease (NCD) outcomes (four NCDs and two major clinical risk factors), based on population socio-demographic characteristics from the American Community Survey. We found that regional prevalence estimates for non-communicable diseases can be reasonably predicted. The predictions were highly correlated with the observed data, in both the states included in the derivation model (median correlation 0.88) and those excluded from the development for use as a completely separated validation sample (median correlation 0.85), demonstrating that the model had sufficient external validity to make good predictions, based on demographics alone, for areas not included in the model development. This highlights both the utility of this sophisticated approach to model development, and the vital importance of simple socio-demographic characteristics as both indicators and determinants of chronic disease.

  11. Advances in Machine Learning and Data Mining for Astronomy

    Science.gov (United States)

    Way, Michael J.; Scargle, Jeffrey D.; Ali, Kamal M.; Srivastava, Ashok N.

    2012-03-01

    Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science. The book's introductory part provides context to issues in the astronomical sciences that are also important to health, social, and physical sciences, particularly probabilistic and statistical aspects of classification and cluster analysis. The next part describes a number of astrophysics case studies that leverage a range of machine learning and data mining technologies. In the last part, developers of algorithms and practitioners of machine learning and data mining show how these tools and techniques are used in astronomical applications. With contributions from leading astronomers and computer scientists, this book is a practical guide to many of the most important developments in machine learning, data mining, and statistics. It explores how these advances can solve current and future problems in astronomy and looks at how they could lead to the creation of entirely new algorithms within the data mining community.

  12. Trends in extreme learning machines: a review.

    Science.gov (United States)

    Huang, Gao; Huang, Guang-Bin; Song, Shiji; You, Keyou

    2015-01-01

    Extreme learning machine (ELM) has gained increasing interest from various research fields recently. In this review, we aim to report the current state of the theoretical research and practical advances on this subject. We first give an overview of ELM from the theoretical perspective, including the interpolation theory, universal approximation capability, and generalization ability. Then we focus on the various improvements made to ELM which further improve its stability, sparsity and accuracy under general or specific conditions. Apart from classification and regression, ELM has recently been extended for clustering, feature selection, representational learning and many other learning tasks. These newly emerging algorithms greatly expand the applications of ELM. From implementation aspect, hardware implementation and parallel computation techniques have substantially sped up the training of ELM, making it feasible for big data processing and real-time reasoning. Due to its remarkable efficiency, simplicity, and impressive generalization performance, ELM have been applied in a variety of domains, such as biomedical engineering, computer vision, system identification, and control and robotics. In this review, we try to provide a comprehensive view of these advances in ELM together with its future perspectives.

  13. A Machine Learning Application Based in Random Forest for Integrating Mass Spectrometry-Based Metabolomic Data: A Simple Screening Method for Patients With Zika Virus

    Directory of Open Access Journals (Sweden)

    Carlos Fernando Odir Rodrigues Melo

    2018-04-01

    Full Text Available Recent Zika outbreaks in South America, accompanied by unexpectedly severe clinical complications have brought much interest in fast and reliable screening methods for ZIKV (Zika virus identification. Reverse-transcriptase polymerase chain reaction (RT-PCR is currently the method of choice to detect ZIKV in biological samples. This approach, nonetheless, demands a considerable amount of time and resources such as kits and reagents that, in endemic areas, may result in a substantial financial burden over affected individuals and health services veering away from RT-PCR analysis. This study presents a powerful combination of high-resolution mass spectrometry and a machine-learning prediction model for data analysis to assess the existence of ZIKV infection across a series of patients that bear similar symptomatic conditions, but not necessarily are infected with the disease. By using mass spectrometric data that are inputted with the developed decision-making algorithm, we were able to provide a set of features that work as a “fingerprint” for this specific pathophysiological condition, even after the acute phase of infection. Since both mass spectrometry and machine learning approaches are well-established and have largely utilized tools within their respective fields, this combination of methods emerges as a distinct alternative for clinical applications, providing a diagnostic screening—faster and more accurate—with improved cost-effectiveness when compared to existing technologies.

  14. A Machine Learning Application Based in Random Forest for Integrating Mass Spectrometry-Based Metabolomic Data: A Simple Screening Method for Patients With Zika Virus.

    Science.gov (United States)

    Melo, Carlos Fernando Odir Rodrigues; Navarro, Luiz Claudio; de Oliveira, Diogo Noin; Guerreiro, Tatiane Melina; Lima, Estela de Oliveira; Delafiori, Jeany; Dabaja, Mohamed Ziad; Ribeiro, Marta da Silva; de Menezes, Maico; Rodrigues, Rafael Gustavo Martins; Morishita, Karen Noda; Esteves, Cibele Zanardi; de Amorim, Aline Lopes Lucas; Aoyagui, Caroline Tiemi; Parise, Pierina Lorencini; Milanez, Guilherme Paier; do Nascimento, Gabriela Mansano; Ribas Freitas, André Ricardo; Angerami, Rodrigo; Costa, Fábio Trindade Maranhão; Arns, Clarice Weis; Resende, Mariangela Ribeiro; Amaral, Eliana; Junior, Renato Passini; Ribeiro-do-Valle, Carolina C; Milanez, Helaine; Moretti, Maria Luiza; Proenca-Modena, Jose Luiz; Avila, Sandra; Rocha, Anderson; Catharino, Rodrigo Ramos

    2018-01-01

    Recent Zika outbreaks in South America, accompanied by unexpectedly severe clinical complications have brought much interest in fast and reliable screening methods for ZIKV (Zika virus) identification. Reverse-transcriptase polymerase chain reaction (RT-PCR) is currently the method of choice to detect ZIKV in biological samples. This approach, nonetheless, demands a considerable amount of time and resources such as kits and reagents that, in endemic areas, may result in a substantial financial burden over affected individuals and health services veering away from RT-PCR analysis. This study presents a powerful combination of high-resolution mass spectrometry and a machine-learning prediction model for data analysis to assess the existence of ZIKV infection across a series of patients that bear similar symptomatic conditions, but not necessarily are infected with the disease. By using mass spectrometric data that are inputted with the developed decision-making algorithm, we were able to provide a set of features that work as a "fingerprint" for this specific pathophysiological condition, even after the acute phase of infection. Since both mass spectrometry and machine learning approaches are well-established and have largely utilized tools within their respective fields, this combination of methods emerges as a distinct alternative for clinical applications, providing a diagnostic screening-faster and more accurate-with improved cost-effectiveness when compared to existing technologies.

  15. Machine learning molecular dynamics for the simulation of infrared spectra.

    Science.gov (United States)

    Gastegger, Michael; Behler, Jörg; Marquetand, Philipp

    2017-10-01

    Machine learning has emerged as an invaluable tool in many research areas. In the present work, we harness this power to predict highly accurate molecular infrared spectra with unprecedented computational efficiency. To account for vibrational anharmonic and dynamical effects - typically neglected by conventional quantum chemistry approaches - we base our machine learning strategy on ab initio molecular dynamics simulations. While these simulations are usually extremely time consuming even for small molecules, we overcome these limitations by leveraging the power of a variety of machine learning techniques, not only accelerating simulations by several orders of magnitude, but also greatly extending the size of systems that can be treated. To this end, we develop a molecular dipole moment model based on environment dependent neural network charges and combine it with the neural network potential approach of Behler and Parrinello. Contrary to the prevalent big data philosophy, we are able to obtain very accurate machine learning models for the prediction of infrared spectra based on only a few hundreds of electronic structure reference points. This is made possible through the use of molecular forces during neural network potential training and the introduction of a fully automated sampling scheme. We demonstrate the power of our machine learning approach by applying it to model the infrared spectra of a methanol molecule, n -alkanes containing up to 200 atoms and the protonated alanine tripeptide, which at the same time represents the first application of machine learning techniques to simulate the dynamics of a peptide. In all of these case studies we find an excellent agreement between the infrared spectra predicted via machine learning models and the respective theoretical and experimental spectra.

  16. Teraflop-scale Incremental Machine Learning

    OpenAIRE

    Özkural, Eray

    2011-01-01

    We propose a long-term memory design for artificial general intelligence based on Solomonoff's incremental machine learning methods. We use R5RS Scheme and its standard library with a few omissions as the reference machine. We introduce a Levin Search variant based on Stochastic Context Free Grammar together with four synergistic update algorithms that use the same grammar as a guiding probability distribution of programs. The update algorithms include adjusting production probabilities, re-u...

  17. Learning to discover: machine learning in high-energy physics

    CERN Multimedia

    CERN. Geneva

    2014-01-01

    In this talk we will survey some of the latest developments in machine learning research through the optics of potential applications in high-energy physics. We will then describe three ongoing projects in detail. The main subject of the talk is the data challenge we are organizing with ATLAS on optimizing the discovery significance for the Higgs to tau-tau channel. Second, we describe our collaboration with the LHCb experiment on designing and optimizing fast multi-variate techniques that can be implemented as online classifiers in triggers. Finally, we will sketch a relatively young project with the ILC (Calice) group in which we are attempting to apply deep learning techniques for inference on imaging calorimeter data.

  18. Machine Learning Approaches in Cardiovascular Imaging.

    Science.gov (United States)

    Henglin, Mir; Stein, Gillian; Hushcha, Pavel V; Snoek, Jasper; Wiltschko, Alexander B; Cheng, Susan

    2017-10-01

    Cardiovascular imaging technologies continue to increase in their capacity to capture and store large quantities of data. Modern computational methods, developed in the field of machine learning, offer new approaches to leveraging the growing volume of imaging data available for analyses. Machine learning methods can now address data-related problems ranging from simple analytic queries of existing measurement data to the more complex challenges involved in analyzing raw images. To date, machine learning has been used in 2 broad and highly interconnected areas: automation of tasks that might otherwise be performed by a human and generation of clinically important new knowledge. Most cardiovascular imaging studies have focused on task-oriented problems, but more studies involving algorithms aimed at generating new clinical insights are emerging. Continued expansion in the size and dimensionality of cardiovascular imaging databases is driving strong interest in applying powerful deep learning methods, in particular, to analyze these data. Overall, the most effective approaches will require an investment in the resources needed to appropriately prepare such large data sets for analyses. Notwithstanding current technical and logistical challenges, machine learning and especially deep learning methods have much to offer and will substantially impact the future practice and science of cardiovascular imaging. © 2017 American Heart Association, Inc.

  19. Machine learning for identifying botnet network traffic

    DEFF Research Database (Denmark)

    Stevanovic, Matija; Pedersen, Jens Myrup

    2013-01-01

    . Due to promise of non-invasive and resilient detection, botnet detection based on network traffic analysis has drawn a special attention of the research community. Furthermore, many authors have turned their attention to the use of machine learning algorithms as the mean of inferring botnet......-related knowledge from the monitored traffic. This paper presents a review of contemporary botnet detection methods that use machine learning as a tool of identifying botnet-related traffic. The main goal of the paper is to provide a comprehensive overview on the field by summarizing current scientific efforts....... The contribution of the paper is three-fold. First, the paper provides a detailed insight on the existing detection methods by investigating which bot-related heuristic were assumed by the detection systems and how different machine learning techniques were adapted in order to capture botnet-related knowledge...

  20. Parsimonious Wavelet Kernel Extreme Learning Machine

    Directory of Open Access Journals (Sweden)

    Wang Qin

    2015-11-01

    Full Text Available In this study, a parsimonious scheme for wavelet kernel extreme learning machine (named PWKELM was introduced by combining wavelet theory and a parsimonious algorithm into kernel extreme learning machine (KELM. In the wavelet analysis, bases that were localized in time and frequency to represent various signals effectively were used. Wavelet kernel extreme learning machine (WELM maximized its capability to capture the essential features in “frequency-rich” signals. The proposed parsimonious algorithm also incorporated significant wavelet kernel functions via iteration in virtue of Householder matrix, thus producing a sparse solution that eased the computational burden and improved numerical stability. The experimental results achieved from the synthetic dataset and a gas furnace instance demonstrated that the proposed PWKELM is efficient and feasible in terms of improving generalization accuracy and real time performance.

  1. Introduction to machine learning for brain imaging.

    Science.gov (United States)

    Lemm, Steven; Blankertz, Benjamin; Dickhaus, Thorsten; Müller, Klaus-Robert

    2011-05-15

    Machine learning and pattern recognition algorithms have in the past years developed to become a working horse in brain imaging and the computational neurosciences, as they are instrumental for mining vast amounts of neural data of ever increasing measurement precision and detecting minuscule signals from an overwhelming noise floor. They provide the means to decode and characterize task relevant brain states and to distinguish them from non-informative brain signals. While undoubtedly this machinery has helped to gain novel biological insights, it also holds the danger of potential unintentional abuse. Ideally machine learning techniques should be usable for any non-expert, however, unfortunately they are typically not. Overfitting and other pitfalls may occur and lead to spurious and nonsensical interpretation. The goal of this review is therefore to provide an accessible and clear introduction to the strengths and also the inherent dangers of machine learning usage in the neurosciences. Copyright © 2010 Elsevier Inc. All rights reserved.

  2. Machine learning in the string landscape

    Science.gov (United States)

    Carifio, Jonathan; Halverson, James; Krioukov, Dmitri; Nelson, Brent D.

    2017-09-01

    We utilize machine learning to study the string landscape. Deep data dives and conjecture generation are proposed as useful frameworks for utilizing machine learning in the landscape, and examples of each are presented. A decision tree accurately predicts the number of weak Fano toric threefolds arising from reflexive polytopes, each of which determines a smooth F-theory compactification, and linear regression generates a previously proven conjecture for the gauge group rank in an ensemble of 4/3× 2.96× {10}^{755} F-theory compactifications. Logistic regression generates a new conjecture for when E 6 arises in the large ensemble of F-theory compactifications, which is then rigorously proven. This result may be relevant for the appearance of visible sectors in the ensemble. Through conjecture generation, machine learning is useful not only for numerics, but also for rigorous results.

  3. Assessing the Ability of Vegetation Indices to Identify Shallow Subsurface Water Flow Pathways from Hyperspectral Imagery Using Machine Learning: Application

    Science.gov (United States)

    Doctor, K.; Byers, J. M.

    2017-12-01

    Shallow underground water flow pathways expressed as slight depressions are common in the land surface. Under conditions of saturated overland flow, such as during heavy rain or snow melt, these areas of preferential flow might appear on the surface as very shallow flowing streams. When there is no water flowing in these ephemeral channels it can be difficult to identify them. It is especially difficult to discern the slight depressions above the subsurface water flow pathways (SWFP) when the area is covered by vegetation. Since the soil moisture content in these SWFP is often greater than the surrounding area, the vegetation growing on top of these channels shows different vigor and moisture content than the vegetation growing above the non-SWFP area. Vegetation indices (VI) are used in visible and near infrared (VNIR) hyperspectral imagery to enhance biophysical properties of vegetation, and so the brightness values between vegetation atop SWFP and the surrounding vegetation were highlighted. We performed supervised machine learning using ground-truth class labels to determine the conditional probability of a SWFP at a given pixel given either the spectral distribution or VI at that pixel. The training data estimates the probability distributions to a determined finite sampling accuracy for a binary Naïve Bayes classifier between SWFP and non-SWFP. The ground-truth data provides a test bed for understanding the ability to build SWFP classifiers using hyperspectral imagery. SWFP were distinguishable in the imagery within corn and grass fields and in areas with low-lying vegetation. However, the training data is limited to particular types of terrain and vegetation cover in the Shenandoah Valley, Virginia and this would limit the resulting classifier. Further training data could extend its use to other environments.

  4. Predicting Solar Activity Using Machine-Learning Methods

    Science.gov (United States)

    Bobra, M.

    2017-12-01

    Of all the activity observed on the Sun, two of the most energetic events are flares and coronal mass ejections. However, we do not, as of yet, fully understand the physical mechanism that triggers solar eruptions. A machine-learning algorithm, which is favorable in cases where the amount of data is large, is one way to [1] empirically determine the signatures of this mechanism in solar image data and [2] use them to predict solar activity. In this talk, we discuss the application of various machine learning algorithms - specifically, a Support Vector Machine, a sparse linear regression (Lasso), and Convolutional Neural Network - to image data from the photosphere, chromosphere, transition region, and corona taken by instruments aboard the Solar Dynamics Observatory in order to predict solar activity on a variety of time scales. Such an approach may be useful since, at the present time, there are no physical models of flares available for real-time prediction. We discuss our results (Bobra and Couvidat, 2015; Bobra and Ilonidis, 2016; Jonas et al., 2017) as well as other attempts to predict flares using machine-learning (e.g. Ahmed et al., 2013; Nishizuka et al. 2017) and compare these results with the more traditional techniques used by the NOAA Space Weather Prediction Center (Crown, 2012). We also discuss some of the challenges in using machine-learning algorithms for space science applications.

  5. Advances in machine learning and data mining for astronomy

    CERN Document Server

    Way, Michael J

    2012-01-01

    Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science. The book's introductory part provides context to issues in the astronomical sciences that are also important to health

  6. Machine learning techniques to examine large patient databases.

    Science.gov (United States)

    Meyfroidt, Geert; Güiza, Fabian; Ramon, Jan; Bruynooghe, Maurice

    2009-03-01

    Computerization in healthcare in general, and in the operating room (OR) and intensive care unit (ICU) in particular, is on the rise. This leads to large patient databases, with specific properties. Machine learning techniques are able to examine and to extract knowledge from large databases in an automatic way. Although the number of potential applications for these techniques in medicine is large, few medical doctors are familiar with their methodology, advantages and pitfalls. A general overview of machine learning techniques, with a more detailed discussion of some of these algorithms, is presented in this review.

  7. Advances in independent component analysis and learning machines

    CERN Document Server

    Bingham, Ella; Laaksonen, Jorma; Lampinen, Jouko

    2015-01-01

    In honour of Professor Erkki Oja, one of the pioneers of Independent Component Analysis (ICA), this book reviews key advances in the theory and application of ICA, as well as its influence on signal processing, pattern recognition, machine learning, and data mining. Examples of topics which have developed from the advances of ICA, which are covered in the book are: A unifying probabilistic model for PCA and ICA Optimization methods for matrix decompositions Insights into the FastICA algorithmUnsupervised deep learning Machine vision and image retrieval A review of developments in the t

  8. Thutmose - Investigation of Machine Learning-Based Intrusion Detection Systems

    Science.gov (United States)

    2016-06-01

    monitoring. This analyzed payload is within the application layer of the OSI model . The analysis tries to establish whether or not the payload is...24 3.2.5 Model Drift Experiments...ADVERSARIAL ENVIRONMENTS (SPIE DSS 2014) .................................................. 58 APPENDIX C - EVALUATING MODEL DRIFT IN MACHINE LEARNING

  9. Simulation-driven machine learning: Bearing fault classification

    Science.gov (United States)

    Sobie, Cameron; Freitas, Carina; Nicolai, Mike

    2018-01-01

    Increasing the accuracy of mechanical fault detection has the potential to improve system safety and economic performance by minimizing scheduled maintenance and the probability of unexpected system failure. Advances in computational performance have enabled the application of machine learning algorithms across numerous applications including condition monitoring and failure detection. Past applications of machine learning to physical failure have relied explicitly on historical data, which limits the feasibility of this approach to in-service components with extended service histories. Furthermore, recorded failure data is often only valid for the specific circumstances and components for which it was collected. This work directly addresses these challenges for roller bearings with race faults by generating training data using information gained from high resolution simulations of roller bearing dynamics, which is used to train machine learning algorithms that are then validated against four experimental datasets. Several different machine learning methodologies are compared starting from well-established statistical feature-based methods to convolutional neural networks, and a novel application of dynamic time warping (DTW) to bearing fault classification is proposed as a robust, parameter free method for race fault detection.

  10. Machine learning and data science in soft materials engineering.

    Science.gov (United States)

    Ferguson, Andrew L

    2018-01-31

    In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by 'de-jargonizing' data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.

  11. Inverse Problems in Geodynamics Using Machine Learning Algorithms

    Science.gov (United States)

    Shahnas, M. H.; Yuen, D. A.; Pysklywec, R. N.

    2018-01-01

    During the past few decades numerical studies have been widely employed to explore the style of circulation and mixing in the mantle of Earth and other planets. However, in geodynamical studies there are many properties from mineral physics, geochemistry, and petrology in these numerical models. Machine learning, as a computational statistic-related technique and a subfield of artificial intelligence, has rapidly emerged recently in many fields of sciences and engineering. We focus here on the application of supervised machine learning (SML) algorithms in predictions of mantle flow processes. Specifically, we emphasize on estimating mantle properties by employing machine learning techniques in solving an inverse problem. Using snapshots of numerical convection models as training samples, we enable machine learning models to determine the magnitude of the spin transition-induced density anomalies that can cause flow stagnation at midmantle depths. Employing support vector machine algorithms, we show that SML techniques can successfully predict the magnitude of mantle density anomalies and can also be used in characterizing mantle flow patterns. The technique can be extended to more complex geodynamic problems in mantle dynamics by employing deep learning algorithms for putting constraints on properties such as viscosity, elastic parameters, and the nature of thermal and chemical anomalies.

  12. Machine learning and data science in soft materials engineering

    Science.gov (United States)

    Ferguson, Andrew L.

    2018-01-01

    In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by ‘de-jargonizing’ data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.

  13. Machine learning for adaptive many-core machines a practical approach

    CERN Document Server

    Lopes, Noel

    2015-01-01

    The overwhelming data produced everyday and the increasing performance and cost requirements of applications?are transversal to a wide range of activities in society, from science to industry. In particular, the magnitude and complexity of the tasks that Machine Learning (ML) algorithms have to solve are driving the need to devise adaptive many-core machines that scale well with the volume of data, or in other words, can handle Big Data.This book gives a concise view on how to extend the applicability of well-known ML algorithms in Graphics Processing Unit (GPU) with data scalability in mind.

  14. Machine Learning Optimization of Evolvable Artificial Cells

    DEFF Research Database (Denmark)

    Caschera, F.; Rasmussen, S.; Hanczyc, M.

    2011-01-01

    can be explored. A machine learning approach (Evo-DoE) could be applied to explore this experimental space and define optimal interactions according to a specific fitness function. Herein an implementation of an evolutionary design of experiments to optimize chemical and biochemical systems based...... on a machine learning process is presented. The optimization proceeds over generations of experiments in iterative loop until optimal compositions are discovered. The fitness function is experimentally measured every time the loop is closed. Two examples of complex systems, namely a liposomal drug formulation...

  15. Recent Advances in Predictive (Machine) Learning

    Energy Technology Data Exchange (ETDEWEB)

    Friedman, J

    2004-01-24

    Prediction involves estimating the unknown value of an attribute of a system under study given the values of other measured attributes. In prediction (machine) learning the prediction rule is derived from data consisting of previously solved cases. Most methods for predictive learning were originated many years ago at the dawn of the computer age. Recently two new techniques have emerged that have revitalized the field. These are support vector machines and boosted decision trees. This paper provides an introduction to these two new methods tracing their respective ancestral roots to standard kernel methods and ordinary decision trees.

  16. Machine learning for Big Data analytics in plants.

    Science.gov (United States)

    Ma, Chuang; Zhang, Hao Helen; Wang, Xiangfeng

    2014-12-01

    Rapid advances in high-throughput genomic technology have enabled biology to enter the era of 'Big Data' (large datasets). The plant science community not only needs to build its own Big-Data-compatible parallel computing and data management infrastructures, but also to seek novel analytical paradigms to extract information from the overwhelming amounts of data. Machine learning offers promising computational and analytical solutions for the integrative analysis of large, heterogeneous and unstructured datasets on the Big-Data scale, and is gradually gaining popularity in biology. This review introduces the basic concepts and procedures of machine-learning applications and envisages how machine learning could interface with Big Data technology to facilitate basic research and biotechnology in the plant sciences. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. A strategy for quantum algorithm design assisted by machine learning

    International Nuclear Information System (INIS)

    Bang, Jeongho; Lee, Jinhyoung; Ryu, Junghee; Yoo, Seokwon; Pawłowski, Marcin

    2014-01-01

    We propose a method for quantum algorithm design assisted by machine learning. The method uses a quantum–classical hybrid simulator, where a ‘quantum student’ is being taught by a ‘classical teacher’. In other words, in our method, the learning system is supposed to evolve into a quantum algorithm for a given problem, assisted by a classical main-feedback system. Our method is applicable for designing quantum oracle-based algorithms. We chose, as a case study, an oracle decision problem, called a Deutsch–Jozsa problem. We showed by using Monte Carlo simulations that our simulator can faithfully learn a quantum algorithm for solving the problem for a given oracle. Remarkably, the learning time is proportional to the square root of the total number of parameters, rather than showing the exponential dependence found in the classical machine learning-based method. (paper)

  18. A strategy for quantum algorithm design assisted by machine learning

    Science.gov (United States)

    Bang, Jeongho; Ryu, Junghee; Yoo, Seokwon; Pawłowski, Marcin; Lee, Jinhyoung

    2014-07-01

    We propose a method for quantum algorithm design assisted by machine learning. The method uses a quantum-classical hybrid simulator, where a ‘quantum student’ is being taught by a ‘classical teacher’. In other words, in our method, the learning system is supposed to evolve into a quantum algorithm for a given problem, assisted by a classical main-feedback system. Our method is applicable for designing quantum oracle-based algorithms. We chose, as a case study, an oracle decision problem, called a Deutsch-Jozsa problem. We showed by using Monte Carlo simulations that our simulator can faithfully learn a quantum algorithm for solving the problem for a given oracle. Remarkably, the learning time is proportional to the square root of the total number of parameters, rather than showing the exponential dependence found in the classical machine learning-based method.

  19. Machine learning of the reactor core loading pattern critical parameters

    International Nuclear Information System (INIS)

    Trontl, K.; Pevec, D.; Smuc, T.

    2007-01-01

    The usual approach to loading pattern optimization involves high degree of engineering judgment, a set of heuristic rules, an optimization algorithm and a computer code used for evaluating proposed loading patterns. The speed of the optimization process is highly dependent on the computer code used for the evaluation. In this paper we investigate the applicability of a machine learning model which could be used for fast loading pattern evaluation. We employed a recently introduced machine learning technique, Support Vector Regression (SVR), which has a strong theoretical background in statistical learning theory. Superior empirical performance of the method has been reported on difficult regression problems in different fields of science and technology. SVR is a data driven, kernel based, nonlinear modelling paradigm, in which model parameters are automatically determined by solving a quadratic optimization problem. The main objective of the work reported in this paper was to evaluate the possibility of applying SVR method for reactor core loading pattern modelling. The starting set of experimental data for training and testing of the machine learning algorithm was obtained using a two-dimensional diffusion theory reactor physics computer code. We illustrate the performance of the solution and discuss its applicability, i.e., complexity, speed and accuracy, with a projection to a more realistic scenario involving machine learning from the results of more accurate and time consuming three-dimensional core modelling code. (author)

  20. Tracking by Machine Learning Methods

    CERN Document Server

    Jofrehei, Arash

    2015-01-01

    Current track reconstructing methods start with two points and then for each layer loop through all possible hits to find proper hits to add to that track. Another idea would be to use this large number of already reconstructed events and/or simulated data and train a machine on this data to find tracks given hit pixels. Training time could be long but real time tracking is really fast Simulation might not be as realistic as real data but tacking has been done for that with 100 percent efficiency while by using real data we would probably be limited to current efficiency.

  1. Machine learning with quantum relative entropy

    Energy Technology Data Exchange (ETDEWEB)

    Tsuda, Koji [Max Planck Institute for Biological Cybernetics, Spemannstr. 38, Tuebingen, 72076 (Germany)], E-mail: koji.tsuda@tuebingen.mpg.de

    2009-12-01

    Density matrices are a central tool in quantum physics, but it is also used in machine learning. A positive definite matrix called kernel matrix is used to represent the similarities between examples. Positive definiteness assures that the examples are embedded in an Euclidean space. When a positive definite matrix is learned from data, one has to design an update rule that maintains the positive definiteness. Our update rule, called matrix exponentiated gradient update, is motivated by the quantum relative entropy. Notably, the relative entropy is an instance of Bregman divergences, which are asymmetric distance measures specifying theoretical properties of machine learning algorithms. Using the calculus commonly used in quantum physics, we prove an upperbound of the generalization error of online learning.

  2. Machine learning with quantum relative entropy

    International Nuclear Information System (INIS)

    Tsuda, Koji

    2009-01-01

    Density matrices are a central tool in quantum physics, but it is also used in machine learning. A positive definite matrix called kernel matrix is used to represent the similarities between examples. Positive definiteness assures that the examples are embedded in an Euclidean space. When a positive definite matrix is learned from data, one has to design an update rule that maintains the positive definiteness. Our update rule, called matrix exponentiated gradient update, is motivated by the quantum relative entropy. Notably, the relative entropy is an instance of Bregman divergences, which are asymmetric distance measures specifying theoretical properties of machine learning algorithms. Using the calculus commonly used in quantum physics, we prove an upperbound of the generalization error of online learning.

  3. Machine Learning and Data Mining Methods in Diabetes Research.

    Science.gov (United States)

    Kavakiotis, Ioannis; Tsave, Olga; Salifoglou, Athanasios; Maglaveras, Nicos; Vlahavas, Ioannis; Chouvarda, Ioanna

    2017-01-01

    The remarkable advances in biotechnology and health sciences have led to a significant production of data, such as high throughput genetic data and clinical information, generated from large Electronic Health Records (EHRs). To this end, application of machine learning and data mining methods in biosciences is presently, more than ever before, vital and indispensable in efforts to transform intelligently all available information into valuable knowledge. Diabetes mellitus (DM) is defined as a group of metabolic disorders exerting significant pressure on human health worldwide. Extensive research in all aspects of diabetes (diagnosis, etiopathophysiology, therapy, etc.) has led to the generation of huge amounts of data. The aim of the present study is to conduct a systematic review of the applications of machine learning, data mining techniques and tools in the field of diabetes research with respect to a) Prediction and Diagnosis, b) Diabetic Complications, c) Genetic Background and Environment, and e) Health Care and Management with the first category appearing to be the most popular. A wide range of machine learning algorithms were employed. In general, 85% of those used were characterized by supervised learning approaches and 15% by unsupervised ones, and more specifically, association rules. Support vector machines (SVM) arise as the most successful and widely used algorithm. Concerning the type of data, clinical datasets were mainly used. The title applications in the selected articles project the usefulness of extracting valuable knowledge leading to new hypotheses targeting deeper understanding and further investigation in DM.

  4. Machine learning applied to crime prediction

    OpenAIRE

    Vaquero Barnadas, Miquel

    2016-01-01

    Machine Learning is a cornerstone when it comes to artificial intelligence and big data analysis. It provides powerful algorithms that are capable of recognizing patterns, classifying data, and, basically, learn by themselves to perform a specific task. This field has incredibly grown in popularity these days, however, it still remains unknown for the majority of people, and even for most professionals. This project intends to provide an understandable explanation of what is it, what types ar...

  5. A Review of Related Work on Machine Learning in Semiconductor Manufacturing and Assembly Lines

    OpenAIRE

    Stanisavljevic, Darko; Spitzer, Michael

    2017-01-01

    This paper deals with applications of machine learning algorithms in manufacturing. Machine learning can be defined as a field of computer science that gives computers the ability to learn without explicitly developing the needed algorithms. Manufacturing is the production of merchandise by manual labour, machines and tools. The focus of this paper is on automatic production lines. The areas of interest of this paper are semiconductor manufacturing and production on assembly lines. The purpos...

  6. Machine learning on geospatial big data

    CSIR Research Space (South Africa)

    Van Zyl, T

    2014-02-01

    Full Text Available When trying to understand the difference between machine learning and statistics, it is important to note that it is not so much the set of techniques and theory that are used but more importantly the intended use of the results. In fact, many...

  7. ML Confidential : machine learning on encrypted data

    NARCIS (Netherlands)

    Graepel, T.; Lauter, K.; Naehrig, M.; Kwon, T.; Lee, M.-K.; Kwon, D.

    2013-01-01

    We demonstrate that, by using a recently proposed leveled homomorphic encryption scheme, it is possible to delegate the execution of a machine learning algorithm to a computing service while retaining con¿dentiality of the training and test data. Since the computational complexity of the homomorphic

  8. Machine Learning for Flapping Wing Flight Control

    NARCIS (Netherlands)

    Goedhart, Menno; van Kampen, E.; Armanini, S.F.; de Visser, C.C.; Chu, Q.

    2018-01-01

    Flight control of Flapping Wing Micro Air Vehicles is challenging, because of their complex dynamics and variability due to manufacturing inconsistencies. Machine Learning algorithms can be used to tackle these challenges. A Policy Gradient algorithm is used to tune the gains of a

  9. ML Confidential : machine learning on encrypted data

    NARCIS (Netherlands)

    Graepel, T.; Lauter, K.; Naehrig, M.

    2012-01-01

    We demonstrate that by using a recently proposed somewhat homomorphic encryption (SHE) scheme it is possible to delegate the execution of a machine learning (ML) algorithm to a compute service while retaining confidentiality of the training and test data. Since the computational complexity of the

  10. Document Classification Using Distributed Machine Learning

    OpenAIRE

    Aydin, Galip; Hallac, Ibrahim Riza

    2018-01-01

    In this paper, we investigate the performance and success rates of Na\\"ive Bayes Classification Algorithm for automatic classification of Turkish news into predetermined categories like economy, life, health etc. We use Apache Big Data technologies such as Hadoop, HDFS, Spark and Mahout, and apply these distributed technologies to Machine Learning.

  11. The ATLAS Higgs Machine Learning Challenge

    CERN Document Server

    Cowan, Glen; The ATLAS collaboration; Bourdarios, Claire

    2015-01-01

    High Energy Physics has been using Machine Learning techniques (commonly known as Multivariate Analysis) since the 1990s with Artificial Neural Net and more recently with Boosted Decision Trees, Random Forest etc. Meanwhile, Machine Learning has become a full blown field of computer science. With the emergence of Big Data, data scientists are developing new Machine Learning algorithms to extract meaning from large heterogeneous data. HEP has exciting and difficult problems like the extraction of the Higgs boson signal, and at the same time data scientists have advanced algorithms: the goal of the HiggsML project was to bring the two together by a “challenge”: participants from all over the world and any scientific background could compete online to obtain the best Higgs to tau tau signal significance on a set of ATLAS fully simulated Monte Carlo signal and background. Instead of HEP physicists browsing through machine learning papers and trying to infer which new algorithms might be useful for HEP, then c...

  12. Parallelization of TMVA Machine Learning Algorithms

    CERN Document Server

    Hajili, Mammad

    2017-01-01

    This report reflects my work on Parallelization of TMVA Machine Learning Algorithms integrated to ROOT Data Analysis Framework during summer internship at CERN. The report consists of 4 impor- tant part - data set used in training and validation, algorithms that multiprocessing applied on them, parallelization techniques and re- sults of execution time changes due to number of workers.

  13. Prototype-based models in machine learning

    NARCIS (Netherlands)

    Biehl, Michael; Hammer, Barbara; Villmann, Thomas

    2016-01-01

    An overview is given of prototype-based models in machine learning. In this framework, observations, i.e., data, are stored in terms of typical representatives. Together with a suitable measure of similarity, the systems can be employed in the context of unsupervised and supervised analysis of

  14. Supporting visual quality assessment with machine learning

    NARCIS (Netherlands)

    Gastaldo, P.; Zunino, R.; Redi, J.

    2013-01-01

    Objective metrics for visual quality assessment often base their reliability on the explicit modeling of the highly non-linear behavior of human perception; as a result, they may be complex and computationally expensive. Conversely, machine learning (ML) paradigms allow to tackle the quality

  15. Machine Learning and Quantum Mechanics

    Science.gov (United States)

    Chapline, George

    The author has previously pointed out some similarities between selforganizing neural networks and quantum mechanics. These types of neural networks were originally conceived of as away of emulating the cognitive capabilities of the human brain. Recently extensions of these networks, collectively referred to as deep learning networks, have strengthened the connection between self-organizing neural networks and human cognitive capabilities. In this note we consider whether hardware quantum devices might be useful for emulating neural networks with human-like cognitive capabilities, or alternatively whether implementations of deep learning neural networks using conventional computers might lead to better algorithms for solving the many body Schrodinger equation.

  16. Extracting meaning from audio signals - a machine learning approach

    DEFF Research Database (Denmark)

    Larsen, Jan

    2007-01-01

    * Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression......* Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression...

  17. Evaluation on knowledge extraction and machine learning in ...

    African Journals Online (AJOL)

    Evaluation on knowledge extraction and machine learning in resolving Malay word ambiguity. ... No 5S (2017) >. Log in or Register to get access to full text downloads. ... Keywords: ambiguity; lexical knowledge; machine learning; Malay word ...

  18. Twin support vector machines models, extensions and applications

    CERN Document Server

    Jayadeva; Chandra, Suresh

    2017-01-01

    This book provides a systematic and focused study of the various aspects of twin support vector machines (TWSVM) and related developments for classification and regression. In addition to presenting most of the basic models of TWSVM and twin support vector regression (TWSVR) available in the literature, it also discusses the important and challenging applications of this new machine learning methodology. A chapter on “Additional Topics” has been included to discuss kernel optimization and support tensor machine topics, which are comparatively new but have great potential in applications. It is primarily written for graduate students and researchers in the area of machine learning and related topics in computer science, mathematics, electrical engineering, management science and finance.

  19. Study of Environmental Data Complexity using Extreme Learning Machine

    Science.gov (United States)

    Leuenberger, Michael; Kanevski, Mikhail

    2017-04-01

    The main goals of environmental data science using machine learning algorithm deal, in a broad sense, around the calibration, the prediction and the visualization of hidden relationship between input and output variables. In order to optimize the models and to understand the phenomenon under study, the characterization of the complexity (at different levels) should be taken into account. Therefore, the identification of the linear or non-linear behavior between input and output variables adds valuable information for the knowledge of the phenomenon complexity. The present research highlights and investigates the different issues that can occur when identifying the complexity (linear/non-linear) of environmental data using machine learning algorithm. In particular, the main attention is paid to the description of a self-consistent methodology for the use of Extreme Learning Machines (ELM, Huang et al., 2006), which recently gained a great popularity. By applying two ELM models (with linear and non-linear activation functions) and by comparing their efficiency, quantification of the linearity can be evaluated. The considered approach is accompanied by simulated and real high dimensional and multivariate data case studies. In conclusion, the current challenges and future development in complexity quantification using environmental data mining are discussed. References - Huang, G.-B., Zhu, Q.-Y., Siew, C.-K., 2006. Extreme learning machine: theory and applications. Neurocomputing 70 (1-3), 489-501. - Kanevski, M., Pozdnoukhov, A., Timonin, V., 2009. Machine Learning for Spatial Environmental Data. EPFL Press; Lausanne, Switzerland, p.392. - Leuenberger, M., Kanevski, M., 2015. Extreme Learning Machines for spatial environmental data. Computers and Geosciences 85, 64-73.

  20. Machine Learning for Treatment Assignment: Improving Individualized Risk Attribution.

    Science.gov (United States)

    Weiss, Jeremy; Kuusisto, Finn; Boyd, Kendrick; Liu, Jie; Page, David

    2015-01-01

    Clinical studies model the average treatment effect (ATE), but apply this population-level effect to future individuals. Due to recent developments of machine learning algorithms with useful statistical guarantees, we argue instead for modeling the individualized treatment effect (ITE), which has better applicability to new patients. We compare ATE-estimation using randomized and observational analysis methods against ITE-estimation using machine learning, and describe how the ITE theoretically generalizes to new population distributions, whereas the ATE may not. On a synthetic data set of statin use and myocardial infarction (MI), we show that a learned ITE model improves true ITE estimation and outperforms the ATE. We additionally argue that ITE models should be learned with a consistent, nonparametric algorithm from unweighted examples and show experiments in favor of our argument using our synthetic data model and a real data set of D-penicillamine use for primary biliary cirrhosis.

  1. Machine Learning Control For Highly Reconfigurable High-Order Systems

    Science.gov (United States)

    2015-01-02

    calibration and applications,” Mechatronics and Embedded Systems and Applications (MESA), 2010 IEEE/ASME International Conference on, IEEE, 2010, pp. 38–43...AFRL-OSR-VA-TR-2015-0012 MACHINE LEARNING CONTROL FOR HIGHLY RECONFIGURABLE HIGH-ORDER SYSTEMS John Valasek TEXAS ENGINEERING EXPERIMENT STATION...DIMENSIONAL RECONFIGURABLE SYSTEMS FA9550-11-1-0302 Period of Performance 1 July 2011 – 29 September 2014 John Valasek Aerospace Engineering

  2. Financial signal processing and machine learning

    CERN Document Server

    Kulkarni,Sanjeev R; Dmitry M. Malioutov

    2016-01-01

    The modern financial industry has been required to deal with large and diverse portfolios in a variety of asset classes often with limited market data available. Financial Signal Processing and Machine Learning unifies a number of recent advances made in signal processing and machine learning for the design and management of investment portfolios and financial engineering. This book bridges the gap between these disciplines, offering the latest information on key topics including characterizing statistical dependence and correlation in high dimensions, constructing effective and robust risk measures, and their use in portfolio optimization and rebalancing. The book focuses on signal processing approaches to model return, momentum, and mean reversion, addressing theoretical and implementation aspects. It highlights the connections between portfolio theory, sparse learning and compressed sensing, sparse eigen-portfolios, robust optimization, non-Gaussian data-driven risk measures, graphical models, causal analy...

  3. A Machine Learning Concept for DTN Routing

    Science.gov (United States)

    Dudukovich, Rachel; Hylton, Alan; Papachristou, Christos

    2017-01-01

    This paper discusses the concept and architecture of a machine learning based router for delay tolerant space networks. The techniques of reinforcement learning and Bayesian learning are used to supplement the routing decisions of the popular Contact Graph Routing algorithm. An introduction to the concepts of Contact Graph Routing, Q-routing and Naive Bayes classification are given. The development of an architecture for a cross-layer feedback framework for DTN (Delay-Tolerant Networking) protocols is discussed. Finally, initial simulation setup and results are given.

  4. Transductive and matched-pair machine learning for difficult target detection problems

    Science.gov (United States)

    Theiler, James

    2014-06-01

    This paper will describe the application of two non-traditional kinds of machine learning (transductive machine learning and the more recently proposed matched-pair machine learning) to the target detection problem. The approach combines explicit domain knowledge to model the target signal with a more agnostic machine-learning approach to characterize the background. The concept is illustrated with simulated data from an elliptically-contoured background distribution, on which a subpixel target of known spectral signature but unknown spatial extent has been implanted.

  5. Splendidly blended: a machine learning set up for CDU control

    Science.gov (United States)

    Utzny, Clemens

    2017-06-01

    As the concepts of machine learning and artificial intelligence continue to grow in importance in the context of internet related applications it is still in its infancy when it comes to process control within the semiconductor industry. Especially the branch of mask manufacturing presents a challenge to the concepts of machine learning since the business process intrinsically induces pronounced product variability on the background of small plate numbers. In this paper we present the architectural set up of a machine learning algorithm which successfully deals with the demands and pitfalls of mask manufacturing. A detailed motivation of this basic set up followed by an analysis of its statistical properties is given. The machine learning set up for mask manufacturing involves two learning steps: an initial step which identifies and classifies the basic global CD patterns of a process. These results form the basis for the extraction of an optimized training set via balanced sampling. A second learning step uses this training set to obtain the local as well as global CD relationships induced by the manufacturing process. Using two production motivated examples we show how this approach is flexible and powerful enough to deal with the exacting demands of mask manufacturing. In one example we show how dedicated covariates can be used in conjunction with increased spatial resolution of the CD map model in order to deal with pathological CD effects at the mask boundary. The other example shows how the model set up enables strategies for dealing tool specific CD signature differences. In this case the balanced sampling enables a process control scheme which allows usage of the full tool park within the specified tight tolerance budget. Overall, this paper shows that the current rapid developments off the machine learning algorithms can be successfully used within the context of semiconductor manufacturing.

  6. Survey of Machine Learning Methods for Database Security

    Science.gov (United States)

    Kamra, Ashish; Ber, Elisa

    Application of machine learning techniques to database security is an emerging area of research. In this chapter, we present a survey of various approaches that use machine learning/data mining techniques to enhance the traditional security mechanisms of databases. There are two key database security areas in which these techniques have found applications, namely, detection of SQL Injection attacks and anomaly detection for defending against insider threats. Apart from the research prototypes and tools, various third-party commercial products are also available that provide database activity monitoring solutions by profiling database users and applications. We present a survey of such products. We end the chapter with a primer on mechanisms for responding to database anomalies.

  7. Reduced multiple empirical kernel learning machine.

    Science.gov (United States)

    Wang, Zhe; Lu, MingZhe; Gao, Daqi

    2015-02-01

    Multiple kernel learning (MKL) is demonstrated to be flexible and effective in depicting heterogeneous data sources since MKL can introduce multiple kernels rather than a single fixed kernel into applications. However, MKL would get a high time and space complexity in contrast to single kernel learning, which is not expected in real-world applications. Meanwhile, it is known that the kernel mapping ways of MKL generally have two forms including implicit kernel mapping and empirical kernel mapping (EKM), where the latter is less attracted. In this paper, we focus on the MKL with the EKM, and propose a reduced multiple empirical kernel learning machine named RMEKLM for short. To the best of our knowledge, it is the first to reduce both time and space complexity of the MKL with EKM. Different from the existing MKL, the proposed RMEKLM adopts the Gauss Elimination technique to extract a set of feature vectors, which is validated that doing so does not lose much information of the original feature space. Then RMEKLM adopts the extracted feature vectors to span a reduced orthonormal subspace of the feature space, which is visualized in terms of the geometry structure. It can be demonstrated that the spanned subspace is isomorphic to the original feature space, which means that the dot product of two vectors in the original feature space is equal to that of the two corresponding vectors in the generated orthonormal subspace. More importantly, the proposed RMEKLM brings a simpler computation and meanwhile needs a less storage space, especially in the processing of testing. Finally, the experimental results show that RMEKLM owns a much efficient and effective performance in terms of both complexity and classification. The contributions of this paper can be given as follows: (1) by mapping the input space into an orthonormal subspace, the geometry of the generated subspace is visualized; (2) this paper first reduces both the time and space complexity of the EKM-based MKL; (3

  8. Stochastic subset selection for learning with kernel machines.

    Science.gov (United States)

    Rhinelander, Jason; Liu, Xiaoping P

    2012-06-01

    Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm.

  9. A Teaching System To Learn Programming: the Programmer's Learning Machine

    OpenAIRE

    Quinson , Martin; Oster , Gérald

    2015-01-01

    International audience; The Programmer's Learning Machine (PLM) is an interactive exerciser for learning programming and algorithms. Using an integrated and graphical environment that provides a short feedback loop, it allows students to learn in a (semi)-autonomous way. This generic platform also enables teachers to create specific programming microworlds that match their teaching goals. This paper discusses our design goals and motivations, introduces the existing material and the proposed ...

  10. Making Individual Prognoses in Psychiatry Using Neuroimaging and Machine Learning.

    Science.gov (United States)

    Janssen, Ronald J; Mourão-Miranda, Janaina; Schnack, Hugo G

    2018-04-22

    Psychiatric prognosis is a difficult problem. Making a prognosis requires looking far into the future, as opposed to making a diagnosis, which is concerned with the current state. During the follow-up period, many factors will influence the course of the disease. Combined with the usually scarcer longitudinal data and the variability in the definition of outcomes/transition, this makes prognostic predictions a challenging endeavor. Employing neuroimaging data in this endeavor introduces the additional hurdle of high dimensionality. Machine-learning techniques are especially suited to tackle this challenging problem. This review starts with a brief introduction to machine learning in the context of its application to clinical neuroimaging data. We highlight a few issues that are especially relevant for prediction of outcome and transition using neuroimaging. We then review the literature that discusses the application of machine learning for this purpose. Critical examination of the studies and their results with respect to the relevant issues revealed the following: 1) there is growing evidence for the prognostic capability of machine-learning-based models using neuroimaging; and 2) reported accuracies may be too optimistic owing to small sample sizes and the lack of independent test samples. Finally, we discuss options to improve the reliability of (prognostic) prediction models. These include new methodologies and multimodal modeling. Paramount, however, is our conclusion that future work will need to provide properly (cross-)validated accuracy estimates of models trained on sufficiently large datasets. Nevertheless, with the technological advances enabling acquisition of large databases of patients and healthy subjects, machine learning represents a powerful tool in the search for psychiatric biomarkers. Copyright © 2018 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  11. Manifold learning in machine vision and robotics

    Science.gov (United States)

    Bernstein, Alexander

    2017-02-01

    Smart algorithms are used in Machine vision and Robotics to organize or extract high-level information from the available data. Nowadays, Machine learning is an essential and ubiquitous tool to automate extraction patterns or regularities from data (images in Machine vision; camera, laser, and sonar sensors data in Robotics) in order to solve various subject-oriented tasks such as understanding and classification of images content, navigation of mobile autonomous robot in uncertain environments, robot manipulation in medical robotics and computer-assisted surgery, and other. Usually such data have high dimensionality, however, due to various dependencies between their components and constraints caused by physical reasons, all "feasible and usable data" occupy only a very small part in high dimensional "observation space" with smaller intrinsic dimensionality. Generally accepted model of such data is manifold model in accordance with which the data lie on or near an unknown manifold (surface) of lower dimensionality embedded in an ambient high dimensional observation space; real-world high-dimensional data obtained from "natural" sources meet, as a rule, this model. The use of Manifold learning technique in Machine vision and Robotics, which discovers a low-dimensional structure of high dimensional data and results in effective algorithms for solving of a large number of various subject-oriented tasks, is the content of the conference plenary speech some topics of which are in the paper.

  12. Parallelization of the ROOT Machine Learning Methods

    CERN Document Server

    Vakilipourtakalou, Pourya

    2016-01-01

    Today computation is an inseparable part of scientific research. Specially in Particle Physics when there is a classification problem like discrimination of Signals from Backgrounds originating from the collisions of particles. On the other hand, Monte Carlo simulations can be used in order to generate a known data set of Signals and Backgrounds based on theoretical physics. The aim of Machine Learning is to train some algorithms on known data set and then apply these trained algorithms to the unknown data sets. However, the most common framework for data analysis in Particle Physics is ROOT. In order to use Machine Learning methods, a Toolkit for Multivariate Data Analysis (TMVA) has been added to ROOT. The major consideration in this report is the parallelization of some TMVA methods, specially Cross-Validation and BDT.

  13. Distinguishing Asthma Phenotypes Using Machine Learning Approaches.

    Science.gov (United States)

    Howard, Rebecca; Rattray, Magnus; Prosperi, Mattia; Custovic, Adnan

    2015-07-01

    Asthma is not a single disease, but an umbrella term for a number of distinct diseases, each of which are caused by a distinct underlying pathophysiological mechanism. These discrete disease entities are often labelled as 'asthma endotypes'. The discovery of different asthma subtypes has moved from subjective approaches in which putative phenotypes are assigned by experts to data-driven ones which incorporate machine learning. This review focuses on the methodological developments of one such machine learning technique-latent class analysis-and how it has contributed to distinguishing asthma and wheezing subtypes in childhood. It also gives a clinical perspective, presenting the findings of studies from the past 5 years that used this approach. The identification of true asthma endotypes may be a crucial step towards understanding their distinct pathophysiological mechanisms, which could ultimately lead to more precise prevention strategies, identification of novel therapeutic targets and the development of effective personalized therapies.

  14. Machine Learning for ATLAS DDM Network Metrics

    CERN Document Server

    Lassnig, Mario; The ATLAS collaboration; Vamosi, Ralf

    2016-01-01

    The increasing volume of physics data is posing a critical challenge to the ATLAS experiment. In anticipation of high luminosity physics, automation of everyday data management tasks has become necessary. Previously many of these tasks required human decision-making and operation. Recent advances in hardware and software have made it possible to entrust more complicated duties to automated systems using models trained by machine learning algorithms. In this contribution we show results from our ongoing automation efforts. First, we describe our framework for distributed data management and network metrics, automatically extract and aggregate data, train models with various machine learning algorithms, and eventually score the resulting models and parameters. Second, we use these models to forecast metrics relevant for network-aware job scheduling and data brokering. We show the characteristics of the data and evaluate the forecasting accuracy of our models.

  15. Machine Learning Methods for Analysis of Metabolic Data and Metabolic Pathway Modeling.

    Science.gov (United States)

    Cuperlovic-Culf, Miroslava

    2018-01-11

    Machine learning uses experimental data to optimize clustering or classification of samples or features, or to develop, augment or verify models that can be used to predict behavior or properties of systems. It is expected that machine learning will help provide actionable knowledge from a variety of big data including metabolomics data, as well as results of metabolism models. A variety of machine learning methods has been applied in bioinformatics and metabolism analyses including self-organizing maps, support vector machines, the kernel machine, Bayesian networks or fuzzy logic. To a lesser extent, machine learning has also been utilized to take advantage of the increasing availability of genomics and metabolomics data for the optimization of metabolic network models and their analysis. In this context, machine learning has aided the development of metabolic networks, the calculation of parameters for stoichiometric and kinetic models, as well as the analysis of major features in the model for the optimal application of bioreactors. Examples of this very interesting, albeit highly complex, application of machine learning for metabolism modeling will be the primary focus of this review presenting several different types of applications for model optimization, parameter determination or system analysis using models, as well as the utilization of several different types of machine learning technologies.

  16. Machine Learning Methods for Analysis of Metabolic Data and Metabolic Pathway Modeling

    Science.gov (United States)

    Cuperlovic-Culf, Miroslava

    2018-01-01

    Machine learning uses experimental data to optimize clustering or classification of samples or features, or to develop, augment or verify models that can be used to predict behavior or properties of systems. It is expected that machine learning will help provide actionable knowledge from a variety of big data including metabolomics data, as well as results of metabolism models. A variety of machine learning methods has been applied in bioinformatics and metabolism analyses including self-organizing maps, support vector machines, the kernel machine, Bayesian networks or fuzzy logic. To a lesser extent, machine learning has also been utilized to take advantage of the increasing availability of genomics and metabolomics data for the optimization of metabolic network models and their analysis. In this context, machine learning has aided the development of metabolic networks, the calculation of parameters for stoichiometric and kinetic models, as well as the analysis of major features in the model for the optimal application of bioreactors. Examples of this very interesting, albeit highly complex, application of machine learning for metabolism modeling will be the primary focus of this review presenting several different types of applications for model optimization, parameter determination or system analysis using models, as well as the utilization of several different types of machine learning technologies. PMID:29324649

  17. Randomized Algorithms for Scalable Machine Learning

    OpenAIRE

    Kleiner, Ariel Jacob

    2012-01-01

    Many existing procedures in machine learning and statistics are computationally intractable in the setting of large-scale data. As a result, the advent of rapidly increasing dataset sizes, which should be a boon yielding improved statistical performance, instead severely blunts the usefulness of a variety of existing inferential methods. In this work, we use randomness to ameliorate this lack of scalability by reducing complex, computationally difficult inferential problems to larger sets o...

  18. Machine Learning Methods for Production Cases Analysis

    Science.gov (United States)

    Mokrova, Nataliya V.; Mokrov, Alexander M.; Safonova, Alexandra V.; Vishnyakov, Igor V.

    2018-03-01

    Approach to analysis of events occurring during the production process were proposed. Described machine learning system is able to solve classification tasks related to production control and hazard identification at an early stage. Descriptors of the internal production network data were used for training and testing of applied models. k-Nearest Neighbors and Random forest methods were used to illustrate and analyze proposed solution. The quality of the developed classifiers was estimated using standard statistical metrics, such as precision, recall and accuracy.

  19. A method to combine target volume data from 3D and 4D planned thoracic radiotherapy patient cohorts for machine learning applications

    NARCIS (Netherlands)

    Johnson, Corinne; Price, Gareth; Khalifa, Jonathan; Faivre-Finn, Corinne; Dekker, Andre; Moore, Christopher; van Herk, Marcel

    2017-01-01

    The gross tumour volume (GTV) is predictive of clinical outcome and consequently features in many machine-learned models. 4D-planning, however, has prompted substitution of the GTV with the internal gross target volume (iGTV). We present and validate a method to synthesise GTV data from the iGTV,

  20. The ATLAS Higgs machine learning challenge

    CERN Document Server

    Davey, W; The ATLAS collaboration; Rousseau, D; Cowan, G; Kegl, B; Germain-Renaud, C; Guyon, I

    2014-01-01

    High Energy Physics has been using Machine Learning techniques (commonly known as Multivariate Analysis) since the 90's with Artificial Neural Net for example, more recently with Boosted Decision Trees, Random Forest etc... Meanwhile, Machine Learning has become a full blown field of computer science. With the emergence of Big Data, Data Scientists are developing new Machine Learning algorithms to extract sense from large heterogeneous data. HEP has exciting and difficult problems like the extraction of the Higgs boson signal, data scientists have advanced algorithms: the goal of the HiggsML project is to bring the two together by a “challenge”: participants from all over the world and any scientific background can compete online ( https://www.kaggle.com/c/higgs-boson ) to obtain the best Higgs to tau tau signal significance on a set of ATLAS full simulated Monte Carlo signal and background. Winners with the best scores will receive money prizes ; authors of the best method (most usable) will be invited t...

  1. Research on machine learning framework based on random forest algorithm

    Science.gov (United States)

    Ren, Qiong; Cheng, Hui; Han, Hai

    2017-03-01

    With the continuous development of machine learning, industry and academia have released a lot of machine learning frameworks based on distributed computing platform, and have been widely used. However, the existing framework of machine learning is limited by the limitations of machine learning algorithm itself, such as the choice of parameters and the interference of noises, the high using threshold and so on. This paper introduces the research background of machine learning framework, and combined with the commonly used random forest algorithm in machine learning classification algorithm, puts forward the research objectives and content, proposes an improved adaptive random forest algorithm (referred to as ARF), and on the basis of ARF, designs and implements the machine learning framework.

  2. A review for detecting gene-gene interactions using machine learning methods in genetic epidemiology.

    Science.gov (United States)

    Koo, Ching Lee; Liew, Mei Jing; Mohamad, Mohd Saberi; Salleh, Abdul Hakim Mohamed

    2013-01-01

    Recently, the greatest statistical computational challenge in genetic epidemiology is to identify and characterize the genes that interact with other genes and environment factors that bring the effect on complex multifactorial disease. These gene-gene interactions are also denoted as epitasis in which this phenomenon cannot be solved by traditional statistical method due to the high dimensionality of the data and the occurrence of multiple polymorphism. Hence, there are several machine learning methods to solve such problems by identifying such susceptibility gene which are neural networks (NNs), support vector machine (SVM), and random forests (RFs) in such common and multifactorial disease. This paper gives an overview on machine learning methods, describing the methodology of each machine learning methods and its application in detecting gene-gene and gene-environment interactions. Lastly, this paper discussed each machine learning method and presents the strengths and weaknesses of each machine learning method in detecting gene-gene interactions in complex human disease.

  3. Quantum machine learning: a classical perspective.

    Science.gov (United States)

    Ciliberto, Carlo; Herbster, Mark; Ialongo, Alessandro Davide; Pontil, Massimiliano; Rocchetto, Andrea; Severini, Simone; Wossnig, Leonard

    2018-01-01

    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning (ML) techniques to impressive results in regression, classification, data generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets is motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed up classical ML algorithms. Here we review the literature in quantum ML and discuss perspectives for a mixed readership of classical ML and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in ML are identified as promising directions for the field. Practical questions, such as how to upload classical data into quantum form, will also be addressed.

  4. Prototype-based models in machine learning.

    Science.gov (United States)

    Biehl, Michael; Hammer, Barbara; Villmann, Thomas

    2016-01-01

    An overview is given of prototype-based models in machine learning. In this framework, observations, i.e., data, are stored in terms of typical representatives. Together with a suitable measure of similarity, the systems can be employed in the context of unsupervised and supervised analysis of potentially high-dimensional, complex datasets. We discuss basic schemes of competitive vector quantization as well as the so-called neural gas approach and Kohonen's topology-preserving self-organizing map. Supervised learning in prototype systems is exemplified in terms of learning vector quantization. Most frequently, the familiar Euclidean distance serves as a dissimilarity measure. We present extensions of the framework to nonstandard measures and give an introduction to the use of adaptive distances in relevance learning. © 2016 Wiley Periodicals, Inc.

  5. Quantum machine learning: a classical perspective

    Science.gov (United States)

    Ciliberto, Carlo; Herbster, Mark; Ialongo, Alessandro Davide; Pontil, Massimiliano; Rocchetto, Andrea; Severini, Simone; Wossnig, Leonard

    2018-01-01

    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning (ML) techniques to impressive results in regression, classification, data generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets is motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed up classical ML algorithms. Here we review the literature in quantum ML and discuss perspectives for a mixed readership of classical ML and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in ML are identified as promising directions for the field. Practical questions, such as how to upload classical data into quantum form, will also be addressed.

  6. Machine Learning of the Reactor Core Loading Pattern Critical Parameters

    Directory of Open Access Journals (Sweden)

    Krešimir Trontl

    2008-01-01

    Full Text Available The usual approach to loading pattern optimization involves high degree of engineering judgment, a set of heuristic rules, an optimization algorithm, and a computer code used for evaluating proposed loading patterns. The speed of the optimization process is highly dependent on the computer code used for the evaluation. In this paper, we investigate the applicability of a machine learning model which could be used for fast loading pattern evaluation. We employ a recently introduced machine learning technique, support vector regression (SVR, which is a data driven, kernel based, nonlinear modeling paradigm, in which model parameters are automatically determined by solving a quadratic optimization problem. The main objective of the work reported in this paper was to evaluate the possibility of applying SVR method for reactor core loading pattern modeling. We illustrate the performance of the solution and discuss its applicability, that is, complexity, speed, and accuracy.

  7. Machine Learning of the Reactor Core Loading Pattern Critical Parameters

    International Nuclear Information System (INIS)

    Trontl, K.; Pevec, D.; Smuc, T.

    2008-01-01

    The usual approach to loading pattern optimization involves high degree of engineering judgment, a set of heuristic rules, an optimization algorithm, and a computer code used for evaluating proposed loading patterns. The speed of the optimization process is highly dependent on the computer code used for the evaluation. In this paper, we investigate the applicability of a machine learning model which could be used for fast loading pattern evaluation. We employ a recently introduced machine learning technique, support vector regression (SVR), which is a data driven, kernel based, nonlinear modeling paradigm, in which model parameters are automatically determined by solving a quadratic optimization problem. The main objective of the work reported in this paper was to evaluate the possibility of applying SVR method for reactor core loading pattern modeling. We illustrate the performance of the solution and discuss its applicability, that is, complexity, speed, and accuracy

  8. 1st International Conference on Machine Learning for Cyber Physical Systems and Industry 4.0

    CERN Document Server

    Beyerer, Jürgen

    2016-01-01

    The work presents new approaches to Machine Learning for Cyber Physical Systems, experiences and visions. It contains some selected papers from the international Conference ML4CPS – Machine Learning for Cyber Physical Systems, which was held in Lemgo, October 1-2, 2015. Cyber Physical Systems are characterized by their ability to adapt and to learn: They analyze their environment and, based on observations, they learn patterns, correlations and predictive models. Typical applications are condition monitoring, predictive maintenance, image processing and diagnosis. Machine Learning is the key technology for these developments.

  9. Machine-to-machine communications architectures, technology, standards, and applications

    CERN Document Server

    Misic, Vojislav B

    2014-01-01

    With the number of machine-to-machine (M2M)-enabled devices projected to reach 20 to 50 billion by 2020, there is a critical need to understand the demands imposed by such systems. Machine-to-Machine Communications: Architectures, Technology, Standards, and Applications offers rigorous treatment of the many facets of M2M communication, including its integration with current technology.Presenting the work of a different group of international experts in each chapter, the book begins by supplying an overview of M2M technology. It considers proposed standards, cutting-edge applications, architectures, and traffic modeling and includes case studies that highlight the differences between traditional and M2M communications technology.Details a practical scheme for the forward error correction code designInvestigates the effectiveness of the IEEE 802.15.4 low data rate wireless personal area network standard for use in M2M communicationsIdentifies algorithms that will ensure functionality, performance, reliability, ...

  10. Housing Value Forecasting Based on Machine Learning Methods

    OpenAIRE

    Mu, Jingyi; Wu, Fang; Zhang, Aihua

    2014-01-01

    In the era of big data, many urgent issues to tackle in all walks of life all can be solved via big data technique. Compared with the Internet, economy, industry, and aerospace fields, the application of big data in the area of architecture is relatively few. In this paper, on the basis of the actual data, the values of Boston suburb houses are forecast by several machine learning methods. According to the predictions, the government and developers can make decisions about whether developing...

  11. Differential Privacy and Machine Learning: a Survey and Review

    OpenAIRE

    Ji, Zhanglong; Lipton, Zachary C.; Elkan, Charles

    2014-01-01

    The objective of machine learning is to extract useful information from data, while privacy is preserved by concealing information. Thus it seems hard to reconcile these competing interests. However, they frequently must be balanced when mining sensitive data. For example, medical research represents an important application where it is necessary both to extract useful information and protect patient privacy. One way to resolve the conflict is to extract general characteristics of whole popul...

  12. Multilevel Cognitive Machine-Learning-Based Concept for Artificial Awareness: Application to Humanoid Robot Awareness Using Visual Saliency

    Directory of Open Access Journals (Sweden)

    Kurosh Madani

    2012-01-01

    Full Text Available As part of “intelligence,” the “awareness” is the state or ability to perceive, feel, or be mindful of events, objects, or sensory patterns: in other words, to be conscious of the surrounding environment and its interactions. Inspired by early-ages human skills developments and especially by early-ages awareness maturation, the present paper accosts the robots intelligence from a different slant directing the attention to combining both “cognitive” and “perceptual” abilities. Within such a slant, the machine (robot shrewdness is constructed on the basis of a multilevel cognitive concept attempting to handle complex artificial behaviors. The intended complex behavior is the autonomous discovering of objects by robot exploring an unknown environment: in other words, proffering the robot autonomy and awareness in and about unknown backdrop.

  13. Machine learning of molecular properties: Locality and active learning

    Science.gov (United States)

    Gubaev, Konstantin; Podryabinkin, Evgeny V.; Shapeev, Alexander V.

    2018-06-01

    In recent years, the machine learning techniques have shown great potent1ial in various problems from a multitude of disciplines, including materials design and drug discovery. The high computational speed on the one hand and the accuracy comparable to that of density functional theory on another hand make machine learning algorithms efficient for high-throughput screening through chemical and configurational space. However, the machine learning algorithms available in the literature require large training datasets to reach the chemical accuracy and also show large errors for the so-called outliers—the out-of-sample molecules, not well-represented in the training set. In the present paper, we propose a new machine learning algorithm for predicting molecular properties that addresses these two issues: it is based on a local model of interatomic interactions providing high accuracy when trained on relatively small training sets and an active learning algorithm of optimally choosing the training set that significantly reduces the errors for the outliers. We compare our model to the other state-of-the-art algorithms from the literature on the widely used benchmark tests.

  14. Toward accelerating landslide mapping with interactive machine learning techniques

    Science.gov (United States)

    Stumpf, André; Lachiche, Nicolas; Malet, Jean-Philippe; Kerle, Norman; Puissant, Anne

    2013-04-01

    Despite important advances in the development of more automated methods for landslide mapping from optical remote sensing images, the elaboration of inventory maps after major triggering events still remains a tedious task. Image classification with expert defined rules typically still requires significant manual labour for the elaboration and adaption of rule sets for each particular case. Machine learning algorithm, on the contrary, have the ability to learn and identify complex image patterns from labelled examples but may require relatively large amounts of training data. In order to reduce the amount of required training data active learning has evolved as key concept to guide the sampling for applications such as document classification, genetics and remote sensing. The general underlying idea of most active learning approaches is to initialize a machine learning model with a small training set, and to subsequently exploit the model state and/or the data structure to iteratively select the most valuable samples that should be labelled by the user and added in the training set. With relatively few queries and labelled samples, an active learning strategy should ideally yield at least the same accuracy than an equivalent classifier trained with many randomly selected samples. Our study was dedicated to the development of an active learning approach for landslide mapping from VHR remote sensing images with special consideration of the spatial distribution of the samples. The developed approach is a region-based query heuristic that enables to guide the user attention towards few compact spatial batches rather than distributed points resulting in time savings of 50% and more compared to standard active learning techniques. The approach was tested with multi-temporal and multi-sensor satellite images capturing recent large scale triggering events in Brazil and China and demonstrated balanced user's and producer's accuracies between 74% and 80%. The assessment also

  15. Estimating extinction using unsupervised machine learning

    Science.gov (United States)

    Meingast, Stefan; Lombardi, Marco; Alves, João

    2017-05-01

    Dust extinction is the most robust tracer of the gas distribution in the interstellar medium, but measuring extinction is limited by the systematic uncertainties involved in estimating the intrinsic colors to background stars. In this paper we present a new technique, Pnicer, that estimates intrinsic colors and extinction for individual stars using unsupervised machine learning algorithms. This new method aims to be free from any priors with respect to the column density and intrinsic color distribution. It is applicable to any combination of parameters and works in arbitrary numbers of dimensions. Furthermore, it is not restricted to color space. Extinction toward single sources is determined by fitting Gaussian mixture models along the extinction vector to (extinction-free) control field observations. In this way it becomes possible to describe the extinction for observed sources with probability densities, rather than a single value. Pnicer effectively eliminates known biases found in similar methods and outperforms them in cases of deep observational data where the number of background galaxies is significant, or when a large number of parameters is used to break degeneracies in the intrinsic color distributions. This new method remains computationally competitive, making it possible to correctly de-redden millions of sources within a matter of seconds. With the ever-increasing number of large-scale high-sensitivity imaging surveys, Pnicer offers a fast and reliable way to efficiently calculate extinction for arbitrary parameter combinations without prior information on source characteristics. The Pnicer software package also offers access to the well-established Nicer technique in a simple unified interface and is capable of building extinction maps including the Nicest correction for cloud substructure. Pnicer is offered to the community as an open-source software solution and is entirely written in Python.

  16. BENCHMARKING MACHINE LEARNING TECHNIQUES FOR SOFTWARE DEFECT DETECTION

    OpenAIRE

    Saiqa Aleem; Luiz Fernando Capretz; Faheem Ahmed

    2015-01-01

    Machine Learning approaches are good in solving problems that have less information. In most cases, the software domain problems characterize as a process of learning that depend on the various circumstances and changes accordingly. A predictive model is constructed by using machine learning approaches and classified them into defective and non-defective modules. Machine learning techniques help developers to retrieve useful information after the classification and enable them to analyse data...

  17. BEBP: An Poisoning Method Against Machine Learning Based IDSs

    OpenAIRE

    Li, Pan; Liu, Qiang; Zhao, Wentao; Wang, Dongxu; Wang, Siqi

    2018-01-01

    In big data era, machine learning is one of fundamental techniques in intrusion detection systems (IDSs). However, practical IDSs generally update their decision module by feeding new data then retraining learning models in a periodical way. Hence, some attacks that comprise the data for training or testing classifiers significantly challenge the detecting capability of machine learning-based IDSs. Poisoning attack, which is one of the most recognized security threats towards machine learning...

  18. A Review of Current Machine Learning Techniques Used in Manufacturing Diagnosis

    OpenAIRE

    Ademujimi , Toyosi ,; Brundage , Michael ,; Prabhu , Vittaldas ,

    2017-01-01

    Part 6: Intelligent Diagnostics and Maintenance Solutions; International audience; Artificial intelligence applications are increasing due to advances in data collection systems, algorithms, and affordability of computing power. Within the manufacturing industry, machine learning algorithms are often used for improving manufacturing system fault diagnosis. This study focuses on a review of recent fault diagnosis applications in manufacturing that are based on several prominent machine learnin...

  19. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  20. Machine learning spatial geometry from entanglement features

    Science.gov (United States)

    You, Yi-Zhuang; Yang, Zhao; Qi, Xiao-Liang

    2018-02-01

    Motivated by the close relations of the renormalization group with both the holography duality and the deep learning, we propose that the holographic geometry can emerge from deep learning the entanglement feature of a quantum many-body state. We develop a concrete algorithm, call the entanglement feature learning (EFL), based on the random tensor network (RTN) model for the tensor network holography. We show that each RTN can be mapped to a Boltzmann machine, trained by the entanglement entropies over all subregions of a given quantum many-body state. The goal is to construct the optimal RTN that best reproduce the entanglement feature. The RTN geometry can then be interpreted as the emergent holographic geometry. We demonstrate the EFL algorithm on a 1D free fermion system and observe the emergence of the hyperbolic geometry (AdS3 spatial geometry) as we tune the fermion system towards the gapless critical point (CFT2 point).

  1. Nonlinear programming for classification problems in machine learning

    Science.gov (United States)

    Astorino, Annabella; Fuduli, Antonio; Gaudioso, Manlio

    2016-10-01

    We survey some nonlinear models for classification problems arising in machine learning. In the last years this field has become more and more relevant due to a lot of practical applications, such as text and web classification, object recognition in machine vision, gene expression profile analysis, DNA and protein analysis, medical diagnosis, customer profiling etc. Classification deals with separation of sets by means of appropriate separation surfaces, which is generally obtained by solving a numerical optimization model. While linear separability is the basis of the most popular approach to classification, the Support Vector Machine (SVM), in the recent years using nonlinear separating surfaces has received some attention. The objective of this work is to recall some of such proposals, mainly in terms of the numerical optimization models. In particular we tackle the polyhedral, ellipsoidal, spherical and conical separation approaches and, for some of them, we also consider the semisupervised versions.

  2. Trustless Machine Learning Contracts; Evaluating and Exchanging Machine Learning Models on the Ethereum Blockchain

    OpenAIRE

    Kurtulmus, A. Besir; Daniel, Kenny

    2018-01-01

    Using blockchain technology, it is possible to create contracts that offer a reward in exchange for a trained machine learning model for a particular data set. This would allow users to train machine learning models for a reward in a trustless manner. The smart contract will use the blockchain to automatically validate the solution, so there would be no debate about whether the solution was correct or not. Users who submit the solutions won't have counterparty risk that they won't get paid fo...

  3. Daily sea level prediction at Chiayi coast, Taiwan using extreme learning machine and relevance vector machine

    Science.gov (United States)

    Imani, Moslem; Kao, Huan-Chin; Lan, Wen-Hau; Kuo, Chung-Yen

    2018-02-01

    The analysis and the prediction of sea level fluctuations are core requirements of marine meteorology and operational oceanography. Estimates of sea level with hours-to-days warning times are especially important for low-lying regions and coastal zone management. The primary purpose of this study is to examine the applicability and capability of extreme learning machine (ELM) and relevance vector machine (RVM) models for predicting sea level variations and compare their performances with powerful machine learning methods, namely, support vector machine (SVM) and radial basis function (RBF) models. The input dataset from the period of January 2004 to May 2011 used in the study was obtained from the Dongshi tide gauge station in Chiayi, Taiwan. Results showed that the ELM and RVM models outperformed the other methods. The performance of the RVM approach was superior in predicting the daily sea level time series given the minimum root mean square error of 34.73 mm and the maximum determination coefficient of 0.93 (R2) during the testing periods. Furthermore, the obtained results were in close agreement with the original tide-gauge data, which indicates that RVM approach is a promising alternative method for time series prediction and could be successfully used for daily sea level forecasts.

  4. Evaluation of Machine Learning Methods for LHC Optics Measurements and Corrections Software

    CERN Document Server

    AUTHOR|(CDS)2206853; Henning, Peter

    The field of artificial intelligence is driven by the goal to provide machines with human-like intelligence. However modern science is currently facing problems with high complexity that cannot be solved by humans in the same timescale as by machines. Therefore there is a demand on automation of complex tasks. To identify the category of tasks which can be performed by machines in the domain of optics measurements and correction on the Large Hadron Collider (LHC) is one of the central research subjects of this thesis. The application of machine learning methods and concepts of artificial intelligence can be found in various industry and scientific branches. In High Energy Physics these concepts are mostly used in offline analysis of experiments data and to perform regression tasks. In Accelerator Physics the machine learning approach has not found a wide application yet. Therefore potential tasks for machine learning solutions can be specified in this domain. The appropriate methods and their suitability for...

  5. Machine learning techniques to select Be star candidates. An application in the OGLE-IV Gaia south ecliptic pole field

    Science.gov (United States)

    Pérez-Ortiz, M. F.; García-Varela, A.; Quiroz, A. J.; Sabogal, B. E.; Hernández, J.

    2017-09-01

    Context. Optical and infrared variability surveys produce a large number of high quality light curves. Statistical pattern recognition methods have provided competitive solutions for variable star classification at a relatively low computational cost. In order to perform supervised classification, a set of features is proposed and used to train an automatic classification system. Quantities related to the magnitude density of the light curves and their Fourier coefficients have been chosen as features in previous studies. However, some of these features are not robust to the presence of outliers and the calculation of Fourier coefficients is computationally expensive for large data sets. Aims: We propose and evaluate the performance of a new robust set of features using supervised classifiers in order to look for new Be star candidates in the OGLE-IV Gaia south ecliptic pole field. Methods: We calculated the proposed set of features on six types of variable stars and also on a set of Be star candidates reported in the literature. We evaluated the performance of these features using classification trees and random forests along with the K-nearest neighbours, support vector machines, and gradient boosted trees methods. We tuned the classifiers with a 10-fold cross-validation and grid search. We then validated the performance of the best classifier on a set of OGLE-IV light curves and applied this to find new Be star candidates. Results: The random forest classifier outperformed the others. By using the random forest classifier and colours criteria we found 50 Be star candidates in the direction of the Gaia south ecliptic pole field, four of which have infrared colours that are consistent with Herbig Ae/Be stars. Conclusions: Supervised methods are very useful in order to obtain preliminary samples of variable stars extracted from large databases. As usual, the stars classified as Be stars candidates must be checked for the colours and spectroscopic characteristics

  6. Machine Learning Methods to Predict Diabetes Complications.

    Science.gov (United States)

    Dagliati, Arianna; Marini, Simone; Sacchi, Lucia; Cogni, Giulia; Teliti, Marsida; Tibollo, Valentina; De Cata, Pasquale; Chiovato, Luca; Bellazzi, Riccardo

    2018-03-01

    One of the areas where Artificial Intelligence is having more impact is machine learning, which develops algorithms able to learn patterns and decision rules from data. Machine learning algorithms have been embedded into data mining pipelines, which can combine them with classical statistical strategies, to extract knowledge from data. Within the EU-funded MOSAIC project, a data mining pipeline has been used to derive a set of predictive models of type 2 diabetes mellitus (T2DM) complications based on electronic health record data of nearly one thousand patients. Such pipeline comprises clinical center profiling, predictive model targeting, predictive model construction and model validation. After having dealt with missing data by means of random forest (RF) and having applied suitable strategies to handle class imbalance, we have used Logistic Regression with stepwise feature selection to predict the onset of retinopathy, neuropathy, or nephropathy, at different time scenarios, at 3, 5, and 7 years from the first visit at the Hospital Center for Diabetes (not from the diagnosis). Considered variables are gender, age, time from diagnosis, body mass index (BMI), glycated hemoglobin (HbA1c), hypertension, and smoking habit. Final models, tailored in accordance with the complications, provided an accuracy up to 0.838. Different variables were selected for each complication and time scenario, leading to specialized models easy to translate to the clinical practice.

  7. Using Machine Learning in Adversarial Environments.

    Energy Technology Data Exchange (ETDEWEB)

    Davis, Warren Leon [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2016-02-01

    Intrusion/anomaly detection systems are among the first lines of cyber defense. Commonly, they either use signatures or machine learning (ML) to identify threats, but fail to account for sophisticated attackers trying to circumvent them. We propose to embed machine learning within a game theoretic framework that performs adversarial modeling, develops methods for optimizing operational response based on ML, and integrates the resulting optimization codebase into the existing ML infrastructure developed by the Hybrid LDRD. Our approach addresses three key shortcomings of ML in adversarial settings: 1) resulting classifiers are typically deterministic and, therefore, easy to reverse engineer; 2) ML approaches only address the prediction problem, but do not prescribe how one should operationalize predictions, nor account for operational costs and constraints; and 3) ML approaches do not model attackers’ response and can be circumvented by sophisticated adversaries. The principal novelty of our approach is to construct an optimization framework that blends ML, operational considerations, and a model predicting attackers reaction, with the goal of computing optimal moving target defense. One important challenge is to construct a realistic model of an adversary that is tractable, yet realistic. We aim to advance the science of attacker modeling by considering game-theoretic methods, and by engaging experimental subjects with red teaming experience in trying to actively circumvent an intrusion detection system, and learning a predictive model of such circumvention activities. In addition, we will generate metrics to test that a particular model of an adversary is consistent with available data.

  8. Multivariate Mapping of Environmental Data Using Extreme Learning Machines

    Science.gov (United States)

    Leuenberger, Michael; Kanevski, Mikhail

    2014-05-01

    In most real cases environmental data are multivariate, highly variable at several spatio-temporal scales, and are generated by nonlinear and complex phenomena. Mapping - spatial predictions of such data, is a challenging problem. Machine learning algorithms, being universal nonlinear tools, have demonstrated their efficiency in modelling of environmental spatial and space-time data (Kanevski et al. 2009). Recently, a new approach in machine learning - Extreme Learning Machine (ELM), has gained a great popularity. ELM is a fast and powerful approach being a part of the machine learning algorithm category. Developed by G.-B. Huang et al. (2006), it follows the structure of a multilayer perceptron (MLP) with one single-hidden layer feedforward neural networks (SLFNs). The learning step of classical artificial neural networks, like MLP, deals with the optimization of weights and biases by using gradient-based learning algorithm (e.g. back-propagation algorithm). Opposed to this optimization phase, which can fall into local minima, ELM generates randomly the weights between the input layer and the hidden layer and also the biases in the hidden layer. By this initialization, it optimizes just the weight vector between the hidden layer and the output layer in a single way. The main advantage of this algorithm is the speed of the learning step. In a theoretical context and by growing the number of hidden nodes, the algorithm can learn any set of training data with zero error. To avoid overfitting, cross-validation method or "true validation" (by randomly splitting data into training, validation and testing subsets) are recommended in order to find an optimal number of neurons. With its universal property and solid theoretical basis, ELM is a good machine learning algorithm which can push the field forward. The present research deals with an extension of ELM to multivariate output modelling and application of ELM to the real data case study - pollution of the sediments in

  9. Exploiting the Dynamics of Soft Materials for Machine Learning.

    Science.gov (United States)

    Nakajima, Kohei; Hauser, Helmut; Li, Tao; Pfeifer, Rolf

    2018-06-01

    Soft materials are increasingly utilized for various purposes in many engineering applications. These materials have been shown to perform a number of functions that were previously difficult to implement using rigid materials. Here, we argue that the diverse dynamics generated by actuating soft materials can be effectively used for machine learning purposes. This is demonstrated using a soft silicone arm through a technique of multiplexing, which enables the rich transient dynamics of the soft materials to be fully exploited as a computational resource. The computational performance of the soft silicone arm is examined through two standard benchmark tasks. Results show that the soft arm compares well to or even outperforms conventional machine learning techniques under multiple conditions. We then demonstrate that this system can be used for the sensory time series prediction problem for the soft arm itself, which suggests its immediate applicability to a real-world machine learning problem. Our approach, on the one hand, represents a radical departure from traditional computational methods, whereas on the other hand, it fits nicely into a more general perspective of computation by way of exploiting the properties of physical materials in the real world.

  10. Predicting breast screening attendance using machine learning techniques.

    Science.gov (United States)

    Baskaran, Vikraman; Guergachi, Aziz; Bali, Rajeev K; Naguib, Raouf N G

    2011-03-01

    Machine learning-based prediction has been effectively applied for many healthcare applications. Predicting breast screening attendance using machine learning (prior to the actual mammogram) is a new field. This paper presents new predictor attributes for such an algorithm. It describes a new hybrid algorithm that relies on back-propagation and radial basis function-based neural networks for prediction. The algorithm has been developed in an open source-based environment. The algorithm was tested on a 13-year dataset (1995-2008). This paper compares the algorithm and validates its accuracy and efficiency with different platforms. Nearly 80% accuracy and 88% positive predictive value and sensitivity were recorded for the algorithm. The results were encouraging; 40-50% of negative predictive value and specificity warrant further work. Preliminary results were promising and provided ample amount of reasons for testing the algorithm on a larger scale.

  11. Machine Learning Interface for Medical Image Analysis.

    Science.gov (United States)

    Zhang, Yi C; Kagen, Alexander C

    2017-10-01

    TensorFlow is a second-generation open-source machine learning software library with a built-in framework for implementing neural networks in wide variety of perceptual tasks. Although TensorFlow usage is well established with computer vision datasets, the TensorFlow interface with DICOM formats for medical imaging remains to be established. Our goal is to extend the TensorFlow API to accept raw DICOM images as input; 1513 DaTscan DICOM images were obtained from the Parkinson's Progression Markers Initiative (PPMI) database. DICOM pixel intensities were extracted and shaped into tensors, or n-dimensional arrays, to populate the training, validation, and test input datasets for machine learning. A simple neural network was constructed in TensorFlow to classify images into normal or Parkinson's disease groups. Training was executed over 1000 iterations for each cross-validation set. The gradient descent optimization and Adagrad optimization algorithms were used to minimize cross-entropy between the predicted and ground-truth labels. Cross-validation was performed ten times to produce a mean accuracy of 0.938 ± 0.047 (95 % CI 0.908-0.967). The mean sensitivity was 0.974 ± 0.043 (95 % CI 0.947-1.00) and mean specificity was 0.822 ± 0.207 (95 % CI 0.694-0.950). We extended the TensorFlow API to enable DICOM compatibility in the context of DaTscan image analysis. We implemented a neural network classifier that produces diagnostic accuracies on par with excellent results from previous machine learning models. These results indicate the potential role of TensorFlow as a useful adjunct diagnostic tool in the clinical setting.

  12. From Curve Fitting to Machine Learning

    CERN Document Server

    Zielesny, Achim

    2011-01-01

    The analysis of experimental data is at heart of science from its beginnings. But it was the advent of digital computers that allowed the execution of highly non-linear and increasingly complex data analysis procedures - methods that were completely unfeasible before. Non-linear curve fitting, clustering and machine learning belong to these modern techniques which are a further step towards computational intelligence. The goal of this book is to provide an interactive and illustrative guide to these topics. It concentrates on the road from two dimensional curve fitting to multidimensional clus

  13. Unintended consequences of machine learning in medicine?

    Science.gov (United States)

    McDonald, Laura; Ramagopalan, Sreeram V; Cox, Andrew P; Oguz, Mustafa

    2017-01-01

    Machine learning (ML) has the potential to significantly aid medical practice. However, a recent article highlighted some negative consequences that may arise from using ML decision support in medicine. We argue here that whilst the concerns raised by the authors may be appropriate, they are not specific to ML, and thus the article may lead to an adverse perception about this technique in particular. Whilst ML is not without its limitations like any methodology, a balanced view is needed in order to not hamper its use in potentially enabling better patient care.

  14. Applying machine learning to predict patient-specific current CD4 ...

    African Journals Online (AJOL)

    Apple apple

    This work shows the application of machine learning to predict current CD4 cell count of an HIV- .... Pre-processing ... remaining data elements of the PR and RT datasets. ... technique based on the structure of the human brain's neuron.

  15. Quantum cloning machines and the applications

    Energy Technology Data Exchange (ETDEWEB)

    Fan, Heng, E-mail: hfan@iphy.ac.cn [Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190 (China); Collaborative Innovation Center of Quantum Matter, Beijing 100190 (China); Wang, Yi-Nan; Jing, Li [School of Physics, Peking University, Beijing 100871 (China); Yue, Jie-Dong [Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190 (China); Shi, Han-Duo; Zhang, Yong-Liang; Mu, Liang-Zhu [School of Physics, Peking University, Beijing 100871 (China)

    2014-11-20

    No-cloning theorem is fundamental for quantum mechanics and for quantum information science that states an unknown quantum state cannot be cloned perfectly. However, we can try to clone a quantum state approximately with the optimal fidelity, or instead, we can try to clone it perfectly with the largest probability. Thus various quantum cloning machines have been designed for different quantum information protocols. Specifically, quantum cloning machines can be designed to analyze the security of quantum key distribution protocols such as BB84 protocol, six-state protocol, B92 protocol and their generalizations. Some well-known quantum cloning machines include universal quantum cloning machine, phase-covariant cloning machine, the asymmetric quantum cloning machine and the probabilistic quantum cloning machine. In the past years, much progress has been made in studying quantum cloning machines and their applications and implementations, both theoretically and experimentally. In this review, we will give a complete description of those important developments about quantum cloning and some related topics. On the other hand, this review is self-consistent, and in particular, we try to present some detailed formulations so that further study can be taken based on those results.

  16. Quantum cloning machines and the applications

    International Nuclear Information System (INIS)

    Fan, Heng; Wang, Yi-Nan; Jing, Li; Yue, Jie-Dong; Shi, Han-Duo; Zhang, Yong-Liang; Mu, Liang-Zhu

    2014-01-01

    No-cloning theorem is fundamental for quantum mechanics and for quantum information science that states an unknown quantum state cannot be cloned perfectly. However, we can try to clone a quantum state approximately with the optimal fidelity, or instead, we can try to clone it perfectly with the largest probability. Thus various quantum cloning machines have been designed for different quantum information protocols. Specifically, quantum cloning machines can be designed to analyze the security of quantum key distribution protocols such as BB84 protocol, six-state protocol, B92 protocol and their generalizations. Some well-known quantum cloning machines include universal quantum cloning machine, phase-covariant cloning machine, the asymmetric quantum cloning machine and the probabilistic quantum cloning machine. In the past years, much progress has been made in studying quantum cloning machines and their applications and implementations, both theoretically and experimentally. In this review, we will give a complete description of those important developments about quantum cloning and some related topics. On the other hand, this review is self-consistent, and in particular, we try to present some detailed formulations so that further study can be taken based on those results

  17. Robust Matching Pursuit Extreme Learning Machines

    Directory of Open Access Journals (Sweden)

    Zejian Yuan

    2018-01-01

    Full Text Available Extreme learning machine (ELM is a popular learning algorithm for single hidden layer feedforward networks (SLFNs. It was originally proposed with the inspiration from biological learning and has attracted massive attentions due to its adaptability to various tasks with a fast learning ability and efficient computation cost. As an effective sparse representation method, orthogonal matching pursuit (OMP method can be embedded into ELM to overcome the singularity problem and improve the stability. Usually OMP recovers a sparse vector by minimizing a least squares (LS loss, which is efficient for Gaussian distributed data, but may suffer performance deterioration in presence of non-Gaussian data. To address this problem, a robust matching pursuit method based on a novel kernel risk-sensitive loss (in short KRSLMP is first proposed in this paper. The KRSLMP is then applied to ELM to solve the sparse output weight vector, and the new method named the KRSLMP-ELM is developed for SLFN learning. Experimental results on synthetic and real-world data sets confirm the effectiveness and superiority of the proposed method.

  18. Towards Machine Learning of Motor Skills

    Science.gov (United States)

    Peters, Jan; Schaal, Stefan; Schölkopf, Bernhard

    Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting.

  19. Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier

    Directory of Open Access Journals (Sweden)

    C. V. Subbulakshmi

    2015-01-01

    Full Text Available Medical data classification is a prime data mining problem being discussed about for a decade that has attracted several researchers around the world. Most classifiers are designed so as to learn from the data itself using a training process, because complete expert knowledge to determine classifier parameters is impracticable. This paper proposes a hybrid methodology based on machine learning paradigm. This paradigm integrates the successful exploration mechanism called self-regulated learning capability of the particle swarm optimization (PSO algorithm with the extreme learning machine (ELM classifier. As a recent off-line learning method, ELM is a single-hidden layer feedforward neural network (FFNN, proved to be an excellent classifier with large number of hidden layer neurons. In this research, PSO is used to determine the optimum set of parameters for the ELM, thus reducing the number of hidden layer neurons, and it further improves the network generalization performance. The proposed method is experimented on five benchmarked datasets of the UCI Machine Learning Repository for handling medical dataset classification. Simulation results show that the proposed approach is able to achieve good generalization performance, compared to the results of other classifiers.

  20. Intellectual Property and Machine Learning: An exploratory study

    OpenAIRE

    Øverlier, Lasse

    2017-01-01

    Our research makes a contribution by exemplifying what controls the freedom-to-operate for a company operating in the area of machine learning. Through interviews we demonstrate the industry’s alternating viewpoints to whether copyrighted data used as input to machine learning systems should be viewed differently than copying the data for storage or reproduction. In addition we show that unauthorized use of copyrighted data in machine learning systems is hard to detect with the burden of proo...

  1. Machine Learning Topological Invariants with Neural Networks

    Science.gov (United States)

    Zhang, Pengfei; Shen, Huitao; Zhai, Hui

    2018-02-01

    In this Letter we supervisedly train neural networks to distinguish different topological phases in the context of topological band insulators. After training with Hamiltonians of one-dimensional insulators with chiral symmetry, the neural network can predict their topological winding numbers with nearly 100% accuracy, even for Hamiltonians with larger winding numbers that are not included in the training data. These results show a remarkable success that the neural network can capture the global and nonlinear topological features of quantum phases from local inputs. By opening up the neural network, we confirm that the network does learn the discrete version of the winding number formula. We also make a couple of remarks regarding the role of the symmetry and the opposite effect of regularization techniques when applying machine learning to physical systems.

  2. Machine-learning the string landscape

    Directory of Open Access Journals (Sweden)

    Yang-Hui He

    2017-11-01

    Full Text Available We propose a paradigm to apply machine learning various databases which have emerged in the study of the string landscape. In particular, we establish neural networks as both classifiers and predictors and train them with a host of available data ranging from Calabi–Yau manifolds and vector bundles, to quiver representations for gauge theories, using a novel framework of recasting geometrical and physical data as pixelated images. We find that even a relatively simple neural network can learn many significant quantities to astounding accuracy in a matter of minutes and can also predict hithertofore unencountered results, whereby rendering the paradigm a valuable tool in physics as well as pure mathematics.

  3. A Machine LearningFramework to Forecast Wave Conditions

    Science.gov (United States)

    Zhang, Y.; James, S. C.; O'Donncha, F.

    2017-12-01

    Recently, significant effort has been undertaken to quantify and extract wave energy because it is renewable, environmental friendly, abundant, and often close to population centers. However, a major challenge is the ability to accurately and quickly predict energy production, especially across a 48-hour cycle. Accurate forecasting of wave conditions is a challenging undertaking that typically involves solving the spectral action-balance equation on a discretized grid with high spatial resolution. The nature of the computations typically demands high-performance computing infrastructure. Using a case-study site at Monterey Bay, California, a machine learning framework was trained to replicate numerically simulated wave conditions at a fraction of the typical computational cost. Specifically, the physics-based Simulating WAves Nearshore (SWAN) model, driven by measured wave conditions, nowcast ocean currents, and wind data, was used to generate training data for machine learning algorithms. The model was run between April 1st, 2013 and May 31st, 2017 generating forecasts at three-hour intervals yielding 11,078 distinct model outputs. SWAN-generated fields of 3,104 wave heights and a characteristic period could be replicated through simple matrix multiplications using the mapping matrices from machine learning algorithms. In fact, wave-height RMSEs from the machine learning algorithms (9 cm) were less than those for the SWAN model-verification exercise where those simulations were compared to buoy wave data within the model domain (>40 cm). The validated machine learning approach, which acts as an accurate surrogate for the SWAN model, can now be used to perform real-time forecasts of wave conditions for the next 48 hours using available forecasted boundary wave conditions, ocean currents, and winds. This solution has obvious applications to wave-energy generation as accurate wave conditions can be forecasted with over a three-order-of-magnitude reduction in

  4. What can machine learning do for antimicrobial peptides, and what can antimicrobial peptides do for machine learning?

    Science.gov (United States)

    Lee, Ernest Y; Lee, Michelle W; Fulan, Benjamin M; Ferguson, Andrew L; Wong, Gerard C L

    2017-12-06

    Antimicrobial peptides (AMPs) are a diverse class of well-studied membrane-permeating peptides with important functions in innate host defense. In this short review, we provide a historical overview of AMPs, summarize previous applications of machine learning to AMPs, and discuss the results of our studies in the context of the latest AMP literature. Much work has been recently done in leveraging computational tools to design new AMP candidates with high therapeutic efficacies for drug-resistant infections. We show that machine learning on AMPs can be used to identify essential physico-chemical determinants of AMP functionality, and identify and design peptide sequences to generate membrane curvature. In a broader scope, we discuss the implications of our findings for the discovery of membrane-active peptides in general, and uncovering membrane activity in new and existing peptide taxonomies.

  5. A framework for detection of malicious software in Android handheld systems using machine learning techniques

    OpenAIRE

    Torregrosa García, Blas

    2015-01-01

    The present study aims at designing and developing new approaches to detect malicious applications in Android-based devices. More precisely, MaLDroide (Machine Learning-based Detector for Android malware), a framework for detection of Android malware based on machine learning techniques, is introduced here. It is devised to identify malicious applications. Este trabajo tiene como objetivo el diseño y el desarrollo de nuevas formas de detección de aplicaciones maliciosas en los dispositivos...

  6. Advances in Patient Classification for Traditional Chinese Medicine: A Machine Learning Perspective

    Directory of Open Access Journals (Sweden)

    Changbo Zhao

    2015-01-01

    data analyzed by different computational methods, we present the overview for four subfields of TCM diagnosis, respectively. For each subfield, we design a rectangular reference list with applications in the horizontal direction and machine learning algorithms in the longitudinal direction. According to the current development of objective TCM diagnosis for patient classification, a discussion of the research issues around machine learning techniques with applications to TCM diagnosis is given to facilitate the further research for TCM patient classification.

  7. Data Mining Practical Machine Learning Tools and Techniques

    CERN Document Server

    Witten, Ian H; Hall, Mark A

    2011-01-01

    Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining. Thorough updates reflect the technical changes and modernizations that have taken place

  8. Mining the Kepler Data using Machine Learning

    Science.gov (United States)

    Walkowicz, Lucianne; Howe, A. R.; Nayar, R.; Turner, E. L.; Scargle, J.; Meadows, V.; Zee, A.

    2014-01-01

    Kepler's high cadence and incredible precision has provided an unprecedented view into stars and their planetary companions, revealing both expected and novel phenomena and systems. Due to the large number of Kepler lightcurves, the discovery of novel phenomena in particular has often been serendipitous in the course of searching for known forms of variability (for example, the discovery of the doubly pulsating elliptical binary KOI-54, originally identified by the transiting planet search pipeline). In this talk, we discuss progress on mining the Kepler data through both supervised and unsupervised machine learning, intended to both systematically search the Kepler lightcurves for rare or anomalous variability, and to create a variability catalog for community use. Mining the dataset in this way also allows for a quantitative identification of anomalous variability, and so may also be used as a signal-agnostic form of optical SETI. As the Kepler data are exceptionally rich, they provide an interesting counterpoint to machine learning efforts typically performed on sparser and/or noisier survey data, and will inform similar characterization carried out on future survey datasets.

  9. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    International Nuclear Information System (INIS)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.; McEwen, Jason D.

    2016-01-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  10. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    Energy Technology Data Exchange (ETDEWEB)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K. [Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT (United Kingdom); McEwen, Jason D., E-mail: dr.michelle.lochner@gmail.com [Mullard Space Science Laboratory, University College London, Surrey RH5 6NT (United Kingdom)

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  11. Quantum machine learning: a classical perspective

    Science.gov (United States)

    Ciliberto, Carlo; Herbster, Mark; Ialongo, Alessandro Davide; Pontil, Massimiliano; Severini, Simone; Wossnig, Leonard

    2018-01-01

    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning (ML) techniques to impressive results in regression, classification, data generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets is motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed up classical ML algorithms. Here we review the literature in quantum ML and discuss perspectives for a mixed readership of classical ML and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in ML are identified as promising directions for the field. Practical questions, such as how to upload classical data into quantum form, will also be addressed. PMID:29434508

  12. Global Bathymetry: Machine Learning for Data Editing

    Science.gov (United States)

    Sandwell, D. T.; Tea, B.; Freund, Y.

    2017-12-01

    The accuracy of global bathymetry depends primarily on the coverage and accuracy of the sounding data and secondarily on the depth predicted from gravity. A main focus of our research is to add newly-available data to the global compilation. Most data sources have 1-12% of erroneous soundings caused by a wide array of blunders and measurement errors. Over the years we have hand-edited this data using undergraduate employees at UCSD (440 million soundings at 500 m resolution). We are developing a machine learning approach to refine the flagging of the older soundings and provide automated editing of newly-acquired soundings. The approach has three main steps: 1) Combine the sounding data with additional information that may inform the machine learning algorithm. The additional parameters include: depth predicted from gravity; distance to the nearest sounding from other cruises; seafloor age; spreading rate; sediment thickness; and vertical gravity gradient. 2) Use available edit decisions as training data sets for a boosted tree algorithm with a binary logistic objective function and L2 regularization. Initial results with poor quality single beam soundings show that the automated algorithm matches the hand-edited data 89% of the time. The results show that most of the information for detecting outliers comes from predicted depth with secondary contributions from distance to the nearest sounding and longitude. A similar analysis using very high quality multibeam data shows that the automated algorithm matches the hand-edited data 93% of the time. Again, most of the information for detecting outliers comes from predicted depth secondary contributions from distance to the nearest sounding and longitude. 3) The third step in the process is to use the machine learning parameters, derived from the training data, to edit 12 million newly acquired single beam sounding data provided by the National Geospatial-Intelligence Agency. The output of the learning algorithm will be

  13. WORMHOLE: Novel Least Diverged Ortholog Prediction through Machine Learning.

    Directory of Open Access Journals (Sweden)

    George L Sutphin

    2016-11-01

    Full Text Available The rapid advancement of technology in genomics and targeted genetic manipulation has made comparative biology an increasingly prominent strategy to model human disease processes. Predicting orthology relationships between species is a vital component of comparative biology. Dozens of strategies for predicting orthologs have been developed using combinations of gene and protein sequence, phylogenetic history, and functional interaction with progressively increasing accuracy. A relatively new class of orthology prediction strategies combines aspects of multiple methods into meta-tools, resulting in improved prediction performance. Here we present WORMHOLE, a novel ortholog prediction meta-tool that applies machine learning to integrate 17 distinct ortholog prediction algorithms to identify novel least diverged orthologs (LDOs between 6 eukaryotic species-humans, mice, zebrafish, fruit flies, nematodes, and budding yeast. Machine learning allows WORMHOLE to intelligently incorporate predictions from a wide-spectrum of strategies in order to form aggregate predictions of LDOs with high confidence. In this study we demonstrate the performance of WORMHOLE across each combination of query and target species. We show that WORMHOLE is particularly adept at improving LDO prediction performance between distantly related species, expanding the pool of LDOs while maintaining low evolutionary distance and a high level of functional relatedness between genes in LDO pairs. We present extensive validation, including cross-validated prediction of PANTHER LDOs and evaluation of evolutionary divergence and functional similarity, and discuss future applications of machine learning in ortholog prediction. A WORMHOLE web tool has been developed and is available at http://wormhole.jax.org/.

  14. WORMHOLE: Novel Least Diverged Ortholog Prediction through Machine Learning

    Science.gov (United States)

    Sutphin, George L.; Mahoney, J. Matthew; Sheppard, Keith; Walton, David O.; Korstanje, Ron

    2016-01-01

    The rapid advancement of technology in genomics and targeted genetic manipulation has made comparative biology an increasingly prominent strategy to model human disease processes. Predicting orthology relationships between species is a vital component of comparative biology. Dozens of strategies for predicting orthologs have been developed using combinations of gene and protein sequence, phylogenetic history, and functional interaction with progressively increasing accuracy. A relatively new class of orthology prediction strategies combines aspects of multiple methods into meta-tools, resulting in improved prediction performance. Here we present WORMHOLE, a novel ortholog prediction meta-tool that applies machine learning to integrate 17 distinct ortholog prediction algorithms to identify novel least diverged orthologs (LDOs) between 6 eukaryotic species—humans, mice, zebrafish, fruit flies, nematodes, and budding yeast. Machine learning allows WORMHOLE to intelligently incorporate predictions from a wide-spectrum of strategies in order to form aggregate predictions of LDOs with high confidence. In this study we demonstrate the performance of WORMHOLE across each combination of query and target species. We show that WORMHOLE is particularly adept at improving LDO prediction performance between distantly related species, expanding the pool of LDOs while maintaining low evolutionary distance and a high level of functional relatedness between genes in LDO pairs. We present extensive validation, including cross-validated prediction of PANTHER LDOs and evaluation of evolutionary divergence and functional similarity, and discuss future applications of machine learning in ortholog prediction. A WORMHOLE web tool has been developed and is available at http://wormhole.jax.org/. PMID:27812085

  15. Nonlinear machine learning in soft materials engineering and design

    Science.gov (United States)

    Ferguson, Andrew

    The inherently many-body nature of molecular folding and colloidal self-assembly makes it challenging to identify the underlying collective mechanisms and pathways governing system behavior, and has hindered rational design of soft materials with desired structure and function. Fundamentally, there exists a predictive gulf between the architecture and chemistry of individual molecules or colloids and the collective many-body thermodynamics and kinetics. Integrating machine learning techniques with statistical thermodynamics provides a means to bridge this divide and identify emergent folding pathways and self-assembly mechanisms from computer simulations or experimental particle tracking data. We will survey a few of our applications of this framework that illustrate the value of nonlinear machine learning in understanding and engineering soft materials: the non-equilibrium self-assembly of Janus colloids into pinwheels, clusters, and archipelagos; engineering reconfigurable ''digital colloids'' as a novel high-density information storage substrate; probing hierarchically self-assembling onjugated asphaltenes in crude oil; and determining macromolecular folding funnels from measurements of single experimental observables. We close with an outlook on the future of machine learning in soft materials engineering, and share some personal perspectives on working at this disciplinary intersection. We acknowledge support for this work from a National Science Foundation CAREER Award (Grant No. DMR-1350008) and the Donors of the American Chemical Society Petroleum Research Fund (ACS PRF #54240-DNI6).

  16. Prediction of body mass index status from voice signals based on machine learning for automated medical applications.

    Science.gov (United States)

    Lee, Bum Ju; Kim, Keun Ho; Ku, Boncho; Jang, Jun-Su; Kim, Jong Yeol

    2013-05-01

    tools for real-time monitoring; such tools are considered helpful in improving automated BMI status diagnosis in remote healthcare or telemedicine and are expected to have applications in forensic and medical science. Copyright © 2013 Elsevier B.V. All rights reserved.

  17. Analysis of underlying causes of inter-expert disagreement in retinopathy of prematurity diagnosis. Application of machine learning principles.

    Science.gov (United States)

    Ataer-Cansizoglu, E; Kalpathy-Cramer, J; You, S; Keck, K; Erdogmus, D; Chiang, M F

    2015-01-01

    Inter-expert variability in image-based clinical diagnosis has been demonstrated in many diseases including retinopathy of prematurity (ROP), which is a disease affecting low birth weight infants and is a major cause of childhood blindness. In order to better understand the underlying causes of variability among experts, we propose a method to quantify the variability of expert decisions and analyze the relationship between expert diagnoses and features computed from the images. Identification of these features is relevant for development of computer-based decision support systems and educational systems in ROP, and these methods may be applicable to other diseases where inter-expert variability is observed. The experiments were carried out on a dataset of 34 retinal images, each with diagnoses provided independently by 22 experts. Analysis was performed using concepts of Mutual Information (MI) and Kernel Density Estimation. A large set of structural features (a total of 66) were extracted from retinal images. Feature selection was utilized to identify the most important features that correlated to actual clinical decisions by the 22 study experts. The best three features for each observer were selected by an exhaustive search on all possible feature subsets and considering joint MI as a relevance criterion. We also compared our results with the results of Cohen's Kappa [36] as an inter-rater reliability measure. The results demonstrate that a group of observers (17 among 22) decide consistently with each other. Mean and second central moment of arteriolar tortuosity is among the reasons of disagreement between this group and the rest of the observers, meaning that the group of experts consider amount of tortuosity as well as the variation of tortuosity in the image. Given a set of image-based features, the proposed analysis method can identify critical image-based features that lead to expert agreement and disagreement in diagnosis of ROP. Although tree

  18. Application of machine learning methods to histone methylation ChIP-Seq data reveals H4R3me2 globally represses gene expression

    Science.gov (United States)

    2010-01-01

    Background In the last decade, biochemical studies have revealed that epigenetic modifications including histone modifications, histone variants and DNA methylation form a complex network that regulate the state of chromatin and processes that depend on it including transcription and DNA replication. Currently, a large number of these epigenetic modifications are being mapped in a variety of cell lines at different stages of development using high throughput sequencing by members of the ENCODE consortium, the NIH Roadmap Epigenomics Program and the Human Epigenome Project. An extremely promising and underexplored area of research is the application of machine learning methods, which are designed to construct predictive network models, to these large-scale epigenomic data sets. Results Using a ChIP-Seq data set of 20 histone lysine and arginine methylations and histone variant H2A.Z in human CD4+ T-cells, we built predictive models of gene expression as a function of histone modification/variant levels using Multilinear (ML) Regression and Multivariate Adaptive Regression Splines (MARS). Along with extensive crosstalk among the 20 histone methylations, we found H4R3me2 was the most and second most globally repressive histone methylation among the 20 studied in the ML and MARS models, respectively. In support of our finding, a number of experimental studies show that PRMT5-catalyzed symmetric dimethylation of H4R3 is associated with repression of gene expression. This includes a recent study, which demonstrated that H4R3me2 is required for DNMT3A-mediated DNA methylation--a known global repressor of gene expression. Conclusion In stark contrast to univariate analysis of the relationship between H4R3me2 and gene expression levels, our study showed that the regulatory role of some modifications like H4R3me2 is masked by confounding variables, but can be elucidated by multivariate/systems-level approaches. PMID:20653935

  19. Learning Activity Packets for Milling Machines. Unit I--Introduction to Milling Machines.

    Science.gov (United States)

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This learning activity packet (LAP) outlines the study activities and performance tasks covered in a related curriculum guide on milling machines. The course of study in this LAP is intended to help students learn to identify parts and attachments of vertical and horizontal milling machines, identify work-holding devices, state safety rules, and…

  20. THE INFLUENCE OF INTERACTIVE MULTIMEDIA AUDIO TELLING MACHINE (IMATE USE AND STUDENTS’ SELF REGU-LATED LEARNING LEVEL ON ENGLISH LANGUAGE GREET-INGS APPLICATION SKILLS (PENGARUH PENGGUNAAN INTERACTIVE MULTIMEDIA AU-DIO TELLING MACHINE (iMATE DAN TINGKAT SELF REGU-LATED LEARNING SISWA TERHADAP KEMAMPUAN MENERAP-KAN GREETINGS BAHASA INGGRIS

    Directory of Open Access Journals (Sweden)

    Muhammad Ridwan Sutisna

    2018-02-01

    Full Text Available Abstract. New trends of technology and also the higher needs of English proficiency have encouraged the quality improvent of English instructions. The aim of this research is to deter-mine the effect of Interactive Multimedia Audio Telling Machine (iMATE and self regulated learning level in English language greetings application skill of vocational school students. iMATE is an Interactive instructional media used in this research. While student’s self regulated learning is divided into high and low level. This research used experimental design. This re-search was held at SMK Pasundan 3 Bandung. Findings of this research were; (1 Generally students achieved better result when using iMATE. (2 There was an interaction between use of instructional media and students’ self regulated learning level. (3 Students with high self regu-lated learning achieved much better when using iMATE. (4 Students with low self regulated learning had a better result when not using iMATE. This Findings lead to the conclusion that students’ self regulated learning level may affect the succes of instructional media use, especially in teaching English language skills. Abstrak. Perkembangan teknologi dan kebutuhan akan kemampuan Bahasa Inggris yang lebih tinggi mendorong kualitas pembelajaran Bahasa Inggris juga mengalami perkem-bangan. Tujuan penelitian ini adalah untuk mengetahui pengaruh dari penggunaan Interactive Multimedia Audio Telling Machine (iMATE dan tingkat self regulated learning terhadap ke-mampuan menerapkan greetings Bahasa Inggris siswa SMK. iMATE adalah multimedia inter-aktif pembelajaran yang digunakan dalam penelitian ini. Siswa sebagai subjek penelitian dibagi kedalam dua kelompok yaitu yang memiliki tingkat self regulated learning yang tinggi dan ren-dah. Penelitian yang dilaksanakan di SMK Pasundan 3 Bandung ini menggunakan desain ek-sperimen. Temuan dari penelitian ini adalah (1 Secara umum siswa memperoleh hasil yang lebih baik dengan

  1. Behavioral Modeling for Mental Health using Machine Learning Algorithms.

    Science.gov (United States)

    Srividya, M; Mohanavalli, S; Bhalaji, N

    2018-04-03

    Mental health is an indicator of emotional, psychological and social well-being of an individual. It determines how an individual thinks, feels and handle situations. Positive mental health helps one to work productively and realize their full potential. Mental health is important at every stage of life, from childhood and adolescence through adulthood. Many factors contribute to mental health problems which lead to mental illness like stress, social anxiety, depression, obsessive compulsive disorder, drug addiction, and personality disorders. It is becoming increasingly important to determine the onset of the mental illness to maintain proper life balance. The nature of machine learning algorithms and Artificial Intelligence (AI) can be fully harnessed for predicting the onset of mental illness. Such applications when implemented in real time will benefit the society by serving as a monitoring tool for individuals with deviant behavior. This research work proposes to apply various machine learning algorithms such as support vector machines, decision trees, naïve bayes classifier, K-nearest neighbor classifier and logistic regression to identify state of mental health in a target group. The responses obtained from the target group for the designed questionnaire were first subject to unsupervised learning techniques. The labels obtained as a result of clustering were validated by computing the Mean Opinion Score. These cluster labels were then used to build classifiers to predict the mental health of an individual. Population from various groups like high school students, college students and working professionals were considered as target groups. The research presents an analysis of applying the aforementioned machine learning algorithms on the target groups and also suggests directions for future work.

  2. Intelligent Vehicle Power Management Using Machine Learning and Fuzzy Logic

    National Research Council Canada - National Science Library

    Chen, ZhiHang; Masrur, M. A; Murphey, Yi L

    2008-01-01

    .... A machine learning algorithm, LOPPS, has been developed to learn about optimal power source combinations with respect to minimum power loss for all possible load requests and various system power states...

  3. Active learning machine learns to create new quantum experiments.

    Science.gov (United States)

    Melnikov, Alexey A; Poulsen Nautrup, Hendrik; Krenn, Mario; Dunjko, Vedran; Tiersch, Markus; Zeilinger, Anton; Briegel, Hans J

    2018-02-06

    How useful can machine learning be in a quantum laboratory? Here we raise the question of the potential of intelligent machines in the context of scientific research. A major motivation for the present work is the unknown reachability of various entanglement classes in quantum experiments. We investigate this question by using the projective simulation model, a physics-oriented approach to artificial intelligence. In our approach, the projective simulation system is challenged to design complex photonic quantum experiments that produce high-dimensional entangled multiphoton states, which are of high interest in modern quantum experiments. The artificial intelligence system learns to create a variety of entangled states and improves the efficiency of their realization. In the process, the system autonomously (re)discovers experimental techniques which are only now becoming standard in modern quantum optical experiments-a trait which was not explicitly demanded from the system but emerged through the process of learning. Such features highlight the possibility that machines could have a significantly more creative role in future research.

  4. Machine Learning in Computer-Aided Synthesis Planning.

    Science.gov (United States)

    Coley, Connor W; Green, William H; Jensen, Klavs F

    2018-05-15

    . While we introduce this task in the context of reaction validation, its utility extends to the prediction of side products and impurities, among other applications. We describe neural network-based approaches that we and others have developed for this forward prediction task that can be trained on previously published experimental data. Machine learning and artificial intelligence have revolutionized a number of disciplines, not limited to image recognition, dictation, translation, content recommendation, advertising, and autonomous driving. While there is a rich history of using machine learning for structure-activity models in chemistry, it is only now that it is being successfully applied more broadly to organic synthesis and synthesis design. As reported in this Account, machine learning is rapidly transforming CASP, but there are several remaining challenges and opportunities, many pertaining to the availability and standardization of both data and evaluation metrics, which must be addressed by the community at large.

  5. IoT Security Techniques Based on Machine Learning

    OpenAIRE

    Xiao, Liang; Wan, Xiaoyue; Lu, Xiaozhen; Zhang, Yanyong; Wu, Di

    2018-01-01

    Internet of things (IoT) that integrate a variety of devices into networks to provide advanced and intelligent services have to protect user privacy and address attacks such as spoofing attacks, denial of service attacks, jamming and eavesdropping. In this article, we investigate the attack model for IoT systems, and review the IoT security solutions based on machine learning techniques including supervised learning, unsupervised learning and reinforcement learning. We focus on the machine le...

  6. Fifty years of computer analysis in chest imaging: rule-based, machine learning, deep learning.

    Science.gov (United States)

    van Ginneken, Bram

    2017-03-01

    Half a century ago, the term "computer-aided diagnosis" (CAD) was introduced in the scientific literature. Pulmonary imaging, with chest radiography and computed tomography, has always been one of the focus areas in this field. In this study, I describe how machine learning became the dominant technology for tackling CAD in the lungs, generally producing better results than do classical rule-based approaches, and how the field is now rapidly changing: in the last few years, we have seen how even better results can be obtained with deep learning. The key differences among rule-based processing, machine learning, and deep learning are summarized and illustrated for various applications of CAD in the chest.

  7. Intelligent sensor networks the integration of sensor networks, signal processing and machine learning

    CERN Document Server

    Hu, Fei

    2012-01-01

    Although governments worldwide have invested significantly in intelligent sensor network research and applications, few books cover intelligent sensor networks from a machine learning and signal processing perspective. Filling this void, Intelligent Sensor Networks: The Integration of Sensor Networks, Signal Processing and Machine Learning focuses on the close integration of sensing, networking, and smart signal processing via machine learning. Based on the world-class research of award-winning authors, the book provides a firm grounding in the fundamentals of intelligent sensor networks, incl

  8. Novel Automatic Filter-Class Feature Selection for Machine Learning Regression

    DEFF Research Database (Denmark)

    Wollsen, Morten Gill; Hallam, John; Jørgensen, Bo Nørregaard

    2017-01-01

    With the increased focus on application of Big Data in all sectors of society, the performance of machine learning becomes essential. Efficient machine learning depends on efficient feature selection algorithms. Filter feature selection algorithms are model-free and therefore very fast, but require...... model in the feature selection process. PCA is often used in machine learning litterature and can be considered the default feature selection method. RDESF outperformed PCA in both experiments in both prediction error and computational speed. RDESF is a new step into filter-based automatic feature...

  9. Quantum machine learning what quantum computing means to data mining

    CERN Document Server

    Wittek, Peter

    2014-01-01

    Quantum Machine Learning bridges the gap between abstract developments in quantum computing and the applied research on machine learning. Paring down the complexity of the disciplines involved, it focuses on providing a synthesis that explains the most important machine learning algorithms in a quantum framework. Theoretical advances in quantum computing are hard to follow for computer scientists, and sometimes even for researchers involved in the field. The lack of a step-by-step guide hampers the broader understanding of this emergent interdisciplinary body of research. Quantum Machine L

  10. CRDM motion analysis using machine learning technique

    International Nuclear Information System (INIS)

    Nishimura, Takuya; Nakayama, Hiroyuki; Saitoh, Mayumi; Yaguchi, Seiji

    2017-01-01

    Magnetic jack type Control Rod Drive Mechanism (CRDM) for pressurized water reactor (PWR) plant operates control rods in response to electrical signals from a reactor control system. CRDM operability is evaluated by quantifying armature's response of closed/opened time which means interval time between coil energizing/de-energizing points and armature closed/opened points. MHI has already developed an automatic CRDM motion analysis and applied it to actual plants so far. However, CRDM operational data has wide variation depending on their characteristics such as plant condition, plant, etc. In the existing motion analysis, there is an issue of analysis accuracy for applying a single analysis technique to all plant conditions, plants, etc. In this study, MHI investigated motion analysis using machine learning (Random Forests) which is flexibly accommodated to CRDM operational data with wide variation, and is improved analysis accuracy. (author)

  11. Pileup Mitigation with Machine Learning (PUMML)

    Science.gov (United States)

    Komiske, Patrick T.; Metodiev, Eric M.; Nachman, Benjamin; Schwartz, Matthew D.

    2017-12-01

    Pileup involves the contamination of the energy distribution arising from the primary collision of interest (leading vertex) by radiation from soft collisions (pileup). We develop a new technique for removing this contamination using machine learning and convolutional neural networks. The network takes as input the energy distribution of charged leading vertex particles, charged pileup particles, and all neutral particles and outputs the energy distribution of particles coming from leading vertex alone. The PUMML algorithm performs remarkably well at eliminating pileup distortion on a wide range of simple and complex jet observables. We test the robustness of the algorithm in a number of ways and discuss how the network can be trained directly on data.

  12. IEEE International Workshop on Machine Learning for Signal Processing: Preface

    DEFF Research Database (Denmark)

    Tao, Jianhua

    The 21st IEEE International Workshop on Machine Learning for Signal Processing will be held in Beijing, China, on September 18–21, 2011. The workshop series is the major annual technical event of the IEEE Signal Processing Society's Technical Committee on Machine Learning for Signal Processing...

  13. Combining Formal Logic and Machine Learning for Sentiment Analysis

    DEFF Research Database (Denmark)

    Petersen, Niklas Christoffer; Villadsen, Jørgen

    2014-01-01

    This paper presents a formal logical method for deep structural analysis of the syntactical properties of texts using machine learning techniques for efficient syntactical tagging. To evaluate the method it is used for entity level sentiment analysis as an alternative to pure machine learning...

  14. Applying Machine Learning to Facilitate Autism Diagnostics: Pitfalls and Promises

    Science.gov (United States)

    Bone, Daniel; Goodwin, Matthew S.; Black, Matthew P.; Lee, Chi-Chun; Audhkhasi, Kartik; Narayanan, Shrikanth

    2015-01-01

    Machine learning has immense potential to enhance diagnostic and intervention research in the behavioral sciences, and may be especially useful in investigations involving the highly prevalent and heterogeneous syndrome of autism spectrum disorder. However, use of machine learning in the absence of clinical domain expertise can be tenuous and lead…

  15. An active role for machine learning in drug development

    Science.gov (United States)

    Murphy, Robert F.

    2014-01-01

    Due to the complexity of biological systems, cutting-edge machine-learning methods will be critical for future drug development. In particular, machine-vision methods to extract detailed information from imaging assays and active-learning methods to guide experimentation will be required to overcome the dimensionality problem in drug development. PMID:21587249

  16. Large-Scale Machine Learning for Classification and Search

    Science.gov (United States)

    Liu, Wei

    2012-01-01

    With the rapid development of the Internet, nowadays tremendous amounts of data including images and videos, up to millions or billions, can be collected for training machine learning models. Inspired by this trend, this thesis is dedicated to developing large-scale machine learning techniques for the purpose of making classification and nearest…

  17. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  18. Kernel methods for interpretable machine learning of order parameters

    Science.gov (United States)

    Ponte, Pedro; Melko, Roger G.

    2017-11-01

    Machine learning is capable of discriminating phases of matter, and finding associated phase transitions, directly from large data sets of raw state configurations. In the context of condensed matter physics, most progress in the field of supervised learning has come from employing neural networks as classifiers. Although very powerful, such algorithms suffer from a lack of interpretability, which is usually desired in scientific applications in order to associate learned features with physical phenomena. In this paper, we explore support vector machines (SVMs), which are a class of supervised kernel methods that provide interpretable decision functions. We find that SVMs can learn the mathematical form of physical discriminators, such as order parameters and Hamiltonian constraints, for a set of two-dimensional spin models: the ferromagnetic Ising model, a conserved-order-parameter Ising model, and the Ising gauge theory. The ability of SVMs to provide interpretable classification highlights their potential for automating feature detection in both synthetic and experimental data sets for condensed matter and other many-body systems.

  19. Machine learning, computer vision, and probabilistic models in jet physics

    CERN Multimedia

    CERN. Geneva; NACHMAN, Ben

    2015-01-01

    In this talk we present recent developments in the application of machine learning, computer vision, and probabilistic models to the analysis and interpretation of LHC events. First, we will introduce the concept of jet-images and computer vision techniques for jet tagging. Jet images enabled the connection between jet substructure and tagging with the fields of computer vision and image processing for the first time, improving the performance to identify highly boosted W bosons with respect to state-of-the-art methods, and providing a new way to visualize the discriminant features of different classes of jets, adding a new capability to understand the physics within jets and to design more powerful jet tagging methods. Second, we will present Fuzzy jets: a new paradigm for jet clustering using machine learning methods. Fuzzy jets view jet clustering as an unsupervised learning task and incorporate a probabilistic assignment of particles to jets to learn new features of the jet structure. In particular, we wi...

  20. Sample-Based Extreme Learning Machine with Missing Data

    Directory of Open Access Journals (Sweden)

    Hang Gao

    2015-01-01

    Full Text Available Extreme learning machine (ELM has been extensively studied in machine learning community during the last few decades due to its high efficiency and the unification of classification, regression, and so forth. Though bearing such merits, existing ELM algorithms cannot efficiently handle the issue of missing data, which is relatively common in practical applications. The problem of missing data is commonly handled by imputation (i.e., replacing missing values with substituted values according to available information. However, imputation methods are not always effective. In this paper, we propose a sample-based learning framework to address this issue. Based on this framework, we develop two sample-based ELM algorithms for classification and regression, respectively. Comprehensive experiments have been conducted in synthetic data sets, UCI benchmark data sets, and a real world fingerprint image data set. As indicated, without introducing extra computational complexity, the proposed algorithms do more accurate and stable learning than other state-of-the-art ones, especially in the case of higher missing ratio.

  1. An Android malware detection system based on machine learning

    Science.gov (United States)

    Wen, Long; Yu, Haiyang

    2017-08-01

    The Android smartphone, with its open source character and excellent performance, has attracted many users. However, the convenience of the Android platform also has motivated the development of malware. The traditional method which detects the malware based on the signature is unable to detect unknown applications. The article proposes a machine learning-based lightweight system that is capable of identifying malware on Android devices. In this system we extract features based on the static analysis and the dynamitic analysis, then a new feature selection approach based on principle component analysis (PCA) and relief are presented in the article to decrease the dimensions of the features. After that, a model will be constructed with support vector machine (SVM) for classification. Experimental results show that our system provides an effective method in Android malware detection.

  2. Dropout Prediction in E-Learning Courses through the Combination of Machine Learning Techniques

    Science.gov (United States)

    Lykourentzou, Ioanna; Giannoukos, Ioannis; Nikolopoulos, Vassilis; Mpardis, George; Loumos, Vassili

    2009-01-01

    In this paper, a dropout prediction method for e-learning courses, based on three popular machine learning techniques and detailed student data, is proposed. The machine learning techniques used are feed-forward neural networks, support vector machines and probabilistic ensemble simplified fuzzy ARTMAP. Since a single technique may fail to…

  3. Research into Financial Position of Listed Companies following Classification via Extreme Learning Machine Based upon DE Optimization

    OpenAIRE

    Fu Yu; Mu Jiong; Duan Xu Liang

    2016-01-01

    By means of the model of extreme learning machine based upon DE optimization, this article particularly centers on the optimization thinking of such a model as well as its application effect in the field of listed company’s financial position classification. It proves that the improved extreme learning machine algorithm based upon DE optimization eclipses the traditional extreme learning machine algorithm following comparison. Meanwhile, this article also intends to introduce certain research...

  4. Source localization in an ocean waveguide using supervised machine learning.

    Science.gov (United States)

    Niu, Haiqiang; Reeves, Emma; Gerstoft, Peter

    2017-09-01

    Source localization in ocean acoustics is posed as a machine learning problem in which data-driven methods learn source ranges directly from observed acoustic data. The pressure received by a vertical linear array is preprocessed by constructing a normalized sample covariance matrix and used as the input for three machine learning methods: feed-forward neural networks (FNN), support vector machines (SVM), and random forests (RF). The range estimation problem is solved both as a classification problem and as a regression problem by these three machine learning algorithms. The results of range estimation for the Noise09 experiment are compared for FNN, SVM, RF, and conventional matched-field processing and demonstrate the potential of machine learning for underwater source localization.

  5. A machine learning approach for the classification of metallic glasses

    Science.gov (United States)

    Gossett, Eric; Perim, Eric; Toher, Cormac; Lee, Dongwoo; Zhang, Haitao; Liu, Jingbei; Zhao, Shaofan; Schroers, Jan; Vlassak, Joost; Curtarolo, Stefano

    Metallic glasses possess an extensive set of mechanical properties along with plastic-like processability. As a result, they are a promising material in many industrial applications. However, the successful synthesis of novel metallic glasses requires trial and error, costing both time and resources. Therefore, we propose a high-throughput approach that combines an extensive set of experimental measurements with advanced machine learning techniques. This allows us to classify metallic glasses and predict the full phase diagrams for a given alloy system. Thus this method provides a means to identify potential glass-formers and opens up the possibility for accelerating and reducing the cost of the design of new metallic glasses.

  6. Using Machine Learning Techniques in the Analysis of Oceanographic Data

    Science.gov (United States)

    Falcinelli, K. E.; Abuomar, S.

    2017-12-01

    Acoustic Doppler Current Profilers (ADCPs) are oceanographic tools capable of collecting large amounts of current profile data. Using unsupervised machine learning techniques such as principal component analysis, fuzzy c-means clustering, and self-organizing maps, patterns and trends in an ADCP dataset are found. Cluster validity algorithms such as visual assessment of cluster tendency and clustering index are used to determine the optimal number of clusters in the ADCP dataset. These techniques prove to be useful in analysis of ADCP data and demonstrate potential for future use in other oceanographic applications.

  7. Application of target costing in machining

    Science.gov (United States)

    Gopalakrishnan, Bhaskaran; Kokatnur, Ameet; Gupta, Deepak P.

    2004-11-01

    In today's intensely competitive and highly volatile business environment, consistent development of low cost and high quality products meeting the functionality requirements is a key to a company's survival. Companies continuously strive to reduce the costs while still producing quality products to stay ahead in the competition. Many companies have turned to target costing to achieve this objective. Target costing is a structured approach to determine the cost at which a proposed product, meeting the quality and functionality requirements, must be produced in order to generate the desired profits. It subtracts the desired profit margin from the company's selling price to establish the manufacturing cost of the product. Extensive literature review revealed that companies in automotive, electronic and process industries have reaped the benefits of target costing. However target costing approach has not been applied in the machining industry, but other techniques based on Geometric Programming, Goal Programming, and Lagrange Multiplier have been proposed for application in this industry. These models follow a forward approach, by first selecting a set of machining parameters, and then determining the machining cost. Hence in this study we have developed an algorithm to apply the concepts of target costing, which is a backward approach that selects the machining parameters based on the required machining costs, and is therefore more suitable for practical applications in process improvement and cost reduction. A target costing model was developed for turning operation and was successfully validated using practical data.

  8. Amp: A modular approach to machine learning in atomistic simulations

    Science.gov (United States)

    Khorshidi, Alireza; Peterson, Andrew A.

    2016-10-01

    Electronic structure calculations, such as those employing Kohn-Sham density functional theory or ab initio wavefunction theories, have allowed for atomistic-level understandings of a wide variety of phenomena and properties of matter at small scales. However, the computational cost of electronic structure methods drastically increases with length and time scales, which makes these methods difficult for long time-scale molecular dynamics simulations or large-sized systems. Machine-learning techniques can provide accurate potentials that can match the quality of electronic structure calculations, provided sufficient training data. These potentials can then be used to rapidly simulate large and long time-scale phenomena at similar quality to the parent electronic structure approach. Machine-learning potentials usually take a bias-free mathematical form and can be readily developed for a wide variety of systems. Electronic structure calculations have favorable properties-namely that they are noiseless and targeted training data can be produced on-demand-that make them particularly well-suited for machine learning. This paper discusses our modular approach to atomistic machine learning through the development of the open-source Atomistic Machine-learning Package (Amp), which allows for representations of both the total and atom-centered potential energy surface, in both periodic and non-periodic systems. Potentials developed through the atom-centered approach are simultaneously applicable for systems with various sizes. Interpolation can be enhanced by introducing custom descriptors of the local environment. We demonstrate this in the current work for Gaussian-type, bispectrum, and Zernike-type descriptors. Amp has an intuitive and modular structure with an interface through the python scripting language yet has parallelizable fortran components for demanding tasks; it is designed to integrate closely with the widely used Atomic Simulation Environment (ASE), which

  9. Strategies and Principles of Distributed Machine Learning on Big Data

    Directory of Open Access Journals (Sweden)

    Eric P. Xing

    2016-06-01

    Full Text Available The rise of big data has led to new demands for machine learning (ML systems to learn complex models, with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer powerful predictive analytics (such as high-dimensional latent features, intermediate representations, and decision functions thereupon. In order to run ML algorithms at such scales, on a distributed cluster with tens to thousands of machines, it is often the case that significant engineering efforts are required—and one might fairly ask whether such engineering truly falls within the domain of ML research. Taking the view that “big” ML systems can benefit greatly from ML-rooted statistical and algorithmic insights—and that ML researchers should therefore not shy away from such systems design—we discuss a series of principles and strategies distilled from our recent efforts on industrial-scale ML solutions. These principles and strategies span a continuum from application, to engineering, and to theoretical research and development of big ML systems and architectures, with the goal of understanding how to make them efficient, generally applicable, and supported with convergence and scaling guarantees. They concern four key questions that traditionally receive little attention in ML research: How can an ML program be distributed over a cluster? How can ML computation be bridged with inter-machine communication? How can such communication be performed? What should be communicated between machines? By exposing underlying statistical and algorithmic characteristics unique to ML programs but not typically seen in traditional computer programs, and by dissecting successful cases to reveal how we have harnessed these principles to design and develop both high-performance distributed ML software as well as general-purpose ML frameworks, we present opportunities for ML researchers and practitioners to further shape and enlarge the area

  10. Combining generative and discriminative representation learning for lung CT analysis with convolutional restricted Boltzmann machines

    DEFF Research Database (Denmark)

    van Tulder, Gijs; de Bruijne, Marleen

    2016-01-01

    The choice of features greatly influences the performance of a tissue classification system. Despite this, many systems are built with standard, predefined filter banks that are not optimized for that particular application. Representation learning methods such as restricted Boltzmann machines may...... outperform these standard filter banks because they learn a feature description directly from the training data. Like many other representation learning methods, restricted Boltzmann machines are unsupervised and are trained with a generative learning objective; this allows them to learn representations from...... unlabeled data, but does not necessarily produce features that are optimal for classification. In this paper we propose the convolutional classification restricted Boltzmann machine, which combines a generative and a discriminative learning objective. This allows it to learn filters that are good both...

  11. Machine Learning Methods for Identifying Composition of Uranium Deposits in Kazakhstan

    Directory of Open Access Journals (Sweden)

    Kuchin Yan

    2017-12-01

    Full Text Available The paper explores geophysical methods of wells survey, as well as their role in the development of Kazakhstan’s uranium deposit mining efforts. An analysis of the existing methods for solving the problem of interpreting geophysical data using machine learning in petroleum geophysics is made. The requirements and possible applications of machine learning methods in regard to uranium deposits of Kazakhstan are formulated in the paper.

  12. Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach

    OpenAIRE

    Weng, Wei-Hung; Wagholikar, Kavishwar B.; McCray, Alexa T.; Szolovits, Peter; Chueh, Henry C.

    2017-01-01

    Background The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note. Methods We constructed the pipeline using the clinical ...

  13. Using Machine Learning to Search for MSSM Higgs Bosons

    CERN Document Server

    Diesing, Rebecca

    2016-01-01

    This paper examines the performance of machine learning in the identification of Minimally Su- persymmetric Standard Model (MSSM) Higgs Bosons, and compares this performance to that of traditional cut strategies. Two boosted decision tree algorithms were tested, scikit-learn and XGBoost. These tests indicated that machine learning can perform significantly better than traditional cuts. However, since machine learning in this form cannot be directly implemented in a real MSSM Higgs analysis, this performance information was instead used to better understand the relationships between training variables. Further studies might use this information to construct an improved cut strategy.

  14. Precision Parameter Estimation and Machine Learning

    Science.gov (United States)

    Wandelt, Benjamin D.

    2008-12-01

    I discuss the strategy of ``Acceleration by Parallel Precomputation and Learning'' (AP-PLe) that can vastly accelerate parameter estimation in high-dimensional parameter spaces and costly likelihood functions, using trivially parallel computing to speed up sequential exploration of parameter space. This strategy combines the power of distributed computing with machine learning and Markov-Chain Monte Carlo techniques efficiently to explore a likelihood function, posterior distribution or χ2-surface. This strategy is particularly successful in cases where computing the likelihood is costly and the number of parameters is moderate or large. We apply this technique to two central problems in cosmology: the solution of the cosmological parameter estimation problem with sufficient accuracy for the Planck data using PICo; and the detailed calculation of cosmological helium and hydrogen recombination with RICO. Since the APPLe approach is designed to be able to use massively parallel resources to speed up problems that are inherently serial, we can bring the power of distributed computing to bear on parameter estimation problems. We have demonstrated this with the CosmologyatHome project.

  15. Quantum learning and universal quantum matching machine

    International Nuclear Information System (INIS)

    Sasaki, Masahide; Carlini, Alberto

    2002-01-01

    Suppose that three kinds of quantum systems are given in some unknown states vertical bar f> xN , vertical bar g 1 > xK , and vertical bar g 2 > xK , and we want to decide which template state vertical bar g 1 > or vertical bar g 2 >, each representing the feature of the pattern class C 1 or C 2 , respectively, is closest to the input feature state vertical bar f>. This is an extension of the pattern matching problem into the quantum domain. Assuming that these states are known a priori to belong to a certain parametric family of pure qubit systems, we derive two kinds of matching strategies. The first one is a semiclassical strategy that is obtained by the natural extension of conventional matching strategies and consists of a two-stage procedure: identification (estimation) of the unknown template states to design the classifier (learning process to train the classifier) and classification of the input system into the appropriate pattern class based on the estimated results. The other is a fully quantum strategy without any intermediate measurement, which we might call as the universal quantum matching machine. We present the Bayes optimal solutions for both strategies in the case of K=1, showing that there certainly exists a fully quantum matching procedure that is strictly superior to the straightforward semiclassical extension of the conventional matching strategy based on the learning process

  16. Machine learning in cardiovascular medicine: are we there yet?

    Science.gov (United States)

    Shameer, Khader; Johnson, Kipp W; Glicksberg, Benjamin S; Dudley, Joel T; Sengupta, Partho P

    2018-01-19

    Artificial intelligence (AI) broadly refers to analytical algorithms that iteratively learn from data, allowing computers to find hidden insights without being explicitly programmed where to look. These include a family of operations encompassing several terms like machine learning, cognitive learning, deep learning and reinforcement learning-based methods that can be used to integrate and interpret complex biomedical and healthcare data in scenarios where traditional statistical methods may not be able to perform. In this review article, we discuss the basics of machine learning algorithms and what potential data sources exist; evaluate the need for machine learning; and examine the potential limitations and challenges of implementing machine in the context of cardiovascular medicine. The most promising avenues for AI in medicine are the development of automated risk prediction algorithms which can be used to guide clinical care; use of unsupervised learning techniques to more precisely phenotype complex disease; and the implementation of reinforcement learning algorithms to intelligently augment healthcare providers. The utility of a machine learning-based predictive model will depend on factors including data heterogeneity, data depth, data breadth, nature of modelling task, choice of machine learning and feature selection algorithms, and orthogonal evidence. A critical understanding of the strength and limitations of various methods and tasks amenable to machine learning is vital. By leveraging the growing corpus of big data in medicine, we detail pathways by which machine learning may facilitate optimal development of patient-specific models for improving diagnoses, intervention and outcome in cardiovascular medicine. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  17. RSO Characterization from Photometric Data Using Machine Learning

    Science.gov (United States)

    Howard, M.; Klem, B.; Gorman, J.

    Object characterization is the description of a resident space object (RSO), its capabilities and its behavior. While astrometric data has been used extensively for object detection, location, and characterization, photometric data has been less widely applied and remains a promising area for improving RSO characterization. RSO characteristics which may influence changes in light intensity with respect to changes in viewing angle or orientation signature include geometry, orientation, components material properties, stability and other characteristics. However, most RSO characterization is presently performed manually and on an individual basis by space analysts and there is a need for efficient and automated methods to perform characterization. This paper discusses the application of machine learning techniques to characterization of RSOs in the geosynchronous altitude regime using photometric data. We develop simulated signatures in the visible spectral band of three basic RSO types, with variations in object orientation, material characteristics, size and attitude and attempt to recover these properties through object characterization techniques. We generate observations by sampling noisy measurements from the simulated signature. Next, we extract a set of features from the observations and train machine learning algorithms to classify the signatures. We consider the effectiveness of a set of binary classifiers trained to individually recognize separate cases. The results of each classifier are combined together to produce a final output characterization of an input observation. Experiments with varying levels of noise are presented, and we evaluate models with respect to classification accuracy and other criteria. The end result of this process is a unique methodology for exploiting the use usefulness and applicability of machine learning to an important space sensing and identification process. This material is based upon work supported by the United States

  18. Automated quantification of cerebral edema following hemispheric infarction: Application of a machine-learning algorithm to evaluate CSF shifts on serial head CTs

    Directory of Open Access Journals (Sweden)

    Yasheng Chen

    2016-01-01

    Full Text Available Although cerebral edema is a major cause of death and deterioration following hemispheric stroke, there remains no validated biomarker that captures the full spectrum of this critical complication. We recently demonstrated that reduction in intracranial cerebrospinal fluid (CSF volume (∆CSF on serial computed tomography (CT scans provides an accurate measure of cerebral edema severity, which may aid in early triaging of stroke patients for craniectomy. However, application of such a volumetric approach would be too cumbersome to perform manually on serial scans in a real-world setting. We developed and validated an automated technique for CSF segmentation via integration of random forest (RF based machine learning with geodesic active contour (GAC segmentation. The proposed RF + GAC approach was compared to conventional Hounsfield Unit (HU thresholding and RF segmentation methods using Dice similarity coefficient (DSC and the correlation of volumetric measurements, with manual delineation serving as the ground truth. CSF spaces were outlined on scans performed at baseline (<6 h after stroke onset and early follow-up (FU (closest to 24 h in 38 acute ischemic stroke patients. RF performed significantly better than optimized HU thresholding (p < 10−4 in baseline and p < 10−5 in FU and RF + GAC performed significantly better than RF (p < 10−3 in baseline and p < 10−5 in FU. Pearson correlation coefficients between the automatically detected ∆CSF and the ground truth were r = 0.178 (p = 0.285, r = 0.876 (p < 10−6 and r = 0.879 (p < 10−6 for thresholding, RF and RF + GAC, respectively, with a slope closer to the line of identity in RF + GAC. When we applied the algorithm trained from images of one stroke center to segment CTs from another center, similar findings held. In conclusion, we have developed and validated an accurate automated approach to segment CSF and calculate its shifts on serial CT scans

  19. Using human brain activity to guide machine learning.

    Science.gov (United States)

    Fong, Ruth C; Scheirer, Walter J; Cox, David D

    2018-03-29

    Machine learning is a field of computer science that builds algorithms that learn. In many cases, machine learning algorithms are used to recreate a human ability like adding a caption to a photo, driving a car, or playing a game. While the human brain has long served as a source of inspiration for machine learning, little effort has been made to directly use data collected from working brains as a guide for machine learning algorithms. Here we demonstrate a new paradigm of "neurally-weighted" machine learning, which takes fMRI measurements of human brain activity from subjects viewing images, and infuses these data into the training process of an object recognition learning algorithm to make it more consistent with the human brain. After training, these neurally-weighted classifiers are able to classify images without requiring any additional neural data. We show that our neural-weighting approach can lead to large performance gains when used with traditional machine vision features, as well as to significant improvements with already high-performing convolutional neural network features. The effectiveness of this approach points to a path forward for a new class of hybrid machine learning algorithms which take both inspiration and direct constraints from neuronal data.

  20. TF.Learn: TensorFlow's High-level Module for Distributed Machine Learning

    OpenAIRE

    Tang, Yuan

    2016-01-01

    TF.Learn is a high-level Python module for distributed machine learning inside TensorFlow. It provides an easy-to-use Scikit-learn style interface to simplify the process of creating, configuring, training, evaluating, and experimenting a machine learning model. TF.Learn integrates a wide range of state-of-art machine learning algorithms built on top of TensorFlow's low level APIs for small to large-scale supervised and unsupervised problems. This module focuses on bringing machine learning t...

  1. MEDLINE MeSH Indexing: Lessons Learned from Machine Learning and Future Directions

    DEFF Research Database (Denmark)

    Jimeno-Yepes, Antonio; Mork, James G.; Wilkowski, Bartlomiej

    2012-01-01

    and analyzed the issues when using standard machine learning algorithms. We show that in some cases machine learning can improve the annotations already recommended by MTI, that machine learning based on low variance methods achieves better performance and that each MeSH heading presents a different behavior......Map and a k-NN approach called PubMed Related Citations (PRC). Our motivation is to improve the quality of MTI based on machine learning. Typical machine learning approaches fit this indexing task into text categorization. In this work, we have studied some Medical Subject Headings (MeSH) recommended by MTI...

  2. Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare.

    Science.gov (United States)

    Mozaffari-Kermani, Mehran; Sur-Kolay, Susmita; Raghunathan, Anand; Jha, Niraj K

    2015-11-01

    Machine learning is being used in a wide range of application domains to discover patterns in large datasets. Increasingly, the results of machine learning drive critical decisions in applications related to healthcare and biomedicine. Such health-related applications are often sensitive, and thus, any security breach would be catastrophic. Naturally, the integrity of the results computed by machine learning is of great importance. Recent research has shown that some machine-learning algorithms can be compromised by augmenting their training datasets with malicious data, leading to a new class of attacks called poisoning attacks. Hindrance of a diagnosis may have life-threatening consequences and could cause distrust. On the other hand, not only may a false diagnosis prompt users to distrust the machine-learning algorithm and even abandon the entire system but also such a false positive classification may cause patient distress. In this paper, we present a systematic, algorithm-independent approach for mounting poisoning attacks across a wide range of machine-learning algorithms and healthcare datasets. The proposed attack procedure generates input data, which, when added to the training set, can either cause the results of machine learning to have targeted errors (e.g., increase the likelihood of classification into a specific class), or simply introduce arbitrary errors (incorrect classification). These attacks may be applied to both fixed and evolving datasets. They can be applied even when only statistics of the training dataset are available or, in some cases, even without access to the training dataset, although at a lower efficacy. We establish the effectiveness of the proposed attacks using a suite of six machine-learning algorithms and five healthcare datasets. Finally, we present countermeasures against the proposed generic attacks that are based on tracking and detecting deviations in various accuracy metrics, and benchmark their effectiveness.

  3. Research into Financial Position of Listed Companies following Classification via Extreme Learning Machine Based upon DE Optimization

    Directory of Open Access Journals (Sweden)

    Fu Yu

    2016-01-01

    Full Text Available By means of the model of extreme learning machine based upon DE optimization, this article particularly centers on the optimization thinking of such a model as well as its application effect in the field of listed company’s financial position classification. It proves that the improved extreme learning machine algorithm based upon DE optimization eclipses the traditional extreme learning machine algorithm following comparison. Meanwhile, this article also intends to introduce certain research thinking concerning extreme learning machine into the economics classification area so as to fulfill the purpose of computerizing the speedy but effective evaluation of massive financial statements of listed companies pertain to different classes

  4. Time-series prediction and applications a machine intelligence approach

    CERN Document Server

    Konar, Amit

    2017-01-01

    This book presents machine learning and type-2 fuzzy sets for the prediction of time-series with a particular focus on business forecasting applications. It also proposes new uncertainty management techniques in an economic time-series using type-2 fuzzy sets for prediction of the time-series at a given time point from its preceding value in fluctuating business environments. It employs machine learning to determine repetitively occurring similar structural patterns in the time-series and uses stochastic automaton to predict the most probabilistic structure at a given partition of the time-series. Such predictions help in determining probabilistic moves in a stock index time-series Primarily written for graduate students and researchers in computer science, the book is equally useful for researchers/professionals in business intelligence and stock index prediction. A background of undergraduate level mathematics is presumed, although not mandatory, for most of the sections. Exercises with tips are provided at...

  5. Machine learning and next-generation asteroid surveys

    Science.gov (United States)

    Nugent, Carrie R.; Dailey, John; Cutri, Roc M.; Masci, Frank J.; Mainzer, Amy K.

    2017-10-01

    Next-generation surveys such as NEOCam (Mainzer et al., 2016) will sift through tens of millions of point source detections daily to detect and discover asteroids. This requires new, more efficient techniques to distinguish between solar system objects, background stars and galaxies, and artifacts such as cosmic rays, scattered light and diffraction spikes.Supervised machine learning is a set of algorithms that allows computers to classify data on a training set, and then apply that classification to make predictions on new datasets. It has been employed by a broad range of fields, including computer vision, medical diagnoses, economics, and natural language processing. It has also been applied to astronomical datasets, including transient identification in the Palomar Transient Factory pipeline (Masci et al., 2016), and in the Pan-STARRS1 difference imaging (D. E. Wright et al., 2015).As part of the NEOCam extended phase A work we apply machine learning techniques to the problem of asteroid detection. Asteroid detection is an ideal application of supervised learning, as there is a wealth of metrics associated with each extracted source, and suitable training sets are easily created. Using the vetted NEOWISE dataset (E. L. Wright et al., 2010, Mainzer et al., 2011) as a proof-of-concept of this technique, we applied the python package sklearn. We report on reliability, feature set selection, and the suitability of various algorithms.

  6. Teaching an Old Log New Tricks with Machine Learning.

    Science.gov (United States)

    Schnell, Krista; Puri, Colin; Mahler, Paul; Dukatz, Carl

    2014-03-01

    To most people, the log file would not be considered an exciting area in technology today. However, these relatively benign, slowly growing data sources can drive large business transformations when combined with modern-day analytics. Accenture Technology Labs has built a new framework that helps to expand existing vendor solutions to create new methods of gaining insights from these benevolent information springs. This framework provides a systematic and effective machine-learning mechanism to understand, analyze, and visualize heterogeneous log files. These techniques enable an automated approach to analyzing log content in real time, learning relevant behaviors, and creating actionable insights applicable in traditionally reactive situations. Using this approach, companies can now tap into a wealth of knowledge residing in log file data that is currently being collected but underutilized because of its overwhelming variety and volume. By using log files as an important data input into the larger enterprise data supply chain, businesses have the opportunity to enhance their current operational log management solution and generate entirely new business insights-no longer limited to the realm of reactive IT management, but extending from proactive product improvement to defense from attacks. As we will discuss, this solution has immediate relevance in the telecommunications and security industries. However, the most forward-looking companies can take it even further. How? By thinking beyond the log file and applying the same machine-learning framework to other log file use cases (including logistics, social media, and consumer behavior) and any other transactional data source.

  7. MLBCD: a machine learning tool for big clinical data.

    Science.gov (United States)

    Luo, Gang

    2015-01-01

    Predictive modeling is fundamental for extracting value from large clinical data sets, or "big clinical data," advancing clinical research, and improving healthcare. Machine learning is a powerful approach to predictive modeling. Two factors make machine learning challenging for healthcare researchers. First, before training a machine learning model, the values of one or more model parameters called hyper-parameters must typically be specified. Due to their inexperience with machine learning, it is hard for healthcare researchers to choose an appropriate algorithm and hyper-parameter values. Second, many clinical data are stored in a special format. These data must be iteratively transformed into the relational table format before conducting predictive modeling. This transformation is time-consuming and requires computing expertise. This paper presents our vision for and design of MLBCD (Machine Learning for Big Clinical Data), a new software system aiming to address these challenges and facilitate building machine learning predictive models using big clinical data. The paper describes MLBCD's design in detail. By making machine learning accessible to healthcare researchers, MLBCD will open the use of big clinical data and increase the ability to foster biomedical discovery and improve care.

  8. Machine Shop I. Learning Activity Packets (LAPs). Section D--Power Saws and Drilling Machines.

    Science.gov (United States)

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This document contains two learning activity packets (LAPs) for the "power saws and drilling machines" instructional area of a Machine Shop I course. The two LAPs cover the following topics: power saws and drill press. Each LAP contains a cover sheet that describes its purpose, an introduction, and the tasks included in the LAP; learning…

  9. Machine learning control taming nonlinear dynamics and turbulence

    CERN Document Server

    Duriez, Thomas; Noack, Bernd R

    2017-01-01

    This is the first book on a generally applicable control strategy for turbulence and other complex nonlinear systems. The approach of the book employs powerful methods of machine learning for optimal nonlinear control laws. This machine learning control (MLC) is motivated and detailed in Chapters 1 and 2. In Chapter 3, methods of linear control theory are reviewed. In Chapter 4, MLC is shown to reproduce known optimal control laws for linear dynamics (LQR, LQG). In Chapter 5, MLC detects and exploits a strongly nonlinear actuation mechanism of a low-dimensional dynamical system when linear control methods are shown to fail. Experimental control demonstrations from a laminar shear-layer to turbulent boundary-layers are reviewed in Chapter 6, followed by general good practices for experiments in Chapter 7. The book concludes with an outlook on the vast future applications of MLC in Chapter 8. Matlab codes are provided for easy reproducibility of the presented results. The book includes interviews with leading r...

  10. Bidirectional extreme learning machine for regression problem and its learning effectiveness.

    Science.gov (United States)

    Yang, Yimin; Wang, Yaonan; Yuan, Xiaofang

    2012-09-01

    It is clear that the learning effectiveness and learning speed of neural networks are in general far slower than required, which has been a major bottleneck for many applications. Recently, a simple and efficient learning method, referred to as extreme learning machine (ELM), was proposed by Huang , which has shown that, compared to some conventional methods, the training time of neural networks can be reduced by a thousand times. However, one of the open problems in ELM research is whether the number of hidden nodes can be further reduced without affecting learning effectiveness. This brief proposes a new learning algorithm, called bidirectional extreme learning machine (B-ELM), in which some hidden nodes are not randomly selected. In theory, this algorithm tends to reduce network output error to 0 at an extremely early learning stage. Furthermore, we find a relationship between the network output error and the network output weights in the proposed B-ELM. Simulation results demonstrate that the proposed method can be tens to hundreds of times faster than other incremental ELM algorithms.

  11. Machine Learning Approaches for Clinical Psychology and Psychiatry.

    Science.gov (United States)

    Dwyer, Dominic B; Falkai, Peter; Koutsouleris, Nikolaos

    2018-05-07

    Machine learning approaches for clinical psychology and psychiatry explicitly focus on learning statistical functions from multidimensional data sets to make generalizable predictions about individuals. The goal of this review is to provide an accessible understanding of why this approach is important for future practice given its potential to augment decisions associated with the diagnosis, prognosis, and treatment of people suffering from mental illness using clinical and biological data. To this end, the limitations of current statistical paradigms in mental health research are critiqued, and an introduction is provided to critical machine learning methods used in clinical studies. A selective literature review is then presented aiming to reinforce the usefulness of machine learning methods and provide evidence of their potential. In the context of promising initial results, the current limitations of machine learning approaches are addressed, and considerations for future clinical translation are outlined.

  12. Modelling tick abundance using machine learning techniques and satellite imagery

    DEFF Research Database (Denmark)

    Kjær, Lene Jung; Korslund, L.; Kjelland, V.

    satellite images to run Boosted Regression Tree machine learning algorithms to predict overall distribution (presence/absence of ticks) and relative tick abundance of nymphs and larvae in southern Scandinavia. For nymphs, the predicted abundance had a positive correlation with observed abundance...... the predicted distribution of larvae was mostly even throughout Denmark, it was primarily around the coastlines in Norway and Sweden. Abundance was fairly low overall except in some fragmented patches corresponding to forested habitats in the region. Machine learning techniques allow us to predict for larger...... the collected ticks for pathogens and using the same machine learning techniques to develop prevalence maps of the ScandTick region....

  13. Implementing Machine Learning in Radiology Practice and Research.

    Science.gov (United States)

    Kohli, Marc; Prevedello, Luciano M; Filice, Ross W; Geis, J Raymond

    2017-04-01

    The purposes of this article are to describe concepts that radiologists should understand to evaluate machine learning projects, including common algorithms, supervised as opposed to unsupervised techniques, statistical pitfalls, and data considerations for training and evaluation, and to briefly describe ethical dilemmas and legal risk. Machine learning includes a broad class of computer programs that improve with experience. The complexity of creating, training, and monitoring machine learning indicates that the success of the algorithms will require radiologist involvement for years to come, leading to engagement rather than replacement.

  14. Studying depression using imaging and machine learning methods

    Directory of Open Access Journals (Sweden)

    Meenal J. Patel

    2016-01-01

    Full Text Available Depression is a complex clinical entity that can pose challenges for clinicians regarding both accurate diagnosis and effective timely treatment. These challenges have prompted the development of multiple machine learning methods to help improve the management of this disease. These methods utilize anatomical and physiological data acquired from neuroimaging to create models that can identify depressed patients vs. non-depressed patients and predict treatment outcomes. This article (1 presents a background on depression, imaging, and machine learning methodologies; (2 reviews methodologies of past studies that have used imaging and machine learning to study depression; and (3 suggests directions for future depression-related studies.

  15. Studying depression using imaging and machine learning methods.

    Science.gov (United States)

    Patel, Meenal J; Khalaf, Alexander; Aizenstein, Howard J

    2016-01-01

    Depression is a complex clinical entity that can pose challenges for clinicians regarding both accurate diagnosis and effective timely treatment. These challenges have prompted the development of multiple machine learning methods to help improve the management of this disease. These methods utilize anatomical and physiological data acquired from neuroimaging to create models that can identify depressed patients vs. non-depressed patients and predict treatment outcomes. This article (1) presents a background on depression, imaging, and machine learning methodologies; (2) reviews methodologies of past studies that have used imaging and machine learning to study depression; and (3) suggests directions for future depression-related studies.

  16. How the machine ‘thinks’: Understanding opacity in machine learning algorithms

    Directory of Open Access Journals (Sweden)

    Jenna Burrell

    2016-01-01

    Full Text Available This article considers the issue of opacity as a problem for socially consequential mechanisms of classification and ranking, such as spam filters, credit card fraud detection, search engines, news trends, market segmentation and advertising, insurance or loan qualification, and credit scoring. These mechanisms of classification all frequently rely on computational algorithms, and in many cases on machine learning algorithms to do this work. In this article, I draw a distinction between three forms of opacity: (1 opacity as intentional corporate or state secrecy, (2 opacity as technical illiteracy, and (3 an opacity that arises from the characteristics of machine learning algorithms and the scale required to apply them usefully. The analysis in this article gets inside the algorithms themselves. I cite existing literatures in computer science, known industry practices (as they are publicly presented, and do some testing and manipulation of code as a form of lightweight code audit. I argue that recognizing the distinct forms of opacity that may be coming into play in a given application is a key to determining which of a variety of technical and non-technical solutions could help to prevent harm.

  17. Support Vector Machine and Application in Seizure Prediction

    KAUST Repository

    Qiu, Simeng

    2018-04-01

    Nowadays, Machine learning (ML) has been utilized in various kinds of area which across the range from engineering field to business area. In this paper, we first present several kernel machine learning methods of solving classification, regression and clustering problems. These have good performance but also have some limitations. We present examples to each method and analyze the advantages and disadvantages for solving different scenarios. Then we focus on one of the most popular classification methods, Support Vectors Machine (SVM). In addition, we introduce the basic theory, advantages and scenarios of using Support Vector Machine (SVM) deal with classification problems. We also explain a convenient approach of tacking SVM problems which are called Sequential Minimal Optimization (SMO). Moreover, one class SVM can be understood in a different way which is called Support Vector Data Description (SVDD). This is a famous non-linear model problem compared with SVM problems, SVDD can be solved by utilizing Gaussian RBF kernel function combined with SMO. At last, we compared the difference and performance of SVM-SMO implementation and SVM-SVDD implementation. About the application part, we utilized SVM method to handle seizure forecasting in canine epilepsy, after comparing the results from different methods such as random forest, extremely randomized tree, and SVM to classify preictal (pre-seizure) and interictal (interval-seizure) binary data. We draw the conclusion that SVM has the best performance.

  18. Machine learning for fab automated diagnostics

    Science.gov (United States)

    Giollo, Manuel; Lam, Auguste; Gkorou, Dimitra; Liu, Xing Lan; van Haren, Richard

    2017-06-01

    Process optimization depends largely on field engineer's knowledge and expertise. However, this practice turns out to be less sustainable due to the fab complexity which is continuously increasing in order to support the extreme miniaturization of Integrated Circuits. On the one hand, process optimization and root cause analysis of tools is necessary for a smooth fab operation. On the other hand, the growth in number of wafer processing steps is adding a considerable new source of noise which may have a significant impact at the nanometer scale. This paper explores the ability of historical process data and Machine Learning to support field engineers in production analysis and monitoring. We implement an automated workflow in order to analyze a large volume of information, and build a predictive model of overlay variation. The proposed workflow addresses significant problems that are typical in fab production, like missing measurements, small number of samples, confounding effects due to heterogeneity of data, and subpopulation effects. We evaluate the proposed workflow on a real usecase and we show that it is able to predict overlay excursions observed in Integrated Circuits manufacturing. The chosen design focuses on linear and interpretable models of the wafer history, which highlight the process steps that are causing defective products. This is a fundamental feature for diagnostics, as it supports process engineers in the continuous improvement of the production line.

  19. GAME: GAlaxy Machine learning for Emission lines

    Science.gov (United States)

    Ucci, G.; Ferrara, A.; Pallottini, A.; Gallerani, S.

    2018-06-01

    We present an updated, optimized version of GAME (GAlaxy Machine learning for Emission lines), a code designed to infer key interstellar medium physical properties from emission line intensities of ultraviolet /optical/far-infrared galaxy spectra. The improvements concern (a) an enlarged spectral library including Pop III stars, (b) the inclusion of spectral noise in the training procedure, and (c) an accurate evaluation of uncertainties. We extensively validate the optimized code and compare its performance against empirical methods and other available emission line codes (PYQZ and HII-CHI-MISTRY) on a sample of 62 SDSS stacked galaxy spectra and 75 observed HII regions. Very good agreement is found for metallicity. However, ionization parameters derived by GAME tend to be higher. We show that this is due to the use of too limited libraries in the other codes. The main advantages of GAME are the simultaneous use of all the measured spectral lines and the extremely short computational times. We finally discuss the code potential and limitations.

  20. Predicting Increased Blood Pressure Using Machine Learning

    Science.gov (United States)

    Golino, Hudson Fernandes; Amaral, Liliany Souza de Brito; Duarte, Stenio Fernando Pimentel; Soares, Telma de Jesus; dos Reis, Luciana Araujo

    2014-01-01

    The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudo R 2 (.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudo R 2 (.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power. PMID:24669313