WorldWideScience

Sample records for machine learning tasks

  1. Towards a Future Reallocation of Work between Humans and Machines – Taxonomy of Tasks and Interaction Types in the Context of Machine Learning

    OpenAIRE

    Traumer, Fabian; Oeste-Reiß, Sarah; Leimeister, Jan Marco

    2017-01-01

    In today’s race for competitive advantages, more and more companies implement innovations in artificial intelligence and machine learning (ML). Although these machines take over tasks that have been executed by humans, they will not make human workforce obsolete. To leverage the potentials of ML, collaboration between humans and machines is necessary. Before collaboration processes can be developed, a classification of tasks in the field of ML is needed. Therefore, we present a taxonomy for t...

  2. Quantum machine learning.

    Science.gov (United States)

    Biamonte, Jacob; Wittek, Peter; Pancotti, Nicola; Rebentrost, Patrick; Wiebe, Nathan; Lloyd, Seth

    2017-09-13

    Fuelled by increasing computer power and algorithmic advances, machine learning techniques have become powerful tools for finding patterns in data. Quantum systems produce atypical patterns that classical systems are thought not to produce efficiently, so it is reasonable to postulate that quantum computers may outperform classical computers on machine learning tasks. The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers. Recent work has produced quantum algorithms that could act as the building blocks of machine learning programs, but the hardware and software challenges are still considerable.

  3. Machine Learning.

    Science.gov (United States)

    Kirrane, Diane E.

    1990-01-01

    As scientists seek to develop machines that can "learn," that is, solve problems by imitating the human brain, a gold mine of information on the processes of human learning is being discovered, expert systems are being improved, and human-machine interactions are being enhanced. (SK)

  4. Machine Learning

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Machine learning, which builds on ideas in computer science, statistics, and optimization, focuses on developing algorithms to identify patterns and regularities in data, and using these learned patterns to make predictions on new observations. Boosted by its industrial and commercial applications, the field of machine learning is quickly evolving and expanding. Recent advances have seen great success in the realms of computer vision, natural language processing, and broadly in data science. Many of these techniques have already been applied in particle physics, for instance for particle identification, detector monitoring, and the optimization of computer resources. Modern machine learning approaches, such as deep learning, are only just beginning to be applied to the analysis of High Energy Physics data to approach more and more complex problems. These classes will review the framework behind machine learning and discuss recent developments in the field.

  5. Machine learning with R

    CERN Document Server

    Lantz, Brett

    2013-01-01

    Written as a tutorial to explore and understand the power of R for machine learning. This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating the results. These steps will build the knowledge you need to apply them to your own data science tasks.Intended for those who want to learn how to use R's machine learning capabilities and gain insight from your data. Perhaps you already know a bit about machine learning, but have never used R; or

  6. Machine Learning

    Energy Technology Data Exchange (ETDEWEB)

    Chikkagoudar, Satish; Chatterjee, Samrat; Thomas, Dennis G.; Carroll, Thomas E.; Muller, George

    2017-04-21

    The absence of a robust and unified theory of cyber dynamics presents challenges and opportunities for using machine learning based data-driven approaches to further the understanding of the behavior of such complex systems. Analysts can also use machine learning approaches to gain operational insights. In order to be operationally beneficial, cybersecurity machine learning based models need to have the ability to: (1) represent a real-world system, (2) infer system properties, and (3) learn and adapt based on expert knowledge and observations. Probabilistic models and Probabilistic graphical models provide these necessary properties and are further explored in this chapter. Bayesian Networks and Hidden Markov Models are introduced as an example of a widely used data driven classification/modeling strategy.

  7. Machine Learning for Hackers

    CERN Document Server

    Conway, Drew

    2012-01-01

    If you're an experienced programmer interested in crunching data, this book will get you started with machine learning-a toolkit of algorithms that enables computers to train themselves to automate useful tasks. Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation. Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you'll learn how to analyz

  8. Machine-learning model observer for detection and localization tasks in clinical SPECT-MPI

    Science.gov (United States)

    Parages, Felipe M.; O'Connor, J. Michael; Pretorius, P. Hendrik; Brankov, Jovan G.

    2016-03-01

    In this work we propose a machine-learning MO based on Naive-Bayes classification (NB-MO) for the diagnostic tasks of detection, localization and assessment of perfusion defects in clinical SPECT Myocardial Perfusion Imaging (MPI), with the goal of evaluating several image reconstruction methods used in clinical practice. NB-MO uses image features extracted from polar-maps in order to predict lesion detection, localization and severity scores given by human readers in a series of 3D SPECT-MPI. The population used to tune (i.e. train) the NB-MO consisted of simulated SPECT-MPI cases - divided into normals or with lesions in variable sizes and locations - reconstructed using filtered backprojection (FBP) method. An ensemble of five human specialists (physicians) read a subset of simulated reconstructed images, and assigned a perfusion score for each region of the left-ventricle (LV). Polar-maps generated from the simulated volumes along with their corresponding human scores were used to train five NB-MOs (one per human reader), which are subsequently applied (i.e. tested) on three sets of clinical SPECT-MPI polar maps, in order to predict human detection and localization scores. The clinical "testing" population comprises healthy individuals and patients suffering from coronary artery disease (CAD) in three possible regions, namely: LAD, LcX and RCA. Each clinical case was reconstructed using three reconstruction strategies, namely: FBP with no SC (i.e. scatter compensation), OSEM with Triple Energy Window (TEW) SC method, and OSEM with Effective Source Scatter Estimation (ESSE) SC. Alternative Free-Response (AFROC) analysis of perfusion scores shows that NB-MO predicts a higher human performance for scatter-compensated reconstructions, in agreement with what has been reported in published literature. These results suggest that NB-MO has good potential to generalize well to reconstruction methods not used during training, even for reasonably dissimilar datasets (i

  9. Neural Control of a Tracking Task via Attention-Gated Reinforcement Learning for Brain-Machine Interfaces.

    Science.gov (United States)

    Wang, Yiwen; Wang, Fang; Xu, Kai; Zhang, Qiaosheng; Zhang, Shaomin; Zheng, Xiaoxiang

    2015-05-01

    Reinforcement learning (RL)-based brain machine interfaces (BMIs) enable the user to learn from the environment through interactions to complete the task without desired signals, which is promising for clinical applications. Previous studies exploited Q-learning techniques to discriminate neural states into simple directional actions providing the trial initial timing. However, the movements in BMI applications can be quite complicated, and the action timing explicitly shows the intention when to move. The rich actions and the corresponding neural states form a large state-action space, imposing generalization difficulty on Q-learning. In this paper, we propose to adopt attention-gated reinforcement learning (AGREL) as a new learning scheme for BMIs to adaptively decode high-dimensional neural activities into seven distinct movements (directional moves, holdings and resting) due to the efficient weight-updating. We apply AGREL on neural data recorded from M1 of a monkey to directly predict a seven-action set in a time sequence to reconstruct the trajectory of a center-out task. Compared to Q-learning techniques, AGREL could improve the target acquisition rate to 90.16% in average with faster convergence and more stability to follow neural activity over multiple days, indicating the potential to achieve better online decoding performance for more complicated BMI tasks.

  10. Machine Learning in Medicine.

    Science.gov (United States)

    Deo, Rahul C

    2015-11-17

    Spurred by advances in processing power, memory, storage, and an unprecedented wealth of data, computers are being asked to tackle increasingly complex learning tasks, often with astonishing success. Computers have now mastered a popular variant of poker, learned the laws of physics from experimental data, and become experts in video games - tasks that would have been deemed impossible not too long ago. In parallel, the number of companies centered on applying complex data analysis to varying industries has exploded, and it is thus unsurprising that some analytic companies are turning attention to problems in health care. The purpose of this review is to explore what problems in medicine might benefit from such learning approaches and use examples from the literature to introduce basic concepts in machine learning. It is important to note that seemingly large enough medical data sets and adequate learning algorithms have been available for many decades, and yet, although there are thousands of papers applying machine learning algorithms to medical data, very few have contributed meaningfully to clinical care. This lack of impact stands in stark contrast to the enormous relevance of machine learning to many other industries. Thus, part of my effort will be to identify what obstacles there may be to changing the practice of medicine through statistical learning approaches, and discuss how these might be overcome. © 2015 American Heart Association, Inc.

  11. Machine Learning in Medicine

    Science.gov (United States)

    Deo, Rahul C.

    2015-01-01

    Spurred by advances in processing power, memory, storage, and an unprecedented wealth of data, computers are being asked to tackle increasingly complex learning tasks, often with astonishing success. Computers have now mastered a popular variant of poker, learned the laws of physics from experimental data, and become experts in video games – tasks which would have been deemed impossible not too long ago. In parallel, the number of companies centered on applying complex data analysis to varying industries has exploded, and it is thus unsurprising that some analytic companies are turning attention to problems in healthcare. The purpose of this review is to explore what problems in medicine might benefit from such learning approaches and use examples from the literature to introduce basic concepts in machine learning. It is important to note that seemingly large enough medical data sets and adequate learning algorithms have been available for many decades – and yet, although there are thousands of papers applying machine learning algorithms to medical data, very few have contributed meaningfully to clinical care. This lack of impact stands in stark contrast to the enormous relevance of machine learning to many other industries. Thus part of my effort will be to identify what obstacles there may be to changing the practice of medicine through statistical learning approaches, and discuss how these might be overcome. PMID:26572668

  12. Use of a machine learning algorithm to classify expertise: analysis of hand motion patterns during a simulated surgical task.

    Science.gov (United States)

    Watson, Robert A

    2014-08-01

    To test the hypothesis that machine learning algorithms increase the predictive power to classify surgical expertise using surgeons' hand motion patterns. In 2012 at the University of North Carolina at Chapel Hill, 14 surgical attendings and 10 first- and second-year surgical residents each performed two bench model venous anastomoses. During the simulated tasks, the participants wore an inertial measurement unit on the dorsum of their dominant (right) hand to capture their hand motion patterns. The pattern from each bench model task performed was preprocessed into a symbolic time series and labeled as expert (attending) or novice (resident). The labeled hand motion patterns were processed and used to train a Support Vector Machine (SVM) classification algorithm. The trained algorithm was then tested for discriminative/predictive power against unlabeled (blinded) hand motion patterns from tasks not used in the training. The Lempel-Ziv (LZ) complexity metric was also measured from each hand motion pattern, with an optimal threshold calculated to separately classify the patterns. The LZ metric classified unlabeled (blinded) hand motion patterns into expert and novice groups with an accuracy of 70% (sensitivity 64%, specificity 80%). The SVM algorithm had an accuracy of 83% (sensitivity 86%, specificity 80%). The results confirmed the hypothesis. The SVM algorithm increased the predictive power to classify blinded surgical hand motion patterns into expert versus novice groups. With further development, the system used in this study could become a viable tool for low-cost, objective assessment of procedural proficiency in a competency-based curriculum.

  13. Quantum machine learning with glow for episodic tasks and decision games

    Science.gov (United States)

    Clausen, Jens; Briegel, Hans J.

    2018-02-01

    We consider a general class of models, where a reinforcement learning (RL) agent learns from cyclic interactions with an external environment via classical signals. Perceptual inputs are encoded as quantum states, which are subsequently transformed by a quantum channel representing the agent's memory, while the outcomes of measurements performed at the channel's output determine the agent's actions. The learning takes place via stepwise modifications of the channel properties. They are described by an update rule that is inspired by the projective simulation (PS) model and equipped with a glow mechanism that allows for a backpropagation of policy changes, analogous to the eligibility traces in RL and edge glow in PS. In this way, the model combines features of PS with the ability for generalization, offered by its physical embodiment as a quantum system. We apply the agent to various setups of an invasion game and a grid world, which serve as elementary model tasks allowing a direct comparison with a basic classical PS agent.

  14. Human Machine Learning Symbiosis

    Science.gov (United States)

    Walsh, Kenneth R.; Hoque, Md Tamjidul; Williams, Kim H.

    2017-01-01

    Human Machine Learning Symbiosis is a cooperative system where both the human learner and the machine learner learn from each other to create an effective and efficient learning environment adapted to the needs of the human learner. Such a system can be used in online learning modules so that the modules adapt to each learner's learning state both…

  15. Attention: A Machine Learning Perspective

    DEFF Research Database (Denmark)

    Hansen, Lars Kai

    2012-01-01

    We review a statistical machine learning model of top-down task driven attention based on the notion of ‘gist’. In this framework we consider the task to be represented as a classification problem with two sets of features — a gist of coarse grained global features and a larger set of low...

  16. Machine Learning of Musical Gestures

    OpenAIRE

    Caramiaux, Baptiste; Tanaka, Atau

    2013-01-01

    We present an overview of machine learning (ML) techniques and theirapplication in interactive music and new digital instruments design. We firstgive to the non-specialist reader an introduction to two ML tasks,classification and regression, that are particularly relevant for gesturalinteraction. We then present a review of the literature in current NIMEresearch that uses ML in musical gesture analysis and gestural sound control.We describe the ways in which machine learning is useful for cre...

  17. Quantum Machine Learning

    Science.gov (United States)

    Biswas, Rupak

    2018-01-01

    Quantum computing promises an unprecedented ability to solve intractable problems by harnessing quantum mechanical effects such as tunneling, superposition, and entanglement. The Quantum Artificial Intelligence Laboratory (QuAIL) at NASA Ames Research Center is the space agency's primary facility for conducting research and development in quantum information sciences. QuAIL conducts fundamental research in quantum physics but also explores how best to exploit and apply this disruptive technology to enable NASA missions in aeronautics, Earth and space sciences, and space exploration. At the same time, machine learning has become a major focus in computer science and captured the imagination of the public as a panacea to myriad big data problems. In this talk, we will discuss how classical machine learning can take advantage of quantum computing to significantly improve its effectiveness. Although we illustrate this concept on a quantum annealer, other quantum platforms could be used as well. If explored fully and implemented efficiently, quantum machine learning could greatly accelerate a wide range of tasks leading to new technologies and discoveries that will significantly change the way we solve real-world problems.

  18. Continual Learning through Evolvable Neural Turing Machines

    DEFF Research Database (Denmark)

    Lüders, Benno; Schläger, Mikkel; Risi, Sebastian

    2016-01-01

    Continual learning, i.e. the ability to sequentially learn tasks without catastrophic forgetting of previously learned ones, is an important open challenge in machine learning. In this paper we take a step in this direction by showing that the recently proposed Evolving Neural Turing Machine (ENTM...

  19. Machine learning methods for planning

    CERN Document Server

    Minton, Steven

    1993-01-01

    Machine Learning Methods for Planning provides information pertinent to learning methods for planning and scheduling. This book covers a wide variety of learning methods and learning architectures, including analogical, case-based, decision-tree, explanation-based, and reinforcement learning.Organized into 15 chapters, this book begins with an overview of planning and scheduling and describes some representative learning systems that have been developed for these tasks. This text then describes a learning apprentice for calendar management. Other chapters consider the problem of temporal credi

  20. Microsoft Azure machine learning

    CERN Document Server

    Mund, Sumit

    2015-01-01

    The book is intended for those who want to learn how to use Azure Machine Learning. Perhaps you already know a bit about Machine Learning, but have never used ML Studio in Azure; or perhaps you are an absolute newbie. In either case, this book will get you up-and-running quickly.

  1. Pattern recognition & machine learning

    CERN Document Server

    Anzai, Y

    1992-01-01

    This is the first text to provide a unified and self-contained introduction to visual pattern recognition and machine learning. It is useful as a general introduction to artifical intelligence and knowledge engineering, and no previous knowledge of pattern recognition or machine learning is necessary. Basic for various pattern recognition and machine learning methods. Translated from Japanese, the book also features chapter exercises, keywords, and summaries.

  2. Introduction to machine learning

    OpenAIRE

    Baştanlar, Yalın; Özuysal, Mustafa

    2014-01-01

    The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning app...

  3. Supervised-machine Learning for Intelligent Collision Avoidance Decision-making and Sensor Tasking

    Data.gov (United States)

    National Aeronautics and Space Administration — Building an autonomous architecture that uses directed self-learning neuro-fuzzy networks with the aim of developing an intelligent autonomous collision avoidance...

  4. Introduction to machine learning.

    Science.gov (United States)

    Baştanlar, Yalin; Ozuysal, Mustafa

    2014-01-01

    The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning approaches for this application area. In this chapter, we first review the fundamental concepts of machine learning such as feature assessment, unsupervised versus supervised learning and types of classification. Then, we point out the main issues of designing machine learning experiments and their performance evaluation. Finally, we introduce some supervised learning methods.

  5. Machine-Learning Research

    OpenAIRE

    Dietterich, Thomas G.

    1997-01-01

    Machine-learning research has been making great progress in many directions. This article summarizes four of these directions and discusses some current open problems. The four directions are (1) the improvement of classification accuracy by learning ensembles of classifiers, (2) methods for scaling up supervised learning algorithms, (3) reinforcement learning, and (4) the learning of complex stochastic models.

  6. Machine learning with R

    CERN Document Server

    Lantz, Brett

    2015-01-01

    Perhaps you already know a bit about machine learning but have never used R, or perhaps you know a little R but are new to machine learning. In either case, this book will get you up and running quickly. It would be helpful to have a bit of familiarity with basic programming concepts, but no prior experience is required.

  7. Machine Learning and Radiology

    Science.gov (United States)

    Wang, Shijun; Summers, Ronald M.

    2012-01-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. PMID:22465077

  8. Machine learning and radiology.

    Science.gov (United States)

    Wang, Shijun; Summers, Ronald M

    2012-07-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. Copyright © 2012. Published by Elsevier B.V.

  9. Creativity in Machine Learning

    OpenAIRE

    Thoma, Martin

    2016-01-01

    Recent machine learning techniques can be modified to produce creative results. Those results did not exist before; it is not a trivial combination of the data which was fed into the machine learning system. The obtained results come in multiple forms: As images, as text and as audio. This paper gives a high level overview of how they are created and gives some examples. It is meant to be a summary of the current work and give people who are new to machine learning some starting points.

  10. Higgs Machine Learning Challenge 2014

    CERN Document Server

    Olivier, A-P; Bourdarios, C ; LAL / Orsay; Goldfarb, S ; University of Michigan

    2014-01-01

    High Energy Physics (HEP) has been using Machine Learning (ML) techniques such as boosted decision trees (paper) and neural nets since the 90s. These techniques are now routinely used for difficult tasks such as the Higgs boson search. Nevertheless, formal connections between the two research fields are rather scarce, with some exceptions such as the AppStat group at LAL, founded in 2006. In collaboration with INRIA, AppStat promotes interdisciplinary research on machine learning, computational statistics, and high-energy particle and astroparticle physics. We are now exploring new ways to improve the cross-fertilization of the two fields by setting up a data challenge, following the footsteps of, among others, the astrophysics community (dark matter and galaxy zoo challenges) and neurobiology (connectomics and decoding the human brain). The organization committee consists of ATLAS physicists and machine learning researchers. The Challenge will run from Monday 12th to September 2014.

  11. Quantum Machine Learning

    OpenAIRE

    Romero García, Cristian

    2017-01-01

    [EN] In a world in which accessible information grows exponentially, the selection of the appropriate information turns out to be an extremely relevant problem. In this context, the idea of Machine Learning (ML), a subfield of Artificial Intelligence, emerged to face problems in data mining, pattern recognition, automatic prediction, among others. Quantum Machine Learning is an interdisciplinary research area combining quantum mechanics with methods of ML, in which quantum properties allow fo...

  12. The RADAR Test Methodology: Evaluating a Multi-Task Machine Learning System with Humans in the Loop

    Science.gov (United States)

    2006-10-01

    details, static websites, and an ecommerce vendor portal. The “corpus” consists of the email and world state content. The latter consists of facts...learned fact variation, and the opportunity to induce a substantial crisis workload. The conference itself was a 4-day, multi-track technical conference...General 1. I am confident I completed the task well. 2. The task was difficult to complete. 3. I could have done as good of a job without the

  13. Machine learning systems

    Energy Technology Data Exchange (ETDEWEB)

    Forsyth, R

    1984-05-01

    With the dramatic rise of expert systems has come a renewed interest in the fuel that drives them-knowledge. For it is specialist knowledge which gives expert systems their power. But extracting knowledge from human experts in symbolic form has proved arduous and labour-intensive. So the idea of machine learning is enjoying a renaissance. Machine learning is any automatic improvement in the performance of a computer system over time, as a result of experience. Thus a learning algorithm seeks to do one or more of the following: cover a wider range of problems, deliver more accurate solutions, obtain answers more cheaply, and simplify codified knowledge. 6 references.

  14. Clojure for machine learning

    CERN Document Server

    Wali, Akhil

    2014-01-01

    A book that brings out the strengths of Clojure programming that have to facilitate machine learning. Each topic is described in substantial detail, and examples and libraries in Clojure are also demonstrated.This book is intended for Clojure developers who want to explore the area of machine learning. Basic understanding of the Clojure programming language is required, but thorough acquaintance with the standard Clojure library or any libraries are not required. Familiarity with theoretical concepts and notation of mathematics and statistics would be an added advantage.

  15. Mastering machine learning with scikit-learn

    CERN Document Server

    Hackeling, Gavin

    2014-01-01

    If you are a software developer who wants to learn how machine learning models work and how to apply them effectively, this book is for you. Familiarity with machine learning fundamentals and Python will be helpful, but is not essential.

  16. An introduction to quantum machine learning

    OpenAIRE

    Schuld, M.; Sinayskiy, I.; Petruccione, F.

    2014-01-01

    Machine learning algorithms learn a desired input-output relation from examples in order to interpret new inputs. This is important for tasks such as image and speech recognition or strategy optimisation, with growing applications in the IT industry. In the last couple of years, researchers investigated if quantum computing can help to improve classical machine learning algorithms. Ideas range from running computationally costly algorithms or their subroutines efficiently on a quantum compute...

  17. Machine Learning for Security

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    Applied statistics, aka ‘Machine Learning’, offers a wealth of techniques for answering security questions. It’s a much hyped topic in the big data world, with many companies now providing machine learning as a service. This talk will demystify these techniques, explain the math, and demonstrate their application to security problems. The presentation will include how-to’s on classifying malware, looking into encrypted tunnels, and finding botnets in DNS data. About the speaker Josiah is a security researcher with HP TippingPoint DVLabs Research Group. He has over 15 years of professional software development experience. Josiah used to do AI, with work focused on graph theory, search, and deductive inference on large knowledge bases. As rules only get you so far, he moved from AI to using machine learning techniques identifying failure modes in email traffic. There followed digressions into clustered data storage and later integrated control systems. Current ...

  18. Massively collaborative machine learning

    NARCIS (Netherlands)

    Rijn, van J.N.

    2016-01-01

    Many scientists are focussed on building models. We nearly process all information we perceive to a model. There are many techniques that enable computers to build models as well. The field of research that develops such techniques is called Machine Learning. Many research is devoted to develop

  19. New Applications of Learning Machines

    DEFF Research Database (Denmark)

    Larsen, Jan

    * Machine learning framework for sound search * Genre classification * Music separation * MIMO channel estimation and symbol detection......* Machine learning framework for sound search * Genre classification * Music separation * MIMO channel estimation and symbol detection...

  20. Machine Learning wins the Higgs Challenge

    CERN Multimedia

    Abha Eli Phoboo

    2014-01-01

    The winner of the four-month-long Higgs Machine Learning Challenge, launched on 12 May, is Gábor Melis from Hungary, followed closely by Tim Salimans from the Netherlands and Pierre Courtiol from France. The challenge explored the potential of advanced machine learning methods to improve the significance of the Higgs discovery.   Winners of the Higgs Machine Learning Challenge: Gábor Melis and Tim Salimans (top row), Tianqi Chen and Tong He (bottom row). Participants in the Higgs Machine Learning Challenge were tasked with developing an algorithm to improve the detection of Higgs boson signal events decaying into two tau particles in a sample of simulated ATLAS data* that contains few signal and a majority of non-Higgs boson “background” events. No knowledge of particle physics was required for the challenge but skills in machine learning - the training of computers to recognise patterns in data – were essential. The Challenge, hosted by Ka...

  1. Can machine learning explain human learning?

    NARCIS (Netherlands)

    Vahdat, M.; Oneto, L.; Anguita, D.; Funk, M.; Rauterberg, G.W.M.

    2016-01-01

    Learning Analytics (LA) has a major interest in exploring and understanding the learning process of humans and, for this purpose, benefits from both Cognitive Science, which studies how humans learn, and Machine Learning, which studies how algorithms learn from data. Usually, Machine Learning is

  2. Student Modeling and Machine Learning

    OpenAIRE

    Sison , Raymund; Shimura , Masamichi

    1998-01-01

    After identifying essential student modeling issues and machine learning approaches, this paper examines how machine learning techniques have been used to automate the construction of student models as well as the background knowledge necessary for student modeling. In the process, the paper sheds light on the difficulty, suitability and potential of using machine learning for student modeling processes, and, to a lesser extent, the potential of using student modeling techniques in machine le...

  3. Machine learning with R cookbook

    CERN Document Server

    Chiu, Yu-Wei

    2015-01-01

    If you want to learn how to use R for machine learning and gain insights from your data, then this book is ideal for you. Regardless of your level of experience, this book covers the basics of applying R to machine learning through to advanced techniques. While it is helpful if you are familiar with basic programming or machine learning concepts, you do not require prior experience to benefit from this book.

  4. Soft computing in machine learning

    CERN Document Server

    Park, Jooyoung; Inoue, Atsushi

    2014-01-01

    As users or consumers are now demanding smarter devices, intelligent systems are revolutionizing by utilizing machine learning. Machine learning as part of intelligent systems is already one of the most critical components in everyday tools ranging from search engines and credit card fraud detection to stock market analysis. You can train machines to perform some things, so that they can automatically detect, diagnose, and solve a variety of problems. The intelligent systems have made rapid progress in developing the state of the art in machine learning based on smart and deep perception. Using machine learning, the intelligent systems make widely applications in automated speech recognition, natural language processing, medical diagnosis, bioinformatics, and robot locomotion. This book aims at introducing how to treat a substantial amount of data, to teach machines and to improve decision making models. And this book specializes in the developments of advanced intelligent systems through machine learning. It...

  5. Machine Learning and Applied Linguistics

    OpenAIRE

    Vajjala, Sowmya

    2018-01-01

    This entry introduces the topic of machine learning and provides an overview of its relevance for applied linguistics and language learning. The discussion will focus on giving an introduction to the methods and applications of machine learning in applied linguistics, and will provide references for further study.

  6. Machine learning in healthcare informatics

    CERN Document Server

    Acharya, U; Dua, Prerna

    2014-01-01

    The book is a unique effort to represent a variety of techniques designed to represent, enhance, and empower multi-disciplinary and multi-institutional machine learning research in healthcare informatics. The book provides a unique compendium of current and emerging machine learning paradigms for healthcare informatics and reflects the diversity, complexity and the depth and breath of this multi-disciplinary area. The integrated, panoramic view of data and machine learning techniques can provide an opportunity for novel clinical insights and discoveries.

  7. Machine learning and medical imaging

    CERN Document Server

    Shen, Dinggang; Sabuncu, Mert

    2016-01-01

    Machine Learning and Medical Imaging presents state-of- the-art machine learning methods in medical image analysis. It first summarizes cutting-edge machine learning algorithms in medical imaging, including not only classical probabilistic modeling and learning methods, but also recent breakthroughs in deep learning, sparse representation/coding, and big data hashing. In the second part leading research groups around the world present a wide spectrum of machine learning methods with application to different medical imaging modalities, clinical domains, and organs. The biomedical imaging modalities include ultrasound, magnetic resonance imaging (MRI), computed tomography (CT), histology, and microscopy images. The targeted organs span the lung, liver, brain, and prostate, while there is also a treatment of examining genetic associations. Machine Learning and Medical Imaging is an ideal reference for medical imaging researchers, industry scientists and engineers, advanced undergraduate and graduate students, a...

  8. Machine learning topological states

    Science.gov (United States)

    Deng, Dong-Ling; Li, Xiaopeng; Das Sarma, S.

    2017-11-01

    Artificial neural networks and machine learning have now reached a new era after several decades of improvement where applications are to explode in many fields of science, industry, and technology. Here, we use artificial neural networks to study an intriguing phenomenon in quantum physics—the topological phases of matter. We find that certain topological states, either symmetry-protected or with intrinsic topological order, can be represented with classical artificial neural networks. This is demonstrated by using three concrete spin systems, the one-dimensional (1D) symmetry-protected topological cluster state and the 2D and 3D toric code states with intrinsic topological orders. For all three cases, we show rigorously that the topological ground states can be represented by short-range neural networks in an exact and efficient fashion—the required number of hidden neurons is as small as the number of physical spins and the number of parameters scales only linearly with the system size. For the 2D toric-code model, we find that the proposed short-range neural networks can describe the excited states with Abelian anyons and their nontrivial mutual statistics as well. In addition, by using reinforcement learning we show that neural networks are capable of finding the topological ground states of nonintegrable Hamiltonians with strong interactions and studying their topological phase transitions. Our results demonstrate explicitly the exceptional power of neural networks in describing topological quantum states, and at the same time provide valuable guidance to machine learning of topological phases in generic lattice models.

  9. Adaptive Machine Aids to Learning.

    Science.gov (United States)

    Starkweather, John A.

    With emphasis on man-machine relationships and on machine evolution, computer-assisted instruction (CAI) is examined in this paper. The discussion includes the background of machine assistance to learning, the current status of CAI, directions of development, the development of criteria for successful instruction, meeting the needs of users,…

  10. An introduction to quantum machine learning

    Science.gov (United States)

    Schuld, Maria; Sinayskiy, Ilya; Petruccione, Francesco

    2015-04-01

    Machine learning algorithms learn a desired input-output relation from examples in order to interpret new inputs. This is important for tasks such as image and speech recognition or strategy optimisation, with growing applications in the IT industry. In the last couple of years, researchers investigated if quantum computing can help to improve classical machine learning algorithms. Ideas range from running computationally costly algorithms or their subroutines efficiently on a quantum computer to the translation of stochastic methods into the language of quantum theory. This contribution gives a systematic overview of the emerging field of quantum machine learning. It presents the approaches as well as technical details in an accessible way, and discusses the potential of a future theory of quantum learning.

  11. Machine Learning for Robotic Vision

    OpenAIRE

    Drummond, Tom

    2018-01-01

    Machine learning is a crucial enabling technology for robotics, in particular for unlocking the capabilities afforded by visual sensing. This talk will present research within Prof Drummond’s lab that explores how machine learning can be developed and used within the context of Robotic Vision.

  12. Machine Learning for Medical Imaging.

    Science.gov (United States)

    Erickson, Bradley J; Korfiatis, Panagiotis; Akkus, Zeynettin; Kline, Timothy L

    2017-01-01

    Machine learning is a technique for recognizing patterns that can be applied to medical images. Although it is a powerful tool that can help in rendering medical diagnoses, it can be misapplied. Machine learning typically begins with the machine learning algorithm system computing the image features that are believed to be of importance in making the prediction or diagnosis of interest. The machine learning algorithm system then identifies the best combination of these image features for classifying the image or computing some metric for the given image region. There are several methods that can be used, each with different strengths and weaknesses. There are open-source versions of most of these machine learning methods that make them easy to try and apply to images. Several metrics for measuring the performance of an algorithm exist; however, one must be aware of the possible associated pitfalls that can result in misleading metrics. More recently, deep learning has started to be used; this method has the benefit that it does not require image feature identification and calculation as a first step; rather, features are identified as part of the learning process. Machine learning has been used in medical imaging and will have a greater influence in the future. Those working in medical imaging must be aware of how machine learning works. © RSNA, 2017.

  13. Online transfer learning with extreme learning machine

    Science.gov (United States)

    Yin, Haibo; Yang, Yun-an

    2017-05-01

    In this paper, we propose a new transfer learning algorithm for online training. The proposed algorithm, which is called Online Transfer Extreme Learning Machine (OTELM), is based on Online Sequential Extreme Learning Machine (OSELM) while it introduces Semi-Supervised Extreme Learning Machine (SSELM) to transfer knowledge from the source to the target domain. With the manifold regularization, SSELM picks out instances from the source domain that are less relevant to those in the target domain to initialize the online training, so as to improve the classification performance. Experimental results demonstrate that the proposed OTELM can effectively use instances in the source domain to enhance the learning performance.

  14. Machine learning applied to crime prediction

    OpenAIRE

    Vaquero Barnadas, Miquel

    2016-01-01

    Machine Learning is a cornerstone when it comes to artificial intelligence and big data analysis. It provides powerful algorithms that are capable of recognizing patterns, classifying data, and, basically, learn by themselves to perform a specific task. This field has incredibly grown in popularity these days, however, it still remains unknown for the majority of people, and even for most professionals. This project intends to provide an understandable explanation of what is it, what types ar...

  15. Machine Learning Approaches in Cardiovascular Imaging.

    Science.gov (United States)

    Henglin, Mir; Stein, Gillian; Hushcha, Pavel V; Snoek, Jasper; Wiltschko, Alexander B; Cheng, Susan

    2017-10-01

    Cardiovascular imaging technologies continue to increase in their capacity to capture and store large quantities of data. Modern computational methods, developed in the field of machine learning, offer new approaches to leveraging the growing volume of imaging data available for analyses. Machine learning methods can now address data-related problems ranging from simple analytic queries of existing measurement data to the more complex challenges involved in analyzing raw images. To date, machine learning has been used in 2 broad and highly interconnected areas: automation of tasks that might otherwise be performed by a human and generation of clinically important new knowledge. Most cardiovascular imaging studies have focused on task-oriented problems, but more studies involving algorithms aimed at generating new clinical insights are emerging. Continued expansion in the size and dimensionality of cardiovascular imaging databases is driving strong interest in applying powerful deep learning methods, in particular, to analyze these data. Overall, the most effective approaches will require an investment in the resources needed to appropriately prepare such large data sets for analyses. Notwithstanding current technical and logistical challenges, machine learning and especially deep learning methods have much to offer and will substantially impact the future practice and science of cardiovascular imaging. © 2017 American Heart Association, Inc.

  16. Machine Learning in Medical Imaging.

    Science.gov (United States)

    Giger, Maryellen L

    2018-03-01

    Advances in both imaging and computers have synergistically led to a rapid rise in the potential use of artificial intelligence in various radiological imaging tasks, such as risk assessment, detection, diagnosis, prognosis, and therapy response, as well as in multi-omics disease discovery. A brief overview of the field is given here, allowing the reader to recognize the terminology, the various subfields, and components of machine learning, as well as the clinical potential. Radiomics, an expansion of computer-aided diagnosis, has been defined as the conversion of images to minable data. The ultimate benefit of quantitative radiomics is to (1) yield predictive image-based phenotypes of disease for precision medicine or (2) yield quantitative image-based phenotypes for data mining with other -omics for discovery (ie, imaging genomics). For deep learning in radiology to succeed, note that well-annotated large data sets are needed since deep networks are complex, computer software and hardware are evolving constantly, and subtle differences in disease states are more difficult to perceive than differences in everyday objects. In the future, machine learning in radiology is expected to have a substantial clinical impact with imaging examinations being routinely obtained in clinical practice, providing an opportunity to improve decision support in medical image interpretation. The term of note is decision support, indicating that computers will augment human decision making, making it more effective and efficient. The clinical impact of having computers in the routine clinical practice may allow radiologists to further integrate their knowledge with their clinical colleagues in other medical specialties and allow for precision medicine. Copyright © 2018. Published by Elsevier Inc.

  17. Machine learning in virtual screening.

    Science.gov (United States)

    Melville, James L; Burke, Edmund K; Hirst, Jonathan D

    2009-05-01

    In this review, we highlight recent applications of machine learning to virtual screening, focusing on the use of supervised techniques to train statistical learning algorithms to prioritize databases of molecules as active against a particular protein target. Both ligand-based similarity searching and structure-based docking have benefited from machine learning algorithms, including naïve Bayesian classifiers, support vector machines, neural networks, and decision trees, as well as more traditional regression techniques. Effective application of these methodologies requires an appreciation of data preparation, validation, optimization, and search methodologies, and we also survey developments in these areas.

  18. Machine Learning an algorithmic perspective

    CERN Document Server

    Marsland, Stephen

    2009-01-01

    Traditional books on machine learning can be divided into two groups - those aimed at advanced undergraduates or early postgraduates with reasonable mathematical knowledge and those that are primers on how to code algorithms. The field is ready for a text that not only demonstrates how to use the algorithms that make up machine learning methods, but also provides the background needed to understand how and why these algorithms work. Machine Learning: An Algorithmic Perspective is that text.Theory Backed up by Practical ExamplesThe book covers neural networks, graphical models, reinforcement le

  19. Learning Activity Packets for Milling Machines. Unit I--Introduction to Milling Machines.

    Science.gov (United States)

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This learning activity packet (LAP) outlines the study activities and performance tasks covered in a related curriculum guide on milling machines. The course of study in this LAP is intended to help students learn to identify parts and attachments of vertical and horizontal milling machines, identify work-holding devices, state safety rules, and…

  20. Model-based machine learning.

    Science.gov (United States)

    Bishop, Christopher M

    2013-02-13

    Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application, a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this study, we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning do not have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this study, we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications.

  1. Introduction to machine learning for brain imaging.

    Science.gov (United States)

    Lemm, Steven; Blankertz, Benjamin; Dickhaus, Thorsten; Müller, Klaus-Robert

    2011-05-15

    Machine learning and pattern recognition algorithms have in the past years developed to become a working horse in brain imaging and the computational neurosciences, as they are instrumental for mining vast amounts of neural data of ever increasing measurement precision and detecting minuscule signals from an overwhelming noise floor. They provide the means to decode and characterize task relevant brain states and to distinguish them from non-informative brain signals. While undoubtedly this machinery has helped to gain novel biological insights, it also holds the danger of potential unintentional abuse. Ideally machine learning techniques should be usable for any non-expert, however, unfortunately they are typically not. Overfitting and other pitfalls may occur and lead to spurious and nonsensical interpretation. The goal of this review is therefore to provide an accessible and clear introduction to the strengths and also the inherent dangers of machine learning usage in the neurosciences. Copyright © 2010 Elsevier Inc. All rights reserved.

  2. Emerging Paradigms in Machine Learning

    CERN Document Server

    Jain, Lakhmi; Howlett, Robert

    2013-01-01

    This  book presents fundamental topics and algorithms that form the core of machine learning (ML) research, as well as emerging paradigms in intelligent system design. The  multidisciplinary nature of machine learning makes it a very fascinating and popular area for research.  The book is aiming at students, practitioners and researchers and captures the diversity and richness of the field of machine learning and intelligent systems.  Several chapters are devoted to computational learning models such as granular computing, rough sets and fuzzy sets An account of applications of well-known learning methods in biometrics, computational stylistics, multi-agent systems, spam classification including an extremely well-written survey on Bayesian networks shed light on the strengths and weaknesses of the methods. Practical studies yielding insight into challenging problems such as learning from incomplete and imbalanced data, pattern recognition of stochastic episodic events and on-line mining of non-stationary ...

  3. Machine learning for healthcare technologies

    CERN Document Server

    Clifton, David A

    2016-01-01

    This book brings together chapters on the state-of-the-art in machine learning (ML) as it applies to the development of patient-centred technologies, with a special emphasis on 'big data' and mobile data.

  4. Machine Learning via Mathematical Programming

    National Research Council Canada - National Science Library

    Mamgasarian, Olivi

    1999-01-01

    Mathematical programming approaches were applied to a variety of problems in machine learning in order to gain deeper understanding of the problems and to come up with new and more efficient computational algorithms...

  5. Machine Learning examples on Invenio

    CERN Document Server

    CERN. Geneva

    2017-01-01

    This talk will present the different Machine Learning tools that the INSPIRE is developing and integrating in order to automatize as much as possible content selection and curation in a subject based repository.

  6. Scikit-learn: Machine Learning in Python

    OpenAIRE

    Pedregosa, Fabian; Varoquaux, Gaël; Gramfort, Alexandre; Michel, Vincent; Thirion, Bertrand; Grisel, Olivier; Blondel, Mathieu; Prettenhofer, Peter; Weiss, Ron; Dubourg, Vincent; Vanderplas, Jake; Passos, Alexandre; Cournapeau, David; Brucher, Matthieu; Perrot, Matthieu

    2011-01-01

    International audience; Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic ...

  7. Scikit-learn: Machine Learning in Python

    OpenAIRE

    Pedregosa, Fabian; Varoquaux, Gaël; Gramfort, Alexandre; Michel, Vincent; Thirion, Bertrand; Grisel, Olivier; Blondel, Mathieu; Louppe, Gilles; Prettenhofer, Peter; Weiss, Ron; Dubourg, Vincent; Vanderplas, Jake; Passos, Alexandre; Cournapeau, David; Brucher, Matthieu

    2012-01-01

    Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings....

  8. Multi-task feature learning by using trace norm regularization

    Directory of Open Access Journals (Sweden)

    Jiangmei Zhang

    2017-11-01

    Full Text Available Multi-task learning can extract the correlation of multiple related machine learning problems to improve performance. This paper considers applying the multi-task learning method to learn a single task. We propose a new learning approach, which employs the mixture of expert model to divide a learning task into several related sub-tasks, and then uses the trace norm regularization to extract common feature representation of these sub-tasks. A nonlinear extension of this approach by using kernel is also provided. Experiments conducted on both simulated and real data sets demonstrate the advantage of the proposed approach.

  9. Learning with Support Vector Machines

    CERN Document Server

    Campbell, Colin

    2010-01-01

    Support Vectors Machines have become a well established tool within machine learning. They work well in practice and have now been used across a wide range of applications from recognizing hand-written digits, to face identification, text categorisation, bioinformatics, and database marketing. In this book we give an introductory overview of this subject. We start with a simple Support Vector Machine for performing binary classification before considering multi-class classification and learning in the presence of noise. We show that this framework can be extended to many other scenarios such a

  10. Machine learning for evolution strategies

    CERN Document Server

    Kramer, Oliver

    2016-01-01

    This book introduces numerous algorithmic hybridizations between both worlds that show how machine learning can improve and support evolution strategies. The set of methods comprises covariance matrix estimation, meta-modeling of fitness and constraint functions, dimensionality reduction for search and visualization of high-dimensional optimization processes, and clustering-based niching. After giving an introduction to evolution strategies and machine learning, the book builds the bridge between both worlds with an algorithmic and experimental perspective. Experiments mostly employ a (1+1)-ES and are implemented in Python using the machine learning library scikit-learn. The examples are conducted on typical benchmark problems illustrating algorithmic concepts and their experimental behavior. The book closes with a discussion of related lines of research.

  11. Gaussian processes for machine learning.

    Science.gov (United States)

    Seeger, Matthias

    2004-04-01

    Gaussian processes (GPs) are natural generalisations of multivariate Gaussian random variables to infinite (countably or continuous) index sets. GPs have been applied in a large number of fields to a diverse range of ends, and very many deep theoretical analyses of various properties are available. This paper gives an introduction to Gaussian processes on a fairly elementary level with special emphasis on characteristics relevant in machine learning. It draws explicit connections to branches such as spline smoothing models and support vector machines in which similar ideas have been investigated. Gaussian process models are routinely used to solve hard machine learning problems. They are attractive because of their flexible non-parametric nature and computational simplicity. Treated within a Bayesian framework, very powerful statistical methods can be implemented which offer valid estimates of uncertainties in our predictions and generic model selection procedures cast as nonlinear optimization problems. Their main drawback of heavy computational scaling has recently been alleviated by the introduction of generic sparse approximations.13,78,31 The mathematical literature on GPs is large and often uses deep concepts which are not required to fully understand most machine learning applications. In this tutorial paper, we aim to present characteristics of GPs relevant to machine learning and to show up precise connections to other "kernel machines" popular in the community. Our focus is on a simple presentation, but references to more detailed sources are provided.

  12. Machine Learning for ATLAS DDM Network Metrics

    CERN Document Server

    Lassnig, Mario; The ATLAS collaboration; Vamosi, Ralf

    2016-01-01

    The increasing volume of physics data is posing a critical challenge to the ATLAS experiment. In anticipation of high luminosity physics, automation of everyday data management tasks has become necessary. Previously many of these tasks required human decision-making and operation. Recent advances in hardware and software have made it possible to entrust more complicated duties to automated systems using models trained by machine learning algorithms. In this contribution we show results from our ongoing automation efforts. First, we describe our framework for distributed data management and network metrics, automatically extract and aggregate data, train models with various machine learning algorithms, and eventually score the resulting models and parameters. Second, we use these models to forecast metrics relevant for network-aware job scheduling and data brokering. We show the characteristics of the data and evaluate the forecasting accuracy of our models.

  13. Game-powered machine learning.

    Science.gov (United States)

    Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert

    2012-04-24

    Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the "wisdom of the crowds." Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., "funky jazz with saxophone," "spooky electronica," etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data.

  14. Machine Learning Methods for Production Cases Analysis

    Science.gov (United States)

    Mokrova, Nataliya V.; Mokrov, Alexander M.; Safonova, Alexandra V.; Vishnyakov, Igor V.

    2018-03-01

    Approach to analysis of events occurring during the production process were proposed. Described machine learning system is able to solve classification tasks related to production control and hazard identification at an early stage. Descriptors of the internal production network data were used for training and testing of applied models. k-Nearest Neighbors and Random forest methods were used to illustrate and analyze proposed solution. The quality of the developed classifiers was estimated using standard statistical metrics, such as precision, recall and accuracy.

  15. MEDLINE MeSH Indexing: Lessons Learned from Machine Learning and Future Directions

    DEFF Research Database (Denmark)

    Jimeno-Yepes, Antonio; Mork, James G.; Wilkowski, Bartlomiej

    2012-01-01

    and analyzed the issues when using standard machine learning algorithms. We show that in some cases machine learning can improve the annotations already recommended by MTI, that machine learning based on low variance methods achieves better performance and that each MeSH heading presents a different behavior......Map and a k-NN approach called PubMed Related Citations (PRC). Our motivation is to improve the quality of MTI based on machine learning. Typical machine learning approaches fit this indexing task into text categorization. In this work, we have studied some Medical Subject Headings (MeSH) recommended by MTI...

  16. Deep learning: Using machine learning to study biological vision

    OpenAIRE

    Majaj, Najib; Pelli, Denis

    2017-01-01

    Today most vision-science presentations mention machine learning. Many neuroscientists use machine learning to decode neural responses. Many perception scientists try to understand recognition by living organisms. To them, machine learning offers a reference of attainable performance based on learned stimuli. This brief overview of the use of machine learning in biological vision touches on its strengths, weaknesses, milestones, controversies, and current directions.

  17. Neuromorphic Deep Learning Machines

    OpenAIRE

    Neftci, E; Augustine, C; Paul, S; Detorakis, G

    2017-01-01

    An ongoing challenge in neuromorphic computing is to devise general and computationally efficient models of inference and learning which are compatible with the spatial and temporal constraints of the brain. One increasingly popular and successful approach is to take inspiration from inference and learning algorithms used in deep neural networks. However, the workhorse of deep learning, the gradient descent Back Propagation (BP) rule, often relies on the immediate availability of network-wide...

  18. Machine Shop I. Learning Activity Packets (LAPs). Section D--Power Saws and Drilling Machines.

    Science.gov (United States)

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This document contains two learning activity packets (LAPs) for the "power saws and drilling machines" instructional area of a Machine Shop I course. The two LAPs cover the following topics: power saws and drill press. Each LAP contains a cover sheet that describes its purpose, an introduction, and the tasks included in the LAP; learning…

  19. Application of Machine Learning Techniques in Aquaculture

    OpenAIRE

    Rahman, Akhlaqur; Tasnim, Sumaira

    2014-01-01

    In this paper we present applications of different machine learning algorithms in aquaculture. Machine learning algorithms learn models from historical data. In aquaculture historical data are obtained from farm practices, yields, and environmental data sources. Associations between these different variables can be obtained by applying machine learning algorithms to historical data. In this paper we present applications of different machine learning algorithms in aquaculture applications.

  20. Manifold learning in machine vision and robotics

    Science.gov (United States)

    Bernstein, Alexander

    2017-02-01

    Smart algorithms are used in Machine vision and Robotics to organize or extract high-level information from the available data. Nowadays, Machine learning is an essential and ubiquitous tool to automate extraction patterns or regularities from data (images in Machine vision; camera, laser, and sonar sensors data in Robotics) in order to solve various subject-oriented tasks such as understanding and classification of images content, navigation of mobile autonomous robot in uncertain environments, robot manipulation in medical robotics and computer-assisted surgery, and other. Usually such data have high dimensionality, however, due to various dependencies between their components and constraints caused by physical reasons, all "feasible and usable data" occupy only a very small part in high dimensional "observation space" with smaller intrinsic dimensionality. Generally accepted model of such data is manifold model in accordance with which the data lie on or near an unknown manifold (surface) of lower dimensionality embedded in an ambient high dimensional observation space; real-world high-dimensional data obtained from "natural" sources meet, as a rule, this model. The use of Manifold learning technique in Machine vision and Robotics, which discovers a low-dimensional structure of high dimensional data and results in effective algorithms for solving of a large number of various subject-oriented tasks, is the content of the conference plenary speech some topics of which are in the paper.

  1. Machine Learning applications in CMS

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Machine Learning is used in many aspects of CMS data taking, monitoring, processing and analysis. We review a few of these use cases and the most recent developments, with an outlook to future applications in the LHC Run III and for the High-Luminosity phase.

  2. Visible Machine Learning for Biomedicine.

    Science.gov (United States)

    Yu, Michael K; Ma, Jianzhu; Fisher, Jasmin; Kreisberg, Jason F; Raphael, Benjamin J; Ideker, Trey

    2018-06-14

    A major ambition of artificial intelligence lies in translating patient data to successful therapies. Machine learning models face particular challenges in biomedicine, however, including handling of extreme data heterogeneity and lack of mechanistic insight into predictions. Here, we argue for "visible" approaches that guide model structure with experimental biology. Copyright © 2018. Published by Elsevier Inc.

  3. Learning scikit-learn machine learning in Python

    CERN Document Server

    Garreta, Raúl

    2013-01-01

    The book adopts a tutorial-based approach to introduce the user to Scikit-learn.If you are a programmer who wants to explore machine learning and data-based methods to build intelligent applications and enhance your programming skills, this the book for you. No previous experience with machine-learning algorithms is required.

  4. Integrating human and machine intelligence in galaxy morphology classification tasks

    Science.gov (United States)

    Beck, Melanie R.; Scarlata, Claudia; Fortson, Lucy F.; Lintott, Chris J.; Simmons, B. D.; Galloway, Melanie A.; Willett, Kyle W.; Dickinson, Hugh; Masters, Karen L.; Marshall, Philip J.; Wright, Darryl

    2018-06-01

    Quantifying galaxy morphology is a challenging yet scientifically rewarding task. As the scale of data continues to increase with upcoming surveys, traditional classification methods will struggle to handle the load. We present a solution through an integration of visual and automated classifications, preserving the best features of both human and machine. We demonstrate the effectiveness of such a system through a re-analysis of visual galaxy morphology classifications collected during the Galaxy Zoo 2 (GZ2) project. We reprocess the top-level question of the GZ2 decision tree with a Bayesian classification aggregation algorithm dubbed SWAP, originally developed for the Space Warps gravitational lens project. Through a simple binary classification scheme, we increase the classification rate nearly 5-fold classifying 226 124 galaxies in 92 d of GZ2 project time while reproducing labels derived from GZ2 classification data with 95.7 per cent accuracy. We next combine this with a Random Forest machine learning algorithm that learns on a suite of non-parametric morphology indicators widely used for automated morphologies. We develop a decision engine that delegates tasks between human and machine and demonstrate that the combined system provides at least a factor of 8 increase in the classification rate, classifying 210 803 galaxies in just 32 d of GZ2 project time with 93.1 per cent accuracy. As the Random Forest algorithm requires a minimal amount of computational cost, this result has important implications for galaxy morphology identification tasks in the era of Euclid and other large-scale surveys.

  5. Machine learning a probabilistic perspective

    CERN Document Server

    Murphy, Kevin P

    2012-01-01

    Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic method...

  6. Machine learning an artificial intelligence approach

    CERN Document Server

    Banerjee, R; Bradshaw, Gary; Carbonell, Jaime Guillermo; Mitchell, Tom Michael; Michalski, Ryszard Spencer

    1983-01-01

    Machine Learning: An Artificial Intelligence Approach contains tutorial overviews and research papers representative of trends in the area of machine learning as viewed from an artificial intelligence perspective. The book is organized into six parts. Part I provides an overview of machine learning and explains why machines should learn. Part II covers important issues affecting the design of learning programs-particularly programs that learn from examples. It also describes inductive learning systems. Part III deals with learning by analogy, by experimentation, and from experience. Parts IV a

  7. Quantum machine learning: a classical perspective.

    Science.gov (United States)

    Ciliberto, Carlo; Herbster, Mark; Ialongo, Alessandro Davide; Pontil, Massimiliano; Rocchetto, Andrea; Severini, Simone; Wossnig, Leonard

    2018-01-01

    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning (ML) techniques to impressive results in regression, classification, data generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets is motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed up classical ML algorithms. Here we review the literature in quantum ML and discuss perspectives for a mixed readership of classical ML and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in ML are identified as promising directions for the field. Practical questions, such as how to upload classical data into quantum form, will also be addressed.

  8. Quantum machine learning: a classical perspective

    Science.gov (United States)

    Ciliberto, Carlo; Herbster, Mark; Ialongo, Alessandro Davide; Pontil, Massimiliano; Rocchetto, Andrea; Severini, Simone; Wossnig, Leonard

    2018-01-01

    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning (ML) techniques to impressive results in regression, classification, data generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets is motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed up classical ML algorithms. Here we review the literature in quantum ML and discuss perspectives for a mixed readership of classical ML and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in ML are identified as promising directions for the field. Practical questions, such as how to upload classical data into quantum form, will also be addressed.

  9. Learning Extended Finite State Machines

    Science.gov (United States)

    Cassel, Sofia; Howar, Falk; Jonsson, Bengt; Steffen, Bernhard

    2014-01-01

    We present an active learning algorithm for inferring extended finite state machines (EFSM)s, combining data flow and control behavior. Key to our learning technique is a novel learning model based on so-called tree queries. The learning algorithm uses the tree queries to infer symbolic data constraints on parameters, e.g., sequence numbers, time stamps, identifiers, or even simple arithmetic. We describe sufficient conditions for the properties that the symbolic constraints provided by a tree query in general must have to be usable in our learning model. We have evaluated our algorithm in a black-box scenario, where tree queries are realized through (black-box) testing. Our case studies include connection establishment in TCP and a priority queue from the Java Class Library.

  10. Learning Machine Learning: A Case Study

    Science.gov (United States)

    Lavesson, N.

    2010-01-01

    This correspondence reports on a case study conducted in the Master's-level Machine Learning (ML) course at Blekinge Institute of Technology, Sweden. The students participated in a self-assessment test and a diagnostic test of prerequisite subjects, and their results on these tests are correlated with their achievement of the course's learning…

  11. What is the machine learning?

    Science.gov (United States)

    Chang, Spencer; Cohen, Timothy; Ostdiek, Bryan

    2018-03-01

    Applications of machine learning tools to problems of physical interest are often criticized for producing sensitivity at the expense of transparency. To address this concern, we explore a data planing procedure for identifying combinations of variables—aided by physical intuition—that can discriminate signal from background. Weights are introduced to smooth away the features in a given variable(s). New networks are then trained on this modified data. Observed decreases in sensitivity diagnose the variable's discriminating power. Planing also allows the investigation of the linear versus nonlinear nature of the boundaries between signal and background. We demonstrate the efficacy of this approach using a toy example, followed by an application to an idealized heavy resonance scenario at the Large Hadron Collider. By unpacking the information being utilized by these algorithms, this method puts in context what it means for a machine to learn.

  12. Machine learning in cardiovascular medicine: are we there yet?

    Science.gov (United States)

    Shameer, Khader; Johnson, Kipp W; Glicksberg, Benjamin S; Dudley, Joel T; Sengupta, Partho P

    2018-01-19

    Artificial intelligence (AI) broadly refers to analytical algorithms that iteratively learn from data, allowing computers to find hidden insights without being explicitly programmed where to look. These include a family of operations encompassing several terms like machine learning, cognitive learning, deep learning and reinforcement learning-based methods that can be used to integrate and interpret complex biomedical and healthcare data in scenarios where traditional statistical methods may not be able to perform. In this review article, we discuss the basics of machine learning algorithms and what potential data sources exist; evaluate the need for machine learning; and examine the potential limitations and challenges of implementing machine in the context of cardiovascular medicine. The most promising avenues for AI in medicine are the development of automated risk prediction algorithms which can be used to guide clinical care; use of unsupervised learning techniques to more precisely phenotype complex disease; and the implementation of reinforcement learning algorithms to intelligently augment healthcare providers. The utility of a machine learning-based predictive model will depend on factors including data heterogeneity, data depth, data breadth, nature of modelling task, choice of machine learning and feature selection algorithms, and orthogonal evidence. A critical understanding of the strength and limitations of various methods and tasks amenable to machine learning is vital. By leveraging the growing corpus of big data in medicine, we detail pathways by which machine learning may facilitate optimal development of patient-specific models for improving diagnoses, intervention and outcome in cardiovascular medicine. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  13. Galaxy Classification using Machine Learning

    Science.gov (United States)

    Fowler, Lucas; Schawinski, Kevin; Brandt, Ben-Elias; widmer, Nicole

    2017-01-01

    We present our current research into the use of machine learning to classify galaxy imaging data with various convolutional neural network configurations in TensorFlow. We are investigating how five-band Sloan Digital Sky Survey imaging data can be used to train on physical properties such as redshift, star formation rate, mass and morphology. We also investigate the performance of artificially redshifted images in recovering physical properties as image quality degrades.

  14. Machine learning in heart failure: ready for prime time.

    Science.gov (United States)

    Awan, Saqib Ejaz; Sohel, Ferdous; Sanfilippo, Frank Mario; Bennamoun, Mohammed; Dwivedi, Girish

    2018-03-01

    The aim of this review is to present an up-to-date overview of the application of machine learning methods in heart failure including diagnosis, classification, readmissions and medication adherence. Recent studies have shown that the application of machine learning techniques may have the potential to improve heart failure outcomes and management, including cost savings by improving existing diagnostic and treatment support systems. Recently developed deep learning methods are expected to yield even better performance than traditional machine learning techniques in performing complex tasks by learning the intricate patterns hidden in big medical data. The review summarizes the recent developments in the application of machine and deep learning methods in heart failure management.

  15. Identifying student stuck states in programmingassignments using machine learning

    OpenAIRE

    Lindell, Johan

    2014-01-01

    Intelligent tutors are becoming more popular with the increased use of computersand hand held devices in the education sphere. An area of research isinvestigating how machine learning can be used to improve the precision andfeedback of the tutor. This thesis compares machine learning clustering algorithmswith various distance functions in an attempt to cluster together codesnapshots of students solving a programming task. It investigates whethera general non-problem specific implementation of...

  16. Machine Learning of Fault Friction

    Science.gov (United States)

    Johnson, P. A.; Rouet-Leduc, B.; Hulbert, C.; Marone, C.; Guyer, R. A.

    2017-12-01

    We are applying machine learning (ML) techniques to continuous acoustic emission (AE) data from laboratory earthquake experiments. Our goal is to apply explicit ML methods to this acoustic datathe AE in order to infer frictional properties of a laboratory fault. The experiment is a double direct shear apparatus comprised of fault blocks surrounding fault gouge comprised of glass beads or quartz powder. Fault characteristics are recorded, including shear stress, applied load (bulk friction = shear stress/normal load) and shear velocity. The raw acoustic signal is continuously recorded. We rely on explicit decision tree approaches (Random Forest and Gradient Boosted Trees) that allow us to identify important features linked to the fault friction. A training procedure that employs both the AE and the recorded shear stress from the experiment is first conducted. Then, testing takes place on data the algorithm has never seen before, using only the continuous AE signal. We find that these methods provide rich information regarding frictional processes during slip (Rouet-Leduc et al., 2017a; Hulbert et al., 2017). In addition, similar machine learning approaches predict failure times, as well as slip magnitudes in some cases. We find that these methods work for both stick slip and slow slip experiments, for periodic slip and for aperiodic slip. We also derive a fundamental relationship between the AE and the friction describing the frictional behavior of any earthquake slip cycle in a given experiment (Rouet-Leduc et al., 2017b). Our goal is to ultimately scale these approaches to Earth geophysical data to probe fault friction. References Rouet-Leduc, B., C. Hulbert, N. Lubbers, K. Barros, C. Humphreys and P. A. Johnson, Machine learning predicts laboratory earthquakes, in review (2017). https://arxiv.org/abs/1702.05774Rouet-LeDuc, B. et al., Friction Laws Derived From the Acoustic Emissions of a Laboratory Fault by Machine Learning (2017), AGU Fall Meeting Session S025

  17. Machine learning in jet physics

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    High energy collider experiments produce several petabytes of data every year. Given the magnitude and complexity of the raw data, machine learning algorithms provide the best available platform to transform and analyse these data to obtain valuable insights to understand Standard Model and Beyond Standard Model theories. These collider experiments produce both quark and gluon initiated hadronic jets as the core components. Deep learning techniques enable us to classify quark/gluon jets through image recognition and help us to differentiate signals and backgrounds in Beyond Standard Model searches at LHC. We are currently working on quark/gluon jet classification and progressing in our studies to find the bias between event generators using domain adversarial neural networks (DANN). We also plan to investigate top tagging, weak supervision on mixed samples in high energy physics, utilizing transfer learning from simulated data to real experimental data.

  18. Towards Machine Learning of Motor Skills

    Science.gov (United States)

    Peters, Jan; Schaal, Stefan; Schölkopf, Bernhard

    Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting.

  19. Reverse hypothesis machine learning a practitioner's perspective

    CERN Document Server

    Kulkarni, Parag

    2017-01-01

    This book introduces a paradigm of reverse hypothesis machines (RHM), focusing on knowledge innovation and machine learning. Knowledge- acquisition -based learning is constrained by large volumes of data and is time consuming. Hence Knowledge innovation based learning is the need of time. Since under-learning results in cognitive inabilities and over-learning compromises freedom, there is need for optimal machine learning. All existing learning techniques rely on mapping input and output and establishing mathematical relationships between them. Though methods change the paradigm remains the same—the forward hypothesis machine paradigm, which tries to minimize uncertainty. The RHM, on the other hand, makes use of uncertainty for creative learning. The approach uses limited data to help identify new and surprising solutions. It focuses on improving learnability, unlike traditional approaches, which focus on accuracy. The book is useful as a reference book for machine learning researchers and professionals as ...

  20. Finding New Perovskite Halides via Machine learning

    Directory of Open Access Journals (Sweden)

    Ghanshyam ePilania

    2016-04-01

    Full Text Available Advanced materials with improved properties have the potential to fuel future technological advancements. However, identification and discovery of these optimal materials for a specific application is a non-trivial task, because of the vastness of the chemical search space with enormous compositional and configurational degrees of freedom. Materials informatics provides an efficient approach towards rational design of new materials, via learning from known data to make decisions on new and previously unexplored compounds in an accelerated manner. Here, we demonstrate the power and utility of such statistical learning (or machine learning via building a support vector machine (SVM based classifier that uses elemental features (or descriptors to predict the formability of a given ABX3 halide composition (where A and B represent monovalent and divalent cations, respectively, and X is F, Cl, Br or I anion in the perovskite crystal structure. The classification model is built by learning from a dataset of 181 experimentally known ABX3 compounds. After exploring a wide range of features, we identify ionic radii, tolerance factor and octahedral factor to be the most important factors for the classification, suggesting that steric and geometric packing effects govern the stability of these halides. The trained and validated models then predict, with a high degree of confidence, several novel ABX3 compositions with perovskite crystal structure.

  1. Finding New Perovskite Halides via Machine learning

    Science.gov (United States)

    Pilania, Ghanshyam; Balachandran, Prasanna V.; Kim, Chiho; Lookman, Turab

    2016-04-01

    Advanced materials with improved properties have the potential to fuel future technological advancements. However, identification and discovery of these optimal materials for a specific application is a non-trivial task, because of the vastness of the chemical search space with enormous compositional and configurational degrees of freedom. Materials informatics provides an efficient approach towards rational design of new materials, via learning from known data to make decisions on new and previously unexplored compounds in an accelerated manner. Here, we demonstrate the power and utility of such statistical learning (or machine learning) via building a support vector machine (SVM) based classifier that uses elemental features (or descriptors) to predict the formability of a given ABX3 halide composition (where A and B represent monovalent and divalent cations, respectively, and X is F, Cl, Br or I anion) in the perovskite crystal structure. The classification model is built by learning from a dataset of 181 experimentally known ABX3 compounds. After exploring a wide range of features, we identify ionic radii, tolerance factor and octahedral factor to be the most important factors for the classification, suggesting that steric and geometric packing effects govern the stability of these halides. The trained and validated models then predict, with a high degree of confidence, several novel ABX3 compositions with perovskite crystal structure.

  2. BELM: Bayesian extreme learning machine.

    Science.gov (United States)

    Soria-Olivas, Emilio; Gómez-Sanchis, Juan; Martín, José D; Vila-Francés, Joan; Martínez, Marcelino; Magdalena, José R; Serrano, Antonio J

    2011-03-01

    The theory of extreme learning machine (ELM) has become very popular on the last few years. ELM is a new approach for learning the parameters of the hidden layers of a multilayer neural network (as the multilayer perceptron or the radial basis function neural network). Its main advantage is the lower computational cost, which is especially relevant when dealing with many patterns defined in a high-dimensional space. This brief proposes a bayesian approach to ELM, which presents some advantages over other approaches: it allows the introduction of a priori knowledge; obtains the confidence intervals (CIs) without the need of applying methods that are computationally intensive, e.g., bootstrap; and presents high generalization capabilities. Bayesian ELM is benchmarked against classical ELM in several artificial and real datasets that are widely used for the evaluation of machine learning algorithms. Achieved results show that the proposed approach produces a competitive accuracy with some additional advantages, namely, automatic production of CIs, reduction of probability of model overfitting, and use of a priori knowledge.

  3. Archetypal Analysis for Machine Learning

    DEFF Research Database (Denmark)

    Mørup, Morten; Hansen, Lars Kai

    2010-01-01

    Archetypal analysis (AA) proposed by Cutler and Breiman in [1] estimates the principal convex hull of a data set. As such AA favors features that constitute representative ’corners’ of the data, i.e. distinct aspects or archetypes. We will show that AA enjoys the interpretability of clustering - ...... for K-means [2]. We demonstrate that the AA model is relevant for feature extraction and dimensional reduction for a large variety of machine learning problems taken from computer vision, neuroimaging, text mining and collaborative filtering....

  4. Machine learning approaches in medical image analysis

    DEFF Research Database (Denmark)

    de Bruijne, Marleen

    2016-01-01

    Machine learning approaches are increasingly successful in image-based diagnosis, disease prognosis, and risk assessment. This paper highlights new research directions and discusses three main challenges related to machine learning in medical imaging: coping with variation in imaging protocols......, learning from weak labels, and interpretation and evaluation of results....

  5. A machine learning model with human cognitive biases capable of learning from small and biased datasets.

    Science.gov (United States)

    Taniguchi, Hidetaka; Sato, Hiroshi; Shirakawa, Tomohiro

    2018-05-09

    Human learners can generalize a new concept from a small number of samples. In contrast, conventional machine learning methods require large amounts of data to address the same types of problems. Humans have cognitive biases that promote fast learning. Here, we developed a method to reduce the gap between human beings and machines in this type of inference by utilizing cognitive biases. We implemented a human cognitive model into machine learning algorithms and compared their performance with the currently most popular methods, naïve Bayes, support vector machine, neural networks, logistic regression and random forests. We focused on the task of spam classification, which has been studied for a long time in the field of machine learning and often requires a large amount of data to obtain high accuracy. Our models achieved superior performance with small and biased samples in comparison with other representative machine learning methods.

  6. Machine learning in genetics and genomics

    Science.gov (United States)

    Libbrecht, Maxwell W.; Noble, William Stafford

    2016-01-01

    The field of machine learning promises to enable computers to assist humans in making sense of large, complex data sets. In this review, we outline some of the main applications of machine learning to genetic and genomic data. In the process, we identify some recurrent challenges associated with this type of analysis and provide general guidelines to assist in the practical application of machine learning to real genetic and genomic data. PMID:25948244

  7. Using Machine Learning to Predict Student Performance

    OpenAIRE

    Pojon, Murat

    2017-01-01

    This thesis examines the application of machine learning algorithms to predict whether a student will be successful or not. The specific focus of the thesis is the comparison of machine learning methods and feature engineering techniques in terms of how much they improve the prediction performance. Three different machine learning methods were used in this thesis. They are linear regression, decision trees, and naïve Bayes classification. Feature engineering, the process of modification ...

  8. Introducing Machine Learning Concepts with WEKA.

    Science.gov (United States)

    Smith, Tony C; Frank, Eibe

    2016-01-01

    This chapter presents an introduction to data mining with machine learning. It gives an overview of various types of machine learning, along with some examples. It explains how to download, install, and run the WEKA data mining toolkit on a simple data set, then proceeds to explain how one might approach a bioinformatics problem. Finally, it includes a brief summary of machine learning algorithms for other types of data mining problems, and provides suggestions about where to find additional information.

  9. Trends in Machine Learning for Signal Processing

    DEFF Research Database (Denmark)

    Adali, Tulay; Miller, David J.; Diamantaras, Konstantinos I.

    2011-01-01

    By putting the accent on learning from the data and the environment, the Machine Learning for SP (MLSP) Technical Committee (TC) provides the essential bridge between the machine learning and SP communities. While the emphasis in MLSP is on learning and data-driven approaches, SP defines the main...... applications of interest, and thus the constraints and requirements on solutions, which include computational efficiency, online adaptation, and learning with limited supervision/reference data....

  10. Machine learning in medicine cookbook

    CERN Document Server

    Cleophas, Ton J

    2014-01-01

    The amount of data in medical databases doubles every 20 months, and physicians are at a loss to analyze them. Also, traditional methods of data analysis have difficulty to identify outliers and patterns in big data and data with multiple exposure / outcome variables and analysis-rules for surveys and questionnaires, currently common methods of data collection, are, essentially, missing. Obviously, it is time that medical and health professionals mastered their reluctance to use machine learning and the current 100 page cookbook should be helpful to that aim. It covers in a condensed form the subjects reviewed in the 750 page three volume textbook by the same authors, entitled “Machine Learning in Medicine I-III” (ed. by Springer, Heidelberg, Germany, 2013) and was written as a hand-hold presentation and must-read publication. It was written not only to investigators and students in the fields, but also to jaded clinicians new to the methods and lacking time to read the entire textbooks. General purposes ...

  11. Advanced Machine learning Algorithm Application for Rotating Machine Health Monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Kanemoto, Shigeru; Watanabe, Masaya [The University of Aizu, Aizuwakamatsu (Japan); Yusa, Noritaka [Tohoku University, Sendai (Japan)

    2014-08-15

    The present paper tries to evaluate the applicability of conventional sound analysis techniques and modern machine learning algorithms to rotating machine health monitoring. These techniques include support vector machine, deep leaning neural network, etc. The inner ring defect and misalignment anomaly sound data measured by a rotating machine mockup test facility are used to verify the above various kinds of algorithms. Although we cannot find remarkable difference of anomaly discrimination performance, some methods give us the very interesting eigen patterns corresponding to normal and abnormal states. These results will be useful for future more sensitive and robust anomaly monitoring technology.

  12. Advanced Machine learning Algorithm Application for Rotating Machine Health Monitoring

    International Nuclear Information System (INIS)

    Kanemoto, Shigeru; Watanabe, Masaya; Yusa, Noritaka

    2014-01-01

    The present paper tries to evaluate the applicability of conventional sound analysis techniques and modern machine learning algorithms to rotating machine health monitoring. These techniques include support vector machine, deep leaning neural network, etc. The inner ring defect and misalignment anomaly sound data measured by a rotating machine mockup test facility are used to verify the above various kinds of algorithms. Although we cannot find remarkable difference of anomaly discrimination performance, some methods give us the very interesting eigen patterns corresponding to normal and abnormal states. These results will be useful for future more sensitive and robust anomaly monitoring technology

  13. Extreme learning machines 2013 algorithms and applications

    CERN Document Server

    Toh, Kar-Ann; Romay, Manuel; Mao, Kezhi

    2014-01-01

    In recent years, ELM has emerged as a revolutionary technique of computational intelligence, and has attracted considerable attentions. An extreme learning machine (ELM) is a single layer feed-forward neural network alike learning system, whose connections from the input layer to the hidden layer are randomly generated, while the connections from the hidden layer to the output layer are learned through linear learning methods. The outstanding merits of extreme learning machine (ELM) are its fast learning speed, trivial human intervene and high scalability.   This book contains some selected papers from the International Conference on Extreme Learning Machine 2013, which was held in Beijing China, October 15-17, 2013. This conference aims to bring together the researchers and practitioners of extreme learning machine from a variety of fields including artificial intelligence, biomedical engineering and bioinformatics, system modelling and control, and signal and image processing, to promote research and discu...

  14. Introduction to Machine Learning: Class Notes 67577

    OpenAIRE

    Shashua, Amnon

    2009-01-01

    Introduction to Machine learning covering Statistical Inference (Bayes, EM, ML/MaxEnt duality), algebraic and spectral methods (PCA, LDA, CCA, Clustering), and PAC learning (the Formal model, VC dimension, Double Sampling theorem).

  15. Building machine learning systems with Python

    CERN Document Server

    Coelho, Luis Pedro

    2015-01-01

    This book primarily targets Python developers who want to learn and use Python's machine learning capabilities and gain valuable insights from data to develop effective solutions for business problems.

  16. Learning Activity Packets for Grinding Machines. Unit I--Grinding Machines.

    Science.gov (United States)

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This learning activity packet (LAP) is one of three that accompany the curriculum guide on grinding machines. It outlines the study activities and performance tasks for the first unit of this curriculum guide. Its purpose is to aid the student in attaining a working knowledge of this area of training and in achieving a skilled or moderately…

  17. Amplifying human ability through autonomics and machine learning in IMPACT

    Science.gov (United States)

    Dzieciuch, Iryna; Reeder, John; Gutzwiller, Robert; Gustafson, Eric; Coronado, Braulio; Martinez, Luis; Croft, Bryan; Lange, Douglas S.

    2017-05-01

    Amplifying human ability for controlling complex environments featuring autonomous units can be aided by learned models of human and system performance. In developing a command and control system that allows a small number of people to control a large number of autonomous teams, we employ an autonomics framework to manage the networks that represent mission plans and the networks that are composed of human controllers and their autonomous assistants. Machine learning allows us to build models of human and system performance useful for monitoring plans and managing human attention and task loads. Machine learning also aids in the development of tactics that human supervisors can successfully monitor through the command and control system.

  18. Robust Matching Pursuit Extreme Learning Machines

    Directory of Open Access Journals (Sweden)

    Zejian Yuan

    2018-01-01

    Full Text Available Extreme learning machine (ELM is a popular learning algorithm for single hidden layer feedforward networks (SLFNs. It was originally proposed with the inspiration from biological learning and has attracted massive attentions due to its adaptability to various tasks with a fast learning ability and efficient computation cost. As an effective sparse representation method, orthogonal matching pursuit (OMP method can be embedded into ELM to overcome the singularity problem and improve the stability. Usually OMP recovers a sparse vector by minimizing a least squares (LS loss, which is efficient for Gaussian distributed data, but may suffer performance deterioration in presence of non-Gaussian data. To address this problem, a robust matching pursuit method based on a novel kernel risk-sensitive loss (in short KRSLMP is first proposed in this paper. The KRSLMP is then applied to ELM to solve the sparse output weight vector, and the new method named the KRSLMP-ELM is developed for SLFN learning. Experimental results on synthetic and real-world data sets confirm the effectiveness and superiority of the proposed method.

  19. Machine Learning Interface for Medical Image Analysis.

    Science.gov (United States)

    Zhang, Yi C; Kagen, Alexander C

    2017-10-01

    TensorFlow is a second-generation open-source machine learning software library with a built-in framework for implementing neural networks in wide variety of perceptual tasks. Although TensorFlow usage is well established with computer vision datasets, the TensorFlow interface with DICOM formats for medical imaging remains to be established. Our goal is to extend the TensorFlow API to accept raw DICOM images as input; 1513 DaTscan DICOM images were obtained from the Parkinson's Progression Markers Initiative (PPMI) database. DICOM pixel intensities were extracted and shaped into tensors, or n-dimensional arrays, to populate the training, validation, and test input datasets for machine learning. A simple neural network was constructed in TensorFlow to classify images into normal or Parkinson's disease groups. Training was executed over 1000 iterations for each cross-validation set. The gradient descent optimization and Adagrad optimization algorithms were used to minimize cross-entropy between the predicted and ground-truth labels. Cross-validation was performed ten times to produce a mean accuracy of 0.938 ± 0.047 (95 % CI 0.908-0.967). The mean sensitivity was 0.974 ± 0.043 (95 % CI 0.947-1.00) and mean specificity was 0.822 ± 0.207 (95 % CI 0.694-0.950). We extended the TensorFlow API to enable DICOM compatibility in the context of DaTscan image analysis. We implemented a neural network classifier that produces diagnostic accuracies on par with excellent results from previous machine learning models. These results indicate the potential role of TensorFlow as a useful adjunct diagnostic tool in the clinical setting.

  20. Machine learning of network metrics in ATLAS Distributed Data Management

    CERN Document Server

    AUTHOR|(INSPIRE)INSPIRE-00218873; The ATLAS collaboration; Toler, Wesley; Vamosi, Ralf; Bogado Garcia, Joaquin Ignacio

    2017-01-01

    The increasing volume of physics data poses a critical challenge to the ATLAS experiment. In anticipation of high luminosity physics, automation of everyday data management tasks has become necessary. Previously many of these tasks required human decision-making and operation. Recent advances in hardware and software have made it possible to entrust more complicated duties to automated systems using models trained by machine learning algorithms. In this contribution we show results from one of our ongoing automation efforts that focuses on network metrics. First, we describe our machine learning framework built atop the ATLAS Analytics Platform. This framework can automatically extract and aggregate data, train models with various machine learning algorithms, and eventually score the resulting models and parameters. Second, we use these models to forecast metrics relevant for network-aware job scheduling and data brokering. We show the characteristics of the data and evaluate the forecasting accuracy of our m...

  1. Machine learning of network metrics in ATLAS Distributed Data Management

    Science.gov (United States)

    Lassnig, Mario; Toler, Wesley; Vamosi, Ralf; Bogado, Joaquin; ATLAS Collaboration

    2017-10-01

    The increasing volume of physics data poses a critical challenge to the ATLAS experiment. In anticipation of high luminosity physics, automation of everyday data management tasks has become necessary. Previously many of these tasks required human decision-making and operation. Recent advances in hardware and software have made it possible to entrust more complicated duties to automated systems using models trained by machine learning algorithms. In this contribution we show results from one of our ongoing automation efforts that focuses on network metrics. First, we describe our machine learning framework built atop the ATLAS Analytics Platform. This framework can automatically extract and aggregate data, train models with various machine learning algorithms, and eventually score the resulting models and parameters. Second, we use these models to forecast metrics relevant for networkaware job scheduling and data brokering. We show the characteristics of the data and evaluate the forecasting accuracy of our models.

  2. Learning as a Machine: Crossovers between Humans and Machines

    Science.gov (United States)

    Hildebrandt, Mireille

    2017-01-01

    This article is a revised version of the keynote presented at LAK '16 in Edinburgh. The article investigates some of the assumptions of learning analytics, notably those related to behaviourism. Building on the work of Ivan Pavlov, Herbert Simon, and James Gibson as ways of "learning as a machine," the article then develops two levels of…

  3. What is the machine learning.

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    Applications of machine learning tools to problems of physical interest are often criticized for producing sensitivity at the expense of transparency. In this talk, I explore a procedure for identifying combinations of variables -- aided by physical intuition -- that can discriminate signal from background. Weights are introduced to smooth away the features in a given variable(s). New networks are then trained on this modified data. Observed decreases in sensitivity diagnose the variable's discriminating power. Planing also allows the investigation of the linear versus non-linear nature of the boundaries between signal and background. I will demonstrate these features in both an easy to understand toy model and an idealized LHC resonance scenario.

  4. Teaching machine learning to design students

    NARCIS (Netherlands)

    Vlist, van der B.J.J.; van de Westelaken, H.F.M.; Bartneck, C.; Hu, J.; Ahn, R.M.C.; Barakova, E.I.; Delbressine, F.L.M.; Feijs, L.M.G.; Pan, Z.; Zhang, X.; El Rhalibi, A.

    2008-01-01

    Machine learning is a key technology to design and create intelligent systems, products, and related services. Like many other design departments, we are faced with the challenge to teach machine learning to design students, who often do not have an inherent affinity towards technology. We

  5. Predicting the dissolution kinetics of silicate glasses using machine learning

    Science.gov (United States)

    Anoop Krishnan, N. M.; Mangalathu, Sujith; Smedskjaer, Morten M.; Tandia, Adama; Burton, Henry; Bauchy, Mathieu

    2018-05-01

    Predicting the dissolution rates of silicate glasses in aqueous conditions is a complex task as the underlying mechanism(s) remain poorly understood and the dissolution kinetics can depend on a large number of intrinsic and extrinsic factors. Here, we assess the potential of data-driven models based on machine learning to predict the dissolution rates of various aluminosilicate glasses exposed to a wide range of solution pH values, from acidic to caustic conditions. Four classes of machine learning methods are investigated, namely, linear regression, support vector machine regression, random forest, and artificial neural network. We observe that, although linear methods all fail to describe the dissolution kinetics, the artificial neural network approach offers excellent predictions, thanks to its inherent ability to handle non-linear data. Overall, we suggest that a more extensive use of machine learning approaches could significantly accelerate the design of novel glasses with tailored properties.

  6. Considerations upon the Machine Learning Technologies

    OpenAIRE

    Alin Munteanu; Cristina Ofelia Sofran

    2006-01-01

    Artificial intelligence offers superior techniques and methods by which problems from diverse domains may find an optimal solution. The Machine Learning technologies refer to the domain of artificial intelligence aiming to develop the techniques allowing the computers to “learn”. Some systems based on Machine Learning technologies tend to eliminate the necessity of the human intelligence while the others adopt a man-machine collaborative approach.

  7. Considerations upon the Machine Learning Technologies

    Directory of Open Access Journals (Sweden)

    Alin Munteanu

    2006-01-01

    Full Text Available Artificial intelligence offers superior techniques and methods by which problems from diverse domains may find an optimal solution. The Machine Learning technologies refer to the domain of artificial intelligence aiming to develop the techniques allowing the computers to “learn”. Some systems based on Machine Learning technologies tend to eliminate the necessity of the human intelligence while the others adopt a man-machine collaborative approach.

  8. Machine vision systems using machine learning for industrial product inspection

    Science.gov (United States)

    Lu, Yi; Chen, Tie Q.; Chen, Jie; Zhang, Jian; Tisler, Anthony

    2002-02-01

    Machine vision inspection requires efficient processing time and accurate results. In this paper, we present a machine vision inspection architecture, SMV (Smart Machine Vision). SMV decomposes a machine vision inspection problem into two stages, Learning Inspection Features (LIF), and On-Line Inspection (OLI). The LIF is designed to learn visual inspection features from design data and/or from inspection products. During the OLI stage, the inspection system uses the knowledge learnt by the LIF component to inspect the visual features of products. In this paper we will present two machine vision inspection systems developed under the SMV architecture for two different types of products, Printed Circuit Board (PCB) and Vacuum Florescent Displaying (VFD) boards. In the VFD board inspection system, the LIF component learns inspection features from a VFD board and its displaying patterns. In the PCB board inspection system, the LIF learns the inspection features from the CAD file of a PCB board. In both systems, the LIF component also incorporates interactive learning to make the inspection system more powerful and efficient. The VFD system has been deployed successfully in three different manufacturing companies and the PCB inspection system is the process of being deployed in a manufacturing plant.

  9. Automated mapping of building facades by machine learning

    DEFF Research Database (Denmark)

    Höhle, Joachim

    2014-01-01

    Facades of buildings contain various types of objects which have to be recorded for information systems. The article describes a solution for this task focussing on automated classification by means of machine learning techniques. Stereo pairs of oblique images are used to derive 3D point clouds...

  10. Kernel Methods for Machine Learning with Life Science Applications

    DEFF Research Database (Denmark)

    Abrahamsen, Trine Julie

    Kernel methods refer to a family of widely used nonlinear algorithms for machine learning tasks like classification, regression, and feature extraction. By exploiting the so-called kernel trick straightforward extensions of classical linear algorithms are enabled as long as the data only appear a...

  11. Adaptive Learning Systems: Beyond Teaching Machines

    Science.gov (United States)

    Kara, Nuri; Sevim, Nese

    2013-01-01

    Since 1950s, teaching machines have changed a lot. Today, we have different ideas about how people learn, what instructor should do to help students during their learning process. We have adaptive learning technologies that can create much more student oriented learning environments. The purpose of this article is to present these changes and its…

  12. Building machine learning systems with Python

    CERN Document Server

    Richert, Willi

    2013-01-01

    This is a tutorial-driven and practical, but well-grounded book showcasing good Machine Learning practices. There will be an emphasis on using existing technologies instead of showing how to write your own implementations of algorithms. This book is a scenario-based, example-driven tutorial. By the end of the book you will have learnt critical aspects of Machine Learning Python projects and experienced the power of ML-based systems by actually working on them.This book primarily targets Python developers who want to learn about and build Machine Learning into their projects, or who want to pro

  13. Probabilistic machine learning and artificial intelligence.

    Science.gov (United States)

    Ghahramani, Zoubin

    2015-05-28

    How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.

  14. Probabilistic machine learning and artificial intelligence

    Science.gov (United States)

    Ghahramani, Zoubin

    2015-05-01

    How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.

  15. Quantum machine learning: a classical perspective

    Science.gov (United States)

    Ciliberto, Carlo; Herbster, Mark; Ialongo, Alessandro Davide; Pontil, Massimiliano; Severini, Simone; Wossnig, Leonard

    2018-01-01

    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning (ML) techniques to impressive results in regression, classification, data generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets is motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed up classical ML algorithms. Here we review the literature in quantum ML and discuss perspectives for a mixed readership of classical ML and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in ML are identified as promising directions for the field. Practical questions, such as how to upload classical data into quantum form, will also be addressed. PMID:29434508

  16. Integrating Human and Machine Intelligence in Galaxy Morphology Classification Tasks

    Science.gov (United States)

    Beck, Melanie Renee

    thus solved both the visual classification problem of time efficiency and improved accuracy by producing a distribution of independent classifications for each galaxy. While crowd-sourced galaxy classifications have proven their worth, challenges remain before establishing this method as a critical and standard component of the data processing pipelines for the next generation of surveys. In particular, though innovative, crowd-sourcing techniques do not have the capacity to handle the data volume and rates expected in the next generation of surveys. These algorithms will be delegated to handle the majority of the classification tasks, freeing citizen scientists to contribute their efforts on subtler and more complex assignments. This thesis presents a solution through an integration of visual and automated classifications, preserving the best features of both human and machine. We demonstrate the effectiveness of such a system through a re-analysis of visual galaxy morphology classifications collected during the Galaxy Zoo 2 (GZ2) project. We reprocess the top-level question of the GZ2 decision tree with a Bayesian classification aggregation algorithm dubbed SWAP, originally developed for the Space Warps gravitational lens project. Through a simple binary classification scheme we increase the classification rate nearly 5-fold classifying 226,124 galaxies in 92 days of GZ2 project time while reproducing labels derived from GZ2 classification data with 95.7% accuracy. We next combine this with a Random Forest machine learning algorithm that learns on a suite of non-parametric morphology indicators widely used for automated morphologies. We develop a decision engine that delegates tasks between human and machine and demonstrate that the combined system provides a factor of 11.4 increase in the classification rate, classifying 210,803 galaxies in just 32 days of GZ2 project time with 93.1% accuracy. As the Random Forest algorithm requires a minimal amount of computational

  17. Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines

    Science.gov (United States)

    Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.

    2011-01-01

    Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433

  18. International Conference on Extreme Learning Machines 2014

    CERN Document Server

    Mao, Kezhi; Cambria, Erik; Man, Zhihong; Toh, Kar-Ann

    2015-01-01

    This book contains some selected papers from the International Conference on Extreme Learning Machine 2014, which was held in Singapore, December 8-10, 2014. This conference brought together the researchers and practitioners of Extreme Learning Machine (ELM) from a variety of fields to promote research and development of “learning without iterative tuning”.  The book covers theories, algorithms and applications of ELM. It gives the readers a glance of the most recent advances of ELM.  

  19. International Conference on Extreme Learning Machine 2015

    CERN Document Server

    Mao, Kezhi; Wu, Jonathan; Lendasse, Amaury; ELM 2015; Theory, Algorithms and Applications (I); Theory, Algorithms and Applications (II)

    2016-01-01

    This book contains some selected papers from the International Conference on Extreme Learning Machine 2015, which was held in Hangzhou, China, December 15-17, 2015. This conference brought together researchers and engineers to share and exchange R&D experience on both theoretical studies and practical applications of the Extreme Learning Machine (ELM) technique and brain learning. This book covers theories, algorithms ad applications of ELM. It gives readers a glance of the most recent advances of ELM. .

  20. Python for probability, statistics, and machine learning

    CERN Document Server

    Unpingco, José

    2016-01-01

    This book covers the key ideas that link probability, statistics, and machine learning illustrated using Python modules in these areas. The entire text, including all the figures and numerical results, is reproducible using the Python codes and their associated Jupyter/IPython notebooks, which are provided as supplementary downloads. The author develops key intuitions in machine learning by working meaningful examples using multiple analytical methods and Python codes, thereby connecting theoretical concepts to concrete implementations. Modern Python modules like Pandas, Sympy, and Scikit-learn are applied to simulate and visualize important machine learning concepts like the bias/variance trade-off, cross-validation, and regularization. Many abstract mathematical ideas, such as convergence in probability theory, are developed and illustrated with numerical examples. This book is suitable for anyone with an undergraduate-level exposure to probability, statistics, or machine learning and with rudimentary knowl...

  1. Evaluation of Machine Learning Methods for LHC Optics Measurements and Corrections Software

    CERN Document Server

    AUTHOR|(CDS)2206853; Henning, Peter

    The field of artificial intelligence is driven by the goal to provide machines with human-like intelligence. However modern science is currently facing problems with high complexity that cannot be solved by humans in the same timescale as by machines. Therefore there is a demand on automation of complex tasks. To identify the category of tasks which can be performed by machines in the domain of optics measurements and correction on the Large Hadron Collider (LHC) is one of the central research subjects of this thesis. The application of machine learning methods and concepts of artificial intelligence can be found in various industry and scientific branches. In High Energy Physics these concepts are mostly used in offline analysis of experiments data and to perform regression tasks. In Accelerator Physics the machine learning approach has not found a wide application yet. Therefore potential tasks for machine learning solutions can be specified in this domain. The appropriate methods and their suitability for...

  2. Trends in extreme learning machines: a review.

    Science.gov (United States)

    Huang, Gao; Huang, Guang-Bin; Song, Shiji; You, Keyou

    2015-01-01

    Extreme learning machine (ELM) has gained increasing interest from various research fields recently. In this review, we aim to report the current state of the theoretical research and practical advances on this subject. We first give an overview of ELM from the theoretical perspective, including the interpolation theory, universal approximation capability, and generalization ability. Then we focus on the various improvements made to ELM which further improve its stability, sparsity and accuracy under general or specific conditions. Apart from classification and regression, ELM has recently been extended for clustering, feature selection, representational learning and many other learning tasks. These newly emerging algorithms greatly expand the applications of ELM. From implementation aspect, hardware implementation and parallel computation techniques have substantially sped up the training of ELM, making it feasible for big data processing and real-time reasoning. Due to its remarkable efficiency, simplicity, and impressive generalization performance, ELM have been applied in a variety of domains, such as biomedical engineering, computer vision, system identification, and control and robotics. In this review, we try to provide a comprehensive view of these advances in ELM together with its future perspectives.

  3. An introduction to machine learning with Scikit-Learn

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    This tutorial gives an introduction to the scientific ecosystem for data analysis and machine learning in Python. After a short introduction of machine learning concepts, we will demonstrate on High Energy Physics data how a basic supervised learning analysis can be carried out using the Scikit-Learn library. Topics covered include data loading facilities and data representation, supervised learning algorithms, pipelines, model selection and evaluation, and model introspection.

  4. Leadership for Learning: Tasks of Learning Culture

    Science.gov (United States)

    Corrigan, Joe

    2012-01-01

    This is a comparative analysis of leadership related to organizational culture and change that occurred at a large Canadian university during a twenty year period 1983-2003. From an institutional development perspective, leadership is characterized as a culture creation and development responsibility. By centering on the tasks of learning culture,…

  5. Machine Learning Techniques in Clinical Vision Sciences.

    Science.gov (United States)

    Caixinha, Miguel; Nunes, Sandrina

    2017-01-01

    This review presents and discusses the contribution of machine learning techniques for diagnosis and disease monitoring in the context of clinical vision science. Many ocular diseases leading to blindness can be halted or delayed when detected and treated at its earliest stages. With the recent developments in diagnostic devices, imaging and genomics, new sources of data for early disease detection and patients' management are now available. Machine learning techniques emerged in the biomedical sciences as clinical decision-support techniques to improve sensitivity and specificity of disease detection and monitoring, increasing objectively the clinical decision-making process. This manuscript presents a review in multimodal ocular disease diagnosis and monitoring based on machine learning approaches. In the first section, the technical issues related to the different machine learning approaches will be present. Machine learning techniques are used to automatically recognize complex patterns in a given dataset. These techniques allows creating homogeneous groups (unsupervised learning), or creating a classifier predicting group membership of new cases (supervised learning), when a group label is available for each case. To ensure a good performance of the machine learning techniques in a given dataset, all possible sources of bias should be removed or minimized. For that, the representativeness of the input dataset for the true population should be confirmed, the noise should be removed, the missing data should be treated and the data dimensionally (i.e., the number of parameters/features and the number of cases in the dataset) should be adjusted. The application of machine learning techniques in ocular disease diagnosis and monitoring will be presented and discussed in the second section of this manuscript. To show the clinical benefits of machine learning in clinical vision sciences, several examples will be presented in glaucoma, age-related macular degeneration

  6. A review of supervised machine learning applied to ageing research.

    Science.gov (United States)

    Fabris, Fabio; Magalhães, João Pedro de; Freitas, Alex A

    2017-04-01

    Broadly speaking, supervised machine learning is the computational task of learning correlations between variables in annotated data (the training set), and using this information to create a predictive model capable of inferring annotations for new data, whose annotations are not known. Ageing is a complex process that affects nearly all animal species. This process can be studied at several levels of abstraction, in different organisms and with different objectives in mind. Not surprisingly, the diversity of the supervised machine learning algorithms applied to answer biological questions reflects the complexities of the underlying ageing processes being studied. Many works using supervised machine learning to study the ageing process have been recently published, so it is timely to review these works, to discuss their main findings and weaknesses. In summary, the main findings of the reviewed papers are: the link between specific types of DNA repair and ageing; ageing-related proteins tend to be highly connected and seem to play a central role in molecular pathways; ageing/longevity is linked with autophagy and apoptosis, nutrient receptor genes, and copper and iron ion transport. Additionally, several biomarkers of ageing were found by machine learning. Despite some interesting machine learning results, we also identified a weakness of current works on this topic: only one of the reviewed papers has corroborated the computational results of machine learning algorithms through wet-lab experiments. In conclusion, supervised machine learning has contributed to advance our knowledge and has provided novel insights on ageing, yet future work should have a greater emphasis in validating the predictions.

  7. Clinical quality needs complex adaptive systems and machine learning.

    Science.gov (United States)

    Marsland, Stephen; Buchan, Iain

    2004-01-01

    The vast increase in clinical data has the potential to bring about large improvements in clinical quality and other aspects of healthcare delivery. However, such benefits do not come without cost. The analysis of such large datasets, particularly where the data may have to be merged from several sources and may be noisy and incomplete, is a challenging task. Furthermore, the introduction of clinical changes is a cyclical task, meaning that the processes under examination operate in an environment that is not static. We suggest that traditional methods of analysis are unsuitable for the task, and identify complexity theory and machine learning as areas that have the potential to facilitate the examination of clinical quality. By its nature the field of complex adaptive systems deals with environments that change because of the interactions that have occurred in the past. We draw parallels between health informatics and bioinformatics, which has already started to successfully use machine learning methods.

  8. Machine Learning Based Localization and Classification with Atomic Magnetometers

    Science.gov (United States)

    Deans, Cameron; Griffin, Lewis D.; Marmugi, Luca; Renzoni, Ferruccio

    2018-01-01

    We demonstrate identification of position, material, orientation, and shape of objects imaged by a Rb 85 atomic magnetometer performing electromagnetic induction imaging supported by machine learning. Machine learning maximizes the information extracted from the images created by the magnetometer, demonstrating the use of hidden data. Localization 2.6 times better than the spatial resolution of the imaging system and successful classification up to 97% are obtained. This circumvents the need of solving the inverse problem and demonstrates the extension of machine learning to diffusive systems, such as low-frequency electrodynamics in media. Automated collection of task-relevant information from quantum-based electromagnetic imaging will have a relevant impact from biomedicine to security.

  9. Machine learning techniques in optical communication

    DEFF Research Database (Denmark)

    Zibar, Darko; Piels, Molly; Jones, Rasmus Thomas

    2016-01-01

    Machine learning techniques relevant for nonlinearity mitigation, carrier recovery, and nanoscale device characterization are reviewed and employed. Markov Chain Monte Carlo in combination with Bayesian filtering is employed within the nonlinear state-space framework and demonstrated for parameter...

  10. Machine learning techniques in optical communication

    DEFF Research Database (Denmark)

    Zibar, Darko; Piels, Molly; Jones, Rasmus Thomas

    2015-01-01

    Techniques from the machine learning community are reviewed and employed for laser characterization, signal detection in the presence of nonlinear phase noise, and nonlinearity mitigation. Bayesian filtering and expectation maximization are employed within nonlinear state-space framework...

  11. Machine learning applications in genetics and genomics.

    Science.gov (United States)

    Libbrecht, Maxwell W; Noble, William Stafford

    2015-06-01

    The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets. Here, we provide an overview of machine learning applications for the analysis of genome sequencing data sets, including the annotation of sequence elements and epigenetic, proteomic or metabolomic data. We present considerations and recurrent challenges in the application of supervised, semi-supervised and unsupervised machine learning methods, as well as of generative and discriminative modelling approaches. We provide general guidelines to assist in the selection of these machine learning methods and their practical application for the analysis of genetic and genomic data sets.

  12. Computer vision and machine learning for archaeology

    NARCIS (Netherlands)

    van der Maaten, L.J.P.; Boon, P.; Lange, G.; Paijmans, J.J.; Postma, E.

    2006-01-01

    Until now, computer vision and machine learning techniques barely contributed to the archaeological domain. The use of these techniques can support archaeologists in their assessment and classification of archaeological finds. The paper illustrates the use of computer vision techniques for

  13. Using Machine Learning for Land Suitability Classification

    African Journals Online (AJOL)

    User

    West African Journal of Applied Ecology, vol. ... evidence for the utility of machine learning methods in land suitability classification especially MCS methods. ... Artificial intelligence tools. ..... Numerical values of index for the various classes.

  14. Model-Agnostic Interpretability of Machine Learning

    OpenAIRE

    Ribeiro, Marco Tulio; Singh, Sameer; Guestrin, Carlos

    2016-01-01

    Understanding why machine learning models behave the way they do empowers both system designers and end-users in many ways: in model selection, feature engineering, in order to trust and act upon the predictions, and in more intuitive user interfaces. Thus, interpretability has become a vital concern in machine learning, and work in the area of interpretable models has found renewed interest. In some applications, such models are as accurate as non-interpretable ones, and thus are preferred f...

  15. Implementing Machine Learning in the PCWG Tool

    Energy Technology Data Exchange (ETDEWEB)

    Clifton, Andrew; Ding, Yu; Stuart, Peter

    2016-12-13

    The Power Curve Working Group (www.pcwg.org) is an ad-hoc industry-led group to investigate the performance of wind turbines in real-world conditions. As part of ongoing experience-sharing exercises, machine learning has been proposed as a possible way to predict turbine performance. This presentation provides some background information about machine learning and how it might be implemented in the PCWG exercises.

  16. Addressing uncertainty in atomistic machine learning

    DEFF Research Database (Denmark)

    Peterson, Andrew A.; Christensen, Rune; Khorshidi, Alireza

    2017-01-01

    Machine-learning regression has been demonstrated to precisely emulate the potential energy and forces that are output from more expensive electronic-structure calculations. However, to predict new regions of the potential energy surface, an assessment must be made of the credibility of the predi......Machine-learning regression has been demonstrated to precisely emulate the potential energy and forces that are output from more expensive electronic-structure calculations. However, to predict new regions of the potential energy surface, an assessment must be made of the credibility...... of the predictions. In this perspective, we address the types of errors that might arise in atomistic machine learning, the unique aspects of atomistic simulations that make machine-learning challenging, and highlight how uncertainty analysis can be used to assess the validity of machine-learning predictions. We...... suggest this will allow researchers to more fully use machine learning for the routine acceleration of large, high-accuracy, or extended-time simulations. In our demonstrations, we use a bootstrap ensemble of neural network-based calculators, and show that the width of the ensemble can provide an estimate...

  17. OpenML : An R package to connect to the machine learning platform OpenML

    NARCIS (Netherlands)

    Casalicchio, G.; Bossek, J.; Lang, M.; Kirchhoff, D.; Kerschke, P.; Hofner, B.; Seibold, H.; Vanschoren, J.; Bischl, B.

    2017-01-01

    OpenML is an online machine learning platform where researchers can easily share data, machine learning tasks and experiments as well as organize them online to work and collaborate more efficiently. In this paper, we present an R package to interface with the OpenML platform and illustrate its

  18. Automatic Task Classification via Support Vector Machine and Crowdsourcing

    Directory of Open Access Journals (Sweden)

    Hyungsik Shin

    2018-01-01

    Full Text Available Automatic task classification is a core part of personal assistant systems that are widely used in mobile devices such as smartphones and tablets. Even though many industry leaders are providing their own personal assistant services, their proprietary internals and implementations are not well known to the public. In this work, we show through real implementation and evaluation that automatic task classification can be implemented for mobile devices by using the support vector machine algorithm and crowdsourcing. To train our task classifier, we collected our training data set via crowdsourcing using the Amazon Mechanical Turk platform. Our classifier can classify a short English sentence into one of the thirty-two predefined tasks that are frequently requested while using personal mobile devices. Evaluation results show high prediction accuracy of our classifier ranging from 82% to 99%. By using large amount of crowdsourced data, we also illustrate the relationship between training data size and the prediction accuracy of our task classifier.

  19. Machine learning techniques for optical communication system optimization

    DEFF Research Database (Denmark)

    Zibar, Darko; Wass, Jesper; Thrane, Jakob

    In this paper, machine learning techniques relevant to optical communication are presented and discussed. The focus is on applying machine learning tools to optical performance monitoring and performance prediction.......In this paper, machine learning techniques relevant to optical communication are presented and discussed. The focus is on applying machine learning tools to optical performance monitoring and performance prediction....

  20. Machine learning for adaptive many-core machines a practical approach

    CERN Document Server

    Lopes, Noel

    2015-01-01

    The overwhelming data produced everyday and the increasing performance and cost requirements of applications?are transversal to a wide range of activities in society, from science to industry. In particular, the magnitude and complexity of the tasks that Machine Learning (ML) algorithms have to solve are driving the need to devise adaptive many-core machines that scale well with the volume of data, or in other words, can handle Big Data.This book gives a concise view on how to extend the applicability of well-known ML algorithms in Graphics Processing Unit (GPU) with data scalability in mind.

  1. MACHINE LEARNING TECHNIQUES USED IN BIG DATA

    Directory of Open Access Journals (Sweden)

    STEFANIA LOREDANA NITA

    2016-07-01

    Full Text Available The classical tools used in data analysis are not enough in order to benefit of all advantages of big data. The amount of information is too large for a complete investigation, and the possible connections and relations between data could be missed, because it is difficult or even impossible to verify all assumption over the information. Machine learning is a great solution in order to find concealed correlations or relationships between data, because it runs at scale machine and works very well with large data sets. The more data we have, the more the machine learning algorithm is useful, because it “learns” from the existing data and applies the found rules on new entries. In this paper, we present some machine learning algorithms and techniques used in big data.

  2. MoleculeNet: a benchmark for molecular machine learning.

    Science.gov (United States)

    Wu, Zhenqin; Ramsundar, Bharath; Feinberg, Evan N; Gomes, Joseph; Geniesse, Caleb; Pappu, Aneesh S; Leswing, Karl; Pande, Vijay

    2018-01-14

    Molecular machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about molecular properties. However, algorithmic progress has been limited due to the lack of a standard benchmark to compare the efficacy of proposed methods; most new algorithms are benchmarked on different datasets making it challenging to gauge the quality of proposed methods. This work introduces MoleculeNet, a large scale benchmark for molecular machine learning. MoleculeNet curates multiple public datasets, establishes metrics for evaluation, and offers high quality open-source implementations of multiple previously proposed molecular featurization and learning algorithms (released as part of the DeepChem open source library). MoleculeNet benchmarks demonstrate that learnable representations are powerful tools for molecular machine learning and broadly offer the best performance. However, this result comes with caveats. Learnable representations still struggle to deal with complex tasks under data scarcity and highly imbalanced classification. For quantum mechanical and biophysical datasets, the use of physics-aware featurizations can be more important than choice of particular learning algorithm.

  3. A distributed algorithm for machine learning

    Science.gov (United States)

    Chen, Shihong

    2018-04-01

    This paper considers a distributed learning problem in which a group of machines in a connected network, each learning its own local dataset, aim to reach a consensus at an optimal model, by exchanging information only with their neighbors but without transmitting data. A distributed algorithm is proposed to solve this problem under appropriate assumptions.

  4. Interactive Algorithms for Unsupervised Machine Learning

    Science.gov (United States)

    2015-06-01

    in Neural Information Processing Systems, 2013. 14 [3] Louigi Addario-Berry, Nicolas Broutin, Luc Devroye, and Gábor Lugosi. On combinato- rial...Myung Jin Choi, Vincent Y F Tan , Animashree Anandkumar, and Alan S Willsky. Learn- ing Latent Tree Graphical Models. Journal of Machine Learning

  5. Efficient tuning in supervised machine learning

    NARCIS (Netherlands)

    Koch, Patrick

    2013-01-01

    The tuning of learning algorithm parameters has become more and more important during the last years. With the fast growth of computational power and available memory databases have grown dramatically. This is very challenging for the tuning of parameters arising in machine learning, since the

  6. Building machines that learn and think like people.

    Science.gov (United States)

    Lake, Brenden M; Ullman, Tomer D; Tenenbaum, Joshua B; Gershman, Samuel J

    2017-01-01

    Recent progress in artificial intelligence has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats that of humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn and how they learn it. Specifically, we argue that these machines should (1) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (2) ground learning in intuitive theories of physics and psychology to support and enrich the knowledge that is learned; and (3) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes toward these goals that can combine the strengths of recent neural network advances with more structured cognitive models.

  7. Machine Learning Phases of Strongly Correlated Fermions

    Directory of Open Access Journals (Sweden)

    Kelvin Ch’ng

    2017-08-01

    Full Text Available Machine learning offers an unprecedented perspective for the problem of classifying phases in condensed matter physics. We employ neural-network machine learning techniques to distinguish finite-temperature phases of the strongly correlated fermions on cubic lattices. We show that a three-dimensional convolutional network trained on auxiliary field configurations produced by quantum Monte Carlo simulations of the Hubbard model can correctly predict the magnetic phase diagram of the model at the average density of one (half filling. We then use the network, trained at half filling, to explore the trend in the transition temperature as the system is doped away from half filling. This transfer learning approach predicts that the instability to the magnetic phase extends to at least 5% doping in this region. Our results pave the way for other machine learning applications in correlated quantum many-body systems.

  8. Machine learning a Bayesian and optimization perspective

    CERN Document Server

    Theodoridis, Sergios

    2015-01-01

    This tutorial text gives a unifying perspective on machine learning by covering both probabilistic and deterministic approaches, which rely on optimization techniques, as well as Bayesian inference, which is based on a hierarchy of probabilistic models. The book presents the major machine learning methods as they have been developed in different disciplines, such as statistics, statistical and adaptive signal processing and computer science. Focusing on the physical reasoning behind the mathematics, all the various methods and techniques are explained in depth, supported by examples and problems, giving an invaluable resource to the student and researcher for understanding and applying machine learning concepts. The book builds carefully from the basic classical methods to the most recent trends, with chapters written to be as self-contained as possible, making the text suitable for different courses: pattern recognition, statistical/adaptive signal processing, statistical/Bayesian learning, as well as shor...

  9. Machine learning: Trends, perspectives, and prospects.

    Science.gov (United States)

    Jordan, M I; Mitchell, T M

    2015-07-17

    Machine learning addresses the question of how to build computers that improve automatically through experience. It is one of today's most rapidly growing technical fields, lying at the intersection of computer science and statistics, and at the core of artificial intelligence and data science. Recent progress in machine learning has been driven both by the development of new learning algorithms and theory and by the ongoing explosion in the availability of online data and low-cost computation. The adoption of data-intensive machine-learning methods can be found throughout science, technology and commerce, leading to more evidence-based decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and marketing. Copyright © 2015, American Association for the Advancement of Science.

  10. Intelligent Machine Learning Approaches for Aerospace Applications

    Science.gov (United States)

    Sathyan, Anoop

    Machine Learning is a type of artificial intelligence that provides machines or networks the ability to learn from data without the need to explicitly program them. There are different kinds of machine learning techniques. This thesis discusses the applications of two of these approaches: Genetic Fuzzy Logic and Convolutional Neural Networks (CNN). Fuzzy Logic System (FLS) is a powerful tool that can be used for a wide variety of applications. FLS is a universal approximator that reduces the need for complex mathematics and replaces it with expert knowledge of the system to produce an input-output mapping using If-Then rules. The expert knowledge of a system can help in obtaining the parameters for small-scale FLSs, but for larger networks we will need to use sophisticated approaches that can automatically train the network to meet the design requirements. This is where Genetic Algorithms (GA) and EVE come into the picture. Both GA and EVE can tune the FLS parameters to minimize a cost function that is designed to meet the requirements of the specific problem. EVE is an artificial intelligence developed by Psibernetix that is trained to tune large scale FLSs. The parameters of an FLS can include the membership functions and rulebase of the inherent Fuzzy Inference Systems (FISs). The main issue with using the GFS is that the number of parameters in a FIS increase exponentially with the number of inputs thus making it increasingly harder to tune them. To reduce this issue, the FLSs discussed in this thesis consist of 2-input-1-output FISs in cascade (Chapter 4) or as a layer of parallel FISs (Chapter 7). We have obtained extremely good results using GFS for different applications at a reduced computational cost compared to other algorithms that are commonly used to solve the corresponding problems. In this thesis, GFSs have been designed for controlling an inverted double pendulum, a task allocation problem of clustering targets amongst a set of UAVs, a fire

  11. Teraflop-scale Incremental Machine Learning

    OpenAIRE

    Özkural, Eray

    2011-01-01

    We propose a long-term memory design for artificial general intelligence based on Solomonoff's incremental machine learning methods. We use R5RS Scheme and its standard library with a few omissions as the reference machine. We introduce a Levin Search variant based on Stochastic Context Free Grammar together with four synergistic update algorithms that use the same grammar as a guiding probability distribution of programs. The update algorithms include adjusting production probabilities, re-u...

  12. Machine Learning-Based Content Analysis: Automating the analysis of frames and agendas in political communication research

    NARCIS (Netherlands)

    Burscher, B.

    2016-01-01

    We used machine learning to study policy issues and frames in political messages. With regard to frames, we investigated the automation of two content-analytical tasks: frame coding and frame identification. We found that both tasks can be successfully automated by means of machine learning

  13. Machine learning paradigms applications in recommender systems

    CERN Document Server

    Lampropoulos, Aristomenis S

    2015-01-01

    This timely book presents Applications in Recommender Systems which are making recommendations using machine learning algorithms trained via examples of content the user likes or dislikes. Recommender systems built on the assumption of availability of both positive and negative examples do not perform well when negative examples are rare. It is exactly this problem that the authors address in the monograph at hand. Specifically, the books approach is based on one-class classification methodologies that have been appearing in recent machine learning research. The blending of recommender systems and one-class classification provides a new very fertile field for research, innovation and development with potential applications in “big data” as well as “sparse data” problems. The book will be useful to researchers, practitioners and graduate students dealing with problems of extensive and complex data. It is intended for both the expert/researcher in the fields of Pattern Recognition, Machine Learning and ...

  14. Machine learning for identifying botnet network traffic

    DEFF Research Database (Denmark)

    Stevanovic, Matija; Pedersen, Jens Myrup

    2013-01-01

    . Due to promise of non-invasive and resilient detection, botnet detection based on network traffic analysis has drawn a special attention of the research community. Furthermore, many authors have turned their attention to the use of machine learning algorithms as the mean of inferring botnet......-related knowledge from the monitored traffic. This paper presents a review of contemporary botnet detection methods that use machine learning as a tool of identifying botnet-related traffic. The main goal of the paper is to provide a comprehensive overview on the field by summarizing current scientific efforts....... The contribution of the paper is three-fold. First, the paper provides a detailed insight on the existing detection methods by investigating which bot-related heuristic were assumed by the detection systems and how different machine learning techniques were adapted in order to capture botnet-related knowledge...

  15. Parsimonious Wavelet Kernel Extreme Learning Machine

    Directory of Open Access Journals (Sweden)

    Wang Qin

    2015-11-01

    Full Text Available In this study, a parsimonious scheme for wavelet kernel extreme learning machine (named PWKELM was introduced by combining wavelet theory and a parsimonious algorithm into kernel extreme learning machine (KELM. In the wavelet analysis, bases that were localized in time and frequency to represent various signals effectively were used. Wavelet kernel extreme learning machine (WELM maximized its capability to capture the essential features in “frequency-rich” signals. The proposed parsimonious algorithm also incorporated significant wavelet kernel functions via iteration in virtue of Householder matrix, thus producing a sparse solution that eased the computational burden and improved numerical stability. The experimental results achieved from the synthetic dataset and a gas furnace instance demonstrated that the proposed PWKELM is efficient and feasible in terms of improving generalization accuracy and real time performance.

  16. Machine learning in the string landscape

    Science.gov (United States)

    Carifio, Jonathan; Halverson, James; Krioukov, Dmitri; Nelson, Brent D.

    2017-09-01

    We utilize machine learning to study the string landscape. Deep data dives and conjecture generation are proposed as useful frameworks for utilizing machine learning in the landscape, and examples of each are presented. A decision tree accurately predicts the number of weak Fano toric threefolds arising from reflexive polytopes, each of which determines a smooth F-theory compactification, and linear regression generates a previously proven conjecture for the gauge group rank in an ensemble of 4/3× 2.96× {10}^{755} F-theory compactifications. Logistic regression generates a new conjecture for when E 6 arises in the large ensemble of F-theory compactifications, which is then rigorously proven. This result may be relevant for the appearance of visible sectors in the ensemble. Through conjecture generation, machine learning is useful not only for numerics, but also for rigorous results.

  17. Stochastic subset selection for learning with kernel machines.

    Science.gov (United States)

    Rhinelander, Jason; Liu, Xiaoping P

    2012-06-01

    Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm.

  18. Virtual Things for Machine Learning Applications

    OpenAIRE

    Bovet , Gérôme; Ridi , Antonio; Hennebert , Jean

    2014-01-01

    International audience; Internet-of-Things (IoT) devices, especially sensors are pro-ducing large quantities of data that can be used for gather-ing knowledge. In this field, machine learning technologies are increasingly used to build versatile data-driven models. In this paper, we present a novel architecture able to ex-ecute machine learning algorithms within the sensor net-work, presenting advantages in terms of privacy and data transfer efficiency. We first argument that some classes of ...

  19. Machine Learning Optimization of Evolvable Artificial Cells

    DEFF Research Database (Denmark)

    Caschera, F.; Rasmussen, S.; Hanczyc, M.

    2011-01-01

    can be explored. A machine learning approach (Evo-DoE) could be applied to explore this experimental space and define optimal interactions according to a specific fitness function. Herein an implementation of an evolutionary design of experiments to optimize chemical and biochemical systems based...... on a machine learning process is presented. The optimization proceeds over generations of experiments in iterative loop until optimal compositions are discovered. The fitness function is experimentally measured every time the loop is closed. Two examples of complex systems, namely a liposomal drug formulation...

  20. Application of machine learning methods in bioinformatics

    Science.gov (United States)

    Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen

    2018-05-01

    Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.

  1. Recent Advances in Predictive (Machine) Learning

    Energy Technology Data Exchange (ETDEWEB)

    Friedman, J

    2004-01-24

    Prediction involves estimating the unknown value of an attribute of a system under study given the values of other measured attributes. In prediction (machine) learning the prediction rule is derived from data consisting of previously solved cases. Most methods for predictive learning were originated many years ago at the dawn of the computer age. Recently two new techniques have emerged that have revitalized the field. These are support vector machines and boosted decision trees. This paper provides an introduction to these two new methods tracing their respective ancestral roots to standard kernel methods and ordinary decision trees.

  2. Machine learning enhanced optical distance sensor

    Science.gov (United States)

    Amin, M. Junaid; Riza, N. A.

    2018-01-01

    Presented for the first time is a machine learning enhanced optical distance sensor. The distance sensor is based on our previously demonstrated distance measurement technique that uses an Electronically Controlled Variable Focus Lens (ECVFL) with a laser source to illuminate a target plane with a controlled optical beam spot. This spot with varying spot sizes is viewed by an off-axis camera and the spot size data is processed to compute the distance. In particular, proposed and demonstrated in this paper is the use of a regularized polynomial regression based supervised machine learning algorithm to enhance the accuracy of the operational sensor. The algorithm uses the acquired features and corresponding labels that are the actual target distance values to train a machine learning model. The optimized training model is trained over a 1000 mm (or 1 m) experimental target distance range. Using the machine learning algorithm produces a training set and testing set distance measurement errors of learning. Applications for the proposed sensor include industrial scenario distance sensing where target material specific training models can be generated to realize low <1% measurement error distance measurements.

  3. Conditional High-Order Boltzmann Machines for Supervised Relation Learning.

    Science.gov (United States)

    Huang, Yan; Wang, Wei; Wang, Liang; Tan, Tieniu

    2017-09-01

    Relation learning is a fundamental problem in many vision tasks. Recently, high-order Boltzmann machine and its variants have shown their great potentials in learning various types of data relation in a range of tasks. But most of these models are learned in an unsupervised way, i.e., without using relation class labels, which are not very discriminative for some challenging tasks, e.g., face verification. In this paper, with the goal to perform supervised relation learning, we introduce relation class labels into conventional high-order multiplicative interactions with pairwise input samples, and propose a conditional high-order Boltzmann Machine (CHBM), which can learn to classify the data relation in a binary classification way. To be able to deal with more complex data relation, we develop two improved variants of CHBM: 1) latent CHBM, which jointly performs relation feature learning and classification, by using a set of latent variables to block the pathway from pairwise input samples to output relation labels and 2) gated CHBM, which untangles factors of variation in data relation, by exploiting a set of latent variables to multiplicatively gate the classification of CHBM. To reduce the large number of model parameters generated by the multiplicative interactions, we approximately factorize high-order parameter tensors into multiple matrices. Then, we develop efficient supervised learning algorithms, by first pretraining the models using joint likelihood to provide good parameter initialization, and then finetuning them using conditional likelihood to enhance the discriminant ability. We apply the proposed models to a series of tasks including invariant recognition, face verification, and action similarity labeling. Experimental results demonstrate that by exploiting supervised relation labels, our models can greatly improve the performance.

  4. Tracking by Machine Learning Methods

    CERN Document Server

    Jofrehei, Arash

    2015-01-01

    Current track reconstructing methods start with two points and then for each layer loop through all possible hits to find proper hits to add to that track. Another idea would be to use this large number of already reconstructed events and/or simulated data and train a machine on this data to find tracks given hit pixels. Training time could be long but real time tracking is really fast Simulation might not be as realistic as real data but tacking has been done for that with 100 percent efficiency while by using real data we would probably be limited to current efficiency.

  5. Machine learning with quantum relative entropy

    Energy Technology Data Exchange (ETDEWEB)

    Tsuda, Koji [Max Planck Institute for Biological Cybernetics, Spemannstr. 38, Tuebingen, 72076 (Germany)], E-mail: koji.tsuda@tuebingen.mpg.de

    2009-12-01

    Density matrices are a central tool in quantum physics, but it is also used in machine learning. A positive definite matrix called kernel matrix is used to represent the similarities between examples. Positive definiteness assures that the examples are embedded in an Euclidean space. When a positive definite matrix is learned from data, one has to design an update rule that maintains the positive definiteness. Our update rule, called matrix exponentiated gradient update, is motivated by the quantum relative entropy. Notably, the relative entropy is an instance of Bregman divergences, which are asymmetric distance measures specifying theoretical properties of machine learning algorithms. Using the calculus commonly used in quantum physics, we prove an upperbound of the generalization error of online learning.

  6. Machine learning with quantum relative entropy

    International Nuclear Information System (INIS)

    Tsuda, Koji

    2009-01-01

    Density matrices are a central tool in quantum physics, but it is also used in machine learning. A positive definite matrix called kernel matrix is used to represent the similarities between examples. Positive definiteness assures that the examples are embedded in an Euclidean space. When a positive definite matrix is learned from data, one has to design an update rule that maintains the positive definiteness. Our update rule, called matrix exponentiated gradient update, is motivated by the quantum relative entropy. Notably, the relative entropy is an instance of Bregman divergences, which are asymmetric distance measures specifying theoretical properties of machine learning algorithms. Using the calculus commonly used in quantum physics, we prove an upperbound of the generalization error of online learning.

  7. Classifying smoking urges via machine learning.

    Science.gov (United States)

    Dumortier, Antoine; Beckjord, Ellen; Shiffman, Saul; Sejdić, Ervin

    2016-12-01

    Smoking is the largest preventable cause of death and diseases in the developed world, and advances in modern electronics and machine learning can help us deliver real-time intervention to smokers in novel ways. In this paper, we examine different machine learning approaches to use situational features associated with having or not having urges to smoke during a quit attempt in order to accurately classify high-urge states. To test our machine learning approaches, specifically, Bayes, discriminant analysis and decision tree learning methods, we used a dataset collected from over 300 participants who had initiated a quit attempt. The three classification approaches are evaluated observing sensitivity, specificity, accuracy and precision. The outcome of the analysis showed that algorithms based on feature selection make it possible to obtain high classification rates with only a few features selected from the entire dataset. The classification tree method outperformed the naive Bayes and discriminant analysis methods, with an accuracy of the classifications up to 86%. These numbers suggest that machine learning may be a suitable approach to deal with smoking cessation matters, and to predict smoking urges, outlining a potential use for mobile health applications. In conclusion, machine learning classifiers can help identify smoking situations, and the search for the best features and classifier parameters significantly improves the algorithms' performance. In addition, this study also supports the usefulness of new technologies in improving the effect of smoking cessation interventions, the management of time and patients by therapists, and thus the optimization of available health care resources. Future studies should focus on providing more adaptive and personalized support to people who really need it, in a minimum amount of time by developing novel expert systems capable of delivering real-time interventions. Copyright © 2016 Elsevier Ireland Ltd. All rights

  8. Quantum machine learning for quantum anomaly detection

    Science.gov (United States)

    Liu, Nana; Rebentrost, Patrick

    2018-04-01

    Anomaly detection is used for identifying data that deviate from "normal" data patterns. Its usage on classical data finds diverse applications in many important areas such as finance, fraud detection, medical diagnoses, data cleaning, and surveillance. With the advent of quantum technologies, anomaly detection of quantum data, in the form of quantum states, may become an important component of quantum applications. Machine-learning algorithms are playing pivotal roles in anomaly detection using classical data. Two widely used algorithms are the kernel principal component analysis and the one-class support vector machine. We find corresponding quantum algorithms to detect anomalies in quantum states. We show that these two quantum algorithms can be performed using resources that are logarithmic in the dimensionality of quantum states. For pure quantum states, these resources can also be logarithmic in the number of quantum states used for training the machine-learning algorithm. This makes these algorithms potentially applicable to big quantum data applications.

  9. Machine learning on geospatial big data

    CSIR Research Space (South Africa)

    Van Zyl, T

    2014-02-01

    Full Text Available When trying to understand the difference between machine learning and statistics, it is important to note that it is not so much the set of techniques and theory that are used but more importantly the intended use of the results. In fact, many...

  10. ML Confidential : machine learning on encrypted data

    NARCIS (Netherlands)

    Graepel, T.; Lauter, K.; Naehrig, M.; Kwon, T.; Lee, M.-K.; Kwon, D.

    2013-01-01

    We demonstrate that, by using a recently proposed leveled homomorphic encryption scheme, it is possible to delegate the execution of a machine learning algorithm to a computing service while retaining con¿dentiality of the training and test data. Since the computational complexity of the homomorphic

  11. Machine Learning for Flapping Wing Flight Control

    NARCIS (Netherlands)

    Goedhart, Menno; van Kampen, E.; Armanini, S.F.; de Visser, C.C.; Chu, Q.

    2018-01-01

    Flight control of Flapping Wing Micro Air Vehicles is challenging, because of their complex dynamics and variability due to manufacturing inconsistencies. Machine Learning algorithms can be used to tackle these challenges. A Policy Gradient algorithm is used to tune the gains of a

  12. ML Confidential : machine learning on encrypted data

    NARCIS (Netherlands)

    Graepel, T.; Lauter, K.; Naehrig, M.

    2012-01-01

    We demonstrate that by using a recently proposed somewhat homomorphic encryption (SHE) scheme it is possible to delegate the execution of a machine learning (ML) algorithm to a compute service while retaining confidentiality of the training and test data. Since the computational complexity of the

  13. Document Classification Using Distributed Machine Learning

    OpenAIRE

    Aydin, Galip; Hallac, Ibrahim Riza

    2018-01-01

    In this paper, we investigate the performance and success rates of Na\\"ive Bayes Classification Algorithm for automatic classification of Turkish news into predetermined categories like economy, life, health etc. We use Apache Big Data technologies such as Hadoop, HDFS, Spark and Mahout, and apply these distributed technologies to Machine Learning.

  14. The ATLAS Higgs Machine Learning Challenge

    CERN Document Server

    Cowan, Glen; The ATLAS collaboration; Bourdarios, Claire

    2015-01-01

    High Energy Physics has been using Machine Learning techniques (commonly known as Multivariate Analysis) since the 1990s with Artificial Neural Net and more recently with Boosted Decision Trees, Random Forest etc. Meanwhile, Machine Learning has become a full blown field of computer science. With the emergence of Big Data, data scientists are developing new Machine Learning algorithms to extract meaning from large heterogeneous data. HEP has exciting and difficult problems like the extraction of the Higgs boson signal, and at the same time data scientists have advanced algorithms: the goal of the HiggsML project was to bring the two together by a “challenge”: participants from all over the world and any scientific background could compete online to obtain the best Higgs to tau tau signal significance on a set of ATLAS fully simulated Monte Carlo signal and background. Instead of HEP physicists browsing through machine learning papers and trying to infer which new algorithms might be useful for HEP, then c...

  15. Parallelization of TMVA Machine Learning Algorithms

    CERN Document Server

    Hajili, Mammad

    2017-01-01

    This report reflects my work on Parallelization of TMVA Machine Learning Algorithms integrated to ROOT Data Analysis Framework during summer internship at CERN. The report consists of 4 impor- tant part - data set used in training and validation, algorithms that multiprocessing applied on them, parallelization techniques and re- sults of execution time changes due to number of workers.

  16. Prototype-based models in machine learning

    NARCIS (Netherlands)

    Biehl, Michael; Hammer, Barbara; Villmann, Thomas

    2016-01-01

    An overview is given of prototype-based models in machine learning. In this framework, observations, i.e., data, are stored in terms of typical representatives. Together with a suitable measure of similarity, the systems can be employed in the context of unsupervised and supervised analysis of

  17. Supporting visual quality assessment with machine learning

    NARCIS (Netherlands)

    Gastaldo, P.; Zunino, R.; Redi, J.

    2013-01-01

    Objective metrics for visual quality assessment often base their reliability on the explicit modeling of the highly non-linear behavior of human perception; as a result, they may be complex and computationally expensive. Conversely, machine learning (ML) paradigms allow to tackle the quality

  18. Optimizing Distributed Machine Learning for Large Scale EEG Data Set

    Directory of Open Access Journals (Sweden)

    M Bilal Shaikh

    2017-06-01

    Full Text Available Distributed Machine Learning (DML has gained its importance more than ever in this era of Big Data. There are a lot of challenges to scale machine learning techniques on distributed platforms. When it comes to scalability, improving the processor technology for high level computation of data is at its limit, however increasing machine nodes and distributing data along with computation looks as a viable solution. Different frameworks   and platforms are available to solve DML problems. These platforms provide automated random data distribution of datasets which miss the power of user defined intelligent data partitioning based on domain knowledge. We have conducted an empirical study which uses an EEG Data Set collected through P300 Speller component of an ERP (Event Related Potential which is widely used in BCI problems; it helps in translating the intention of subject w h i l e performing any cognitive task. EEG data contains noise due to waves generated by other activities in the brain which contaminates true P300Speller. Use of Machine Learning techniques could help in detecting errors made by P300 Speller. We are solving this classification problem by partitioning data into different chunks and preparing distributed models using Elastic CV Classifier. To present a case of optimizing distributed machine learning, we propose an intelligent user defined data partitioning approach that could impact on the accuracy of distributed machine learners on average. Our results show better average AUC as compared to average AUC obtained after applying random data partitioning which gives no control to user over data partitioning. It improves the average accuracy of distributed learner due to the domain specific intelligent partitioning by the user. Our customized approach achieves 0.66 AUC on individual sessions and 0.75 AUC on mixed sessions, whereas random / uncontrolled data distribution records 0.63 AUC.

  19. Machine learning a theoretical approach

    CERN Document Server

    Natarajan, Balas K

    2014-01-01

    This is the first comprehensive introduction to computational learning theory. The author's uniform presentation of fundamental results and their applications offers AI researchers a theoretical perspective on the problems they study. The book presents tools for the analysis of probabilistic models of learning, tools that crisply classify what is and is not efficiently learnable. After a general introduction to Valiant's PAC paradigm and the important notion of the Vapnik-Chervonenkis dimension, the author explores specific topics such as finite automata and neural networks. The presentation

  20. Machine Learning and Quantum Mechanics

    Science.gov (United States)

    Chapline, George

    The author has previously pointed out some similarities between selforganizing neural networks and quantum mechanics. These types of neural networks were originally conceived of as away of emulating the cognitive capabilities of the human brain. Recently extensions of these networks, collectively referred to as deep learning networks, have strengthened the connection between self-organizing neural networks and human cognitive capabilities. In this note we consider whether hardware quantum devices might be useful for emulating neural networks with human-like cognitive capabilities, or alternatively whether implementations of deep learning neural networks using conventional computers might lead to better algorithms for solving the many body Schrodinger equation.

  1. Extracting meaning from audio signals - a machine learning approach

    DEFF Research Database (Denmark)

    Larsen, Jan

    2007-01-01

    * Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression......* Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression...

  2. Evaluation on knowledge extraction and machine learning in ...

    African Journals Online (AJOL)

    Evaluation on knowledge extraction and machine learning in resolving Malay word ambiguity. ... No 5S (2017) >. Log in or Register to get access to full text downloads. ... Keywords: ambiguity; lexical knowledge; machine learning; Malay word ...

  3. Channelized relevance vector machine as a numerical observer for cardiac perfusion defect detection task

    Science.gov (United States)

    Kalayeh, Mahdi M.; Marin, Thibault; Pretorius, P. Hendrik; Wernick, Miles N.; Yang, Yongyi; Brankov, Jovan G.

    2011-03-01

    In this paper, we present a numerical observer for image quality assessment, aiming to predict human observer accuracy in a cardiac perfusion defect detection task for single-photon emission computed tomography (SPECT). In medical imaging, image quality should be assessed by evaluating the human observer accuracy for a specific diagnostic task. This approach is known as task-based assessment. Such evaluations are important for optimizing and testing imaging devices and algorithms. Unfortunately, human observer studies with expert readers are costly and time-demanding. To address this problem, numerical observers have been developed as a surrogate for human readers to predict human diagnostic performance. The channelized Hotelling observer (CHO) with internal noise model has been found to predict human performance well in some situations, but does not always generalize well to unseen data. We have argued in the past that finding a model to predict human observers could be viewed as a machine learning problem. Following this approach, in this paper we propose a channelized relevance vector machine (CRVM) to predict human diagnostic scores in a detection task. We have previously used channelized support vector machines (CSVM) to predict human scores and have shown that this approach offers better and more robust predictions than the classical CHO method. The comparison of the proposed CRVM with our previously introduced CSVM method suggests that CRVM can achieve similar generalization accuracy, while dramatically reducing model complexity and computation time.

  4. Machine Learning for Neuroimaging with Scikit-Learn

    Directory of Open Access Journals (Sweden)

    Alexandre eAbraham

    2014-02-01

    Full Text Available Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g. multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g. resting state functional MRI or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.

  5. Machine learning for neuroimaging with scikit-learn.

    Science.gov (United States)

    Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël

    2014-01-01

    Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.

  6. Financial signal processing and machine learning

    CERN Document Server

    Kulkarni,Sanjeev R; Dmitry M. Malioutov

    2016-01-01

    The modern financial industry has been required to deal with large and diverse portfolios in a variety of asset classes often with limited market data available. Financial Signal Processing and Machine Learning unifies a number of recent advances made in signal processing and machine learning for the design and management of investment portfolios and financial engineering. This book bridges the gap between these disciplines, offering the latest information on key topics including characterizing statistical dependence and correlation in high dimensions, constructing effective and robust risk measures, and their use in portfolio optimization and rebalancing. The book focuses on signal processing approaches to model return, momentum, and mean reversion, addressing theoretical and implementation aspects. It highlights the connections between portfolio theory, sparse learning and compressed sensing, sparse eigen-portfolios, robust optimization, non-Gaussian data-driven risk measures, graphical models, causal analy...

  7. MLnet report: training in Europe on machine learning

    OpenAIRE

    Ellebrecht, Mario; Morik, Katharina

    1999-01-01

    Machine learning techniques offer opportunities for a variety of applications and the theory of machine learning investigates problems that are of interest for other fields of computer science (e.g., complexity theory, logic programming, pattern recognition). However, the impacts of machine learning can only be recognized by those who know the techniques and are able to apply them. Hence, teaching machine learning is necessary before this field can diversify computer science. In order ...

  8. A Machine Learning Concept for DTN Routing

    Science.gov (United States)

    Dudukovich, Rachel; Hylton, Alan; Papachristou, Christos

    2017-01-01

    This paper discusses the concept and architecture of a machine learning based router for delay tolerant space networks. The techniques of reinforcement learning and Bayesian learning are used to supplement the routing decisions of the popular Contact Graph Routing algorithm. An introduction to the concepts of Contact Graph Routing, Q-routing and Naive Bayes classification are given. The development of an architecture for a cross-layer feedback framework for DTN (Delay-Tolerant Networking) protocols is discussed. Finally, initial simulation setup and results are given.

  9. Toward accelerating landslide mapping with interactive machine learning techniques

    Science.gov (United States)

    Stumpf, André; Lachiche, Nicolas; Malet, Jean-Philippe; Kerle, Norman; Puissant, Anne

    2013-04-01

    Despite important advances in the development of more automated methods for landslide mapping from optical remote sensing images, the elaboration of inventory maps after major triggering events still remains a tedious task. Image classification with expert defined rules typically still requires significant manual labour for the elaboration and adaption of rule sets for each particular case. Machine learning algorithm, on the contrary, have the ability to learn and identify complex image patterns from labelled examples but may require relatively large amounts of training data. In order to reduce the amount of required training data active learning has evolved as key concept to guide the sampling for applications such as document classification, genetics and remote sensing. The general underlying idea of most active learning approaches is to initialize a machine learning model with a small training set, and to subsequently exploit the model state and/or the data structure to iteratively select the most valuable samples that should be labelled by the user and added in the training set. With relatively few queries and labelled samples, an active learning strategy should ideally yield at least the same accuracy than an equivalent classifier trained with many randomly selected samples. Our study was dedicated to the development of an active learning approach for landslide mapping from VHR remote sensing images with special consideration of the spatial distribution of the samples. The developed approach is a region-based query heuristic that enables to guide the user attention towards few compact spatial batches rather than distributed points resulting in time savings of 50% and more compared to standard active learning techniques. The approach was tested with multi-temporal and multi-sensor satellite images capturing recent large scale triggering events in Brazil and China and demonstrated balanced user's and producer's accuracies between 74% and 80%. The assessment also

  10. A Parameter Communication Optimization Strategy for Distributed Machine Learning in Sensors.

    Science.gov (United States)

    Zhang, Jilin; Tu, Hangdi; Ren, Yongjian; Wan, Jian; Zhou, Li; Li, Mingwei; Wang, Jue; Yu, Lifeng; Zhao, Chang; Zhang, Lei

    2017-09-21

    In order to utilize the distributed characteristic of sensors, distributed machine learning has become the mainstream approach, but the different computing capability of sensors and network delays greatly influence the accuracy and the convergence rate of the machine learning model. Our paper describes a reasonable parameter communication optimization strategy to balance the training overhead and the communication overhead. We extend the fault tolerance of iterative-convergent machine learning algorithms and propose the Dynamic Finite Fault Tolerance (DFFT). Based on the DFFT, we implement a parameter communication optimization strategy for distributed machine learning, named Dynamic Synchronous Parallel Strategy (DSP), which uses the performance monitoring model to dynamically adjust the parameter synchronization strategy between worker nodes and the Parameter Server (PS). This strategy makes full use of the computing power of each sensor, ensures the accuracy of the machine learning model, and avoids the situation that the model training is disturbed by any tasks unrelated to the sensors.

  11. Machine Learning and Inverse Problem in Geodynamics

    Science.gov (United States)

    Shahnas, M. H.; Yuen, D. A.; Pysklywec, R.

    2017-12-01

    During the past few decades numerical modeling and traditional HPC have been widely deployed in many diverse fields for problem solutions. However, in recent years the rapid emergence of machine learning (ML), a subfield of the artificial intelligence (AI), in many fields of sciences, engineering, and finance seems to mark a turning point in the replacement of traditional modeling procedures with artificial intelligence-based techniques. The study of the circulation in the interior of Earth relies on the study of high pressure mineral physics, geochemistry, and petrology where the number of the mantle parameters is large and the thermoelastic parameters are highly pressure- and temperature-dependent. More complexity arises from the fact that many of these parameters that are incorporated in the numerical models as input parameters are not yet well established. In such complex systems the application of machine learning algorithms can play a valuable role. Our focus in this study is the application of supervised machine learning (SML) algorithms in predicting mantle properties with the emphasis on SML techniques in solving the inverse problem. As a sample problem we focus on the spin transition in ferropericlase and perovskite that may cause slab and plume stagnation at mid-mantle depths. The degree of the stagnation depends on the degree of negative density anomaly at the spin transition zone. The training and testing samples for the machine learning models are produced by the numerical convection models with known magnitudes of density anomaly (as the class labels of the samples). The volume fractions of the stagnated slabs and plumes which can be considered as measures for the degree of stagnation are assigned as sample features. The machine learning models can determine the magnitude of the spin transition-induced density anomalies that can cause flow stagnation at mid-mantle depths. Employing support vector machine (SVM) algorithms we show that SML techniques

  12. A review of machine learning in obesity.

    Science.gov (United States)

    DeGregory, K W; Kuiper, P; DeSilvio, T; Pleuss, J D; Miller, R; Roginski, J W; Fisher, C B; Harness, D; Viswanath, S; Heymsfield, S B; Dungan, I; Thomas, D M

    2018-05-01

    Rich sources of obesity-related data arising from sensors, smartphone apps, electronic medical health records and insurance data can bring new insights for understanding, preventing and treating obesity. For such large datasets, machine learning provides sophisticated and elegant tools to describe, classify and predict obesity-related risks and outcomes. Here, we review machine learning methods that predict and/or classify such as linear and logistic regression, artificial neural networks, deep learning and decision tree analysis. We also review methods that describe and characterize data such as cluster analysis, principal component analysis, network science and topological data analysis. We introduce each method with a high-level overview followed by examples of successful applications. The algorithms were then applied to National Health and Nutrition Examination Survey to demonstrate methodology, utility and outcomes. The strengths and limitations of each method were also evaluated. This summary of machine learning algorithms provides a unique overview of the state of data analysis applied specifically to obesity. © 2018 World Obesity Federation.

  13. Machine learning in geosciences and remote sensing

    Directory of Open Access Journals (Sweden)

    David J. Lary

    2016-01-01

    Full Text Available Learning incorporates a broad range of complex procedures. Machine learning (ML is a subdivision of artificial intelligence based on the biological learning process. The ML approach deals with the design of algorithms to learn from machine readable data. ML covers main domains such as data mining, difficult-to-program applications, and software applications. It is a collection of a variety of algorithms (e.g. neural networks, support vector machines, self-organizing map, decision trees, random forests, case-based reasoning, genetic programming, etc. that can provide multivariate, nonlinear, nonparametric regression or classification. The modeling capabilities of the ML-based methods have resulted in their extensive applications in science and engineering. Herein, the role of ML as an effective approach for solving problems in geosciences and remote sensing will be highlighted. The unique features of some of the ML techniques will be outlined with a specific attention to genetic programming paradigm. Furthermore, nonparametric regression and classification illustrative examples are presented to demonstrate the efficiency of ML for tackling the geosciences and remote sensing problems.

  14. From machine learning to deep learning: progress in machine intelligence for rational drug discovery.

    Science.gov (United States)

    Zhang, Lu; Tan, Jianjun; Han, Dan; Zhu, Hao

    2017-11-01

    Machine intelligence, which is normally presented as artificial intelligence, refers to the intelligence exhibited by computers. In the history of rational drug discovery, various machine intelligence approaches have been applied to guide traditional experiments, which are expensive and time-consuming. Over the past several decades, machine-learning tools, such as quantitative structure-activity relationship (QSAR) modeling, were developed that can identify potential biological active molecules from millions of candidate compounds quickly and cheaply. However, when drug discovery moved into the era of 'big' data, machine learning approaches evolved into deep learning approaches, which are a more powerful and efficient way to deal with the massive amounts of data generated from modern drug discovery approaches. Here, we summarize the history of machine learning and provide insight into recently developed deep learning approaches and their applications in rational drug discovery. We suggest that this evolution of machine intelligence now provides a guide for early-stage drug design and discovery in the current big data era. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Multi-task Vector Field Learning.

    Science.gov (United States)

    Lin, Binbin; Yang, Sen; Zhang, Chiyuan; Ye, Jieping; He, Xiaofei

    2012-01-01

    Multi-task learning (MTL) aims to improve generalization performance by learning multiple related tasks simultaneously and identifying the shared information among tasks. Most of existing MTL methods focus on learning linear models under the supervised setting. We propose a novel semi-supervised and nonlinear approach for MTL using vector fields. A vector field is a smooth mapping from the manifold to the tangent spaces which can be viewed as a directional derivative of functions on the manifold. We argue that vector fields provide a natural way to exploit the geometric structure of data as well as the shared differential structure of tasks, both of which are crucial for semi-supervised multi-task learning. In this paper, we develop multi-task vector field learning (MTVFL) which learns the predictor functions and the vector fields simultaneously. MTVFL has the following key properties. (1) The vector fields MTVFL learns are close to the gradient fields of the predictor functions. (2) Within each task, the vector field is required to be as parallel as possible which is expected to span a low dimensional subspace. (3) The vector fields from all tasks share a low dimensional subspace. We formalize our idea in a regularization framework and also provide a convex relaxation method to solve the original non-convex problem. The experimental results on synthetic and real data demonstrate the effectiveness of our proposed approach.

  16. Machine learning analysis of binaural rowing sounds

    DEFF Research Database (Denmark)

    Johard, Leonard; Ruffaldi, Emanuele; Hoffmann, Pablo F.

    2011-01-01

    Techniques for machine hearing are increasing their potentiality due to new application domains. In this work we are addressing the analysis of rowing sounds in natural context for the purpose of supporting a training system based on virtual environments. This paper presents the acquisition metho...... methodology and the evaluation of different machine learning techniques for classifying rowing-sound data. We see that a combination of principal component analysis and shallow networks perform equally well as deep architectures, while being much faster to train.......Techniques for machine hearing are increasing their potentiality due to new application domains. In this work we are addressing the analysis of rowing sounds in natural context for the purpose of supporting a training system based on virtual environments. This paper presents the acquisition...

  17. Learning About Climate and Atmospheric Models Through Machine Learning

    Science.gov (United States)

    Lucas, D. D.

    2017-12-01

    From the analysis of ensemble variability to improving simulation performance, machine learning algorithms can play a powerful role in understanding the behavior of atmospheric and climate models. To learn about model behavior, we create training and testing data sets through ensemble techniques that sample different model configurations and values of input parameters, and then use supervised machine learning to map the relationships between the inputs and outputs. Following this procedure, we have used support vector machines, random forests, gradient boosting and other methods to investigate a variety of atmospheric and climate model phenomena. We have used machine learning to predict simulation crashes, estimate the probability density function of climate sensitivity, optimize simulations of the Madden Julian oscillation, assess the impacts of weather and emissions uncertainty on atmospheric dispersion, and quantify the effects of model resolution changes on precipitation. This presentation highlights recent examples of our applications of machine learning to improve the understanding of climate and atmospheric models. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

  18. A Teaching System To Learn Programming: the Programmer's Learning Machine

    OpenAIRE

    Quinson , Martin; Oster , Gérald

    2015-01-01

    International audience; The Programmer's Learning Machine (PLM) is an interactive exerciser for learning programming and algorithms. Using an integrated and graphical environment that provides a short feedback loop, it allows students to learn in a (semi)-autonomous way. This generic platform also enables teachers to create specific programming microworlds that match their teaching goals. This paper discusses our design goals and motivations, introduces the existing material and the proposed ...

  19. Stochastic Synapses Enable Efficient Brain-Inspired Learning Machines

    Science.gov (United States)

    Neftci, Emre O.; Pedroni, Bruno U.; Joshi, Siddharth; Al-Shedivat, Maruan; Cauwenberghs, Gert

    2016-01-01

    Recent studies have shown that synaptic unreliability is a robust and sufficient mechanism for inducing the stochasticity observed in cortex. Here, we introduce Synaptic Sampling Machines (S2Ms), a class of neural network models that uses synaptic stochasticity as a means to Monte Carlo sampling and unsupervised learning. Similar to the original formulation of Boltzmann machines, these models can be viewed as a stochastic counterpart of Hopfield networks, but where stochasticity is induced by a random mask over the connections. Synaptic stochasticity plays the dual role of an efficient mechanism for sampling, and a regularizer during learning akin to DropConnect. A local synaptic plasticity rule implementing an event-driven form of contrastive divergence enables the learning of generative models in an on-line fashion. S2Ms perform equally well using discrete-timed artificial units (as in Hopfield networks) or continuous-timed leaky integrate and fire neurons. The learned representations are remarkably sparse and robust to reductions in bit precision and synapse pruning: removal of more than 75% of the weakest connections followed by cursory re-learning causes a negligible performance loss on benchmark classification tasks. The spiking neuron-based S2Ms outperform existing spike-based unsupervised learners, while potentially offering substantial advantages in terms of power and complexity, and are thus promising models for on-line learning in brain-inspired hardware. PMID:27445650

  20. Unique sensor fusion system for coordinate-measuring machine tasks

    Science.gov (United States)

    Nashman, Marilyn; Yoshimi, Billibon; Hong, Tsai Hong; Rippey, William G.; Herman, Martin

    1997-09-01

    This paper describes a real-time hierarchical system that fuses data from vision and touch sensors to improve the performance of a coordinate measuring machine (CMM) used for dimensional inspection tasks. The system consists of sensory processing, world modeling, and task decomposition modules. It uses the strengths of each sensor -- the precision of the CMM scales and the analog touch probe and the global information provided by the low resolution camera -- to improve the speed and flexibility of the inspection task. In the experiment described, the vision module performs all computations in image coordinate space. The part's boundaries are extracted during an initialization process and then the probe's position is continuously updated as it scans and measures the part surface. The system fuses the estimated probe velocity and distance to the part boundary in image coordinates with the estimated velocity and probe position provided by the CMM controller. The fused information provides feedback to the monitor controller as it guides the touch probe to scan the part. We also discuss integrating information from the vision system and the probe to autonomously collect data for 2-D to 3-D calibration, and work to register computer aided design (CAD) models with images of parts in the workplace.

  1. Parallelization of the ROOT Machine Learning Methods

    CERN Document Server

    Vakilipourtakalou, Pourya

    2016-01-01

    Today computation is an inseparable part of scientific research. Specially in Particle Physics when there is a classification problem like discrimination of Signals from Backgrounds originating from the collisions of particles. On the other hand, Monte Carlo simulations can be used in order to generate a known data set of Signals and Backgrounds based on theoretical physics. The aim of Machine Learning is to train some algorithms on known data set and then apply these trained algorithms to the unknown data sets. However, the most common framework for data analysis in Particle Physics is ROOT. In order to use Machine Learning methods, a Toolkit for Multivariate Data Analysis (TMVA) has been added to ROOT. The major consideration in this report is the parallelization of some TMVA methods, specially Cross-Validation and BDT.

  2. Distinguishing Asthma Phenotypes Using Machine Learning Approaches.

    Science.gov (United States)

    Howard, Rebecca; Rattray, Magnus; Prosperi, Mattia; Custovic, Adnan

    2015-07-01

    Asthma is not a single disease, but an umbrella term for a number of distinct diseases, each of which are caused by a distinct underlying pathophysiological mechanism. These discrete disease entities are often labelled as 'asthma endotypes'. The discovery of different asthma subtypes has moved from subjective approaches in which putative phenotypes are assigned by experts to data-driven ones which incorporate machine learning. This review focuses on the methodological developments of one such machine learning technique-latent class analysis-and how it has contributed to distinguishing asthma and wheezing subtypes in childhood. It also gives a clinical perspective, presenting the findings of studies from the past 5 years that used this approach. The identification of true asthma endotypes may be a crucial step towards understanding their distinct pathophysiological mechanisms, which could ultimately lead to more precise prevention strategies, identification of novel therapeutic targets and the development of effective personalized therapies.

  3. Designing anticancer peptides by constructive machine learning.

    Science.gov (United States)

    Grisoni, Francesca; Neuhaus, Claudia; Gabernet, Gisela; Müller, Alex; Hiss, Jan; Schneider, Gisbert

    2018-04-21

    Constructive machine learning enables the automated generation of novel chemical structures without the need for explicit molecular design rules. This study presents the experimental application of such a generative model to design membranolytic anticancer peptides (ACPs) de novo. A recurrent neural network with long short-term memory cells was trained on alpha-helical cationic amphipathic peptide sequences and then fine-tuned with 26 known ACPs. This optimized model was used to generate unique and novel amino acid sequences. Twelve of the peptides were synthesized and tested for their activity on MCF7 human breast adenocarcinoma cells and selectivity against human erythrocytes. Ten of these peptides were active against cancer cells. Six of the active peptides killed MCF7 cancer cells without affecting human erythrocytes with at least threefold selectivity. These results advocate constructive machine learning for the automated design of peptides with desired biological activities. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Machine learning for micro-tomography

    Science.gov (United States)

    Parkinson, Dilworth Y.; Pelt, Daniël. M.; Perciano, Talita; Ushizima, Daniela; Krishnan, Harinarayan; Barnard, Harold S.; MacDowell, Alastair A.; Sethian, James

    2017-09-01

    Machine learning has revolutionized a number of fields, but many micro-tomography users have never used it for their work. The micro-tomography beamline at the Advanced Light Source (ALS), in collaboration with the Center for Applied Mathematics for Energy Research Applications (CAMERA) at Lawrence Berkeley National Laboratory, has now deployed a series of tools to automate data processing for ALS users using machine learning. This includes new reconstruction algorithms, feature extraction tools, and image classification and recommen- dation systems for scientific image. Some of these tools are either in automated pipelines that operate on data as it is collected or as stand-alone software. Others are deployed on computing resources at Berkeley Lab-from workstations to supercomputers-and made accessible to users through either scripting or easy-to-use graphical interfaces. This paper presents a progress report on this work.

  5. Randomized Algorithms for Scalable Machine Learning

    OpenAIRE

    Kleiner, Ariel Jacob

    2012-01-01

    Many existing procedures in machine learning and statistics are computationally intractable in the setting of large-scale data. As a result, the advent of rapidly increasing dataset sizes, which should be a boon yielding improved statistical performance, instead severely blunts the usefulness of a variety of existing inferential methods. In this work, we use randomness to ameliorate this lack of scalability by reducing complex, computationally difficult inferential problems to larger sets o...

  6. Two-Dimensional Extreme Learning Machine

    Directory of Open Access Journals (Sweden)

    Bo Jia

    2015-01-01

    (BP networks. However, like many other methods, ELM is originally proposed to handle vector pattern while nonvector patterns in real applications need to be explored, such as image data. We propose the two-dimensional extreme learning machine (2DELM based on the very natural idea to deal with matrix data directly. Unlike original ELM which handles vectors, 2DELM take the matrices as input features without vectorization. Empirical studies on several real image datasets show the efficiency and effectiveness of the algorithm.

  7. Network anomaly detection a machine learning perspective

    CERN Document Server

    Bhattacharyya, Dhruba Kumar

    2013-01-01

    With the rapid rise in the ubiquity and sophistication of Internet technology and the accompanying growth in the number of network attacks, network intrusion detection has become increasingly important. Anomaly-based network intrusion detection refers to finding exceptional or nonconforming patterns in network traffic data compared to normal behavior. Finding these anomalies has extensive applications in areas such as cyber security, credit card and insurance fraud detection, and military surveillance for enemy activities. Network Anomaly Detection: A Machine Learning Perspective presents mach

  8. Introduction to special issue on machine learning approaches to shallow parsing

    NARCIS (Netherlands)

    Hammerton, J; Osborne, M; Armstrong, S; Daelemans, W

    2002-01-01

    This article introduces the problem of partial or shallow parsing (assigning partial syntactic structure to sentences) and explains why it is an important natural language processing (NLP) task. The complexity of the task makes Machine Learning an attractive option in comparison to the handcrafting

  9. Machine Learning in Computer-Aided Synthesis Planning.

    Science.gov (United States)

    Coley, Connor W; Green, William H; Jensen, Klavs F

    2018-05-15

    . While we introduce this task in the context of reaction validation, its utility extends to the prediction of side products and impurities, among other applications. We describe neural network-based approaches that we and others have developed for this forward prediction task that can be trained on previously published experimental data. Machine learning and artificial intelligence have revolutionized a number of disciplines, not limited to image recognition, dictation, translation, content recommendation, advertising, and autonomous driving. While there is a rich history of using machine learning for structure-activity models in chemistry, it is only now that it is being successfully applied more broadly to organic synthesis and synthesis design. As reported in this Account, machine learning is rapidly transforming CASP, but there are several remaining challenges and opportunities, many pertaining to the availability and standardization of both data and evaluation metrics, which must be addressed by the community at large.

  10. The ATLAS Higgs machine learning challenge

    CERN Document Server

    Davey, W; The ATLAS collaboration; Rousseau, D; Cowan, G; Kegl, B; Germain-Renaud, C; Guyon, I

    2014-01-01

    High Energy Physics has been using Machine Learning techniques (commonly known as Multivariate Analysis) since the 90's with Artificial Neural Net for example, more recently with Boosted Decision Trees, Random Forest etc... Meanwhile, Machine Learning has become a full blown field of computer science. With the emergence of Big Data, Data Scientists are developing new Machine Learning algorithms to extract sense from large heterogeneous data. HEP has exciting and difficult problems like the extraction of the Higgs boson signal, data scientists have advanced algorithms: the goal of the HiggsML project is to bring the two together by a “challenge”: participants from all over the world and any scientific background can compete online ( https://www.kaggle.com/c/higgs-boson ) to obtain the best Higgs to tau tau signal significance on a set of ATLAS full simulated Monte Carlo signal and background. Winners with the best scores will receive money prizes ; authors of the best method (most usable) will be invited t...

  11. Research on machine learning framework based on random forest algorithm

    Science.gov (United States)

    Ren, Qiong; Cheng, Hui; Han, Hai

    2017-03-01

    With the continuous development of machine learning, industry and academia have released a lot of machine learning frameworks based on distributed computing platform, and have been widely used. However, the existing framework of machine learning is limited by the limitations of machine learning algorithm itself, such as the choice of parameters and the interference of noises, the high using threshold and so on. This paper introduces the research background of machine learning framework, and combined with the commonly used random forest algorithm in machine learning classification algorithm, puts forward the research objectives and content, proposes an improved adaptive random forest algorithm (referred to as ARF), and on the basis of ARF, designs and implements the machine learning framework.

  12. Using Machine Learning to Advance Personality Assessment and Theory.

    Science.gov (United States)

    Bleidorn, Wiebke; Hopwood, Christopher James

    2018-05-01

    Machine learning has led to important advances in society. One of the most exciting applications of machine learning in psychological science has been the development of assessment tools that can powerfully predict human behavior and personality traits. Thus far, machine learning approaches to personality assessment have focused on the associations between social media and other digital records with established personality measures. The goal of this article is to expand the potential of machine learning approaches to personality assessment by embedding it in a more comprehensive construct validation framework. We review recent applications of machine learning to personality assessment, place machine learning research in the broader context of fundamental principles of construct validation, and provide recommendations for how to use machine learning to advance our understanding of personality.

  13. Ensemble Machine Learning Methods and Applications

    CERN Document Server

    Ma, Yunqian

    2012-01-01

    It is common wisdom that gathering a variety of views and inputs improves the process of decision making, and, indeed, underpins a democratic society. Dubbed “ensemble learning” by researchers in computational intelligence and machine learning, it is known to improve a decision system’s robustness and accuracy. Now, fresh developments are allowing researchers to unleash the power of ensemble learning in an increasing range of real-world applications. Ensemble learning algorithms such as “boosting” and “random forest” facilitate solutions to key computational issues such as face detection and are now being applied in areas as diverse as object trackingand bioinformatics.   Responding to a shortage of literature dedicated to the topic, this volume offers comprehensive coverage of state-of-the-art ensemble learning techniques, including various contributions from researchers in leading industrial research labs. At once a solid theoretical study and a practical guide, the volume is a windfall for r...

  14. Prototype-based models in machine learning.

    Science.gov (United States)

    Biehl, Michael; Hammer, Barbara; Villmann, Thomas

    2016-01-01

    An overview is given of prototype-based models in machine learning. In this framework, observations, i.e., data, are stored in terms of typical representatives. Together with a suitable measure of similarity, the systems can be employed in the context of unsupervised and supervised analysis of potentially high-dimensional, complex datasets. We discuss basic schemes of competitive vector quantization as well as the so-called neural gas approach and Kohonen's topology-preserving self-organizing map. Supervised learning in prototype systems is exemplified in terms of learning vector quantization. Most frequently, the familiar Euclidean distance serves as a dissimilarity measure. We present extensions of the framework to nonstandard measures and give an introduction to the use of adaptive distances in relevance learning. © 2016 Wiley Periodicals, Inc.

  15. Exploiting the Dynamics of Soft Materials for Machine Learning.

    Science.gov (United States)

    Nakajima, Kohei; Hauser, Helmut; Li, Tao; Pfeifer, Rolf

    2018-06-01

    Soft materials are increasingly utilized for various purposes in many engineering applications. These materials have been shown to perform a number of functions that were previously difficult to implement using rigid materials. Here, we argue that the diverse dynamics generated by actuating soft materials can be effectively used for machine learning purposes. This is demonstrated using a soft silicone arm through a technique of multiplexing, which enables the rich transient dynamics of the soft materials to be fully exploited as a computational resource. The computational performance of the soft silicone arm is examined through two standard benchmark tasks. Results show that the soft arm compares well to or even outperforms conventional machine learning techniques under multiple conditions. We then demonstrate that this system can be used for the sensory time series prediction problem for the soft arm itself, which suggests its immediate applicability to a real-world machine learning problem. Our approach, on the one hand, represents a radical departure from traditional computational methods, whereas on the other hand, it fits nicely into a more general perspective of computation by way of exploiting the properties of physical materials in the real world.

  16. Machine learning for epigenetics and future medical applications.

    Science.gov (United States)

    Holder, Lawrence B; Haque, M Muksitul; Skinner, Michael K

    2017-07-03

    Understanding epigenetic processes holds immense promise for medical applications. Advances in Machine Learning (ML) are critical to realize this promise. Previous studies used epigenetic data sets associated with the germline transmission of epigenetic transgenerational inheritance of disease and novel ML approaches to predict genome-wide locations of critical epimutations. A combination of Active Learning (ACL) and Imbalanced Class Learning (ICL) was used to address past problems with ML to develop a more efficient feature selection process and address the imbalance problem in all genomic data sets. The power of this novel ML approach and our ability to predict epigenetic phenomena and associated disease is suggested. The current approach requires extensive computation of features over the genome. A promising new approach is to introduce Deep Learning (DL) for the generation and simultaneous computation of novel genomic features tuned to the classification task. This approach can be used with any genomic or biological data set applied to medicine. The application of molecular epigenetic data in advanced machine learning analysis to medicine is the focus of this review.

  17. Machine learning of molecular properties: Locality and active learning

    Science.gov (United States)

    Gubaev, Konstantin; Podryabinkin, Evgeny V.; Shapeev, Alexander V.

    2018-06-01

    In recent years, the machine learning techniques have shown great potent1ial in various problems from a multitude of disciplines, including materials design and drug discovery. The high computational speed on the one hand and the accuracy comparable to that of density functional theory on another hand make machine learning algorithms efficient for high-throughput screening through chemical and configurational space. However, the machine learning algorithms available in the literature require large training datasets to reach the chemical accuracy and also show large errors for the so-called outliers—the out-of-sample molecules, not well-represented in the training set. In the present paper, we propose a new machine learning algorithm for predicting molecular properties that addresses these two issues: it is based on a local model of interatomic interactions providing high accuracy when trained on relatively small training sets and an active learning algorithm of optimally choosing the training set that significantly reduces the errors for the outliers. We compare our model to the other state-of-the-art algorithms from the literature on the widely used benchmark tests.

  18. Selecting Learning Tasks: Effects of Adaptation and Shared Control on Learning Efficiency and Task Involvement

    Science.gov (United States)

    Corbalan, Gemma; Kester, Liesbeth; van Merrienboer, Jeroen J. G.

    2008-01-01

    Complex skill acquisition by performing authentic learning tasks is constrained by limited working memory capacity [Baddeley, A. D. (1992). Working memory. "Science, 255", 556-559]. To prevent cognitive overload, task difficulty and support of each newly selected learning task can be adapted to the learner's competence level and perceived task…

  19. BENCHMARKING MACHINE LEARNING TECHNIQUES FOR SOFTWARE DEFECT DETECTION

    OpenAIRE

    Saiqa Aleem; Luiz Fernando Capretz; Faheem Ahmed

    2015-01-01

    Machine Learning approaches are good in solving problems that have less information. In most cases, the software domain problems characterize as a process of learning that depend on the various circumstances and changes accordingly. A predictive model is constructed by using machine learning approaches and classified them into defective and non-defective modules. Machine learning techniques help developers to retrieve useful information after the classification and enable them to analyse data...

  20. BEBP: An Poisoning Method Against Machine Learning Based IDSs

    OpenAIRE

    Li, Pan; Liu, Qiang; Zhao, Wentao; Wang, Dongxu; Wang, Siqi

    2018-01-01

    In big data era, machine learning is one of fundamental techniques in intrusion detection systems (IDSs). However, practical IDSs generally update their decision module by feeding new data then retraining learning models in a periodical way. Hence, some attacks that comprise the data for training or testing classifiers significantly challenge the detecting capability of machine learning-based IDSs. Poisoning attack, which is one of the most recognized security threats towards machine learning...

  1. Machine learning methods for clinical forms analysis in mental health.

    Science.gov (United States)

    Strauss, John; Peguero, Arturo Martinez; Hirst, Graeme

    2013-01-01

    In preparation for a clinical information system implementation, the Centre for Addiction and Mental Health (CAMH) Clinical Information Transformation project completed multiple preparation steps. An automated process was desired to supplement the onerous task of manual analysis of clinical forms. We used natural language processing (NLP) and machine learning (ML) methods for a series of 266 separate clinical forms. For the investigation, documents were represented by feature vectors. We used four ML algorithms for our examination of the forms: cluster analysis, k-nearest neigh-bours (kNN), decision trees and support vector machines (SVM). Parameters for each algorithm were optimized. SVM had the best performance with a precision of 64.6%. Though we did not find any method sufficiently accurate for practical use, to our knowledge this approach to forms has not been used previously in mental health.

  2. Data Mining and Machine Learning in Astronomy

    Science.gov (United States)

    Ball, Nicholas M.; Brunner, Robert J.

    We review the current state of data mining and machine learning in astronomy. Data Mining can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those in which data mining techniques directly contributed to improving science, and important current and future directions, including probability density functions, parallel algorithms, Peta-Scale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.

  3. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  4. Novel jet observables from machine learning

    Science.gov (United States)

    Datta, Kaustuv; Larkoski, Andrew J.

    2018-03-01

    Previous studies have demonstrated the utility and applicability of machine learning techniques to jet physics. In this paper, we construct new observables for the discrimination of jets from different originating particles exclusively from information identified by the machine. The approach we propose is to first organize information in the jet by resolved phase space and determine the effective N -body phase space at which discrimination power saturates. This then allows for the construction of a discrimination observable from the N -body phase space coordinates. A general form of this observable can be expressed with numerous parameters that are chosen so that the observable maximizes the signal vs. background likelihood. Here, we illustrate this technique applied to discrimination of H\\to b\\overline{b} decays from massive g\\to b\\overline{b} splittings. We show that for a simple parametrization, we can construct an observable that has discrimination power comparable to, or better than, widely-used observables motivated from theory considerations. For the case of jets on which modified mass-drop tagger grooming is applied, the observable that the machine learns is essentially the angle of the dominant gluon emission off of the b\\overline{b} pair.

  5. Machine learning search for variable stars

    Science.gov (United States)

    Pashchenko, Ilya N.; Sokolovsky, Kirill V.; Gavras, Panagiotis

    2018-04-01

    Photometric variability detection is often considered as a hypothesis testing problem: an object is variable if the null hypothesis that its brightness is constant can be ruled out given the measurements and their uncertainties. The practical applicability of this approach is limited by uncorrected systematic errors. We propose a new variability detection technique sensitive to a wide range of variability types while being robust to outliers and underestimated measurement uncertainties. We consider variability detection as a classification problem that can be approached with machine learning. Logistic Regression (LR), Support Vector Machines (SVM), k Nearest Neighbours (kNN), Neural Nets (NN), Random Forests (RF), and Stochastic Gradient Boosting classifier (SGB) are applied to 18 features (variability indices) quantifying scatter and/or correlation between points in a light curve. We use a subset of Optical Gravitational Lensing Experiment phase two (OGLE-II) Large Magellanic Cloud (LMC) photometry (30 265 light curves) that was searched for variability using traditional methods (168 known variable objects) as the training set and then apply the NN to a new test set of 31 798 OGLE-II LMC light curves. Among 205 candidates selected in the test set, 178 are real variables, while 13 low-amplitude variables are new discoveries. The machine learning classifiers considered are found to be more efficient (select more variables and fewer false candidates) compared to traditional techniques using individual variability indices or their linear combination. The NN, SGB, SVM, and RF show a higher efficiency compared to LR and kNN.

  6. Machine learning spatial geometry from entanglement features

    Science.gov (United States)

    You, Yi-Zhuang; Yang, Zhao; Qi, Xiao-Liang

    2018-02-01

    Motivated by the close relations of the renormalization group with both the holography duality and the deep learning, we propose that the holographic geometry can emerge from deep learning the entanglement feature of a quantum many-body state. We develop a concrete algorithm, call the entanglement feature learning (EFL), based on the random tensor network (RTN) model for the tensor network holography. We show that each RTN can be mapped to a Boltzmann machine, trained by the entanglement entropies over all subregions of a given quantum many-body state. The goal is to construct the optimal RTN that best reproduce the entanglement feature. The RTN geometry can then be interpreted as the emergent holographic geometry. We demonstrate the EFL algorithm on a 1D free fermion system and observe the emergence of the hyperbolic geometry (AdS3 spatial geometry) as we tune the fermion system towards the gapless critical point (CFT2 point).

  7. Pedagogical entrepreneurship in learning tasks

    Directory of Open Access Journals (Sweden)

    Marit Engum Hansen

    2016-11-01

    Full Text Available Background: The action plan "Entrepreneurship in Education – from primary to higher education "(2009-2014, proposed to establish a site for digital learning materials within entrepreneurship in basic education. PedEnt (Pedagogical Entrepreneurship was launched in autumn of 2014, and both the authors have contributed to the professional development of the site. Two of the learning assignments published on PedEnt constitute the research objects of this study. Methods: Based on pedagogical entrepreneurship we present a case study of learning work carried out by students at lower and upper secondary level. Using an analysis of assignment texts and as well as with video recordings we have identified the characteristics of entrepreneurial learning methods as they were expressed through each case. Results: The analysis showed that learning assignments can be characterized as entrepreneurial because they promoted the actor role and creativity of the students. We found that the relationship between the relevance of the assignments and the context in which they are given pose an important prerequisite for the students in order to experience the learning work as meaningful. Conclusions: Entrepreneurial learning methods challenge the traditional view that theory tends to take primacy over practice. To orient learning assignments within relevant contexts gives students opportunities to experience by themselves the need for increased knowledge.

  8. Trustless Machine Learning Contracts; Evaluating and Exchanging Machine Learning Models on the Ethereum Blockchain

    OpenAIRE

    Kurtulmus, A. Besir; Daniel, Kenny

    2018-01-01

    Using blockchain technology, it is possible to create contracts that offer a reward in exchange for a trained machine learning model for a particular data set. This would allow users to train machine learning models for a reward in a trustless manner. The smart contract will use the blockchain to automatically validate the solution, so there would be no debate about whether the solution was correct or not. Users who submit the solutions won't have counterparty risk that they won't get paid fo...

  9. Machine learning, computer vision, and probabilistic models in jet physics

    CERN Multimedia

    CERN. Geneva; NACHMAN, Ben

    2015-01-01

    In this talk we present recent developments in the application of machine learning, computer vision, and probabilistic models to the analysis and interpretation of LHC events. First, we will introduce the concept of jet-images and computer vision techniques for jet tagging. Jet images enabled the connection between jet substructure and tagging with the fields of computer vision and image processing for the first time, improving the performance to identify highly boosted W bosons with respect to state-of-the-art methods, and providing a new way to visualize the discriminant features of different classes of jets, adding a new capability to understand the physics within jets and to design more powerful jet tagging methods. Second, we will present Fuzzy jets: a new paradigm for jet clustering using machine learning methods. Fuzzy jets view jet clustering as an unsupervised learning task and incorporate a probabilistic assignment of particles to jets to learn new features of the jet structure. In particular, we wi...

  10. Machine Learning Methods to Predict Diabetes Complications.

    Science.gov (United States)

    Dagliati, Arianna; Marini, Simone; Sacchi, Lucia; Cogni, Giulia; Teliti, Marsida; Tibollo, Valentina; De Cata, Pasquale; Chiovato, Luca; Bellazzi, Riccardo

    2018-03-01

    One of the areas where Artificial Intelligence is having more impact is machine learning, which develops algorithms able to learn patterns and decision rules from data. Machine learning algorithms have been embedded into data mining pipelines, which can combine them with classical statistical strategies, to extract knowledge from data. Within the EU-funded MOSAIC project, a data mining pipeline has been used to derive a set of predictive models of type 2 diabetes mellitus (T2DM) complications based on electronic health record data of nearly one thousand patients. Such pipeline comprises clinical center profiling, predictive model targeting, predictive model construction and model validation. After having dealt with missing data by means of random forest (RF) and having applied suitable strategies to handle class imbalance, we have used Logistic Regression with stepwise feature selection to predict the onset of retinopathy, neuropathy, or nephropathy, at different time scenarios, at 3, 5, and 7 years from the first visit at the Hospital Center for Diabetes (not from the diagnosis). Considered variables are gender, age, time from diagnosis, body mass index (BMI), glycated hemoglobin (HbA1c), hypertension, and smoking habit. Final models, tailored in accordance with the complications, provided an accuracy up to 0.838. Different variables were selected for each complication and time scenario, leading to specialized models easy to translate to the clinical practice.

  11. Using Machine Learning in Adversarial Environments.

    Energy Technology Data Exchange (ETDEWEB)

    Davis, Warren Leon [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2016-02-01

    Intrusion/anomaly detection systems are among the first lines of cyber defense. Commonly, they either use signatures or machine learning (ML) to identify threats, but fail to account for sophisticated attackers trying to circumvent them. We propose to embed machine learning within a game theoretic framework that performs adversarial modeling, develops methods for optimizing operational response based on ML, and integrates the resulting optimization codebase into the existing ML infrastructure developed by the Hybrid LDRD. Our approach addresses three key shortcomings of ML in adversarial settings: 1) resulting classifiers are typically deterministic and, therefore, easy to reverse engineer; 2) ML approaches only address the prediction problem, but do not prescribe how one should operationalize predictions, nor account for operational costs and constraints; and 3) ML approaches do not model attackers’ response and can be circumvented by sophisticated adversaries. The principal novelty of our approach is to construct an optimization framework that blends ML, operational considerations, and a model predicting attackers reaction, with the goal of computing optimal moving target defense. One important challenge is to construct a realistic model of an adversary that is tractable, yet realistic. We aim to advance the science of attacker modeling by considering game-theoretic methods, and by engaging experimental subjects with red teaming experience in trying to actively circumvent an intrusion detection system, and learning a predictive model of such circumvention activities. In addition, we will generate metrics to test that a particular model of an adversary is consistent with available data.

  12. Machine learning vortices at the Kosterlitz-Thouless transition

    Science.gov (United States)

    Beach, Matthew J. S.; Golubeva, Anna; Melko, Roger G.

    2018-01-01

    Efficient and automated classification of phases from minimally processed data is one goal of machine learning in condensed-matter and statistical physics. Supervised algorithms trained on raw samples of microstates can successfully detect conventional phase transitions via learning a bulk feature such as an order parameter. In this paper, we investigate whether neural networks can learn to classify phases based on topological defects. We address this question on the two-dimensional classical XY model which exhibits a Kosterlitz-Thouless transition. We find significant feature engineering of the raw spin states is required to convincingly claim that features of the vortex configurations are responsible for learning the transition temperature. We further show a single-layer network does not correctly classify the phases of the XY model, while a convolutional network easily performs classification by learning the global magnetization. Finally, we design a deep network capable of learning vortices without feature engineering. We demonstrate the detection of vortices does not necessarily result in the best classification accuracy, especially for lattices of less than approximately 1000 spins. For larger systems, it remains a difficult task to learn vortices.

  13. Machine Learning-Empowered Biometric Methods for Biomedicine Applications

    Directory of Open Access Journals (Sweden)

    Qingxue Zhang

    2017-07-01

    Full Text Available Nowadays, pervasive computing technologies are paving a promising way for advanced smart health applications. However, a key impediment faced by wide deployment of these assistive smart devices, is the increasing privacy and security issue, such as how to protect access to sensitive patient data in the health record. Focusing on this challenge, biometrics are attracting intense attention in terms of effective user identification to enable confidential health applications. In this paper, we take special interest in two bio-potential-based biometric modalities, electrocardiogram (ECG and electroencephalogram (EEG, considering that they are both unique to individuals, and more reliable than token (identity card and knowledge-based (username/password methods. After extracting effective features in multiple domains from ECG/EEG signals, several advanced machine learning algorithms are introduced to perform the user identification task, including Neural Network, K-nearest Neighbor, Bagging, Random Forest and AdaBoost. Experimental results on two public ECG and EEG datasets show that ECG is a more robust biometric modality compared to EEG, leveraging a higher signal to noise ratio and also more distinguishable morphological patterns. Among different machine learning classifiers, the random forest greatly outperforms the others and owns an identification rate as high as 98%. This study is expected to demonstrate that properly selected biometric empowered by an effective machine learner owns a great potential, to enable confidential biomedicine applications in the era of smart digital health.

  14. From Curve Fitting to Machine Learning

    CERN Document Server

    Zielesny, Achim

    2011-01-01

    The analysis of experimental data is at heart of science from its beginnings. But it was the advent of digital computers that allowed the execution of highly non-linear and increasingly complex data analysis procedures - methods that were completely unfeasible before. Non-linear curve fitting, clustering and machine learning belong to these modern techniques which are a further step towards computational intelligence. The goal of this book is to provide an interactive and illustrative guide to these topics. It concentrates on the road from two dimensional curve fitting to multidimensional clus

  15. Unintended consequences of machine learning in medicine?

    Science.gov (United States)

    McDonald, Laura; Ramagopalan, Sreeram V; Cox, Andrew P; Oguz, Mustafa

    2017-01-01

    Machine learning (ML) has the potential to significantly aid medical practice. However, a recent article highlighted some negative consequences that may arise from using ML decision support in medicine. We argue here that whilst the concerns raised by the authors may be appropriate, they are not specific to ML, and thus the article may lead to an adverse perception about this technique in particular. Whilst ML is not without its limitations like any methodology, a balanced view is needed in order to not hamper its use in potentially enabling better patient care.

  16. An Evolutionary Machine Learning Framework for Big Data Sequence Mining

    Science.gov (United States)

    Kamath, Uday Krishna

    2014-01-01

    Sequence classification is an important problem in many real-world applications. Unlike other machine learning data, there are no "explicit" features or signals in sequence data that can help traditional machine learning algorithms learn and predict from the data. Sequence data exhibits inter-relationships in the elements that are…

  17. Book review: A first course in Machine Learning

    DEFF Research Database (Denmark)

    Ortiz-Arroyo, Daniel

    2016-01-01

    "The new edition of A First Course in Machine Learning by Rogers and Girolami is an excellent introduction to the use of statistical methods in machine learning. The book introduces concepts such as mathematical modeling, inference, and prediction, providing ‘just in time’ the essential background...... to change models and parameter values to make [it] easier to understand and apply these models in real applications. The authors [also] introduce more advanced, state-of-the-art machine learning methods, such as Gaussian process models and advanced mixture models, which are used across machine learning....... This makes the book interesting not only to students with little or no background in machine learning but also to more advanced graduate students interested in statistical approaches to machine learning." —Daniel Ortiz-Arroyo, Associate Professor, Aalborg University Esbjerg, Denmark...

  18. Machine learning: novel bioinformatics approaches for combating antimicrobial resistance.

    Science.gov (United States)

    Macesic, Nenad; Polubriaginof, Fernanda; Tatonetti, Nicholas P

    2017-12-01

    Antimicrobial resistance (AMR) is a threat to global health and new approaches to combating AMR are needed. Use of machine learning in addressing AMR is in its infancy but has made promising steps. We reviewed the current literature on the use of machine learning for studying bacterial AMR. The advent of large-scale data sets provided by next-generation sequencing and electronic health records make applying machine learning to the study and treatment of AMR possible. To date, it has been used for antimicrobial susceptibility genotype/phenotype prediction, development of AMR clinical decision rules, novel antimicrobial agent discovery and antimicrobial therapy optimization. Application of machine learning to studying AMR is feasible but remains limited. Implementation of machine learning in clinical settings faces barriers to uptake with concerns regarding model interpretability and data quality.Future applications of machine learning to AMR are likely to be laboratory-based, such as antimicrobial susceptibility phenotype prediction.

  19. Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier

    Directory of Open Access Journals (Sweden)

    C. V. Subbulakshmi

    2015-01-01

    Full Text Available Medical data classification is a prime data mining problem being discussed about for a decade that has attracted several researchers around the world. Most classifiers are designed so as to learn from the data itself using a training process, because complete expert knowledge to determine classifier parameters is impracticable. This paper proposes a hybrid methodology based on machine learning paradigm. This paradigm integrates the successful exploration mechanism called self-regulated learning capability of the particle swarm optimization (PSO algorithm with the extreme learning machine (ELM classifier. As a recent off-line learning method, ELM is a single-hidden layer feedforward neural network (FFNN, proved to be an excellent classifier with large number of hidden layer neurons. In this research, PSO is used to determine the optimum set of parameters for the ELM, thus reducing the number of hidden layer neurons, and it further improves the network generalization performance. The proposed method is experimented on five benchmarked datasets of the UCI Machine Learning Repository for handling medical dataset classification. Simulation results show that the proposed approach is able to achieve good generalization performance, compared to the results of other classifiers.

  20. AGE GROUP CLASSIFICATION USING MACHINE LEARNING TECHNIQUES

    OpenAIRE

    Arshdeep Singh Syal*1 & Abhinav Gupta2

    2017-01-01

    A human face provides a lot of information that allows another person to identify characteristics such as age, sex, etc. Therefore, the challenge is to develop an age group prediction system using the automatic learning method. The task of estimating the age group of the human from their frontal facial images is very captivating, but also challenging because of the pattern of personalized and non-linear aging that differs from one person to another. This paper examines the problem of predicti...

  1. Intellectual Property and Machine Learning: An exploratory study

    OpenAIRE

    Øverlier, Lasse

    2017-01-01

    Our research makes a contribution by exemplifying what controls the freedom-to-operate for a company operating in the area of machine learning. Through interviews we demonstrate the industry’s alternating viewpoints to whether copyrighted data used as input to machine learning systems should be viewed differently than copying the data for storage or reproduction. In addition we show that unauthorized use of copyrighted data in machine learning systems is hard to detect with the burden of proo...

  2. Statistical and machine learning approaches for network analysis

    CERN Document Server

    Dehmer, Matthias

    2012-01-01

    Explore the multidisciplinary nature of complex networks through machine learning techniques Statistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph measures for classification. By providing different approaches based on experimental data, the book uniquely sets itself apart from the current literature by exploring the application of machine learning techniques to various types of complex networks. Comprised of chapters written by internation

  3. Machine Learning Topological Invariants with Neural Networks

    Science.gov (United States)

    Zhang, Pengfei; Shen, Huitao; Zhai, Hui

    2018-02-01

    In this Letter we supervisedly train neural networks to distinguish different topological phases in the context of topological band insulators. After training with Hamiltonians of one-dimensional insulators with chiral symmetry, the neural network can predict their topological winding numbers with nearly 100% accuracy, even for Hamiltonians with larger winding numbers that are not included in the training data. These results show a remarkable success that the neural network can capture the global and nonlinear topological features of quantum phases from local inputs. By opening up the neural network, we confirm that the network does learn the discrete version of the winding number formula. We also make a couple of remarks regarding the role of the symmetry and the opposite effect of regularization techniques when applying machine learning to physical systems.

  4. Machine-learning the string landscape

    Directory of Open Access Journals (Sweden)

    Yang-Hui He

    2017-11-01

    Full Text Available We propose a paradigm to apply machine learning various databases which have emerged in the study of the string landscape. In particular, we establish neural networks as both classifiers and predictors and train them with a host of available data ranging from Calabi–Yau manifolds and vector bundles, to quiver representations for gauge theories, using a novel framework of recasting geometrical and physical data as pixelated images. We find that even a relatively simple neural network can learn many significant quantities to astounding accuracy in a matter of minutes and can also predict hithertofore unencountered results, whereby rendering the paradigm a valuable tool in physics as well as pure mathematics.

  5. Amp: A modular approach to machine learning in atomistic simulations

    Science.gov (United States)

    Khorshidi, Alireza; Peterson, Andrew A.

    2016-10-01

    Electronic structure calculations, such as those employing Kohn-Sham density functional theory or ab initio wavefunction theories, have allowed for atomistic-level understandings of a wide variety of phenomena and properties of matter at small scales. However, the computational cost of electronic structure methods drastically increases with length and time scales, which makes these methods difficult for long time-scale molecular dynamics simulations or large-sized systems. Machine-learning techniques can provide accurate potentials that can match the quality of electronic structure calculations, provided sufficient training data. These potentials can then be used to rapidly simulate large and long time-scale phenomena at similar quality to the parent electronic structure approach. Machine-learning potentials usually take a bias-free mathematical form and can be readily developed for a wide variety of systems. Electronic structure calculations have favorable properties-namely that they are noiseless and targeted training data can be produced on-demand-that make them particularly well-suited for machine learning. This paper discusses our modular approach to atomistic machine learning through the development of the open-source Atomistic Machine-learning Package (Amp), which allows for representations of both the total and atom-centered potential energy surface, in both periodic and non-periodic systems. Potentials developed through the atom-centered approach are simultaneously applicable for systems with various sizes. Interpolation can be enhanced by introducing custom descriptors of the local environment. We demonstrate this in the current work for Gaussian-type, bispectrum, and Zernike-type descriptors. Amp has an intuitive and modular structure with an interface through the python scripting language yet has parallelizable fortran components for demanding tasks; it is designed to integrate closely with the widely used Atomic Simulation Environment (ASE), which

  6. Data Mining Practical Machine Learning Tools and Techniques

    CERN Document Server

    Witten, Ian H; Hall, Mark A

    2011-01-01

    Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining. Thorough updates reflect the technical changes and modernizations that have taken place

  7. IRB Process Improvements: A Machine Learning Analysis.

    Science.gov (United States)

    Shoenbill, Kimberly; Song, Yiqiang; Cobb, Nichelle L; Drezner, Marc K; Mendonca, Eneida A

    2017-06-01

    Clinical research involving humans is critically important, but it is a lengthy and expensive process. Most studies require institutional review board (IRB) approval. Our objective is to identify predictors of delays or accelerations in the IRB review process and apply this knowledge to inform process change in an effort to improve IRB efficiency, transparency, consistency and communication. We analyzed timelines of protocol submissions to determine protocol or IRB characteristics associated with different processing times. Our evaluation included single variable analysis to identify significant predictors of IRB processing time and machine learning methods to predict processing times through the IRB review system. Based on initial identified predictors, changes to IRB workflow and staffing procedures were instituted and we repeated our analysis. Our analysis identified several predictors of delays in the IRB review process including type of IRB review to be conducted, whether a protocol falls under Veteran's Administration purview and specific staff in charge of a protocol's review. We have identified several predictors of delays in IRB protocol review processing times using statistical and machine learning methods. Application of this knowledge to process improvement efforts in two IRBs has led to increased efficiency in protocol review. The workflow and system enhancements that are being made support our four-part goal of improving IRB efficiency, consistency, transparency, and communication.

  8. Mining the Kepler Data using Machine Learning

    Science.gov (United States)

    Walkowicz, Lucianne; Howe, A. R.; Nayar, R.; Turner, E. L.; Scargle, J.; Meadows, V.; Zee, A.

    2014-01-01

    Kepler's high cadence and incredible precision has provided an unprecedented view into stars and their planetary companions, revealing both expected and novel phenomena and systems. Due to the large number of Kepler lightcurves, the discovery of novel phenomena in particular has often been serendipitous in the course of searching for known forms of variability (for example, the discovery of the doubly pulsating elliptical binary KOI-54, originally identified by the transiting planet search pipeline). In this talk, we discuss progress on mining the Kepler data through both supervised and unsupervised machine learning, intended to both systematically search the Kepler lightcurves for rare or anomalous variability, and to create a variability catalog for community use. Mining the dataset in this way also allows for a quantitative identification of anomalous variability, and so may also be used as a signal-agnostic form of optical SETI. As the Kepler data are exceptionally rich, they provide an interesting counterpoint to machine learning efforts typically performed on sparser and/or noisier survey data, and will inform similar characterization carried out on future survey datasets.

  9. Machine Learning: developing an image recognition program : with Python, Scikit Learn and OpenCV

    OpenAIRE

    Nguyen, Minh

    2016-01-01

    Machine Learning is one of the most debated topic in computer world these days, especially after the first Computer Go program has beaten human Go world champion. Among endless application of Machine Learning, image recognition, which problem is processing enormous amount of data from dynamic input. This thesis will present the basic concept of Machine Learning, Machine Learning algorithms, Python programming language and Scikit Learn – a simple and efficient tool for data analysis in P...

  10. Task Demands in OSCEs Influence Learning Strategies.

    Science.gov (United States)

    Lafleur, Alexandre; Laflamme, Jonathan; Leppink, Jimmie; Côté, Luc

    2017-01-01

    Models on pre-assessment learning effects confirmed that task demands stand out among the factors assessors can modify in an assessment to influence learning. However, little is known about which tasks in objective structured clinical examinations (OSCEs) improve students' cognitive and metacognitive processes. Research is needed to support OSCE designs that benefit students' metacognitive strategies when they are studying, reinforcing a hypothesis-driven approach. With that intent, hypothesis-driven physical examination (HDPE) assessments ask students to elicit and interpret findings of the physical exam to reach a diagnosis ("Examine this patient with a painful shoulder to reach a diagnosis"). When studying for HDPE, students will dedicate more time to hypothesis-driven discussions and practice than when studying for a part-task OSCE ("Perform the shoulder exam"). It is expected that the whole-task nature of HDPE will lead to a hypothesis-oriented use of the learning resources, a frequent use of adjustment strategies, and persistence with learning. In a mixed-methods study, 40 medical students were randomly paired and filmed while studying together for two hypothetical OSCE stations. Each 25-min study period began with video cues asking to study for either a part-task OSCE or an HDPE. In a crossover design, sequences were randomized for OSCEs and contents (shoulder or spine). Time-on-task for discussions or practice were categorized as "hypothesis-driven" or "sequence of signs and maneuvers." Content analysis of focus group interviews summarized students' perception of learning resources, adjustment strategies, and persistence with learning. When studying for HDPE, students allocate significantly more time for hypothesis-driven discussions and practice. Students use resources contrasting diagnoses and report persistence with learning. When studying for part-task OSCEs, time-on-task is reversed, spent on rehearsing a sequence of signs and maneuvers. OSCEs with

  11. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    International Nuclear Information System (INIS)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.; McEwen, Jason D.

    2016-01-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  12. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    Energy Technology Data Exchange (ETDEWEB)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K. [Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT (United Kingdom); McEwen, Jason D., E-mail: dr.michelle.lochner@gmail.com [Mullard Space Science Laboratory, University College London, Surrey RH5 6NT (United Kingdom)

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  13. The immune system, adaptation, and machine learning

    Science.gov (United States)

    Farmer, J. Doyne; Packard, Norman H.; Perelson, Alan S.

    1986-10-01

    The immune system is capable of learning, memory, and pattern recognition. By employing genetic operators on a time scale fast enough to observe experimentally, the immune system is able to recognize novel shapes without preprogramming. Here we describe a dynamical model for the immune system that is based on the network hypothesis of Jerne, and is simple enough to simulate on a computer. This model has a strong similarity to an approach to learning and artificial intelligence introduced by Holland, called the classifier system. We demonstrate that simple versions of the classifier system can be cast as a nonlinear dynamical system, and explore the analogy between the immune and classifier systems in detail. Through this comparison we hope to gain insight into the way they perform specific tasks, and to suggest new approaches that might be of value in learning systems.

  14. Global Bathymetry: Machine Learning for Data Editing

    Science.gov (United States)

    Sandwell, D. T.; Tea, B.; Freund, Y.

    2017-12-01

    The accuracy of global bathymetry depends primarily on the coverage and accuracy of the sounding data and secondarily on the depth predicted from gravity. A main focus of our research is to add newly-available data to the global compilation. Most data sources have 1-12% of erroneous soundings caused by a wide array of blunders and measurement errors. Over the years we have hand-edited this data using undergraduate employees at UCSD (440 million soundings at 500 m resolution). We are developing a machine learning approach to refine the flagging of the older soundings and provide automated editing of newly-acquired soundings. The approach has three main steps: 1) Combine the sounding data with additional information that may inform the machine learning algorithm. The additional parameters include: depth predicted from gravity; distance to the nearest sounding from other cruises; seafloor age; spreading rate; sediment thickness; and vertical gravity gradient. 2) Use available edit decisions as training data sets for a boosted tree algorithm with a binary logistic objective function and L2 regularization. Initial results with poor quality single beam soundings show that the automated algorithm matches the hand-edited data 89% of the time. The results show that most of the information for detecting outliers comes from predicted depth with secondary contributions from distance to the nearest sounding and longitude. A similar analysis using very high quality multibeam data shows that the automated algorithm matches the hand-edited data 93% of the time. Again, most of the information for detecting outliers comes from predicted depth secondary contributions from distance to the nearest sounding and longitude. 3) The third step in the process is to use the machine learning parameters, derived from the training data, to edit 12 million newly acquired single beam sounding data provided by the National Geospatial-Intelligence Agency. The output of the learning algorithm will be

  15. Inverse analysis of turbidites by machine learning

    Science.gov (United States)

    Naruse, H.; Nakao, K.

    2017-12-01

    This study aims to propose a method to estimate paleo-hydraulic conditions of turbidity currents from ancient turbidites by using machine-learning technique. In this method, numerical simulation was repeated under various initial conditions, which produces a data set of characteristic features of turbidites. Then, this data set of turbidites is used for supervised training of a deep-learning neural network (NN). Quantities of characteristic features of turbidites in the training data set are given to input nodes of NN, and output nodes are expected to provide the estimates of initial condition of the turbidity current. The optimization of weight coefficients of NN is then conducted to reduce root-mean-square of the difference between the true conditions and the output values of NN. The empirical relationship with numerical results and the initial conditions is explored in this method, and the discovered relationship is used for inversion of turbidity currents. This machine learning can potentially produce NN that estimates paleo-hydraulic conditions from data of ancient turbidites. We produced a preliminary implementation of this methodology. A forward model based on 1D shallow-water equations with a correction of density-stratification effect was employed. This model calculates a behavior of a surge-like turbidity current transporting mixed-size sediment, and outputs spatial distribution of volume per unit area of each grain-size class on the uniform slope. Grain-size distribution was discretized 3 classes. Numerical simulation was repeated 1000 times, and thus 1000 beds of turbidites were used as the training data for NN that has 21000 input nodes and 5 output nodes with two hidden-layers. After the machine learning finished, independent simulations were conducted 200 times in order to evaluate the performance of NN. As a result of this test, the initial conditions of validation data were successfully reconstructed by NN. The estimated values show very small

  16. Recent machine learning advancements in sensor-based mobility analysis: Deep learning for Parkinson's disease assessment.

    Science.gov (United States)

    Eskofier, Bjoern M; Lee, Sunghoon I; Daneault, Jean-Francois; Golabchi, Fatemeh N; Ferreira-Carvalho, Gabriela; Vergara-Diaz, Gloria; Sapienza, Stefano; Costante, Gianluca; Klucken, Jochen; Kautz, Thomas; Bonato, Paolo

    2016-08-01

    The development of wearable sensors has opened the door for long-term assessment of movement disorders. However, there is still a need for developing methods suitable to monitor motor symptoms in and outside the clinic. The purpose of this paper was to investigate deep learning as a method for this monitoring. Deep learning recently broke records in speech and image classification, but it has not been fully investigated as a potential approach to analyze wearable sensor data. We collected data from ten patients with idiopathic Parkinson's disease using inertial measurement units. Several motor tasks were expert-labeled and used for classification. We specifically focused on the detection of bradykinesia. For this, we compared standard machine learning pipelines with deep learning based on convolutional neural networks. Our results showed that deep learning outperformed other state-of-the-art machine learning algorithms by at least 4.6 % in terms of classification rate. We contribute a discussion of the advantages and disadvantages of deep learning for sensor-based movement assessment and conclude that deep learning is a promising method for this field.

  17. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    Science.gov (United States)

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-07

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  18. Intelligent Vehicle Power Management Using Machine Learning and Fuzzy Logic

    National Research Council Canada - National Science Library

    Chen, ZhiHang; Masrur, M. A; Murphey, Yi L

    2008-01-01

    .... A machine learning algorithm, LOPPS, has been developed to learn about optimal power source combinations with respect to minimum power loss for all possible load requests and various system power states...

  19. Active learning machine learns to create new quantum experiments.

    Science.gov (United States)

    Melnikov, Alexey A; Poulsen Nautrup, Hendrik; Krenn, Mario; Dunjko, Vedran; Tiersch, Markus; Zeilinger, Anton; Briegel, Hans J

    2018-02-06

    How useful can machine learning be in a quantum laboratory? Here we raise the question of the potential of intelligent machines in the context of scientific research. A major motivation for the present work is the unknown reachability of various entanglement classes in quantum experiments. We investigate this question by using the projective simulation model, a physics-oriented approach to artificial intelligence. In our approach, the projective simulation system is challenged to design complex photonic quantum experiments that produce high-dimensional entangled multiphoton states, which are of high interest in modern quantum experiments. The artificial intelligence system learns to create a variety of entangled states and improves the efficiency of their realization. In the process, the system autonomously (re)discovers experimental techniques which are only now becoming standard in modern quantum optical experiments-a trait which was not explicitly demanded from the system but emerged through the process of learning. Such features highlight the possibility that machines could have a significantly more creative role in future research.

  20. Advanced methods in NDE using machine learning approaches

    Science.gov (United States)

    Wunderlich, Christian; Tschöpe, Constanze; Duckhorn, Frank

    2018-04-01

    Machine learning (ML) methods and algorithms have been applied recently with great success in quality control and predictive maintenance. Its goal to build new and/or leverage existing algorithms to learn from training data and give accurate predictions, or to find patterns, particularly with new and unseen similar data, fits perfectly to Non-Destructive Evaluation. The advantages of ML in NDE are obvious in such tasks as pattern recognition in acoustic signals or automated processing of images from X-ray, Ultrasonics or optical methods. Fraunhofer IKTS is using machine learning algorithms in acoustic signal analysis. The approach had been applied to such a variety of tasks in quality assessment. The principal approach is based on acoustic signal processing with a primary and secondary analysis step followed by a cognitive system to create model data. Already in the second analysis steps unsupervised learning algorithms as principal component analysis are used to simplify data structures. In the cognitive part of the software further unsupervised and supervised learning algorithms will be trained. Later the sensor signals from unknown samples can be recognized and classified automatically by the algorithms trained before. Recently the IKTS team was able to transfer the software for signal processing and pattern recognition to a small printed circuit board (PCB). Still, algorithms will be trained on an ordinary PC; however, trained algorithms run on the Digital Signal Processor and the FPGA chip. The identical approach will be used for pattern recognition in image analysis of OCT pictures. Some key requirements have to be fulfilled, however. A sufficiently large set of training data, a high signal-to-noise ratio, and an optimized and exact fixation of components are required. The automated testing can be done subsequently by the machine. By integrating the test data of many components along the value chain further optimization including lifetime and durability

  1. Distributed Extreme Learning Machine for Nonlinear Learning over Network

    Directory of Open Access Journals (Sweden)

    Songyan Huang

    2015-02-01

    Full Text Available Distributed data collection and analysis over a network are ubiquitous, especially over a wireless sensor network (WSN. To our knowledge, the data model used in most of the distributed algorithms is linear. However, in real applications, the linearity of systems is not always guaranteed. In nonlinear cases, the single hidden layer feedforward neural network (SLFN with radial basis function (RBF hidden neurons has the ability to approximate any continuous functions and, thus, may be used as the nonlinear learning system. However, confined by the communication cost, using the distributed version of the conventional algorithms to train the neural network directly is usually prohibited. Fortunately, based on the theorems provided in the extreme learning machine (ELM literature, we only need to compute the output weights of the SLFN. Computing the output weights itself is a linear learning problem, although the input-output mapping of the overall SLFN is still nonlinear. Using the distributed algorithmto cooperatively compute the output weights of the SLFN, we obtain a distributed extreme learning machine (dELM for nonlinear learning in this paper. This dELM is applied to the regression problem and classification problem to demonstrate its effectiveness and advantages.

  2. Airline Passenger Profiling Based on Fuzzy Deep Machine Learning.

    Science.gov (United States)

    Zheng, Yu-Jun; Sheng, Wei-Guo; Sun, Xing-Ming; Chen, Sheng-Yong

    2017-12-01

    Passenger profiling plays a vital part of commercial aviation security, but classical methods become very inefficient in handling the rapidly increasing amounts of electronic records. This paper proposes a deep learning approach to passenger profiling. The center of our approach is a Pythagorean fuzzy deep Boltzmann machine (PFDBM), whose parameters are expressed by Pythagorean fuzzy numbers such that each neuron can learn how a feature affects the production of the correct output from both the positive and negative sides. We propose a hybrid algorithm combining a gradient-based method and an evolutionary algorithm for training the PFDBM. Based on the novel learning model, we develop a deep neural network (DNN) for classifying normal passengers and potential attackers, and further develop an integrated DNN for identifying group attackers whose individual features are insufficient to reveal the abnormality. Experiments on data sets from Air China show that our approach provides much higher learning ability and classification accuracy than existing profilers. It is expected that the fuzzy deep learning approach can be adapted for a variety of complex pattern analysis tasks.

  3. Semi-supervised and unsupervised extreme learning machines.

    Science.gov (United States)

    Huang, Gao; Song, Shiji; Gupta, Jatinder N D; Wu, Cheng

    2014-12-01

    Extreme learning machines (ELMs) have proven to be efficient and effective learning mechanisms for pattern classification and regression. However, ELMs are primarily applied to supervised learning problems. Only a few existing research papers have used ELMs to explore unlabeled data. In this paper, we extend ELMs for both semi-supervised and unsupervised tasks based on the manifold regularization, thus greatly expanding the applicability of ELMs. The key advantages of the proposed algorithms are as follows: 1) both the semi-supervised ELM (SS-ELM) and the unsupervised ELM (US-ELM) exhibit learning capability and computational efficiency of ELMs; 2) both algorithms naturally handle multiclass classification or multicluster clustering; and 3) both algorithms are inductive and can handle unseen data at test time directly. Moreover, it is shown in this paper that all the supervised, semi-supervised, and unsupervised ELMs can actually be put into a unified framework. This provides new perspectives for understanding the mechanism of random feature mapping, which is the key concept in ELM theory. Empirical study on a wide range of data sets demonstrates that the proposed algorithms are competitive with the state-of-the-art semi-supervised or unsupervised learning algorithms in terms of accuracy and efficiency.

  4. IoT Security Techniques Based on Machine Learning

    OpenAIRE

    Xiao, Liang; Wan, Xiaoyue; Lu, Xiaozhen; Zhang, Yanyong; Wu, Di

    2018-01-01

    Internet of things (IoT) that integrate a variety of devices into networks to provide advanced and intelligent services have to protect user privacy and address attacks such as spoofing attacks, denial of service attacks, jamming and eavesdropping. In this article, we investigate the attack model for IoT systems, and review the IoT security solutions based on machine learning techniques including supervised learning, unsupervised learning and reinforcement learning. We focus on the machine le...

  5. Extremely Randomized Machine Learning Methods for Compound Activity Prediction

    Directory of Open Access Journals (Sweden)

    Wojciech M. Czarnecki

    2015-11-01

    Full Text Available Speed, a relatively low requirement for computational resources and high effectiveness of the evaluation of the bioactivity of compounds have caused a rapid growth of interest in the application of machine learning methods to virtual screening tasks. However, due to the growth of the amount of data also in cheminformatics and related fields, the aim of research has shifted not only towards the development of algorithms of high predictive power but also towards the simplification of previously existing methods to obtain results more quickly. In the study, we tested two approaches belonging to the group of so-called ‘extremely randomized methods’—Extreme Entropy Machine and Extremely Randomized Trees—for their ability to properly identify compounds that have activity towards particular protein targets. These methods were compared with their ‘non-extreme’ competitors, i.e., Support Vector Machine and Random Forest. The extreme approaches were not only found out to improve the efficiency of the classification of bioactive compounds, but they were also proved to be less computationally complex, requiring fewer steps to perform an optimization procedure.

  6. Quantum machine learning what quantum computing means to data mining

    CERN Document Server

    Wittek, Peter

    2014-01-01

    Quantum Machine Learning bridges the gap between abstract developments in quantum computing and the applied research on machine learning. Paring down the complexity of the disciplines involved, it focuses on providing a synthesis that explains the most important machine learning algorithms in a quantum framework. Theoretical advances in quantum computing are hard to follow for computer scientists, and sometimes even for researchers involved in the field. The lack of a step-by-step guide hampers the broader understanding of this emergent interdisciplinary body of research. Quantum Machine L

  7. Machine Learning in Radiology: Applications Beyond Image Interpretation.

    Science.gov (United States)

    Lakhani, Paras; Prater, Adam B; Hutson, R Kent; Andriole, Kathy P; Dreyer, Keith J; Morey, Jose; Prevedello, Luciano M; Clark, Toshi J; Geis, J Raymond; Itri, Jason N; Hawkins, C Matthew

    2018-02-01

    Much attention has been given to machine learning and its perceived impact in radiology, particularly in light of recent success with image classification in international competitions. However, machine learning is likely to impact radiology outside of image interpretation long before a fully functional "machine radiologist" is implemented in practice. Here, we describe an overview of machine learning, its application to radiology and other domains, and many cases of use that do not involve image interpretation. We hope that better understanding of these potential applications will help radiology practices prepare for the future and realize performance improvement and efficiency gains. Copyright © 2017 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  8. CRDM motion analysis using machine learning technique

    International Nuclear Information System (INIS)

    Nishimura, Takuya; Nakayama, Hiroyuki; Saitoh, Mayumi; Yaguchi, Seiji

    2017-01-01

    Magnetic jack type Control Rod Drive Mechanism (CRDM) for pressurized water reactor (PWR) plant operates control rods in response to electrical signals from a reactor control system. CRDM operability is evaluated by quantifying armature's response of closed/opened time which means interval time between coil energizing/de-energizing points and armature closed/opened points. MHI has already developed an automatic CRDM motion analysis and applied it to actual plants so far. However, CRDM operational data has wide variation depending on their characteristics such as plant condition, plant, etc. In the existing motion analysis, there is an issue of analysis accuracy for applying a single analysis technique to all plant conditions, plants, etc. In this study, MHI investigated motion analysis using machine learning (Random Forests) which is flexibly accommodated to CRDM operational data with wide variation, and is improved analysis accuracy. (author)

  9. Pileup Mitigation with Machine Learning (PUMML)

    Science.gov (United States)

    Komiske, Patrick T.; Metodiev, Eric M.; Nachman, Benjamin; Schwartz, Matthew D.

    2017-12-01

    Pileup involves the contamination of the energy distribution arising from the primary collision of interest (leading vertex) by radiation from soft collisions (pileup). We develop a new technique for removing this contamination using machine learning and convolutional neural networks. The network takes as input the energy distribution of charged leading vertex particles, charged pileup particles, and all neutral particles and outputs the energy distribution of particles coming from leading vertex alone. The PUMML algorithm performs remarkably well at eliminating pileup distortion on a wide range of simple and complex jet observables. We test the robustness of the algorithm in a number of ways and discuss how the network can be trained directly on data.

  10. Using Machine Learning to Predict MCNP Bias

    Energy Technology Data Exchange (ETDEWEB)

    Grechanuk, Pavel Aleksandrovi [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2018-01-09

    For many real-world applications in radiation transport where simulations are compared to experimental measurements, like in nuclear criticality safety, the bias (simulated - experimental keff) in the calculation is an extremely important quantity used for code validation. The objective of this project is to accurately predict the bias of MCNP6 [1] criticality calculations using machine learning (ML) algorithms, with the intention of creating a tool that can complement the current nuclear criticality safety methods. In the latest release of MCNP6, the Whisper tool is available for criticality safety analysts and includes a large catalogue of experimental benchmarks, sensitivity profiles, and nuclear data covariance matrices. This data, coming from 1100+ benchmark cases, is used in this study of ML algorithms for criticality safety bias predictions.

  11. IEEE International Workshop on Machine Learning for Signal Processing: Preface

    DEFF Research Database (Denmark)

    Tao, Jianhua

    The 21st IEEE International Workshop on Machine Learning for Signal Processing will be held in Beijing, China, on September 18–21, 2011. The workshop series is the major annual technical event of the IEEE Signal Processing Society's Technical Committee on Machine Learning for Signal Processing...

  12. Combining Formal Logic and Machine Learning for Sentiment Analysis

    DEFF Research Database (Denmark)

    Petersen, Niklas Christoffer; Villadsen, Jørgen

    2014-01-01

    This paper presents a formal logical method for deep structural analysis of the syntactical properties of texts using machine learning techniques for efficient syntactical tagging. To evaluate the method it is used for entity level sentiment analysis as an alternative to pure machine learning...

  13. Applying Machine Learning to Facilitate Autism Diagnostics: Pitfalls and Promises

    Science.gov (United States)

    Bone, Daniel; Goodwin, Matthew S.; Black, Matthew P.; Lee, Chi-Chun; Audhkhasi, Kartik; Narayanan, Shrikanth

    2015-01-01

    Machine learning has immense potential to enhance diagnostic and intervention research in the behavioral sciences, and may be especially useful in investigations involving the highly prevalent and heterogeneous syndrome of autism spectrum disorder. However, use of machine learning in the absence of clinical domain expertise can be tenuous and lead…

  14. An active role for machine learning in drug development

    Science.gov (United States)

    Murphy, Robert F.

    2014-01-01

    Due to the complexity of biological systems, cutting-edge machine-learning methods will be critical for future drug development. In particular, machine-vision methods to extract detailed information from imaging assays and active-learning methods to guide experimentation will be required to overcome the dimensionality problem in drug development. PMID:21587249

  15. Newton Methods for Large Scale Problems in Machine Learning

    Science.gov (United States)

    Hansen, Samantha Leigh

    2014-01-01

    The focus of this thesis is on practical ways of designing optimization algorithms for minimizing large-scale nonlinear functions with applications in machine learning. Chapter 1 introduces the overarching ideas in the thesis. Chapters 2 and 3 are geared towards supervised machine learning applications that involve minimizing a sum of loss…

  16. Large-Scale Machine Learning for Classification and Search

    Science.gov (United States)

    Liu, Wei

    2012-01-01

    With the rapid development of the Internet, nowadays tremendous amounts of data including images and videos, up to millions or billions, can be collected for training machine learning models. Inspired by this trend, this thesis is dedicated to developing large-scale machine learning techniques for the purpose of making classification and nearest…

  17. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  18. Dropout Prediction in E-Learning Courses through the Combination of Machine Learning Techniques

    Science.gov (United States)

    Lykourentzou, Ioanna; Giannoukos, Ioannis; Nikolopoulos, Vassilis; Mpardis, George; Loumos, Vassili

    2009-01-01

    In this paper, a dropout prediction method for e-learning courses, based on three popular machine learning techniques and detailed student data, is proposed. The machine learning techniques used are feed-forward neural networks, support vector machines and probabilistic ensemble simplified fuzzy ARTMAP. Since a single technique may fail to…

  19. Effective and efficient optics inspection approach using machine learning algorithms

    International Nuclear Information System (INIS)

    Abdulla, G.; Kegelmeyer, L.; Liao, Z.; Carr, W.

    2010-01-01

    The Final Optics Damage Inspection (FODI) system automatically acquires and utilizes the Optics Inspection (OI) system to analyze images of the final optics at the National Ignition Facility (NIF). During each inspection cycle up to 1000 images acquired by FODI are examined by OI to identify and track damage sites on the optics. The process of tracking growing damage sites on the surface of an optic can be made more effective by identifying and removing signals associated with debris or reflections. The manual process to filter these false sites is daunting and time consuming. In this paper we discuss the use of machine learning tools and data mining techniques to help with this task. We describe the process to prepare a data set that can be used for training and identifying hardware reflections in the image data. In order to collect training data, the images are first automatically acquired and analyzed with existing software and then relevant features such as spatial, physical and luminosity measures are extracted for each site. A subset of these sites is 'truthed' or manually assigned a class to create training data. A supervised classification algorithm is used to test if the features can predict the class membership of new sites. A suite of self-configuring machine learning tools called 'Avatar Tools' is applied to classify all sites. To verify, we used 10-fold cross correlation and found the accuracy was above 99%. This substantially reduces the number of false alarms that would otherwise be sent for more extensive investigation.

  20. Machine learning of molecular electronic properties in chemical compound space

    Science.gov (United States)

    Montavon, Grégoire; Rupp, Matthias; Gobre, Vivekanand; Vazquez-Mayagoitia, Alvaro; Hansen, Katja; Tkatchenko, Alexandre; Müller, Klaus-Robert; Anatole von Lilienfeld, O.

    2013-09-01

    The combination of modern scientific computing with electronic structure theory can lead to an unprecedented amount of data amenable to intelligent data analysis for the identification of meaningful, novel and predictive structure-property relationships. Such relationships enable high-throughput screening for relevant properties in an exponentially growing pool of virtual compounds that are synthetically accessible. Here, we present a machine learning model, trained on a database of ab initio calculation results for thousands of organic molecules, that simultaneously predicts multiple electronic ground- and excited-state properties. The properties include atomization energy, polarizability, frontier orbital eigenvalues, ionization potential, electron affinity and excitation energies. The machine learning model is based on a deep multi-task artificial neural network, exploiting the underlying correlations between various molecular properties. The input is identical to ab initio methods, i.e. nuclear charges and Cartesian coordinates of all atoms. For small organic molecules, the accuracy of such a ‘quantum machine’ is similar, and sometimes superior, to modern quantum-chemical methods—at negligible computational cost.

  1. Machine learning of molecular electronic properties in chemical compound space

    International Nuclear Information System (INIS)

    Montavon, Grégoire; Müller, Klaus-Robert; Rupp, Matthias; Gobre, Vivekanand; Hansen, Katja; Tkatchenko, Alexandre; Vazquez-Mayagoitia, Alvaro; Anatole von Lilienfeld, O

    2013-01-01

    The combination of modern scientific computing with electronic structure theory can lead to an unprecedented amount of data amenable to intelligent data analysis for the identification of meaningful, novel and predictive structure–property relationships. Such relationships enable high-throughput screening for relevant properties in an exponentially growing pool of virtual compounds that are synthetically accessible. Here, we present a machine learning model, trained on a database of ab initio calculation results for thousands of organic molecules, that simultaneously predicts multiple electronic ground- and excited-state properties. The properties include atomization energy, polarizability, frontier orbital eigenvalues, ionization potential, electron affinity and excitation energies. The machine learning model is based on a deep multi-task artificial neural network, exploiting the underlying correlations between various molecular properties. The input is identical to ab initio methods, i.e. nuclear charges and Cartesian coordinates of all atoms. For small organic molecules, the accuracy of such a ‘quantum machine’ is similar, and sometimes superior, to modern quantum-chemical methods—at negligible computational cost. (paper)

  2. Revisit of Machine Learning Supported Biological and Biomedical Studies.

    Science.gov (United States)

    Yu, Xiang-Tian; Wang, Lu; Zeng, Tao

    2018-01-01

    Generally, machine learning includes many in silico methods to transform the principles underlying natural phenomenon to human understanding information, which aim to save human labor, to assist human judge, and to create human knowledge. It should have wide application potential in biological and biomedical studies, especially in the era of big biological data. To look through the application of machine learning along with biological development, this review provides wide cases to introduce the selection of machine learning methods in different practice scenarios involved in the whole biological and biomedical study cycle and further discusses the machine learning strategies for analyzing omics data in some cutting-edge biological studies. Finally, the notes on new challenges for machine learning due to small-sample high-dimension are summarized from the key points of sample unbalance, white box, and causality.

  3. Source localization in an ocean waveguide using supervised machine learning.

    Science.gov (United States)

    Niu, Haiqiang; Reeves, Emma; Gerstoft, Peter

    2017-09-01

    Source localization in ocean acoustics is posed as a machine learning problem in which data-driven methods learn source ranges directly from observed acoustic data. The pressure received by a vertical linear array is preprocessed by constructing a normalized sample covariance matrix and used as the input for three machine learning methods: feed-forward neural networks (FNN), support vector machines (SVM), and random forests (RF). The range estimation problem is solved both as a classification problem and as a regression problem by these three machine learning algorithms. The results of range estimation for the Noise09 experiment are compared for FNN, SVM, RF, and conventional matched-field processing and demonstrate the potential of machine learning for underwater source localization.

  4. Indicators of ADHD symptoms in virtual learning context using machine learning technics

    Directory of Open Access Journals (Sweden)

    Laura Patricia Mancera Valetts

    2015-12-01

    Full Text Available Rev.esc.adm.neg This paper presents a user model for students performing virtual learning processes. This model is used to infer the presence of Attention Deficit Hyperactivity Disorder (ADHD indicators in a student. The user model is built considering three user characteristics, which can be also used as variables in different contexts. These variables are: behavioral conduct (BC, executive functions performance (EFP, and emotional state (ES. For inferring the ADHD symptomatic profile of a student and his/her emotional alterations, these features are used as input in a set of classification rules. Based on the testing of the proposed model, training examples are obtained. These examples are used to prepare a classification machine learning algorithm for performing, and improving, the task of profiling a student. The proposed user model can provide the first step to adapt learning resources in e-learning platforms to people with attention problems, specifically, young-adult students with ADHD.

  5. Impedance learning for robotic contact tasks using natural actor-critic algorithm.

    Science.gov (United States)

    Kim, Byungchan; Park, Jooyoung; Park, Shinsuk; Kang, Sungchul

    2010-04-01

    Compared with their robotic counterparts, humans excel at various tasks by using their ability to adaptively modulate arm impedance parameters. This ability allows us to successfully perform contact tasks even in uncertain environments. This paper considers a learning strategy of motor skill for robotic contact tasks based on a human motor control theory and machine learning schemes. Our robot learning method employs impedance control based on the equilibrium point control theory and reinforcement learning to determine the impedance parameters for contact tasks. A recursive least-square filter-based episodic natural actor-critic algorithm is used to find the optimal impedance parameters. The effectiveness of the proposed method was tested through dynamic simulations of various contact tasks. The simulation results demonstrated that the proposed method optimizes the performance of the contact tasks in uncertain conditions of the environment.

  6. Machine Shop I. Learning Activity Packets (LAPs). Section C--Hand and Bench Work.

    Science.gov (United States)

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This document contains two learning activity packets (LAPs) for the "hand and bench work" instructional area of a Machine Shop I course. The two LAPs cover the following topics: hand and bench work and pedestal grinder. Each LAP contains a cover sheet that describes its purpose, an introduction, and the tasks included in the LAP;…

  7. Machine Shop I. Learning Activity Packets (LAPs). Section A--Orientation.

    Science.gov (United States)

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This document contains two learning activity packets (LAPs) for the "orientation and safety" instructional area of a Machine Shop I course. The two LAPs cover the following topics: orientation and general shop safety. Each LAP contains a cover sheet that describes its purpose, an introduction, and the tasks included in the LAP; learning…

  8. A comparative study of machine learning classifiers for modeling travel mode choice

    NARCIS (Netherlands)

    Hagenauer, J; Helbich, M

    2017-01-01

    The analysis of travel mode choice is an important task in transportation planning and policy making in order to understand and predict travel demands. While advances in machine learning have led to numerous powerful classifiers, their usefulness for modeling travel mode choice remains largely

  9. Learning Activity Packets for Grinding Machines. Unit II--Surface Grinding.

    Science.gov (United States)

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This learning activity packet (LAP) is one of three that accompany the curriculum guide on grinding machines. It outlines the study activities and performance tasks for the second unit of this curriculum guide. Its purpose is to aid the student in attaining a working knowledge of this area of training and in achieving a skilled or moderately…

  10. Learning Activity Packets for Grinding Machines. Unit III--Cylindrical Grinding.

    Science.gov (United States)

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This learning activity packet (LAP) is one of three that accompany the curriculum guide on grinding machines. It outlines the study activities and performance tasks for the third unit of this curriculum guide. Its purpose is to aid the student in attaining a working knowledge of this area of training and in achieving a skilled or moderately…

  11. Progressive sampling-based Bayesian optimization for efficient and automatic machine learning model selection.

    Science.gov (United States)

    Zeng, Xueqiang; Luo, Gang

    2017-12-01

    Machine learning is broadly used for clinical data analysis. Before training a model, a machine learning algorithm must be selected. Also, the values of one or more model parameters termed hyper-parameters must be set. Selecting algorithms and hyper-parameter values requires advanced machine learning knowledge and many labor-intensive manual iterations. To lower the bar to machine learning, miscellaneous automatic selection methods for algorithms and/or hyper-parameter values have been proposed. Existing automatic selection methods are inefficient on large data sets. This poses a challenge for using machine learning in the clinical big data era. To address the challenge, this paper presents progressive sampling-based Bayesian optimization, an efficient and automatic selection method for both algorithms and hyper-parameter values. We report an implementation of the method. We show that compared to a state of the art automatic selection method, our method can significantly reduce search time, classification error rate, and standard deviation of error rate due to randomization. This is major progress towards enabling fast turnaround in identifying high-quality solutions required by many machine learning-based clinical data analysis tasks.

  12. Using Machine Learning to Search for MSSM Higgs Bosons

    CERN Document Server

    Diesing, Rebecca

    2016-01-01

    This paper examines the performance of machine learning in the identification of Minimally Su- persymmetric Standard Model (MSSM) Higgs Bosons, and compares this performance to that of traditional cut strategies. Two boosted decision tree algorithms were tested, scikit-learn and XGBoost. These tests indicated that machine learning can perform significantly better than traditional cuts. However, since machine learning in this form cannot be directly implemented in a real MSSM Higgs analysis, this performance information was instead used to better understand the relationships between training variables. Further studies might use this information to construct an improved cut strategy.

  13. Machine learning in radiation oncology theory and applications

    CERN Document Server

    El Naqa, Issam; Murphy, Martin J

    2015-01-01

    ​This book provides a complete overview of the role of machine learning in radiation oncology and medical physics, covering basic theory, methods, and a variety of applications in medical physics and radiotherapy. An introductory section explains machine learning, reviews supervised and unsupervised learning methods, discusses performance evaluation, and summarizes potential applications in radiation oncology. Detailed individual sections are then devoted to the use of machine learning in quality assurance; computer-aided detection, including treatment planning and contouring; image-guided rad

  14. Reinforcement and Systemic Machine Learning for Decision Making

    CERN Document Server

    Kulkarni, Parag

    2012-01-01

    Reinforcement and Systemic Machine Learning for Decision Making There are always difficulties in making machines that learn from experience. Complete information is not always available-or it becomes available in bits and pieces over a period of time. With respect to systemic learning, there is a need to understand the impact of decisions and actions on a system over that period of time. This book takes a holistic approach to addressing that need and presents a new paradigm-creating new learning applications and, ultimately, more intelligent machines. The first book of its kind in this new an

  15. Precision Parameter Estimation and Machine Learning

    Science.gov (United States)

    Wandelt, Benjamin D.

    2008-12-01

    I discuss the strategy of ``Acceleration by Parallel Precomputation and Learning'' (AP-PLe) that can vastly accelerate parameter estimation in high-dimensional parameter spaces and costly likelihood functions, using trivially parallel computing to speed up sequential exploration of parameter space. This strategy combines the power of distributed computing with machine learning and Markov-Chain Monte Carlo techniques efficiently to explore a likelihood function, posterior distribution or χ2-surface. This strategy is particularly successful in cases where computing the likelihood is costly and the number of parameters is moderate or large. We apply this technique to two central problems in cosmology: the solution of the cosmological parameter estimation problem with sufficient accuracy for the Planck data using PICo; and the detailed calculation of cosmological helium and hydrogen recombination with RICO. Since the APPLe approach is designed to be able to use massively parallel resources to speed up problems that are inherently serial, we can bring the power of distributed computing to bear on parameter estimation problems. We have demonstrated this with the CosmologyatHome project.

  16. Quantum learning and universal quantum matching machine

    International Nuclear Information System (INIS)

    Sasaki, Masahide; Carlini, Alberto

    2002-01-01

    Suppose that three kinds of quantum systems are given in some unknown states vertical bar f> xN , vertical bar g 1 > xK , and vertical bar g 2 > xK , and we want to decide which template state vertical bar g 1 > or vertical bar g 2 >, each representing the feature of the pattern class C 1 or C 2 , respectively, is closest to the input feature state vertical bar f>. This is an extension of the pattern matching problem into the quantum domain. Assuming that these states are known a priori to belong to a certain parametric family of pure qubit systems, we derive two kinds of matching strategies. The first one is a semiclassical strategy that is obtained by the natural extension of conventional matching strategies and consists of a two-stage procedure: identification (estimation) of the unknown template states to design the classifier (learning process to train the classifier) and classification of the input system into the appropriate pattern class based on the estimated results. The other is a fully quantum strategy without any intermediate measurement, which we might call as the universal quantum matching machine. We present the Bayes optimal solutions for both strategies in the case of K=1, showing that there certainly exists a fully quantum matching procedure that is strictly superior to the straightforward semiclassical extension of the conventional matching strategy based on the learning process

  17. Neural architecture design based on extreme learning machine.

    Science.gov (United States)

    Bueno-Crespo, Andrés; García-Laencina, Pedro J; Sancho-Gómez, José-Luis

    2013-12-01

    Selection of the optimal neural architecture to solve a pattern classification problem entails to choose the relevant input units, the number of hidden neurons and its corresponding interconnection weights. This problem has been widely studied in many research works but their solutions usually involve excessive computational cost in most of the problems and they do not provide a unique solution. This paper proposes a new technique to efficiently design the MultiLayer Perceptron (MLP) architecture for classification using the Extreme Learning Machine (ELM) algorithm. The proposed method provides a high generalization capability and a unique solution for the architecture design. Moreover, the selected final network only retains those input connections that are relevant for the classification task. Experimental results show these advantages. Copyright © 2013 Elsevier Ltd. All rights reserved.

  18. Machine-Learning-Based No Show Prediction in Outpatient Visits

    Directory of Open Access Journals (Sweden)

    Carlos Elvira

    2018-03-01

    Full Text Available A recurring problem in healthcare is the high percentage of patients who miss their appointment, be it a consultation or a hospital test. The present study seeks patient’s behavioural patterns that allow predicting the probability of no- shows. We explore the convenience of using Big Data Machine Learning models to accomplish this task. To begin with, a predictive model based only on variables associated with the target appointment is built. Then the model is improved by considering the patient’s history of appointments. In both cases, the Gradient Boosting algorithm was the predictor of choice. Our numerical results are considered promising given the small amount of information available. However, there seems to be plenty of room to improve the model if we manage to collect additional data for both patients and appointments.

  19. Machine learning based switching model for electricity load forecasting

    Energy Technology Data Exchange (ETDEWEB)

    Fan, Shu; Lee, Wei-Jen [Energy Systems Research Center, The University of Texas at Arlington, 416 S. College Street, Arlington, TX 76019 (United States); Chen, Luonan [Department of Electronics, Information and Communication Engineering, Osaka Sangyo University, 3-1-1 Nakagaito, Daito, Osaka 574-0013 (Japan)

    2008-06-15

    In deregulated power markets, forecasting electricity loads is one of the most essential tasks for system planning, operation and decision making. Based on an integration of two machine learning techniques: Bayesian clustering by dynamics (BCD) and support vector regression (SVR), this paper proposes a novel forecasting model for day ahead electricity load forecasting. The proposed model adopts an integrated architecture to handle the non-stationarity of time series. Firstly, a BCD classifier is applied to cluster the input data set into several subsets by the dynamics of the time series in an unsupervised manner. Then, groups of SVRs are used to fit the training data of each subset in a supervised way. The effectiveness of the proposed model is demonstrated with actual data taken from the New York ISO and the Western Farmers Electric Cooperative in Oklahoma. (author)

  20. Machine learning based switching model for electricity load forecasting

    Energy Technology Data Exchange (ETDEWEB)

    Fan Shu [Energy Systems Research Center, University of Texas at Arlington, 416 S. College Street, Arlington, TX 76019 (United States); Chen Luonan [Department of Electronics, Information and Communication Engineering, Osaka Sangyo University, 3-1-1 Nakagaito, Daito, Osaka 574-0013 (Japan); Lee, Weijen [Energy Systems Research Center, University of Texas at Arlington, 416 S. College Street, Arlington, TX 76019 (United States)], E-mail: wlee@uta.edu

    2008-06-15

    In deregulated power markets, forecasting electricity loads is one of the most essential tasks for system planning, operation and decision making. Based on an integration of two machine learning techniques: Bayesian clustering by dynamics (BCD) and support vector regression (SVR), this paper proposes a novel forecasting model for day ahead electricity load forecasting. The proposed model adopts an integrated architecture to handle the non-stationarity of time series. Firstly, a BCD classifier is applied to cluster the input data set into several subsets by the dynamics of the time series in an unsupervised manner. Then, groups of SVRs are used to fit the training data of each subset in a supervised way. The effectiveness of the proposed model is demonstrated with actual data taken from the New York ISO and the Western Farmers Electric Cooperative in Oklahoma.

  1. Machine learning action parameters in lattice quantum chromodynamics

    Science.gov (United States)

    Shanahan, Phiala E.; Trewartha, Daniel; Detmold, William

    2018-05-01

    Numerical lattice quantum chromodynamics studies of the strong interaction are important in many aspects of particle and nuclear physics. Such studies require significant computing resources to undertake. A number of proposed methods promise improved efficiency of lattice calculations, and access to regions of parameter space that are currently computationally intractable, via multi-scale action-matching approaches that necessitate parametric regression of generated lattice datasets. The applicability of machine learning to this regression task is investigated, with deep neural networks found to provide an efficient solution even in cases where approaches such as principal component analysis fail. The high information content and complex symmetries inherent in lattice QCD datasets require custom neural network layers to be introduced and present opportunities for further development.

  2. Supervised Machine Learning for Population Genetics: A New Paradigm

    Science.gov (United States)

    Schrider, Daniel R.; Kern, Andrew D.

    2018-01-01

    As population genomic datasets grow in size, researchers are faced with the daunting task of making sense of a flood of information. To keep pace with this explosion of data, computational methodologies for population genetic inference are rapidly being developed to best utilize genomic sequence data. In this review we discuss a new paradigm that has emerged in computational population genomics: that of supervised machine learning (ML). We review the fundamentals of ML, discuss recent applications of supervised ML to population genetics that outperform competing methods, and describe promising future directions in this area. Ultimately, we argue that supervised ML is an important and underutilized tool that has considerable potential for the world of evolutionary genomics. PMID:29331490

  3. Machine learning based switching model for electricity load forecasting

    International Nuclear Information System (INIS)

    Fan Shu; Chen Luonan; Lee, Weijen

    2008-01-01

    In deregulated power markets, forecasting electricity loads is one of the most essential tasks for system planning, operation and decision making. Based on an integration of two machine learning techniques: Bayesian clustering by dynamics (BCD) and support vector regression (SVR), this paper proposes a novel forecasting model for day ahead electricity load forecasting. The proposed model adopts an integrated architecture to handle the non-stationarity of time series. Firstly, a BCD classifier is applied to cluster the input data set into several subsets by the dynamics of the time series in an unsupervised manner. Then, groups of SVRs are used to fit the training data of each subset in a supervised way. The effectiveness of the proposed model is demonstrated with actual data taken from the New York ISO and the Western Farmers Electric Cooperative in Oklahoma

  4. Using human brain activity to guide machine learning.

    Science.gov (United States)

    Fong, Ruth C; Scheirer, Walter J; Cox, David D

    2018-03-29

    Machine learning is a field of computer science that builds algorithms that learn. In many cases, machine learning algorithms are used to recreate a human ability like adding a caption to a photo, driving a car, or playing a game. While the human brain has long served as a source of inspiration for machine learning, little effort has been made to directly use data collected from working brains as a guide for machine learning algorithms. Here we demonstrate a new paradigm of "neurally-weighted" machine learning, which takes fMRI measurements of human brain activity from subjects viewing images, and infuses these data into the training process of an object recognition learning algorithm to make it more consistent with the human brain. After training, these neurally-weighted classifiers are able to classify images without requiring any additional neural data. We show that our neural-weighting approach can lead to large performance gains when used with traditional machine vision features, as well as to significant improvements with already high-performing convolutional neural network features. The effectiveness of this approach points to a path forward for a new class of hybrid machine learning algorithms which take both inspiration and direct constraints from neuronal data.

  5. Proceedings of the IEEE Machine Learning for Signal Processing XVII

    DEFF Research Database (Denmark)

    The seventeenth of a series of workshops sponsored by the IEEE Signal Processing Society and organized by the Machine Learning for Signal Processing Technical Committee (MLSP-TC). The field of machine learning has matured considerably in both methodology and real-world application domains and has...... become particularly important for solution of problems in signal processing. As reflected in this collection, machine learning for signal processing combines many ideas from adaptive signal/image processing, learning theory and models, and statistics in order to solve complex real-world signal processing......, and two papers from the winners of the Data Analysis Competition. The program included papers in the following areas: genomic signal processing, pattern recognition and classification, image and video processing, blind signal processing, models, learning algorithms, and applications of machine learning...

  6. TF.Learn: TensorFlow's High-level Module for Distributed Machine Learning

    OpenAIRE

    Tang, Yuan

    2016-01-01

    TF.Learn is a high-level Python module for distributed machine learning inside TensorFlow. It provides an easy-to-use Scikit-learn style interface to simplify the process of creating, configuring, training, evaluating, and experimenting a machine learning model. TF.Learn integrates a wide range of state-of-art machine learning algorithms built on top of TensorFlow's low level APIs for small to large-scale supervised and unsupervised problems. This module focuses on bringing machine learning t...

  7. Machine Shop Suggested Job and Task Sheets. Part II. 21 Advanced Jobs.

    Science.gov (United States)

    Texas A and M Univ., College Station. Vocational Instructional Services.

    This volume consists of advanced job and task sheets adaptable for use in the regular vocational industrial education programs for the training of machinists and machine shop operators. Twenty-one advanced machine shop job sheets are included. Some or all of this material is provided for each job: an introductory sheet with aim, checking…

  8. Machine Shop Suggested Job and Task Sheets. Part I. 25 Elementary Jobs.

    Science.gov (United States)

    Texas A and M Univ., College Station. Vocational Instructional Services.

    This volume consists of elementary job and task sheets adaptable for use in the regular vocational industrial education programs for the training of machinists and machine shop operators. Twenty-five simple machine shop job sheets are included. Some or all of this material is provided for each job sheet: an introductory sheet with aim, checking…

  9. MLBCD: a machine learning tool for big clinical data.

    Science.gov (United States)

    Luo, Gang

    2015-01-01

    Predictive modeling is fundamental for extracting value from large clinical data sets, or "big clinical data," advancing clinical research, and improving healthcare. Machine learning is a powerful approach to predictive modeling. Two factors make machine learning challenging for healthcare researchers. First, before training a machine learning model, the values of one or more model parameters called hyper-parameters must typically be specified. Due to their inexperience with machine learning, it is hard for healthcare researchers to choose an appropriate algorithm and hyper-parameter values. Second, many clinical data are stored in a special format. These data must be iteratively transformed into the relational table format before conducting predictive modeling. This transformation is time-consuming and requires computing expertise. This paper presents our vision for and design of MLBCD (Machine Learning for Big Clinical Data), a new software system aiming to address these challenges and facilitate building machine learning predictive models using big clinical data. The paper describes MLBCD's design in detail. By making machine learning accessible to healthcare researchers, MLBCD will open the use of big clinical data and increase the ability to foster biomedical discovery and improve care.

  10. Machine learning in autistic spectrum disorder behavioral research: A review and ways forward.

    Science.gov (United States)

    Thabtah, Fadi

    2018-02-13

    Autistic Spectrum Disorder (ASD) is a mental disorder that retards acquisition of linguistic, communication, cognitive, and social skills and abilities. Despite being diagnosed with ASD, some individuals exhibit outstanding scholastic, non-academic, and artistic capabilities, in such cases posing a challenging task for scientists to provide answers. In the last few years, ASD has been investigated by social and computational intelligence scientists utilizing advanced technologies such as machine learning to improve diagnostic timing, precision, and quality. Machine learning is a multidisciplinary research topic that employs intelligent techniques to discover useful concealed patterns, which are utilized in prediction to improve decision making. Machine learning techniques such as support vector machines, decision trees, logistic regressions, and others, have been applied to datasets related to autism in order to construct predictive models. These models claim to enhance the ability of clinicians to provide robust diagnoses and prognoses of ASD. However, studies concerning the use of machine learning in ASD diagnosis and treatment suffer from conceptual, implementation, and data issues such as the way diagnostic codes are used, the type of feature selection employed, the evaluation measures chosen, and class imbalances in data among others. A more serious claim in recent studies is the development of a new method for ASD diagnoses based on machine learning. This article critically analyses these recent investigative studies on autism, not only articulating the aforementioned issues in these studies but also recommending paths forward that enhance machine learning use in ASD with respect to conceptualization, implementation, and data. Future studies concerning machine learning in autism research are greatly benefitted by such proposals.

  11. Machine Learning Approaches for Clinical Psychology and Psychiatry.

    Science.gov (United States)

    Dwyer, Dominic B; Falkai, Peter; Koutsouleris, Nikolaos

    2018-05-07

    Machine learning approaches for clinical psychology and psychiatry explicitly focus on learning statistical functions from multidimensional data sets to make generalizable predictions about individuals. The goal of this review is to provide an accessible understanding of why this approach is important for future practice given its potential to augment decisions associated with the diagnosis, prognosis, and treatment of people suffering from mental illness using clinical and biological data. To this end, the limitations of current statistical paradigms in mental health research are critiqued, and an introduction is provided to critical machine learning methods used in clinical studies. A selective literature review is then presented aiming to reinforce the usefulness of machine learning methods and provide evidence of their potential. In the context of promising initial results, the current limitations of machine learning approaches are addressed, and considerations for future clinical translation are outlined.

  12. Learning to discover: machine learning in high-energy physics

    CERN Multimedia

    CERN. Geneva

    2014-01-01

    In this talk we will survey some of the latest developments in machine learning research through the optics of potential applications in high-energy physics. We will then describe three ongoing projects in detail. The main subject of the talk is the data challenge we are organizing with ATLAS on optimizing the discovery significance for the Higgs to tau-tau channel. Second, we describe our collaboration with the LHCb experiment on designing and optimizing fast multi-variate techniques that can be implemented as online classifiers in triggers. Finally, we will sketch a relatively young project with the ILC (Calice) group in which we are attempting to apply deep learning techniques for inference on imaging calorimeter data.

  13. Modelling tick abundance using machine learning techniques and satellite imagery

    DEFF Research Database (Denmark)

    Kjær, Lene Jung; Korslund, L.; Kjelland, V.

    satellite images to run Boosted Regression Tree machine learning algorithms to predict overall distribution (presence/absence of ticks) and relative tick abundance of nymphs and larvae in southern Scandinavia. For nymphs, the predicted abundance had a positive correlation with observed abundance...... the predicted distribution of larvae was mostly even throughout Denmark, it was primarily around the coastlines in Norway and Sweden. Abundance was fairly low overall except in some fragmented patches corresponding to forested habitats in the region. Machine learning techniques allow us to predict for larger...... the collected ticks for pathogens and using the same machine learning techniques to develop prevalence maps of the ScandTick region....

  14. Implementing Machine Learning in Radiology Practice and Research.

    Science.gov (United States)

    Kohli, Marc; Prevedello, Luciano M; Filice, Ross W; Geis, J Raymond

    2017-04-01

    The purposes of this article are to describe concepts that radiologists should understand to evaluate machine learning projects, including common algorithms, supervised as opposed to unsupervised techniques, statistical pitfalls, and data considerations for training and evaluation, and to briefly describe ethical dilemmas and legal risk. Machine learning includes a broad class of computer programs that improve with experience. The complexity of creating, training, and monitoring machine learning indicates that the success of the algorithms will require radiologist involvement for years to come, leading to engagement rather than replacement.

  15. Studying depression using imaging and machine learning methods

    Directory of Open Access Journals (Sweden)

    Meenal J. Patel

    2016-01-01

    Full Text Available Depression is a complex clinical entity that can pose challenges for clinicians regarding both accurate diagnosis and effective timely treatment. These challenges have prompted the development of multiple machine learning methods to help improve the management of this disease. These methods utilize anatomical and physiological data acquired from neuroimaging to create models that can identify depressed patients vs. non-depressed patients and predict treatment outcomes. This article (1 presents a background on depression, imaging, and machine learning methodologies; (2 reviews methodologies of past studies that have used imaging and machine learning to study depression; and (3 suggests directions for future depression-related studies.

  16. Studying depression using imaging and machine learning methods.

    Science.gov (United States)

    Patel, Meenal J; Khalaf, Alexander; Aizenstein, Howard J

    2016-01-01

    Depression is a complex clinical entity that can pose challenges for clinicians regarding both accurate diagnosis and effective timely treatment. These challenges have prompted the development of multiple machine learning methods to help improve the management of this disease. These methods utilize anatomical and physiological data acquired from neuroimaging to create models that can identify depressed patients vs. non-depressed patients and predict treatment outcomes. This article (1) presents a background on depression, imaging, and machine learning methodologies; (2) reviews methodologies of past studies that have used imaging and machine learning to study depression; and (3) suggests directions for future depression-related studies.

  17. Why Robots Should Be Social: Enhancing Machine Learning through Social Human-Robot Interaction

    Science.gov (United States)

    de Greeff, Joachim; Belpaeme, Tony

    2015-01-01

    Social learning is a powerful method for cultural propagation of knowledge and skills relying on a complex interplay of learning strategies, social ecology and the human propensity for both learning and tutoring. Social learning has the potential to be an equally potent learning strategy for artificial systems and robots in specific. However, given the complexity and unstructured nature of social learning, implementing social machine learning proves to be a challenging problem. We study one particular aspect of social machine learning: that of offering social cues during the learning interaction. Specifically, we study whether people are sensitive to social cues offered by a learning robot, in a similar way to children’s social bids for tutoring. We use a child-like social robot and a task in which the robot has to learn the meaning of words. For this a simple turn-based interaction is used, based on language games. Two conditions are tested: one in which the robot uses social means to invite a human teacher to provide information based on what the robot requires to fill gaps in its knowledge (i.e. expression of a learning preference); the other in which the robot does not provide social cues to communicate a learning preference. We observe that conveying a learning preference through the use of social cues results in better and faster learning by the robot. People also seem to form a “mental model” of the robot, tailoring the tutoring to the robot’s performance as opposed to using simply random teaching. In addition, the social learning shows a clear gender effect with female participants being responsive to the robot’s bids, while male teachers appear to be less receptive. This work shows how additional social cues in social machine learning can result in people offering better quality learning input to artificial systems, resulting in improved learning performance. PMID:26422143

  18. Why Robots Should Be Social: Enhancing Machine Learning through Social Human-Robot Interaction.

    Science.gov (United States)

    de Greeff, Joachim; Belpaeme, Tony

    2015-01-01

    Social learning is a powerful method for cultural propagation of knowledge and skills relying on a complex interplay of learning strategies, social ecology and the human propensity for both learning and tutoring. Social learning has the potential to be an equally potent learning strategy for artificial systems and robots in specific. However, given the complexity and unstructured nature of social learning, implementing social machine learning proves to be a challenging problem. We study one particular aspect of social machine learning: that of offering social cues during the learning interaction. Specifically, we study whether people are sensitive to social cues offered by a learning robot, in a similar way to children's social bids for tutoring. We use a child-like social robot and a task in which the robot has to learn the meaning of words. For this a simple turn-based interaction is used, based on language games. Two conditions are tested: one in which the robot uses social means to invite a human teacher to provide information based on what the robot requires to fill gaps in its knowledge (i.e. expression of a learning preference); the other in which the robot does not provide social cues to communicate a learning preference. We observe that conveying a learning preference through the use of social cues results in better and faster learning by the robot. People also seem to form a "mental model" of the robot, tailoring the tutoring to the robot's performance as opposed to using simply random teaching. In addition, the social learning shows a clear gender effect with female participants being responsive to the robot's bids, while male teachers appear to be less receptive. This work shows how additional social cues in social machine learning can result in people offering better quality learning input to artificial systems, resulting in improved learning performance.

  19. Why Robots Should Be Social: Enhancing Machine Learning through Social Human-Robot Interaction.

    Directory of Open Access Journals (Sweden)

    Joachim de Greeff

    Full Text Available Social learning is a powerful method for cultural propagation of knowledge and skills relying on a complex interplay of learning strategies, social ecology and the human propensity for both learning and tutoring. Social learning has the potential to be an equally potent learning strategy for artificial systems and robots in specific. However, given the complexity and unstructured nature of social learning, implementing social machine learning proves to be a challenging problem. We study one particular aspect of social machine learning: that of offering social cues during the learning interaction. Specifically, we study whether people are sensitive to social cues offered by a learning robot, in a similar way to children's social bids for tutoring. We use a child-like social robot and a task in which the robot has to learn the meaning of words. For this a simple turn-based interaction is used, based on language games. Two conditions are tested: one in which the robot uses social means to invite a human teacher to provide information based on what the robot requires to fill gaps in its knowledge (i.e. expression of a learning preference; the other in which the robot does not provide social cues to communicate a learning preference. We observe that conveying a learning preference through the use of social cues results in better and faster learning by the robot. People also seem to form a "mental model" of the robot, tailoring the tutoring to the robot's performance as opposed to using simply random teaching. In addition, the social learning shows a clear gender effect with female participants being responsive to the robot's bids, while male teachers appear to be less receptive. This work shows how additional social cues in social machine learning can result in people offering better quality learning input to artificial systems, resulting in improved learning performance.

  20. Machine learning for fab automated diagnostics

    Science.gov (United States)

    Giollo, Manuel; Lam, Auguste; Gkorou, Dimitra; Liu, Xing Lan; van Haren, Richard

    2017-06-01

    Process optimization depends largely on field engineer's knowledge and expertise. However, this practice turns out to be less sustainable due to the fab complexity which is continuously increasing in order to support the extreme miniaturization of Integrated Circuits. On the one hand, process optimization and root cause analysis of tools is necessary for a smooth fab operation. On the other hand, the growth in number of wafer processing steps is adding a considerable new source of noise which may have a significant impact at the nanometer scale. This paper explores the ability of historical process data and Machine Learning to support field engineers in production analysis and monitoring. We implement an automated workflow in order to analyze a large volume of information, and build a predictive model of overlay variation. The proposed workflow addresses significant problems that are typical in fab production, like missing measurements, small number of samples, confounding effects due to heterogeneity of data, and subpopulation effects. We evaluate the proposed workflow on a real usecase and we show that it is able to predict overlay excursions observed in Integrated Circuits manufacturing. The chosen design focuses on linear and interpretable models of the wafer history, which highlight the process steps that are causing defective products. This is a fundamental feature for diagnostics, as it supports process engineers in the continuous improvement of the production line.

  1. GAME: GAlaxy Machine learning for Emission lines

    Science.gov (United States)

    Ucci, G.; Ferrara, A.; Pallottini, A.; Gallerani, S.

    2018-06-01

    We present an updated, optimized version of GAME (GAlaxy Machine learning for Emission lines), a code designed to infer key interstellar medium physical properties from emission line intensities of ultraviolet /optical/far-infrared galaxy spectra. The improvements concern (a) an enlarged spectral library including Pop III stars, (b) the inclusion of spectral noise in the training procedure, and (c) an accurate evaluation of uncertainties. We extensively validate the optimized code and compare its performance against empirical methods and other available emission line codes (PYQZ and HII-CHI-MISTRY) on a sample of 62 SDSS stacked galaxy spectra and 75 observed HII regions. Very good agreement is found for metallicity. However, ionization parameters derived by GAME tend to be higher. We show that this is due to the use of too limited libraries in the other codes. The main advantages of GAME are the simultaneous use of all the measured spectral lines and the extremely short computational times. We finally discuss the code potential and limitations.

  2. Predicting Increased Blood Pressure Using Machine Learning

    Science.gov (United States)

    Golino, Hudson Fernandes; Amaral, Liliany Souza de Brito; Duarte, Stenio Fernando Pimentel; Soares, Telma de Jesus; dos Reis, Luciana Araujo

    2014-01-01

    The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudo R 2 (.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudo R 2 (.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power. PMID:24669313

  3. Predicting increased blood pressure using machine learning.

    Science.gov (United States)

    Golino, Hudson Fernandes; Amaral, Liliany Souza de Brito; Duarte, Stenio Fernando Pimentel; Gomes, Cristiano Mauro Assis; Soares, Telma de Jesus; Dos Reis, Luciana Araujo; Santos, Joselito

    2014-01-01

    The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudo R (2) (.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudo R (2) (.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power.

  4. GAME: GAlaxy Machine learning for Emission lines

    Science.gov (United States)

    Ucci, G.; Ferrara, A.; Pallottini, A.; Gallerani, S.

    2018-03-01

    We present an updated, optimized version of GAME (GAlaxy Machine learning for Emission lines), a code designed to infer key interstellar medium physical properties from emission line intensities of UV/optical/far infrared galaxy spectra. The improvements concern: (a) an enlarged spectral library including Pop III stars; (b) the inclusion of spectral noise in the training procedure, and (c) an accurate evaluation of uncertainties. We extensively validate the optimized code and compare its performance against empirical methods and other available emission line codes (pyqz and HII-CHI-mistry) on a sample of 62 SDSS stacked galaxy spectra and 75 observed HII regions. Very good agreement is found for metallicity. However, ionization parameters derived by GAME tend to be higher. We show that this is due to the use of too limited libraries in the other codes. The main advantages of GAME are the simultaneous use of all the measured spectral lines, and the extremely short computational times. We finally discuss the code potential and limitations.

  5. Predicting Increased Blood Pressure Using Machine Learning

    Directory of Open Access Journals (Sweden)

    Hudson Fernandes Golino

    2014-01-01

    Full Text Available The present study investigates the prediction of increased blood pressure by body mass index (BMI, waist (WC and hip circumference (HC, and waist hip ratio (WHR using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42, misclassification (.19, and the higher pseudo R2 (.43. This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25, misclassification (.16, and the higher pseudo R2 (.46. This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power.

  6. Optimal interference code based on machine learning

    Science.gov (United States)

    Qian, Ye; Chen, Qian; Hu, Xiaobo; Cao, Ercong; Qian, Weixian; Gu, Guohua

    2016-10-01

    In this paper, we analyze the characteristics of pseudo-random code, by the case of m sequence. Depending on the description of coding theory, we introduce the jamming methods. We simulate the interference effect or probability model by the means of MATLAB to consolidate. In accordance with the length of decoding time the adversary spends, we find out the optimal formula and optimal coefficients based on machine learning, then we get the new optimal interference code. First, when it comes to the phase of recognition, this study judges the effect of interference by the way of simulating the length of time over the decoding period of laser seeker. Then, we use laser active deception jamming simulate interference process in the tracking phase in the next block. In this study we choose the method of laser active deception jamming. In order to improve the performance of the interference, this paper simulates the model by MATLAB software. We find out the least number of pulse intervals which must be received, then we can make the conclusion that the precise interval number of the laser pointer for m sequence encoding. In order to find the shortest space, we make the choice of the greatest common divisor method. Then, combining with the coding regularity that has been found before, we restore pulse interval of pseudo-random code, which has been already received. Finally, we can control the time period of laser interference, get the optimal interference code, and also increase the probability of interference as well.

  7. (Machine) learning to do more with less

    Science.gov (United States)

    Cohen, Timothy; Freytsis, Marat; Ostdiek, Bryan

    2018-02-01

    Determining the best method for training a machine learning algorithm is critical to maximizing its ability to classify data. In this paper, we compare the standard "fully supervised" approach (which relies on knowledge of event-by-event truth-level labels) with a recent proposal that instead utilizes class ratios as the only discriminating information provided during training. This so-called "weakly supervised" technique has access to less information than the fully supervised method and yet is still able to yield impressive discriminating power. In addition, weak supervision seems particularly well suited to particle physics since quantum mechanics is incompatible with the notion of mapping an individual event onto any single Feynman diagram. We examine the technique in detail — both analytically and numerically — with a focus on the robustness to issues of mischaracterizing the training samples. Weakly supervised networks turn out to be remarkably insensitive to a class of systematic mismodeling. Furthermore, we demonstrate that the event level outputs for weakly versus fully supervised networks are probing different kinematics, even though the numerical quality metrics are essentially identical. This implies that it should be possible to improve the overall classification ability by combining the output from the two types of networks. For concreteness, we apply this technology to a signature of beyond the Standard Model physics to demonstrate that all these impressive features continue to hold in a scenario of relevance to the LHC. Example code is provided on GitHub.

  8. Automated Essay Grading using Machine Learning Algorithm

    Science.gov (United States)

    Ramalingam, V. V.; Pandian, A.; Chetry, Prateek; Nigam, Himanshu

    2018-04-01

    Essays are paramount for of assessing the academic excellence along with linking the different ideas with the ability to recall but are notably time consuming when they are assessed manually. Manual grading takes significant amount of evaluator’s time and hence it is an expensive process. Automated grading if proven effective will not only reduce the time for assessment but comparing it with human scores will also make the score realistic. The project aims to develop an automated essay assessment system by use of machine learning techniques by classifying a corpus of textual entities into small number of discrete categories, corresponding to possible grades. Linear regression technique will be utilized for training the model along with making the use of various other classifications and clustering techniques. We intend to train classifiers on the training set, make it go through the downloaded dataset, and then measure performance our dataset by comparing the obtained values with the dataset values. We have implemented our model using java.

  9. Component Pin Recognition Using Algorithms Based on Machine Learning

    Science.gov (United States)

    Xiao, Yang; Hu, Hong; Liu, Ze; Xu, Jiangchang

    2018-04-01

    The purpose of machine vision for a plug-in machine is to improve the machine’s stability and accuracy, and recognition of the component pin is an important part of the vision. This paper focuses on component pin recognition using three different techniques. The first technique involves traditional image processing using the core algorithm for binary large object (BLOB) analysis. The second technique uses the histogram of oriented gradients (HOG), to experimentally compare the effect of the support vector machine (SVM) and the adaptive boosting machine (AdaBoost) learning meta-algorithm classifiers. The third technique is the use of an in-depth learning method known as convolution neural network (CNN), which involves identifying the pin by comparing a sample to its training. The main purpose of the research presented in this paper is to increase the knowledge of learning methods used in the plug-in machine industry in order to achieve better results.

  10. Reduced multiple empirical kernel learning machine.

    Science.gov (United States)

    Wang, Zhe; Lu, MingZhe; Gao, Daqi

    2015-02-01

    Multiple kernel learning (MKL) is demonstrated to be flexible and effective in depicting heterogeneous data sources since MKL can introduce multiple kernels rather than a single fixed kernel into applications. However, MKL would get a high time and space complexity in contrast to single kernel learning, which is not expected in real-world applications. Meanwhile, it is known that the kernel mapping ways of MKL generally have two forms including implicit kernel mapping and empirical kernel mapping (EKM), where the latter is less attracted. In this paper, we focus on the MKL with the EKM, and propose a reduced multiple empirical kernel learning machine named RMEKLM for short. To the best of our knowledge, it is the first to reduce both time and space complexity of the MKL with EKM. Different from the existing MKL, the proposed RMEKLM adopts the Gauss Elimination technique to extract a set of feature vectors, which is validated that doing so does not lose much information of the original feature space. Then RMEKLM adopts the extracted feature vectors to span a reduced orthonormal subspace of the feature space, which is visualized in terms of the geometry structure. It can be demonstrated that the spanned subspace is isomorphic to the original feature space, which means that the dot product of two vectors in the original feature space is equal to that of the two corresponding vectors in the generated orthonormal subspace. More importantly, the proposed RMEKLM brings a simpler computation and meanwhile needs a less storage space, especially in the processing of testing. Finally, the experimental results show that RMEKLM owns a much efficient and effective performance in terms of both complexity and classification. The contributions of this paper can be given as follows: (1) by mapping the input space into an orthonormal subspace, the geometry of the generated subspace is visualized; (2) this paper first reduces both the time and space complexity of the EKM-based MKL; (3

  11. Acceleration of saddle-point searches with machine learning

    Energy Technology Data Exchange (ETDEWEB)

    Peterson, Andrew A., E-mail: andrew-peterson@brown.edu [School of Engineering, Brown University, Providence, Rhode Island 02912 (United States)

    2016-08-21

    In atomistic simulations, the location of the saddle point on the potential-energy surface (PES) gives important information on transitions between local minima, for example, via transition-state theory. However, the search for saddle points often involves hundreds or thousands of ab initio force calls, which are typically all done at full accuracy. This results in the vast majority of the computational effort being spent calculating the electronic structure of states not important to the researcher, and very little time performing the calculation of the saddle point state itself. In this work, we describe how machine learning (ML) can reduce the number of intermediate ab initio calculations needed to locate saddle points. Since machine-learning models can learn from, and thus mimic, atomistic simulations, the saddle-point search can be conducted rapidly in the machine-learning representation. The saddle-point prediction can then be verified by an ab initio calculation; if it is incorrect, this strategically has identified regions of the PES where the machine-learning representation has insufficient training data. When these training data are used to improve the machine-learning model, the estimates greatly improve. This approach can be systematized, and in two simple example problems we demonstrate a dramatic reduction in the number of ab initio force calls. We expect that this approach and future refinements will greatly accelerate searches for saddle points, as well as other searches on the potential energy surface, as machine-learning methods see greater adoption by the atomistics community.

  12. Status Checking System of Home Appliances using machine learning

    Directory of Open Access Journals (Sweden)

    Yoon Chi-Yurl

    2017-01-01

    Full Text Available This paper describes status checking system of home appliances based on machine learning, which can be applied to existing household appliances without networking function. Designed status checking system consists of sensor modules, a wireless communication module, cloud server, android application and a machine learning algorithm. The developed system applied to washing machine analyses and judges the four-kinds of appliance’s status such as staying, washing, rinsing and spin-drying. The measurements of sensor and transmission of sensing data are operated on an Arduino board and the data are transmitted to cloud server in real time. The collected data are parsed by an Android application and injected into the machine learning algorithm for learning the status of the appliances. The machine learning algorithm compares the stored learning data with collected real-time data from the appliances. Our results are expected to contribute as a base technology to design an automatic control system based on machine learning technology for household appliances in real-time.

  13. Acceleration of saddle-point searches with machine learning

    International Nuclear Information System (INIS)

    Peterson, Andrew A.

    2016-01-01

    In atomistic simulations, the location of the saddle point on the potential-energy surface (PES) gives important information on transitions between local minima, for example, via transition-state theory. However, the search for saddle points often involves hundreds or thousands of ab initio force calls, which are typically all done at full accuracy. This results in the vast majority of the computational effort being spent calculating the electronic structure of states not important to the researcher, and very little time performing the calculation of the saddle point state itself. In this work, we describe how machine learning (ML) can reduce the number of intermediate ab initio calculations needed to locate saddle points. Since machine-learning models can learn from, and thus mimic, atomistic simulations, the saddle-point search can be conducted rapidly in the machine-learning representation. The saddle-point prediction can then be verified by an ab initio calculation; if it is incorrect, this strategically has identified regions of the PES where the machine-learning representation has insufficient training data. When these training data are used to improve the machine-learning model, the estimates greatly improve. This approach can be systematized, and in two simple example problems we demonstrate a dramatic reduction in the number of ab initio force calls. We expect that this approach and future refinements will greatly accelerate searches for saddle points, as well as other searches on the potential energy surface, as machine-learning methods see greater adoption by the atomistics community.

  14. Acceleration of saddle-point searches with machine learning.

    Science.gov (United States)

    Peterson, Andrew A

    2016-08-21

    In atomistic simulations, the location of the saddle point on the potential-energy surface (PES) gives important information on transitions between local minima, for example, via transition-state theory. However, the search for saddle points often involves hundreds or thousands of ab initio force calls, which are typically all done at full accuracy. This results in the vast majority of the computational effort being spent calculating the electronic structure of states not important to the researcher, and very little time performing the calculation of the saddle point state itself. In this work, we describe how machine learning (ML) can reduce the number of intermediate ab initio calculations needed to locate saddle points. Since machine-learning models can learn from, and thus mimic, atomistic simulations, the saddle-point search can be conducted rapidly in the machine-learning representation. The saddle-point prediction can then be verified by an ab initio calculation; if it is incorrect, this strategically has identified regions of the PES where the machine-learning representation has insufficient training data. When these training data are used to improve the machine-learning model, the estimates greatly improve. This approach can be systematized, and in two simple example problems we demonstrate a dramatic reduction in the number of ab initio force calls. We expect that this approach and future refinements will greatly accelerate searches for saddle points, as well as other searches on the potential energy surface, as machine-learning methods see greater adoption by the atomistics community.

  15. Semi-supervised Learning for Phenotyping Tasks.

    Science.gov (United States)

    Dligach, Dmitriy; Miller, Timothy; Savova, Guergana K

    2015-01-01

    Supervised learning is the dominant approach to automatic electronic health records-based phenotyping, but it is expensive due to the cost of manual chart review. Semi-supervised learning takes advantage of both scarce labeled and plentiful unlabeled data. In this work, we study a family of semi-supervised learning algorithms based on Expectation Maximization (EM) in the context of several phenotyping tasks. We first experiment with the basic EM algorithm. When the modeling assumptions are violated, basic EM leads to inaccurate parameter estimation. Augmented EM attenuates this shortcoming by introducing a weighting factor that downweights the unlabeled data. Cross-validation does not always lead to the best setting of the weighting factor and other heuristic methods may be preferred. We show that accurate phenotyping models can be trained with only a few hundred labeled (and a large number of unlabeled) examples, potentially providing substantial savings in the amount of the required manual chart review.

  16. A Novel Machine Learning Strategy Based on Two-Dimensional Numerical Models in Financial Engineering

    Directory of Open Access Journals (Sweden)

    Qingzhen Xu

    2013-01-01

    Full Text Available Machine learning is the most commonly used technique to address larger and more complex tasks by analyzing the most relevant information already present in databases. In order to better predict the future trend of the index, this paper proposes a two-dimensional numerical model for machine learning to simulate major U.S. stock market index and uses a nonlinear implicit finite-difference method to find numerical solutions of the two-dimensional simulation model. The proposed machine learning method uses partial differential equations to predict the stock market and can be extensively used to accelerate large-scale data processing on the history database. The experimental results show that the proposed algorithm reduces the prediction error and improves forecasting precision.

  17. Probabilistic models and machine learning in structural bioinformatics

    DEFF Research Database (Denmark)

    Hamelryck, Thomas

    2009-01-01

    . Recently, probabilistic models and machine learning methods based on Bayesian principles are providing efficient and rigorous solutions to challenging problems that were long regarded as intractable. In this review, I will highlight some important recent developments in the prediction, analysis...

  18. Proceedings of IEEE Machine Learning for Signal Processing Workshop XVI

    DEFF Research Database (Denmark)

    Larsen, Jan

    These proceedings contains refereed papers presented at the sixteenth IEEE Workshop on Machine Learning for Signal Processing (MLSP'2006), held in Maynooth, Co. Kildare, Ireland, September 6-8, 2006. This is a continuation of the IEEE Workshops on Neural Networks for Signal Processing (NNSP......). The name of the Technical Committee, hence of the Workshop, was changed to Machine Learning for Signal Processing in September 2003 to better reflect the areas represented by the Technical Committee. The conference is organized by the Machine Learning for Signal Processing Technical Committee...... the same standard as the printed version and facilitates the reading and searching of the papers. The field of machine learning has matured considerably in both methodology and real-world application domains and has become particularly important for solution of problems in signal processing. As reflected...

  19. Using machine learning, neural networks and statistics to predict bankruptcy

    NARCIS (Netherlands)

    Pompe, P.P.M.; Feelders, A.J.; Feelders, A.J.

    1997-01-01

    Recent literature strongly suggests that machine learning approaches to classification outperform "classical" statistical methods. We make a comparison between the performance of linear discriminant analysis, classification trees, and neural networks in predicting corporate bankruptcy. Linear

  20. Sparse Machine Learning Methods for Understanding Large Text Corpora

    Data.gov (United States)

    National Aeronautics and Space Administration — Sparse machine learning has recently emerged as powerful tool to obtain models of high-dimensional data with high degree of interpretability, at low computational...

  1. Machine learning and medicine: book review and commentary.

    Science.gov (United States)

    Koprowski, Robert; Foster, Kenneth R

    2018-02-01

    This article is a review of the book "Master machine learning algorithms, discover how they work and implement them from scratch" (ISBN: not available, 37 USD, 163 pages) edited by Jason Brownlee published by the Author, edition, v1.10 http://MachineLearningMastery.com . An accompanying commentary discusses some of the issues that are involved with use of machine learning and data mining techniques to develop predictive models for diagnosis or prognosis of disease, and to call attention to additional requirements for developing diagnostic and prognostic algorithms that are generally useful in medicine. Appendix provides examples that illustrate potential problems with machine learning that are not addressed in the reviewed book.

  2. Exploration of Machine Learning Approaches to Predict Pavement Performance

    Science.gov (United States)

    2018-03-23

    Machine learning (ML) techniques were used to model and predict pavement condition index (PCI) for various pavement types using a variety of input variables. The primary objective of this research was to develop and assess PCI predictive models for t...

  3. Corporate Disruption in the Science of Machine Learning

    OpenAIRE

    Work, Sam

    2016-01-01

    This MSc dissertation considers the effects of the current corporate interest on researchers in the field of machine learning. Situated within the field's cyclical history of academic, public and corporate interest, this dissertation investigates how current researchers view recent developments and negotiate their own research practices within an environment of increased commercial interest and funding. The original research consists of in-depth interviews with 12 machine learning researchers...

  4. Machine learning concepts in coherent optical communication systems

    DEFF Research Database (Denmark)

    Zibar, Darko; Schäffer, Christian G.

    2014-01-01

    Powerful statistical signal processing methods, used by the machine learning community, are addressed and linked to current problems in coherent optical communication. Bayesian filtering methods are presented and applied for nonlinear dynamic state tracking. © 2014 OSA.......Powerful statistical signal processing methods, used by the machine learning community, are addressed and linked to current problems in coherent optical communication. Bayesian filtering methods are presented and applied for nonlinear dynamic state tracking. © 2014 OSA....

  5. Machine learning techniques applied to system characterization and equalization

    DEFF Research Database (Denmark)

    Zibar, Darko; Thrane, Jakob; Wass, Jesper

    2016-01-01

    Linear signal processing algorithms are effective in combating linear fibre channel impairments. We demonstrate the ability of machine learning algorithms to combat nonlinear fibre channel impairments and perform parameter extraction from directly detected signals.......Linear signal processing algorithms are effective in combating linear fibre channel impairments. We demonstrate the ability of machine learning algorithms to combat nonlinear fibre channel impairments and perform parameter extraction from directly detected signals....

  6. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics

    OpenAIRE

    HUANG, SHUJUN; CAI, NIANGUANG; PACHECO, PEDRO PENZUTI; NARANDES, SHAVIRA; WANG, YANG; XU, WAYNE

    2017-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better ...

  7. Machine Learning in Production Systems Design Using Genetic Algorithms

    OpenAIRE

    Abu Qudeiri Jaber; Yamamoto Hidehiko Rizauddin Ramli

    2008-01-01

    To create a solution for a specific problem in machine learning, the solution is constructed from the data or by use a search method. Genetic algorithms are a model of machine learning that can be used to find nearest optimal solution. While the great advantage of genetic algorithms is the fact that they find a solution through evolution, this is also the biggest disadvantage. Evolution is inductive, in nature life does not evolve towards a good solution but it evolves aw...

  8. Machine learning applied to the prediction of citrus production

    OpenAIRE

    Díaz Rodríguez, Susana Irene; Mazza, Silvia M.; Fernández-Combarro Álvarez, Elías; Giménez, Laura I.; Gaiad, José E.

    2017-01-01

    An in-depth knowledge about variables affecting production is required in order to predict global production and take decisions in agriculture. Machine learning is a technique used in agricultural planning and precision agriculture. This work (i) studies the effectiveness of machine learning techniques for predicting orchards production; and (ii) variables affecting this production were also identified. Data from 964 orchards of lemon, mandarin, and orange in Corrientes, Argentina are analyse...

  9. A Comparative Analysis of Machine Learning Techniques for Credit Scoring

    OpenAIRE

    Nwulu, Nnamdi; Oroja, Shola; İlkan, Mustafa

    2012-01-01

    Abstract Credit Scoring has become an oft researched topic in light of the increasing volatility of the global economy and the recent world financial crisis. Amidst the many methods used for credit scoring, machine learning techniques are becoming increasingly popular due to their efficient and accurate nature and relative simplicity. Furthermore machine learning techniques minimize the risk of human bias and error and maximize speed as they are able to perform computation...

  10. Statistical and Machine Learning Models to Predict Programming Performance

    OpenAIRE

    Bergin, Susan

    2006-01-01

    This thesis details a longitudinal study on factors that influence introductory programming success and on the development of machine learning models to predict incoming student performance. Although numerous studies have developed models to predict programming success, the models struggled to achieve high accuracy in predicting the likely performance of incoming students. Our approach overcomes this by providing a machine learning technique, using a set of three significant...

  11. Learning Machines Implemented on Non-Deterministic Hardware

    OpenAIRE

    Gupta, Suyog; Sindhwani, Vikas; Gopalakrishnan, Kailash

    2014-01-01

    This paper highlights new opportunities for designing large-scale machine learning systems as a consequence of blurring traditional boundaries that have allowed algorithm designers and application-level practitioners to stay -- for the most part -- oblivious to the details of the underlying hardware-level implementations. The hardware/software co-design methodology advocated here hinges on the deployment of compute-intensive machine learning kernels onto compute platforms that trade-off deter...

  12. Machine learning application in the life time of materials

    OpenAIRE

    Yu, Xiaojiao

    2017-01-01

    Materials design and development typically takes several decades from the initial discovery to commercialization with the traditional trial and error development approach. With the accumulation of data from both experimental and computational results, data based machine learning becomes an emerging field in materials discovery, design and property prediction. This manuscript reviews the history of materials science as a disciplinary the most common machine learning method used in materials sc...

  13. Machine learning in laboratory medicine: waiting for the flood?

    Science.gov (United States)

    Cabitza, Federico; Banfi, Giuseppe

    2018-03-28

    This review focuses on machine learning and on how methods and models combining data analytics and artificial intelligence have been applied to laboratory medicine so far. Although still in its infancy, the potential for applying machine learning to laboratory data for both diagnostic and prognostic purposes deserves more attention by the readership of this journal, as well as by physician-scientists who will want to take advantage of this new computer-based support in pathology and laboratory medicine.

  14. Investigating Antecedents of Task Commitment and Task Attraction in Service Learning Team Projects

    Science.gov (United States)

    Schaffer, Bryan S.; Manegold, Jennifer G.

    2018-01-01

    The authors investigated the antecedents of team task cohesiveness in service learning classroom environments. Focusing on task commitment and task attraction as key dependent variables representing cohesiveness, and task interdependence as the primary independent variable, the authors position three important task action phase processes as…

  15. Bypassing the Kohn-Sham equations with machine learning.

    Science.gov (United States)

    Brockherde, Felix; Vogt, Leslie; Li, Li; Tuckerman, Mark E; Burke, Kieron; Müller, Klaus-Robert

    2017-10-11

    Last year, at least 30,000 scientific papers used the Kohn-Sham scheme of density functional theory to solve electronic structure problems in a wide variety of scientific fields. Machine learning holds the promise of learning the energy functional via examples, bypassing the need to solve the Kohn-Sham equations. This should yield substantial savings in computer time, allowing larger systems and/or longer time-scales to be tackled, but attempts to machine-learn this functional have been limited by the need to find its derivative. The present work overcomes this difficulty by directly learning the density-potential and energy-density maps for test systems and various molecules. We perform the first molecular dynamics simulation with a machine-learned density functional on malonaldehyde and are able to capture the intramolecular proton transfer process. Learning density models now allows the construction of accurate density functionals for realistic molecular systems.Machine learning allows electronic structure calculations to access larger system sizes and, in dynamical simulations, longer time scales. Here, the authors perform such a simulation using a machine-learned density functional that avoids direct solution of the Kohn-Sham equations.

  16. Machine learning methods for metabolic pathway prediction

    Directory of Open Access Journals (Sweden)

    Karp Peter D

    2010-01-01

    Full Text Available Abstract Background A key challenge in systems biology is the reconstruction of an organism's metabolic network from its genome sequence. One strategy for addressing this problem is to predict which metabolic pathways, from a reference database of known pathways, are present in the organism, based on the annotated genome of the organism. Results To quantitatively validate methods for pathway prediction, we developed a large "gold standard" dataset of 5,610 pathway instances known to be present or absent in curated metabolic pathway databases for six organisms. We defined a collection of 123 pathway features, whose information content we evaluated with respect to the gold standard. Feature data were used as input to an extensive collection of machine learning (ML methods, including naïve Bayes, decision trees, and logistic regression, together with feature selection and ensemble methods. We compared the ML methods to the previous PathoLogic algorithm for pathway prediction using the gold standard dataset. We found that ML-based prediction methods can match the performance of the PathoLogic algorithm. PathoLogic achieved an accuracy of 91% and an F-measure of 0.786. The ML-based prediction methods achieved accuracy as high as 91.2% and F-measure as high as 0.787. The ML-based methods output a probability for each predicted pathway, whereas PathoLogic does not, which provides more information to the user and facilitates filtering of predicted pathways. Conclusions ML methods for pathway prediction perform as well as existing methods, and have qualitative advantages in terms of extensibility, tunability, and explainability. More advanced prediction methods and/or more sophisticated input features may improve the performance of ML methods. However, pathway prediction performance appears to be limited largely by the ability to correctly match enzymes to the reactions they catalyze based on genome annotations.

  17. Machine learning methods for metabolic pathway prediction

    Science.gov (United States)

    2010-01-01

    Background A key challenge in systems biology is the reconstruction of an organism's metabolic network from its genome sequence. One strategy for addressing this problem is to predict which metabolic pathways, from a reference database of known pathways, are present in the organism, based on the annotated genome of the organism. Results To quantitatively validate methods for pathway prediction, we developed a large "gold standard" dataset of 5,610 pathway instances known to be present or absent in curated metabolic pathway databases for six organisms. We defined a collection of 123 pathway features, whose information content we evaluated with respect to the gold standard. Feature data were used as input to an extensive collection of machine learning (ML) methods, including naïve Bayes, decision trees, and logistic regression, together with feature selection and ensemble methods. We compared the ML methods to the previous PathoLogic algorithm for pathway prediction using the gold standard dataset. We found that ML-based prediction methods can match the performance of the PathoLogic algorithm. PathoLogic achieved an accuracy of 91% and an F-measure of 0.786. The ML-based prediction methods achieved accuracy as high as 91.2% and F-measure as high as 0.787. The ML-based methods output a probability for each predicted pathway, whereas PathoLogic does not, which provides more information to the user and facilitates filtering of predicted pathways. Conclusions ML methods for pathway prediction perform as well as existing methods, and have qualitative advantages in terms of extensibility, tunability, and explainability. More advanced prediction methods and/or more sophisticated input features may improve the performance of ML methods. However, pathway prediction performance appears to be limited largely by the ability to correctly match enzymes to the reactions they catalyze based on genome annotations. PMID:20064214

  18. Bone-suppressed radiography using machine learning

    Energy Technology Data Exchange (ETDEWEB)

    Park, Jun Beom; Kim, Dae Cheon; Kim, Ho Kyung [KAERI, Daejeon (Korea, Republic of)

    2016-05-15

    The single-shot dual-energy imaging suffers from reduced contrast-to-noise ratio performance due to poor spectral separation. Tomosynthesis requires more complex motion equipment and may require higher patient dose. An alternative tissue-specific imaging technique was introduced. This alternative technique usually possesses a filter to generate bone-only images for given digital radiographs. Therefore, it provides soft-tissue-enhanced images from the subtraction of given radiographs and filtered bone-only images. Only bone-suppressed imaging capability is a limitation of the method. The filter can be obtained from a machine-learning algorithm, e.g. artificial neural network (ANN), with the dual-energy bone-only images (called 'teaching' images). We suspect the robustness of the filter may be dependent upon the number of teaching images and the number of patients from whose radiographs we obtain the teaching images. In this study, we design an ANN to obtain a bone-extracting filter from a radiograph, and investigate the filter properties with respect to various ANN parameters. Preliminary results are summarized in Fig. 3. We extracted 5,000 subregions in a 21x21 pixel format from the lung region in the bone-enhanced dual-energy image and we used them for teaching images during training the ANN. The resultant bone-enhanced image from the ANN nonlinear filter is shown in Fig. 3 (a). From the weighted logarithmic subtraction between Fig. 2 (a) and Fig. 3 (a), we could obtain the bone-suppressed image as shown in Fig. 3 (b). The quality of the bone-suppressed image is comparable to the ground truth Fig. 2 (c).

  19. Estimating extinction using unsupervised machine learning

    Science.gov (United States)

    Meingast, Stefan; Lombardi, Marco; Alves, João

    2017-05-01

    Dust extinction is the most robust tracer of the gas distribution in the interstellar medium, but measuring extinction is limited by the systematic uncertainties involved in estimating the intrinsic colors to background stars. In this paper we present a new technique, Pnicer, that estimates intrinsic colors and extinction for individual stars using unsupervised machine learning algorithms. This new method aims to be free from any priors with respect to the column density and intrinsic color distribution. It is applicable to any combination of parameters and works in arbitrary numbers of dimensions. Furthermore, it is not restricted to color space. Extinction toward single sources is determined by fitting Gaussian mixture models along the extinction vector to (extinction-free) control field observations. In this way it becomes possible to describe the extinction for observed sources with probability densities, rather than a single value. Pnicer effectively eliminates known biases found in similar methods and outperforms them in cases of deep observational data where the number of background galaxies is significant, or when a large number of parameters is used to break degeneracies in the intrinsic color distributions. This new method remains computationally competitive, making it possible to correctly de-redden millions of sources within a matter of seconds. With the ever-increasing number of large-scale high-sensitivity imaging surveys, Pnicer offers a fast and reliable way to efficiently calculate extinction for arbitrary parameter combinations without prior information on source characteristics. The Pnicer software package also offers access to the well-established Nicer technique in a simple unified interface and is capable of building extinction maps including the Nicest correction for cloud substructure. Pnicer is offered to the community as an open-source software solution and is entirely written in Python.

  20. Bone-suppressed radiography using machine learning

    International Nuclear Information System (INIS)

    Park, Jun Beom; Kim, Dae Cheon; Kim, Ho Kyung

    2016-01-01

    The single-shot dual-energy imaging suffers from reduced contrast-to-noise ratio performance due to poor spectral separation. Tomosynthesis requires more complex motion equipment and may require higher patient dose. An alternative tissue-specific imaging technique was introduced. This alternative technique usually possesses a filter to generate bone-only images for given digital radiographs. Therefore, it provides soft-tissue-enhanced images from the subtraction of given radiographs and filtered bone-only images. Only bone-suppressed imaging capability is a limitation of the method. The filter can be obtained from a machine-learning algorithm, e.g. artificial neural network (ANN), with the dual-energy bone-only images (called 'teaching' images). We suspect the robustness of the filter may be dependent upon the number of teaching images and the number of patients from whose radiographs we obtain the teaching images. In this study, we design an ANN to obtain a bone-extracting filter from a radiograph, and investigate the filter properties with respect to various ANN parameters. Preliminary results are summarized in Fig. 3. We extracted 5,000 subregions in a 21x21 pixel format from the lung region in the bone-enhanced dual-energy image and we used them for teaching images during training the ANN. The resultant bone-enhanced image from the ANN nonlinear filter is shown in Fig. 3 (a). From the weighted logarithmic subtraction between Fig. 2 (a) and Fig. 3 (a), we could obtain the bone-suppressed image as shown in Fig. 3 (b). The quality of the bone-suppressed image is comparable to the ground truth Fig. 2 (c).

  1. Machine Learning Techniques in Optimal Design

    Science.gov (United States)

    Cerbone, Giuseppe

    1992-01-01

    to the problem, is then obtained by solving in parallel each of the sub-problems in the set and computing the one with the minimum cost. In addition to speeding up the optimization process, our use of learning methods also relieves the expert from the burden of identifying rules that exactly pinpoint optimal candidate sub-problems. In real engineering tasks it is usually too costly to the engineers to derive such rules. Therefore, this paper also contributes to a further step towards the solution of the knowledge acquisition bottleneck [Feigenbaum, 1977] which has somewhat impaired the construction of rulebased expert systems.

  2. Machine-learning the string landscape

    Science.gov (United States)

    He, Yang-Hui

    2017-11-01

    We propose a paradigm to apply machine learning various databases which have emerged in the study of the string landscape. In particular, we establish neural networks as both classifiers and predictors and train them with a host of available data ranging from Calabi-Yau manifolds and vector bundles, to quiver representations for gauge theories, using a novel framework of recasting geometrical and physical data as pixelated images. We find that even a relatively simple neural network can learn many significant quantities to astounding accuracy in a matter of minutes and can also predict hithertofore unencountered results, whereby rendering the paradigm a valuable tool in physics as well as pure mathematics. Of course, this paradigm is useful not only to physicists but to also to mathematicians; for instance, could our NN be trained well enough to approximate bundle cohomology calculations? This, and a host of other examples, we will now examine.Methodology  Neural networks are known for their complexity, involving usually a complicated directed graph each node of which is a ;perceptron; (an activation function imitating a neuron) and amongst the multitude of which there are many arrows encoding input/output. Throughout this letter, we will use a rather simple multi-layer perceptron (MLP) consisting of 5 layers, three of which are hidden, with activation functions typically of the form of a logistic sigmoid or a hyperbolic tangent. The input layer is a linear layer of 100 to 1000 nodes, recognizing a tensor (as we will soon see, algebro-geometric objects such as Calabi-Yau manifolds or polytopes are generically configurations of integer tensors) and the output layer is a summation layer giving a number corresponding to a Hodge number, or to rank of a cohomology group, etc. Such an MLP can be implemented, for instance, on the latest versions of Wolfram Mathematica. With 500-1000 training rounds, the running time is merely about 5-20 minutes on an ordinary laptop. It

  3. On the Conditioning of Machine-Learning-Assisted Turbulence Modeling

    Science.gov (United States)

    Wu, Jinlong; Sun, Rui; Wang, Qiqi; Xiao, Heng

    2017-11-01

    Recently, several researchers have demonstrated that machine learning techniques can be used to improve the RANS modeled Reynolds stress by training on available database of high fidelity simulations. However, obtaining improved mean velocity field remains an unsolved challenge, restricting the predictive capability of current machine-learning-assisted turbulence modeling approaches. In this work we define a condition number to evaluate the model conditioning of data-driven turbulence modeling approaches, and propose a stability-oriented machine learning framework to model Reynolds stress. Two canonical flows, the flow in a square duct and the flow over periodic hills, are investigated to demonstrate the predictive capability of the proposed framework. The satisfactory prediction performance of mean velocity field for both flows demonstrates the predictive capability of the proposed framework for machine-learning-assisted turbulence modeling. With showing the capability of improving the prediction of mean flow field, the proposed stability-oriented machine learning framework bridges the gap between the existing machine-learning-assisted turbulence modeling approaches and the demand of predictive capability of turbulence models in real applications.

  4. Solar wind classification from a machine learning perspective

    Science.gov (United States)

    Heidrich-Meisner, V.; Wimmer-Schweingruber, R. F.

    2017-12-01

    It is a very well known fact that the ubiquitous solar wind comes in at least two varieties, the slow solar wind and the coronal hole wind. The simplified view of two solar wind types has been frequently challenged. Existing solar wind categorization schemes rely mainly on different combinations of the solar wind proton speed, the O and C charge state ratios, the Alfvén speed, the expected proton temperature and the specific proton entropy. In available solar wind classification schemes, solar wind from stream interaction regimes is often considered either as coronal hole wind or slow solar wind, although their plasma properties are different compared to "pure" coronal hole or slow solar wind. As shown in Neugebauer et al. (2016), even if only two solar wind types are assumed, available solar wind categorization schemes differ considerably for intermediate solar wind speeds. Thus, the decision boundary between the coronal hole and the slow solar wind is so far not well defined.In this situation, a machine learning approach to solar wind classification can provide an additional perspective.We apply a well-known machine learning method, k-means, to the task of solar wind classification in order to answer the following questions: (1) How many solar wind types can reliably be identified in our data set comprised of ten years of solar wind observations from the Advanced Composition Explorer (ACE)? (2) Which combinations of solar wind parameters are particularly useful for solar wind classification?Potential subtypes of slow solar wind are of particular interest because they can provide hints of respective different source regions or release mechanisms of slow solar wind.

  5. PMLB: a large benchmark suite for machine learning evaluation and comparison.

    Science.gov (United States)

    Olson, Randal S; La Cava, William; Orzechowski, Patryk; Urbanowicz, Ryan J; Moore, Jason H

    2017-01-01

    The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists. The present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. From this study, we find that existing benchmarks lack the diversity to properly benchmark machine learning algorithms, and there are several gaps in benchmarking problems that still need to be considered. This work represents another important step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.

  6. Development of E-Learning Materials for Machining Safety Education

    Science.gov (United States)

    Nakazawa, Tsuyoshi; Mita, Sumiyoshi; Matsubara, Masaaki; Takashima, Takeo; Tanaka, Koichi; Izawa, Satoru; Kawamura, Takashi

    We developed two e-learning materials for Manufacturing Practice safety education: movie learning materials and hazard-detection learning materials. Using these video and sound media, students can learn how to operate machines safely with movie learning materials, which raise the effectiveness of preparation and review for manufacturing practice. Using these materials, students can realize safety operation well. Students can apply knowledge learned in lectures to the detection of hazards and use study methods for hazard detection during machine operation using the hazard-detection learning materials. Particularly, the hazard-detection learning materials raise students‧ safety consciousness and increase students‧ comprehension of knowledge from lectures and comprehension of operations during Manufacturing Practice.

  7. Multiple-Machine Scheduling with Learning Effects and Cooperative Games

    Directory of Open Access Journals (Sweden)

    Yiyuan Zhou

    2015-01-01

    Full Text Available Multiple-machine scheduling problems with position-based learning effects are studied in this paper. There is an initial schedule in this scheduling problem. The optimal schedule minimizes the sum of the weighted completion times; the difference between the initial total weighted completion time and the minimal total weighted completion time is the cost savings. A multiple-machine sequencing game is introduced to allocate the cost savings. The game is balanced if the normal processing times of jobs that are on the same machine are equal and an equal number of jobs are scheduled on each machine initially.

  8. Ship localization in Santa Barbara Channel using machine learning classifiers.

    Science.gov (United States)

    Niu, Haiqiang; Ozanich, Emma; Gerstoft, Peter

    2017-11-01

    Machine learning classifiers are shown to outperform conventional matched field processing for a deep water (600 m depth) ocean acoustic-based ship range estimation problem in the Santa Barbara Channel Experiment when limited environmental information is known. Recordings of three different ships of opportunity on a vertical array were used as training and test data for the feed-forward neural network and support vector machine classifiers, demonstrating the feasibility of machine learning methods to locate unseen sources. The classifiers perform well up to 10 km range whereas the conventional matched field processing fails at about 4 km range without accurate environmental information.

  9. A Telescopic Binary Learning Machine for Training Neural Networks.

    Science.gov (United States)

    Brunato, Mauro; Battiti, Roberto

    2017-03-01

    This paper proposes a new algorithm based on multiscale stochastic local search with binary representation for training neural networks [binary learning machine (BLM)]. We study the effects of neighborhood evaluation strategies, the effect of the number of bits per weight and that of the maximum weight range used for mapping binary strings to real values. Following this preliminary investigation, we propose a telescopic multiscale version of local search, where the number of bits is increased in an adaptive manner, leading to a faster search and to local minima of better quality. An analysis related to adapting the number of bits in a dynamic way is presented. The control on the number of bits, which happens in a natural manner in the proposed method, is effective to increase the generalization performance. The learning dynamics are discussed and validated on a highly nonlinear artificial problem and on real-world tasks in many application domains; BLM is finally applied to a problem requiring either feedforward or recurrent architectures for feedback control.

  10. Deep learning versus traditional machine learning methods for aggregated energy demand prediction

    NARCIS (Netherlands)

    Paterakis, N.G.; Mocanu, E.; Gibescu, M.; Stappers, B.; van Alst, W.

    2018-01-01

    In this paper the more advanced, in comparison with traditional machine learning approaches, deep learning methods are explored with the purpose of accurately predicting the aggregated energy consumption. Despite the fact that a wide range of machine learning methods have been applied to

  11. Applications of machine learning in cancer prediction and prognosis.

    Science.gov (United States)

    Cruz, Joseph A; Wishart, David S

    2007-02-11

    Machine learning is a branch of artificial intelligence that employs a variety of statistical, probabilistic and optimization techniques that allows computers to "learn" from past examples and to detect hard-to-discern patterns from large, noisy or complex data sets. This capability is particularly well-suited to medical applications, especially those that depend on complex proteomic and genomic measurements. As a result, machine learning is frequently used in cancer diagnosis and detection. More recently machine learning has been applied to cancer prognosis and prediction. This latter approach is particularly interesting as it is part of a growing trend towards personalized, predictive medicine. In assembling this review we conducted a broad survey of the different types of machine learning methods being used, the types of data being integrated and the performance of these methods in cancer prediction and prognosis. A number of trends are noted, including a growing dependence on protein biomarkers and microarray data, a strong bias towards applications in prostate and breast cancer, and a heavy reliance on "older" technologies such artificial neural networks (ANNs) instead of more recently developed or more easily interpretable machine learning methods. A number of published studies also appear to lack an appropriate level of validation or testing. Among the better designed and validated studies it is clear that machine learning methods can be used to substantially (15-25%) improve the accuracy of predicting cancer susceptibility, recurrence and mortality. At a more fundamental level, it is also evident that machine learning is also helping to improve our basic understanding of cancer development and progression.

  12. 2015 International Conference on Machine Learning and Signal Processing

    CERN Document Server

    Woo, Wai; Sulaiman, Hamzah; Othman, Mohd; Saat, Mohd

    2016-01-01

    This book presents important research findings and recent innovations in the field of machine learning and signal processing. A wide range of topics relating to machine learning and signal processing techniques and their applications are addressed in order to provide both researchers and practitioners with a valuable resource documenting the latest advances and trends. The book comprises a careful selection of the papers submitted to the 2015 International Conference on Machine Learning and Signal Processing (MALSIP 2015), which was held on 15–17 December 2015 in Ho Chi Minh City, Vietnam with the aim of offering researchers, academicians, and practitioners an ideal opportunity to disseminate their findings and achievements. All of the included contributions were chosen by expert peer reviewers from across the world on the basis of their interest to the community. In addition to presenting the latest in design, development, and research, the book provides access to numerous new algorithms for machine learni...

  13. Prostate Cancer Probability Prediction By Machine Learning Technique.

    Science.gov (United States)

    Jović, Srđan; Miljković, Milica; Ivanović, Miljan; Šaranović, Milena; Arsić, Milena

    2017-11-26

    The main goal of the study was to explore possibility of prostate cancer prediction by machine learning techniques. In order to improve the survival probability of the prostate cancer patients it is essential to make suitable prediction models of the prostate cancer. If one make relevant prediction of the prostate cancer it is easy to create suitable treatment based on the prediction results. Machine learning techniques are the most common techniques for the creation of the predictive models. Therefore in this study several machine techniques were applied and compared. The obtained results were analyzed and discussed. It was concluded that the machine learning techniques could be used for the relevant prediction of prostate cancer.

  14. Image Classification, Deep Learning and Convolutional Neural Networks : A Comparative Study of Machine Learning Frameworks

    OpenAIRE

    Airola, Rasmus; Hager, Kristoffer

    2017-01-01

    The use of machine learning and specifically neural networks is a growing trend in software development, and has grown immensely in the last couple of years in the light of an increasing need to handle big data and large information flows. Machine learning has a broad area of application, such as human-computer interaction, predicting stock prices, real-time translation, and self driving vehicles. Large companies such as Microsoft and Google have already implemented machine learning in some o...

  15. In silico machine learning methods in drug development.

    Science.gov (United States)

    Dobchev, Dimitar A; Pillai, Girinath G; Karelson, Mati

    2014-01-01

    Machine learning (ML) computational methods for predicting compounds with pharmacological activity, specific pharmacodynamic and ADMET (absorption, distribution, metabolism, excretion and toxicity) properties are being increasingly applied in drug discovery and evaluation. Recently, machine learning techniques such as artificial neural networks, support vector machines and genetic programming have been explored for predicting inhibitors, antagonists, blockers, agonists, activators and substrates of proteins related to specific therapeutic targets. These methods are particularly useful for screening compound libraries of diverse chemical structures, "noisy" and high-dimensional data to complement QSAR methods, and in cases of unavailable receptor 3D structure to complement structure-based methods. A variety of studies have demonstrated the potential of machine-learning methods for predicting compounds as potential drug candidates. The present review is intended to give an overview of the strategies and current progress in using machine learning methods for drug design and the potential of the respective model development tools. We also regard a number of applications of the machine learning algorithms based on common classes of diseases.

  16. Machine learning based analysis of cardiovascular images

    NARCIS (Netherlands)

    Wolterink, JM

    2017-01-01

    Cardiovascular diseases (CVDs), including coronary artery disease (CAD) and congenital heart disease (CHD) are the global leading cause of death. Computed tomography (CT) and magnetic resonance imaging (MRI) allow non-invasive imaging of cardiovascular structures. This thesis presents machine

  17. Toward a Progress Indicator for Machine Learning Model Building and Data Mining Algorithm Execution: A Position Paper

    Science.gov (United States)

    Luo, Gang

    2017-01-01

    For user-friendliness, many software systems offer progress indicators for long-duration tasks. A typical progress indicator continuously estimates the remaining task execution time as well as the portion of the task that has been finished. Building a machine learning model often takes a long time, but no existing machine learning software supplies a non-trivial progress indicator. Similarly, running a data mining algorithm often takes a long time, but no existing data mining software provides a nontrivial progress indicator. In this article, we consider the problem of offering progress indicators for machine learning model building and data mining algorithm execution. We discuss the goals and challenges intrinsic to this problem. Then we describe an initial framework for implementing such progress indicators and two advanced, potential uses of them, with the goal of inspiring future research on this topic. PMID:29177022

  18. Indirect Tire Monitoring System - Machine Learning Approach

    Science.gov (United States)

    Svensson, O.; Thelin, S.; Byttner, S.; Fan, Y.

    2017-10-01

    The heavy vehicle industry has today no requirement to provide a tire pressure monitoring system by law. This has created issues surrounding unknown tire pressure and thread depth during active service. There is also no standardization for these kind of systems which means that different manufacturers and third party solutions work after their own principles and it can be hard to know what works for a given vehicle type. The objective is to create an indirect tire monitoring system that can generalize a method that detect both incorrect tire pressure and thread depth for different type of vehicles within a fleet without the need for additional physical sensors or vehicle specific parameters. The existing sensors that are connected communicate through CAN and are interpreted by the Drivec Bridge hardware that exist in the fleet. By using supervised machine learning a classifier was created for each axle where the main focus was the front axle which had the most issues. The classifier will classify the vehicles tires condition and will be implemented in Drivecs cloud service where it will receive its data. The resulting classifier is a random forest implemented in Python. The result from the front axle with a data set consisting of 9767 samples of buses with correct tire condition and 1909 samples of buses with incorrect tire condition it has an accuracy of 90.54% (0.96%). The data sets are created from 34 unique measurements from buses between January and May 2017. This classifier has been exported and is used inside a Node.js module created for Drivecs cloud service which is the result of the whole implementation. The developed solution is called Indirect Tire Monitoring System (ITMS) and is seen as a process. This process will predict bad classes in the cloud which will lead to warnings. The warnings are defined as incidents. They contain only the information needed and the bandwidth of the incidents are also controlled so incidents are created within an

  19. Machine Learning Based Diagnosis of Lithium Batteries

    Science.gov (United States)

    Ibe-Ekeocha, Chinemerem Christopher

    The depletion of the world's current petroleum reserve, coupled with the negative effects of carbon monoxide and other harmful petrochemical by-products on the environment, is the driving force behind the movement towards renewable and sustainable energy sources. Furthermore, the growing transportation sector consumes a significant portion of the total energy used in the United States. A complete electrification of this sector would require a significant development in electric vehicles (EVs) and hybrid electric vehicles (HEVs), thus translating to a reduction in the carbon footprint. As the market for EVs and HEVs grows, their battery management systems (BMS) need to be improved accordingly. The BMS is not only responsible for optimally charging and discharging the battery, but also monitoring battery's state of charge (SOC) and state of health (SOH). SOC, similar to an energy gauge, is a representation of a battery's remaining charge level as a percentage of its total possible charge at full capacity. Similarly, SOH is a measure of deterioration of a battery; thus it is a representation of the battery's age. Both SOC and SOH are not measurable, so it is important that these quantities are estimated accurately. An inaccurate estimation could not only be inconvenient for EV consumers, but also potentially detrimental to battery's performance and life. Such estimations could be implemented either online, while battery is in use, or offline when battery is at rest. This thesis presents intelligent online SOC and SOH estimation methods using machine learning tools such as artificial neural network (ANN). ANNs are a powerful generalization tool if programmed and trained effectively. Unlike other estimation strategies, the techniques used require no battery modeling or knowledge of battery internal parameters but rather uses battery's voltage, charge/discharge current, and ambient temperature measurements to accurately estimate battery's SOC and SOH. The developed

  20. Contemporary machine learning: techniques for practitioners in the physical sciences

    Science.gov (United States)

    Spears, Brian

    2017-10-01

    Machine learning is the science of using computers to find relationships in data without explicitly knowing or programming those relationships in advance. Often without realizing it, we employ machine learning every day as we use our phones or drive our cars. Over the last few years, machine learning has found increasingly broad application in the physical sciences. This most often involves building a model relationship between a dependent, measurable output and an associated set of controllable, but complicated, independent inputs. The methods are applicable both to experimental observations and to databases of simulated output from large, detailed numerical simulations. In this tutorial, we will present an overview of current tools and techniques in machine learning - a jumping-off point for researchers interested in using machine learning to advance their work. We will discuss supervised learning techniques for modeling complicated functions, beginning with familiar regression schemes, then advancing to more sophisticated decision trees, modern neural networks, and deep learning methods. Next, we will cover unsupervised learning and techniques for reducing the dimensionality of input spaces and for clustering data. We'll show example applications from both magnetic and inertial confinement fusion. Along the way, we will describe methods for practitioners to help ensure that their models generalize from their training data to as-yet-unseen test data. We will finally point out some limitations to modern machine learning and speculate on some ways that practitioners from the physical sciences may be particularly suited to help. This work was performed by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

  1. Applications of Machine Learning in Cancer Prediction and Prognosis

    Directory of Open Access Journals (Sweden)

    Joseph A. Cruz

    2006-01-01

    Full Text Available Machine learning is a branch of artificial intelligence that employs a variety of statistical, probabilistic and optimization techniques that allows computers to “learn” from past examples and to detect hard-to-discern patterns from large, noisy or complex data sets. This capability is particularly well-suited to medical applications, especially those that depend on complex proteomic and genomic measurements. As a result, machine learning is frequently used in cancer diagnosis and detection. More recently machine learning has been applied to cancer prognosis and prediction. This latter approach is particularly interesting as it is part of a growing trend towards personalized, predictive medicine. In assembling this review we conducted a broad survey of the different types of machine learning methods being used, the types of data being integrated and the performance of these methods in cancer prediction and prognosis. A number of trends are noted, including a growing dependence on protein biomarkers and microarray data, a strong bias towards applications in prostate and breast cancer, and a heavy reliance on “older” technologies such artificial neural networks (ANNs instead of more recently developed or more easily interpretable machine learning methods. A number of published studies also appear to lack an appropriate level of validation or testing. Among the better designed and validated studies it is clear that machine learning methods can be used to substantially (15-25% improve the accuracy of predicting cancer susceptibility, recurrence and mortality. At a more fundamental level, it is also evident that machine learning is also helping to improve our basic understanding of cancer development and progression.

  2. Machine learning molecular dynamics for the simulation of infrared spectra.

    Science.gov (United States)

    Gastegger, Michael; Behler, Jörg; Marquetand, Philipp

    2017-10-01

    Machine learning has emerged as an invaluable tool in many research areas. In the present work, we harness this power to predict highly accurate molecular infrared spectra with unprecedented computational efficiency. To account for vibrational anharmonic and dynamical effects - typically neglected by conventional quantum chemistry approaches - we base our machine learning strategy on ab initio molecular dynamics simulations. While these simulations are usually extremely time consuming even for small molecules, we overcome these limitations by leveraging the power of a variety of machine learning techniques, not only accelerating simulations by several orders of magnitude, but also greatly extending the size of systems that can be treated. To this end, we develop a molecular dipole moment model based on environment dependent neural network charges and combine it with the neural network potential approach of Behler and Parrinello. Contrary to the prevalent big data philosophy, we are able to obtain very accurate machine learning models for the prediction of infrared spectra based on only a few hundreds of electronic structure reference points. This is made possible through the use of molecular forces during neural network potential training and the introduction of a fully automated sampling scheme. We demonstrate the power of our machine learning approach by applying it to model the infrared spectra of a methanol molecule, n -alkanes containing up to 200 atoms and the protonated alanine tripeptide, which at the same time represents the first application of machine learning techniques to simulate the dynamics of a peptide. In all of these case studies we find an excellent agreement between the infrared spectra predicted via machine learning models and the respective theoretical and experimental spectra.

  3. Distribution Learning in Evolutionary Strategies and Restricted Boltzmann Machines

    DEFF Research Database (Denmark)

    Krause, Oswin

    The thesis is concerned with learning distributions in the two settings of Evolutionary Strategies (ESs) and Restricted Boltzmann Machines (RBMs). In both cases, the distributions are learned from samples, albeit with different goals. Evolutionary Strategies are concerned with finding an optimum ...

  4. Relevance as a metric for evaluating machine learning algorithms

    NARCIS (Netherlands)

    Kota Gopalakrishna, A.; Ozcelebi, T.; Liotta, A.; Lukkien, J.J.

    2013-01-01

    In machine learning, the choice of a learning algorithm that is suitable for the application domain is critical. The performance metric used to compare different algorithms must also reflect the concerns of users in the application domain under consideration. In this work, we propose a novel

  5. Machine Learning for Quantification of Small Vessel Disease Imaging Biomarkers

    NARCIS (Netherlands)

    Ghafoorian, M.

    2018-01-01

    This thesis is devoted to developing fully automated methods for quantification of small vessel disease imaging bio-markers, namely WMHs and lacunes, using vari- ous machine learning/deep learning and computer vision techniques. The rest of the thesis is organized as follows: Chapter 2 describes

  6. Virtual screening by a new Clustering-based Weighted Similarity Extreme Learning Machine approach.

    Science.gov (United States)

    Pasupa, Kitsuchart; Kudisthalert, Wasu

    2018-01-01

    Machine learning techniques are becoming popular in virtual screening tasks. One of the powerful machine learning algorithms is Extreme Learning Machine (ELM) which has been applied to many applications and has recently been applied to virtual screening. We propose the Weighted Similarity ELM (WS-ELM) which is based on a single layer feed-forward neural network in a conjunction of 16 different similarity coefficients as activation function in the hidden layer. It is known that the performance of conventional ELM is not robust due to random weight selection in the hidden layer. Thus, we propose a Clustering-based WS-ELM (CWS-ELM) that deterministically assigns weights by utilising clustering algorithms i.e. k-means clustering and support vector clustering. The experiments were conducted on one of the most challenging datasets-Maximum Unbiased Validation Dataset-which contains 17 activity classes carefully selected from PubChem. The proposed algorithms were then compared with other machine learning techniques such as support vector machine, random forest, and similarity searching. The results show that CWS-ELM in conjunction with support vector clustering yields the best performance when utilised together with Sokal/Sneath(1) coefficient. Furthermore, ECFP_6 fingerprint presents the best results in our framework compared to the other types of fingerprints, namely ECFP_4, FCFP_4, and FCFP_6.

  7. Performance Evaluation of Machine Learning Algorithms for Urban Pattern Recognition from Multi-spectral Satellite Images

    Directory of Open Access Journals (Sweden)

    Marc Wieland

    2014-03-01

    Full Text Available In this study, a classification and performance evaluation framework for the recognition of urban patterns in medium (Landsat ETM, TM and MSS and very high resolution (WorldView-2, Quickbird, Ikonos multi-spectral satellite images is presented. The study aims at exploring the potential of machine learning algorithms in the context of an object-based image analysis and to thoroughly test the algorithm’s performance under varying conditions to optimize their usage for urban pattern recognition tasks. Four classification algorithms, Normal Bayes, K Nearest Neighbors, Random Trees and Support Vector Machines, which represent different concepts in machine learning (probabilistic, nearest neighbor, tree-based, function-based, have been selected and implemented on a free and open-source basis. Particular focus is given to assess the generalization ability of machine learning algorithms and the transferability of trained learning machines between different image types and image scenes. Moreover, the influence of the number and choice of training data, the influence of the size and composition of the feature vector and the effect of image segmentation on the classification accuracy is evaluated.

  8. Large-scale Machine Learning in High-dimensional Datasets

    DEFF Research Database (Denmark)

    Hansen, Toke Jansen

    Over the last few decades computers have gotten to play an essential role in our daily life, and data is now being collected in various domains at a faster pace than ever before. This dissertation presents research advances in four machine learning fields that all relate to the challenges imposed...... are better at modeling local heterogeneities. In the field of machine learning for neuroimaging, we introduce learning protocols for real-time functional Magnetic Resonance Imaging (fMRI) that allow for dynamic intervention in the human decision process. Specifically, the model exploits the structure of f...

  9. Less is more: regularization perspectives on large scale machine learning

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Deep learning based techniques provide a possible solution at the expanse of theoretical guidance and, especially, of computational requirements. It is then a key challenge for large scale machine learning to devise approaches guaranteed to be accurate and yet computationally efficient. In this talk, we will consider a regularization perspectives on machine learning appealing to classical ideas in linear algebra and inverse problems to scale-up dramatically nonparametric methods such as kernel methods, often dismissed because of prohibitive costs. Our analysis derives optimal theoretical guarantees while providing experimental results at par or out-performing state of the art approaches.

  10. Conformal prediction for reliable machine learning theory, adaptations and applications

    CERN Document Server

    Balasubramanian, Vineeth; Vovk, Vladimir

    2014-01-01

    The conformal predictions framework is a recent development in machine learning that can associate a reliable measure of confidence with a prediction in any real-world pattern recognition application, including risk-sensitive applications such as medical diagnosis, face recognition, and financial risk prediction. Conformal Predictions for Reliable Machine Learning: Theory, Adaptations and Applications captures the basic theory of the framework, demonstrates how to apply it to real-world problems, and presents several adaptations, including active learning, change detection, and anomaly detecti

  11. Machine learning in Python essential techniques for predictive analysis

    CERN Document Server

    Bowles, Michael

    2015-01-01

    Learn a simpler and more effective way to analyze data and predict outcomes with Python Machine Learning in Python shows you how to successfully analyze data using only two core machine learning algorithms, and how to apply them using Python. By focusing on two algorithm families that effectively predict outcomes, this book is able to provide full descriptions of the mechanisms at work, and the examples that illustrate the machinery with specific, hackable code. The algorithms are explained in simple terms with no complex math and applied using Python, with guidance on algorithm selection, d

  12. Modeling Geomagnetic Variations using a Machine Learning Framework

    Science.gov (United States)

    Cheung, C. M. M.; Handmer, C.; Kosar, B.; Gerules, G.; Poduval, B.; Mackintosh, G.; Munoz-Jaramillo, A.; Bobra, M.; Hernandez, T.; McGranaghan, R. M.

    2017-12-01

    We present a framework for data-driven modeling of Heliophysics time series data. The Solar Terrestrial Interaction Neural net Generator (STING) is an open source python module built on top of state-of-the-art statistical learning frameworks (traditional machine learning methods as well as deep learning). To showcase the capability of STING, we deploy it for the problem of predicting the temporal variation of geomagnetic fields. The data used includes solar wind measurements from the OMNI database and geomagnetic field data taken by magnetometers at US Geological Survey observatories. We examine the predictive capability of different machine learning techniques (recurrent neural networks, support vector machines) for a range of forecasting times (minutes to 12 hours). STING is designed to be extensible to other types of data. We show how STING can be used on large sets of data from different sensors/observatories and adapted to tackle other problems in Heliophysics.

  13. Functional discrimination of membrane proteins using machine learning techniques

    Directory of Open Access Journals (Sweden)

    Yabuki Yukimitsu

    2008-03-01

    Full Text Available Abstract Background Discriminating membrane proteins based on their functions is an important task in genome annotation. In this work, we have analyzed the characteristic features of amino acid residues in membrane proteins that perform major functions, such as channels/pores, electrochemical potential-driven transporters and primary active transporters. Results We observed that the residues Asp, Asn and Tyr are dominant in channels/pores whereas the composition of hydrophobic residues, Phe, Gly, Ile, Leu and Val is high in electrochemical potential-driven transporters. The composition of all the amino acids in primary active transporters lies in between other two classes of proteins. We have utilized different machine learning algorithms, such as, Bayes rule, Logistic function, Neural network, Support vector machine, Decision tree etc. for discriminating these classes of proteins. We observed that most of the algorithms have discriminated them with similar accuracy. The neural network method discriminated the channels/pores, electrochemical potential-driven transporters and active transporters with the 5-fold cross validation accuracy of 64% in a data set of 1718 membrane proteins. The application of amino acid occurrence improved the overall accuracy to 68%. In addition, we have discriminated transporters from other α-helical and β-barrel membrane proteins with the accuracy of 85% using k-nearest neighbor method. The classification of transporters and all other proteins (globular and membrane showed the accuracy of 82%. Conclusion The performance of discrimination with amino acid occurrence is better than that with amino acid composition. We suggest that this method could be effectively used to discriminate transporters from all other globular and membrane proteins, and classify them into channels/pores, electrochemical and active transporters.

  14. Building Artificial Vision Systems with Machine Learning

    Energy Technology Data Exchange (ETDEWEB)

    LeCun, Yann [New York University

    2011-02-23

    Three questions pose the next challenge for Artificial Intelligence (AI), robotics, and neuroscience. How do we learn perception (e.g. vision)? How do we learn representations of the perceptual world? How do we learn visual categories from just a few examples?

  15. Machine learning and data science in soft materials engineering.

    Science.gov (United States)

    Ferguson, Andrew L

    2018-01-31

    In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by 'de-jargonizing' data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.

  16. Inverse Problems in Geodynamics Using Machine Learning Algorithms

    Science.gov (United States)

    Shahnas, M. H.; Yuen, D. A.; Pysklywec, R. N.

    2018-01-01

    During the past few decades numerical studies have been widely employed to explore the style of circulation and mixing in the mantle of Earth and other planets. However, in geodynamical studies there are many properties from mineral physics, geochemistry, and petrology in these numerical models. Machine learning, as a computational statistic-related technique and a subfield of artificial intelligence, has rapidly emerged recently in many fields of sciences and engineering. We focus here on the application of supervised machine learning (SML) algorithms in predictions of mantle flow processes. Specifically, we emphasize on estimating mantle properties by employing machine learning techniques in solving an inverse problem. Using snapshots of numerical convection models as training samples, we enable machine learning models to determine the magnitude of the spin transition-induced density anomalies that can cause flow stagnation at midmantle depths. Employing support vector machine algorithms, we show that SML techniques can successfully predict the magnitude of mantle density anomalies and can also be used in characterizing mantle flow patterns. The technique can be extended to more complex geodynamic problems in mantle dynamics by employing deep learning algorithms for putting constraints on properties such as viscosity, elastic parameters, and the nature of thermal and chemical anomalies.

  17. Machine learning and data science in soft materials engineering

    Science.gov (United States)

    Ferguson, Andrew L.

    2018-01-01

    In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by ‘de-jargonizing’ data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.

  18. Machine learning methods without tears: a primer for ecologists.

    Science.gov (United States)

    Olden, Julian D; Lawler, Joshua J; Poff, N LeRoy

    2008-06-01

    Machine learning methods, a family of statistical techniques with origins in the field of artificial intelligence, are recognized as holding great promise for the advancement of understanding and prediction about ecological phenomena. These modeling techniques are flexible enough to handle complex problems with multiple interacting elements and typically outcompete traditional approaches (e.g., generalized linear models), making them ideal for modeling ecological systems. Despite their inherent advantages, a review of the literature reveals only a modest use of these approaches in ecology as compared to other disciplines. One potential explanation for this lack of interest is that machine learning techniques do not fall neatly into the class of statistical modeling approaches with which most ecologists are familiar. In this paper, we provide an introduction to three machine learning approaches that can be broadly used by ecologists: classification and regression trees, artificial neural networks, and evolutionary computation. For each approach, we provide a brief background to the methodology, give examples of its application in ecology, describe model development and implementation, discuss strengths and weaknesses, explore the availability of statistical software, and provide an illustrative example. Although the ecological application of machine learning approaches has increased, there remains considerable skepticism with respect to the role of these techniques in ecology. Our review encourages a greater understanding of machin learning approaches and promotes their future application and utilization, while also providing a basis from which ecologists can make informed decisions about whether to select or avoid these approaches in their future modeling endeavors.

  19. Advances in Machine Learning and Data Mining for Astronomy

    Science.gov (United States)

    Way, Michael J.; Scargle, Jeffrey D.; Ali, Kamal M.; Srivastava, Ashok N.

    2012-03-01

    Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science. The book's introductory part provides context to issues in the astronomical sciences that are also important to health, social, and physical sciences, particularly probabilistic and statistical aspects of classification and cluster analysis. The next part describes a number of astrophysics case studies that leverage a range of machine learning and data mining technologies. In the last part, developers of algorithms and practitioners of machine learning and data mining show how these tools and techniques are used in astronomical applications. With contributions from leading astronomers and computer scientists, this book is a practical guide to many of the most important developments in machine learning, data mining, and statistics. It explores how these advances can solve current and future problems in astronomy and looks at how they could lead to the creation of entirely new algorithms within the data mining community.

  20. Machine Learning and Conflict Prediction: A Use Case

    Directory of Open Access Journals (Sweden)

    Chris Perry

    2013-10-01

    Full Text Available For at least the last two decades, the international community in general and the United Nations specifically have attempted to develop robust, accurate and effective conflict early warning system for conflict prevention. One potential and promising component of integrated early warning systems lies in the field of machine learning. This paper aims at giving conflict analysis a basic understanding of machine learning methodology as well as to test the feasibility and added value of such an approach. The paper finds that the selection of appropriate machine learning methodologies can offer substantial improvements in accuracy and performance. It also finds that even at this early stage in testing machine learning on conflict prediction, full models offer more predictive power than simply using a prior outbreak of violence as the leading indicator of current violence. This suggests that a refined data selection methodology combined with strategic use of machine learning algorithms could indeed offer a significant addition to the early warning toolkit. Finally, the paper suggests a number of steps moving forward to improve upon this initial test methodology.

  1. Machine learning for epigenetics and future medical applications

    OpenAIRE

    Holder, Lawrence B.; Haque, M. Muksitul; Skinner, Michael K.

    2017-01-01

    ABSTRACT Understanding epigenetic processes holds immense promise for medical applications. Advances in Machine Learning (ML) are critical to realize this promise. Previous studies used epigenetic data sets associated with the germline transmission of epigenetic transgenerational inheritance of disease and novel ML approaches to predict genome-wide locations of critical epimutations. A combination of Active Learning (ACL) and Imbalanced Class Learning (ICL) was used to address past problems w...

  2. CHISSL: A Human-Machine Collaboration Space for Unsupervised Learning

    Energy Technology Data Exchange (ETDEWEB)

    Arendt, Dustin L.; Komurlu, Caner; Blaha, Leslie M.

    2017-07-14

    We developed CHISSL, a human-machine interface that utilizes supervised machine learning in an unsupervised context to help the user group unlabeled instances by her own mental model. The user primarily interacts via correction (moving a misplaced instance into its correct group) or confirmation (accepting that an instance is placed in its correct group). Concurrent with the user's interactions, CHISSL trains a classification model guided by the user's grouping of the data. It then predicts the group of unlabeled instances and arranges some of these alongside the instances manually organized by the user. We hypothesize that this mode of human and machine collaboration is more effective than Active Learning, wherein the machine decides for itself which instances should be labeled by the user. We found supporting evidence for this hypothesis in a pilot study where we applied CHISSL to organize a collection of handwritten digits.

  3. Machine learning techniques for persuasion dectection in conversation

    OpenAIRE

    Ortiz, Pedro.

    2010-01-01

    Approved for public release; distribution is unlimited We determined that it is possible to automatically detect persuasion in conversations using three traditional machine learning techniques, naive bayes, maximum entropy, and support vector machine. These results are the first of their kind and serve as a baseline for all future work in this field. The three techniques consistently outperformed the baseline F-score, but not at a level that would be useful for real world applications. The...

  4. Clustering and Candidate Motif Detection in Exosomal miRNAs by Application of Machine Learning Algorithms.

    Science.gov (United States)

    Gaur, Pallavi; Chaturvedi, Anoop

    2017-07-22

    The clustering pattern and motifs give immense information about any biological data. An application of machine learning algorithms for clustering and candidate motif detection in miRNAs derived from exosomes is depicted in this paper. Recent progress in the field of exosome research and more particularly regarding exosomal miRNAs has led much bioinformatic-based research to come into existence. The information on clustering pattern and candidate motifs in miRNAs of exosomal origin would help in analyzing existing, as well as newly discovered miRNAs within exosomes. Along with obtaining clustering pattern and candidate motifs in exosomal miRNAs, this work also elaborates the usefulness of the machine learning algorithms that can be efficiently used and executed on various programming languages/platforms. Data were clustered and sequence candidate motifs were detected successfully. The results were compared and validated with some available web tools such as 'BLASTN' and 'MEME suite'. The machine learning algorithms for aforementioned objectives were applied successfully. This work elaborated utility of machine learning algorithms and language platforms to achieve the tasks of clustering and candidate motif detection in exosomal miRNAs. With the information on mentioned objectives, deeper insight would be gained for analyses of newly discovered miRNAs in exosomes which are considered to be circulating biomarkers. In addition, the execution of machine learning algorithms on various language platforms gives more flexibility to users to try multiple iterations according to their requirements. This approach can be applied to other biological data-mining tasks as well.

  5. Modern machine learning techniques and their applications in cartoon animation research

    CERN Document Server

    Yu, Jun

    2013-01-01

    The integration of machine learning techniques and cartoon animation research is fast becoming a hot topic. This book helps readers learn the latest machine learning techniques, including patch alignment framework; spectral clustering, graph cuts, and convex relaxation; ensemble manifold learning; multiple kernel learning; multiview subspace learning; and multiview distance metric learning. It then presents the applications of these modern machine learning techniques in cartoon animation research. With these techniques, users can efficiently utilize the cartoon materials to generate animations

  6. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  7. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    Directory of Open Access Journals (Sweden)

    Saerom Park

    Full Text Available Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  8. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    Science.gov (United States)

    Park, Saerom; Lee, Jaewook; Son, Youngdoo

    2016-01-01

    Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  9. Proceedings of IEEE Machine Learning for Signal Processing Workshop XV

    DEFF Research Database (Denmark)

    Larsen, Jan

    These proceedings contains refereed papers presented at the Fifteenth IEEE Workshop on Machine Learning for Signal Processing (MLSP’2005), held in Mystic, Connecticut, USA, September 28-30, 2005. This is a continuation of the IEEE Workshops on Neural Networks for Signal Processing (NNSP) organized...... by the NNSP Technical Committee of the IEEE Signal Processing Society. The name of the Technical Committee, hence of the Workshop, was changed to Machine Learning for Signal Processing in September 2003 to better reflect the areas represented by the Technical Committee. The conference is organized...... by the Machine Learning for Signal Processing Technical Committee with sponsorship of the IEEE Signal Processing Society. Following the practice started two years ago, the bound volume of the proceedings is going to be published by IEEE following the Workshop, and we are pleased to offer to conference attendees...

  10. Machine Learning Principles Can Improve Hip Fracture Prediction

    DEFF Research Database (Denmark)

    Kruse, Christian; Eiken, Pia; Vestergaard, Peter

    2017-01-01

    Apply machine learning principles to predict hip fractures and estimate predictor importance in Dual-energy X-ray absorptiometry (DXA)-scanned men and women. Dual-energy X-ray absorptiometry data from two Danish regions between 1996 and 2006 were combined with national Danish patient data.......89 [0.82; 0.95], but with poor calibration in higher probabilities. A ten predictor subset (BMD, biochemical cholesterol and liver function tests, penicillin use and osteoarthritis diagnoses) achieved a test AUC of 0.86 [0.78; 0.94] using an “xgbTree” model. Machine learning can improve hip fracture...... prediction beyond logistic regression using ensemble models. Compiling data from international cohorts of longer follow-up and performing similar machine learning procedures has the potential to further improve discrimination and calibration....

  11. Machine learning in manufacturing: advantages, challenges, and applications

    Directory of Open Access Journals (Sweden)

    Thorsten Wuest

    2016-01-01

    Full Text Available The nature of manufacturing systems faces ever more complex, dynamic and at times even chaotic behaviors. In order to being able to satisfy the demand for high-quality products in an efficient manner, it is essential to utilize all means available. One area, which saw fast pace developments in terms of not only promising results but also usability, is machine learning. Promising an answer to many of the old and new challenges of manufacturing, machine learning is widely discussed by researchers and practitioners alike. However, the field is very broad and even confusing which presents a challenge and a barrier hindering wide application. Here, this paper contributes in presenting an overview of available machine learning techniques and structuring this rather complicated area. A special focus is laid on the potential benefit, and examples of successful applications in a manufacturing environment.

  12. Machine learning for Big Data analytics in plants.

    Science.gov (United States)

    Ma, Chuang; Zhang, Hao Helen; Wang, Xiangfeng

    2014-12-01

    Rapid advances in high-throughput genomic technology have enabled biology to enter the era of 'Big Data' (large datasets). The plant science community not only needs to build its own Big-Data-compatible parallel computing and data management infrastructures, but also to seek novel analytical paradigms to extract information from the overwhelming amounts of data. Machine learning offers promising computational and analytical solutions for the integrative analysis of large, heterogeneous and unstructured datasets on the Big-Data scale, and is gradually gaining popularity in biology. This review introduces the basic concepts and procedures of machine-learning applications and envisages how machine learning could interface with Big Data technology to facilitate basic research and biotechnology in the plant sciences. Copyright © 2014 Elsevier Ltd. All rights reserved.

  13. Accurate Identification of Cancerlectins through Hybrid Machine Learning Technology.

    Science.gov (United States)

    Zhang, Jieru; Ju, Ying; Lu, Huijuan; Xuan, Ping; Zou, Quan

    2016-01-01

    Cancerlectins are cancer-related proteins that function as lectins. They have been identified through computational identification techniques, but these techniques have sometimes failed to identify proteins because of sequence diversity among the cancerlectins. Advanced machine learning identification methods, such as support vector machine and basic sequence features (n-gram), have also been used to identify cancerlectins. In this study, various protein fingerprint features and advanced classifiers, including ensemble learning techniques, were utilized to identify this group of proteins. We improved the prediction accuracy of the original feature extraction methods and classification algorithms by more than 10% on average. Our work provides a basis for the computational identification of cancerlectins and reveals the power of hybrid machine learning techniques in computational proteomics.

  14. The cerebellum: a neuronal learning machine?

    Science.gov (United States)

    Raymond, J. L.; Lisberger, S. G.; Mauk, M. D.

    1996-01-01

    Comparison of two seemingly quite different behaviors yields a surprisingly consistent picture of the role of the cerebellum in motor learning. Behavioral and physiological data about classical conditioning of the eyelid response and motor learning in the vestibulo-ocular reflex suggests that (i) plasticity is distributed between the cerebellar cortex and the deep cerebellar nuclei; (ii) the cerebellar cortex plays a special role in learning the timing of movement; and (iii) the cerebellar cortex guides learning in the deep nuclei, which may allow learning to be transferred from the cortex to the deep nuclei. Because many of the similarities in the data from the two systems typify general features of cerebellar organization, the cerebellar mechanisms of learning in these two systems may represent principles that apply to many motor systems.

  15. Robust visual tracking via multi-task sparse learning

    KAUST Repository

    Zhang, Tianzhu; Ghanem, Bernard; Liu, Si; Ahuja, Narendra

    2012-01-01

    In this paper, we formulate object tracking in a particle filter framework as a multi-task sparse learning problem, which we denote as Multi-Task Tracking (MTT). Since we model particles as linear combinations of dictionary templates

  16. A strategy for quantum algorithm design assisted by machine learning

    International Nuclear Information System (INIS)

    Bang, Jeongho; Lee, Jinhyoung; Ryu, Junghee; Yoo, Seokwon; Pawłowski, Marcin

    2014-01-01

    We propose a method for quantum algorithm design assisted by machine learning. The method uses a quantum–classical hybrid simulator, where a ‘quantum student’ is being taught by a ‘classical teacher’. In other words, in our method, the learning system is supposed to evolve into a quantum algorithm for a given problem, assisted by a classical main-feedback system. Our method is applicable for designing quantum oracle-based algorithms. We chose, as a case study, an oracle decision problem, called a Deutsch–Jozsa problem. We showed by using Monte Carlo simulations that our simulator can faithfully learn a quantum algorithm for solving the problem for a given oracle. Remarkably, the learning time is proportional to the square root of the total number of parameters, rather than showing the exponential dependence found in the classical machine learning-based method. (paper)

  17. A strategy for quantum algorithm design assisted by machine learning

    Science.gov (United States)

    Bang, Jeongho; Ryu, Junghee; Yoo, Seokwon; Pawłowski, Marcin; Lee, Jinhyoung

    2014-07-01

    We propose a method for quantum algorithm design assisted by machine learning. The method uses a quantum-classical hybrid simulator, where a ‘quantum student’ is being taught by a ‘classical teacher’. In other words, in our method, the learning system is supposed to evolve into a quantum algorithm for a given problem, assisted by a classical main-feedback system. Our method is applicable for designing quantum oracle-based algorithms. We chose, as a case study, an oracle decision problem, called a Deutsch-Jozsa problem. We showed by using Monte Carlo simulations that our simulator can faithfully learn a quantum algorithm for solving the problem for a given oracle. Remarkably, the learning time is proportional to the square root of the total number of parameters, rather than showing the exponential dependence found in the classical machine learning-based method.

  18. Predicting Solar Activity Using Machine-Learning Methods

    Science.gov (United States)

    Bobra, M.

    2017-12-01

    Of all the activity observed on the Sun, two of the most energetic events are flares and coronal mass ejections. However, we do not, as of yet, fully understand the physical mechanism that triggers solar eruptions. A machine-learning algorithm, which is favorable in cases where the amount of data is large, is one way to [1] empirically determine the signatures of this mechanism in solar image data and [2] use them to predict solar activity. In this talk, we discuss the application of various machine learning algorithms - specifically, a Support Vector Machine, a sparse linear regression (Lasso), and Convolutional Neural Network - to image data from the photosphere, chromosphere, transition region, and corona taken by instruments aboard the Solar Dynamics Observatory in order to predict solar activity on a variety of time scales. Such an approach may be useful since, at the present time, there are no physical models of flares available for real-time prediction. We discuss our results (Bobra and Couvidat, 2015; Bobra and Ilonidis, 2016; Jonas et al., 2017) as well as other attempts to predict flares using machine-learning (e.g. Ahmed et al., 2013; Nishizuka et al. 2017) and compare these results with the more traditional techniques used by the NOAA Space Weather Prediction Center (Crown, 2012). We also discuss some of the challenges in using machine-learning algorithms for space science applications.

  19. Study of Environmental Data Complexity using Extreme Learning Machine

    Science.gov (United States)

    Leuenberger, Michael; Kanevski, Mikhail

    2017-04-01

    The main goals of environmental data science using machine learning algorithm deal, in a broad sense, around the calibration, the prediction and the visualization of hidden relationship between input and output variables. In order to optimize the models and to understand the phenomenon under study, the characterization of the complexity (at different levels) should be taken into account. Therefore, the identification of the linear or non-linear behavior between input and output variables adds valuable information for the knowledge of the phenomenon complexity. The present research highlights and investigates the different issues that can occur when identifying the complexity (linear/non-linear) of environmental data using machine learning algorithm. In particular, the main attention is paid to the description of a self-consistent methodology for the use of Extreme Learning Machines (ELM, Huang et al., 2006), which recently gained a great popularity. By applying two ELM models (with linear and non-linear activation functions) and by comparing their efficiency, quantification of the linearity can be evaluated. The considered approach is accompanied by simulated and real high dimensional and multivariate data case studies. In conclusion, the current challenges and future development in complexity quantification using environmental data mining are discussed. References - Huang, G.-B., Zhu, Q.-Y., Siew, C.-K., 2006. Extreme learning machine: theory and applications. Neurocomputing 70 (1-3), 489-501. - Kanevski, M., Pozdnoukhov, A., Timonin, V., 2009. Machine Learning for Spatial Environmental Data. EPFL Press; Lausanne, Switzerland, p.392. - Leuenberger, M., Kanevski, M., 2015. Extreme Learning Machines for spatial environmental data. Computers and Geosciences 85, 64-73.

  20. Self-Efficacy, Task Complexity and Task Performance: Exploring Interactions in Two Versions of Vocabulary Learning Tasks

    Science.gov (United States)

    Wu, Xiaoli; Lowyck, Joost; Sercu, Lies; Elen, Jan

    2012-01-01

    The present study aimed for better understanding of the interactions between task complexity and students' self-efficacy beliefs and students' use of learning strategies, and finally their interacting effects on task performance. This investigation was carried out in the context of Chinese students learning English as a foreign language in a…

  1. Developing a PLC-friendly state machine model: lessons learned

    Science.gov (United States)

    Pessemier, Wim; Deconinck, Geert; Raskin, Gert; Saey, Philippe; Van Winckel, Hans

    2014-07-01

    Modern Programmable Logic Controllers (PLCs) have become an attractive platform for controlling real-time aspects of astronomical telescopes and instruments due to their increased versatility, performance and standardization. Likewise, vendor-neutral middleware technologies such as OPC Unified Architecture (OPC UA) have recently demonstrated that they can greatly facilitate the integration of these industrial platforms into the overall control system. Many practical questions arise, however, when building multi-tiered control systems that consist of PLCs for low level control, and conventional software and platforms for higher level control. How should the PLC software be structured, so that it can rely on well-known programming paradigms on the one hand, and be mapped to a well-organized OPC UA interface on the other hand? Which programming languages of the IEC 61131-3 standard closely match the problem domains of the abstraction levels within this structure? How can the recent additions to the standard (such as the support for namespaces and object-oriented extensions) facilitate a model based development approach? To what degree can our applications already take advantage of the more advanced parts of the OPC UA standard, such as the high expressiveness of the semantic modeling language that it defines, or the support for events, aggregation of data, automatic discovery, ... ? What are the timing and concurrency problems to be expected for the higher level tiers of the control system due to the cyclic execution of control and communication tasks by the PLCs? We try to answer these questions by demonstrating a semantic state machine model that can readily be implemented using IEC 61131 and OPC UA. One that does not aim to capture all possible states of a system, but rather one that attempts to organize the course-grained structure and behaviour of a system. In this paper we focus on the intricacies of this seemingly simple task, and on the lessons that we

  2. A machine learning approach to the accurate prediction of monitor units for a compact proton machine.

    Science.gov (United States)

    Sun, Baozhou; Lam, Dao; Yang, Deshan; Grantham, Kevin; Zhang, Tiezhi; Mutic, Sasa; Zhao, Tianyu

    2018-05-01

    Clinical treatment planning systems for proton therapy currently do not calculate monitor units (MUs) in passive scatter proton therapy due to the complexity of the beam delivery systems. Physical phantom measurements are commonly employed to determine the field-specific output factors (OFs) but are often subject to limited machine time, measurement uncertainties and intensive labor. In this study, a machine learning-based approach was developed to predict output (cGy/MU) and derive MUs, incorporating the dependencies on gantry angle and field size for a single-room proton therapy system. The goal of this study was to develop a secondary check tool for OF measurements and eventually eliminate patient-specific OF measurements. The OFs of 1754 fields previously measured in a water phantom with calibrated ionization chambers and electrometers for patient-specific fields with various range and modulation width combinations for 23 options were included in this study. The training data sets for machine learning models in three different methods (Random Forest, XGBoost and Cubist) included 1431 (~81%) OFs. Ten-fold cross-validation was used to prevent "overfitting" and to validate each model. The remaining 323 (~19%) OFs were used to test the trained models. The difference between the measured and predicted values from machine learning models was analyzed. Model prediction accuracy was also compared with that of the semi-empirical model developed by Kooy (Phys. Med. Biol. 50, 2005). Additionally, gantry angle dependence of OFs was measured for three groups of options categorized on the selection of the second scatters. Field size dependence of OFs was investigated for the measurements with and without patient-specific apertures. All three machine learning methods showed higher accuracy than the semi-empirical model which shows considerably large discrepancy of up to 7.7% for the treatment fields with full range and full modulation width. The Cubist-based solution

  3. Advances in machine learning and data mining for astronomy

    CERN Document Server

    Way, Michael J

    2012-01-01

    Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science. The book's introductory part provides context to issues in the astronomical sciences that are also important to health

  4. Machine learning and computer vision approaches for phenotypic profiling.

    Science.gov (United States)

    Grys, Ben T; Lo, Dara S; Sahin, Nil; Kraus, Oren Z; Morris, Quaid; Boone, Charles; Andrews, Brenda J

    2017-01-02

    With recent advances in high-throughput, automated microscopy, there has been an increased demand for effective computational strategies to analyze large-scale, image-based data. To this end, computer vision approaches have been applied to cell segmentation and feature extraction, whereas machine-learning approaches have been developed to aid in phenotypic classification and clustering of data acquired from biological images. Here, we provide an overview of the commonly used computer vision and machine-learning methods for generating and categorizing phenotypic profiles, highlighting the general biological utility of each approach. © 2017 Grys et al.

  5. Machine learning techniques to examine large patient databases.

    Science.gov (United States)

    Meyfroidt, Geert; Güiza, Fabian; Ramon, Jan; Bruynooghe, Maurice

    2009-03-01

    Computerization in healthcare in general, and in the operating room (OR) and intensive care unit (ICU) in particular, is on the rise. This leads to large patient databases, with specific properties. Machine learning techniques are able to examine and to extract knowledge from large databases in an automatic way. Although the number of potential applications for these techniques in medicine is large, few medical doctors are familiar with their methodology, advantages and pitfalls. A general overview of machine learning techniques, with a more detailed discussion of some of these algorithms, is presented in this review.

  6. Advances in independent component analysis and learning machines

    CERN Document Server

    Bingham, Ella; Laaksonen, Jorma; Lampinen, Jouko

    2015-01-01

    In honour of Professor Erkki Oja, one of the pioneers of Independent Component Analysis (ICA), this book reviews key advances in the theory and application of ICA, as well as its influence on signal processing, pattern recognition, machine learning, and data mining. Examples of topics which have developed from the advances of ICA, which are covered in the book are: A unifying probabilistic model for PCA and ICA Optimization methods for matrix decompositions Insights into the FastICA algorithmUnsupervised deep learning Machine vision and image retrieval A review of developments in the t

  7. Energy landscapes for a machine learning application to series data

    Energy Technology Data Exchange (ETDEWEB)

    Ballard, Andrew J.; Stevenson, Jacob D.; Das, Ritankar; Wales, David J., E-mail: dw34@cam.ac.uk [University Chemical Laboratories, Lensfield Road, Cambridge CB2 1EW (United Kingdom)

    2016-03-28

    Methods developed to explore and characterise potential energy landscapes are applied to the corresponding landscapes obtained from optimisation of a cost function in machine learning. We consider neural network predictions for the outcome of local geometry optimisation in a triatomic cluster, where four distinct local minima exist. The accuracy of the predictions is compared for fits using data from single and multiple points in the series of atomic configurations resulting from local geometry optimisation and for alternative neural networks. The machine learning solution landscapes are visualised using disconnectivity graphs, and signatures in the effective heat capacity are analysed in terms of distributions of local minima and their properties.

  8. Energy landscapes for a machine learning application to series data

    International Nuclear Information System (INIS)

    Ballard, Andrew J.; Stevenson, Jacob D.; Das, Ritankar; Wales, David J.

    2016-01-01

    Methods developed to explore and characterise potential energy landscapes are applied to the corresponding landscapes obtained from optimisation of a cost function in machine learning. We consider neural network predictions for the outcome of local geometry optimisation in a triatomic cluster, where four distinct local minima exist. The accuracy of the predictions is compared for fits using data from single and multiple points in the series of atomic configurations resulting from local geometry optimisation and for alternative neural networks. The machine learning solution landscapes are visualised using disconnectivity graphs, and signatures in the effective heat capacity are analysed in terms of distributions of local minima and their properties.

  9. Machine Learning and Neurosurgical Outcome Prediction: A Systematic Review.

    Science.gov (United States)

    Senders, Joeky T; Staples, Patrick C; Karhade, Aditya V; Zaki, Mark M; Gormley, William B; Broekman, Marike L D; Smith, Timothy R; Arnaout, Omar

    2018-01-01

    Accurate measurement of surgical outcomes is highly desirable to optimize surgical decision-making. An important element of surgical decision making is identification of the patient cohort that will benefit from surgery before the intervention. Machine learning (ML) enables computers to learn from previous data to make accurate predictions on new data. In this systematic review, we evaluate the potential of ML for neurosurgical outcome prediction. A systematic search in the PubMed and Embase databases was performed to identify all potential relevant studies up to January 1, 2017. Thirty studies were identified that evaluated ML algorithms used as prediction models for survival, recurrence, symptom improvement, and adverse events in patients undergoing surgery for epilepsy, brain tumor, spinal lesions, neurovascular disease, movement disorders, traumatic brain injury, and hydrocephalus. Depending on the specific prediction task evaluated and the type of input features included, ML models predicted outcomes after neurosurgery with a median accuracy and area under the receiver operating curve of 94.5% and 0.83, respectively. Compared with logistic regression, ML models performed significantly better and showed a median absolute improvement in accuracy and area under the receiver operating curve of 15% and 0.06, respectively. Some studies also demonstrated a better performance in ML models compared with established prognostic indices and clinical experts. In the research setting, ML has been studied extensively, demonstrating an excellent performance in outcome prediction for a wide range of neurosurgical conditions. However, future studies should investigate how ML can be implemented as a practical tool supporting neurosurgical care. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Individual differences in implicit motor learning: task specificity in sensorimotor adaptation and sequence learning.

    Science.gov (United States)

    Stark-Inbar, Alit; Raza, Meher; Taylor, Jordan A; Ivry, Richard B

    2017-01-01

    In standard taxonomies, motor skills are typically treated as representative of implicit or procedural memory. We examined two emblematic tasks of implicit motor learning, sensorimotor adaptation and sequence learning, asking whether individual differences in learning are correlated between these tasks, as well as how individual differences within each task are related to different performance variables. As a prerequisite, it was essential to establish the reliability of learning measures for each task. Participants were tested twice on a visuomotor adaptation task and on a sequence learning task, either the serial reaction time task or the alternating reaction time task. Learning was evident in all tasks at the group level and reliable at the individual level in visuomotor adaptation and the alternating reaction time task but not in the serial reaction time task. Performance variability was predictive of learning in both domains, yet the relationship was in the opposite direction for adaptation and sequence learning. For the former, faster learning was associated with lower variability, consistent with models of sensorimotor adaptation in which learning rates are sensitive to noise. For the latter, greater learning was associated with higher variability and slower reaction times, factors that may facilitate the spread of activation required to form predictive, sequential associations. Interestingly, learning measures of the different tasks were not correlated. Together, these results oppose a shared process for implicit learning in sensorimotor adaptation and sequence learning and provide insight into the factors that account for individual differences in learning within each task domain. We investigated individual differences in the ability to implicitly learn motor skills. As a prerequisite, we assessed whether individual differences were reliable across test sessions. We found that two commonly used tasks of implicit learning, visuomotor adaptation and the

  11. ROBOT LEARNING OF OBJECT MANIPULATION TASK ACTIONS FROM HUMAN DEMONSTRATIONS

    Directory of Open Access Journals (Sweden)

    Maria Kyrarini

    2017-08-01

    Full Text Available Robot learning from demonstration is a method which enables robots to learn in a similar way as humans. In this paper, a framework that enables robots to learn from multiple human demonstrations via kinesthetic teaching is presented. The subject of learning is a high-level sequence of actions, as well as the low-level trajectories necessary to be followed by the robot to perform the object manipulation task. The multiple human demonstrations are recorded and only the most similar demonstrations are selected for robot learning. The high-level learning module identifies the sequence of actions of the demonstrated task. Using Dynamic Time Warping (DTW and Gaussian Mixture Model (GMM, the model of demonstrated trajectories is learned. The learned trajectory is generated by Gaussian mixture regression (GMR from the learned Gaussian mixture model.  In online working phase, the sequence of actions is identified and experimental results show that the robot performs the learned task successfully.

  12. Splendidly blended: a machine learning set up for CDU control

    Science.gov (United States)

    Utzny, Clemens

    2017-06-01

    As the concepts of machine learning and artificial intelligence continue to grow in importance in the context of internet related applications it is still in its infancy when it comes to process control within the semiconductor industry. Especially the branch of mask manufacturing presents a challenge to the concepts of machine learning since the business process intrinsically induces pronounced product variability on the background of small plate numbers. In this paper we present the architectural set up of a machine learning algorithm which successfully deals with the demands and pitfalls of mask manufacturing. A detailed motivation of this basic set up followed by an analysis of its statistical properties is given. The machine learning set up for mask manufacturing involves two learning steps: an initial step which identifies and classifies the basic global CD patterns of a process. These results form the basis for the extraction of an optimized training set via balanced sampling. A second learning step uses this training set to obtain the local as well as global CD relationships induced by the manufacturing process. Using two production motivated examples we show how this approach is flexible and powerful enough to deal with the exacting demands of mask manufacturing. In one example we show how dedicated covariates can be used in conjunction with increased spatial resolution of the CD map model in order to deal with pathological CD effects at the mask boundary. The other example shows how the model set up enables strategies for dealing tool specific CD signature differences. In this case the balanced sampling enables a process control scheme which allows usage of the full tool park within the specified tight tolerance budget. Overall, this paper shows that the current rapid developments off the machine learning algorithms can be successfully used within the context of semiconductor manufacturing.

  13. Machine learning of the reactor core loading pattern critical parameters

    International Nuclear Information System (INIS)

    Trontl, K.; Pevec, D.; Smuc, T.

    2007-01-01

    The usual approach to loading pattern optimization involves high degree of engineering judgment, a set of heuristic rules, an optimization algorithm and a computer code used for evaluating proposed loading patterns. The speed of the optimization process is highly dependent on the computer code used for the evaluation. In this paper we investigate the applicability of a machine learning model which could be used for fast loading pattern evaluation. We employed a recently introduced machine learning technique, Support Vector Regression (SVR), which has a strong theoretical background in statistical learning theory. Superior empirical performance of the method has been reported on difficult regression problems in different fields of science and technology. SVR is a data driven, kernel based, nonlinear modelling paradigm, in which model parameters are automatically determined by solving a quadratic optimization problem. The main objective of the work reported in this paper was to evaluate the possibility of applying SVR method for reactor core loading pattern modelling. The starting set of experimental data for training and testing of the machine learning algorithm was obtained using a two-dimensional diffusion theory reactor physics computer code. We illustrate the performance of the solution and discuss its applicability, i.e., complexity, speed and accuracy, with a projection to a more realistic scenario involving machine learning from the results of more accurate and time consuming three-dimensional core modelling code. (author)

  14. Machine-Learning Algorithms to Code Public Health Spending Accounts.

    Science.gov (United States)

    Brady, Eoghan S; Leider, Jonathon P; Resnick, Beth A; Alfonso, Y Natalia; Bishai, David

    Government public health expenditure data sets require time- and labor-intensive manipulation to summarize results that public health policy makers can use. Our objective was to compare the performances of machine-learning algorithms with manual classification of public health expenditures to determine if machines could provide a faster, cheaper alternative to manual classification. We used machine-learning algorithms to replicate the process of manually classifying state public health expenditures, using the standardized public health spending categories from the Foundational Public Health Services model and a large data set from the US Census Bureau. We obtained a data set of 1.9 million individual expenditure items from 2000 to 2013. We collapsed these data into 147 280 summary expenditure records, and we followed a standardized method of manually classifying each expenditure record as public health, maybe public health, or not public health. We then trained 9 machine-learning algorithms to replicate the manual process. We calculated recall, precision, and coverage rates to measure the performance of individual and ensembled algorithms. Compared with manual classification, the machine-learning random forests algorithm produced 84% recall and 91% precision. With algorithm ensembling, we achieved our target criterion of 90% recall by using a consensus ensemble of ≥6 algorithms while still retaining 93% coverage, leaving only 7% of the summary expenditure records unclassified. Machine learning can be a time- and cost-saving tool for estimating public health spending in the United States. It can be used with standardized public health spending categories based on the Foundational Public Health Services model to help parse public health expenditure information from other types of health-related spending, provide data that are more comparable across public health organizations, and evaluate the impact of evidence-based public health resource allocation.

  15. Scaling up machine learning: parallel and distributed approaches

    National Research Council Canada - National Science Library

    Bekkerman, Ron; Bilenko, Mikhail; Langford, John

    2012-01-01

    .... Demand for parallelizing learning algorithms is highly task-specific: in some settings it is driven by the enormous dataset sizes, in others by model complexity or by real-time performance requirements...

  16. Improved Extreme Learning Machine based on the Sensitivity Analysis

    Science.gov (United States)

    Cui, Licheng; Zhai, Huawei; Wang, Benchao; Qu, Zengtang

    2018-03-01

    Extreme learning machine and its improved ones is weak in some points, such as computing complex, learning error and so on. After deeply analyzing, referencing the importance of hidden nodes in SVM, an novel analyzing method of the sensitivity is proposed which meets people’s cognitive habits. Based on these, an improved ELM is proposed, it could remove hidden nodes before meeting the learning error, and it can efficiently manage the number of hidden nodes, so as to improve the its performance. After comparing tests, it is better in learning time, accuracy and so on.

  17. Outsmarting neural networks: an alternative paradigm for machine learning

    Energy Technology Data Exchange (ETDEWEB)

    Protopopescu, V.; Rao, N.S.V.

    1996-10-01

    We address three problems in machine learning, namely: (i) function learning, (ii) regression estimation, and (iii) sensor fusion, in the Probably and Approximately Correct (PAC) framework. We show that, under certain conditions, one can reduce the three problems above to the regression estimation. The latter is usually tackled with artificial neural networks (ANNs) that satisfy the PAC criteria, but have high computational complexity. We propose several computationally efficient PAC alternatives to ANNs to solve the regression estimation. Thereby we also provide efficient PAC solutions to the function learning and sensor fusion problems. The approach is based on cross-fertilizing concepts and methods from statistical estimation, nonlinear algorithms, and the theory of computational complexity, and is designed as part of a new, coherent paradigm for machine learning.

  18. RG-inspired machine learning for lattice field theory

    Directory of Open Access Journals (Sweden)

    Foreman Sam

    2018-01-01

    Full Text Available Machine learning has been a fast growing field of research in several areas dealing with large datasets. We report recent attempts to use renormalization group (RG ideas in the context of machine learning. We examine coarse graining procedures for perceptron models designed to identify the digits of the MNIST data. We discuss the correspondence between principal components analysis (PCA and RG flows across the transition for worm configurations of the 2D Ising model. Preliminary results regarding the logarithmic divergence of the leading PCA eigenvalue were presented at the conference. More generally, we discuss the relationship between PCA and observables in Monte Carlo simulations and the possibility of reducing the number of learning parameters in supervised learning based on RG inspired hierarchical ansatzes.

  19. RG-inspired machine learning for lattice field theory

    Science.gov (United States)

    Foreman, Sam; Giedt, Joel; Meurice, Yannick; Unmuth-Yockey, Judah

    2018-03-01

    Machine learning has been a fast growing field of research in several areas dealing with large datasets. We report recent attempts to use renormalization group (RG) ideas in the context of machine learning. We examine coarse graining procedures for perceptron models designed to identify the digits of the MNIST data. We discuss the correspondence between principal components analysis (PCA) and RG flows across the transition for worm configurations of the 2D Ising model. Preliminary results regarding the logarithmic divergence of the leading PCA eigenvalue were presented at the conference. More generally, we discuss the relationship between PCA and observables in Monte Carlo simulations and the possibility of reducing the number of learning parameters in supervised learning based on RG inspired hierarchical ansatzes.

  20. Machine Learning for Treatment Assignment: Improving Individualized Risk Attribution.

    Science.gov (United States)

    Weiss, Jeremy; Kuusisto, Finn; Boyd, Kendrick; Liu, Jie; Page, David

    2015-01-01

    Clinical studies model the average treatment effect (ATE), but apply this population-level effect to future individuals. Due to recent developments of machine learning algorithms with useful statistical guarantees, we argue instead for modeling the individualized treatment effect (ITE), which has better applicability to new patients. We compare ATE-estimation using randomized and observational analysis methods against ITE-estimation using machine learning, and describe how the ITE theoretically generalizes to new population distributions, whereas the ATE may not. On a synthetic data set of statin use and myocardial infarction (MI), we show that a learned ITE model improves true ITE estimation and outperforms the ATE. We additionally argue that ITE models should be learned with a consistent, nonparametric algorithm from unweighted examples and show experiments in favor of our argument using our synthetic data model and a real data set of D-penicillamine use for primary biliary cirrhosis.

  1. Concurrent Learning of Control in Multi agent Sequential Decision Tasks

    Science.gov (United States)

    2018-04-17

    Concurrent Learning of Control in Multi-agent Sequential Decision Tasks The overall objective of this project was to develop multi-agent reinforcement... learning (MARL) approaches for intelligent agents to autonomously learn distributed control policies in decentral- ized partially observable... learning of policies in Dec-POMDPs, established performance bounds, evaluated these algorithms both theoretically and empirically, The views

  2. Machine learning-based dual-energy CT parametric mapping.

    Science.gov (United States)

    Su, Kuan-Hao; Kuo, Jung-Wen; Jordan, David W; Van Hedent, Steven; Klahr, Paul; Wei, Zhouping; Al Helo, Rose; Liang, Fan; Qian, Pengjiang; Pereira, Gisele C; Rassouli, Negin; Gilkeson, Robert C; Traughber, Bryan J; Cheng, Chee-Wai; Muzic, Raymond F

    2018-05-22

    The aim is to develop and evaluate machine learning methods for generating quantitative parametric maps of effective atomic number (Zeff), relative electron density (ρe), mean excitation energy (Ix), and relative stopping power (RSP) from clinical dual-energy CT data. The maps could be used for material identification and radiation dose calculation. Machine learning methods of historical centroid (HC), random forest (RF), and artificial neural networks (ANN) were used to learn the relationship between dual-energy CT input data and ideal output parametric maps calculated for phantoms from the known compositions of 13 tissue substitutes. After training and model selection steps, the machine learning predictors were used to generate parametric maps from independent phantom and patient input data. Precision and accuracy were evaluated using the ideal maps. This process was repeated for a range of exposure doses, and performance was compared to that of the clinically-used dual-energy, physics-based method which served as the reference. The machine learning methods generated more accurate and precise parametric maps than those obtained using the reference method. Their performance advantage was particularly evident when using data from the lowest exposure, one-fifth of a typical clinical abdomen CT acquisition. The RF method achieved the greatest accuracy. In comparison, the ANN method was only 1% less accurate but had much better computational efficiency than RF, being able to produce parametric maps in 15 seconds. Machine learning methods outperformed the reference method in terms of accuracy and noise tolerance when generating parametric maps, encouraging further exploration of the techniques. Among the methods we evaluated, ANN is the most suitable for clinical use due to its combination of accuracy, excellent low-noise performance, and computational efficiency. . © 2018 Institute of Physics and Engineering in

  3. Data preparation for municipal virtual assistant using machine learning

    OpenAIRE

    Jovan, Leon Noe

    2016-01-01

    The main goal of this master’s thesis was to develop a procedure that will automate the construction of the knowledge base for a virtual assistant that answers questions about municipalities in Slovenia. The aim of the procedure is to replace or facilitate manual preparation of the virtual assistant's knowledge base. Theoretical backgrounds of different machine learning fields, such as multilabel classification, text mining and learning from weakly labeled data were examined to gain a better ...

  4. Machine learning \\& artificial intelligence in the quantum domain

    OpenAIRE

    Dunjko, Vedran; Briegel, Hans J.

    2017-01-01

    Quantum information technologies, and intelligent learning systems, are both emergent technologies that will likely have a transforming impact on our society. The respective underlying fields of research -- quantum information (QI) versus machine learning (ML) and artificial intelligence (AI) -- have their own specific challenges, which have hitherto been investigated largely independently. However, in a growing body of recent work, researchers have been probing the question to what extent th...

  5. Finding protein sites using machine learning methods

    Directory of Open Access Journals (Sweden)

    Jaime Leonardo Bobadilla Molina

    2003-07-01

    Full Text Available The increasing amount of protein three-dimensional (3D structures determined by x-ray and NMR technologies as well as structures predicted by computational methods results in the need for automated methods to provide inital annotations. We have developed a new method for recognizing sites in three-dimensional protein structures. Our method is based on a previosly reported algorithm for creating descriptions of protein microenviroments using physical and chemical properties at multiple levels of detail. The recognition method takes three inputs: 1. A set of control nonsites that share some structural or functional role. 2. A set of control nonsites that lack this role. 3. A single query site. A support vector machine classifier is built using feature vectors where each component represents a property in a given volume. Validation against an independent test set shows that this recognition approach has high sensitivity and specificity. We also describe the results of scanning four calcium binding proteins (with the calcium removed using a three dimensional grid of probe points at 1.25 angstrom spacing. The system finds the sites in the proteins giving points at or near the blinding sites. Our results show that property based descriptions along with support vector machines can be used for recognizing protein sites in unannotated structures.

  6. Facial Expression Recognition Through Machine Learning

    Directory of Open Access Journals (Sweden)

    Nazia Perveen

    2015-08-01

    Full Text Available Facial expressions communicate non-verbal cues which play an important role in interpersonal relations. Automatic recognition of facial expressions can be an important element of normal human-machine interfaces it might likewise be utilized as a part of behavioral science and in clinical practice. In spite of the fact that people perceive facial expressions for all intents and purposes immediately solid expression recognition by machine is still a challenge. From the point of view of automatic recognition a facial expression can be considered to comprise of disfigurements of the facial parts and their spatial relations or changes in the faces pigmentation. Research into automatic recognition of the facial expressions addresses the issues encompassing the representation and arrangement of static or dynamic qualities of these distortions or face pigmentation. We get results by utilizing the CVIPtools. We have taken train data set of six facial expressions of three persons and for train data set purpose we have total border mask sample 90 and 30 border mask sample for test data set purpose and we use RST- Invariant features and texture features for feature analysis and then classified them by using k- Nearest Neighbor classification algorithm. The maximum accuracy is 90.

  7. Comparison of Machine Learning Methods for the Arterial Hypertension Diagnostics

    Directory of Open Access Journals (Sweden)

    Vladimir S. Kublanov

    2017-01-01

    Full Text Available The paper presents results of machine learning approach accuracy applied analysis of cardiac activity. The study evaluates the diagnostics possibilities of the arterial hypertension by means of the short-term heart rate variability signals. Two groups were studied: 30 relatively healthy volunteers and 40 patients suffering from the arterial hypertension of II-III degree. The following machine learning approaches were studied: linear and quadratic discriminant analysis, k-nearest neighbors, support vector machine with radial basis, decision trees, and naive Bayes classifier. Moreover, in the study, different methods of feature extraction are analyzed: statistical, spectral, wavelet, and multifractal. All in all, 53 features were investigated. Investigation results show that discriminant analysis achieves the highest classification accuracy. The suggested approach of noncorrelated feature set search achieved higher results than data set based on the principal components.

  8. A comparative study of machine learning models for ethnicity classification

    Science.gov (United States)

    Trivedi, Advait; Bessie Amali, D. Geraldine

    2017-11-01

    This paper endeavours to adopt a machine learning approach to solve the problem of ethnicity recognition. Ethnicity identification is an important vision problem with its use cases being extended to various domains. Despite the multitude of complexity involved, ethnicity identification comes naturally to humans. This meta information can be leveraged to make several decisions, be it in target marketing or security. With the recent development of intelligent systems a sub module to efficiently capture ethnicity would be useful in several use cases. Several attempts to identify an ideal learning model to represent a multi-ethnic dataset have been recorded. A comparative study of classifiers such as support vector machines, logistic regression has been documented. Experimental results indicate that the logical classifier provides a much accurate classification than the support vector machine.

  9. A Comparison of Machine Learning Approaches for Corn Yield Estimation

    Science.gov (United States)

    Kim, N.; Lee, Y. W.

    2017-12-01

    Machine learning is an efficient empirical method for classification and prediction, and it is another approach to crop yield estimation. The objective of this study is to estimate corn yield in the Midwestern United States by employing the machine learning approaches such as the support vector machine (SVM), random forest (RF), and deep neural networks (DNN), and to perform the comprehensive comparison for their results. We constructed the database using satellite images from MODIS, the climate data of PRISM climate group, and GLDAS soil moisture data. In addition, to examine the seasonal sensitivities of corn yields, two period groups were set up: May to September (MJJAS) and July and August (JA). In overall, the DNN showed the highest accuracies in term of the correlation coefficient for the two period groups. The differences between our predictions and USDA yield statistics were about 10-11 %.

  10. Analysis and design of machine learning techniques evolutionary solutions for regression, prediction, and control problems

    CERN Document Server

    Stalph, Patrick

    2014-01-01

    Manipulating or grasping objects seems like a trivial task for humans, as these are motor skills of everyday life. Nevertheless, motor skills are not easy to learn for humans and this is also an active research topic in robotics. However, most solutions are optimized for industrial applications and, thus, few are plausible explanations for human learning. The fundamental challenge, that motivates Patrick Stalph, originates from the cognitive science: How do humans learn their motor skills? The author makes a connection between robotics and cognitive sciences by analyzing motor skill learning using implementations that could be found in the human brain – at least to some extent. Therefore three suitable machine learning algorithms are selected – algorithms that are plausible from a cognitive viewpoint and feasible for the roboticist. The power and scalability of those algorithms is evaluated in theoretical simulations and more realistic scenarios with the iCub humanoid robot. Convincing results confirm the...

  11. Runtime Optimizations for Tree-Based Machine Learning Models

    NARCIS (Netherlands)

    N. Asadi; J.J.P. Lin (Jimmy); A.P. de Vries (Arjen)

    2014-01-01

    htmlabstractTree-based models have proven to be an effective solution for web ranking as well as other machine learning problems in diverse domains. This paper focuses on optimizing the runtime performance of applying such models to make predictions, specifically using gradient-boosted regression

  12. Sensor Data Air Pollution Prediction by Machine Learning Methods

    Czech Academy of Sciences Publication Activity Database

    Vidnerová, Petra; Neruda, Roman

    submitted 25. 1. (2018) ISSN 1530-437X R&D Projects: GA ČR GA15-18108S Grant - others:GA MŠk(CZ) LM2015042 Institutional support: RVO:67985807 Keywords : machine learning * sensors * air pollution * deep neural networks * regularization networks Subject RIV: IN - Informatics, Computer Science Impact factor: 2.512, year: 2016

  13. Classification of carcinogenic and mutagenic properties using machine learning method

    DEFF Research Database (Denmark)

    Moorthy, N. S.Hari Narayana; Kumar, Surendra; Poongavanam, Vasanthanathan

    2017-01-01

    An accurate calculation of carcinogenicity of chemicals became a serious challenge for the health assessment authority around the globe because of not only increased cost for experiments but also various ethical issues exist using animal models. In this study, we provide machine learning...

  14. Machine learning for network-based malware detection

    DEFF Research Database (Denmark)

    Stevanovic, Matija

    and based on different, mutually complementary, principles of traffic analysis. The proposed approaches rely on machine learning algorithms (MLAs) for automated and resource-efficient identification of the patterns of malicious network traffic. We evaluated the proposed methods through extensive evaluations...

  15. A machine learning approach to understand business processes

    NARCIS (Netherlands)

    Maruster, L.

    2003-01-01

    Business processes (industries, administration, hospitals, etc.) become nowadays more and more complex and it is difficult to have a complete understanding of them. The goal of the thesis is to show that machine learning techniques can be used successfully for understanding a process on the basis of

  16. Machine Learning and Data Mining Methods in Diabetes Research.

    Science.gov (United States)

    Kavakiotis, Ioannis; Tsave, Olga; Salifoglou, Athanasios; Maglaveras, Nicos; Vlahavas, Ioannis; Chouvarda, Ioanna

    2017-01-01

    The remarkable advances in biotechnology and health sciences have led to a significant production of data, such as high throughput genetic data and clinical information, generated from large Electronic Health Records (EHRs). To this end, application of machine learning and data mining methods in biosciences is presently, more than ever before, vital and indispensable in efforts to transform intelligently all available information into valuable knowledge. Diabetes mellitus (DM) is defined as a group of metabolic disorders exerting significant pressure on human health worldwide. Extensive research in all aspects of diabetes (diagnosis, etiopathophysiology, therapy, etc.) has led to the generation of huge amounts of data. The aim of the present study is to conduct a systematic review of the applications of machine learning, data mining techniques and tools in the field of diabetes research with respect to a) Prediction and Diagnosis, b) Diabetic Complications, c) Genetic Background and Environment, and e) Health Care and Management with the first category appearing to be the most popular. A wide range of machine learning algorithms were employed. In general, 85% of those used were characterized by supervised learning approaches and 15% by unsupervised ones, and more specifically, association rules. Support vector machines (SVM) arise as the most successful and widely used algorithm. Concerning the type of data, clinical datasets were mainly used. The title applications in the selected articles project the usefulness of extracting valuable knowledge leading to new hypotheses targeting deeper understanding and further investigation in DM.

  17. Assessing Implicit Knowledge in BIM Models with Machine Learning

    DEFF Research Database (Denmark)

    Krijnen, Thomas; Tamke, Martin

    2015-01-01

    architects and engineers are able to deduce non-explicitly explicitly stated information, which is often the core of the transported architectural information. This paper investigates how machine learning approaches allow a computational system to deduce implicit knowledge from a set of BIM models....

  18. Informatics and machine learning to define the phenotype.

    Science.gov (United States)

    Basile, Anna Okula; Ritchie, Marylyn DeRiggi

    2018-03-01

    For the past decade, the focus of complex disease research has been the genotype. From technological advancements to the development of analysis methods, great progress has been made. However, advances in our definition of the phenotype have remained stagnant. Phenotype characterization has recently emerged as an exciting area of informatics and machine learning. The copious amounts of diverse biomedical data that have been collected may be leveraged with data-driven approaches to elucidate trait-related features and patterns. Areas covered: In this review, the authors discuss the phenotype in traditional genetic associations and the challenges this has imposed.Approaches for phenotype refinement that can aid in more accurate characterization of traits are also discussed. Further, the authors highlight promising machine learning approaches for establishing a phenotype and the challenges of electronic health record (EHR)-derived data. Expert commentary: The authors hypothesize that through unsupervised machine learning, data-driven approaches can be used to define phenotypes rather than relying on expert clinician knowledge. Through the use of machine learning and an unbiased set of features extracted from clinical repositories, researchers will have the potential to further understand complex traits and identify patient subgroups. This knowledge may lead to more preventative and precise clinical care.

  19. RSO Characterization with Photometric Data Using Machine Learning

    Science.gov (United States)

    2015-10-18

    RSO Characterization with Photometric Data Using Machine Learning Michael Howard Charles River Analytics, Inc. Bernie Klem SASSO, Inc. Joe...and its behavior. This paper explores object characterization methods using photometric data. An important property of RSO photometric signatures is... photometric signature include geometry, orientation, material characteristics and stability. For this reason, it should be possible to recover these

  20. Machine learning versus knowledge based classification of legal texts

    NARCIS (Netherlands)

    de Maat, E.; Krabben, K.; Winkels, R.; Winkels, R.G.F.

    2010-01-01

    This paper presents results of an experiment in which we used machine learning (ML) techniques to classify sentences in Dutch legislation. These results are compared to the results of a pattern-based classifier. Overall, the ML classifier performs as accurate (>90%) as the pattern based one, but

  1. Archetypal analysis for machine learning and data mining

    DEFF Research Database (Denmark)

    Mørup, Morten; Hansen, Lars Kai

    2012-01-01

    of the observed data. We further demonstrate that the aa model is relevant for feature extraction and dimensionality reduction for a large variety of machine learning problems taken from computer vision, neuroimaging, chemistry, text mining and collaborative filtering leading to highly interpretable...

  2. Machine learning modelling for predicting soil liquefaction susceptibility

    Directory of Open Access Journals (Sweden)

    P. Samui

    2011-01-01

    Full Text Available This study describes two machine learning techniques applied to predict liquefaction susceptibility of soil based on the standard penetration test (SPT data from the 1999 Chi-Chi, Taiwan earthquake. The first machine learning technique which uses Artificial Neural Network (ANN based on multi-layer perceptions (MLP that are trained with Levenberg-Marquardt backpropagation algorithm. The second machine learning technique uses the Support Vector machine (SVM that is firmly based on the theory of statistical learning theory, uses classification technique. ANN and SVM have been developed to predict liquefaction susceptibility using corrected SPT [(N160] and cyclic stress ratio (CSR. Further, an attempt has been made to simplify the models, requiring only the two parameters [(N160 and peck ground acceleration (amax/g], for the prediction of liquefaction susceptibility. The developed ANN and SVM models have also been applied to different case histories available globally. The paper also highlights the capability of the SVM over the ANN models.

  3. Thutmose - Investigation of Machine Learning-Based Intrusion Detection Systems

    Science.gov (United States)

    2016-06-01

    monitoring. This analyzed payload is within the application layer of the OSI model . The analysis tries to establish whether or not the payload is...24 3.2.5 Model Drift Experiments...ADVERSARIAL ENVIRONMENTS (SPIE DSS 2014) .................................................. 58 APPENDIX C - EVALUATING MODEL DRIFT IN MACHINE LEARNING

  4. Simulation-driven machine learning: Bearing fault classification

    Science.gov (United States)

    Sobie, Cameron; Freitas, Carina; Nicolai, Mike

    2018-01-01

    Increasing the accuracy of mechanical fault detection has the potential to improve system safety and economic performance by minimizing scheduled maintenance and the probability of unexpected system failure. Advances in computational performance have enabled the application of machine learning algorithms across numerous applications including condition monitoring and failure detection. Past applications of machine learning to physical failure have relied explicitly on historical data, which limits the feasibility of this approach to in-service components with extended service histories. Furthermore, recorded failure data is often only valid for the specific circumstances and components for which it was collected. This work directly addresses these challenges for roller bearings with race faults by generating training data using information gained from high resolution simulations of roller bearing dynamics, which is used to train machine learning algorithms that are then validated against four experimental datasets. Several different machine learning methodologies are compared starting from well-established statistical feature-based methods to convolutional neural networks, and a novel application of dynamic time warping (DTW) to bearing fault classification is proposed as a robust, parameter free method for race fault detection.

  5. Novel Breast Imaging and Machine Learning: Predicting Breast Lesion Malignancy at Cone-Beam CT Using Machine Learning Techniques.

    Science.gov (United States)

    Uhlig, Johannes; Uhlig, Annemarie; Kunze, Meike; Beissbarth, Tim; Fischer, Uwe; Lotz, Joachim; Wienbeck, Susanne

    2018-05-24

    The purpose of this study is to evaluate the diagnostic performance of machine learning techniques for malignancy prediction at breast cone-beam CT (CBCT) and to compare them to human readers. Five machine learning techniques, including random forests, back propagation neural networks (BPN), extreme learning machines, support vector machines, and K-nearest neighbors, were used to train diagnostic models on a clinical breast CBCT dataset with internal validation by repeated 10-fold cross-validation. Two independent blinded human readers with profound experience in breast imaging and breast CBCT analyzed the same CBCT dataset. Diagnostic performance was compared using AUC, sensitivity, and specificity. The clinical dataset comprised 35 patients (American College of Radiology density type C and D breasts) with 81 suspicious breast lesions examined with contrast-enhanced breast CBCT. Forty-five lesions were histopathologically proven to be malignant. Among the machine learning techniques, BPNs provided the best diagnostic performance, with AUC of 0.91, sensitivity of 0.85, and specificity of 0.82. The diagnostic performance of the human readers was AUC of 0.84, sensitivity of 0.89, and specificity of 0.72 for reader 1 and AUC of 0.72, sensitivity of 0.71, and specificity of 0.67 for reader 2. AUC was significantly higher for BPN when compared with both reader 1 (p = 0.01) and reader 2 (p Machine learning techniques provide a high and robust diagnostic performance in the prediction of malignancy in breast lesions identified at CBCT. BPNs showed the best diagnostic performance, surpassing human readers in terms of AUC and specificity.

  6. Man-Machine Communication in Remote Manipulation: Task-Oriented Supervisory Command Language (TOSC).

    Science.gov (United States)

    1980-03-01

    2-32 WvNIPUa.ATY wWMj.IAYCOYOLLID 𔃾 -XNEMILPU . 310?!3us mI.I Sr 1lR ROBO ........... O4WIUVISYN CP"CRIAumaN OWW411 -femi y PACISS PSTISH ’uucomC...task description 3-15 and as a machine advisor for an apprentice. Thus, the PN are developed hierarchically, by the machine from the top down to the...96734 rs Defense Technical Information Center Dr. A. L. Slafkosky Cameron Station, Bldg. 5 Scientific Advisor Alexandria, VA 22314 (12 cys) Commandant

  7. Applications of Task-Based Learning in TESOL

    Science.gov (United States)

    Shehadeh, Ali, Ed.; Coombe, Christine, Ed.

    2010-01-01

    Why are many teachers around the world moving toward task-based learning (TBL)? This shift is based on the strong belief that TBL facilitates second language acquisition and makes second language learning and teaching more principled and effective. Based on insights gained from using tasks as research tools, this volume shows how teachers can use…

  8. Asymmetrical learning between a tactile and visual serial RT task

    NARCIS (Netherlands)

    Abrahamse, E.L.; van der Lubbe, Robert Henricus Johannes; Verwey, Willem B.

    2007-01-01

    According to many researchers, implicit learning in the serial reaction-time task is predominantly motor based and therefore should be independent of stimulus modality. Previous research on the task, however, has focused almost completely on the visual domain. Here we investigated sequence learning

  9. Making Individual Prognoses in Psychiatry Using Neuroimaging and Machine Learning.

    Science.gov (United States)

    Janssen, Ronald J; Mourão-Miranda, Janaina; Schnack, Hugo G

    2018-04-22

    Psychiatric prognosis is a difficult problem. Making a prognosis requires looking far into the future, as opposed to making a diagnosis, which is concerned with the current state. During the follow-up period, many factors will influence the course of the disease. Combined with the usually scarcer longitudinal data and the variability in the definition of outcomes/transition, this makes prognostic predictions a challenging endeavor. Employing neuroimaging data in this endeavor introduces the additional hurdle of high dimensionality. Machine-learning techniques are especially suited to tackle this challenging problem. This review starts with a brief introduction to machine learning in the context of its application to clinical neuroimaging data. We highlight a few issues that are especially relevant for prediction of outcome and transition using neuroimaging. We then review the literature that discusses the application of machine learning for this purpose. Critical examination of the studies and their results with respect to the relevant issues revealed the following: 1) there is growing evidence for the prognostic capability of machine-learning-based models using neuroimaging; and 2) reported accuracies may be too optimistic owing to small sample sizes and the lack of independent test samples. Finally, we discuss options to improve the reliability of (prognostic) prediction models. These include new methodologies and multimodal modeling. Paramount, however, is our conclusion that future work will need to provide properly (cross-)validated accuracy estimates of models trained on sufficiently large datasets. Nevertheless, with the technological advances enabling acquisition of large databases of patients and healthy subjects, machine learning represents a powerful tool in the search for psychiatric biomarkers. Copyright © 2018 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  10. Obtaining Global Picture From Single Point Observations by Combining Data Assimilation and Machine Learning Tools

    Science.gov (United States)

    Shprits, Y.; Zhelavskaya, I. S.; Kellerman, A. C.; Spasojevic, M.; Kondrashov, D. A.; Ghil, M.; Aseev, N.; Castillo Tibocha, A. M.; Cervantes Villa, J. S.; Kletzing, C.; Kurth, W. S.

    2017-12-01

    Increasing volume of satellite measurements requires deployment of new tools that can utilize such vast amount of data. Satellite measurements are usually limited to a single location in space, which complicates the data analysis geared towards reproducing the global state of the space environment. In this study we show how measurements can be combined by means of data assimilation and how machine learning can help analyze large amounts of data and can help develop global models that are trained on single point measurement. Data Assimilation: Manual analysis of the satellite measurements is a challenging task, while automated analysis is complicated by the fact that measurements are given at various locations in space, have different instrumental errors, and often vary by orders of magnitude. We show results of the long term reanalysis of radiation belt measurements along with fully operational real-time predictions using data assimilative VERB code. Machine Learning: We present application of the machine learning tools for the analysis of NASA Van Allen Probes upper-hybrid frequency measurements. Using the obtained data set we train a new global predictive neural network. The results for the Van Allen Probes based neural network are compared with historical IMAGE satellite observations. We also show examples of predictions of geomagnetic indices using neural networks. Combination of machine learning and data assimilation: We discuss how data assimilation tools and machine learning tools can be combine so that physics-based insight into the dynamics of the particular system can be combined with empirical knowledge of it's non-linear behavior.

  11. Task type and incidental L2 vocabulary learning: Repetition versus ...

    African Journals Online (AJOL)

    This study investigated the effect of task type on incidental L2 vocabulary learning. The different tasks investigated in this study differed in terms of repetition of encounters and task involvement load. In a within-subjects design, 72 Iranian learners of English practised 18 target words in three exercise conditions: three ...

  12. Machine learning models in breast cancer survival prediction.

    Science.gov (United States)

    Montazeri, Mitra; Montazeri, Mohadeseh; Montazeri, Mahdieh; Beigzadeh, Amin

    2016-01-01

    Breast cancer is one of the most common cancers with a high mortality rate among women. With the early diagnosis of breast cancer survival will increase from 56% to more than 86%. Therefore, an accurate and reliable system is necessary for the early diagnosis of this cancer. The proposed model is the combination of rules and different machine learning techniques. Machine learning models can help physicians to reduce the number of false decisions. They try to exploit patterns and relationships among a large number of cases and predict the outcome of a disease using historical cases stored in datasets. The objective of this study is to propose a rule-based classification method with machine learning techniques for the prediction of different types of Breast cancer survival. We use a dataset with eight attributes that include the records of 900 patients in which 876 patients (97.3%) and 24 (2.7%) patients were females and males respectively. Naive Bayes (NB), Trees Random Forest (TRF), 1-Nearest Neighbor (1NN), AdaBoost (AD), Support Vector Machine (SVM), RBF Network (RBFN), and Multilayer Perceptron (MLP) machine learning techniques with 10-cross fold technique were used with the proposed model for the prediction of breast cancer survival. The performance of machine learning techniques were evaluated with accuracy, precision, sensitivity, specificity, and area under ROC curve. Out of 900 patients, 803 patients and 97 patients were alive and dead, respectively. In this study, Trees Random Forest (TRF) technique showed better results in comparison to other techniques (NB, 1NN, AD, SVM and RBFN, MLP). The accuracy, sensitivity and the area under ROC curve of TRF are 96%, 96%, 93%, respectively. However, 1NN machine learning technique provided poor performance (accuracy 91%, sensitivity 91% and area under ROC curve 78%). This study demonstrates that Trees Random Forest model (TRF) which is a rule-based classification model was the best model with the highest level of

  13. Deep imitation learning for 3D navigation tasks.

    Science.gov (United States)

    Hussein, Ahmed; Elyan, Eyad; Gaber, Mohamed Medhat; Jayne, Chrisina

    2018-01-01

    Deep learning techniques have shown success in learning from raw high-dimensional data in various applications. While deep reinforcement learning is recently gaining popularity as a method to train intelligent agents, utilizing deep learning in imitation learning has been scarcely explored. Imitation learning can be an efficient method to teach intelligent agents by providing a set of demonstrations to learn from. However, generalizing to situations that are not represented in the demonstrations can be challenging, especially in 3D environments. In this paper, we propose a deep imitation learning method to learn navigation tasks from demonstrations in a 3D environment. The supervised policy is refined using active learning in order to generalize to unseen situations. This approach is compared to two popular deep reinforcement learning techniques: deep-Q-networks and Asynchronous actor-critic (A3C). The proposed method as well as the reinforcement learning methods employ deep convolutional neural networks and learn directly from raw visual input. Methods for combining learning from demonstrations and experience are also investigated. This combination aims to join the generalization ability of learning by experience with the efficiency of learning by imitation. The proposed methods are evaluated on 4 navigation tasks in a 3D simulated environment. Navigation tasks are a typical problem that is relevant to many real applications. They pose the challenge of requiring demonstrations of long trajectories to reach the target and only providing delayed rewards (usually terminal) to the agent. The experiments show that the proposed method can successfully learn navigation tasks from raw visual input while learning from experience methods fail to learn an effective policy. Moreover, it is shown that active learning can significantly improve the performance of the initially learned policy using a small number of active samples.

  14. Machine Learning Techniques for Stellar Light Curve Classification

    Science.gov (United States)

    Hinners, Trisha A.; Tat, Kevin; Thorp, Rachel

    2018-07-01

    We apply machine learning techniques in an attempt to predict and classify stellar properties from noisy and sparse time-series data. We preprocessed over 94 GB of Kepler light curves from the Mikulski Archive for Space Telescopes (MAST) to classify according to 10 distinct physical properties using both representation learning and feature engineering approaches. Studies using machine learning in the field have been primarily done on simulated data, making our study one of the first to use real light-curve data for machine learning approaches. We tuned our data using previous work with simulated data as a template and achieved mixed results between the two approaches. Representation learning using a long short-term memory recurrent neural network produced no successful predictions, but our work with feature engineering was successful for both classification and regression. In particular, we were able to achieve values for stellar density, stellar radius, and effective temperature with low error (∼2%–4%) and good accuracy (∼75%) for classifying the number of transits for a given star. The results show promise for improvement for both approaches upon using larger data sets with a larger minority class. This work has the potential to provide a foundation for future tools and techniques to aid in the analysis of astrophysical data.

  15. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.

    Science.gov (United States)

    Huang, Shujun; Cai, Nianguang; Pacheco, Pedro Penzuti; Narrandes, Shavira; Wang, Yang; Xu, Wayne

    2018-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications. Copyright© 2018, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  16. On the Use of Machine Learning for Identifying Botnet Network Traffic

    DEFF Research Database (Denmark)

    Stevanovic, Matija; Pedersen, Jens Myrup

    2016-01-01

    contemporary approaches use machine learning techniques for identifying malicious traffic. This paper presents a survey of contemporary botnet detection methods that rely on machine learning for identifying botnet network traffic. The paper provides a comprehensive overview on the existing scientific work thus...... contributing to the better understanding of capabilities, limitations and opportunities of using machine learning for identifying botnet traffic. Furthermore, the paper outlines possibilities for the future development of machine learning-based botnet detection systems....

  17. Which Management Control System principles and aspects are relevant when deploying a learning machine?

    OpenAIRE

    Martin, Johansson; Mikael, Göthager

    2017-01-01

    How shall a business adapt its management control systems when learning machines enter the arena? Will the control system continue to focus on humans aspects and continue to consider a learning machine to be an automation tool as any other historically programmed computer? Learning machines introduces productivity capabilities that achieve very high levels of efficiency and quality. A learning machine can sort through large amounts of data and make conclusions difficult by a human mind. Howev...

  18. Attacking Machine Learning models as part of a cyber kill chain

    OpenAIRE

    Nguyen, Tam N.

    2017-01-01

    Machine learning is gaining popularity in the network security domain as many more network-enabled devices get connected, as malicious activities become stealthier, and as new technologies like Software Defined Networking emerge. Compromising machine learning model is a desirable goal. In fact, spammers have been quite successful getting through machine learning enabled spam filters for years. While previous works have been done on adversarial machine learning, none has been considered within...

  19. Unsupervised process monitoring and fault diagnosis with machine learning methods

    CERN Document Server

    Aldrich, Chris

    2013-01-01

    This unique text/reference describes in detail the latest advances in unsupervised process monitoring and fault diagnosis with machine learning methods. Abundant case studies throughout the text demonstrate the efficacy of each method in real-world settings. The broad coverage examines such cutting-edge topics as the use of information theory to enhance unsupervised learning in tree-based methods, the extension of kernel methods to multiple kernel learning for feature extraction from data, and the incremental training of multilayer perceptrons to construct deep architectures for enhanced data

  20. Machine Learning for the Knowledge Plane

    Science.gov (United States)

    2006-06-01

    Baader, Diego Calvanese, Daniele Nardi, and Peter Patel- Schneider. The Description Logic Handbook. Cambridge University Press, 2003. Shelly ...classifiers for which action to select or regression functions over actions or states. However, it can also be cast as larger-scale structures...Research in the reinforcement learning framework falls into two main paradigms. One casts control policies in terms of functions that map state