WorldWideScience

Sample records for decision tree method

  1. Decision tree methods: applications for classification and prediction.

    Science.gov (United States)

    Song, Yan-Yan; Lu, Ying

    2015-04-25

    Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure.

  2. EEG feature selection method based on decision tree.

    Science.gov (United States)

    Duan, Lijuan; Ge, Hui; Ma, Wei; Miao, Jun

    2015-01-01

    This paper aims to solve automated feature selection problem in brain computer interface (BCI). In order to automate feature selection process, we proposed a novel EEG feature selection method based on decision tree (DT). During the electroencephalogram (EEG) signal processing, a feature extraction method based on principle component analysis (PCA) was used, and the selection process based on decision tree was performed by searching the feature space and automatically selecting optimal features. Considering that EEG signals are a series of non-linear signals, a generalized linear classifier named support vector machine (SVM) was chosen. In order to test the validity of the proposed method, we applied the EEG feature selection method based on decision tree to BCI Competition II datasets Ia, and the experiment showed encouraging results.

  3. Predicting metabolic syndrome using decision tree and support vector machine methods

    Directory of Open Access Journals (Sweden)

    Farzaneh Karimi-Alavijeh

    2016-06-01

    Full Text Available BACKGROUND: Metabolic syndrome which underlies the increased prevalence of cardiovascular disease and Type 2 diabetes is considered as a group of metabolic abnormalities including central obesity, hypertriglyceridemia, glucose intolerance, hypertension, and dyslipidemia. Recently, artificial intelligence based health-care systems are highly regarded because of its success in diagnosis, prediction, and choice of treatment. This study employs machine learning technics for predict the metabolic syndrome. METHODS: This study aims to employ decision tree and support vector machine (SVM to predict the 7-year incidence of metabolic syndrome. This research is a practical one in which data from 2107 participants of Isfahan Cohort Study has been utilized. The subjects without metabolic syndrome according to the ATPIII criteria were selected. The features that have been used in this data set include: gender, age, weight, body mass index, waist circumference, waist-to-hip ratio, hip circumference, physical activity, smoking, hypertension, antihypertensive medication use, systolic blood pressure (BP, diastolic BP, fasting blood sugar, 2-hour blood glucose, triglycerides (TGs, total cholesterol, low-density lipoprotein, high density lipoprotein-cholesterol, mean corpuscular volume, and mean corpuscular hemoglobin. Metabolic syndrome was diagnosed based on ATPIII criteria and two methods of decision tree and SVM were selected to predict the metabolic syndrome. The criteria of sensitivity, specificity and accuracy were used for validation. RESULTS: SVM and decision tree methods were examined according to the criteria of sensitivity, specificity and accuracy. Sensitivity, specificity and accuracy were 0.774 (0.758, 0.74 (0.72 and 0.757 (0.739 in SVM (decision tree method. CONCLUSION: The results show that SVM method sensitivity, specificity and accuracy is more efficient than decision tree. The results of decision tree method show that the TG is the most

  4. Predicting metabolic syndrome using decision tree and support vector machine methods.

    Science.gov (United States)

    Karimi-Alavijeh, Farzaneh; Jalili, Saeed; Sadeghi, Masoumeh

    2016-05-01

    Metabolic syndrome which underlies the increased prevalence of cardiovascular disease and Type 2 diabetes is considered as a group of metabolic abnormalities including central obesity, hypertriglyceridemia, glucose intolerance, hypertension, and dyslipidemia. Recently, artificial intelligence based health-care systems are highly regarded because of its success in diagnosis, prediction, and choice of treatment. This study employs machine learning technics for predict the metabolic syndrome. This study aims to employ decision tree and support vector machine (SVM) to predict the 7-year incidence of metabolic syndrome. This research is a practical one in which data from 2107 participants of Isfahan Cohort Study has been utilized. The subjects without metabolic syndrome according to the ATPIII criteria were selected. The features that have been used in this data set include: gender, age, weight, body mass index, waist circumference, waist-to-hip ratio, hip circumference, physical activity, smoking, hypertension, antihypertensive medication use, systolic blood pressure (BP), diastolic BP, fasting blood sugar, 2-hour blood glucose, triglycerides (TGs), total cholesterol, low-density lipoprotein, high density lipoprotein-cholesterol, mean corpuscular volume, and mean corpuscular hemoglobin. Metabolic syndrome was diagnosed based on ATPIII criteria and two methods of decision tree and SVM were selected to predict the metabolic syndrome. The criteria of sensitivity, specificity and accuracy were used for validation. SVM and decision tree methods were examined according to the criteria of sensitivity, specificity and accuracy. Sensitivity, specificity and accuracy were 0.774 (0.758), 0.74 (0.72) and 0.757 (0.739) in SVM (decision tree) method. The results show that SVM method sensitivity, specificity and accuracy is more efficient than decision tree. The results of decision tree method show that the TG is the most important feature in predicting metabolic syndrome. According

  5. Method of decision tree applied in adopting the decision for promoting a company

    Directory of Open Access Journals (Sweden)

    Cezarina Adina TOFAN

    2015-09-01

    Full Text Available The decision can be defined as the way chosen from several possible to achieve an objective. An important role in the functioning of the decisional-informational system is held by the decision-making methods. Decision trees are proving to be very useful tools for taking financial decisions or regarding the numbers, where a large amount of complex information must be considered. They provide an effective structure in which alternative decisions and the implications of their choice can be assessed, and help to form a correct and balanced vision of the risks and rewards that may result from a certain choice. For these reasons, the content of this communication will review a series of decision-making criteria. Also, it will analyse the benefits of using the decision tree method in the decision-making process by providing a numerical example. On this basis, it can be concluded that the procedure may prove useful in making decisions for companies operating on markets where competition intensity is differentiated.

  6. Improving Land Use/Land Cover Classification by Integrating Pixel Unmixing and Decision Tree Methods

    Directory of Open Access Journals (Sweden)

    Chao Yang

    2017-11-01

    Full Text Available Decision tree classification is one of the most efficient methods for obtaining land use/land cover (LULC information from remotely sensed imageries. However, traditional decision tree classification methods cannot effectively eliminate the influence of mixed pixels. This study aimed to integrate pixel unmixing and decision tree to improve LULC classification by removing mixed pixel influence. The abundance and minimum noise fraction (MNF results that were obtained from mixed pixel decomposition were added to decision tree multi-features using a three-dimensional (3D Terrain model, which was created using an image fusion digital elevation model (DEM, to select training samples (ROIs, and improve ROI separability. A Landsat-8 OLI image of the Yunlong Reservoir Basin in Kunming was used to test this proposed method. Study results showed that the Kappa coefficient and the overall accuracy of integrated pixel unmixing and decision tree method increased by 0.093% and 10%, respectively, as compared with the original decision tree method. This proposed method could effectively eliminate the influence of mixed pixels and improve the accuracy in complex LULC classifications.

  7. Decision trees in epidemiological research.

    Science.gov (United States)

    Venkatasubramaniam, Ashwini; Wolfson, Julian; Mitchell, Nathan; Barnes, Timothy; JaKa, Meghan; French, Simone

    2017-01-01

    In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.

  8. Human decision error (HUMDEE) trees

    International Nuclear Information System (INIS)

    Ostrom, L.T.

    1993-01-01

    Graphical presentations of human actions in incident and accident sequences have been used for many years. However, for the most part, human decision making has been underrepresented in these trees. This paper presents a method of incorporating the human decision process into graphical presentations of incident/accident sequences. This presentation is in the form of logic trees. These trees are called Human Decision Error Trees or HUMDEE for short. The primary benefit of HUMDEE trees is that they graphically illustrate what else the individuals involved in the event could have done to prevent either the initiation or continuation of the event. HUMDEE trees also present the alternate paths available at the operator decision points in the incident/accident sequence. This is different from the Technique for Human Error Rate Prediction (THERP) event trees. There are many uses of these trees. They can be used for incident/accident investigations to show what other courses of actions were available and for training operators. The trees also have a consequence component so that not only the decision can be explored, also the consequence of that decision

  9. Decision trees in epidemiological research

    Directory of Open Access Journals (Sweden)

    Ashwini Venkatasubramaniam

    2017-09-01

    Full Text Available Abstract Background In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. Main text We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART technique and the newer Conditional Inference tree (CTree technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Conclusions Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.

  10. Two Trees: Migrating Fault Trees to Decision Trees for Real Time Fault Detection on International Space Station

    Science.gov (United States)

    Lee, Charles; Alena, Richard L.; Robinson, Peter

    2004-01-01

    We started from ISS fault trees example to migrate to decision trees, presented a method to convert fault trees to decision trees. The method shows that the visualizations of root cause of fault are easier and the tree manipulating becomes more programmatic via available decision tree programs. The visualization of decision trees for the diagnostic shows a format of straight forward and easy understands. For ISS real time fault diagnostic, the status of the systems could be shown by mining the signals through the trees and see where it stops at. The other advantage to use decision trees is that the trees can learn the fault patterns and predict the future fault from the historic data. The learning is not only on the static data sets but also can be online, through accumulating the real time data sets, the decision trees can gain and store faults patterns in the trees and recognize them when they come.

  11. Ethnographic Decision Tree Modeling: A Research Method for Counseling Psychology.

    Science.gov (United States)

    Beck, Kirk A.

    2005-01-01

    This article describes ethnographic decision tree modeling (EDTM; C. H. Gladwin, 1989) as a mixed method design appropriate for counseling psychology research. EDTM is introduced and located within a postpositivist research paradigm. Decision theory that informs EDTM is reviewed, and the 2 phases of EDTM are highlighted. The 1st phase, model…

  12. Meta-learning in decision tree induction

    CERN Document Server

    Grąbczewski, Krzysztof

    2014-01-01

    The book focuses on different variants of decision tree induction but also describes  the meta-learning approach in general which is applicable to other types of machine learning algorithms. The book discusses different variants of decision tree induction and represents a useful source of information to readers wishing to review some of the techniques used in decision tree learning, as well as different ensemble methods that involve decision trees. It is shown that the knowledge of different components used within decision tree learning needs to be systematized to enable the system to generate and evaluate different variants of machine learning algorithms with the aim of identifying the top-most performers or potentially the best one. A unified view of decision tree learning enables to emulate different decision tree algorithms simply by setting certain parameters. As meta-learning requires running many different processes with the aim of obtaining performance results, a detailed description of the experimen...

  13. An Efficient Method of Vibration Diagnostics For Rotating Machinery Using a Decision Tree

    Directory of Open Access Journals (Sweden)

    Bo Suk Yang

    2000-01-01

    Full Text Available This paper describes an efficient method to automatize vibration diagnosis for rotating machinery using a decision tree, which is applicable to vibration diagnosis expert system. Decision tree is a widely known formalism for expressing classification knowledge and has been used successfully in many diverse areas such as character recognition, medical diagnosis, and expert systems, etc. In order to build a decision tree for vibration diagnosis, we have to define classes and attributes. A set of cases based on past experiences is also needed. This training set is inducted using a result-cause matrix newly developed in the present work instead of using a conventionally implemented cause-result matrix. This method was applied to diagnostics for various cases taken from published work. It is found that the present method predicts causes of the abnormal vibration for test cases with high reliability.

  14. Predicting volume of distribution with decision tree-based regression methods using predicted tissue:plasma partition coefficients.

    Science.gov (United States)

    Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat

    2015-01-01

    Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.

  15. Comprehensive decision tree models in bioinformatics.

    Directory of Open Access Journals (Sweden)

    Gregor Stiglic

    Full Text Available PURPOSE: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. METHODS: This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. RESULTS: The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. CONCLUSIONS: The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets

  16. Interpreting CNNs via Decision Trees

    OpenAIRE

    Zhang, Quanshi; Yang, Yu; Wu, Ying Nian; Zhu, Song-Chun

    2018-01-01

    This paper presents a method to learn a decision tree to quantitatively explain the logic of each prediction of a pre-trained convolutional neural networks (CNNs). Our method boosts the following two aspects of network interpretability. 1) In the CNN, each filter in a high conv-layer must represent a specific object part, instead of describing mixed patterns without clear meanings. 2) People can explain each specific prediction made by the CNN at the semantic level using a decision tree, i.e....

  17. Automated Decision Tree Classification of Corneal Shape

    Science.gov (United States)

    Twa, Michael D.; Parthasarathy, Srinivasan; Roberts, Cynthia; Mahmoud, Ashraf M.; Raasch, Thomas W.; Bullimore, Mark A.

    2011-01-01

    Purpose The volume and complexity of data produced during videokeratography examinations present a challenge of interpretation. As a consequence, results are often analyzed qualitatively by subjective pattern recognition or reduced to comparisons of summary indices. We describe the application of decision tree induction, an automated machine learning classification method, to discriminate between normal and keratoconic corneal shapes in an objective and quantitative way. We then compared this method with other known classification methods. Methods The corneal surface was modeled with a seventh-order Zernike polynomial for 132 normal eyes of 92 subjects and 112 eyes of 71 subjects diagnosed with keratoconus. A decision tree classifier was induced using the C4.5 algorithm, and its classification performance was compared with the modified Rabinowitz–McDonnell index, Schwiegerling’s Z3 index (Z3), Keratoconus Prediction Index (KPI), KISA%, and Cone Location and Magnitude Index using recommended classification thresholds for each method. We also evaluated the area under the receiver operator characteristic (ROC) curve for each classification method. Results Our decision tree classifier performed equal to or better than the other classifiers tested: accuracy was 92% and the area under the ROC curve was 0.97. Our decision tree classifier reduced the information needed to distinguish between normal and keratoconus eyes using four of 36 Zernike polynomial coefficients. The four surface features selected as classification attributes by the decision tree method were inferior elevation, greater sagittal depth, oblique toricity, and trefoil. Conclusions Automated decision tree classification of corneal shape through Zernike polynomials is an accurate quantitative method of classification that is interpretable and can be generated from any instrument platform capable of raw elevation data output. This method of pattern classification is extendable to other classification

  18. Decision and Inhibitory Trees for Decision Tables with Many-Valued Decisions

    KAUST Repository

    Azad, Mohammad

    2018-06-06

    Decision trees are one of the most commonly used tools in decision analysis, knowledge representation, machine learning, etc., for its simplicity and interpretability. We consider an extension of dynamic programming approach to process the whole set of decision trees for the given decision table which was previously only attainable by brute-force algorithms. We study decision tables with many-valued decisions (each row may contain multiple decisions) because they are more reasonable models of data in many cases. To address this problem in a broad sense, we consider not only decision trees but also inhibitory trees where terminal nodes are labeled with “̸= decision”. Inhibitory trees can sometimes describe more knowledge from datasets than decision trees. As for cost functions, we consider depth or average depth to minimize time complexity of trees, and the number of nodes or the number of the terminal, or nonterminal nodes to minimize the space complexity of trees. We investigate the multi-stage optimization of trees relative to some cost functions, and also the possibility to describe the whole set of strictly optimal trees. Furthermore, we study the bi-criteria optimization cost vs. cost and cost vs. uncertainty for decision trees, and cost vs. cost and cost vs. completeness for inhibitory trees. The most interesting application of the developed technique is the creation of multi-pruning and restricted multi-pruning approaches which are useful for knowledge representation and prediction. The experimental results show that decision trees constructed by these approaches can often outperform the decision trees constructed by the CART algorithm. Another application includes the comparison of 12 greedy heuristics for single- and bi-criteria optimization (cost vs. cost) of trees. We also study the three approaches (decision tables with many-valued decisions, decision tables with most common decisions, and decision tables with generalized decisions) to handle

  19. Statistical Sensitive Data Protection and Inference Prevention with Decision Tree Methods

    National Research Council Canada - National Science Library

    Chang, LiWu

    2003-01-01

    .... We consider inference as correct classification and approach it with decision tree methods. As in our previous work, sensitive data are viewed as classes of those test data and non-sensitive data are the rest attribute values...

  20. Creating ensembles of decision trees through sampling

    Science.gov (United States)

    Kamath, Chandrika; Cantu-Paz, Erick

    2005-08-30

    A system for decision tree ensembles that includes a module to read the data, a module to sort the data, a module to evaluate a potential split of the data according to some criterion using a random sample of the data, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method is based on statistical sampling techniques and includes the steps of reading the data; sorting the data; evaluating a potential split according to some criterion using a random sample of the data, splitting the data, and combining multiple decision trees in ensembles.

  1. Decision Tree Technique for Particle Identification

    International Nuclear Information System (INIS)

    Quiller, Ryan

    2003-01-01

    Particle identification based on measurements such as the Cerenkov angle, momentum, and the rate of energy loss per unit distance (-dE/dx) is fundamental to the BaBar detector for particle physics experiments. It is particularly important to separate the charged forms of kaons and pions. Currently, the Neural Net, an algorithm based on mapping input variables to an output variable using hidden variables as intermediaries, is one of the primary tools used for identification. In this study, a decision tree classification technique implemented in the computer program, CART, was investigated and compared to the Neural Net over the range of momenta, 0.25 GeV/c to 5.0 GeV/c. For a given subinterval of momentum, three decision trees were made using different sets of input variables. The sensitivity and specificity were calculated for varying kaon acceptance thresholds. This data was used to plot Receiver Operating Characteristic curves (ROC curves) to compare the performance of the classification methods. Also, input variables used in constructing the decision trees were analyzed. It was found that the Neural Net was a significant contributor to decision trees using dE/dx and the Cerenkov angle as inputs. Furthermore, the Neural Net had poorer performance than the decision tree technique, but tended to improve decision tree performance when used as an input variable. These results suggest that the decision tree technique using Neural Net input may possibly increase accuracy of particle identification in BaBar

  2. Comprehensive decision tree models in bioinformatics.

    Science.gov (United States)

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly

  3. On algorithm for building of optimal α-decision trees

    KAUST Repository

    Alkhalid, Abdulaziz

    2010-01-01

    The paper describes an algorithm that constructs approximate decision trees (α-decision trees), which are optimal relatively to one of the following complexity measures: depth, total path length or number of nodes. The algorithm uses dynamic programming and extends methods described in [4] to constructing approximate decision trees. Adjustable approximation rate allows controlling algorithm complexity. The algorithm is applied to build optimal α-decision trees for two data sets from UCI Machine Learning Repository [1]. © 2010 Springer-Verlag Berlin Heidelberg.

  4. Quantifying human and organizational factors in accident management using decision trees: the HORAAM method

    International Nuclear Information System (INIS)

    Baumont, G.; Menage, F.; Schneiter, J.R.; Spurgin, A.; Vogel, A.

    2000-01-01

    In the framework of the level 2 Probabilistic Safety Study (PSA 2) project, the Institute for Nuclear Safety and Protection (IPSN) has developed a method for taking into account Human and Organizational Reliability Aspects during accident management. Actions are taken during very degraded installation operations by teams of experts in the French framework of Crisis Organization (ONC). After describing the background of the framework of the Level 2 PSA, the French specific Crisis Organization and the characteristics of human actions in the Accident Progression Event Tree, this paper describes the method developed to introduce in PSA the Human and Organizational Reliability Analysis in Accident Management (HORAAM). This method is based on the Decision Tree method and has gone through a number of steps in its development. The first one was the observation of crisis center exercises, in order to identify the main influence factors (IFs) which affect human and organizational reliability. These IFs were used as headings in the Decision Tree method. Expert judgment was used in order to verify the IFs, to rank them, and to estimate the value of the aggregated factors to simplify the quantification of the tree. A tool based on Mathematica was developed to increase the flexibility and the efficiency of the study

  5. An Isometric Mapping Based Co-Location Decision Tree Algorithm

    Science.gov (United States)

    Zhou, G.; Wei, J.; Zhou, X.; Zhang, R.; Huang, W.; Sha, H.; Chen, J.

    2018-05-01

    Decision tree (DT) induction has been widely used in different pattern classification. However, most traditional DTs have the disadvantage that they consider only non-spatial attributes (ie, spectral information) as a result of classifying pixels, which can result in objects being misclassified. Therefore, some researchers have proposed a co-location decision tree (Cl-DT) method, which combines co-location and decision tree to solve the above the above-mentioned traditional decision tree problems. Cl-DT overcomes the shortcomings of the existing DT algorithms, which create a node for each value of a given attribute, which has a higher accuracy than the existing decision tree approach. However, for non-linearly distributed data instances, the euclidean distance between instances does not reflect the true positional relationship between them. In order to overcome these shortcomings, this paper proposes an isometric mapping method based on Cl-DT (called, (Isomap-based Cl-DT), which is a method that combines heterogeneous and Cl-DT together. Because isometric mapping methods use geodetic distances instead of Euclidean distances between non-linearly distributed instances, the true distance between instances can be reflected. The experimental results and several comparative analyzes show that: (1) The extraction method of exposed carbonate rocks is of high accuracy. (2) The proposed method has many advantages, because the total number of nodes, the number of leaf nodes and the number of nodes are greatly reduced compared to Cl-DT. Therefore, the Isomap -based Cl-DT algorithm can construct a more accurate and faster decision tree.

  6. AN ISOMETRIC MAPPING BASED CO-LOCATION DECISION TREE ALGORITHM

    Directory of Open Access Journals (Sweden)

    G. Zhou

    2018-05-01

    Full Text Available Decision tree (DT induction has been widely used in different pattern classification. However, most traditional DTs have the disadvantage that they consider only non-spatial attributes (ie, spectral information as a result of classifying pixels, which can result in objects being misclassified. Therefore, some researchers have proposed a co-location decision tree (Cl-DT method, which combines co-location and decision tree to solve the above the above-mentioned traditional decision tree problems. Cl-DT overcomes the shortcomings of the existing DT algorithms, which create a node for each value of a given attribute, which has a higher accuracy than the existing decision tree approach. However, for non-linearly distributed data instances, the euclidean distance between instances does not reflect the true positional relationship between them. In order to overcome these shortcomings, this paper proposes an isometric mapping method based on Cl-DT (called, (Isomap-based Cl-DT, which is a method that combines heterogeneous and Cl-DT together. Because isometric mapping methods use geodetic distances instead of Euclidean distances between non-linearly distributed instances, the true distance between instances can be reflected. The experimental results and several comparative analyzes show that: (1 The extraction method of exposed carbonate rocks is of high accuracy. (2 The proposed method has many advantages, because the total number of nodes, the number of leaf nodes and the number of nodes are greatly reduced compared to Cl-DT. Therefore, the Isomap -based Cl-DT algorithm can construct a more accurate and faster decision tree.

  7. Parallel object-oriented decision tree system

    Science.gov (United States)

    Kamath,; Chandrika, Cantu-Paz [Dublin, CA; Erick, [Oakland, CA

    2006-02-28

    A data mining decision tree system that uncovers patterns, associations, anomalies, and other statistically significant structures in data by reading and displaying data files, extracting relevant features for each of the objects, and using a method of recognizing patterns among the objects based upon object features through a decision tree that reads the data, sorts the data if necessary, determines the best manner to split the data into subsets according to some criterion, and splits the data.

  8. Fast Image Texture Classification Using Decision Trees

    Science.gov (United States)

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  9. Building of fuzzy decision trees using ID3 algorithm

    Science.gov (United States)

    Begenova, S. B.; Avdeenko, T. V.

    2018-05-01

    Decision trees are widely used in the field of machine learning and artificial intelligence. Such popularity is due to the fact that with the help of decision trees graphic models, text rules can be built and they are easily understood by the final user. Because of the inaccuracy of observations, uncertainties, the data, collected in the environment, often take an unclear form. Therefore, fuzzy decision trees becoming popular in the field of machine learning. This article presents a method that includes the features of the two above-mentioned approaches: a graphical representation of the rules system in the form of a tree and a fuzzy representation of the data. The approach uses such advantages as high comprehensibility of decision trees and the ability to cope with inaccurate and uncertain information in fuzzy representation. The received learning method is suitable for classifying problems with both numerical and symbolic features. In the article, solution illustrations and numerical results are given.

  10. Objective consensus from decision trees.

    Science.gov (United States)

    Putora, Paul Martin; Panje, Cedric M; Papachristofilou, Alexandros; Dal Pra, Alan; Hundsberger, Thomas; Plasswilm, Ludwig

    2014-12-05

    Consensus-based approaches provide an alternative to evidence-based decision making, especially in situations where high-level evidence is limited. Our aim was to demonstrate a novel source of information, objective consensus based on recommendations in decision tree format from multiple sources. Based on nine sample recommendations in decision tree format a representative analysis was performed. The most common (mode) recommendations for each eventuality (each permutation of parameters) were determined. The same procedure was applied to real clinical recommendations for primary radiotherapy for prostate cancer. Data was collected from 16 radiation oncology centres, converted into decision tree format and analyzed in order to determine the objective consensus. Based on information from multiple sources in decision tree format, treatment recommendations can be assessed for every parameter combination. An objective consensus can be determined by means of mode recommendations without compromise or confrontation among the parties. In the clinical example involving prostate cancer therapy, three parameters were used with two cut-off values each (Gleason score, PSA, T-stage) resulting in a total of 27 possible combinations per decision tree. Despite significant variations among the recommendations, a mode recommendation could be found for specific combinations of parameters. Recommendations represented as decision trees can serve as a basis for objective consensus among multiple parties.

  11. Objective consensus from decision trees

    International Nuclear Information System (INIS)

    Putora, Paul Martin; Panje, Cedric M; Papachristofilou, Alexandros; Pra, Alan Dal; Hundsberger, Thomas; Plasswilm, Ludwig

    2014-01-01

    Consensus-based approaches provide an alternative to evidence-based decision making, especially in situations where high-level evidence is limited. Our aim was to demonstrate a novel source of information, objective consensus based on recommendations in decision tree format from multiple sources. Based on nine sample recommendations in decision tree format a representative analysis was performed. The most common (mode) recommendations for each eventuality (each permutation of parameters) were determined. The same procedure was applied to real clinical recommendations for primary radiotherapy for prostate cancer. Data was collected from 16 radiation oncology centres, converted into decision tree format and analyzed in order to determine the objective consensus. Based on information from multiple sources in decision tree format, treatment recommendations can be assessed for every parameter combination. An objective consensus can be determined by means of mode recommendations without compromise or confrontation among the parties. In the clinical example involving prostate cancer therapy, three parameters were used with two cut-off values each (Gleason score, PSA, T-stage) resulting in a total of 27 possible combinations per decision tree. Despite significant variations among the recommendations, a mode recommendation could be found for specific combinations of parameters. Recommendations represented as decision trees can serve as a basis for objective consensus among multiple parties

  12. An enhanced component connection method for conversion of fault trees to binary decision diagrams

    International Nuclear Information System (INIS)

    Remenyte-Prescott, R.; Andrews, J.D.

    2008-01-01

    Fault tree analysis (FTA) is widely applied to assess the failure probability of industrial systems. Many computer packages are available, which are based on conventional kinetic tree theory methods. When dealing with large (possibly non-coherent) fault trees, the limitations of the technique in terms of accuracy of the solutions and the efficiency of the processing time become apparent. Over recent years, the binary decision diagram (BDD) method has been developed that solves fault trees and overcomes the disadvantages of the conventional FTA approach. First of all, a fault tree for a particular system failure mode is constructed and then converted to a BDD for analysis. This paper analyses alternative methods for the fault tree to BDD conversion process. For most fault tree to BDD conversion approaches, the basic events of the fault tree are placed in an ordering. This can dramatically affect the size of the final BDD and the success of qualitative and quantitative analyses of the system. A set of rules is then applied to each gate in the fault tree to generate the BDD. An alternative approach can also be used, where BDD constructs for each of the gate types are first built and then merged to represent a parent gate. A powerful and efficient property, sub-node sharing, is also incorporated in the enhanced method proposed in this paper. Finally, a combined approach is developed taking the best features of the alternative methods. The efficiency of the techniques is analysed and discussed

  13. TreePOD: Sensitivity-Aware Selection of Pareto-Optimal Decision Trees.

    Science.gov (United States)

    Muhlbacher, Thomas; Linhardt, Lorenz; Moller, Torsten; Piringer, Harald

    2018-01-01

    Balancing accuracy gains with other objectives such as interpretability is a key challenge when building decision trees. However, this process is difficult to automate because it involves know-how about the domain as well as the purpose of the model. This paper presents TreePOD, a new approach for sensitivity-aware model selection along trade-offs. TreePOD is based on exploring a large set of candidate trees generated by sampling the parameters of tree construction algorithms. Based on this set, visualizations of quantitative and qualitative tree aspects provide a comprehensive overview of possible tree characteristics. Along trade-offs between two objectives, TreePOD provides efficient selection guidance by focusing on Pareto-optimal tree candidates. TreePOD also conveys the sensitivities of tree characteristics on variations of selected parameters by extending the tree generation process with a full-factorial sampling. We demonstrate how TreePOD supports a variety of tasks involved in decision tree selection and describe its integration in a holistic workflow for building and selecting decision trees. For evaluation, we illustrate a case study for predicting critical power grid states, and we report qualitative feedback from domain experts in the energy sector. This feedback suggests that TreePOD enables users with and without statistical background a confident and efficient identification of suitable decision trees.

  14. Multi-stage optimization of decision and inhibitory trees for decision tables with many-valued decisions

    KAUST Repository

    Azad, Mohammad

    2017-06-16

    We study problems of optimization of decision and inhibitory trees for decision tables with many-valued decisions. As cost functions, we consider depth, average depth, number of nodes, and number of terminal/nonterminal nodes in trees. Decision tables with many-valued decisions (multi-label decision tables) are often more accurate models for real-life data sets than usual decision tables with single-valued decisions. Inhibitory trees can sometimes capture more information from decision tables than decision trees. In this paper, we create dynamic programming algorithms for multi-stage optimization of trees relative to a sequence of cost functions. We apply these algorithms to prove the existence of totally optimal (simultaneously optimal relative to a number of cost functions) decision and inhibitory trees for some modified decision tables from the UCI Machine Learning Repository.

  15. Multi-stage optimization of decision and inhibitory trees for decision tables with many-valued decisions

    KAUST Repository

    Azad, Mohammad; Moshkov, Mikhail

    2017-01-01

    We study problems of optimization of decision and inhibitory trees for decision tables with many-valued decisions. As cost functions, we consider depth, average depth, number of nodes, and number of terminal/nonterminal nodes in trees. Decision tables with many-valued decisions (multi-label decision tables) are often more accurate models for real-life data sets than usual decision tables with single-valued decisions. Inhibitory trees can sometimes capture more information from decision tables than decision trees. In this paper, we create dynamic programming algorithms for multi-stage optimization of trees relative to a sequence of cost functions. We apply these algorithms to prove the existence of totally optimal (simultaneously optimal relative to a number of cost functions) decision and inhibitory trees for some modified decision tables from the UCI Machine Learning Repository.

  16. A survey of decision tree classifier methodology

    Science.gov (United States)

    Safavian, S. R.; Landgrebe, David

    1991-01-01

    Decision tree classifiers (DTCs) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps the most important feature of DTCs is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issues. After considering potential advantages of DTCs over single-state classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  17. Multivariate analysis of flow cytometric data using decision trees.

    Science.gov (United States)

    Simon, Svenja; Guthke, Reinhard; Kamradt, Thomas; Frey, Oliver

    2012-01-01

    Characterization of the response of the host immune system is important in understanding the bidirectional interactions between the host and microbial pathogens. For research on the host site, flow cytometry has become one of the major tools in immunology. Advances in technology and reagents allow now the simultaneous assessment of multiple markers on a single cell level generating multidimensional data sets that require multivariate statistical analysis. We explored the explanatory power of the supervised machine learning method called "induction of decision trees" in flow cytometric data. In order to examine whether the production of a certain cytokine is depended on other cytokines, datasets from intracellular staining for six cytokines with complex patterns of co-expression were analyzed by induction of decision trees. After weighting the data according to their class probabilities, we created a total of 13,392 different decision trees for each given cytokine with different parameter settings. For a more realistic estimation of the decision trees' quality, we used stratified fivefold cross validation and chose the "best" tree according to a combination of different quality criteria. While some of the decision trees reflected previously known co-expression patterns, we found that the expression of some cytokines was not only dependent on the co-expression of others per se, but was also dependent on the intensity of expression. Thus, for the first time we successfully used induction of decision trees for the analysis of high dimensional flow cytometric data and demonstrated the feasibility of this method to reveal structural patterns in such data sets.

  18. Predicting gene function using hierarchical multi-label decision tree ensembles

    Directory of Open Access Journals (Sweden)

    Kocev Dragi

    2010-01-01

    Full Text Available Abstract Background S. cerevisiae, A. thaliana and M. musculus are well-studied organisms in biology and the sequencing of their genomes was completed many years ago. It is still a challenge, however, to develop methods that assign biological functions to the ORFs in these genomes automatically. Different machine learning methods have been proposed to this end, but it remains unclear which method is to be preferred in terms of predictive performance, efficiency and usability. Results We study the use of decision tree based models for predicting the multiple functions of ORFs. First, we describe an algorithm for learning hierarchical multi-label decision trees. These can simultaneously predict all the functions of an ORF, while respecting a given hierarchy of gene functions (such as FunCat or GO. We present new results obtained with this algorithm, showing that the trees found by it exhibit clearly better predictive performance than the trees found by previously described methods. Nevertheless, the predictive performance of individual trees is lower than that of some recently proposed statistical learning methods. We show that ensembles of such trees are more accurate than single trees and are competitive with state-of-the-art statistical learning and functional linkage methods. Moreover, the ensemble method is computationally efficient and easy to use. Conclusions Our results suggest that decision tree based methods are a state-of-the-art, efficient and easy-to-use approach to ORF function prediction.

  19. VC-dimension of univariate decision trees.

    Science.gov (United States)

    Yildiz, Olcay Taner

    2015-02-01

    In this paper, we give and prove the lower bounds of the Vapnik-Chervonenkis (VC)-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. Via a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively, we show that our VC-dimension bounds are tight for simple trees. To verify that the VC-dimension bounds are useful, we also use them to get VC-generalization bounds for complexity control using structural risk minimization in decision trees, i.e., pruning. Our simulation results show that structural risk minimization pruning using the VC-dimension bounds finds trees that are more accurate as those pruned using cross validation.

  20. Minimization of decision tree depth for multi-label decision tables

    KAUST Repository

    Azad, Mohammad

    2014-10-01

    In this paper, we consider multi-label decision tables that have a set of decisions attached to each row. Our goal is to find one decision from the set of decisions for each row by using decision tree as our tool. Considering our target to minimize the depth of the decision tree, we devised various kinds of greedy algorithms as well as dynamic programming algorithm. When we compare with the optimal result obtained from dynamic programming algorithm, we found some greedy algorithms produces results which are close to the optimal result for the minimization of depth of decision trees.

  1. Minimization of decision tree depth for multi-label decision tables

    KAUST Repository

    Azad, Mohammad; Moshkov, Mikhail

    2014-01-01

    In this paper, we consider multi-label decision tables that have a set of decisions attached to each row. Our goal is to find one decision from the set of decisions for each row by using decision tree as our tool. Considering our target to minimize the depth of the decision tree, we devised various kinds of greedy algorithms as well as dynamic programming algorithm. When we compare with the optimal result obtained from dynamic programming algorithm, we found some greedy algorithms produces results which are close to the optimal result for the minimization of depth of decision trees.

  2. Finding small equivalent decision trees is hard

    NARCIS (Netherlands)

    Zantema, H.; Bodlaender, H.L.

    2000-01-01

    Two decision trees are called decision equivalent if they represent the same function, i.e., they yield the same result for every possible input. We prove that given a decision tree and a number, to decide if there is a decision equivalent decision tree of size at most that number is NPcomplete. As

  3. Decision tree modeling using R.

    Science.gov (United States)

    Zhang, Zhongheng

    2016-08-01

    In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building.

  4. Decision table development and application to the construction of fault trees

    International Nuclear Information System (INIS)

    Salem, S.L.; Wu, J.S.; Apostolakis, G.

    1979-01-01

    A systematic methodology for the construction of fault trees based on the use of decision tables has been developed. These tables are used to describe each possible output state of a component as a set of combinations of states of inputs and internal operational or T states. Two methods for modeling component behavior via decision tables have been developed, one inductive and one deductive. These methods are useful for creating decision tables that realistically model the operational and failure modes of electrical, mechanical, and hydraulic components as well as human interactions inhibit conditions and common-cause events. A computer code CAT (Computer Automated Tree) has been developed to automatically produce fault trees from decision tables. A simple electrical system was chosen to illustrate the basic features of the decision table approach and to provide an example of an actual fault tree produced by this code. This example demonstrates the potential utility of such an automated approach to fault tree construction once a basic set of general decision tables has been developed

  5. Representing Boolean Functions by Decision Trees

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    A Boolean or discrete function can be represented by a decision tree. A compact form of decision tree named binary decision diagram or branching program is widely known in logic design [2, 40]. This representation is equivalent to other forms, and in some cases it is more compact than values table or even the formula [44]. Representing a function in the form of decision tree allows applying graph algorithms for various transformations [10]. Decision trees and branching programs are used for effective hardware [15] and software [5] implementation of functions. For the implementation to be effective, the function representation should have minimal time and space complexity. The average depth of decision tree characterizes the expected computing time, and the number of nodes in branching program characterizes the number of functional elements required for implementation. Often these two criteria are incompatible, i.e. there is no solution that is optimal on both time and space complexity. © Springer-Verlag Berlin Heidelberg 2011.

  6. Totally optimal decision trees for Boolean functions

    KAUST Repository

    Chikalov, Igor

    2016-07-28

    We study decision trees which are totally optimal relative to different sets of complexity parameters for Boolean functions. A totally optimal tree is an optimal tree relative to each parameter from the set simultaneously. We consider the parameters characterizing both time (in the worst- and average-case) and space complexity of decision trees, i.e., depth, total path length (average depth), and number of nodes. We have created tools based on extensions of dynamic programming to study totally optimal trees. These tools are applicable to both exact and approximate decision trees, and allow us to make multi-stage optimization of decision trees relative to different parameters and to count the number of optimal trees. Based on the experimental results we have formulated the following hypotheses (and subsequently proved): for almost all Boolean functions there exist totally optimal decision trees (i) relative to the depth and number of nodes, and (ii) relative to the depth and average depth.

  7. Safety validation of decision trees for hepatocellular carcinoma.

    Science.gov (United States)

    Wang, Xian-Qiang; Liu, Zhe; Lv, Wen-Ping; Luo, Ying; Yang, Guang-Yun; Li, Chong-Hui; Meng, Xiang-Fei; Liu, Yang; Xu, Ke-Sen; Dong, Jia-Hong

    2015-08-21

    To evaluate a different decision tree for safe liver resection and verify its efficiency. A total of 2457 patients underwent hepatic resection between January 2004 and December 2010 at the Chinese PLA General Hospital, and 634 hepatocellular carcinoma (HCC) patients were eligible for the final analyses. Post-hepatectomy liver failure (PHLF) was identified by the association of prothrombin time 50 μmol/L (the "50-50" criteria), which were assessed at day 5 postoperatively or later. The Swiss-Clavien decision tree, Tokyo University-Makuuchi decision tree, and Chinese consensus decision tree were adopted to divide patients into two groups based on those decision trees in sequence, and the PHLF rates were recorded. The overall mortality and PHLF rate were 0.16% and 3.0%. A total of 19 patients experienced PHLF. The numbers of patients to whom the Swiss-Clavien, Tokyo University-Makuuchi, and Chinese consensus decision trees were applied were 581, 573, and 622, and the PHLF rates were 2.75%, 2.62%, and 2.73%, respectively. Significantly more cases satisfied the Chinese consensus decision tree than the Swiss-Clavien decision tree and Tokyo University-Makuuchi decision tree (P decision trees. The Chinese consensus decision tree expands the indications for hepatic resection for HCC patients and does not increase the PHLF rate compared to the Swiss-Clavien and Tokyo University-Makuuchi decision trees. It would be a safe and effective algorithm for hepatectomy in patients with hepatocellular carcinoma.

  8. Construction of α-decision trees for tables with many-valued decisions

    KAUST Repository

    Moshkov, Mikhail; Zielosko, Beata

    2011-01-01

    The paper is devoted to the study of greedy algorithm for construction of approximate decision trees (α-decision trees). This algorithm is applicable to decision tables with many-valued decisions where each row is labeled with a set of decisions. For a given row, we should find a decision from the set attached to this row. We consider bound on the number of algorithm steps, and bound on the algorithm accuracy relative to the depth of decision trees. © 2011 Springer-Verlag.

  9. Using decision trees and their ensembles for analysis of NIR spectroscopic data

    DEFF Research Database (Denmark)

    Kucheryavskiy, Sergey V.

    and interpretation of the models. In this presentation, we are going to discuss an applicability of decision trees based methods (including gradient boosting) for solving classification and regression tasks with NIR spectra as predictors. We will cover such aspects as evaluation, optimization and validation......Advanced machine learning methods, like convolutional neural networks and decision trees, became extremely popular in the last decade. This, first of all, is directly related to the current boom in Big data analysis, where traditional statistical methods are not efficient. According to the kaggle.......com — the most popular online resource for Big data problems and solutions — methods based on decision trees and their ensembles are most widely used for solving the problems. It can be noted that the decision trees and convolutional neural networks are not very popular in Chemometrics. One of the reasons...

  10. Boosted decision trees and applications

    International Nuclear Information System (INIS)

    Coadou, Y.

    2013-01-01

    Decision trees are a machine learning technique more and more commonly used in high energy physics, while it has been widely used in the social sciences. After introducing the concepts of decision trees, this article focuses on its application in particle physics. (authors)

  11. Stock Picking via Nonsymmetrically Pruned Binary Decision Trees

    OpenAIRE

    Anton Andriyashin

    2008-01-01

    Stock picking is the field of financial analysis that is of particular interest for many professional investors and researchers. In this study stock picking is implemented via binary classification trees. Optimal tree size is believed to be the crucial factor in forecasting performance of the trees. While there exists a standard method of tree pruning, which is based on the cost-complexity tradeoff and used in the majority of studies employing binary decision trees, this paper introduces a no...

  12. Improved Frame Mode Selection for AMR-WB+ Based on Decision Tree

    Science.gov (United States)

    Kim, Jong Kyu; Kim, Nam Soo

    In this letter, we propose a coding mode selection method for the AMR-WB+ audio coder based on a decision tree. In order to reduce computation while maintaining good performance, decision tree classifier is adopted with the closed loop mode selection results as the target classification labels. The size of the decision tree is controlled by pruning, so the proposed method does not increase the memory requirement significantly. Through an evaluation test on a database covering both speech and music materials, the proposed method is found to achieve a much better mode selection accuracy compared with the open loop mode selection module in the AMR-WB+.

  13. Multivariate analysis of flow cytometric data using decision trees

    Directory of Open Access Journals (Sweden)

    Svenja eSimon

    2012-04-01

    Full Text Available Characterization of the response of the host immune system is important in understanding the bidirectional interactions between the host and microbial pathogens. For research on the host site, flow cytometry has become one of the major tools in immunology. Advances in technology and reagents allow now the simultaneous assessment of multiple markers on a single cell level generating multidimensional data sets that require multivariate statistical analysis. We explored the explanatory power of the supervised machine learning method called 'induction of decision trees' in flow cytometric data. In order to examine whether the production of a certain cytokine is depended on other cytokines, datasets from intracellular staining for six cytokines with complex patterns of co-expression were analyzed by induction of decision trees. After weighting the data according to their class probabilities, we created a total of 13,392 different decision trees for each given cytokine with different parameter settings. For a more realistic estimation of the decision trees's quality, we used stratified 5-fold cross-validation and chose the 'best' tree according to a combination of different quality criteria. While some of the decision trees reflected previously known co-expression patterns, we found that the expression of some cytokines was not only dependent on the co-expression of others per se, but was also dependent on the intensity of expression. Thus, for the first time we successfully used induction of decision trees for the analysis of high dimensional flow cytometric data and demonstrated the feasibility of this method to reveal structural patterns in such data sets.

  14. Representing Boolean Functions by Decision Trees

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    A Boolean or discrete function can be represented by a decision tree. A compact form of decision tree named binary decision diagram or branching program is widely known in logic design [2, 40]. This representation is equivalent to other forms

  15. A framework for sensitivity analysis of decision trees.

    Science.gov (United States)

    Kamiński, Bogumił; Jakubczyk, Michał; Szufel, Przemysław

    2018-01-01

    In the paper, we consider sequential decision problems with uncertainty, represented as decision trees. Sensitivity analysis is always a crucial element of decision making and in decision trees it often focuses on probabilities. In the stochastic model considered, the user often has only limited information about the true values of probabilities. We develop a framework for performing sensitivity analysis of optimal strategies accounting for this distributional uncertainty. We design this robust optimization approach in an intuitive and not overly technical way, to make it simple to apply in daily managerial practice. The proposed framework allows for (1) analysis of the stability of the expected-value-maximizing strategy and (2) identification of strategies which are robust with respect to pessimistic/optimistic/mode-favoring perturbations of probabilities. We verify the properties of our approach in two cases: (a) probabilities in a tree are the primitives of the model and can be modified independently; (b) probabilities in a tree reflect some underlying, structural probabilities, and are interrelated. We provide a free software tool implementing the methods described.

  16. Decision tree ensembles for online operation of large smart grids

    International Nuclear Information System (INIS)

    Steer, Kent C.B.; Wirth, Andrew; Halgamuge, Saman K.

    2012-01-01

    Highlights: ► We present a new technique for the online control of large smart grids. ► We use a Decision Tree Ensemble in a Receding Horizon Controller. ► Decision Trees can approximate online optimisation approaches. ► Decision Trees can make adjustments to their output in real time. ► The new technique outperforms heuristic online optimisation approaches. - Abstract: Smart grids utilise omnidirectional data transfer to operate a network of energy resources. Associated technologies present operators with greater control over system elements and more detailed information on the system state. While these features may improve the theoretical optimal operating performance, determining the optimal operating strategy becomes more difficult. In this paper, we show how a decision tree ensemble or ‘forest’ can produce a near-optimal control strategy in real time. The approach substitutes the decision forest for the simulation–optimisation sub-routine commonly employed in receding horizon controllers. The method is demonstrated on a small and a large network, and compared to controllers employing particle swarm optimisation and evolutionary strategies. For the smaller network the proposed method performs comparably in terms of total energy usage, but delivers a greater demand deficit. On the larger network the proposed method is superior with respect to all measures. We conclude that the method is useful when the time required to evaluate possible strategies via simulation is high.

  17. CUDT: A CUDA Based Decision Tree Algorithm

    Directory of Open Access Journals (Sweden)

    Win-Tsung Lo

    2014-01-01

    Full Text Available Decision tree is one of the famous classification methods in data mining. Many researches have been proposed, which were focusing on improving the performance of decision tree. However, those algorithms are developed and run on traditional distributed systems. Obviously the latency could not be improved while processing huge data generated by ubiquitous sensing node in the era without new technology help. In order to improve data processing latency in huge data mining, in this paper, we design and implement a new parallelized decision tree algorithm on a CUDA (compute unified device architecture, which is a GPGPU solution provided by NVIDIA. In the proposed system, CPU is responsible for flow control while the GPU is responsible for computation. We have conducted many experiments to evaluate system performance of CUDT and made a comparison with traditional CPU version. The results show that CUDT is 5∼55 times faster than Weka-j48 and is 18 times speedup than SPRINT for large data set.

  18. Relationships among various parameters for decision tree optimization

    KAUST Repository

    Hussain, Shahid

    2014-01-14

    In this chapter, we study, in detail, the relationships between various pairs of cost functions and between uncertainty measure and cost functions, for decision tree optimization. We provide new tools (algorithms) to compute relationship functions, as well as provide experimental results on decision tables acquired from UCI ML Repository. The algorithms presented in this paper have already been implemented and are now a part of Dagger, which is a software system for construction/optimization of decision trees and decision rules. The main results presented in this chapter deal with two types of algorithms for computing relationships; first, we discuss the case where we construct approximate decision trees and are interested in relationships between certain cost function, such as depth or number of nodes of a decision trees, and an uncertainty measure, such as misclassification error (accuracy) of decision tree. Secondly, relationships between two different cost functions are discussed, for example, the number of misclassification of a decision tree versus number of nodes in a decision trees. The results of experiments, presented in the chapter, provide further insight. © 2014 Springer International Publishing Switzerland.

  19. Relationships among various parameters for decision tree optimization

    KAUST Repository

    Hussain, Shahid

    2014-01-01

    In this chapter, we study, in detail, the relationships between various pairs of cost functions and between uncertainty measure and cost functions, for decision tree optimization. We provide new tools (algorithms) to compute relationship functions, as well as provide experimental results on decision tables acquired from UCI ML Repository. The algorithms presented in this paper have already been implemented and are now a part of Dagger, which is a software system for construction/optimization of decision trees and decision rules. The main results presented in this chapter deal with two types of algorithms for computing relationships; first, we discuss the case where we construct approximate decision trees and are interested in relationships between certain cost function, such as depth or number of nodes of a decision trees, and an uncertainty measure, such as misclassification error (accuracy) of decision tree. Secondly, relationships between two different cost functions are discussed, for example, the number of misclassification of a decision tree versus number of nodes in a decision trees. The results of experiments, presented in the chapter, provide further insight. © 2014 Springer International Publishing Switzerland.

  20. Automatic design of decision-tree induction algorithms

    CERN Document Server

    Barros, Rodrigo C; Freitas, Alex A

    2015-01-01

    Presents a detailed study of the major design components that constitute a top-down decision-tree induction algorithm, including aspects such as split criteria, stopping criteria, pruning, and the approaches for dealing with missing values. Whereas the strategy still employed nowadays is to use a 'generic' decision-tree induction algorithm regardless of the data, the authors argue on the benefits that a bias-fitting strategy could bring to decision-tree induction, in which the ultimate goal is the automatic generation of a decision-tree induction algorithm tailored to the application domain o

  1. Minimizing size of decision trees for multi-label decision tables

    KAUST Repository

    Azad, Mohammad

    2014-09-29

    We used decision tree as a model to discover the knowledge from multi-label decision tables where each row has a set of decisions attached to it and our goal is to find out one arbitrary decision from the set of decisions attached to a row. The size of the decision tree can be small as well as very large. We study here different greedy as well as dynamic programming algorithms to minimize the size of the decision trees. When we compare the optimal result from dynamic programming algorithm, we found some greedy algorithms produce results which are close to the optimal result for the minimization of number of nodes (at most 18.92% difference), number of nonterminal nodes (at most 20.76% difference), and number of terminal nodes (at most 18.71% difference).

  2. Minimizing size of decision trees for multi-label decision tables

    KAUST Repository

    Azad, Mohammad; Moshkov, Mikhail

    2014-01-01

    We used decision tree as a model to discover the knowledge from multi-label decision tables where each row has a set of decisions attached to it and our goal is to find out one arbitrary decision from the set of decisions attached to a row. The size of the decision tree can be small as well as very large. We study here different greedy as well as dynamic programming algorithms to minimize the size of the decision trees. When we compare the optimal result from dynamic programming algorithm, we found some greedy algorithms produce results which are close to the optimal result for the minimization of number of nodes (at most 18.92% difference), number of nonterminal nodes (at most 20.76% difference), and number of terminal nodes (at most 18.71% difference).

  3. Relationships for Cost and Uncertainty of Decision Trees

    KAUST Repository

    Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail

    2013-01-01

    This chapter is devoted to the design of new tools for the study of decision trees. These tools are based on dynamic programming approach and need the consideration of subtables of the initial decision table. So this approach is applicable only to relatively small decision tables. The considered tools allow us to compute: 1. Theminimum cost of an approximate decision tree for a given uncertainty value and a cost function. 2. The minimum number of nodes in an exact decision tree whose depth is at most a given value. For the first tool we considered various cost functions such as: depth and average depth of a decision tree and number of nodes (and number of terminal and nonterminal nodes) of a decision tree. The uncertainty of a decision table is equal to the number of unordered pairs of rows with different decisions. The uncertainty of approximate decision tree is equal to the maximum uncertainty of a subtable corresponding to a terminal node of the tree. In addition to the algorithms for such tools we also present experimental results applied to various datasets acquired from UCI ML Repository [4]. © Springer-Verlag Berlin Heidelberg 2013.

  4. Decision tree analysis in subarachnoid hemorrhage: prediction of outcome parameters during the course of aneurysmal subarachnoid hemorrhage using decision tree analysis.

    Science.gov (United States)

    Hostettler, Isabel Charlotte; Muroi, Carl; Richter, Johannes Konstantin; Schmid, Josef; Neidert, Marian Christoph; Seule, Martin; Boss, Oliver; Pangalu, Athina; Germans, Menno Robbert; Keller, Emanuela

    2018-01-19

    OBJECTIVE The aim of this study was to create prediction models for outcome parameters by decision tree analysis based on clinical and laboratory data in patients with aneurysmal subarachnoid hemorrhage (aSAH). METHODS The database consisted of clinical and laboratory parameters of 548 patients with aSAH who were admitted to the Neurocritical Care Unit, University Hospital Zurich. To examine the model performance, the cohort was randomly divided into a derivation cohort (60% [n = 329]; training data set) and a validation cohort (40% [n = 219]; test data set). The classification and regression tree prediction algorithm was applied to predict death, functional outcome, and ventriculoperitoneal (VP) shunt dependency. Chi-square automatic interaction detection was applied to predict delayed cerebral infarction on days 1, 3, and 7. RESULTS The overall mortality was 18.4%. The accuracy of the decision tree models was good for survival on day 1 and favorable functional outcome at all time points, with a difference between the training and test data sets of decision trees enables exploration of dependent variables in the context of multiple changing influences over the course of an illness. The decision tree currently generated increases awareness of the early systemic stress response, which is seemingly pertinent for prognostication.

  5. Automated Sleep Stage Scoring by Decision Tree Learning

    National Research Council Canada - National Science Library

    Hanaoka, Masaaki

    2001-01-01

    In this paper we describe a waveform recognition method that extracts characteristic parameters from wave- forms and a method of automated sleep stage scoring using decision tree learning that is in...

  6. Decision-Tree Formulation With Order-1 Lateral Execution

    Science.gov (United States)

    James, Mark

    2007-01-01

    A compact symbolic formulation enables mapping of an arbitrarily complex decision tree of a certain type into a highly computationally efficient multidimensional software object. The type of decision trees to which this formulation applies is that known in the art as the Boolean class of balanced decision trees. Parallel lateral slices of an object created by means of this formulation can be executed in constant time considerably less time than would otherwise be required. Decision trees of various forms are incorporated into almost all large software systems. A decision tree is a way of hierarchically solving a problem, proceeding through a set of true/false responses to a conclusion. By definition, a decision tree has a tree-like structure, wherein each internal node denotes a test on an attribute, each branch from an internal node represents an outcome of a test, and leaf nodes represent classes or class distributions that, in turn represent possible conclusions. The drawback of decision trees is that execution of them can be computationally expensive (and, hence, time-consuming) because each non-leaf node must be examined to determine whether to progress deeper into a tree structure or to examine an alternative. The present formulation was conceived as an efficient means of representing a decision tree and executing it in as little time as possible. The formulation involves the use of a set of symbolic algorithms to transform a decision tree into a multi-dimensional object, the rank of which equals the number of lateral non-leaf nodes. The tree can then be executed in constant time by means of an order-one table lookup. The sequence of operations performed by the algorithms is summarized as follows: 1. Determination of whether the tree under consideration can be encoded by means of this formulation. 2. Extraction of decision variables. 3. Symbolic optimization of the decision tree to minimize its form. 4. Expansion and transformation of all nested conjunctive

  7. The decision tree approach to classification

    Science.gov (United States)

    Wu, C.; Landgrebe, D. A.; Swain, P. H.

    1975-01-01

    A class of multistage decision tree classifiers is proposed and studied relative to the classification of multispectral remotely sensed data. The decision tree classifiers are shown to have the potential for improving both the classification accuracy and the computation efficiency. Dimensionality in pattern recognition is discussed and two theorems on the lower bound of logic computation for multiclass classification are derived. The automatic or optimization approach is emphasized. Experimental results on real data are reported, which clearly demonstrate the usefulness of decision tree classifiers.

  8. Prevalence and Determinants of Preterm Birth in Tehran, Iran: A Comparison between Logistic Regression and Decision Tree Methods.

    Science.gov (United States)

    Amini, Payam; Maroufizadeh, Saman; Samani, Reza Omani; Hamidi, Omid; Sepidarkish, Mahdi

    2017-06-01

    Preterm birth (PTB) is a leading cause of neonatal death and the second biggest cause of death in children under five years of age. The objective of this study was to determine the prevalence of PTB and its associated factors using logistic regression and decision tree classification methods. This cross-sectional study was conducted on 4,415 pregnant women in Tehran, Iran, from July 6-21, 2015. Data were collected by a researcher-developed questionnaire through interviews with mothers and review of their medical records. To evaluate the accuracy of the logistic regression and decision tree methods, several indices such as sensitivity, specificity, and the area under the curve were used. The PTB rate was 5.5% in this study. The logistic regression outperformed the decision tree for the classification of PTB based on risk factors. Logistic regression showed that multiple pregnancies, mothers with preeclampsia, and those who conceived with assisted reproductive technology had an increased risk for PTB ( p logistic regression model for the classification of risk groups for PTB.

  9. Minimization of Decision Tree Average Depth for Decision Tables with Many-valued Decisions

    KAUST Repository

    Azad, Mohammad

    2014-09-13

    The paper is devoted to the analysis of greedy algorithms for the minimization of average depth of decision trees for decision tables such that each row is labeled with a set of decisions. The goal is to find one decision from the set of decisions. When we compare with the optimal result obtained from dynamic programming algorithm, we found some greedy algorithms produces results which are close to the optimal result for the minimization of average depth of decision trees.

  10. Minimization of Decision Tree Average Depth for Decision Tables with Many-valued Decisions

    KAUST Repository

    Azad, Mohammad; Moshkov, Mikhail

    2014-01-01

    The paper is devoted to the analysis of greedy algorithms for the minimization of average depth of decision trees for decision tables such that each row is labeled with a set of decisions. The goal is to find one decision from the set of decisions. When we compare with the optimal result obtained from dynamic programming algorithm, we found some greedy algorithms produces results which are close to the optimal result for the minimization of average depth of decision trees.

  11. Implementation of Data Mining to Analyze Drug Cases Using C4.5 Decision Tree

    Science.gov (United States)

    Wahyuni, Sri

    2018-03-01

    Data mining was the process of finding useful information from a large set of databases. One of the existing techniques in data mining was classification. The method used was decision tree method and algorithm used was C4.5 algorithm. The decision tree method was a method that transformed a very large fact into a decision tree which was presenting the rules. Decision tree method was useful for exploring data, as well as finding a hidden relationship between a number of potential input variables with a target variable. The decision tree of the C4.5 algorithm was constructed with several stages including the selection of attributes as roots, created a branch for each value and divided the case into the branch. These stages would be repeated for each branch until all the cases on the branch had the same class. From the solution of the decision tree there would be some rules of a case. In this case the researcher classified the data of prisoners at Labuhan Deli prison to know the factors of detainees committing criminal acts of drugs. By applying this C4.5 algorithm, then the knowledge was obtained as information to minimize the criminal acts of drugs. From the findings of the research, it was found that the most influential factor of the detainee committed the criminal act of drugs was from the address variable.

  12. Algorithms for Decision Tree Construction

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    The study of algorithms for decision tree construction was initiated in 1960s. The first algorithms are based on the separation heuristic [13, 31] that at each step tries dividing the set of objects as evenly as possible. Later Garey and Graham [28] showed that such algorithm may construct decision trees whose average depth is arbitrarily far from the minimum. Hyafil and Rivest in [35] proved NP-hardness of DT problem that is constructing a tree with the minimum average depth for a diagnostic problem over 2-valued information system and uniform probability distribution. Cox et al. in [22] showed that for a two-class problem over information system, even finding the root node attribute for an optimal tree is an NP-hard problem. © Springer-Verlag Berlin Heidelberg 2011.

  13. Shopping intention prediction using decision trees

    Directory of Open Access Journals (Sweden)

    Dario Šebalj

    2017-09-01

    Full Text Available Introduction: The price is considered to be neglected marketing mix element due to the complexity of price management and sensitivity of customers on price changes. It pulls the fastest customer reactions to that change. Accordingly, the process of making shopping decisions can be very challenging for customer. Objective: The aim of this paper is to create a model that is able to predict shopping intention and classify respondents into one of the two categories, depending on whether they intend to shop or not. Methods: Data sample consists of 305 respondents, who are persons older than 18 years involved in buying groceries for their household. The research was conducted in February 2017. In order to create a model, the decision trees method was used with its several classification algorithms. Results: All models, except the one that used RandomTree algorithm, achieved relatively high classification rate (over the 80%. The highest classification accuracy of 84.75% gave J48 and RandomForest algorithms. Since there is no statistically significant difference between those two algorithms, authors decided to choose J48 algorithm and build a decision tree. Conclusions: The value for money and price level in the store were the most significant variables for classification of shopping intention. Future study plans to compare this model with some other data mining techniques, such as neural networks or support vector machines since these techniques achieved very good accuracy in some previous research in this field.

  14. Improving medical diagnosis reliability using Boosted C5.0 decision tree empowered by Particle Swarm Optimization.

    Science.gov (United States)

    Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin

    2015-08-01

    Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.

  15. Decision tree and PCA-based fault diagnosis of rotating machinery

    Science.gov (United States)

    Sun, Weixiang; Chen, Jin; Li, Jiaqing

    2007-04-01

    After analysing the flaws of conventional fault diagnosis methods, data mining technology is introduced to fault diagnosis field, and a new method based on C4.5 decision tree and principal component analysis (PCA) is proposed. In this method, PCA is used to reduce features after data collection, preprocessing and feature extraction. Then, C4.5 is trained by using the samples to generate a decision tree model with diagnosis knowledge. At last the tree model is used to make diagnosis analysis. To validate the method proposed, six kinds of running states (normal or without any defect, unbalance, rotor radial rub, oil whirl, shaft crack and a simultaneous state of unbalance and radial rub), are simulated on Bently Rotor Kit RK4 to test C4.5 and PCA-based method and back-propagation neural network (BPNN). The result shows that C4.5 and PCA-based diagnosis method has higher accuracy and needs less training time than BPNN.

  16. Multi-pruning of decision trees for knowledge representation and classification

    KAUST Repository

    Azad, Mohammad

    2016-06-09

    We consider two important questions related to decision trees: first how to construct a decision tree with reasonable number of nodes and reasonable number of misclassification, and second how to improve the prediction accuracy of decision trees when they are used as classifiers. We have created a dynamic programming based approach for bi-criteria optimization of decision trees relative to the number of nodes and the number of misclassification. This approach allows us to construct the set of all Pareto optimal points and to derive, for each such point, decision trees with parameters corresponding to that point. Experiments on datasets from UCI ML Repository show that, very often, we can find a suitable Pareto optimal point and derive a decision tree with small number of nodes at the expense of small increment in number of misclassification. Based on the created approach we have proposed a multi-pruning procedure which constructs decision trees that, as classifiers, often outperform decision trees constructed by CART. © 2015 IEEE.

  17. Multi-pruning of decision trees for knowledge representation and classification

    KAUST Repository

    Azad, Mohammad; Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail

    2016-01-01

    We consider two important questions related to decision trees: first how to construct a decision tree with reasonable number of nodes and reasonable number of misclassification, and second how to improve the prediction accuracy of decision trees when they are used as classifiers. We have created a dynamic programming based approach for bi-criteria optimization of decision trees relative to the number of nodes and the number of misclassification. This approach allows us to construct the set of all Pareto optimal points and to derive, for each such point, decision trees with parameters corresponding to that point. Experiments on datasets from UCI ML Repository show that, very often, we can find a suitable Pareto optimal point and derive a decision tree with small number of nodes at the expense of small increment in number of misclassification. Based on the created approach we have proposed a multi-pruning procedure which constructs decision trees that, as classifiers, often outperform decision trees constructed by CART. © 2015 IEEE.

  18. A new approach to enhance the performance of decision tree for classifying gene expression data.

    Science.gov (United States)

    Hassan, Md; Kotagiri, Ramamohanarao

    2013-12-20

    Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree.

  19. Induction of Ordinal Decision Trees

    NARCIS (Netherlands)

    J.C. Bioch (Cor); V. Popova (Viara)

    2003-01-01

    textabstractThis paper focuses on the problem of monotone decision trees from the point of view of the multicriteria decision aid methodology (MCDA). By taking into account the preferences of the decision maker, an attempt is made to bring closer similar research within machine learning and MCDA.

  20. Classification of premalignant pancreatic cancer mass-spectrometry data using decision tree ensembles

    Directory of Open Access Journals (Sweden)

    Wong G William

    2008-06-01

    Full Text Available Abstract Background Pancreatic cancer is the fourth leading cause of cancer death in the United States. Consequently, identification of clinically relevant biomarkers for the early detection of this cancer type is urgently needed. In recent years, proteomics profiling techniques combined with various data analysis methods have been successfully used to gain critical insights into processes and mechanisms underlying pathologic conditions, particularly as they relate to cancer. However, the high dimensionality of proteomics data combined with their relatively small sample sizes poses a significant challenge to current data mining methodology where many of the standard methods cannot be applied directly. Here, we propose a novel methodological framework using machine learning method, in which decision tree based classifier ensembles coupled with feature selection methods, is applied to proteomics data generated from premalignant pancreatic cancer. Results This study explores the utility of three different feature selection schemas (Student t test, Wilcoxon rank sum test and genetic algorithm to reduce the high dimensionality of a pancreatic cancer proteomic dataset. Using the top features selected from each method, we compared the prediction performances of a single decision tree algorithm C4.5 with six different decision-tree based classifier ensembles (Random forest, Stacked generalization, Bagging, Adaboost, Logitboost and Multiboost. We show that ensemble classifiers always outperform single decision tree classifier in having greater accuracies and smaller prediction errors when applied to a pancreatic cancer proteomics dataset. Conclusion In our cross validation framework, classifier ensembles generally have better classification accuracies compared to that of a single decision tree when applied to a pancreatic cancer proteomic dataset, thus suggesting its utility in future proteomics data analysis. Additionally, the use of feature selection

  1. Greedy heuristics for minimization of number of terminal nodes in decision trees

    KAUST Repository

    Hussain, Shahid

    2014-10-01

    This paper describes, in detail, several greedy heuristics for construction of decision trees. We study the number of terminal nodes of decision trees, which is closely related with the cardinality of the set of rules corresponding to the tree. We compare these heuristics empirically for two different types of datasets (datasets acquired from UCI ML Repository and randomly generated data) as well as compare with the optimal results obtained using dynamic programming method.

  2. Greedy heuristics for minimization of number of terminal nodes in decision trees

    KAUST Repository

    Hussain, Shahid

    2014-01-01

    This paper describes, in detail, several greedy heuristics for construction of decision trees. We study the number of terminal nodes of decision trees, which is closely related with the cardinality of the set of rules corresponding to the tree. We compare these heuristics empirically for two different types of datasets (datasets acquired from UCI ML Repository and randomly generated data) as well as compare with the optimal results obtained using dynamic programming method.

  3. A method and application study on holistic decision tree for human reliability analysis in nuclear power plant

    International Nuclear Information System (INIS)

    Sun Feng; Zhong Shan; Wu Zhiyu

    2008-01-01

    The paper introduces a human reliability analysis method mainly used in Nuclear Power Plant Safety Assessment and the Holistic Decision Tree (HDT) method and how to apply it. The focus is primarily on providing the basic framework and some background of HDT method and steps to perform it. Influence factors and quality descriptors are formed by the interview with operators in Qinshan Nuclear Power Plant and HDT analysis performed for SGTR and SLOCA based on this information. The HDT model can use a graphic tree structure to indicate that error rate is a function of influence factors. HDT method is capable of dealing with the uncertainty in HRA, and it is reliable and practical. (authors)

  4. Use of fault and decision tree analyses to protect against industrial sabotage

    International Nuclear Information System (INIS)

    Fullwood, R.R.; Erdmann, R.C.

    1975-01-01

    Fault tree and decision tree analyses provide systematic bases for evaluation of safety systems and procedures. Heuristically, this paper shows applications of these methods for industrial sabotage analysis at a reprocessing plant. Fault trees constructed by ''leak path'' analysis for completeness through path inventory. The escape fault tree is readily developed by this method and using the reciprocal character of the trees, the attack fault tree is constructed. After construction, the events on the fault tree are corrected for their nonreciprocal character. The fault trees are algebraically solved and the protection that is afforded is ranked by the number of barriers that must be penetrated. No attempt is made to assess the barrier penetration probabilities or penetration time duration. Event trees are useful for dynamic plant protection analysis through their time-sequencing character. To illustrate their usefulness, a simple attack scenario is devised and event-tree analyzed. Two saboteur success paths and 21 failure paths are found. This example clearly shows the event tree usefulness for concisely presenting the time sequencing of key decision points. However, event trees have the disadvantage of being scenario dependent, therefore requiring a separate event tree for each scenario

  5. Using histograms to introduce randomization in the generation of ensembles of decision trees

    Science.gov (United States)

    Kamath, Chandrika; Cantu-Paz, Erick; Littau, David

    2005-02-22

    A system for decision tree ensembles that includes a module to read the data, a module to create a histogram, a module to evaluate a potential split according to some criterion using the histogram, a module to select a split point randomly in an interval around the best split, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method includes the steps of reading the data; creating a histogram; evaluating a potential split according to some criterion using the histogram, selecting a split point randomly in an interval around the best split, splitting the data, and combining multiple decision trees in ensembles.

  6. Decision and Inhibitory Trees for Decision Tables with Many-Valued Decisions

    KAUST Repository

    Azad, Mohammad

    2018-01-01

    Decision trees are one of the most commonly used tools in decision analysis, knowledge representation, machine learning, etc., for its simplicity and interpretability. We consider an extension of dynamic programming approach to process the whole set

  7. Automatic design of decision-tree induction algorithms tailored to flexible-receptor docking data.

    Science.gov (United States)

    Barros, Rodrigo C; Winck, Ana T; Machado, Karina S; Basgalupp, Márcio P; de Carvalho, André C P L F; Ruiz, Duncan D; de Souza, Osmar Norberto

    2012-11-21

    This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance. The empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application. We conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor.

  8. Univariate decision tree induction using maximum margin classification

    OpenAIRE

    Yıldız, Olcay Taner

    2012-01-01

    In many pattern recognition applications, first decision trees are used due to their simplicity and easily interpretable nature. In this paper, we propose a new decision tree learning algorithm called univariate margin tree where, for each continuous attribute, the best split is found using convex optimization. Our simulation results on 47 data sets show that the novel margin tree classifier performs at least as good as C4.5 and linear discriminant tree (LDT) with a similar time complexity. F...

  9. A fuzzy decision tree method for fault classification in the steam generator of a pressurized water reactor

    International Nuclear Information System (INIS)

    Zio, Enrico; Baraldi, Piero; Popescu, Irina Crenguta

    2009-01-01

    This paper extends a method previously introduced by the authors for building a transparent fault classification algorithm by combining the fuzzy clustering, fuzzy logic and decision trees techniques. The baseline method transforms an opaque, fuzzy clustering-based classification model into a fuzzy logic inference model based on linguistic rules which can be represented by a decision tree formalism. The classification model thereby obtained is transparent in that it allows direct interpretation and inspection of the model. An extension in the procedure for the development of the fuzzy logic inference model is introduced to allow the treatment of more complicated cases, e.g. splitted and overlapping clusters. The corresponding computational tool developed relies on a number of parameters which can be tuned by the user to optimally compromise the level of transparency of the classification process and its efficiency. A numerical application is presented with regards to the fault classification in the Steam Generator of a Pressurized Water Reactor.

  10. Development of a New Decision Tree to Rapidly Screen Chemical Estrogenic Activities of Xenopus laevis.

    Science.gov (United States)

    Wang, Ting; Li, Weiying; Zheng, Xiaofeng; Lin, Zhifen; Kong, Deyang

    2014-02-01

    During the last past decades, there is an increasing number of studies about estrogenic activities of the environmental pollutants on amphibians and many determination methods have been proposed. However, these determination methods are time-consuming and expensive, and a rapid and simple method to screen and test the chemicals for estrogenic activities to amphibians is therefore imperative. Herein is proposed a new decision tree formulated not only with physicochemical parameters but also a biological parameter that was successfully used to screen estrogenic activities of the chemicals on amphibians. The biological parameter, CDOCKER interaction energy (Ebinding ) between chemicals and the target proteins was calculated based on the method of molecular docking, and it was used to revise the decision tree formulated by Hong only with physicochemical parameters for screening estrogenic activity of chemicals in rat. According to the correlation between Ebinding of rat and Xenopus laevis, a new decision tree for estrogenic activities in Xenopus laevis is finally proposed. Then it was validated by using the randomly 8 chemicals which can be frequently exposed to Xenopus laevis, and the agreement between the results from the new decision tree and the ones from experiments is generally satisfactory. Consequently, the new decision tree can be used to screen the estrogenic activities of the chemicals, and combinational use of the Ebinding and classical physicochemical parameters can greatly improves Hong's decision tree. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. [Comparison of Discriminant Analysis and Decision Trees for the Detection of Subclinical Keratoconus].

    Science.gov (United States)

    Kleinhans, Sonja; Herrmann, Eva; Kohnen, Thomas; Bühren, Jens

    2017-08-15

    Background Iatrogenic keratectasia is one of the most dreaded complications of refractive surgery. In most cases, keratectasia develops after refractive surgery of eyes suffering from subclinical stages of keratoconus with few or no signs. Unfortunately, there has been no reliable procedure for the early detection of keratoconus. In this study, we used binary decision trees (recursive partitioning) to assess their suitability for discrimination between normal eyes and eyes with subclinical keratoconus. Patients and Methods The method of decision tree analysis was compared with discriminant analysis which has shown good results in previous studies. Input data were 32 eyes of 32 patients with newly diagnosed keratoconus in the contralateral eye and preoperative data of 10 eyes of 5 patients with keratectasia after laser in-situ keratomileusis (LASIK). The control group was made up of 245 normal eyes after LASIK and 12-month follow-up without any signs of iatrogenic keratectasia. Results Decision trees gave better accuracy and specificity than did discriminant analysis. The sensitivity of decision trees was lower than the sensitivity of discriminant analysis. Conclusion On the basis of the patient population of this study, decision trees did not prove to be superior to linear discriminant analysis for the detection of subclinical keratoconus. Georg Thieme Verlag KG Stuttgart · New York.

  12. A tool for study of optimal decision trees

    KAUST Repository

    Alkhalid, Abdulaziz

    2010-01-01

    The paper describes a tool which allows us for relatively small decision tables to make consecutive optimization of decision trees relative to various complexity measures such as number of nodes, average depth, and depth, and to find parameters and the number of optimal decision trees. © 2010 Springer-Verlag Berlin Heidelberg.

  13. Extensions of Dynamic Programming: Decision Trees, Combinatorial Optimization, and Data Mining

    KAUST Repository

    Hussain, Shahid

    2016-01-01

    This thesis is devoted to the development of extensions of dynamic programming to the study of decision trees. The considered extensions allow us to make multi-stage optimization of decision trees relative to a sequence of cost functions, to count the number of optimal trees, and to study relationships: cost vs cost and cost vs uncertainty for decision trees by construction of the set of Pareto-optimal points for the corresponding bi-criteria optimization problem. The applications include study of totally optimal (simultaneously optimal relative to a number of cost functions) decision trees for Boolean functions, improvement of bounds on complexity of decision trees for diagnosis of circuits, study of time and memory trade-off for corner point detection, study of decision rules derived from decision trees, creation of new procedure (multi-pruning) for construction of classifiers, and comparison of heuristics for decision tree construction. Part of these extensions (multi-stage optimization) was generalized to well-known combinatorial optimization problems: matrix chain multiplication, binary search trees, global sequence alignment, and optimal paths in directed graphs.

  14. Extensions of Dynamic Programming: Decision Trees, Combinatorial Optimization, and Data Mining

    KAUST Repository

    Hussain, Shahid

    2016-07-10

    This thesis is devoted to the development of extensions of dynamic programming to the study of decision trees. The considered extensions allow us to make multi-stage optimization of decision trees relative to a sequence of cost functions, to count the number of optimal trees, and to study relationships: cost vs cost and cost vs uncertainty for decision trees by construction of the set of Pareto-optimal points for the corresponding bi-criteria optimization problem. The applications include study of totally optimal (simultaneously optimal relative to a number of cost functions) decision trees for Boolean functions, improvement of bounds on complexity of decision trees for diagnosis of circuits, study of time and memory trade-off for corner point detection, study of decision rules derived from decision trees, creation of new procedure (multi-pruning) for construction of classifiers, and comparison of heuristics for decision tree construction. Part of these extensions (multi-stage optimization) was generalized to well-known combinatorial optimization problems: matrix chain multiplication, binary search trees, global sequence alignment, and optimal paths in directed graphs.

  15. Decision-Tree Program

    Science.gov (United States)

    Buntine, Wray

    1994-01-01

    IND computer program introduces Bayesian and Markov/maximum-likelihood (MML) methods and more-sophisticated methods of searching in growing trees. Produces more-accurate class-probability estimates important in applications like diagnosis. Provides range of features and styles with convenience for casual user, fine-tuning for advanced user or for those interested in research. Consists of four basic kinds of routines: data-manipulation, tree-generation, tree-testing, and tree-display. Written in C language.

  16. Comparison of Greedy Algorithms for Decision Tree Optimization

    KAUST Repository

    Alkhalid, Abdulaziz

    2013-01-01

    This chapter is devoted to the study of 16 types of greedy algorithms for decision tree construction. The dynamic programming approach is used for construction of optimal decision trees. Optimization is performed relative to minimal values of average depth, depth, number of nodes, number of terminal nodes, and number of nonterminal nodes of decision trees. We compare average depth, depth, number of nodes, number of terminal nodes and number of nonterminal nodes of constructed trees with minimum values of the considered parameters obtained based on a dynamic programming approach. We report experiments performed on data sets from UCI ML Repository and randomly generated binary decision tables. As a result, for depth, average depth, and number of nodes we propose a number of good heuristics. © Springer-Verlag Berlin Heidelberg 2013.

  17. Multivariate decision tree designing for the classification of multi-jet topologies in e sup + e sup - collisions

    CERN Document Server

    Mjahed, M

    2002-01-01

    The binary decision tree method is used to separate between several multi-jet topologies in e sup + e sup - collisions. Instead of the univariate process usually taken, a new design procedure for constructing multivariate decision trees is proposed. The segmentation is obtained by considering some features functions, where linear and non-linear discriminant functions and a minimal distance method are used. The classification focuses on ALEPH simulated events, with multi-jet topologies. Compared to a standard univariate tree, the multivariate decision trees offer significantly better performance.

  18. Comparison of Greedy Algorithms for Decision Tree Optimization

    KAUST Repository

    Alkhalid, Abdulaziz; Chikalov, Igor; Moshkov, Mikhail

    2013-01-01

    This chapter is devoted to the study of 16 types of greedy algorithms for decision tree construction. The dynamic programming approach is used for construction of optimal decision trees. Optimization is performed relative to minimal values

  19. Bounds on Average Time Complexity of Decision Trees

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    In this chapter, bounds on the average depth and the average weighted depth of decision trees are considered. Similar problems are studied in search theory [1], coding theory [77], design and analysis of algorithms (e.g., sorting) [38]. For any diagnostic problem, the minimum average depth of decision tree is bounded from below by the entropy of probability distribution (with a multiplier 1/log2 k for a problem over a k-valued information system). Among diagnostic problems, the problems with a complete set of attributes have the lowest minimum average depth of decision trees (e.g, the problem of building optimal prefix code [1] and a blood test study in assumption that exactly one patient is ill [23]). For such problems, the minimum average depth of decision tree exceeds the lower bound by at most one. The minimum average depth reaches the maximum on the problems in which each attribute is "indispensable" [44] (e.g., a diagnostic problem with n attributes and kn pairwise different rows in the decision table and the problem of implementing the modulo 2 summation function). These problems have the minimum average depth of decision tree equal to the number of attributes in the problem description. © Springer-Verlag Berlin Heidelberg 2011.

  20. Totally optimal decision trees for Boolean functions

    KAUST Repository

    Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail

    2016-01-01

    We study decision trees which are totally optimal relative to different sets of complexity parameters for Boolean functions. A totally optimal tree is an optimal tree relative to each parameter from the set simultaneously. We consider the parameters

  1. Comparison of greedy algorithms for α-decision tree construction

    KAUST Repository

    Alkhalid, Abdulaziz; Chikalov, Igor; Moshkov, Mikhail

    2011-01-01

    A comparison among different heuristics that are used by greedy algorithms which constructs approximate decision trees (α-decision trees) is presented. The comparison is conducted using decision tables based on 24 data sets from UCI Machine Learning Repository [2]. Complexity of decision trees is estimated relative to several cost functions: depth, average depth, number of nodes, number of nonterminal nodes, and number of terminal nodes. Costs of trees built by greedy algorithms are compared with minimum costs calculated by an algorithm based on dynamic programming. The results of experiments assign to each cost function a set of potentially good heuristics that minimize it. © 2011 Springer-Verlag.

  2. Learning from examples - Generation and evaluation of decision trees for software resource analysis

    Science.gov (United States)

    Selby, Richard W.; Porter, Adam A.

    1988-01-01

    A general solution method for the automatic generation of decision (or classification) trees is investigated. The approach is to provide insights through in-depth empirical characterization and evaluation of decision trees for software resource data analysis. The trees identify classes of objects (software modules) that had high development effort. Sixteen software systems ranging from 3,000 to 112,000 source lines were selected for analysis from a NASA production environment. The collection and analysis of 74 attributes (or metrics), for over 4,700 objects, captured information about the development effort, faults, changes, design style, and implementation style. A total of 9,600 decision trees were automatically generated and evaluated. The trees correctly identified 79.3 percent of the software modules that had high development effort or faults, and the trees generated from the best parameter combinations correctly identified 88.4 percent of the modules on the average.

  3. Identification of Potential Sources of Mercury (Hg) in Farmland Soil Using a Decision Tree Method in China.

    Science.gov (United States)

    Zhong, Taiyang; Chen, Dongmei; Zhang, Xiuying

    2016-11-09

    Identification of the sources of soil mercury (Hg) on the provincial scale is helpful for enacting effective policies to prevent further contamination and take reclamation measurements. The natural and anthropogenic sources and their contributions of Hg in Chinese farmland soil were identified based on a decision tree method. The results showed that the concentrations of Hg in parent materials were most strongly associated with the general spatial distribution pattern of Hg concentration on a provincial scale. The decision tree analysis gained an 89.70% total accuracy in simulating the influence of human activities on the additions of Hg in farmland soil. Human activities-for example, the production of coke, application of fertilizers, discharge of wastewater, discharge of solid waste, and the production of non-ferrous metals-were the main external sources of a large amount of Hg in the farmland soil.

  4. The Decision Tree: A Tool for Achieving Behavioral Change.

    Science.gov (United States)

    Saren, Dru

    1999-01-01

    Presents a "Decision Tree" process for structuring team decision making and problem solving about specific student behavioral goals. The Decision Tree involves a sequence of questions/decisions that can be answered in "yes/no" terms. Questions address reasonableness of the goal, time factors, importance of the goal, responsibilities, safety,…

  5. Applied Swarm-based medicine: collecting decision trees for patterns of algorithms analysis.

    Science.gov (United States)

    Panje, Cédric M; Glatzer, Markus; von Rappard, Joscha; Rothermundt, Christian; Hundsberger, Thomas; Zumstein, Valentin; Plasswilm, Ludwig; Putora, Paul Martin

    2017-08-16

    The objective consensus methodology has recently been applied in consensus finding in several studies on medical decision-making among clinical experts or guidelines. The main advantages of this method are an automated analysis and comparison of treatment algorithms of the participating centers which can be performed anonymously. Based on the experience from completed consensus analyses, the main steps for the successful implementation of the objective consensus methodology were identified and discussed among the main investigators. The following steps for the successful collection and conversion of decision trees were identified and defined in detail: problem definition, population selection, draft input collection, tree conversion, criteria adaptation, problem re-evaluation, results distribution and refinement, tree finalisation, and analysis. This manuscript provides information on the main steps for successful collection of decision trees and summarizes important aspects at each point of the analysis.

  6. 'Misclassification error' greedy heuristic to construct decision trees for inconsistent decision tables

    KAUST Repository

    Azad, Mohammad; Moshkov, Mikhail

    2014-01-01

    A greedy algorithm has been presented in this paper to construct decision trees for three different approaches (many-valued decision, most common decision, and generalized decision) in order to handle the inconsistency of multiple decisions in a decision table. In this algorithm, a greedy heuristic ‘misclassification error’ is used which performs faster, and for some cost function, results are better than ‘number of boundary subtables’ heuristic in literature. Therefore, it can be used in the case of larger data sets and does not require huge amount of memory. Experimental results of depth, average depth and number of nodes of decision trees constructed by this algorithm are compared in the framework of each of the three approaches.

  7. Modifiable risk factors predicting major depressive disorder at four year follow-up: a decision tree approach

    Directory of Open Access Journals (Sweden)

    Christensen Helen

    2009-11-01

    Full Text Available Abstract Background Relative to physical health conditions such as cardiovascular disease, little is known about risk factors that predict the prevalence of depression. The present study investigates the expected effects of a reduction of these risks over time, using the decision tree method favoured in assessing cardiovascular disease risk. Methods The PATH through Life cohort was used for the study, comprising 2,105 20-24 year olds, 2,323 40-44 year olds and 2,177 60-64 year olds sampled from the community in the Canberra region, Australia. A decision tree methodology was used to predict the presence of major depressive disorder after four years of follow-up. The decision tree was compared with a logistic regression analysis using ROC curves. Results The decision tree was found to distinguish and delineate a wide range of risk profiles. Previous depressive symptoms were most highly predictive of depression after four years, however, modifiable risk factors such as substance use and employment status played significant roles in assessing the risk of depression. The decision tree was found to have better sensitivity and specificity than a logistic regression using identical predictors. Conclusion The decision tree method was useful in assessing the risk of major depressive disorder over four years. Application of the model to the development of a predictive tool for tailored interventions is discussed.

  8. Extensions of dynamic programming as a new tool for decision tree optimization

    KAUST Repository

    Alkhalid, Abdulaziz

    2013-01-01

    The chapter is devoted to the consideration of two types of decision trees for a given decision table: α-decision trees (the parameter α controls the accuracy of tree) and decision trees (which allow arbitrary level of accuracy). We study possibilities of sequential optimization of α-decision trees relative to different cost functions such as depth, average depth, and number of nodes. For decision trees, we analyze relationships between depth and number of misclassifications. We also discuss results of computer experiments with some datasets from UCI ML Repository. ©Springer-Verlag Berlin Heidelberg 2013.

  9. Decision trees with minimum average depth for sorting eight elements

    KAUST Repository

    AbouEisha, Hassan M.

    2015-11-19

    We prove that the minimum average depth of a decision tree for sorting 8 pairwise different elements is equal to 620160/8!. We show also that each decision tree for sorting 8 elements, which has minimum average depth (the number of such trees is approximately equal to 8.548×10^326365), has also minimum depth. Both problems were considered by Knuth (1998). To obtain these results, we use tools based on extensions of dynamic programming which allow us to make sequential optimization of decision trees relative to depth and average depth, and to count the number of decision trees with minimum average depth.

  10. Extensions of dynamic programming as a new tool for decision tree optimization

    KAUST Repository

    Alkhalid, Abdulaziz; Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail

    2013-01-01

    The chapter is devoted to the consideration of two types of decision trees for a given decision table: α-decision trees (the parameter α controls the accuracy of tree) and decision trees (which allow arbitrary level of accuracy). We study

  11. IND - THE IND DECISION TREE PACKAGE

    Science.gov (United States)

    Buntine, W.

    1994-01-01

    A common approach to supervised classification and prediction in artificial intelligence and statistical pattern recognition is the use of decision trees. A tree is "grown" from data using a recursive partitioning algorithm to create a tree which has good prediction of classes on new data. Standard algorithms are CART (by Breiman Friedman, Olshen and Stone) and ID3 and its successor C4 (by Quinlan). As well as reimplementing parts of these algorithms and offering experimental control suites, IND also introduces Bayesian and MML methods and more sophisticated search in growing trees. These produce more accurate class probability estimates that are important in applications like diagnosis. IND is applicable to most data sets consisting of independent instances, each described by a fixed length vector of attribute values. An attribute value may be a number, one of a set of attribute specific symbols, or it may be omitted. One of the attributes is delegated the "target" and IND grows trees to predict the target. Prediction can then be done on new data or the decision tree printed out for inspection. IND provides a range of features and styles with convenience for the casual user as well as fine-tuning for the advanced user or those interested in research. IND can be operated in a CART-like mode (but without regression trees, surrogate splits or multivariate splits), and in a mode like the early version of C4. Advanced features allow more extensive search, interactive control and display of tree growing, and Bayesian and MML algorithms for tree pruning and smoothing. These often produce more accurate class probability estimates at the leaves. IND also comes with a comprehensive experimental control suite. IND consists of four basic kinds of routines: data manipulation routines, tree generation routines, tree testing routines, and tree display routines. The data manipulation routines are used to partition a single large data set into smaller training and test sets. The

  12. USING PRECEDENTS FOR REDUCTION OF DECISION TREE BY GRAPH SEARCH

    Directory of Open Access Journals (Sweden)

    I. A. Bessmertny

    2015-01-01

    Full Text Available The paper considers the problem of mutual payment organization between business entities by means of clearing that is solved by search of graph paths. To reduce the decision tree complexity a method of precedents is proposed that consists in saving the intermediate solution during the moving along decision tree. An algorithm and example are presented demonstrating solution complexity coming close to a linear one. The tests carried out in civil aviation settlement system demonstrate approximately 30 percent shortage of real money transfer. The proposed algorithm is planned to be implemented also in other clearing organizations of the Russian Federation.

  13. Using Decision Trees to Detect and Isolate Simulated Leaks in the J-2X Rocket Engine

    Science.gov (United States)

    Schwabacher, Mark A.; Aguilar, Robert; Figueroa, Fernando F.

    2009-01-01

    The goal of this work was to use data-driven methods to automatically detect and isolate faults in the J-2X rocket engine. It was decided to use decision trees, since they tend to be easier to interpret than other data-driven methods. The decision tree algorithm automatically "learns" a decision tree by performing a search through the space of possible decision trees to find one that fits the training data. The particular decision tree algorithm used is known as C4.5. Simulated J-2X data from a high-fidelity simulator developed at Pratt & Whitney Rocketdyne and known as the Detailed Real-Time Model (DRTM) was used to "train" and test the decision tree. Fifty-six DRTM simulations were performed for this purpose, with different leak sizes, different leak locations, and different times of leak onset. To make the simulations as realistic as possible, they included simulated sensor noise, and included a gradual degradation in both fuel and oxidizer turbine efficiency. A decision tree was trained using 11 of these simulations, and tested using the remaining 45 simulations. In the training phase, the C4.5 algorithm was provided with labeled examples of data from nominal operation and data including leaks in each leak location. From the data, it "learned" a decision tree that can classify unseen data as having no leak or having a leak in one of the five leak locations. In the test phase, the decision tree produced very low false alarm rates and low missed detection rates on the unseen data. It had very good fault isolation rates for three of the five simulated leak locations, but it tended to confuse the remaining two locations, perhaps because a large leak at one of these two locations can look very similar to a small leak at the other location.

  14. Metric Sex Determination of the Human Coxal Bone on a Virtual Sample using Decision Trees.

    Science.gov (United States)

    Savall, Frédéric; Faruch-Bilfeld, Marie; Dedouit, Fabrice; Sans, Nicolas; Rousseau, Hervé; Rougé, Daniel; Telmon, Norbert

    2015-11-01

    Decision trees provide an alternative to multivariate discriminant analysis, which is still the most commonly used in anthropometric studies. Our study analyzed the metric characterization of a recent virtual sample of 113 coxal bones using decision trees for sex determination. From 17 osteometric type I landmarks, a dataset was built with five classic distances traditionally reported in the literature and six new distances selected using the two-step ratio method. A ten-fold cross-validation was performed, and a decision tree was established on two subsamples (training and test sets). The decision tree established on the training set included three nodes and its application to the test set correctly classified 92% of individuals. This percentage was similar to the data of the literature. The usefulness of decision trees has been demonstrated in numerous fields. They have been already used in sex determination, body mass prediction, and ancestry estimation. This study shows another use of decision trees enabling simple and accurate sex determination. © 2015 American Academy of Forensic Sciences.

  15. Ship Engine Room Casualty Analysis by Using Decision Tree Method

    Directory of Open Access Journals (Sweden)

    Ömür Yaşar SAATÇİOĞLU

    2017-03-01

    Full Text Available Ships may encounter undesirable conditions during operations. In consequence of a casualty, fire, explosion, flooding, grounding, injury even death may occur. Besides, these results can be avoidable with precautions and preventive operating processes. In maritime transportation, casualties depend on various factors. These were listed as misuse of the engine equipment and tools, defective machinery or equipment, inadequacy of operational procedure and measure of safety and force majeure effects. Casualty reports which were published in Australia, New Zealand, United Kingdom, Canada and United States until 2015 were examined and the probable causes and consequences of casualties were determined with their occurrence percentages. In this study, 89 marine investigation reports regarding engine room casualties were analyzed. Casualty factors were analyzed with their frequency percentages and also their main causes were constructed. This study aims to investigate engine room based casualties, frequency of each casualty type and main causes by using decision tree method.

  16. An automated approach to the design of decision tree classifiers

    Science.gov (United States)

    Argentiero, P.; Chin, R.; Beaudet, P.

    1982-01-01

    An automated technique is presented for designing effective decision tree classifiers predicated only on a priori class statistics. The procedure relies on linear feature extractions and Bayes table look-up decision rules. Associated error matrices are computed and utilized to provide an optimal design of the decision tree at each so-called 'node'. A by-product of this procedure is a simple algorithm for computing the global probability of correct classification assuming the statistical independence of the decision rules. Attention is given to a more precise definition of decision tree classification, the mathematical details on the technique for automated decision tree design, and an example of a simple application of the procedure using class statistics acquired from an actual Landsat scene.

  17. Using Decision Trees to Characterize Verbal Communication During Change and Stuck Episodes in the Therapeutic Process

    Directory of Open Access Journals (Sweden)

    Víctor Hugo eMasías

    2015-04-01

    Full Text Available Methods are needed for creating models to characterize verbal communication between therapists and their patients that are suitable for teaching purposes without losing analytical potential. A technique meeting these twin requirements is proposed that uses decision trees to identify both change and stuck episodes in therapist-patient communication. Three decision tree algorithms (C4.5, NBtree, and REPtree are applied to the problem of characterizing verbal responses into change and stuck episodes in the therapeutic process. The data for the problem is derived from a corpus of 8 successful individual therapy sessions with 1,760 speaking turns in a psychodynamic context. The decision tree model that performed best was generated by the C4.5 algorithm. It delivered 15 rules characterizing the verbal communication in the two types of episodes. Decision trees are a promising technique for analyzing verbal communication during significant therapy events and have much potential for use in teaching practice on changes in therapeutic communication. The development of pedagogical methods using decision trees can support the transmission of academic knowledge to therapeutic practice.

  18. Using decision trees to characterize verbal communication during change and stuck episodes in the therapeutic process.

    Science.gov (United States)

    Masías, Víctor H; Krause, Mariane; Valdés, Nelson; Pérez, J C; Laengle, Sigifredo

    2015-01-01

    Methods are needed for creating models to characterize verbal communication between therapists and their patients that are suitable for teaching purposes without losing analytical potential. A technique meeting these twin requirements is proposed that uses decision trees to identify both change and stuck episodes in therapist-patient communication. Three decision tree algorithms (C4.5, NBTree, and REPTree) are applied to the problem of characterizing verbal responses into change and stuck episodes in the therapeutic process. The data for the problem is derived from a corpus of 8 successful individual therapy sessions with 1760 speaking turns in a psychodynamic context. The decision tree model that performed best was generated by the C4.5 algorithm. It delivered 15 rules characterizing the verbal communication in the two types of episodes. Decision trees are a promising technique for analyzing verbal communication during significant therapy events and have much potential for use in teaching practice on changes in therapeutic communication. The development of pedagogical methods using decision trees can support the transmission of academic knowledge to therapeutic practice.

  19. Bi-Criteria Optimization of Decision Trees with Applications to Data Analysis

    KAUST Repository

    Chikalov, Igor

    2017-10-19

    This paper is devoted to the study of bi-criteria optimization problems for decision trees. We consider different cost functions such as depth, average depth, and number of nodes. We design algorithms that allow us to construct the set of Pareto optimal points (POPs) for a given decision table and the corresponding bi-criteria optimization problem. These algorithms are suitable for investigation of medium-sized decision tables. We discuss three examples of applications of the created tools: the study of relationships among depth, average depth and number of nodes for decision trees for corner point detection (such trees are used in computer vision for object tracking), study of systems of decision rules derived from decision trees, and comparison of different greedy algorithms for decision tree construction as single- and bi-criteria optimization algorithms.

  20. Bi-Criteria Optimization of Decision Trees with Applications to Data Analysis

    KAUST Repository

    Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail

    2017-01-01

    : the study of relationships among depth, average depth and number of nodes for decision trees for corner point detection (such trees are used in computer vision for object tracking), study of systems of decision rules derived from decision trees

  1. The Studies of Decision Tree in Estimation of Breast Cancer Risk by Using Polymorphism Nucleotide

    Directory of Open Access Journals (Sweden)

    Frida Seyedmir

    2017-07-01

    Full Text Available Abstract Introduction:   Decision tree is the data mining tools to collect, accurate prediction and sift information from massive amounts of data that are used widely in the field of computational biology and bioinformatics. In bioinformatics can be predict on diseases, including breast cancer. The use of genomic data including single nucleotide polymorphisms is a very important factor in predicting the risk of diseases. The number of seven important SNP among hundreds of thousands genetic markers were identified as factors associated with breast cancer. The objective of this study is to evaluate the training data on decision tree predictor error of the risk of breast cancer by using single nucleotide polymorphism genotype. Methods: The risk of breast cancer were calculated associated with the use of SNP formula:xj = fo * In human,  The decision tree can be used To predict the probability of disease using single nucleotide polymorphisms .Seven SNP with different odds ratio associated with breast cancer considered and coding and design of decision tree model, C4.5, by  Csharp2013 programming language were done. In the decision tree created with the coding, the four important associated SNP was considered. The decision tree error in two case of coding and using WEKA were assessment and percentage of decision tree accuracy in prediction of breast cancer were calculated. The number of trained samples was obtained with systematic sampling. With coding, two scenarios as well as software WEKA, three scenarios with different sets of data and the number of different learning and testing, were evaluated. Results: In both scenarios of coding, by increasing the training percentage from 66/66 to 86/42, the error reduced from 55/56 to 9/09. Also by running of WEKA on three scenarios with different sets of data, the number of different education, and different tests by increasing records number from 81 to 2187, the error rate decreased from 48/15 to 13

  2. New Splitting Criteria for Decision Trees in Stationary Data Streams.

    Science.gov (United States)

    Jaworski, Maciej; Duda, Piotr; Rutkowski, Leszek; Jaworski, Maciej; Duda, Piotr; Rutkowski, Leszek; Rutkowski, Leszek; Duda, Piotr; Jaworski, Maciej

    2018-06-01

    The most popular tools for stream data mining are based on decision trees. In previous 15 years, all designed methods, headed by the very fast decision tree algorithm, relayed on Hoeffding's inequality and hundreds of researchers followed this scheme. Recently, we have demonstrated that although the Hoeffding decision trees are an effective tool for dealing with stream data, they are a purely heuristic procedure; for example, classical decision trees such as ID3 or CART cannot be adopted to data stream mining using Hoeffding's inequality. Therefore, there is an urgent need to develop new algorithms, which are both mathematically justified and characterized by good performance. In this paper, we address this problem by developing a family of new splitting criteria for classification in stationary data streams and investigating their probabilistic properties. The new criteria, derived using appropriate statistical tools, are based on the misclassification error and the Gini index impurity measures. The general division of splitting criteria into two types is proposed. Attributes chosen based on type- splitting criteria guarantee, with high probability, the highest expected value of split measure. Type- criteria ensure that the chosen attribute is the same, with high probability, as it would be chosen based on the whole infinite data stream. Moreover, in this paper, two hybrid splitting criteria are proposed, which are the combinations of single criteria based on the misclassification error and Gini index.

  3. Learning decision trees with flexible constraints and objectives using integer optimization

    NARCIS (Netherlands)

    Verwer, S.; Zhang, Y.

    2017-01-01

    We encode the problem of learning the optimal decision tree of a given depth as an integer optimization problem. We show experimentally that our method (DTIP) can be used to learn good trees up to depth 5 from data sets of size up to 1000. In addition to being efficient, our new formulation allows

  4. Post-event human decision errors: operator action tree/time reliability correlation

    International Nuclear Information System (INIS)

    Hall, R.E.; Fragola, J.; Wreathall, J.

    1982-11-01

    This report documents an interim framework for the quantification of the probability of errors of decision on the part of nuclear power plant operators after the initiation of an accident. The framework can easily be incorporated into an event tree/fault tree analysis. The method presented consists of a structure called the operator action tree and a time reliability correlation which assumes the time available for making a decision to be the dominating factor in situations requiring cognitive human response. This limited approach decreases the magnitude and complexity of the decision modeling task. Specifically, in the past, some human performance models have attempted prediction by trying to emulate sequences of human actions, or by identifying and modeling the information processing approach applicable to the task. The model developed here is directed at describing the statistical performance of a representative group of hypothetical individuals responding to generalized situations

  5. Post-event human decision errors: operator action tree/time reliability correlation

    Energy Technology Data Exchange (ETDEWEB)

    Hall, R E; Fragola, J; Wreathall, J

    1982-11-01

    This report documents an interim framework for the quantification of the probability of errors of decision on the part of nuclear power plant operators after the initiation of an accident. The framework can easily be incorporated into an event tree/fault tree analysis. The method presented consists of a structure called the operator action tree and a time reliability correlation which assumes the time available for making a decision to be the dominating factor in situations requiring cognitive human response. This limited approach decreases the magnitude and complexity of the decision modeling task. Specifically, in the past, some human performance models have attempted prediction by trying to emulate sequences of human actions, or by identifying and modeling the information processing approach applicable to the task. The model developed here is directed at describing the statistical performance of a representative group of hypothetical individuals responding to generalized situations.

  6. Application of preprocessing filtering on Decision Tree C4.5 and rough set theory

    Science.gov (United States)

    Chan, Joseph C. C.; Lin, Tsau Y.

    2001-03-01

    This paper compares two artificial intelligence methods: the Decision Tree C4.5 and Rough Set Theory on the stock market data. The Decision Tree C4.5 is reviewed with the Rough Set Theory. An enhanced window application is developed to facilitate the pre-processing filtering by introducing the feature (attribute) transformations, which allows users to input formulas and create new attributes. Also, the application produces three varieties of data set with delaying, averaging, and summation. The results prove the improvement of pre-processing by applying feature (attribute) transformations on Decision Tree C4.5. Moreover, the comparison between Decision Tree C4.5 and Rough Set Theory is based on the clarity, automation, accuracy, dimensionality, raw data, and speed, which is supported by the rules sets generated by both algorithms on three different sets of data.

  7. Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets.

    Science.gov (United States)

    Sankari, E Siva; Manimegalai, D

    2017-12-21

    Predicting membrane protein types is an important and challenging research area in bioinformatics and proteomics. Traditional biophysical methods are used to classify membrane protein types. Due to large exploration of uncharacterized protein sequences in databases, traditional methods are very time consuming, expensive and susceptible to errors. Hence, it is highly desirable to develop a robust, reliable, and efficient method to predict membrane protein types. Imbalanced datasets and large datasets are often handled well by decision tree classifiers. Since imbalanced datasets are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Random tree, REP (Reduced Error Pruning) tree, ensemble methods such as Adaboost, RUS (Random Under Sampling) boost, Rotation forest and Random forest are analysed. Among the various decision tree classifiers Random forest performs well in less time with good accuracy of 96.35%. Another inference is RUS boost decision tree classifier is able to classify one or two samples in the class with very less samples while the other classifiers such as DT, Adaboost, Rotation forest and Random forest are not sensitive for the classes with fewer samples. Also the performance of decision tree classifiers is compared with SVM (Support Vector Machine) and Naive Bayes classifier. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. On algorithm for building of optimal α-decision trees

    KAUST Repository

    Alkhalid, Abdulaziz; Chikalov, Igor; Moshkov, Mikhail

    2010-01-01

    The paper describes an algorithm that constructs approximate decision trees (α-decision trees), which are optimal relatively to one of the following complexity measures: depth, total path length or number of nodes. The algorithm uses dynamic

  9. Decision trees with minimum average depth for sorting eight elements

    KAUST Repository

    AbouEisha, Hassan M.; Chikalov, Igor; Moshkov, Mikhail

    2015-01-01

    We prove that the minimum average depth of a decision tree for sorting 8 pairwise different elements is equal to 620160/8!. We show also that each decision tree for sorting 8 elements, which has minimum average depth (the number of such trees

  10. [Prediction of regional soil quality based on mutual information theory integrated with decision tree algorithm].

    Science.gov (United States)

    Lin, Fen-Fang; Wang, Ke; Yang, Ning; Yan, Shi-Guang; Zheng, Xin-Yu

    2012-02-01

    In this paper, some main factors such as soil type, land use pattern, lithology type, topography, road, and industry type that affect soil quality were used to precisely obtain the spatial distribution characteristics of regional soil quality, mutual information theory was adopted to select the main environmental factors, and decision tree algorithm See 5.0 was applied to predict the grade of regional soil quality. The main factors affecting regional soil quality were soil type, land use, lithology type, distance to town, distance to water area, altitude, distance to road, and distance to industrial land. The prediction accuracy of the decision tree model with the variables selected by mutual information was obviously higher than that of the model with all variables, and, for the former model, whether of decision tree or of decision rule, its prediction accuracy was all higher than 80%. Based on the continuous and categorical data, the method of mutual information theory integrated with decision tree could not only reduce the number of input parameters for decision tree algorithm, but also predict and assess regional soil quality effectively.

  11. Exploring predictors of scientific performance with decision tree analysis: The case of research excellence in early career mathematics

    Energy Technology Data Exchange (ETDEWEB)

    Lindahl, J.

    2016-07-01

    The purpose of this study was (1) to introduce the exploratory method of decision tree analysis as a complementary alternative to current confirmatory methods used in scientometric prediction studies of research performance; and (2) as an illustrative case, to explore predictors of future research excellence at the individual level among 493 early career mathematicians in the sub-field of number theory between 1999 and 2010. A conceptual introduction to decision tree analysis is provided including an overview of the main steps of the tree-building algorithm and the statistical method of cross-validation used to evaluate the performance of decision tree models. A decision tree analysis of 493 mathematicians was conducted to find useful predictors and important relationships between variables in the context of predicting research excellence. The results suggest that the number of prestige journal publications and a topically diverse output are important predictors of future research excellence. Researchers with no prestige journal publications are very unlikely to produce excellent research. Limitations of decision three analysis are discussed. (Author)

  12. Modifiable risk factors predicting major depressive disorder at four year follow-up: a decision tree approach.

    Science.gov (United States)

    Batterham, Philip J; Christensen, Helen; Mackinnon, Andrew J

    2009-11-22

    Relative to physical health conditions such as cardiovascular disease, little is known about risk factors that predict the prevalence of depression. The present study investigates the expected effects of a reduction of these risks over time, using the decision tree method favoured in assessing cardiovascular disease risk. The PATH through Life cohort was used for the study, comprising 2,105 20-24 year olds, 2,323 40-44 year olds and 2,177 60-64 year olds sampled from the community in the Canberra region, Australia. A decision tree methodology was used to predict the presence of major depressive disorder after four years of follow-up. The decision tree was compared with a logistic regression analysis using ROC curves. The decision tree was found to distinguish and delineate a wide range of risk profiles. Previous depressive symptoms were most highly predictive of depression after four years, however, modifiable risk factors such as substance use and employment status played significant roles in assessing the risk of depression. The decision tree was found to have better sensitivity and specificity than a logistic regression using identical predictors. The decision tree method was useful in assessing the risk of major depressive disorder over four years. Application of the model to the development of a predictive tool for tailored interventions is discussed.

  13. Measuring performance in health care: case-mix adjustment by boosted decision trees.

    Science.gov (United States)

    Neumann, Anke; Holstein, Josiane; Le Gall, Jean-Roger; Lepage, Eric

    2004-10-01

    The purpose of this paper is to investigate the suitability of boosted decision trees for the case-mix adjustment involved in comparing the performance of various health care entities. First, we present logistic regression, decision trees, and boosted decision trees in a unified framework. Second, we study in detail their application for two common performance indicators, the mortality rate in intensive care and the rate of potentially avoidable hospital readmissions. For both examples the technique of boosting decision trees outperformed standard prognostic models, in particular linear logistic regression models, with regard to predictive power. On the other hand, boosting decision trees was computationally demanding and the resulting models were rather complex and needed additional tools for interpretation. Boosting decision trees represents a powerful tool for case-mix adjustment in health care performance measurement. Depending on the specific priorities set in each context, the gain in predictive power might compensate for the inconvenience in the use of boosted decision trees.

  14. Classification and Optimization of Decision Trees for Inconsistent Decision Tables Represented as MVD Tables

    KAUST Repository

    Azad, Mohammad

    2015-10-11

    Decision tree is a widely used technique to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples (objects) with equal values of conditional attributes but different decisions (values of the decision attribute), then to discover the essential patterns or knowledge from the data set is challenging. We consider three approaches (generalized, most common and many-valued decision) to handle such inconsistency. We created different greedy algorithms using various types of impurity and uncertainty measures to construct decision trees. We compared the three approaches based on the decision tree properties of the depth, average depth and number of nodes. Based on the result of the comparison, we choose to work with the many-valued decision approach. Now to determine which greedy algorithms are efficient, we compared them based on the optimization and classification results. It was found that some greedy algorithms Mult\\\\_ws\\\\_entSort, and Mult\\\\_ws\\\\_entML are good for both optimization and classification.

  15. Classification and Optimization of Decision Trees for Inconsistent Decision Tables Represented as MVD Tables

    KAUST Repository

    Azad, Mohammad; Moshkov, Mikhail

    2015-01-01

    Decision tree is a widely used technique to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples (objects) with equal values of conditional attributes but different decisions (values of the decision attribute), then to discover the essential patterns or knowledge from the data set is challenging. We consider three approaches (generalized, most common and many-valued decision) to handle such inconsistency. We created different greedy algorithms using various types of impurity and uncertainty measures to construct decision trees. We compared the three approaches based on the decision tree properties of the depth, average depth and number of nodes. Based on the result of the comparison, we choose to work with the many-valued decision approach. Now to determine which greedy algorithms are efficient, we compared them based on the optimization and classification results. It was found that some greedy algorithms Mult\\_ws\\_entSort, and Mult\\_ws\\_entML are good for both optimization and classification.

  16. Optimization and analysis of decision trees and rules: Dynamic programming approach

    KAUST Repository

    Alkhalid, Abdulaziz

    2013-08-01

    This paper is devoted to the consideration of software system Dagger created in KAUST. This system is based on extensions of dynamic programming. It allows sequential optimization of decision trees and rules relative to different cost functions, derivation of relationships between two cost functions (in particular, between number of misclassifications and depth of decision trees), and between cost and uncertainty of decision trees. We describe features of Dagger and consider examples of this systems work on decision tables from UCI Machine Learning Repository. We also use Dagger to compare 16 different greedy algorithms for decision tree construction. © 2013 Taylor and Francis Group, LLC.

  17. Optimization and analysis of decision trees and rules: Dynamic programming approach

    KAUST Repository

    Alkhalid, Abdulaziz; Amin, Talha M.; Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail; Zielosko, Beata

    2013-01-01

    This paper is devoted to the consideration of software system Dagger created in KAUST. This system is based on extensions of dynamic programming. It allows sequential optimization of decision trees and rules relative to different cost functions, derivation of relationships between two cost functions (in particular, between number of misclassifications and depth of decision trees), and between cost and uncertainty of decision trees. We describe features of Dagger and consider examples of this systems work on decision tables from UCI Machine Learning Repository. We also use Dagger to compare 16 different greedy algorithms for decision tree construction. © 2013 Taylor and Francis Group, LLC.

  18. Greedy algorithm with weights for decision tree construction

    KAUST Repository

    Moshkov, Mikhail

    2010-01-01

    An approximate algorithm for minimization of weighted depth of decision trees is considered. A bound on accuracy of this algorithm is obtained which is unimprovable in general case. Under some natural assumptions on the class NP, the considered algorithm is close (from the point of view of accuracy) to best polynomial approximate algorithms for minimization of weighted depth of decision trees.

  19. Greedy algorithm with weights for decision tree construction

    KAUST Repository

    Moshkov, Mikhail

    2010-12-01

    An approximate algorithm for minimization of weighted depth of decision trees is considered. A bound on accuracy of this algorithm is obtained which is unimprovable in general case. Under some natural assumptions on the class NP, the considered algorithm is close (from the point of view of accuracy) to best polynomial approximate algorithms for minimization of weighted depth of decision trees.

  20. Simple Prediction of Type 2 Diabetes Mellitus via Decision Tree Modeling

    Directory of Open Access Journals (Sweden)

    Mehrab Sayadi

    2017-06-01

    Full Text Available Background: Type 2 Diabetes Mellitus (T2DM is one of the most important risk factors in cardiovascular disorders considered as a common clinical and public health problem. Early diagnosis can reduce the burden of the disease. Decision tree, as an advanced data mining method, can be used as a reliable tool to predict T2DM. Objectives: This study aimed to present a simple model for predicting T2DM using decision tree modeling. Materials and Methods: This analytical model-based study used a part of the cohort data obtained from a database in Healthy Heart House of Shiraz, Iran. The data included routine information, such as age, gender, Body Mass Index (BMI, family history of diabetes, and systolic and diastolic blood pressure, which were obtained from the individuals referred for gathering baseline data in Shiraz cohort study from 2014 to 2015. Diabetes diagnosis was used as binary datum. Decision tree technique and J48 algorithm were applied using the WEKA software (version 3.7.5, New Zealand. Additionally, Receiver Operator Characteristic (ROC curve and Area Under Curve (AUC were used for checking the goodness of fit. Results: The age of the 11302 cases obtained after data preparation ranged from 18 to 89 years with the mean age of 48.1 ± 11.4 years. Additionally, 51.1% of the cases were male. In the tree structure, blood pressure and age were placed where most information was gained. In our model, however, gender was not important and was placed on the final branch of the tree. Total precision and AUC were 87% and 89%, respectively. This indicated that the model had good accuracy for distinguishing patients from normal individuals. Conclusions: The results showed that T2DM could be predicted via decision tree model without laboratory tests. Thus, this model can be used in pre-clinical and public health screening programs.

  1. Reconciliation of Decision-Making Heuristics Based on Decision Trees Topologies and Incomplete Fuzzy Probabilities Sets.

    Science.gov (United States)

    Doubravsky, Karel; Dohnal, Mirko

    2015-01-01

    Complex decision making tasks of different natures, e.g. economics, safety engineering, ecology and biology, are based on vague, sparse, partially inconsistent and subjective knowledge. Moreover, decision making economists / engineers are usually not willing to invest too much time into study of complex formal theories. They require such decisions which can be (re)checked by human like common sense reasoning. One important problem related to realistic decision making tasks are incomplete data sets required by the chosen decision making algorithm. This paper presents a relatively simple algorithm how some missing III (input information items) can be generated using mainly decision tree topologies and integrated into incomplete data sets. The algorithm is based on an easy to understand heuristics, e.g. a longer decision tree sub-path is less probable. This heuristic can solve decision problems under total ignorance, i.e. the decision tree topology is the only information available. But in a practice, isolated information items e.g. some vaguely known probabilities (e.g. fuzzy probabilities) are usually available. It means that a realistic problem is analysed under partial ignorance. The proposed algorithm reconciles topology related heuristics and additional fuzzy sets using fuzzy linear programming. The case study, represented by a tree with six lotteries and one fuzzy probability, is presented in details.

  2. Reconciliation of Decision-Making Heuristics Based on Decision Trees Topologies and Incomplete Fuzzy Probabilities Sets.

    Directory of Open Access Journals (Sweden)

    Karel Doubravsky

    Full Text Available Complex decision making tasks of different natures, e.g. economics, safety engineering, ecology and biology, are based on vague, sparse, partially inconsistent and subjective knowledge. Moreover, decision making economists / engineers are usually not willing to invest too much time into study of complex formal theories. They require such decisions which can be (rechecked by human like common sense reasoning. One important problem related to realistic decision making tasks are incomplete data sets required by the chosen decision making algorithm. This paper presents a relatively simple algorithm how some missing III (input information items can be generated using mainly decision tree topologies and integrated into incomplete data sets. The algorithm is based on an easy to understand heuristics, e.g. a longer decision tree sub-path is less probable. This heuristic can solve decision problems under total ignorance, i.e. the decision tree topology is the only information available. But in a practice, isolated information items e.g. some vaguely known probabilities (e.g. fuzzy probabilities are usually available. It means that a realistic problem is analysed under partial ignorance. The proposed algorithm reconciles topology related heuristics and additional fuzzy sets using fuzzy linear programming. The case study, represented by a tree with six lotteries and one fuzzy probability, is presented in details.

  3. Bayesian additive decision trees of biomarker by treatment interactions for predictive biomarker detection and subgroup identification.

    Science.gov (United States)

    Zhao, Yang; Zheng, Wei; Zhuo, Daisy Y; Lu, Yuefeng; Ma, Xiwen; Liu, Hengchang; Zeng, Zhen; Laird, Glen

    2017-10-11

    Personalized medicine, or tailored therapy, has been an active and important topic in recent medical research. Many methods have been proposed in the literature for predictive biomarker detection and subgroup identification. In this article, we propose a novel decision tree-based approach applicable in randomized clinical trials. We model the prognostic effects of the biomarkers using additive regression trees and the biomarker-by-treatment effect using a single regression tree. Bayesian approach is utilized to periodically revise the split variables and the split rules of the decision trees, which provides a better overall fitting. Gibbs sampler is implemented in the MCMC procedure, which updates the prognostic trees and the interaction tree separately. We use the posterior distribution of the interaction tree to construct the predictive scores of the biomarkers and to identify the subgroup where the treatment is superior to the control. Numerical simulations show that our proposed method performs well under various settings comparing to existing methods. We also demonstrate an application of our method in a real clinical trial.

  4. Decision trees and decision committee applied to star/galaxy separation problem

    Science.gov (United States)

    Vasconcellos, Eduardo Charles

    Vasconcellos et al [1] study the efficiency of 13 diferente decision tree algorithms applied to photometric data in the Sloan Digital Sky Digital Survey Data Release Seven (SDSS-DR7) to perform star/galaxy separation. Each algorithm is defined by a set fo parameters which, when varied, produce diferente final classifications trees. In that work we extensively explore the parameter space of each algorithm, using the set of 884,126 SDSS objects with spectroscopic data as the training set. We find that Functional Tree algorithm (FT) yields the best results by the mean completeness function (galaxy true positive rate) in two magnitude intervals:14=19 (82.1%). We compare FT classification to the SDSS parametric, 2DPHOT and Ball et al (2006) classifications. At the faintest magnitudes (r > 19), our classifier is the only one that maintains high completeness (>80%) while simultaneously achieving low contamination ( 2.5%). We also examine the SDSS parametric classifier (psfMag - modelMag) to see if the dividing line between stars and galaxies can be adjusted to improve the classifier. We find that currently stars in close pairs are often misclassified as galaxies, and suggest a new cut to improve the classifier. Finally, we apply our FT classifier to separate stars from galaxies in the full set of 69,545,326 SDSS photometric objects in the magnitude range 14 train six FT classifiers with random selected objects from the same 884,126 SDSS-DR7 objects with spectroscopic data that we use before. Both, the decision commitee and our previous single FT classifier will be applied to the new ojects from SDSS data releses eight, nine and ten. Finally we will compare peformances of both methods in this new data set. [1] Vasconcellos, E. C.; de Carvalho, R. R.; Gal, R. R.; LaBarbera, F. L.; Capelato, H. V.; Fraga Campos Velho, H.; Trevisan, M.; Ruiz, R. S. R.. Decision Tree Classifiers for Star/Galaxy Separation. The Astronomical Journal, Volume 141, Issue 6, 2011.

  5. Multi-test decision tree and its application to microarray data classification.

    Science.gov (United States)

    Czajkowski, Marcin; Grześ, Marek; Kretowski, Marek

    2014-05-01

    The desirable property of tools used to investigate biological data is easy to understand models and predictive decisions. Decision trees are particularly promising in this regard due to their comprehensible nature that resembles the hierarchical process of human decision making. However, existing algorithms for learning decision trees have tendency to underfit gene expression data. The main aim of this work is to improve the performance and stability of decision trees with only a small increase in their complexity. We propose a multi-test decision tree (MTDT); our main contribution is the application of several univariate tests in each non-terminal node of the decision tree. We also search for alternative, lower-ranked features in order to obtain more stable and reliable predictions. Experimental validation was performed on several real-life gene expression datasets. Comparison results with eight classifiers show that MTDT has a statistically significantly higher accuracy than popular decision tree classifiers, and it was highly competitive with ensemble learning algorithms. The proposed solution managed to outperform its baseline algorithm on 14 datasets by an average 6%. A study performed on one of the datasets showed that the discovered genes used in the MTDT classification model are supported by biological evidence in the literature. This paper introduces a new type of decision tree which is more suitable for solving biological problems. MTDTs are relatively easy to analyze and much more powerful in modeling high dimensional microarray data than their popular counterparts. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Online decision trees to support the control of gastrointestinal worms in ruminants

    OpenAIRE

    Koopmann, Regine; Dämmrich, Michaela; Ploeger, Harm

    2014-01-01

    Control of gastrointestinal worms is crucial to any pasture system for ruminants. To support the farmer's foresighted planning of pasturage and to avoid excessive deworm-ing in Germany we created four decision trees and put them online. They are freely accessible at www.weide-parasiten.de. There is one decision tree for young first season cattle in intensive dairy husbandry, one decision tree for young cattle in suckling-cow management and one decision tree for sheep and goats, respectively.

  7. Prediction of the compression ratio for municipal solid waste using decision tree.

    Science.gov (United States)

    Heshmati R, Ali Akbar; Mokhtari, Maryam; Shakiba Rad, Saeed

    2014-01-01

    The compression ratio of municipal solid waste (MSW) is an essential parameter for evaluation of waste settlement and landfill design. However, no appropriate model has been proposed to estimate the waste compression ratio so far. In this study, a decision tree method was utilized to predict the waste compression ratio (C'c). The tree was constructed using Quinlan's M5 algorithm. A reliable database retrieved from the literature was used to develop a practical model that relates C'c to waste composition and properties, including dry density, dry weight water content, and percentage of biodegradable organic waste using the decision tree method. The performance of the developed model was examined in terms of different statistical criteria, including correlation coefficient, root mean squared error, mean absolute error and mean bias error, recommended by researchers. The obtained results demonstrate that the suggested model is able to evaluate the compression ratio of MSW effectively.

  8. Algorithms for optimal dyadic decision trees

    Energy Technology Data Exchange (ETDEWEB)

    Hush, Don [Los Alamos National Laboratory; Porter, Reid [Los Alamos National Laboratory

    2009-01-01

    A new algorithm for constructing optimal dyadic decision trees was recently introduced, analyzed, and shown to be very effective for low dimensional data sets. This paper enhances and extends this algorithm by: introducing an adaptive grid search for the regularization parameter that guarantees optimal solutions for all relevant trees sizes, revising the core tree-building algorithm so that its run time is substantially smaller for most regularization parameter values on the grid, and incorporating new data structures and data pre-processing steps that provide significant run time enhancement in practice.

  9. Classification of Parkinsonian syndromes from FDG-PET brain data using decision trees with SSM/PCA features.

    Science.gov (United States)

    Mudali, D; Teune, L K; Renken, R J; Leenders, K L; Roerdink, J B T M

    2015-01-01

    Medical imaging techniques like fluorodeoxyglucose positron emission tomography (FDG-PET) have been used to aid in the differential diagnosis of neurodegenerative brain diseases. In this study, the objective is to classify FDG-PET brain scans of subjects with Parkinsonian syndromes (Parkinson's disease, multiple system atrophy, and progressive supranuclear palsy) compared to healthy controls. The scaled subprofile model/principal component analysis (SSM/PCA) method was applied to FDG-PET brain image data to obtain covariance patterns and corresponding subject scores. The latter were used as features for supervised classification by the C4.5 decision tree method. Leave-one-out cross validation was applied to determine classifier performance. We carried out a comparison with other types of classifiers. The big advantage of decision tree classification is that the results are easy to understand by humans. A visual representation of decision trees strongly supports the interpretation process, which is very important in the context of medical diagnosis. Further improvements are suggested based on enlarging the number of the training data, enhancing the decision tree method by bagging, and adding additional features based on (f)MRI data.

  10. Classification and Progression Based on CFS-GA and C5.0 Boost Decision Tree of TCM Zheng in Chronic Hepatitis B.

    Science.gov (United States)

    Chen, Xiao Yu; Ma, Li Zhuang; Chu, Na; Zhou, Min; Hu, Yiyang

    2013-01-01

    Chronic hepatitis B (CHB) is a serious public health problem, and Traditional Chinese Medicine (TCM) plays an important role in the control and treatment for CHB. In the treatment of TCM, zheng discrimination is the most important step. In this paper, an approach based on CFS-GA (Correlation based Feature Selection and Genetic Algorithm) and C5.0 boost decision tree is used for zheng classification and progression in the TCM treatment of CHB. The CFS-GA performs better than the typical method of CFS. By CFS-GA, the acquired attribute subset is classified by C5.0 boost decision tree for TCM zheng classification of CHB, and C5.0 decision tree outperforms two typical decision trees of NBTree and REPTree on CFS-GA, CFS, and nonselection in comparison. Based on the critical indicators from C5.0 decision tree, important lab indicators in zheng progression are obtained by the method of stepwise discriminant analysis for expressing TCM zhengs in CHB, and alterations of the important indicators are also analyzed in zheng progression. In conclusion, all the three decision trees perform better on CFS-GA than on CFS and nonselection, and C5.0 decision tree outperforms the two typical decision trees both on attribute selection and nonselection.

  11. An ordering heuristic for building Binary Decision Diagrams for fault-trees

    International Nuclear Information System (INIS)

    Bouissou, M.

    1997-01-01

    Binary Decision Diagrams (BDD) have recently made a noticeable entry in the RAMS field. This kind of representation for boolean functions makes possible the assessment of complex fault-trees, both qualitatively (minimal cut-sets search) and quantitatively (exact calculation of top event probability). The object of the paper is to present a pre-processing of the fault-tree which ensures that the results given by different heuristics on the 'optimized' fault-tree are not too sensitive to the way the tree is written. This property is based on a theoretical proof. In contrast with some well known heuristics, the method proposed is not based only on intuition and practical experiments. (author)

  12. LOCAL BINARIZATION FOR DOCUMENT IMAGES CAPTURED BY CAMERAS WITH DECISION TREE

    Directory of Open Access Journals (Sweden)

    Naser Jawas

    2012-07-01

    Full Text Available Character recognition in a document image captured by a digital camera requires a good binary image as the input for the separation the text from the background. Global binarization method does not provide such good separation because of the problem of uneven levels of lighting in images captured by cameras. Local binarization method overcomes the problem but requires a method to partition the large image into local windows properly. In this paper, we propose a local binariation method with dynamic image partitioning using integral image and decision tree for the binarization decision. The integral image is used to estimate the number of line in the document image. The number of line in the document image is used to devide the document into local windows. The decision tree makes a decision for threshold in every local window. The result shows that the proposed method can separate the text from the background better than using global thresholding with the best OCR result of the binarized image is 99.4%. Pengenalan karakter pada sebuah dokumen citra yang diambil menggunakan kamera digital membutuhkan citra yang terbinerisasi dengan baik untuk memisahkan antara teks dengan background. Metode binarisasi global tidak memberikan hasil pemisahan yang bagus karena permasalahan tingkat pencahayaan yang tidak seimbang pada citra hasil kamera digital. Metode binarisasi lokal dapat mengatasi permasalahan tersebut namun metode tersebut membutuhkan metode untuk membagi citra ke dalam bagian-bagian window lokal. Pada paper ini diusulkan sebuah metode binarisasi lokal dengan pembagian citra secara dinamis menggunakan integral image dan decision tree untuk keputusan binarisasi lokalnya. Integral image digunakan untuk mengestimasi jumlah baris teks dalam dokumen citra. Jumlah baris tersebut kemudian digunakan untuk membagi citra dokumen ke dalam window lokal. Keputusan nilai threshold untuk setiap window lokal ditentukan dengan decisiontree. Hasilnya menunjukkan

  13. Alternative measures of risk of extreme events in decision trees

    International Nuclear Information System (INIS)

    Frohwein, H.I.; Lambert, J.H.; Haimes, Y.Y.

    1999-01-01

    A need for a methodology to control the extreme events, defined as low-probability, high-consequence incidents, in sequential decisions is identified. A variety of alternative and complementary measures of the risk of extreme events are examined for their usability as objective functions in sequential decisions, represented as single- or multiple-objective decision trees. Earlier work had addressed difficulties, related to non-separability, with the minimization of some measures of the risk of extreme events in sequential decisions. In an extension of these results, it is shown how some non-separable measures of the risk of extreme events can be interpreted in terms of separable constituents of risk, thereby enabling a wider class of measures of the risk of extreme events to be handled in a straightforward manner in a decision tree. Also for extreme events, results are given to enable minimax- and Hurwicz-criterion analyses in decision trees. An example demonstrates the incorporation of different measures of the risk of extreme events in a multi-objective decision tree. Conceptual formulations for optimizing non-separable measures of the risk of extreme events are identified as an important area for future investigation

  14. A fault tree analysis strategy using binary decision diagrams

    International Nuclear Information System (INIS)

    Reay, Karen A.; Andrews, John D.

    2002-01-01

    The use of binary decision diagrams (BDDs) in fault tree analysis provides both an accurate and efficient means of analysing a system. There is a problem, however, with the conversion process of the fault tree to the BDD. The variable ordering scheme chosen for the construction of the BDD has a crucial effect on its resulting size and previous research has failed to identify any scheme that is capable of producing BDDs for all fault trees. This paper proposes an analysis strategy aimed at increasing the likelihood of obtaining a BDD for any given fault tree, by ensuring the associated calculations are as efficient as possible. The method implements simplification techniques, which are applied to the fault tree to obtain a set of 'minimal' subtrees, equivalent to the original fault tree structure. BDDs are constructed for each, using ordering schemes most suited to their particular characteristics. Quantitative analysis is performed simultaneously on the set of BDDs to obtain the top event probability, the system unconditional failure intensity and the criticality of the basic events

  15. Algorithms for Decision Tree Construction

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    The study of algorithms for decision tree construction was initiated in 1960s. The first algorithms are based on the separation heuristic [13, 31] that at each step tries dividing the set of objects as evenly as possible. Later Garey and Graham [28

  16. Decision Tree Approach to Discovering Fraud in Leasing Agreements

    Directory of Open Access Journals (Sweden)

    Horvat Ivan

    2014-09-01

    Full Text Available Background: Fraud attempts create large losses for financing subjects in modern economies. At the same time, leasing agreements have become more and more popular as a means of financing objects such as machinery and vehicles, but are more vulnerable to fraud attempts. Objectives: The goal of the paper is to estimate the usability of the data mining approach in discovering fraud in leasing agreements. Methods/Approach: Real-world data from one Croatian leasing firm was used for creating tow models for fraud detection in leasing. The decision tree method was used for creating a classification model, and the CHAID algorithm was deployed. Results: The decision tree model has indicated that the object of the leasing agreement had the strongest impact on the probability of fraud. Conclusions: In order to enhance the probability of the developed model, it would be necessary to develop software that would enable automated, quick and transparent retrieval of data from the system, processing according to the rules and displaying the results in multiple categories.

  17. Identifying Bank Frauds Using CRISP-DM and Decision Trees

    OpenAIRE

    Bruno Carneiro da Rocha; Rafael Timóteo de Sousa Júnior

    2010-01-01

    This article aims to evaluate the use of techniques of decision trees, in conjunction with the managementmodel CRISP-DM, to help in the prevention of bank fraud. This article offers a study on decision trees, animportant concept in the field of artificial intelligence. The study is focused on discussing how these treesare able to assist in the decision making process of identifying frauds by the analysis of informationregarding bank transactions. This information is captured with the use of t...

  18. A Branch-and-Price approach to find optimal decision trees

    NARCIS (Netherlands)

    Firat, M.; Crognier, Guillaume; Gabor, Adriana; Zhang, Y.

    2018-01-01

    In Artificial Intelligence (AI) field, decision trees have gained certain importance due to their effectiveness in solving classification and regression problems. Recently, in the literature we see finding optimal decision trees are formulated as Mixed Integer Linear Programming (MILP) models. This

  19. 15 CFR Supplement 1 to Part 732 - Decision Tree

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 2 2010-01-01 2010-01-01 false Decision Tree 1 Supplement 1 to Part 732 Commerce and Foreign Trade Regulations Relating to Commerce and Foreign Trade (Continued) BUREAU... THE EAR Pt. 732, Supp. 1 Supplement 1 to Part 732—Decision Tree ER06FE04.000 [69 FR 5687, Feb. 6, 2004] ...

  20. Computerized Adaptive Test vs. decision trees: Development of a support decision system to identify suicidal behavior.

    Science.gov (United States)

    Delgado-Gomez, D; Baca-Garcia, E; Aguado, D; Courtet, P; Lopez-Castroman, J

    2016-12-01

    Several Computerized Adaptive Tests (CATs) have been proposed to facilitate assessments in mental health. These tests are built in a standard way, disregarding useful and usually available information not included in the assessment scales that could increase the precision and utility of CATs, such as the history of suicide attempts. Using the items of a previously developed scale for suicidal risk, we compared the performance of a standard CAT and a decision tree in a support decision system to identify suicidal behavior. We included the history of past suicide attempts as a class for the separation of patients in the decision tree. The decision tree needed an average of four items to achieve a similar accuracy than a standard CAT with nine items. The accuracy of the decision tree, obtained after 25 cross-validations, was 81.4%. A shortened test adapted for the separation of suicidal and non-suicidal patients was developed. CATs can be very useful tools for the assessment of suicidal risk. However, standard CATs do not use all the information that is available. A decision tree can improve the precision of the assessment since they are constructed using a priori information. Copyright © 2016 Elsevier B.V. All rights reserved.

  1. Sequence Algebra, Sequence Decision Diagrams and Dynamic Fault Trees

    International Nuclear Information System (INIS)

    Rauzy, Antoine B.

    2011-01-01

    A large attention has been focused on the Dynamic Fault Trees in the past few years. By adding new gates to static (regular) Fault Trees, Dynamic Fault Trees aim to take into account dependencies among events. Merle et al. proposed recently an algebraic framework to give a formal interpretation to these gates. In this article, we extend Merle et al.'s work by adopting a slightly different perspective. We introduce Sequence Algebras that can be seen as Algebras of Basic Events, representing failures of non-repairable components. We show how to interpret Dynamic Fault Trees within this framework. Finally, we propose a new data structure to encode sets of sequences of Basic Events: Sequence Decision Diagrams. Sequence Decision Diagrams are very much inspired from Minato's Zero-Suppressed Binary Decision Diagrams. We show that all operations of Sequence Algebras can be performed on this data structure.

  2. A method of building of decision trees based on data from wearable device during a rehabilitation of patients with tibia fractures

    International Nuclear Information System (INIS)

    Kupriyanov, M. S.; Shukeilo, E. Y.; Shichkina, J. A.

    2015-01-01

    Nowadays technologies which are used in traumatology are a combination of mechanical, electronic, calculating and programming tools. Relevance of development of mobile applications for an expeditious data processing which are received from medical devices (in particular, wearable devices), and formulation of management decisions increases. Using of a mathematical method of building of decision trees for an assessment of a patient’s health condition using data from a wearable device considers in this article

  3. A method of building of decision trees based on data from wearable device during a rehabilitation of patients with tibia fractures

    Science.gov (United States)

    Kupriyanov, M. S.; Shukeilo, E. Y.; Shichkina, J. A.

    2015-11-01

    Nowadays technologies which are used in traumatology are a combination of mechanical, electronic, calculating and programming tools. Relevance of development of mobile applications for an expeditious data processing which are received from medical devices (in particular, wearable devices), and formulation of management decisions increases. Using of a mathematical method of building of decision trees for an assessment of a patient's health condition using data from a wearable device considers in this article.

  4. A method of building of decision trees based on data from wearable device during a rehabilitation of patients with tibia fractures

    Energy Technology Data Exchange (ETDEWEB)

    Kupriyanov, M. S., E-mail: mikhail.kupriyanov@gmail.com; Shukeilo, E. Y., E-mail: eyshukeylo@gmail.com; Shichkina, J. A., E-mail: strange.y@mail.ru [Saint Petersburg Electrotechnical University “LETI” (Russian Federation)

    2015-11-17

    Nowadays technologies which are used in traumatology are a combination of mechanical, electronic, calculating and programming tools. Relevance of development of mobile applications for an expeditious data processing which are received from medical devices (in particular, wearable devices), and formulation of management decisions increases. Using of a mathematical method of building of decision trees for an assessment of a patient’s health condition using data from a wearable device considers in this article.

  5. Predicting the probability of mortality of gastric cancer patients using decision tree.

    Science.gov (United States)

    Mohammadzadeh, F; Noorkojuri, H; Pourhoseingholi, M A; Saadat, S; Baghestani, A R

    2015-06-01

    Gastric cancer is the fourth most common cancer worldwide. This reason motivated us to investigate and introduce gastric cancer risk factors utilizing statistical methods. The aim of this study was to identify the most important factors influencing the mortality of patients who suffer from gastric cancer disease and to introduce a classification approach according to decision tree model for predicting the probability of mortality from this disease. Data on 216 patients with gastric cancer, who were registered in Taleghani hospital in Tehran,Iran, were analyzed. At first, patients were divided into two groups: the dead and alive. Then, to fit decision tree model to our data, we randomly selected 20% of dataset to the test sample and remaining dataset considered as the training sample. Finally, the validity of the model examined with sensitivity, specificity, diagnosis accuracy and the area under the receiver operating characteristic curve. The CART version 6.0 and SPSS version 19.0 softwares were used for the analysis of the data. Diabetes, ethnicity, tobacco, tumor size, surgery, pathologic stage, age at diagnosis, exposure to chemical weapons and alcohol consumption were determined as effective factors on mortality of gastric cancer. The sensitivity, specificity and accuracy of decision tree were 0.72, 0.75 and 0.74 respectively. The indices of sensitivity, specificity and accuracy represented that the decision tree model has acceptable accuracy to prediction the probability of mortality in gastric cancer patients. So a simple decision tree consisted of factors affecting on mortality of gastric cancer may help clinicians as a reliable and practical tool to predict the probability of mortality in these patients.

  6. An ordering heuristic for building Binary Decision Diagrams for fault-trees

    Energy Technology Data Exchange (ETDEWEB)

    Bouissou, M. [Electricite de France (EDF), 75 - Paris (France)

    1997-12-31

    Binary Decision Diagrams (BDD) have recently made a noticeable entry in the RAMS field. This kind of representation for boolean functions makes possible the assessment of complex fault-trees, both qualitatively (minimal cut-sets search) and quantitatively (exact calculation of top event probability). The object of the paper is to present a pre-processing of the fault-tree which ensures that the results given by different heuristics on the `optimized` fault-tree are not too sensitive to the way the tree is written. This property is based on a theoretical proof. In contrast with some well known heuristics, the method proposed is not based only on intuition and practical experiments. (author) 12 refs.

  7. MRI-based decision tree model for diagnosis of biliary atresia.

    Science.gov (United States)

    Kim, Yong Hee; Kim, Myung-Joon; Shin, Hyun Joo; Yoon, Haesung; Han, Seok Joo; Koh, Hong; Roh, Yun Ho; Lee, Mi-Jung

    2018-02-23

    To evaluate MRI findings and to generate a decision tree model for diagnosis of biliary atresia (BA) in infants with jaundice. We retrospectively reviewed features of MRI and ultrasonography (US) performed in infants with jaundice between January 2009 and June 2016 under approval of the institutional review board, including the maximum diameter of periportal signal change on MRI (MR triangular cord thickness, MR-TCT) or US (US-TCT), visibility of common bile duct (CBD) and abnormality of gallbladder (GB). Hepatic subcapsular flow was reviewed on Doppler US. We performed conditional inference tree analysis using MRI findings to generate a decision tree model. A total of 208 infants were included, 112 in the BA group and 96 in the non-BA group. Mean age at the time of MRI was 58.7 ± 36.6 days. Visibility of CBD, abnormality of GB and MR-TCT were good discriminators for the diagnosis of BA and the MRI-based decision tree using these findings with MR-TCT cut-off 5.1 mm showed 97.3 % sensitivity, 94.8 % specificity and 96.2 % accuracy. MRI-based decision tree model reliably differentiates BA in infants with jaundice. MRI can be an objective imaging modality for the diagnosis of BA. • MRI-based decision tree model reliably differentiates biliary atresia in neonatal cholestasis. • Common bile duct, gallbladder and periportal signal changes are the discriminators. • MRI has comparable performance to ultrasonography for diagnosis of biliary atresia.

  8. RE-Powering’s Electronic Decision Tree

    Science.gov (United States)

    Developed by US EPA's RE-Powering America's Land Initiative, the RE-Powering Decision Trees tool guides interested parties through a process to screen sites for their suitability for solar photovoltaics or wind installations

  9. A Decision Tree for Nonmetric Sex Assessment from the Skull.

    Science.gov (United States)

    Langley, Natalie R; Dudzik, Beatrix; Cloutier, Alesia

    2018-01-01

    This study uses five well-documented cranial nonmetric traits (glabella, mastoid process, mental eminence, supraorbital margin, and nuchal crest) and one additional trait (zygomatic extension) to develop a validated decision tree for sex assessment. The decision tree was built and cross-validated on a sample of 293 U.S. White individuals from the William M. Bass Donated Skeletal Collection. Ordinal scores from the six traits were analyzed using the partition modeling option in JMP Pro 12. A holdout sample of 50 skulls was used to test the model. The most accurate decision tree includes three variables: glabella, zygomatic extension, and mastoid process. This decision tree yielded 93.5% accuracy on the training sample, 94% on the cross-validated sample, and 96% on a holdout validation sample. Linear weighted kappa statistics indicate acceptable agreement among observers for these variables. Mental eminence should be avoided, and definitions and figures should be referenced carefully to score nonmetric traits. © 2017 American Academy of Forensic Sciences.

  10. Classifying dysmorphic syndromes by using artificial neural network based hierarchical decision tree.

    Science.gov (United States)

    Özdemir, Merve Erkınay; Telatar, Ziya; Eroğul, Osman; Tunca, Yusuf

    2018-05-01

    Dysmorphic syndromes have different facial malformations. These malformations are significant to an early diagnosis of dysmorphic syndromes and contain distinctive information for face recognition. In this study we define the certain features of each syndrome by considering facial malformations and classify Fragile X, Hurler, Prader Willi, Down, Wolf Hirschhorn syndromes and healthy groups automatically. The reference points are marked on the face images and ratios between the points' distances are taken into consideration as features. We suggest a neural network based hierarchical decision tree structure in order to classify the syndrome types. We also implement k-nearest neighbor (k-NN) and artificial neural network (ANN) classifiers to compare classification accuracy with our hierarchical decision tree. The classification accuracy is 50, 73 and 86.7% with k-NN, ANN and hierarchical decision tree methods, respectively. Then, the same images are shown to a clinical expert who achieve a recognition rate of 46.7%. We develop an efficient system to recognize different syndrome types automatically in a simple, non-invasive imaging data, which is independent from the patient's age, sex and race at high accuracy. The promising results indicate that our method can be used for pre-diagnosis of the dysmorphic syndromes by clinical experts.

  11. Development of a diagnostic decision tree for obstructive pulmonary diseases based on real-life data

    Directory of Open Access Journals (Sweden)

    Esther I. Metting

    2016-01-01

    Full Text Available The aim of this study was to develop and explore the diagnostic accuracy of a decision tree derived from a large real-life primary care population. Data from 9297 primary care patients (45% male, mean age 53±17 years with suspicion of an obstructive pulmonary disease was derived from an asthma/chronic obstructive pulmonary disease (COPD service where patients were assessed using spirometry, the Asthma Control Questionnaire, the Clinical COPD Questionnaire, history data and medication use. All patients were diagnosed through the Internet by a pulmonologist. The Chi-squared Automatic Interaction Detection method was used to build the decision tree. The tree was externally validated in another real-life primary care population (n=3215. Our tree correctly diagnosed 79% of the asthma patients, 85% of the COPD patients and 32% of the asthma–COPD overlap syndrome (ACOS patients. External validation showed a comparable pattern (correct: asthma 78%, COPD 83%, ACOS 24%. Our decision tree is considered to be promising because it was based on real-life primary care patients with a specialist's diagnosis. In most patients the diagnosis could be correctly predicted. Predicting ACOS, however, remained a challenge. The total decision tree can be implemented in computer-assisted diagnostic systems for individual patients. A simplified version of this tree can be used in daily clinical practice as a desk tool.

  12. Development of a diagnostic decision tree for obstructive pulmonary diseases based on real-life data

    Science.gov (United States)

    in ’t Veen, Johannes C.C.M.; Dekhuijzen, P.N. Richard; van Heijst, Ellen; Kocks, Janwillem W.H.; Muilwijk-Kroes, Jacqueline B.; Chavannes, Niels H.; van der Molen, Thys

    2016-01-01

    The aim of this study was to develop and explore the diagnostic accuracy of a decision tree derived from a large real-life primary care population. Data from 9297 primary care patients (45% male, mean age 53±17 years) with suspicion of an obstructive pulmonary disease was derived from an asthma/chronic obstructive pulmonary disease (COPD) service where patients were assessed using spirometry, the Asthma Control Questionnaire, the Clinical COPD Questionnaire, history data and medication use. All patients were diagnosed through the Internet by a pulmonologist. The Chi-squared Automatic Interaction Detection method was used to build the decision tree. The tree was externally validated in another real-life primary care population (n=3215). Our tree correctly diagnosed 79% of the asthma patients, 85% of the COPD patients and 32% of the asthma–COPD overlap syndrome (ACOS) patients. External validation showed a comparable pattern (correct: asthma 78%, COPD 83%, ACOS 24%). Our decision tree is considered to be promising because it was based on real-life primary care patients with a specialist's diagnosis. In most patients the diagnosis could be correctly predicted. Predicting ACOS, however, remained a challenge. The total decision tree can be implemented in computer-assisted diagnostic systems for individual patients. A simplified version of this tree can be used in daily clinical practice as a desk tool. PMID:27730177

  13. A simple component-connection method for building binary decision diagrams encoding a fault tree

    International Nuclear Information System (INIS)

    Way, Y.-S.; Hsia, D.-Y.

    2000-01-01

    A simple new method for building binary decision diagrams (BDDs) encoding a fault tree (FT) is provided in this study. We first decompose the FT into FT-components. Each of them is a single descendant (SD) gate-sequence. Following the node-connection rule, the BDD-component encoding an SD FT-component can each be found to be an SD node-sequence. By successively connecting the BDD-components one by one, the BDD for the entire FT is thus obtained. During the node-connection and component-connection, reduction rules might need to be applied. An example FT is used throughout the article to explain the procedure step by step. Our method proposed is a hybrid one for FT analysis. Some algorithms or techniques used in the conventional FT analysis or the newer BDD approach may be applied to our case; our ideas mentioned in the article might be referred by the two methods

  14. Circum-Arctic petroleum systems identified using decision-tree chemometrics

    Science.gov (United States)

    Peters, K.E.; Ramos, L.S.; Zumberge, J.E.; Valin, Z.C.; Scotese, C.R.; Gautier, D.L.

    2007-01-01

    Source- and age-related biomarker and isotopic data were measured for more than 1000 crude oil samples from wells and seeps collected above approximately 55??N latitude. A unique, multitiered chemometric (multivariate statistical) decision tree was created that allowed automated classification of 31 genetically distinct circumArctic oil families based on a training set of 622 oil samples. The method, which we call decision-tree chemometrics, uses principal components analysis and multiple tiers of K-nearest neighbor and SIMCA (soft independent modeling of class analogy) models to classify and assign confidence limits for newly acquired oil samples and source rock extracts. Geochemical data for each oil sample were also used to infer the age, lithology, organic matter input, depositional environment, and identity of its source rock. These results demonstrate the value of large petroleum databases where all samples were analyzed using the same procedures and instrumentation. Copyright ?? 2007. The American Association of Petroleum Geologists. All rights reserved.

  15. PRIA 3 Fee Determination Decision Tree

    Science.gov (United States)

    The PRIA 3 decision tree will help applicants requesting a pesticide registration or certain tolerance action to accurately identify the category of their application and the amount of the required fee before they submit the application.

  16. The decision tree classifier - Design and potential. [for Landsat-1 data

    Science.gov (United States)

    Hauska, H.; Swain, P. H.

    1975-01-01

    A new classifier has been developed for the computerized analysis of remote sensor data. The decision tree classifier is essentially a maximum likelihood classifier using multistage decision logic. It is characterized by the fact that an unknown sample can be classified into a class using one or several decision functions in a successive manner. The classifier is applied to the analysis of data sensed by Landsat-1 over Kenosha Pass, Colorado. The classifier is illustrated by a tree diagram which for processing purposes is encoded as a string of symbols such that there is a unique one-to-one relationship between string and decision tree.

  17. Influence diagrams and decision trees for severe accident management

    Energy Technology Data Exchange (ETDEWEB)

    Goetz, W.W.J.

    1996-09-01

    A review of relevant methodologies based on Influence Diagrams (IDs), Decision Trees (DTs), and Containment Event Trees (CETs) was conducted to assess the practicality of these methods for the selection of effective strategies for Severe Accident Management (SAM). The review included an evaluation of some software packages for these methods. The emphasis was on possible pitfalls of using IDs and on practical aspects, the latter by performance of a case study that was based on an existing Level 2 Probabilistic Safety Assessment (PSA). The study showed that the use of a combined ID/DT model has advantages over CET models, in particular when conservatisms in the Level 2 PSA have been identified and replaced by fair assessments of the uncertainties involved. It is recommended to use ID/DT models complementary to CET models. (orig.).

  18. Influence diagrams and decision trees for severe accident management

    International Nuclear Information System (INIS)

    Goetz, W.W.J.; Seebregts, A.J.; Bedford, T.J.

    1996-08-01

    A review of relevent methodologies based on Influence Diagrams (IDs), Decision Trees (DTs), and Containment Event Trees (CETs) was conducted to assess the practicality of these methods for the selection of effective strategies for Severe Accident Management (SAM). The review included an evaluation of some software packages for these methods. The emphasis was on possible pitfalls of using IDs and on practical aspects, the latter by performance of a case study that was based on an existing Level 2 Probabilistic Safety Assessment (PSA). The study showed that the use of a combined ID/DT model has advantages over CET models, in particular when conservatisms in the Level 2 PSA have been identified and replaced by fair assessments of the uncertainties involved. It is recommended to use ID/DT models as complementary to CET models. (orig.)

  19. Influence diagrams and decision trees for severe accident management

    International Nuclear Information System (INIS)

    Goetz, W.W.J.

    1996-09-01

    A review of relevant methodologies based on Influence Diagrams (IDs), Decision Trees (DTs), and Containment Event Trees (CETs) was conducted to assess the practicality of these methods for the selection of effective strategies for Severe Accident Management (SAM). The review included an evaluation of some software packages for these methods. The emphasis was on possible pitfalls of using IDs and on practical aspects, the latter by performance of a case study that was based on an existing Level 2 Probabilistic Safety Assessment (PSA). The study showed that the use of a combined ID/DT model has advantages over CET models, in particular when conservatisms in the Level 2 PSA have been identified and replaced by fair assessments of the uncertainties involved. It is recommended to use ID/DT models complementary to CET models. (orig.)

  20. Total Path Length and Number of Terminal Nodes for Decision Trees

    KAUST Repository

    Hussain, Shahid

    2014-01-01

    This paper presents a new tool for study of relationships between total path length (average depth) and number of terminal nodes for decision trees. These relationships are important from the point of view of optimization of decision trees

  1. A tool for study of optimal decision trees

    KAUST Repository

    Alkhalid, Abdulaziz; Chikalov, Igor; Moshkov, Mikhail

    2010-01-01

    The paper describes a tool which allows us for relatively small decision tables to make consecutive optimization of decision trees relative to various complexity measures such as number of nodes, average depth, and depth, and to find parameters

  2. Creating ensembles of oblique decision trees with evolutionary algorithms and sampling

    Science.gov (United States)

    Cantu-Paz, Erick [Oakland, CA; Kamath, Chandrika [Tracy, CA

    2006-06-13

    A decision tree system that is part of a parallel object-oriented pattern recognition system, which in turn is part of an object oriented data mining system. A decision tree process includes the step of reading the data. If necessary, the data is sorted. A potential split of the data is evaluated according to some criterion. An initial split of the data is determined. The final split of the data is determined using evolutionary algorithms and statistical sampling techniques. The data is split. Multiple decision trees are combined in ensembles.

  3. Learning in data-limited multimodal scenarios: Scandent decision forests and tree-based features.

    Science.gov (United States)

    Hor, Soheil; Moradi, Mehdi

    2016-12-01

    Incomplete and inconsistent datasets often pose difficulties in multimodal studies. We introduce the concept of scandent decision trees to tackle these difficulties. Scandent trees are decision trees that optimally mimic the partitioning of the data determined by another decision tree, and crucially, use only a subset of the feature set. We show how scandent trees can be used to enhance the performance of decision forests trained on a small number of multimodal samples when we have access to larger datasets with vastly incomplete feature sets. Additionally, we introduce the concept of tree-based feature transforms in the decision forest paradigm. When combined with scandent trees, the tree-based feature transforms enable us to train a classifier on a rich multimodal dataset, and use it to classify samples with only a subset of features of the training data. Using this methodology, we build a model trained on MRI and PET images of the ADNI dataset, and then test it on cases with only MRI data. We show that this is significantly more effective in staging of cognitive impairments compared to a similar decision forest model trained and tested on MRI only, or one that uses other kinds of feature transform applied to the MRI data. Copyright © 2016. Published by Elsevier B.V.

  4. Evaluation of Decision Trees for Cloud Detection from AVHRR Data

    Science.gov (United States)

    Shiffman, Smadar; Nemani, Ramakrishna

    2005-01-01

    Automated cloud detection and tracking is an important step in assessing changes in radiation budgets associated with global climate change via remote sensing. Data products based on satellite imagery are available to the scientific community for studying trends in the Earth's atmosphere. The data products include pixel-based cloud masks that assign cloud-cover classifications to pixels. Many cloud-mask algorithms have the form of decision trees. The decision trees employ sequential tests that scientists designed based on empirical astrophysics studies and simulations. Limitations of existing cloud masks restrict our ability to accurately track changes in cloud patterns over time. In a previous study we compared automatically learned decision trees to cloud masks included in Advanced Very High Resolution Radiometer (AVHRR) data products from the year 2000. In this paper we report the replication of the study for five-year data, and for a gold standard based on surface observations performed by scientists at weather stations in the British Islands. For our sample data, the accuracy of automatically learned decision trees was greater than the accuracy of the cloud masks p < 0.001.

  5. Decision Trees for Helpdesk Advisor Graphs

    OpenAIRE

    Gkezerlis, Spyros; Kalles, Dimitris

    2017-01-01

    We use decision trees to build a helpdesk agent reference network to facilitate the on-the-job advising of junior or less experienced staff on how to better address telecommunication customer fault reports. Such reports generate field measurements and remote measurements which, when coupled with location data and client attributes, and fused with organization-level statistics, can produce models of how support should be provided. Beyond decision support, these models can help identify staff w...

  6. The application of a decision tree to establish the parameters associated with hypertension.

    Science.gov (United States)

    Tayefi, Maryam; Esmaeili, Habibollah; Saberi Karimian, Maryam; Amirabadi Zadeh, Alireza; Ebrahimi, Mahmoud; Safarian, Mohammad; Nematy, Mohsen; Parizadeh, Seyed Mohammad Reza; Ferns, Gordon A; Ghayour-Mobarhan, Majid

    2017-02-01

    Hypertension is an important risk factor for cardiovascular disease (CVD). The goal of this study was to establish the factors associated with hypertension by using a decision-tree algorithm as a supervised classification method of data mining. Data from a cross-sectional study were used in this study. A total of 9078 subjects who met the inclusion criteria were recruited. 70% of these subjects (6358 cases) were randomly allocated to the training dataset for the constructing of the decision-tree. The remaining 30% (2720 cases) were used as the testing dataset to evaluate the performance of decision-tree. Two models were evaluated in this study. In model I, age, gender, body mass index, marital status, level of education, occupation status, depression and anxiety status, physical activity level, smoking status, LDL, TG, TC, FBG, uric acid and hs-CRP were considered as input variables and in model II, age, gender, WBC, RBC, HGB, HCT MCV, MCH, PLT, RDW and PDW were considered as input variables. The validation of the model was assessed by constructing a receiver operating characteristic (ROC) curve. The prevalence rates of hypertension were 32% in our population. For the decision-tree model I, the accuracy, sensitivity, specificity and area under the ROC curve (AUC) value for identifying the related risk factors of hypertension were 73%, 63%, 77% and 0.72, respectively. The corresponding values for model II were 70%, 61%, 74% and 0.68, respectively. We have developed a decision tree model to identify the risk factors associated with hypertension that maybe used to develop programs for hypertension management. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  7. The value of decision tree analysis in planning anaesthetic care in obstetrics.

    Science.gov (United States)

    Bamber, J H; Evans, S A

    2016-08-01

    The use of decision tree analysis is discussed in the context of the anaesthetic and obstetric management of a young pregnant woman with joint hypermobility syndrome with a history of insensitivity to local anaesthesia and a previous difficult intubation due to a tongue tumour. The multidisciplinary clinical decision process resulted in the woman being delivered without complication by elective caesarean section under general anaesthesia after an awake fibreoptic intubation. The decision process used is reviewed and compared retrospectively to a decision tree analytical approach. The benefits and limitations of using decision tree analysis are reviewed and its application in obstetric anaesthesia is discussed. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Data Fusion Research of Triaxial Human Body Motion Gesture based on Decision Tree

    Directory of Open Access Journals (Sweden)

    Feihong Zhou

    2014-05-01

    Full Text Available The development status of human body motion gesture data fusion domestic and overseas has been analyzed. A triaxial accelerometer is adopted to develop a wearable human body motion gesture monitoring system aimed at old people healthcare. On the basis of a brief introduction of decision tree algorithm, the WEKA workbench is adopted to generate a human body motion gesture decision tree. At last, the classification quality of the decision tree has been validated through experiments. The experimental results show that the decision tree algorithm could reach an average predicting accuracy of 97.5 % with lower time cost.

  9. The Decision Tree for Teaching Management of Uncertainty

    Science.gov (United States)

    Knaggs, Sara J.; And Others

    1974-01-01

    A 'decision tree' consists of an outline of the patient's symptoms and a logic for decision and action. It is felt that this approach to the decisionmaking process better facilitates each learner's application of his own level of knowledge and skills. (Author)

  10. Three-dimensional object recognition using similar triangles and decision trees

    Science.gov (United States)

    Spirkovska, Lilly

    1993-01-01

    A system, TRIDEC, that is capable of distinguishing between a set of objects despite changes in the objects' positions in the input field, their size, or their rotational orientation in 3D space is described. TRIDEC combines very simple yet effective features with the classification capabilities of inductive decision tree methods. The feature vector is a list of all similar triangles defined by connecting all combinations of three pixels in a coarse coded 127 x 127 pixel input field. The classification is accomplished by building a decision tree using the information provided from a limited number of translated, scaled, and rotated samples. Simulation results are presented which show that TRIDEC achieves 94 percent recognition accuracy in the 2D invariant object recognition domain and 98 percent recognition accuracy in the 3D invariant object recognition domain after training on only a small sample of transformed views of the objects.

  11. [Analysis of the characteristics of the older adults with depression using data mining decision tree analysis].

    Science.gov (United States)

    Park, Myonghwa; Choi, Sora; Shin, A Mi; Koo, Chul Hoi

    2013-02-01

    The purpose of this study was to develop a prediction model for the characteristics of older adults with depression using the decision tree method. A large dataset from the 2008 Korean Elderly Survey was used and data of 14,970 elderly people were analyzed. Target variable was depression and 53 input variables were general characteristics, family & social relationship, economic status, health status, health behavior, functional status, leisure & social activity, quality of life, and living environment. Data were analyzed by decision tree analysis, a data mining technique using SPSS Window 19.0 and Clementine 12.0 programs. The decision trees were classified into five different rules to define the characteristics of older adults with depression. Classification & Regression Tree (C&RT) showed the best prediction with an accuracy of 80.81% among data mining models. Factors in the rules were life satisfaction, nutritional status, daily activity difficulty due to pain, functional limitation for basic or instrumental daily activities, number of chronic diseases and daily activity difficulty due to disease. The different rules classified by the decision tree model in this study should contribute as baseline data for discovering informative knowledge and developing interventions tailored to these individual characteristics.

  12. Totally Optimal Decision Trees for Monotone Boolean Functions with at Most Five Variables

    KAUST Repository

    Chikalov, Igor

    2013-01-01

    In this paper, we present the empirical results for relationships between time (depth) and space (number of nodes) complexity of decision trees computing monotone Boolean functions, with at most five variables. We use Dagger (a tool for optimization of decision trees and decision rules) to conduct experiments. We show that, for each monotone Boolean function with at most five variables, there exists a totally optimal decision tree which is optimal with respect to both depth and number of nodes.

  13. Evaluation of the improvement of the double-contrast radiographic image quality in the prone position brought about by the use of a decision tree in the screening examination

    International Nuclear Information System (INIS)

    Yamada, Yasuji; Nakamura, Syunichi; Ohno, Ryuichi; Azuma, Hiroshi; Fujinaga, Akira; Nagai, Makoto

    2009-01-01

    We designed a decision tree in order to improve the double-contrast radiographic image quality in the prone position and compensate for the disparity of technique among radiological technologists. We evaluated 391 consecutive individuals who underwent medical checkups at our hospital. Three decision trees, id est (i.e.), Tree 1, Tree 2 and Tree 3, were constructed based on the axis and contortion of the stomach with the use of a prone filling image, and then the insertion site of compression pillow was altered. The image quality at the gastric angulus, the gastric body and the antrum was evaluated based on our original numeric scale, and was compared between the previous method and the present method which employs a decision tree. The image quality improved more significantly in the present method employing a decision tree compared with the previous method, for each decision tree: from 90 points to 100 points for Tree 1, from 70 points to 95 points for Tree 2, and from 39.5 points for 85.7 points in Tree 3. These results indicate that our original procedure employing a decision tree improves the radiographic image quality in the prone position and compensates for the disparity of technique among radiological technologists. Therefore, the present method may be expected to serve as the standard procedure of double-contrast radiography in the prone position. (author)

  14. Total Path Length and Number of Terminal Nodes for Decision Trees

    KAUST Repository

    Hussain, Shahid

    2014-09-13

    This paper presents a new tool for study of relationships between total path length (average depth) and number of terminal nodes for decision trees. These relationships are important from the point of view of optimization of decision trees. In this particular case of total path length and number of terminal nodes, the relationships between these two cost functions are closely related with space-time trade-off. In addition to algorithm to compute the relationships, the paper also presents results of experiments with datasets from UCI ML Repository1. These experiments show how two cost functions behave for a given decision table and the resulting plots show the Pareto frontier or Pareto set of optimal points. Furthermore, in some cases this Pareto frontier is a singleton showing the total optimality of decision trees for the given decision table.

  15. Boundary expansion algorithm of a decision tree induction for an imbalanced dataset

    Directory of Open Access Journals (Sweden)

    Kesinee Boonchuay

    2017-10-01

    Full Text Available A decision tree is one of the famous classifiers based on a recursive partitioning algorithm. This paper introduces the Boundary Expansion Algorithm (BEA to improve a decision tree induction that deals with an imbalanced dataset. BEA utilizes all attributes to define non-splittable ranges. The computed means of all attributes for minority instances are used to find the nearest minority instance, which will be expanded along all attributes to cover a minority region. As a result, BEA can successfully cope with an imbalanced dataset comparing with C4.5, Gini, asymmetric entropy, top-down tree, and Hellinger distance decision tree on 25 imbalanced datasets from the UCI Repository.

  16. Evaluation with Decision Trees of Efficacy and Safety of Semirigid Ureteroscopy in the Treatment of Proximal Ureteral Calculi.

    Science.gov (United States)

    Sancak, Eyup Burak; Kılınç, Muhammet Fatih; Yücebaş, Sait Can

    2017-01-01

    The decision on the choice of proximal ureteral stone therapy depends on many factors, and sometimes urologists have difficulty in choosing the treatment option. This study is aimed at evaluating the factors affecting the success of semirigid ureterorenoscopy (URS) using the "decision tree" method. From January 2005 to November 2015, the data of consecutive patients treated for proximal ureteral stone were retrospectively analyzed. A total of 920 patients with proximal ureteral stone treated with semirigid URS were included in the study. All statistically significant attributes were tested using the decision tree method. The model created using decision tree had a sensitivity of 0.993 and an accuracy of 0.857. While URS treatment was successful in 752 patients (81.7%), it was unsuccessful in 168 patients (18.3%). According to the decision tree method, the most important factor affecting the success of URS is whether the stone is impacted to the ureteral wall. The second most important factor affecting treatment was intramural stricture requiring dilatation if the stone is impacted, and the size of the stone if not impacted. Our study suggests that the impacted stone, intramural stricture requiring dilatation and stone size may have a significant effect on the success rate of semirigid URS for proximal ureteral stone. Further studies with population-based and longitudinal design should be conducted to confirm this finding. © 2017 S. Karger AG, Basel.

  17. Application of decision tree model for the ground subsidence hazard mapping near abandoned underground coal mines.

    Science.gov (United States)

    Lee, Saro; Park, Inhye

    2013-09-30

    Subsidence of ground caused by underground mines poses hazards to human life and property. This study analyzed the hazard to ground subsidence using factors that can affect ground subsidence and a decision tree approach in a geographic information system (GIS). The study area was Taebaek, Gangwon-do, Korea, where many abandoned underground coal mines exist. Spatial data, topography, geology, and various ground-engineering data for the subsidence area were collected and compiled in a database for mapping ground-subsidence hazard (GSH). The subsidence area was randomly split 50/50 for training and validation of the models. A data-mining classification technique was applied to the GSH mapping, and decision trees were constructed using the chi-squared automatic interaction detector (CHAID) and the quick, unbiased, and efficient statistical tree (QUEST) algorithms. The frequency ratio model was also applied to the GSH mapping for comparing with probabilistic model. The resulting GSH maps were validated using area-under-the-curve (AUC) analysis with the subsidence area data that had not been used for training the model. The highest accuracy was achieved by the decision tree model using CHAID algorithm (94.01%) comparing with QUEST algorithms (90.37%) and frequency ratio model (86.70%). These accuracies are higher than previously reported results for decision tree. Decision tree methods can therefore be used efficiently for GSH analysis and might be widely used for prediction of various spatial events. Copyright © 2013. Published by Elsevier Ltd.

  18. Monte Carlo Tree Search for Continuous and Stochastic Sequential Decision Making Problems

    International Nuclear Information System (INIS)

    Couetoux, Adrien

    2013-01-01

    In this thesis, I studied sequential decision making problems, with a focus on the unit commitment problem. Traditionally solved by dynamic programming methods, this problem is still a challenge, due to its high dimension and to the sacrifices made on the accuracy of the model to apply state of the art methods. I investigated on the applicability of Monte Carlo Tree Search methods for this problem, and other problems that are single player, stochastic and continuous sequential decision making problems. In doing so, I obtained a consistent and anytime algorithm, that can easily be combined with existing strong heuristic solvers. (author)

  19. A greedy algorithm for construction of decision trees for tables with many-valued decisions - A comparative study

    KAUST Repository

    Azad, Mohammad

    2013-11-25

    In the paper, we study a greedy algorithm for construction of decision trees. This algorithm is applicable to decision tables with many-valued decisions where each row is labeled with a set of decisions. For a given row, we should find a decision from the set attached to this row. Experimental results for data sets from UCI Machine Learning Repository and randomly generated tables are presented. We make a comparative study of the depth and average depth of the constructed decision trees for proposed approach and approach based on generalized decision. The obtained results show that the proposed approach can be useful from the point of view of knowledge representation and algorithm construction.

  20. A greedy algorithm for construction of decision trees for tables with many-valued decisions - A comparative study

    KAUST Repository

    Azad, Mohammad; Chikalov, Igor; Moshkov, Mikhail; Zielosko, Beata

    2013-01-01

    In the paper, we study a greedy algorithm for construction of decision trees. This algorithm is applicable to decision tables with many-valued decisions where each row is labeled with a set of decisions. For a given row, we should find a decision from the set attached to this row. Experimental results for data sets from UCI Machine Learning Repository and randomly generated tables are presented. We make a comparative study of the depth and average depth of the constructed decision trees for proposed approach and approach based on generalized decision. The obtained results show that the proposed approach can be useful from the point of view of knowledge representation and algorithm construction.

  1. Using decision tree to predict serum ferritin level in women with anemia

    Directory of Open Access Journals (Sweden)

    Parisa Safaee

    2016-04-01

    Full Text Available Background: Data mining is known as a process of discovering and analysing large amounts of data in order to find meaningful rules and trends. In healthcare, data mining offers numerous opportunities to study the unknown patterns in a data set. These patterns can be used to diagnosis, prognosis and treatment of patients by physicians. The main objective of this study was to predict the level of serum ferritin in women with anemia and to specify the basic predictive factors of iron deficiency anemia using data mining techniques. Methods: In this research 690 patients and 22 variables have been studied in women population with anemia. These data include 11 laboratories and 11 clinical variables of patients related to the patients who have referred to the laboratory of Imam Hossein and Shohada-E- Haft Tir hospitals from April 2013 to April 2014. Decision tree technique has been used to build the model. Results: The accuracy of the decision tree with all the variables is 75%. Different combinations of variables were examined in order to determine the best model to predict. Regarding the optimum obtained model of the decision tree, the RBC, MCH, MCHC, gastrointestinal cancer and gastrointestinal ulcer were identified as the most important predictive factors. The results indicate if the values of MCV, MCHC and MCH variables are normal and the value of RBC variable is lower than normal limitation, it is diagnosed that the patient is likely 90% iron deficiency anemia. Conclusion: Regarding the simplicity and the low cost of the complete blood count examination, the model of decision tree was taken into consideration to diagnose iron deficiency anemia in patients. Also the impact of new factors such as gastrointestinal hemorrhoids, gastrointestinal surgeries, different gastrointestinal diseases and gastrointestinal ulcers are considered in this paper while the previous studies have been limited only to assess laboratory variables. The rules of the

  2. Language Adaptive LVCSR Through Polyphone Decision Tree Specialization

    Science.gov (United States)

    2000-08-01

    crossing rate. After cepstral mean subtraction a linear discriminant analysis 1Mandarin is given in character based error rate, Japanese in hiragana ...data. For this purpose we German GE 9173 71 132K 16.7 introduce a polyphone decision tree specialization method. Several Japanese JA 9096 108 212K...12.1 2. Multiple Languages German 11.8 61K 200 44.5 43 9.0 For our experiments we developed monolingual LVCSR sys- Japanese 10.0 22K 230 33.8 33 7.9

  3. Combining evolutionary algorithms with oblique decision trees to detect bent-double galaxies

    Science.gov (United States)

    Cantu-Paz, Erick; Kamath, Chandrika

    2000-10-01

    Decision tress have long been popular in classification as they use simple and easy-to-understand tests at each node. Most variants of decision trees test a single attribute at a node, leading to axis- parallel trees, where the test results in a hyperplane which is parallel to one of the dimensions in the attribute space. These trees can be rather large and inaccurate in cases where the concept to be learned is best approximated by oblique hyperplanes. In such cases, it may be more appropriate to use an oblique decision tree, where the decision at each node is a linear combination of the attributes. Oblique decision trees have not gained wide popularity in part due to the complexity of constructing good oblique splits and the tendency of existing splitting algorithms to get stuck in local minima. Several alternatives have been proposed to handle these problems including randomization in conjunction wiht deterministic hill-climbing and the use of simulated annealing. In this paper, we use evolutionary algorithms (EAs) to determine the split. EAs are well suited for this problem because of their global search properties, their tolerance to noisy fitness evaluations, and their scalability to large dimensional search spaces. We demonstrate our technique on a synthetic data set, and then we apply it to a practical problem from astronomy, namely, the classification of galaxies with a bent-double morphology. In addition, we describe our experiences with several split evaluation criteria. Our results suggest that, in some cases, the evolutionary approach is faster and more accurate than existing oblique decision tree algorithms. However, for our astronomical data, the accuracy is not significantly different than the axis-parallel trees.

  4. A Decision Tree for Psychology Majors: Supplying Questions as Well as Answers.

    Science.gov (United States)

    Poe, Retta E.

    1988-01-01

    Outlines the development of a psychology careers decision tree to help faculty advise students plan their program. States that students using the decision tree may benefit by learning more about their career options and by acquiring better question-asking skills. (GEA)

  5. External validation of a decision tree early warning score using only laboratory data

    DEFF Research Database (Denmark)

    Holm Atkins, Tara E; Öhman, Malin C; Brabrand, Mikkel

    2018-01-01

    INTRODUCTION: Early warning scores (EWS) have been developed to identify the degree of illness severity among acutely ill patients. One system, The Laboratory Decision Tree Early Warning Score (LDT-EWS) is wholly laboratory data based. Laboratory data was used in the development of a rare...... computerized method, developing a decision tree analysis. This article externally validates LDT-EWS, which is obligatory for an EWS before clinical use. METHOD: We conducted a retrospective review of prospectively collected data based on a time limited sample of all patients admitted through the medical......) and calibration (precision) as Hosmer-Lemeshow Goodness of fit test. RESULTS: A total of 5858 patients were admitted and 4902 included (83.7%). In-hospital mortality in our final dataset (n=4902) was 3.5%. Discriminatory power (95% CI), identifying in-hospital death was 0.809 (0.777-0.842). Calibration was good...

  6. Solar and Wind Site Screening Decision Trees

    Science.gov (United States)

    EPA and NREL created a decision tree to guide state and local governments and other stakeholders through a process for screening sites for their suitability for future redevelopment with solar photovoltaic (PV) energy and wind energy.

  7. Speech Recognition Using Randomized Relational Decision Trees

    National Research Council Canada - National Science Library

    Amit, Yali

    1999-01-01

    .... This implies that we recognize words as units, without recognizing their subcomponents. Multiple randomized decision trees are used to access the large pool of acoustic events in a systematic manner and are aggregated to produce the classifier.

  8. Peripheral Exophytic Oral Lesions: A Clinical Decision Tree

    Directory of Open Access Journals (Sweden)

    Hamed Mortazavi

    2017-01-01

    Full Text Available Diagnosis of peripheral oral exophytic lesions might be quite challenging. This review article aimed to introduce a decision tree for oral exophytic lesions according to their clinical features. General search engines and specialized databases including PubMed, PubMed Central, Medline Plus, EBSCO, Science Direct, Scopus, Embase, and authenticated textbooks were used to find relevant topics by means of keywords such as “oral soft tissue lesion,” “oral tumor like lesion,” “oral mucosal enlargement,” and “oral exophytic lesion.” Related English-language articles published since 1988 to 2016 in both medical and dental journals were appraised. Upon compilation of data, peripheral oral exophytic lesions were categorized into two major groups according to their surface texture: smooth (mesenchymal or nonsquamous epithelium-originated and rough (squamous epithelium-originated. Lesions with smooth surface were also categorized into three subgroups according to their general frequency: reactive hyperplastic lesions/inflammatory hyperplasia, salivary gland lesions (nonneoplastic and neoplastic, and mesenchymal lesions (benign and malignant neoplasms. In addition, lesions with rough surface were summarized in six more common lesions. In total, 29 entities were organized in the form of a decision tree in order to help clinicians establish a logical diagnosis by a stepwise progression method.

  9. Identification of Biomarkers for Esophageal Squamous Cell Carcinoma Using Feature Selection and Decision Tree Methods

    Directory of Open Access Journals (Sweden)

    Chun-Wei Tung

    2013-01-01

    Full Text Available Esophageal squamous cell cancer (ESCC is one of the most common fatal human cancers. The identification of biomarkers for early detection could be a promising strategy to decrease mortality. Previous studies utilized microarray techniques to identify more than one hundred genes; however, it is desirable to identify a small set of biomarkers for clinical use. This study proposes a sequential forward feature selection algorithm to design decision tree models for discriminating ESCC from normal tissues. Two potential biomarkers of RUVBL1 and CNIH were identified and validated based on two public available microarray datasets. To test the discrimination ability of the two biomarkers, 17 pairs of expression profiles of ESCC and normal tissues from Taiwanese male patients were measured by using microarray techniques. The classification accuracies of the two biomarkers in all three datasets were higher than 90%. Interpretable decision tree models were constructed to analyze expression patterns of the two biomarkers. RUVBL1 was consistently overexpressed in all three datasets, although we found inconsistent CNIH expression possibly affected by the diverse major risk factors for ESCC across different areas.

  10. ATLAAS: an automatic decision tree-based learning algorithm for advanced image segmentation in positron emission tomography.

    Science.gov (United States)

    Berthon, Beatrice; Marshall, Christopher; Evans, Mererid; Spezi, Emiliano

    2016-07-07

    Accurate and reliable tumour delineation on positron emission tomography (PET) is crucial for radiotherapy treatment planning. PET automatic segmentation (PET-AS) eliminates intra- and interobserver variability, but there is currently no consensus on the optimal method to use, as different algorithms appear to perform better for different types of tumours. This work aimed to develop a predictive segmentation model, trained to automatically select and apply the best PET-AS method, according to the tumour characteristics. ATLAAS, the automatic decision tree-based learning algorithm for advanced segmentation is based on supervised machine learning using decision trees. The model includes nine PET-AS methods and was trained on a 100 PET scans with known true contour. A decision tree was built for each PET-AS algorithm to predict its accuracy, quantified using the Dice similarity coefficient (DSC), according to the tumour volume, tumour peak to background SUV ratio and a regional texture metric. The performance of ATLAAS was evaluated for 85 PET scans obtained from fillable and printed subresolution sandwich phantoms. ATLAAS showed excellent accuracy across a wide range of phantom data and predicted the best or near-best segmentation algorithm in 93% of cases. ATLAAS outperformed all single PET-AS methods on fillable phantom data with a DSC of 0.881, while the DSC for H&N phantom data was 0.819. DSCs higher than 0.650 were achieved in all cases. ATLAAS is an advanced automatic image segmentation algorithm based on decision tree predictive modelling, which can be trained on images with known true contour, to predict the best PET-AS method when the true contour is unknown. ATLAAS provides robust and accurate image segmentation with potential applications to radiation oncology.

  11. Modeling and Testing Landslide Hazard Using Decision Tree

    Directory of Open Access Journals (Sweden)

    Mutasem Sh. Alkhasawneh

    2014-01-01

    Full Text Available This paper proposes a decision tree model for specifying the importance of 21 factors causing the landslides in a wide area of Penang Island, Malaysia. These factors are vegetation cover, distance from the fault line, slope angle, cross curvature, slope aspect, distance from road, geology, diagonal length, longitude curvature, rugosity, plan curvature, elevation, rain perception, soil texture, surface area, distance from drainage, roughness, land cover, general curvature, tangent curvature, and profile curvature. Decision tree models are used for prediction, classification, and factors importance and are usually represented by an easy to interpret tree like structure. Four models were created using Chi-square Automatic Interaction Detector (CHAID, Exhaustive CHAID, Classification and Regression Tree (CRT, and Quick-Unbiased-Efficient Statistical Tree (QUEST. Twenty-one factors were extracted using digital elevation models (DEMs and then used as input variables for the models. A data set of 137570 samples was selected for each variable in the analysis, where 68786 samples represent landslides and 68786 samples represent no landslides. 10-fold cross-validation was employed for testing the models. The highest accuracy was achieved using Exhaustive CHAID (82.0% compared to CHAID (81.9%, CRT (75.6%, and QUEST (74.0% model. Across the four models, five factors were identified as most important factors which are slope angle, distance from drainage, surface area, slope aspect, and cross curvature.

  12. The Utility of Decision Trees in Oncofertility Care in Japan.

    Science.gov (United States)

    Ito, Yuki; Shiraishi, Eriko; Kato, Atsuko; Haino, Takayuki; Sugimoto, Kouhei; Okamoto, Aikou; Suzuki, Nao

    2017-03-01

    To identify the utility and issues associated with the use of decision trees in oncofertility patient care in Japan. A total of 35 women who had been diagnosed with cancer, but had not begun anticancer treatment, were enrolled. We applied the oncofertility decision tree for women published by Gardino et al. to counsel a consecutive series of women on fertility preservation (FP) options following cancer diagnosis. Percentage of women who decided to undergo oocyte retrieval for embryo cryopreservation and the expected live-birth rate for these patients were calculated using the following equation: expected live-birth rate = pregnancy rate at each age per embryo transfer × (1 - miscarriage rate) × No. of cryopreserved embryos. Oocyte retrieval was performed for 17 patients (48.6%; mean ± standard deviation [SD] age, 36.35 ± 3.82 years). The mean ± SD number of cryopreserved embryos was 5.29 ± 4.63. The expected live-birth rate was 0.66. The expected live-birth rate with FP indicated that one in three oncofertility patients would not expect to have a live birth following oocyte retrieval and embryo cryopreservation. While the decision trees were useful as decision-making tools for women contemplating FP, in the context of the current restrictions on oocyte donation and the extremely small number of adoptions in Japan, the remaining options for fertility after cancer are limited. In order for cancer survivors to feel secure in their decisions, the decision tree may need to be adapted simultaneously with improvements to the social environment, such as greater support for adoption.

  13. Constructing multi-labelled decision trees for junction design using the predicted probabilities

    NARCIS (Netherlands)

    Bezembinder, Erwin M.; Wismans, Luc J. J.; Van Berkum, Eric C.

    2017-01-01

    In this paper, we evaluate the use of traditional decision tree algorithms CRT, CHAID and QUEST to determine a decision tree which can be used to predict a set of (Pareto optimal) junction design alternatives (e.g. signal or roundabout) for a given traffic demand pattern and available space. This is

  14. Decision-Tree Models of Categorization Response Times, Choice Proportions, and Typicality Judgments

    Science.gov (United States)

    Lafond, Daniel; Lacouture, Yves; Cohen, Andrew L.

    2009-01-01

    The authors present 3 decision-tree models of categorization adapted from T. Trabasso, H. Rollins, and E. Shaughnessy (1971) and use them to provide a quantitative account of categorization response times, choice proportions, and typicality judgments at the individual-participant level. In Experiment 1, the decision-tree models were fit to…

  15. The effect of the fragmentation problem in decision tree learning applied to the search for single top quark production

    International Nuclear Information System (INIS)

    Vilalta, R; Ocegueda-Hernandez, F; Valerio, R; Watts, G

    2010-01-01

    Decision tree learning constitutes a suitable approach to classification due to its ability to partition the variable space into regions of class-uniform events, while providing a structure amenable to interpretation, in contrast to other methods such as neural networks. But an inherent limitation of decision tree learning is the progressive lessening of the statistical support of the final classifier as clusters of single-class events are split on every partition, a problem known as the fragmentation problem. We describe a software system called DTFE, for Decision Tree Fragmentation Evaluator, that measures the degree of fragmentation caused by a decision tree learner on every event cluster. Clusters are found through a decomposition of the data using a technique known as Spectral Clustering. Each cluster is analyzed in terms of the number and type of partitions induced by the decision tree. Our domain of application lies on the search for single top quark production, a challenging problem due to large and similar backgrounds, low energetic signals, and low number of jets. The output of the machine-learning software tool consists of a series of statistics describing the degree of data fragmentation.

  16. Which Types of Leadership Styles Do Followers Prefer? A Decision Tree Approach

    Science.gov (United States)

    Salehzadeh, Reza

    2017-01-01

    Purpose: The purpose of this paper is to propose a new method to find the appropriate leadership styles based on the followers' preferences using the decision tree technique. Design/methodology/approach: Statistical population includes the students of the University of Isfahan. In total, 750 questionnaires were distributed; out of which, 680…

  17. Predictability of the future development of aggressive behavior of cranial dural arteriovenous fistulas based on decision tree analysis.

    Science.gov (United States)

    Satomi, Junichiro; Ghaibeh, A Ammar; Moriguchi, Hiroki; Nagahiro, Shinji

    2015-07-01

    The severity of clinical signs and symptoms of cranial dural arteriovenous fistulas (DAVFs) are well correlated with their pattern of venous drainage. Although the presence of cortical venous drainage can be considered a potential predictor of aggressive DAVF behaviors, such as intracranial hemorrhage or progressive neurological deficits due to venous congestion, accurate statistical analyses are currently not available. Using a decision tree data mining method, the authors aimed at clarifying the predictability of the future development of aggressive behaviors of DAVF and at identifying the main causative factors. Of 266 DAVF patients, 89 were eligible for analysis. Under observational management, 51 patients presented with intracranial hemorrhage/infarction during the follow-up period. The authors created a decision tree able to assess the risk for the development of aggressive DAVF behavior. Evaluated by 10-fold cross-validation, the decision tree's accuracy, sensitivity, and specificity were 85.28%, 88.33%, and 80.83%, respectively. The tree shows that the main factor in symptomatic patients was the presence of cortical venous drainage. In its absence, the lesion location determined the risk of a DAVF developing aggressive behavior. Decision tree analysis accurately predicts the future development of aggressive DAVF behavior.

  18. The new decision tree for the evaluation of pesticide leaching from soils

    NARCIS (Netherlands)

    Linden AMA van der; Boesten JJTI; Cornelese AA; Kruijne R; Leistra M; Linders JBHJ; Pol JW; Tiktak A; Verschoor AJ; Alterra; CTB; LDL; SEC; LER; Alterra

    2004-01-01

    The Dutch decision tree on leaching from soil has been re-designed to be more in line with EU guidelines on the assessment of the leaching potential of substances. The new decision tree explicitly defines reasonable worst-case conditions as the 90th percentile of the area to which a substance is

  19. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis.

    Science.gov (United States)

    Lo, Benjamin W Y; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A H

    2016-01-01

    Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56-2.45, P tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH.

  20. An Improved Decision Tree for Predicting a Major Product in Competing Reactions

    Science.gov (United States)

    Graham, Kate J.

    2014-01-01

    When organic chemistry students encounter competing reactions, they are often overwhelmed by the task of evaluating multiple factors that affect the outcome of a reaction. The use of a decision tree is a useful tool to teach students to evaluate a complex situation and propose a likely outcome. Specifically, a decision tree can help students…

  1. Comparative study of biodegradability prediction of chemicals using decision trees, functional trees, and logistic regression.

    Science.gov (United States)

    Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M

    2014-12-01

    Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed. © 2014 SETAC.

  2. Application of decision tree algorithm for identification of rock forming minerals using energy dispersive spectrometry

    Science.gov (United States)

    Akkaş, Efe; Çubukçu, H. Evren; Artuner, Harun

    2014-05-01

    Rapid and automated mineral identification is compulsory in certain applications concerning natural rocks. Among all microscopic and spectrometric methods, energy dispersive X-ray spectrometers (EDS) integrated with scanning electron microscopes produce rapid information with reliable chemical data. Although obtaining elemental data with EDS analyses is fast and easy by the help of improving technology, it is rather challenging to perform accurate and rapid identification considering the large quantity of minerals in a rock sample with varying dimensions ranging between nanometer to centimeter. Furthermore, the physical properties of the specimen (roughness, thickness, electrical conductivity, position in the instrument etc.) and the incident electron beam (accelerating voltage, beam current, spot size etc.) control the produced characteristic X-ray, which in turn affect the elemental analyses. In order to minimize the effects of these physical constraints and develop an automated mineral identification system, a rule induction paradigm has been applied to energy dispersive spectral data. Decision tree classifiers divide training data sets into subclasses using generated rules or decisions and thereby it produces classification or recognition associated with these data sets. A number of thinsections prepared from rock samples with suitable mineralogy have been investigated and a preliminary 12 distinct mineral groups (olivine, orthopyroxene, clinopyroxene, apatite, amphibole, plagioclase, K- feldspar, zircon, magnetite, titanomagnetite, biotite, quartz), comprised mostly of silicates and oxides, have been selected. Energy dispersive spectral data for each group, consisting of 240 reference and 200 test analyses, have been acquired under various, non-standard, physical and electrical conditions. The reference X-Ray data have been used to assign the spectral distribution of elements to the specified mineral groups. Consequently, the test data have been analyzed using

  3. A P2P Botnet detection scheme based on decision tree and adaptive multilayer neural networks.

    Science.gov (United States)

    Alauthaman, Mohammad; Aslam, Nauman; Zhang, Li; Alasem, Rafe; Hossain, M A

    2018-01-01

    In recent years, Botnets have been adopted as a popular method to carry and spread many malicious codes on the Internet. These malicious codes pave the way to execute many fraudulent activities including spam mail, distributed denial-of-service attacks and click fraud. While many Botnets are set up using centralized communication architecture, the peer-to-peer (P2P) Botnets can adopt a decentralized architecture using an overlay network for exchanging command and control data making their detection even more difficult. This work presents a method of P2P Bot detection based on an adaptive multilayer feed-forward neural network in cooperation with decision trees. A classification and regression tree is applied as a feature selection technique to select relevant features. With these features, a multilayer feed-forward neural network training model is created using a resilient back-propagation learning algorithm. A comparison of feature set selection based on the decision tree, principal component analysis and the ReliefF algorithm indicated that the neural network model with features selection based on decision tree has a better identification accuracy along with lower rates of false positives. The usefulness of the proposed approach is demonstrated by conducting experiments on real network traffic datasets. In these experiments, an average detection rate of 99.08 % with false positive rate of 0.75 % was observed.

  4. Data acquisition in modeling using neural networks and decision trees

    Directory of Open Access Journals (Sweden)

    R. Sika

    2011-04-01

    Full Text Available The paper presents a comparison of selected models from area of artificial neural networks and decision trees in relation with actualconditions of foundry processes. The work contains short descriptions of used algorithms, their destination and method of data preparation,which is a domain of work of Data Mining systems. First part concerns data acquisition realized in selected iron foundry, indicating problems to solve in aspect of casting process modeling. Second part is a comparison of selected algorithms: a decision tree and artificial neural network, that is CART (Classification And Regression Trees and BP (Backpropagation in MLP (Multilayer Perceptron networks algorithms.Aim of the paper is to show an aspect of selecting data for modeling, cleaning it and reducing, for example due to too strong correlationbetween some of recorded process parameters. Also, it has been shown what results can be obtained using two different approaches:first when modeling using available commercial software, for example Statistica, second when modeling step by step using Excel spreadsheetbasing on the same algorithm, like BP-MLP. Discrepancy of results obtained from these two approaches originates from a priorimade assumptions. Mentioned earlier Statistica universal software package, when used without awareness of relations of technologicalparameters, i.e. without user having experience in foundry and without scheduling ranks of particular parameters basing on acquisition, can not give credible basis to predict the quality of the castings. Also, a decisive influence of data acquisition method has been clearly indicated, the acquisition should be conducted according to repetitive measurement and control procedures. This paper is based on about 250 records of actual data, for one assortment for 6 month period, where only 12 data sets were complete (including two that were used for validation of neural network and useful for creating a model. It is definitely too

  5. Three approaches to deal with inconsistent decision tables - Comparison of decision tree complexity

    KAUST Repository

    Azad, Mohammad; Chikalov, Igor; Moshkov, Mikhail

    2013-01-01

    In inconsistent decision tables, there are groups of rows with equal values of conditional attributes and different decisions (values of the decision attribute). We study three approaches to deal with such tables. Instead of a group of equal rows, we consider one row given by values of conditional attributes and we attach to this row: (i) the set of all decisions for rows from the group (many-valued decision approach); (ii) the most common decision for rows from the group (most common decision approach); and (iii) the unique code of the set of all decisions for rows from the group (generalized decision approach). We present experimental results and compare the depth, average depth and number of nodes of decision trees constructed by a greedy algorithm in the framework of each of the three approaches. © 2013 Springer-Verlag.

  6. Transferability of decision trees for land cover classification in a ...

    African Journals Online (AJOL)

    This paper attempts to derive classification rules from training data of four Landsat-8 scenes by using the classification and regression tree (CART) implementation of the decision tree algorithm. The transferability of the ruleset was evaluated by classifying two adjacent scenes. The classification of the four mosaicked scenes ...

  7. Using decision tree induction systems for modeling space-time behavior

    NARCIS (Netherlands)

    Arentze, T.A.; Hofman, F.; Mourik, van H.; Timmermans, H.J.P.; Wets, G.

    2000-01-01

    Discrete choice models are commonly used to predict individuals' activity and travel choices either separately or simultaneously in activity scheduling models. This paper investigates the possibilities of decision tree induction systems as an alternative approach. The ability of decision frees to

  8. Beef Quality Identification Using Thresholding Method and Decision Tree Classification Based on Android Smartphone

    Directory of Open Access Journals (Sweden)

    Kusworo Adi

    2017-01-01

    Full Text Available Beef is one of the animal food products that have high nutrition because it contains carbohydrates, proteins, fats, vitamins, and minerals. Therefore, the quality of beef should be maintained so that consumers get good beef quality. Determination of beef quality is commonly conducted visually by comparing the actual beef and reference pictures of each beef class. This process presents weaknesses, as it is subjective in nature and takes a considerable amount of time. Therefore, an automated system based on image processing that is capable of determining beef quality is required. This research aims to develop an image segmentation method by processing digital images. The system designed consists of image acquisition processes with varied distance, resolution, and angle. Image segmentation is done to separate the images of fat and meat using the Otsu thresholding method. Classification was carried out using the decision tree algorithm and the best accuracies were obtained at 90% for training and 84% for testing. Once developed, this system is then embedded into the android programming. Results show that the image processing technique is capable of proper marbling score identification.

  9. Decision tree-based learning to predict patient controlled analgesia consumption and readjustment

    Directory of Open Access Journals (Sweden)

    Hu Yuh-Jyh

    2012-11-01

    Full Text Available Abstract Background Appropriate postoperative pain management contributes to earlier mobilization, shorter hospitalization, and reduced cost. The under treatment of pain may impede short-term recovery and have a detrimental long-term effect on health. This study focuses on Patient Controlled Analgesia (PCA, which is a delivery system for pain medication. This study proposes and demonstrates how to use machine learning and data mining techniques to predict analgesic requirements and PCA readjustment. Methods The sample in this study included 1099 patients. Every patient was described by 280 attributes, including the class attribute. In addition to commonly studied demographic and physiological factors, this study emphasizes attributes related to PCA. We used decision tree-based learning algorithms to predict analgesic consumption and PCA control readjustment based on the first few hours of PCA medications. We also developed a nearest neighbor-based data cleaning method to alleviate the class-imbalance problem in PCA setting readjustment prediction. Results The prediction accuracies of total analgesic consumption (continuous dose and PCA dose and PCA analgesic requirement (PCA dose only by an ensemble of decision trees were 80.9% and 73.1%, respectively. Decision tree-based learning outperformed Artificial Neural Network, Support Vector Machine, Random Forest, Rotation Forest, and Naïve Bayesian classifiers in analgesic consumption prediction. The proposed data cleaning method improved the performance of every learning method in this study of PCA setting readjustment prediction. Comparative analysis identified the informative attributes from the data mining models and compared them with the correlates of analgesic requirement reported in previous works. Conclusion This study presents a real-world application of data mining to anesthesiology. Unlike previous research, this study considers a wider variety of predictive factors, including PCA

  10. Evaluation of the potential allergenicity of the enzyme microbial transglutaminase using the 2001 FAO/WHO Decision Tree

    DEFF Research Database (Denmark)

    Pedersen, Mona H.; Hansen, Tine K.; Sten, Eva

    2004-01-01

    All novel proteins must be assessed for their potential allergenicity before they are introduced into the food market. One method to achieve this is the 2001 FAO/WHO Decision Tree recommended for evaluation of proteins from genetically modified organisms (GMOs). It was the aim of this study...... to investigate the allergenicity of microbial transglutaminase (m-TG) from Streptoverticillium mobaraense. Amino acid sequence similarity to known allergens, pepsin resistance, and detection of protein binding to specific serum immunoglobulin E (IgE) (RAST) have been evaluated as recommended by the decision tree...... meets the requirements of the decision tree. However, there is a match at the five contiguous amino acid level to the major codfish allergen Gad c1. The potential cross reactivity between m-TG and Gad c1 was investigated in RAST using sera from 25 documented cod-allergic patients and an extract of raw...

  11. Exploratory Use of Decision Tree Analysis in Classification of Outcome in Hypoxic–Ischemic Brain Injury

    Directory of Open Access Journals (Sweden)

    Thanh G. Phan

    2018-03-01

    Full Text Available BackgroundPrognostication following hypoxic ischemic encephalopathy (brain injury is important for clinical management. The aim of this exploratory study is to use a decision tree model to find clinical and MRI associates of severe disability and death in this condition. We evaluate clinical model and then the added value of MRI data.MethodThe inclusion criteria were as follows: age ≥17 years, cardio-respiratory arrest, and coma on admission (2003–2011. Decision tree analysis was used to find clinical [Glasgow Coma Score (GCS, features about cardiac arrest, therapeutic hypothermia, age, and sex] and MRI (infarct volume associates of severe disability and death. We used the area under the ROC (auROC to determine accuracy of model. There were 41 (63.7% males patients having MRI imaging with the average age 51.5 ± 18.9 years old. The decision trees showed that infarct volume and age were important factors for discrimination between mild to moderate disability and severe disability and death at day 0 and day 2. The auROC for this model was 0.94 (95% CI 0.82–1.00. At day 7, GCS value was the only predictor; the auROC was 0.96 (95% CI 0.86–1.00.ConclusionOur findings provide proof of concept for further exploration of the role of MR imaging and decision tree analysis in the early prognostication of hypoxic ischemic brain injury.

  12. Bayesian averaging over Decision Tree models for trauma severity scoring.

    Science.gov (United States)

    Schetinin, V; Jakaite, L; Krzanowski, W

    2018-01-01

    Health care practitioners analyse possible risks of misleading decisions and need to estimate and quantify uncertainty in predictions. We have examined the "gold" standard of screening a patient's conditions for predicting survival probability, based on logistic regression modelling, which is used in trauma care for clinical purposes and quality audit. This methodology is based on theoretical assumptions about data and uncertainties. Models induced within such an approach have exposed a number of problems, providing unexplained fluctuation of predicted survival and low accuracy of estimating uncertainty intervals within which predictions are made. Bayesian method, which in theory is capable of providing accurate predictions and uncertainty estimates, has been adopted in our study using Decision Tree models. Our approach has been tested on a large set of patients registered in the US National Trauma Data Bank and has outperformed the standard method in terms of prediction accuracy, thereby providing practitioners with accurate estimates of the predictive posterior densities of interest that are required for making risk-aware decisions. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Ultrasonographic diagnosis of biliary atresia based on a decision-making tree model

    Energy Technology Data Exchange (ETDEWEB)

    Lee, So Mi; Cheon, Jung Eun; Choi, Young Hun; Kim, Woo Sun; Cho, Hyun Hye; Kim, In One; You, Sun Kyoung [Dept. of Radiology, Seoul National University College of Medicine, Seoul (Korea, Republic of)

    2015-12-15

    To assess the diagnostic value of various ultrasound (US) findings and to make a decision-tree model for US diagnosis of biliary atresia (BA). From March 2008 to January 2014, the following US findings were retrospectively evaluated in 100 infants with cholestatic jaundice (BA, n = 46; non-BA, n = 54): length and morphology of the gallbladder, triangular cord thickness, hepatic artery and portal vein diameters, and visualization of the common bile duct. Logistic regression analyses were performed to determine the features that would be useful in predicting BA. Conditional inference tree analysis was used to generate a decision-making tree for classifying patients into the BA or non-BA groups. Multivariate logistic regression analysis showed that abnormal gallbladder morphology and greater triangular cord thickness were significant predictors of BA (p = 0.003 and 0.001; adjusted odds ratio: 345.6 and 65.6, respectively). In the decision-making tree using conditional inference tree analysis, gallbladder morphology and triangular cord thickness (optimal cutoff value of triangular cord thickness, 3.4 mm) were also selected as significant discriminators for differential diagnosis of BA, and gallbladder morphology was the first discriminator. The diagnostic performance of the decision-making tree was excellent, with sensitivity of 100% (46/46), specificity of 94.4% (51/54), and overall accuracy of 97% (97/100). Abnormal gallbladder morphology and greater triangular cord thickness (> 3.4 mm) were the most useful predictors of BA on US. We suggest that the gallbladder morphology should be evaluated first and that triangular cord thickness should be evaluated subsequently in cases with normal gallbladder morphology.

  14. Ultrasonographic diagnosis of biliary atresia based on a decision-making tree model

    International Nuclear Information System (INIS)

    Lee, So Mi; Cheon, Jung Eun; Choi, Young Hun; Kim, Woo Sun; Cho, Hyun Hye; Kim, In One; You, Sun Kyoung

    2015-01-01

    To assess the diagnostic value of various ultrasound (US) findings and to make a decision-tree model for US diagnosis of biliary atresia (BA). From March 2008 to January 2014, the following US findings were retrospectively evaluated in 100 infants with cholestatic jaundice (BA, n = 46; non-BA, n = 54): length and morphology of the gallbladder, triangular cord thickness, hepatic artery and portal vein diameters, and visualization of the common bile duct. Logistic regression analyses were performed to determine the features that would be useful in predicting BA. Conditional inference tree analysis was used to generate a decision-making tree for classifying patients into the BA or non-BA groups. Multivariate logistic regression analysis showed that abnormal gallbladder morphology and greater triangular cord thickness were significant predictors of BA (p = 0.003 and 0.001; adjusted odds ratio: 345.6 and 65.6, respectively). In the decision-making tree using conditional inference tree analysis, gallbladder morphology and triangular cord thickness (optimal cutoff value of triangular cord thickness, 3.4 mm) were also selected as significant discriminators for differential diagnosis of BA, and gallbladder morphology was the first discriminator. The diagnostic performance of the decision-making tree was excellent, with sensitivity of 100% (46/46), specificity of 94.4% (51/54), and overall accuracy of 97% (97/100). Abnormal gallbladder morphology and greater triangular cord thickness (> 3.4 mm) were the most useful predictors of BA on US. We suggest that the gallbladder morphology should be evaluated first and that triangular cord thickness should be evaluated subsequently in cases with normal gallbladder morphology

  15. Ultrasonographic Diagnosis of Biliary Atresia Based on a Decision-Making Tree Model.

    Science.gov (United States)

    Lee, So Mi; Cheon, Jung-Eun; Choi, Young Hun; Kim, Woo Sun; Cho, Hyun-Hae; Cho, Hyun-Hye; Kim, In-One; You, Sun Kyoung

    2015-01-01

    To assess the diagnostic value of various ultrasound (US) findings and to make a decision-tree model for US diagnosis of biliary atresia (BA). From March 2008 to January 2014, the following US findings were retrospectively evaluated in 100 infants with cholestatic jaundice (BA, n = 46; non-BA, n = 54): length and morphology of the gallbladder, triangular cord thickness, hepatic artery and portal vein diameters, and visualization of the common bile duct. Logistic regression analyses were performed to determine the features that would be useful in predicting BA. Conditional inference tree analysis was used to generate a decision-making tree for classifying patients into the BA or non-BA groups. Multivariate logistic regression analysis showed that abnormal gallbladder morphology and greater triangular cord thickness were significant predictors of BA (p = 0.003 and 0.001; adjusted odds ratio: 345.6 and 65.6, respectively). In the decision-making tree using conditional inference tree analysis, gallbladder morphology and triangular cord thickness (optimal cutoff value of triangular cord thickness, 3.4 mm) were also selected as significant discriminators for differential diagnosis of BA, and gallbladder morphology was the first discriminator. The diagnostic performance of the decision-making tree was excellent, with sensitivity of 100% (46/46), specificity of 94.4% (51/54), and overall accuracy of 97% (97/100). Abnormal gallbladder morphology and greater triangular cord thickness (> 3.4 mm) were the most useful predictors of BA on US. We suggest that the gallbladder morphology should be evaluated first and that triangular cord thickness should be evaluated subsequently in cases with normal gallbladder morphology.

  16. Comparison of Taxi Time Prediction Performance Using Different Taxi Speed Decision Trees

    Science.gov (United States)

    Lee, Hanbong

    2017-01-01

    In the STBO modeler and tactical surface scheduler for ATD-2 project, taxi speed decision trees are used to calculate the unimpeded taxi times of flights taxiing on the airport surface. The initial taxi speed values in these decision trees did not show good prediction accuracy of taxi times. Using the more recent, reliable surveillance data, new taxi speed values in ramp area and movement area were computed. Before integrating these values into the STBO system, we performed test runs using live data from Charlotte airport, with different taxi speed settings: 1) initial taxi speed values and 2) new ones. Taxi time prediction performance was evaluated by comparing various metrics. The results show that the new taxi speed decision trees can calculate the unimpeded taxi-out times more accurately.

  17. Bounds on Average Time Complexity of Decision Trees

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    In this chapter, bounds on the average depth and the average weighted depth of decision trees are considered. Similar problems are studied in search theory [1], coding theory [77], design and analysis of algorithms (e.g., sorting) [38]. For any

  18. Identification of pests and diseases of Dalbergia hainanensis based on EVI time series and classification of decision tree

    Science.gov (United States)

    Luo, Qiu; Xin, Wu; Qiming, Xiong

    2017-06-01

    In the process of vegetation remote sensing information extraction, the problem of phenological features and low performance of remote sensing analysis algorithm is not considered. To solve this problem, the method of remote sensing vegetation information based on EVI time-series and the classification of decision-tree of multi-source branch similarity is promoted. Firstly, to improve the time-series stability of recognition accuracy, the seasonal feature of vegetation is extracted based on the fitting span range of time-series. Secondly, the decision-tree similarity is distinguished by adaptive selection path or probability parameter of component prediction. As an index, it is to evaluate the degree of task association, decide whether to perform migration of multi-source decision tree, and ensure the speed of migration. Finally, the accuracy of classification and recognition of pests and diseases can reach 87%--98% of commercial forest in Dalbergia hainanensis, which is significantly better than that of MODIS coverage accuracy of 80%--96% in this area. Therefore, the validity of the proposed method can be verified.

  19. A FTA-based method for risk decision-making in emergency response

    DEFF Research Database (Denmark)

    Liu, Yang; Li, Hongyan

    2014-01-01

    Decision-making problems in emergency response are usually risky and uncertain due to the limited decision data and possible evolvement of emergency scenarios. This paper focuses on a risk decisionmaking problem in emergency response with several distinct characteristics including dynamic...... evolvement process of emergency, multiple scenarios, and impact of response actions on the emergency scenarios. A method based on Fault Tree Analysis (FTA) is proposed to solve the problem. By analyzing the evolvement process of emergency, the Fault Tree (FT) is constructed to describe the logical relations...

  20. Intracranial hypertension prediction using extremely randomized decision trees.

    Science.gov (United States)

    Scalzo, Fabien; Hamilton, Robert; Asgari, Shadnaz; Kim, Sunghan; Hu, Xiao

    2012-10-01

    Intracranial pressure (ICP) elevation (intracranial hypertension, IH) in neurocritical care is typically treated in a reactive fashion; it is only delivered after bedside clinicians notice prolonged ICP elevation. A proactive solution is desirable to improve the treatment of intracranial hypertension. Several studies have shown that the waveform morphology of the intracranial pressure pulse holds predictors about future intracranial hypertension and could therefore be used to alert the bedside clinician of a likely occurrence of the elevation in the immediate future. In this paper, a computational framework is proposed to predict prolonged intracranial hypertension based on morphological waveform features computed from the ICP. A key contribution of this work is to exploit an ensemble classifier method based on extremely randomized decision trees (Extra-Trees). Experiments on a representative set of 30 patients admitted for various intracranial pressure related conditions demonstrate the effectiveness of the predicting framework on ICP pulses acquired under clinical conditions and the superior results of the proposed approach in comparison to linear and AdaBoost classifiers. Copyright © 2011 IPEM. Published by Elsevier Ltd. All rights reserved.

  1. An overview of decision tree applied to power systems

    DEFF Research Database (Denmark)

    Liu, Leo; Rather, Zakir Hussain; Chen, Zhe

    2013-01-01

    The corrosive volume of available data in electric power systems motivate the adoption of data mining techniques in the emerging field of power system data analytics. The mainstream of data mining algorithm applied to power system, Decision Tree (DT), also named as Classification And Regression...... Tree (CART), has gained increasing interests because of its high performance in terms of computational efficiency, uncertainty manageability, and interpretability. This paper presents an overview of a variety of DT applications to power systems for better interfacing of power systems with data...... analytics. The fundamental knowledge of CART algorithm is also introduced which is then followed by examples of both classification tree and regression tree with the help of case study for security assessment of Danish power system....

  2. Cloud Detection from Satellite Imagery: A Comparison of Expert-Generated and Automatically-Generated Decision Trees

    Science.gov (United States)

    Shiffman, Smadar

    2004-01-01

    Automated cloud detection and tracking is an important step in assessing global climate change via remote sensing. Cloud masks, which indicate whether individual pixels depict clouds, are included in many of the data products that are based on data acquired on- board earth satellites. Many cloud-mask algorithms have the form of decision trees, which employ sequential tests that scientists designed based on empirical astrophysics studies and astrophysics simulations. Limitations of existing cloud masks restrict our ability to accurately track changes in cloud patterns over time. In this study we explored the potential benefits of automatically-learned decision trees for detecting clouds from images acquired using the Advanced Very High Resolution Radiometer (AVHRR) instrument on board the NOAA-14 weather satellite of the National Oceanic and Atmospheric Administration. We constructed three decision trees for a sample of 8km-daily AVHRR data from 2000 using a decision-tree learning procedure provided within MATLAB(R), and compared the accuracy of the decision trees to the accuracy of the cloud mask. We used ground observations collected by the National Aeronautics and Space Administration Clouds and the Earth s Radiant Energy Systems S COOL project as the gold standard. For the sample data, the accuracy of automatically learned decision trees was greater than the accuracy of the cloud masks included in the AVHRR data product.

  3. Decision Tree Repository and Rule Set Based Mingjiang River Estuarine Wetlands Classifaction

    Science.gov (United States)

    Zhang, W.; Li, X.; Xiao, W.

    2018-05-01

    The increasing urbanization and industrialization have led to wetland losses in estuarine area of Mingjiang River over past three decades. There has been increasing attention given to produce wetland inventories using remote sensing and GIS technology. Due to inconsistency training site and training sample, traditionally pixel-based image classification methods can't achieve a comparable result within different organizations. Meanwhile, object-oriented image classification technique shows grate potential to solve this problem and Landsat moderate resolution remote sensing images are widely used to fulfill this requirement. Firstly, the standardized atmospheric correct, spectrally high fidelity texture feature enhancement was conducted before implementing the object-oriented wetland classification method in eCognition. Secondly, we performed the multi-scale segmentation procedure, taking the scale, hue, shape, compactness and smoothness of the image into account to get the appropriate parameters, using the top and down region merge algorithm from single pixel level, the optimal texture segmentation scale for different types of features is confirmed. Then, the segmented object is used as the classification unit to calculate the spectral information such as Mean value, Maximum value, Minimum value, Brightness value and the Normalized value. The Area, length, Tightness and the Shape rule of the image object Spatial features and texture features such as Mean, Variance and Entropy of image objects are used as classification features of training samples. Based on the reference images and the sampling points of on-the-spot investigation, typical training samples are selected uniformly and randomly for each type of ground objects. The spectral, texture and spatial characteristics of each type of feature in each feature layer corresponding to the range of values are used to create the decision tree repository. Finally, with the help of high resolution reference images, the

  4. Binary Decision Trees for Preoperative Periapical Cyst Screening Using Cone-beam Computed Tomography.

    Science.gov (United States)

    Pitcher, Brandon; Alaqla, Ali; Noujeim, Marcel; Wealleans, James A; Kotsakis, Georgios; Chrepa, Vanessa

    2017-03-01

    Cone-beam computed tomographic (CBCT) analysis allows for 3-dimensional assessment of periradicular lesions and may facilitate preoperative periapical cyst screening. The purpose of this study was to develop and assess the predictive validity of a cyst screening method based on CBCT volumetric analysis alone or combined with designated radiologic criteria. Three independent examiners evaluated 118 presurgical CBCT scans from cases that underwent apicoectomies and had an accompanying gold standard histopathological diagnosis of either a cyst or granuloma. Lesion volume, density, and specific radiologic characteristics were assessed using specialized software. Logistic regression models with histopathological diagnosis as the dependent variable were constructed for cyst prediction, and receiver operating characteristic curves were used to assess the predictive validity of the models. A conditional inference binary decision tree based on a recursive partitioning algorithm was constructed to facilitate preoperative screening. Interobserver agreement was excellent for volume and density, but it varied from poor to good for the radiologic criteria. Volume and root displacement were strong predictors for cyst screening in all analyses. The binary decision tree classifier determined that if the volume of the lesion was >247 mm 3 , there was 80% probability of a cyst. If volume was cyst probability was 60% (78% accuracy). The good accuracy and high specificity of the decision tree classifier renders it a useful preoperative cyst screening tool that can aid in clinical decision making but not a substitute for definitive histopathological diagnosis after biopsy. Confirmatory studies are required to validate the present findings. Published by Elsevier Inc.

  5. Optimization of matrix tablets controlled drug release using Elman dynamic neural networks and decision trees.

    Science.gov (United States)

    Petrović, Jelena; Ibrić, Svetlana; Betz, Gabriele; Đurić, Zorica

    2012-05-30

    The main objective of the study was to develop artificial intelligence methods for optimization of drug release from matrix tablets regardless of the matrix type. Static and dynamic artificial neural networks of the same topology were developed to model dissolution profiles of different matrix tablets types (hydrophilic/lipid) using formulation composition, compression force used for tableting and tablets porosity and tensile strength as input data. Potential application of decision trees in discovering knowledge from experimental data was also investigated. Polyethylene oxide polymer and glyceryl palmitostearate were used as matrix forming materials for hydrophilic and lipid matrix tablets, respectively whereas selected model drugs were diclofenac sodium and caffeine. Matrix tablets were prepared by direct compression method and tested for in vitro dissolution profiles. Optimization of static and dynamic neural networks used for modeling of drug release was performed using Monte Carlo simulations or genetic algorithms optimizer. Decision trees were constructed following discretization of data. Calculated difference (f(1)) and similarity (f(2)) factors for predicted and experimentally obtained dissolution profiles of test matrix tablets formulations indicate that Elman dynamic neural networks as well as decision trees are capable of accurate predictions of both hydrophilic and lipid matrix tablets dissolution profiles. Elman neural networks were compared to most frequently used static network, Multi-layered perceptron, and superiority of Elman networks have been demonstrated. Developed methods allow simple, yet very precise way of drug release predictions for both hydrophilic and lipid matrix tablets having controlled drug release. Copyright © 2012 Elsevier B.V. All rights reserved.

  6. Minimizing the cost of translocation failure with decision-tree models that predict species' behavioral response in translocation sites.

    Science.gov (United States)

    Ebrahimi, Mehregan; Ebrahimie, Esmaeil; Bull, C Michael

    2015-08-01

    The high number of failures is one reason why translocation is often not recommended. Considering how behavior changes during translocations may improve translocation success. To derive decision-tree models for species' translocation, we used data on the short-term responses of an endangered Australian skink in 5 simulated translocations with different release conditions. We used 4 different decision-tree algorithms (decision tree, decision-tree parallel, decision stump, and random forest) with 4 different criteria (gain ratio, information gain, gini index, and accuracy) to investigate how environmental and behavioral parameters may affect the success of a translocation. We assumed behavioral changes that increased dispersal away from a release site would reduce translocation success. The trees became more complex when we included all behavioral parameters as attributes, but these trees yielded more detailed information about why and how dispersal occurred. According to these complex trees, there were positive associations between some behavioral parameters, such as fight and dispersal, that showed there was a higher chance, for example, of dispersal among lizards that fought than among those that did not fight. Decision trees based on parameters related to release conditions were easier to understand and could be used by managers to make translocation decisions under different circumstances. © 2015 Society for Conservation Biology.

  7. Decision tree approach for classification of remotely sensed satellite ...

    Indian Academy of Sciences (India)

    sensed satellite data using open source support. Richa Sharma .... Decision tree classification techniques have been .... the USGS Earth Resource Observation Systems. (EROS) ... for shallow water, 11% were for sparse and dense built-up ...

  8. Decision Trees Predicting Tumor Shrinkage for Head and Neck Cancer: Implications for Adaptive Radiotherapy.

    Science.gov (United States)

    Surucu, Murat; Shah, Karan K; Mescioglu, Ibrahim; Roeske, John C; Small, William; Choi, Mehee; Emami, Bahman

    2016-02-01

    To develop decision trees predicting for tumor volume reduction in patients with head and neck (H&N) cancer using pretreatment clinical and pathological parameters. Forty-eight patients treated with definitive concurrent chemoradiotherapy for squamous cell carcinoma of the nasopharynx, oropharynx, oral cavity, or hypopharynx were retrospectively analyzed. These patients were rescanned at a median dose of 37.8 Gy and replanned to account for anatomical changes. The percentages of gross tumor volume (GTV) change from initial to rescan computed tomography (CT; %GTVΔ) were calculated. Two decision trees were generated to correlate %GTVΔ in primary and nodal volumes with 14 characteristics including age, gender, Karnofsky performance status (KPS), site, human papilloma virus (HPV) status, tumor grade, primary tumor growth pattern (endophytic/exophytic), tumor/nodal/group stages, chemotherapy regimen, and primary, nodal, and total GTV volumes in the initial CT scan. The C4.5 Decision Tree induction algorithm was implemented. The median %GTVΔ for primary, nodal, and total GTVs was 26.8%, 43.0%, and 31.2%, respectively. Type of chemotherapy, age, primary tumor growth pattern, site, KPS, and HPV status were the most predictive parameters for primary %GTVΔ decision tree, whereas for nodal %GTVΔ, KPS, site, age, primary tumor growth pattern, initial primary GTV, and total GTV volumes were predictive. Both decision trees had an accuracy of 88%. There can be significant changes in primary and nodal tumor volumes during the course of H&N chemoradiotherapy. Considering the proposed decision trees, radiation oncologists can select patients predicted to have high %GTVΔ, who would theoretically gain the most benefit from adaptive radiotherapy, in order to better use limited clinical resources. © The Author(s) 2015.

  9. An Assessment for A Filtered Containment Venting Strategy Using Decision Tree Models

    International Nuclear Information System (INIS)

    Shin, Hoyoung; Jae, Moosung

    2016-01-01

    In this study, a probabilistic assessment of the severe accident management strategy through a filtered containment venting system was performed by using decision tree models. In Korea, the filtered containment venting system has been installed for the first time in Wolsong unit 1 as a part of Fukushima follow-up steps, and it is planned to be applied gradually for all the remaining reactors. Filtered containment venting system, one of severe accident countermeasures, prevents a gradual pressurization of the containment building exhausting noncondensable gas and vapor to the outside of the containment building. In this study, a probabilistic assessment of the filtered containment venting strategy, one of the severe accident management strategies, was performed by using decision tree models. Containment failure frequencies of each decision were evaluated by the developed decision tree model. The optimum accident management strategies were evaluated by comparing the results. Various strategies in severe accident management guidelines (SAMG) could be improved by utilizing the methodology in this study and the offsite risk analysis methodology

  10. An Assessment for A Filtered Containment Venting Strategy Using Decision Tree Models

    Energy Technology Data Exchange (ETDEWEB)

    Shin, Hoyoung; Jae, Moosung [Hanyang University, Seoul (Korea, Republic of)

    2016-10-15

    In this study, a probabilistic assessment of the severe accident management strategy through a filtered containment venting system was performed by using decision tree models. In Korea, the filtered containment venting system has been installed for the first time in Wolsong unit 1 as a part of Fukushima follow-up steps, and it is planned to be applied gradually for all the remaining reactors. Filtered containment venting system, one of severe accident countermeasures, prevents a gradual pressurization of the containment building exhausting noncondensable gas and vapor to the outside of the containment building. In this study, a probabilistic assessment of the filtered containment venting strategy, one of the severe accident management strategies, was performed by using decision tree models. Containment failure frequencies of each decision were evaluated by the developed decision tree model. The optimum accident management strategies were evaluated by comparing the results. Various strategies in severe accident management guidelines (SAMG) could be improved by utilizing the methodology in this study and the offsite risk analysis methodology.

  11. Assessing School Readiness for a Practice Arrangement Using Decision Tree Methodology.

    Science.gov (United States)

    Barger, Sara E.

    1998-01-01

    Questions in a decision-tree address mission, faculty interest, administrative support, and practice plan as a way of assessing arrangements for nursing faculty's clinical practice. Decisions should be based on congruence between the human resource allocation and the reward systems. (SK)

  12. Comparison of the use of binary decision trees and neural networks in top-quark detection

    International Nuclear Information System (INIS)

    Bowser-Chao, D.; Dzialo, D.L.

    1993-01-01

    The use of neural networks for signal versus background discrimination in high-energy physics experiments has been investigated and has compared favorably with the efficiency of traditional kinematic cuts. Recent work in top-quark identification produced a neural network that, for a given top-quark mass, yielded a higher signal-to-background ratio in Monte Carlo simulation than a corresponding set of conventional cuts. In this article we discuss another pattern-recognition algorithm, the binary decision tree. We apply a binary decision tree to top-quark identification at the Fermilab Tevatron and find it to be comparable in performance to the neural network. Furthermore, reservations about the ''black box'' nature of neural network discriminators do not appy to binary decision trees; a binary decision tree may be reduced to a set of kinematic cuts subject to conventional error analysis

  13. Decision tree approach for classification of remotely sensed satellite

    Indian Academy of Sciences (India)

    DTC) algorithm for classification of remotely sensed satellite data (Landsat TM) using open source support. The decision tree is constructed by recursively partitioning the spectral distribution of the training dataset using WEKA, open source ...

  14. A decision tree for differentiating multiple system atrophy from Parkinson's disease using 3-T MR imaging.

    Science.gov (United States)

    Nair, Shalini Rajandran; Tan, Li Kuo; Mohd Ramli, Norlisah; Lim, Shen Yang; Rahmat, Kartini; Mohd Nor, Hazman

    2013-06-01

    To develop a decision tree based on standard magnetic resonance imaging (MRI) and diffusion tensor imaging to differentiate multiple system atrophy (MSA) from Parkinson's disease (PD). 3-T brain MRI and DTI (diffusion tensor imaging) were performed on 26 PD and 13 MSA patients. Regions of interest (ROIs) were the putamen, substantia nigra, pons, middle cerebellar peduncles (MCP) and cerebellum. Linear, volumetry and DTI (fractional anisotropy and mean diffusivity) were measured. A three-node decision tree was formulated, with design goals being 100 % specificity at node 1, 100 % sensitivity at node 2 and highest combined sensitivity and specificity at node 3. Nine parameters (mean width, fractional anisotropy (FA) and mean diffusivity (MD) of MCP; anteroposterior diameter of pons; cerebellar FA and volume; pons and mean putamen volume; mean FA substantia nigra compacta-rostral) showed statistically significant (P decision tree. Threshold values were 14.6 mm, 21.8 mm and 0.55, respectively. Overall performance of the decision tree was 92 % sensitivity, 96 % specificity, 92 % PPV and 96 % NPV. Twelve out of 13 MSA patients were accurately classified. Formation of the decision tree using these parameters was both descriptive and predictive in differentiating between MSA and PD. • Parkinson's disease and multiple system atrophy can be distinguished on MR imaging. • Combined conventional MRI and diffusion tensor imaging improves the accuracy of diagnosis. • A decision tree is descriptive and predictive in differentiating between clinical entities. • A decision tree can reliably differentiate Parkinson's disease from multiple system atrophy.

  15. Visualization of Decision Tree State for the Classification of Parkinson's Disease

    NARCIS (Netherlands)

    Valentijn, E

    2016-01-01

    Decision trees have been shown to be effective at classifying subjects with Parkinson’s disease when provided with features (subject scores) derived from FDG-PET data. Such subject scores have strong discriminative power but are not intuitive to understand. We therefore augment each decision node

  16. PCA based feature reduction to improve the accuracy of decision tree c4.5 classification

    Science.gov (United States)

    Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.

    2018-03-01

    Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.

  17. Coronal Mass Ejection Data Clustering and Visualization of Decision Trees

    Science.gov (United States)

    Ma, Ruizhe; Angryk, Rafal A.; Riley, Pete; Filali Boubrahimi, Soukaina

    2018-05-01

    Coronal mass ejections (CMEs) can be categorized as either “magnetic clouds” (MCs) or non-MCs. Features such as a large magnetic field, low plasma-beta, and low proton temperature suggest that a CME event is also an MC event; however, so far there is neither a definitive method nor an automatic process to distinguish the two. Human labeling is time-consuming, and results can fluctuate owing to the imprecise definition of such events. In this study, we approach the problem of MC and non-MC distinction from a time series data analysis perspective and show how clustering can shed some light on this problem. Although many algorithms exist for traditional data clustering in the Euclidean space, they are not well suited for time series data. Problems such as inadequate distance measure, inaccurate cluster center description, and lack of intuitive cluster representations need to be addressed for effective time series clustering. Our data analysis in this work is twofold: clustering and visualization. For clustering we compared the results from the popular hierarchical agglomerative clustering technique to a distance density clustering heuristic we developed previously for time series data clustering. In both cases, dynamic time warping will be used for similarity measure. For classification as well as visualization, we use decision trees to aggregate single-dimensional clustering results to form a multidimensional time series decision tree, with averaged time series to present each decision. In this study, we achieved modest accuracy and, more importantly, an intuitive interpretation of how different parameters contribute to an MC event.

  18. Detecting Structural Metadata with Decision Trees and Transformation-Based Learning

    National Research Council Canada - National Science Library

    Kim, Joungbum; Schwarm, Sarah E; Ostendorf, Mari

    2004-01-01

    .... Specifically, combinations of decision trees and language models are used to predict sentence ends and interruption points and given these events transformation based learning is used to detect edit...

  19. Decision tree analysis to evaluate dry cow strategies under UK conditions

    NARCIS (Netherlands)

    Berry, E.A.; Hogeveen, H.; Hillerton, J.E.

    2004-01-01

    Economic decisions on animal health strategies address the cost-benefit aspect along with animal welfare and public health concerns. Decision tree analysis at an individual cow level highlighted that there is little economic difference between the use of either dry cow antibiotic or an internal teat

  20. Non-compliance with a postmastectomy radiotherapy guideline: Decision tree and cause analysis

    OpenAIRE

    Razavi, Amir R; Gill, Hans; Åhlfeldt, Hans; Shahsavar, Nosrat

    2008-01-01

    Background: The guideline for postmastectomy radiotherapy (PMRT), which is prescribed to reduce recurrence of breast cancer in the chest wall and improve overall survival, is not always followed. Identifying and extracting important patterns of non-compliance are crucial in maintaining the quality of care in Oncology. Methods: Analysis of 759 patients with malignant breast cancer using decision tree induction (DTI) found patterns of non-compliance with the guideline. The PMRT guideline was us...

  1. Modeling flash floods in ungauged mountain catchments of China: A decision tree learning approach for parameter regionalization

    Science.gov (United States)

    Ragettli, S.; Zhou, J.; Wang, H.; Liu, C.

    2017-12-01

    Flash floods in small mountain catchments are one of the most frequent causes of loss of life and property from natural hazards in China. Hydrological models can be a useful tool for the anticipation of these events and the issuing of timely warnings. Since sub-daily streamflow information is unavailable for most small basins in China, one of the main challenges is finding appropriate parameter values for simulating flash floods in ungauged catchments. In this study, we use decision tree learning to explore parameter set transferability between different catchments. For this purpose, the physically-based, semi-distributed rainfall-runoff model PRMS-OMS is set up for 35 catchments in ten Chinese provinces. Hourly data from more than 800 storm runoff events are used to calibrate the model and evaluate the performance of parameter set transfers between catchments. For each catchment, 58 catchment attributes are extracted from several data sets available for whole China. We then use a data mining technique (decision tree learning) to identify catchment similarities that can be related to good transfer performance. Finally, we use the splitting rules of decision trees for finding suitable donor catchments for ungauged target catchments. We show that decision tree learning allows to optimally utilize the information content of available catchment descriptors and outperforms regionalization based on a conventional measure of physiographic-climatic similarity by 15%-20%. Similar performance can be achieved with a regionalization method based on spatial proximity, but decision trees offer flexible rules for selecting suitable donor catchments, not relying on the vicinity of gauged catchments. This flexibility makes the method particularly suitable for implementation in sparsely gauged environments. We evaluate the probability to detect flood events exceeding a given return period, considering measured discharge and PRMS-OMS simulated flows with regionalized parameters

  2. Vlsi implementation of flexible architecture for decision tree classification in data mining

    Science.gov (United States)

    Sharma, K. Venkatesh; Shewandagn, Behailu; Bhukya, Shankar Nayak

    2017-07-01

    The Data mining algorithms have become vital to researchers in science, engineering, medicine, business, search and security domains. In recent years, there has been a terrific raise in the size of the data being collected and analyzed. Classification is the main difficulty faced in data mining. In a number of the solutions developed for this problem, most accepted one is Decision Tree Classification (DTC) that gives high precision while handling very large amount of data. This paper presents VLSI implementation of flexible architecture for Decision Tree classification in data mining using c4.5 algorithm.

  3. Data mining usage in health care management: literature survey and decision tree application

    Directory of Open Access Journals (Sweden)

    Dijana Ćosić

    2008-02-01

    Full Text Available Aim To show the benefits of data mining in health care management.In this example, we are going to show a way to raise awarenessof women in terms of contraceptive methods they use (do notuse.Methods Goal of the data mining analysis was to determine ifthere are common characteristics of the women according to theirchoice of contraception (typical classification problem. Therefore,we decided to use decision trees. We have generated a CHAIDmodel in “Statistica”, based on the database that was formed as aresult of an Indonesian research that was conducted in 1987. Thesample contains married women who were either not pregnant ordid not know if they were pregnant at the time of the interview.The database consists of 1473 cases. Also, an extensive internetsearch was conducted in order to detect a number of articles citedin scientific databases published on the subject of data mining inhealth care management.Results It has shown that the most important variable in case ofwomen’s choice of contraceptive methods is – a husband’s profession.Also we retrieved 221 articles published on the application ofdata mining in health care.Conclusion The goal of the paper is achieved in two ways: first,retrieving 221 articles published on the subject we have proved thebenefits of data mining in the health care management. Second,the decision tree method is successfully applied in explanation ofwomen’s choice of contraceptive methods.

  4. Klasifikasi Nilai Kelayakan Calon Debitur Baru Menggunakan Decision Tree C4.5

    Directory of Open Access Journals (Sweden)

    Bambang Hermanto

    2017-01-01

    Full Text Available In an effort to improve the quality of customer service, especially in terms of feasibility assessment of borrowers due to the increasing number of new prospective borrowers loans financing the purchase of a motor vehicle, then the company needs a decision making tool allowing you to easily and quickly estimate Where the debtor is able to pay off the loans. This study discusses the process generates C4.5 decision tree algorithm and utilizing the learning group of debtor financing dataset motorcycle. The decision tree is then interpreted into the form of decision rules that can be understood and used as a reference in processing the data of borrowers in determining the feasibility of prospective new borrowers. Feasibility value refers to the value of the destination parameter credit status. If the value of the credit is paid off status mean estimated prospective borrower is able to repay the loan in question, but if the credit status parameters estimated worth pull means candidates concerned debtor is unable to pay loans.. System testing is done by comparing the results of the testing data by learning data in three scenarios with the decision that the data is valid at over 70% for all case scenarios. Moreover, in generated tree  and generate rules takes fairly quickly, which is no more than 15 minutes for each test scenario

  5. The application of outsourcing decision-making methods in a logistics context in South Africa

    Directory of Open Access Journals (Sweden)

    Naomi Bloem

    2015-07-01

    Full Text Available Background: Companies have often relinquished the control of important business functions to outside suppliers for the sake of short-term savings and because of the lack of use of proper decision-making methods within the business. Objectives: This article identified three methods of decision-making and applied it to a logistics outsourcing problem. The logistics outsourcing problem consisted of a make-or-buy decision as well as a supplier selection process. The purpose of the study was to determine the most suitable method in the case of logistics outsourcing. Method: The decision-making methods were applied to a South African case study within the fast moving consumer goods (FMCG industry. The logistics functions considered in the case study included secondary distribution and warehousing of finished goods. Each method considered the same evaluation criteria and the results were analysed and compared. Results: Each method produced different results to the logistics outsourcing problem. The method developed by Platts, Probert and Canez (2000 suggested that the logistics functions be insourced. The decision tree method suggested outsourcing both functions, with a unit rate cost model. The results from the linear programming (LP method indicated that the secondary distribution function should be insourced and the warehousing function outsourced, with a fixed and variable cost model pending further analysis of the demand trends. Conclusion: The study provides empirical evidence that proven outsourcing decision-making methods, such as the method developed by Platts et al. (2000, the LP method and the decision tree method traditionally applied to a manufacturing outsourcing decision problem, can be adapted and applied to a logistics outsourcing decision problem of a South African FMCG company.

  6. Hyper-parameter tuning of a decision tree induction algorithm

    NARCIS (Netherlands)

    Mantovani, R.G.; Horváth, T.; Cerri, R.; Vanschoren, J.; de Carvalho, A.C.P.L.F.

    2017-01-01

    Supervised classification is the most studied task in Machine Learning. Among the many algorithms used in such task, Decision Tree algorithms are a popular choice, since they are robust and efficient to construct. Moreover, they have the advantage of producing comprehensible models and satisfactory

  7. Decision tree analysis to stratify risk of de novo non-melanoma skin cancer following liver transplantation.

    Science.gov (United States)

    Tanaka, Tomohiro; Voigt, Michael D

    2018-03-01

    Non-melanoma skin cancer (NMSC) is the most common de novo malignancy in liver transplant (LT) recipients; it behaves more aggressively and it increases mortality. We used decision tree analysis to develop a tool to stratify and quantify risk of NMSC in LT recipients. We performed Cox regression analysis to identify which predictive variables to enter into the decision tree analysis. Data were from the Organ Procurement Transplant Network (OPTN) STAR files of September 2016 (n = 102984). NMSC developed in 4556 of the 105984 recipients, a mean of 5.6 years after transplant. The 5/10/20-year rates of NMSC were 2.9/6.3/13.5%, respectively. Cox regression identified male gender, Caucasian race, age, body mass index (BMI) at LT, and sirolimus use as key predictive or protective factors for NMSC. These factors were entered into a decision tree analysis. The final tree stratified non-Caucasians as low risk (0.8%), and Caucasian males > 47 years, BMI decision tree model accurately stratifies the risk of developing NMSC in the long-term after LT.

  8. Reconciliation as a tool for decision making within decision tree related to insolvency problems

    Directory of Open Access Journals (Sweden)

    Tomáš Poláček

    2016-05-01

    Full Text Available Purpose of the article: The paper draws on the results of previous studies recoverability of creditor’s claims, where it was research from debtor’s point of view and his/her debts on the Czech Republic financial market. The company, which fell into a bankruptcy hearing, has several legislatively supported options how to deal with this situation and repay creditors money. Each of the options has been specified as a variant of a decisionmaking tree. This paper is focused on third option of evaluation – The reconciliation. The heuristic generates all missing information items. The result is then focused on the comparison and evaluation of the best ways to repay the debt, also including solution for the future continuation of the company currently in liquidation and quantification of percentage refund of creditors claim. A realistic case study is presented in full details. Further introduction of decision making with uncerteinties in insolvency proceedings. Methodology/methods: Solving within decision tree with partially ignorance of probability using reconciliation. Scientific aim: Comparison and evaluation of the best ways to repay the debt, also including solution for the future continuation of the company currently in liquidation and quantification of percentage refund of creditors claim. Findings: Predictions of future actions in dealing with insolvency act and bankruptcy hearing, quicker and more effective agreeing on compromises among all creditors and debtor. Conclusions: Finding a best way and solution of repayment and avoiding of termination for both of interested parties (creditor and debtor.

  9. Development and acceptability testing of decision trees for self-management of prosthetic socket fit in adults with lower limb amputation.

    Science.gov (United States)

    Lee, Daniel Joseph; Veneri, Diana A

    2018-05-01

    The most common complaint lower limb prosthesis users report is inadequacy of a proper socket fit. Adjustments to the residual limb-socket interface can be made by the prosthesis user without consultation of a clinician in many scenarios through skilled self-management. Decision trees guide prosthesis wearers through the self-management process, empowering them to rectify fit issues, or referring them to a clinician when necessary. This study examines the development and acceptability testing of patient-centered decision trees for lower limb prosthesis users. Decision trees underwent a four-stage process: literature review and expert consultation, designing, two-rounds of expert panel review and revisions, and target audience testing. Fifteen lower limb prosthesis users (average age 61 years) reviewed the decision trees and completed an acceptability questionnaire. Participants reported agreement of 80% or above in five of the eight questions related to acceptability of the decision trees. Disagreement was related to the level of experience of the respondent. Decision trees were found to be easy to use, illustrate correct solutions to common issues, and have terminology consistent with that of a new prosthesis user. Some users with greater than 1.5 years of experience would not use the decision trees based on their own self-management skills. Implications for Rehabilitation Discomfort of the residual limb-prosthetic socket interface is the most common reason for clinician visits. Prosthesis users can use decision trees to guide them through the process of obtaining a proper socket fit independently. Newer users may benefit from using the decision trees more than experienced users.

  10. Classification of soil respiration in areas of sugarcane renewal using decision tree

    Directory of Open Access Journals (Sweden)

    Camila Viana Vieira Farhate

    Full Text Available ABSTRACT: The use of data mining is a promising alternative to predict soil respiration from correlated variables. Our objective was to build a model using variable selection and decision tree induction to predict different levels of soil respiration, taking into account physical, chemical and microbiological variables of soil as well as precipitation in renewal of sugarcane areas. The original dataset was composed of 19 variables (18 independent variables and one dependent (or response variable. The variable-target refers to soil respiration as the target classification. Due to a large number of variables, a procedure for variable selection was conducted to remove those with low correlation with the variable-target. For that purpose, four approaches of variable selection were evaluated: no variable selection, correlation-based feature selection (CFS, chisquare method (χ2 and Wrapper. To classify soil respiration, we used the decision tree induction technique available in the Weka software package. Our results showed that data mining techniques allow the development of a model for soil respiration classification with accuracy of 81 %, resulting in a knowledge base composed of 27 rules for prediction of soil respiration. In particular, the wrapper method for variable selection identified a subset of only five variables out of 18 available in the original dataset, and they had the following order of influence in determining soil respiration: soil temperature > precipitation > macroporosity > soil moisture > potential acidity.

  11. Coalescent methods for estimating phylogenetic trees.

    Science.gov (United States)

    Liu, Liang; Yu, Lili; Kubatko, Laura; Pearl, Dennis K; Edwards, Scott V

    2009-10-01

    We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces.

  12. A ROUGH SET DECISION TREE BASED MLP-CNN FOR VERY HIGH RESOLUTION REMOTELY SENSED IMAGE CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    C. Zhang

    2017-09-01

    Full Text Available Recent advances in remote sensing have witnessed a great amount of very high resolution (VHR images acquired at sub-metre spatial resolution. These VHR remotely sensed data has post enormous challenges in processing, analysing and classifying them effectively due to the high spatial complexity and heterogeneity. Although many computer-aid classification methods that based on machine learning approaches have been developed over the past decades, most of them are developed toward pixel level spectral differentiation, e.g. Multi-Layer Perceptron (MLP, which are unable to exploit abundant spatial details within VHR images. This paper introduced a rough set model as a general framework to objectively characterize the uncertainty in CNN classification results, and further partition them into correctness and incorrectness on the map. The correct classification regions of CNN were trusted and maintained, whereas the misclassification areas were reclassified using a decision tree with both CNN and MLP. The effectiveness of the proposed rough set decision tree based MLP-CNN was tested using an urban area at Bournemouth, United Kingdom. The MLP-CNN, well capturing the complementarity between CNN and MLP through the rough set based decision tree, achieved the best classification performance both visually and numerically. Therefore, this research paves the way to achieve fully automatic and effective VHR image classification.

  13. A symptom based decision tree approach to boiling water reactor emergency operating procedures

    International Nuclear Information System (INIS)

    Knobel, R.C.

    1984-01-01

    This paper describes a Decision Tree approach to development of BWR Emergency Operating Procedures for use by operators during emergencies. This approach utilizes the symptom based Emergency Procedure Guidelines approved for implementation by the USNRC. Included in the paper is a discussion of the relative merits of the event based Emergency Operating Procedures currently in use at USBWR plants. The body of the paper is devoted to a discussion of the Decision Tree Approach to Emergency Operating Procedures soon to be implemented at two United States Boiling Water Reactor plants, why this approach solves many of the problems with procedures indentified in the post accident reviews of Three Mile Island procedures, and why only now is this approach both desirable and feasible. The paper discusses how nuclear plant simulators were involved in the development of the Emergency Operating Procedure decision trees, and in the verification and validation of these procedures. (orig./HP)

  14. Practical secure decision tree learning in a teletreatment application

    NARCIS (Netherlands)

    de Hoogh, Sebastiaan; Schoenmakers, Berry; Chen, Ping; op den Akker, Harm

    In this paper we develop a range of practical cryptographic protocols for secure decision tree learning, a primary problem in privacy preserving data mining. We focus on particular variants of the well-known ID3 algorithm allowing a high level of security and performance at the same time. Our

  15. Practical secure decision tree learning in a teletreatment application

    NARCIS (Netherlands)

    Hoogh, de S.J.A.; Schoenmakers, B.; Chen, Ping; Op den Akker, H.; Christin, N.; Safavi-Naini, R.

    2014-01-01

    In this paper we develop a range of practical cryptographic protocols for secure decision tree learning, a primary problem in privacy preserving data mining. We focus on particular variants of the well-known ID3 algorithm allowing a high level of security and performance at the same time. Our

  16. Analisis Peramalan Penjualan dan Penggunaan Metode Linear Programming dan Decision Tree Guna Mengoptimalkan Keuntungan pada PT Primajaya Pantes Garment

    Directory of Open Access Journals (Sweden)

    Inti Sariani Jianta Djie

    2013-09-01

    Full Text Available Primajaya Pantes Garment is a company that runs its business in garment sector. However, due to various numbers of requests each month, the company is difficult to determine the amount of production per month that is appropriate to maximize profits. The purpose of this study is to determine the appropriate forecasting method that can be used as a reference to determine the amount of production in the next period and to find a combination of products to maximize profits. Research used forecasting methods, including naive method, moving averages, weighted moving averages, exponential smoothing, exponential smoothing with trend, and linear regression. In addition, this study also used Linear Programming method with Simplex method to determine the best combination of products for the company and to choose a decision using a decision tree to determine which alternative should be done by the company. Results of this study found that the linear regression method is the most appropriate method in determining the forecast demand in the next period. While in the Linear Programming method, constraints used were the constraints of raw materials, labor hours, and limited demand for the product. The result of the decision tree is to increase production capacity.

  17. Role of decision-tree analysis in the performance of a complex feasibility study

    International Nuclear Information System (INIS)

    Dworkin, D.; Sarkar, A.; Motwani, J.

    1991-01-01

    This report presents the results of the Feasibility Study (FS) of a National Priorities List (NPL) site in New Jersey and the decision tree that made this FS possible. The development of the decision tree and the remedial action alternatives that address the hazards at the site are presented. The FS efforts were performed in accordance with U.S. EPA guidance under the authority of the Comprehensive Environmental Response, Compensation, and Liability Act of 1980 (CERCLA) as amended by the Superfund Amendments and Reauthorization Act of 1986 (SARA). A Record of Decision (ROD) is expected in mid-1990. The subject site, the Myers Property site in Franklin Township, New Jersey, has been owned by several individuals and companies since 1811. Uses have included the production of DDT. A Remedial Investigation (RI) was performed by WESTON, the lead technical firm for this feasibility study, which identified soils/sediments, groundwater and buildings on the site as areas of concern. The major chemicals of concern are DDT and its metabolites, polynuclear aromatic compounds, various chlorinated benzenes, dioxin/furan homologues and heavy metals such as arsenic, cadmium, copper, lead, chromium and nickel. While this FS was developed in accordance with current CERCLA FS guidance and procedures, it was expanded to accommodate several outstanding technical and policy issues. Outstanding technical issues focused on uncertainties with respect to hydrogeologic conditions, and policy issues centered upon the development of the site-specific remedial action goals. A decision tree was established to facilitate the development of remedial strategies. The decision tree formed the basis for the FS and allowed remedial alternatives to be identified and evaluated based on key policy decisions. Site media were addressed as contaminated soils/sediments, groundwater and the on-site buildings

  18. Construction and application of hierarchical decision tree for classification of ultrasonographic prostate images

    NARCIS (Netherlands)

    Giesen, R. J.; Huynen, A. L.; Aarnink, R. G.; de la Rosette, J. J.; Debruyne, F. M.; Wijkstra, H.

    1996-01-01

    A non-parametric algorithm is described for the construction of a binary decision tree classifier. This tree is used to correlate textural features, computed from ultrasonographic prostate images, with the histopathology of the imaged tissue. The algorithm consists of two parts; growing and pruning.

  19. Decision Rules, Trees and Tests for Tables with Many-valued Decisions–comparative Study

    KAUST Repository

    Azad, Mohammad; Zielosko, Beata; Moshkov, Mikhail; Chikalov, Igor

    2013-01-01

    In this paper, we present three approaches for construction of decision rules for decision tables with many-valued decisions. We construct decision rules directly for rows of decision table, based on paths in decision tree, and based on attributes contained in a test (super-reduct). Experimental results for the data sets taken from UCI Machine Learning Repository, contain comparison of the maximum and the average length of rules for the mentioned approaches.

  20. Decision Rules, Trees and Tests for Tables with Many-valued Decisions–comparative Study

    KAUST Repository

    Azad, Mohammad

    2013-10-04

    In this paper, we present three approaches for construction of decision rules for decision tables with many-valued decisions. We construct decision rules directly for rows of decision table, based on paths in decision tree, and based on attributes contained in a test (super-reduct). Experimental results for the data sets taken from UCI Machine Learning Repository, contain comparison of the maximum and the average length of rules for the mentioned approaches.

  1. Data Clustering and Evolving Fuzzy Decision Tree for Data Base Classification Problems

    Science.gov (United States)

    Chang, Pei-Chann; Fan, Chin-Yuan; Wang, Yen-Wen

    Data base classification suffers from two well known difficulties, i.e., the high dimensionality and non-stationary variations within the large historic data. This paper presents a hybrid classification model by integrating a case based reasoning technique, a Fuzzy Decision Tree (FDT), and Genetic Algorithms (GA) to construct a decision-making system for data classification in various data base applications. The model is major based on the idea that the historic data base can be transformed into a smaller case-base together with a group of fuzzy decision rules. As a result, the model can be more accurately respond to the current data under classifying from the inductions by these smaller cases based fuzzy decision trees. Hit rate is applied as a performance measure and the effectiveness of our proposed model is demonstrated by experimentally compared with other approaches on different data base classification applications. The average hit rate of our proposed model is the highest among others.

  2. Decision-Tree, Rule-Based, and Random Forest Classification of High-Resolution Multispectral Imagery for Wetland Mapping and Inventory

    Science.gov (United States)

    Efforts are increasingly being made to classify the world’s wetland resources, an important ecosystem and habitat that is diminishing in abundance. There are multiple remote sensing classification methods, including a suite of nonparametric classifiers such as decision-tree...

  3. Intrusion Detection System Based on Decision Tree over Big Data in Fog Environment

    Directory of Open Access Journals (Sweden)

    Kai Peng

    2018-01-01

    Full Text Available Fog computing, as the supplement of cloud computing, can provide low-latency services between mobile users and the cloud. However, fog devices may encounter security challenges as a result of the fog nodes being close to the end users and having limited computing ability. Traditional network attacks may destroy the system of fog nodes. Intrusion detection system (IDS is a proactive security protection technology and can be used in the fog environment. Although IDS in tradition network has been well investigated, unfortunately directly using them in the fog environment may be inappropriate. Fog nodes produce massive amounts of data at all times, and, thus, enabling an IDS system over big data in the fog environment is of paramount importance. In this study, we propose an IDS system based on decision tree. Firstly, we propose a preprocessing algorithm to digitize the strings in the given dataset and then normalize the whole data, to ensure the quality of the input data so as to improve the efficiency of detection. Secondly, we use decision tree method for our IDS system, and then we compare this method with Naïve Bayesian method as well as KNN method. Both the 10% dataset and the full dataset are tested. Our proposed method not only completely detects four kinds of attacks but also enables the detection of twenty-two kinds of attacks. The experimental results show that our IDS system is effective and precise. Above all, our IDS system can be used in fog computing environment over big data.

  4. Decision support for mitigating the risk of tree induced transmission line failure in utility rights-of-way.

    Science.gov (United States)

    Poulos, H M; Camp, A E

    2010-02-01

    Vegetation management is a critical component of rights-of-way (ROW) maintenance for preventing electrical outages and safety hazards resulting from tree contact with conductors during storms. Northeast Utility's (NU) transmission lines are a critical element of the nation's power grid; NU is therefore under scrutiny from federal agencies charged with protecting the electrical transmission infrastructure of the United States. We developed a decision support system to focus right-of-way maintenance and minimize the potential for a tree fall episode that disables transmission capacity across the state of Connecticut. We used field data on tree characteristics to develop a system for identifying hazard trees (HTs) in the field using limited equipment to manage Connecticut power line ROW. Results from this study indicated that the tree height-to-diameter ratio, total tree height, and live crown ratio were the key characteristics that differentiated potential risk trees (danger trees) from trees with a high probability of tree fall (HTs). Products from this research can be transferred to adaptive right-of-way management, and the methods we used have great potential for future application to other regions of the United States and elsewhere where tree failure can disrupt electrical power.

  5. Binary Decision Tree Development for Probabilistic Safety Assessment Applications

    International Nuclear Information System (INIS)

    Simic, Z.; Banov, R.; Mikulicic, V.

    2008-01-01

    The aim of this article is to describe state of the development for the relatively new approach in the probabilistic safety analysis (PSA). This approach is based on the application of binary decision diagrams (BDD) representation for the logical function on the quantitative and qualitative analysis of complex systems that are presented by fault trees and event trees in the PSA applied for the nuclear power plants risk determination. Even BDD approach offers full solution comparing to the partial one from the conventional quantification approach there are still problems to be solved before new approach could be fully implemented. Major problem with full application of BDD is difficulty of getting any solution for the PSA models of certain complexity. This paper is comparing two approaches in PSA quantification. Major focus of the paper is description of in-house developed BDD application with implementation of the original algorithms. Resulting number of nodes required to represent the BDD is extremely sensitive to the chosen order of variables (i.e., basic events in PSA). The problem of finding an optimal order of variables that form the BDD falls under the class of NP-complete complexity. This paper presents an original approach to the problem of finding the initial order of variables utilized for the BDD construction by various dynamical reordering schemes. Main advantage of this approach compared to the known methods of finding the initial order is with better results in respect to the required working memory and time needed to finish the BDD construction. Developed method is compared against results from well known methods such as depth-first, breadth-first search procedures. Described method may be applied in finding of an initial order for fault trees/event trees being created from basic events by means of logical operations (e.g. negation, and, or, exclusive or). With some testing models a significant reduction of used memory has been achieved, sometimes

  6. Imitation learning of car driving skills with decision trees and random forests

    Directory of Open Access Journals (Sweden)

    Cichosz Paweł

    2014-09-01

    Full Text Available Machine learning is an appealing and useful approach to creating vehicle control algorithms, both for simulated and real vehicles. One common learning scenario that is often possible to apply is learning by imitation, in which the behavior of an exemplary driver provides training instances for a supervised learning algorithm. This article follows this approach in the domain of simulated car racing, using the TORCS simulator. In contrast to most prior work on imitation learning, a symbolic decision tree knowledge representation is adopted, which combines potentially high accuracy with human readability, an advantage that can be important in many applications. Decision trees are demonstrated to be capable of representing high quality control models, reaching the performance level of sophisticated pre-designed algorithms. This is achieved by enhancing the basic imitation learning scenario to include active retraining, automatically triggered on control failures. It is also demonstrated how better stability and generalization can be achieved by sacrificing human-readability and using decision tree model ensembles. The methodology for learning control models contributed by this article can be hopefully applied to solve real-world control tasks, as well as to develop video game bots

  7. Identification of radon anomalies in soil gas using decision trees and neural networks

    International Nuclear Information System (INIS)

    Zmazek, B.; Dzeroski, S.; Torkar, D.; Vaupotic, J.; Kobal, I.

    2010-01-01

    The time series of radon ( 222 Rn) concentration in soil gas at a fault, together with the environmental parameters, have been analysed applying two machine learning techniques: (I) decision trees and (II) neural networks, with the aim at identifying radon anomalies caused by seismic events and not simply ascribed to the effect of the environmental parameters. By applying neural networks, 10 radon anomalies were observed for 12 earthquakes, while with decision trees, the anomaly was found for every earthquake, but, undesirably, some anomalies appeared also during periods without earthquakes. (authors)

  8. Classification of Different Degrees of Disability Following Intracerebral Hemorrhage: A Decision Tree Analysis from VISTA-ICH Collaboration.

    Science.gov (United States)

    Phan, Thanh G; Chen, Jian; Beare, Richard; Ma, Henry; Clissold, Benjamin; Van Ly, John; Srikanth, Velandai

    2017-01-01

    Prognostication following intracerebral hemorrhage (ICH) has focused on poor outcome at the expense of lumping together mild and moderate disability. We aimed to develop a novel approach at classifying a range of disability following ICH. The Virtual International Stroke Trial Archive collaboration database was searched for patients with ICH and known volume of ICH on baseline CT scans. Disability was partitioned into mild [modified Rankin Scale (mRS) at 90 days of 0-2], moderate (mRS = 3-4), and severe disabilities (mRS = 5-6). We used binary and trichotomy decision tree methodology. The data were randomly divided into training (2/3 of data) and validation (1/3 data) datasets. The area under the receiver operating characteristic curve (AUC) was used to calculate the accuracy of the decision tree model. We identified 957 patients, age 65.9 ± 12.3 years, 63.7% males, and ICH volume 22.6 ± 22.1 ml. The binary tree showed that lower ICH volume (27.9 ml), older age (>69.5 years), and low Glasgow Coma Scale (tree showed that ICH volume, age, and serum glucose can separate mild, moderate, and severe disability groups with AUC 0.79 (95% CI 0.71-0.87). Both the binary and trichotomy methods provide equivalent discrimination of disability outcome after ICH. The trichotomy method can classify three categories at once, whereas this action was not possible with the binary method. The trichotomy method may be of use to clinicians and trialists for classifying a range of disability in ICH.

  9. Tools of the Future: How Decision Tree Analysis Will Impact Mission Planning

    Science.gov (United States)

    Otterstatter, Matthew R.

    2005-01-01

    The universe is infinitely complex; however, the human mind has a finite capacity. The multitude of possible variables, metrics, and procedures in mission planning are far too many to address exhaustively. This is unfortunate because, in general, considering more possibilities leads to more accurate and more powerful results. To compensate, we can get more insightful results by employing our greatest tool, the computer. The power of the computer will be utilized through a technology that considers every possibility, decision tree analysis. Although decision trees have been used in many other fields, this is innovative for space mission planning. Because this is a new strategy, no existing software is able to completely accommodate all of the requirements. This was determined through extensive research and testing of current technologies. It was necessary to create original software, for which a short-term model was finished this summer. The model was built into Microsoft Excel to take advantage of the familiar graphical interface for user input, computation, and viewing output. Macros were written to automate the process of tree construction, optimization, and presentation. The results are useful and promising. If this tool is successfully implemented in mission planning, our reliance on old-fashioned heuristics, an error-prone shortcut for handling complexity, will be reduced. The computer algorithms involved in decision trees will revolutionize mission planning. The planning will be faster and smarter, leading to optimized missions with the potential for more valuable data.

  10. MODIS Snow Cover Mapping Decision Tree Technique: Snow and Cloud Discrimination

    Science.gov (United States)

    Riggs, George A.; Hall, Dorothy K.

    2010-01-01

    Accurate mapping of snow cover continues to challenge cryospheric scientists and modelers. The Moderate-Resolution Imaging Spectroradiometer (MODIS) snow data products have been used since 2000 by many investigators to map and monitor snow cover extent for various applications. Users have reported on the utility of the products and also on problems encountered. Three problems or hindrances in the use of the MODIS snow data products that have been reported in the literature are: cloud obscuration, snow/cloud confusion, and snow omission errors in thin or sparse snow cover conditions. Implementation of the MODIS snow algorithm in a decision tree technique using surface reflectance input to mitigate those problems is being investigated. The objective of this work is to use a decision tree structure for the snow algorithm. This should alleviate snow/cloud confusion and omission errors and provide a snow map with classes that convey information on how snow was detected, e.g. snow under clear sky, snow tinder cloud, to enable users' flexibility in interpreting and deriving a snow map. Results of a snow cover decision tree algorithm are compared to the standard MODIS snow map and found to exhibit improved ability to alleviate snow/cloud confusion in some situations allowing up to about 5% increase in mapped snow cover extent, thus accuracy, in some scenes.

  11. Success tree method of resources evaluation

    International Nuclear Information System (INIS)

    Chen Qinglan; Sun Wenpeng

    1994-01-01

    By applying the reliability theory in system engineering, the success tree method is used to transfer the expert's recognition on metallogenetic regularities into the form of the success tree. The aim of resources evaluation is achieved by means of calculating the metallogenetic probability or favorability of the top event of the success tree. This article introduces in detail, the source, principle of the success tree method and three kinds of calculation methods, expounds concretely how to establish the success tree of comprehensive uranium metallogenesis as well as the procedure from which the resources evaluation is performed. Because this method has not restrictions on the number of known deposits and calculated area, it is applicable to resources evaluation for different mineral species, types and scales and possesses good prospects of development

  12. Office of Legacy Management Decision Tree for Solar Photovoltaic Projects - 13317

    Energy Technology Data Exchange (ETDEWEB)

    Elmer, John; Butherus, Michael [S.M. Stoller Corporation (United States); Barr, Deborah L. [U.S. Department of Energy Office of Legacy Management (United States)

    2013-07-01

    To support consideration of renewable energy power development as a land reuse option, the DOE Office of Legacy Management (LM) and the National Renewable Energy Laboratory (NREL) established a partnership to conduct an assessment of wind and solar renewable energy resources on LM lands. From a solar capacity perspective, the larger sites in the western United States present opportunities for constructing solar photovoltaic (PV) projects. A detailed analysis and preliminary plan was developed for three large sites in New Mexico, assessing the costs, the conceptual layout of a PV system, and the electric utility interconnection process. As a result of the study, a 1,214-hectare (3,000-acre) site near Grants, New Mexico, was chosen for further study. The state incentives, utility connection process, and transmission line capacity were key factors in assessing the feasibility of the project. LM's Durango, Colorado, Disposal Site was also chosen for consideration because the uranium mill tailings disposal cell is on a hillside facing south, transmission lines cross the property, and the community was very supportive of the project. LM worked with the regulators to demonstrate that the disposal cell's long-term performance would not be impacted by the installation of a PV solar system. A number of LM-unique issues were resolved in making the site available for a private party to lease a portion of the site for a solar PV project. A lease was awarded in September 2012. Using a solar decision tree that was developed and launched by the EPA and NREL, LM has modified and expanded the decision tree structure to address the unique aspects and challenges faced by LM on its multiple sites. The LM solar decision tree covers factors such as land ownership, usable acreage, financial viability of the project, stakeholder involvement, and transmission line capacity. As additional sites are transferred to LM in the future, the decision tree will assist in determining

  13. Office of Legacy Management Decision Tree for Solar Photovoltaic Projects - 13317

    International Nuclear Information System (INIS)

    Elmer, John; Butherus, Michael; Barr, Deborah L.

    2013-01-01

    To support consideration of renewable energy power development as a land reuse option, the DOE Office of Legacy Management (LM) and the National Renewable Energy Laboratory (NREL) established a partnership to conduct an assessment of wind and solar renewable energy resources on LM lands. From a solar capacity perspective, the larger sites in the western United States present opportunities for constructing solar photovoltaic (PV) projects. A detailed analysis and preliminary plan was developed for three large sites in New Mexico, assessing the costs, the conceptual layout of a PV system, and the electric utility interconnection process. As a result of the study, a 1,214-hectare (3,000-acre) site near Grants, New Mexico, was chosen for further study. The state incentives, utility connection process, and transmission line capacity were key factors in assessing the feasibility of the project. LM's Durango, Colorado, Disposal Site was also chosen for consideration because the uranium mill tailings disposal cell is on a hillside facing south, transmission lines cross the property, and the community was very supportive of the project. LM worked with the regulators to demonstrate that the disposal cell's long-term performance would not be impacted by the installation of a PV solar system. A number of LM-unique issues were resolved in making the site available for a private party to lease a portion of the site for a solar PV project. A lease was awarded in September 2012. Using a solar decision tree that was developed and launched by the EPA and NREL, LM has modified and expanded the decision tree structure to address the unique aspects and challenges faced by LM on its multiple sites. The LM solar decision tree covers factors such as land ownership, usable acreage, financial viability of the project, stakeholder involvement, and transmission line capacity. As additional sites are transferred to LM in the future, the decision tree will assist in determining whether a solar

  14. Discovering Decision Knowledge from Web Log Portfolio for Managing Classroom Processes by Applying Decision Tree and Data Cube Technology.

    Science.gov (United States)

    Chen, Gwo-Dong; Liu, Chen-Chung; Ou, Kuo-Liang; Liu, Baw-Jhiune

    2000-01-01

    Discusses the use of Web logs to record student behavior that can assist teachers in assessing performance and making curriculum decisions for distance learning students who are using Web-based learning systems. Adopts decision tree and data cube information processing methodologies for developing more effective pedagogical strategies. (LRW)

  15. Sistem Pakar Untuk Diagnosa Penyakit Kehamilan Menggunakan Metode Dempster-Shafer Dan Decision Tree

    Directory of Open Access Journals (Sweden)

    joko popo minardi

    2016-01-01

    Full Text Available Dempster-Shafer theory is a mathematical theory of evidence based on belief functions and plausible reasoning, which is used to combine separate pieces of information. Dempster-Shafer theory an alternative to traditional probabilistic theory for the mathematical representation of uncertainty. In the diagnosis of diseases of pregnancy information obtained from the patient sometimes incomplete, with Dempster-Shafer method and expert system rules can be a combination of symptoms that are not complete to get an appropriate diagnosis while the decision tree is used as a decision support tool reference tracking of disease symptoms This Research aims to develop an expert system that can perform a diagnosis of pregnancy using Dempster Shafer method, which can produce a trust value to a disease diagnosis. Based on the results of diagnostic testing Dempster-Shafer method and expert systems, the resulting accuracy of 76%.   Keywords: Expert system; Diseases of pregnancy; Dempster Shafer

  16. Comparing wavefront-optimized, wavefront-guided and topography-guided laser vision correction: clinical outcomes using an objective decision tree.

    Science.gov (United States)

    Stonecipher, Karl; Parrish, Joseph; Stonecipher, Megan

    2018-05-18

    This review is intended to update and educate the reader on the currently available options for laser vision correction, more specifically, laser-assisted in-situ keratomileusis (LASIK). In addition, some related clinical outcomes data from over 1000 cases performed over a 1-year are presented to highlight some differences between the various treatment profiles currently available including the rapidity of visual recovery. The cases in question were performed on the basis of a decision tree to segregate patients on the basis of anatomical, topographic and aberrometry findings; the decision tree was formulated based on the data available in some of the reviewed articles. Numerous recent studies reported in the literature provide data related to the risks and benefits of LASIK; alternatives to a laser refractive procedure are also discussed. The results from these studies have been used to prepare a decision tree to assist the surgeon in choosing the best option for the patient based on the data from several standard preoperative diagnostic tests. The data presented here should aid surgeons in understanding the effects of currently available LASIK treatment profiles. Surgeons should also be able to appreciate how the findings were used to create a decision tree to help choose the most appropriate treatment profile for patients. Finally, the retrospective evaluation of clinical outcomes based on the decision tree should provide surgeons with a realistic expectation for their own outcomes should they adopt such a decision tree in their own practice.

  17. Constructing an optimal decision tree for FAST corner point detection

    KAUST Repository

    Alkhalid, Abdulaziz; Chikalov, Igor; Moshkov, Mikhail

    2011-01-01

    In this paper, we consider a problem that is originated in computer vision: determining an optimal testing strategy for the corner point detection problem that is a part of FAST algorithm [11,12]. The problem can be formulated as building a decision tree with the minimum average depth for a decision table with all discrete attributes. We experimentally compare performance of an exact algorithm based on dynamic programming and several greedy algorithms that differ in the attribute selection criterion. © 2011 Springer-Verlag.

  18. Decision Tree Algorithm-Generated Single-Nucleotide Polymorphism Barcodes of rbcL Genes for 38 Brassicaceae Species Tagging.

    Science.gov (United States)

    Yang, Cheng-Hong; Wu, Kuo-Chuan; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2018-01-01

    DNA barcode sequences are accumulating in large data sets. A barcode is generally a sequence larger than 1000 base pairs and generates a computational burden. Although the DNA barcode was originally envisioned as straightforward species tags, the identification usage of barcode sequences is rarely emphasized currently. Single-nucleotide polymorphism (SNP) association studies provide us an idea that the SNPs may be the ideal target of feature selection to discriminate between different species. We hypothesize that SNP-based barcodes may be more effective than the full length of DNA barcode sequences for species discrimination. To address this issue, we tested a r ibulose diphosphate carboxylase ( rbcL ) S NP b arcoding (RSB) strategy using a decision tree algorithm. After alignment and trimming, 31 SNPs were discovered in the rbcL sequences from 38 Brassicaceae plant species. In the decision tree construction, these SNPs were computed to set up the decision rule to assign the sequences into 2 groups level by level. After algorithm processing, 37 nodes and 31 loci were required for discriminating 38 species. Finally, the sequence tags consisting of 31 rbcL SNP barcodes were identified for discriminating 38 Brassicaceae species based on the decision tree-selected SNP pattern using RSB method. Taken together, this study provides the rational that the SNP aspect of DNA barcode for rbcL gene is a useful and effective sequence for tagging 38 Brassicaceae species.

  19. Decision making under uncertainty: An investigation into the application of formal decision-making methods to safety issue decisions

    International Nuclear Information System (INIS)

    Bohn, M.P.

    1992-12-01

    As part of the NRC-sponsored program to study the implications of Generic Issue 57, ''Effects of Fire Protection System Actuation on Safety-Related Equipment,'' a subtask was performed to evaluate the applicability of formal decision analysis methods to generic issues cost/benefit-type decisions and to apply these methods to the GI-57 results. In this report, the numerical results obtained from the analysis of three plants (two PWRs and one BWR) as developed in the technical resolution program for GI-57 were studied. For each plant, these results included a calculation of the person-REM averted due to various accident scenarios and various proposed modifications to mitigate the accident scenarios identified. These results were recomputed to break out the benefit in terms of contributions due to random event scenarios, fire event scenarios, and seismic event scenarios. Furthermore, the benefits associated with risk (in terms of person-REM) averted from earthquakes at three different seismic ground motion levels were separately considered. Given this data, formal decision methodologies involving decision trees, value functions, and utility functions were applied to this basic data. It is shown that the formal decision methodology can be applied at several different levels. Examples are given in which the decision between several retrofits is changed from that resulting from a simple cost/benefit-ratio criterion by virtue of the decision-makinger's expressed (and assumed) preferences

  20. Relationships between depth and number of misclassifications for decision trees

    KAUST Repository

    Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail

    2011-01-01

    This paper describes a new tool for the study of relationships between depth and number of misclassifications for decision trees. In addition to the algorithm the paper also presents the results of experiments with three datasets from UCI Machine Learning Repository [3]. © 2011 Springer-Verlag.

  1. Prognostic Factors and Decision Tree for Long-term Survival in Metastatic Uveal Melanoma.

    Science.gov (United States)

    Lorenzo, Daniel; Ochoa, María; Piulats, Josep Maria; Gutiérrez, Cristina; Arias, Luis; Català, Jaum; Grau, María; Peñafiel, Judith; Cobos, Estefanía; Garcia-Bru, Pere; Rubio, Marcos Javier; Padrón-Pérez, Noel; Dias, Bruno; Pera, Joan; Caminal, Josep Maria

    2017-12-04

    The purpose of this study was to demonstrate the existence of a bimodal survival pattern in metastatic uveal melanoma. Secondary aims were to identify the characteristics and prognostic factors associated with long-term survival and to develop a clinical decision tree. The medical records of 99 metastatic uveal melanoma patients were retrospectively reviewed. Patients were classified as either short (≤ 12 months) or long-term survivors (> 12 months) based on a graphical interpretation of the survival curve after diagnosis of the first metastatic lesion. Ophthalmic and oncological characteristics were assessed in both groups. Of the 99 patients, 62 (62.6%) were classified as short-term survivors, and 37 (37.4%) as long-term survivors. The multivariate analysis identified the following predictors of long-term survival: age ≤ 65 years (p=0.012) and unaltered serum lactate dehydrogenase levels (p=0.018); additionally, the size (smaller vs. larger) of the largest liver metastasis showed a trend towards significance (p=0.063). Based on the variables significantly associated with long-term survival, we developed a decision tree to facilitate clinical decision-making. The findings of this study demonstrate the existence of a bimodal survival pattern in patients with metastatic uveal melanoma. The presence of certain clinical characteristics at diagnosis of distant disease is associated with long-term survival. A decision tree was developed to facilitate clinical decision-making and to counsel patients about the expected course of disease.

  2. Chi-squared Automatic Interaction Detection Decision Tree Analysis of Risk Factors for Infant Anemia in Beijing, China.

    Science.gov (United States)

    Ye, Fang; Chen, Zhi-Hua; Chen, Jie; Liu, Fang; Zhang, Yong; Fan, Qin-Ying; Wang, Lin

    2016-05-20

    In the past decades, studies on infant anemia have mainly focused on rural areas of China. With the increasing heterogeneity of population in recent years, available information on infant anemia is inconclusive in large cities of China, especially with comparison between native residents and floating population. This population-based cross-sectional study was implemented to determine the anemic status of infants as well as the risk factors in a representative downtown area of Beijing. As useful methods to build a predictive model, Chi-squared automatic interaction detection (CHAID) decision tree analysis and logistic regression analysis were introduced to explore risk factors of infant anemia. A total of 1091 infants aged 6-12 months together with their parents/caregivers living at Heping Avenue Subdistrict of Beijing were surveyed from January 1, 2013 to December 31, 2014. The prevalence of anemia was 12.60% with a range of 3.47%-40.00% in different subgroup characteristics. The CHAID decision tree model has demonstrated multilevel interaction among risk factors through stepwise pathways to detect anemia. Besides the three predictors identified by logistic regression model including maternal anemia during pregnancy, exclusive breastfeeding in the first 6 months, and floating population, CHAID decision tree analysis also identified the fourth risk factor, the maternal educational level, with higher overall classification accuracy and larger area below the receiver operating characteristic curve. The infant anemic status in metropolis is complex and should be carefully considered by the basic health care practitioners. CHAID decision tree analysis has demonstrated a better performance in hierarchical analysis of population with great heterogeneity. Risk factors identified by this study might be meaningful in the early detection and prompt treatment of infant anemia in large cities.

  3. Extensions and applications of ensemble-of-trees methods in machine learning

    Science.gov (United States)

    Bleich, Justin

    Ensemble-of-trees algorithms have emerged to the forefront of machine learning due to their ability to generate high forecasting accuracy for a wide array of regression and classification problems. Classic ensemble methodologies such as random forests (RF) and stochastic gradient boosting (SGB) rely on algorithmic procedures to generate fits to data. In contrast, more recent ensemble techniques such as Bayesian Additive Regression Trees (BART) and Dynamic Trees (DT) focus on an underlying Bayesian probability model to generate the fits. These new probability model-based approaches show much promise versus their algorithmic counterparts, but also offer substantial room for improvement. The first part of this thesis focuses on methodological advances for ensemble-of-trees techniques with an emphasis on the more recent Bayesian approaches. In particular, we focus on extensions of BART in four distinct ways. First, we develop a more robust implementation of BART for both research and application. We then develop a principled approach to variable selection for BART as well as the ability to naturally incorporate prior information on important covariates into the algorithm. Next, we propose a method for handling missing data that relies on the recursive structure of decision trees and does not require imputation. Last, we relax the assumption of homoskedasticity in the BART model to allow for parametric modeling of heteroskedasticity. The second part of this thesis returns to the classic algorithmic approaches in the context of classification problems with asymmetric costs of forecasting errors. First we consider the performance of RF and SGB more broadly and demonstrate its superiority to logistic regression for applications in criminology with asymmetric costs. Next, we use RF to forecast unplanned hospital readmissions upon patient discharge with asymmetric costs taken into account. Finally, we explore the construction of stable decision trees for forecasts of

  4. SITUATIONAL CONTROL OF HOT BLAST STOVES GROUP BASED ON DECISION TREE

    Directory of Open Access Journals (Sweden)

    E. I. Kobysh

    2016-09-01

    Full Text Available In this paper was developed the control system of group of hot blast stoves, which operates on the basis of the packing heating control subsystem and subsystem of forecasting of modes duration in the hot blast stoves APCS of iron smelting in a blast furnace. With the use of multi-criteria optimization methods, implemented the adjustment of control system conduct, which takes into account the current production situation that has arisen in the course of the heating packing of each hot blast stove group. Developed a situation recognition algorithm and the choice of scenarios of control based on a decision tree.

  5. Decision tree based knowledge acquisition and failure diagnosis using a PWR loop vibration model

    International Nuclear Information System (INIS)

    Bauernfeind, V.; Ding, Y.

    1993-01-01

    An analytical vibration model of the primary system of a 1300 MW PWR was used for simulating mechanical faults. Deviations in the calculated power density spectra and coherence functions are determined and classified. The decision tree technique is then used for a personal computer supported knowledge presentation and for optimizing the logical relationships between the simulated faults and the observed symptoms. The optimized decision tree forms the knowledge base and can be used to diagnose known cases as well as to include new data into the knowledge base if new faults occur. (author)

  6. Interpretable decision-tree induction in a big data parallel framework

    Directory of Open Access Journals (Sweden)

    Weinberg Abraham Itzhak

    2017-12-01

    Full Text Available When running data-mining algorithms on big data platforms, a parallel, distributed framework, such asMAPREDUCE, may be used. However, in a parallel framework, each individual model fits the data allocated to its own computing node without necessarily fitting the entire dataset. In order to induce a single consistent model, ensemble algorithms such as majority voting, aggregate the local models, rather than analyzing the entire dataset directly. Our goal is to develop an efficient algorithm for choosing one representative model from multiple, locally induced decision-tree models. The proposed SySM (syntactic similarity method algorithm computes the similarity between the models produced by parallel nodes and chooses the model which is most similar to others as the best representative of the entire dataset. In 18.75% of 48 experiments on four big datasets, SySM accuracy is significantly higher than that of the ensemble; in about 43.75% of the experiments, SySM accuracy is significantly lower; in one case, the results are identical; and in the remaining 35.41% of cases the difference is not statistically significant. Compared with ensemble methods, the representative tree models selected by the proposed methodology are more compact and interpretable, their induction consumes less memory, and, as confirmed by the empirical results, they allow faster classification of new records.

  7. Proactive data mining with decision trees

    CERN Document Server

    Dahan, Haim; Rokach, Lior; Maimon, Oded

    2014-01-01

    This book explores a proactive and domain-driven method to classification tasks. This novel proactive approach to data mining not only induces a model for predicting or explaining a phenomenon, but also utilizes specific problem/domain knowledge to suggest specific actions to achieve optimal changes in the value of the target attribute. In particular, the authors suggest a specific implementation of the domain-driven proactive approach for classification trees. The book centers on the core idea of moving observations from one branch of the tree to another. It introduces a novel splitting crite

  8. An application of the value tree analysis methodology within the integrated risk informed decision making for the nuclear facilities

    International Nuclear Information System (INIS)

    Borysiewicz, Mieczysław; Kowal, Karol; Potempski, Sławomir

    2015-01-01

    A new framework of integrated risk informed decision making (IRIDM) has been recently developed in order to improve the risk management of the nuclear facilities. IRIDM is a process in which qualitatively different inputs, corresponding to different types of risk, are jointly taken into account. However, the relative importance of the IRIDM inputs and their influence on the decision to be made is difficult to be determined quantitatively. An improvement of this situation can be achieved by application of the Value Tree Analysis (VTA) methods. The aim of this article is to present the VTA methodology in the context of its potential usage in the decision making on nuclear facilities. The benefits of the VTA application within the IRIDM process were identified while making the decision on fuel conversion of the research reactor MARIA. - Highlights: • New approach to risk informed decision making on nuclear facilities was postulated. • Value tree diagram was developed for decision processes on nuclear installations. • An experiment was performed to compare the new approach with the standard one. • Benefits of the new approach were reached in fuel conversion of a research reactor. • The new approach makes the decision making process more transparent and auditable

  9. ArborZ: PHOTOMETRIC REDSHIFTS USING BOOSTED DECISION TREES

    International Nuclear Information System (INIS)

    Gerdes, David W.; Sypniewski, Adam J.; McKay, Timothy A.; Hao, Jiangang; Weis, Matthew R.; Wechsler, Risa H.; Busha, Michael T.

    2010-01-01

    Precision photometric redshifts will be essential for extracting cosmological parameters from the next generation of wide-area imaging surveys. In this paper, we introduce a photometric redshift algorithm, ArborZ, based on the machine-learning technique of boosted decision trees. We study the algorithm using galaxies from the Sloan Digital Sky Survey (SDSS) and from mock catalogs intended to simulate both the SDSS and the upcoming Dark Energy Survey. We show that it improves upon the performance of existing algorithms. Moreover, the method naturally leads to the reconstruction of a full probability density function (PDF) for the photometric redshift of each galaxy, not merely a single 'best estimate' and error, and also provides a photo-z quality figure of merit for each galaxy that can be used to reject outliers. We show that the stacked PDFs yield a more accurate reconstruction of the redshift distribution N(z). We discuss limitations of the current algorithm and ideas for future work.

  10. Indirect methods of tree biomass estimation and their uncertainties ...

    African Journals Online (AJOL)

    Depending on data availability (dbh only or both dbh and total tree height) either of the models may be applied to generate satisfactory estimates of tree volume needed for planning and decision-making in management of mangrove forests. The study found an overall mean FF value of 0.65 ± 0.03 (SE), 0.56 ± 0.03 (SE) and ...

  11. Fault trees for decision making in systems analysis

    International Nuclear Information System (INIS)

    Lambert, H.E.

    1975-01-01

    The application of fault tree analysis (FTA) to system safety and reliability is presented within the framework of system safety analysis. The concepts and techniques involved in manual and automated fault tree construction are described and their differences noted. The theory of mathematical reliability pertinent to FTA is presented with emphasis on engineering applications. An outline of the quantitative reliability techniques of the Reactor Safety Study is given. Concepts of probabilistic importance are presented within the fault tree framework and applied to the areas of system design, diagnosis and simulation. The computer code IMPORTANCE ranks basic events and cut sets according to a sensitivity analysis. A useful feature of the IMPORTANCE code is that it can accept relative failure data as input. The output of the IMPORTANCE code can assist an analyst in finding weaknesses in system design and operation, suggest the most optimal course of system upgrade, and determine the optimal location of sensors within a system. A general simulation model of system failure in terms of fault tree logic is described. The model is intended for efficient diagnosis of the causes of system failure in the event of a system breakdown. It can also be used to assist an operator in making decisions under a time constraint regarding the future course of operations. The model is well suited for computer implementation. New results incorporated in the simulation model include an algorithm to generate repair checklists on the basis of fault tree logic and a one-step-ahead optimization procedure that minimizes the expected time to diagnose system failure. (80 figures, 20 tables)

  12. Relationships Between Average Depth and Number of Nodes for Decision Trees

    KAUST Repository

    Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail

    2013-01-01

    This paper presents a new tool for the study of relationships between total path length or average depth and number of nodes of decision trees. In addition to algorithm, the paper also presents the results of experiments with datasets from UCI ML

  13. Oblique decision trees using embedded support vector machines in classifier ensembles

    NARCIS (Netherlands)

    Menkovski, V.; Christou, I.; Efremidis, S.

    2008-01-01

    Classifier ensembles have emerged in recent years as a promising research area for boosting pattern recognition systems' performance. We present a new base classifier that utilizes oblique decision tree technology based on support vector machines for the construction of oblique (non-axis parallel)

  14. Diagnosis of Constant Faults in Read-Once Contact Networks over Finite Bases using Decision Trees

    KAUST Repository

    Busbait, Monther I.

    2014-05-01

    We study the depth of decision trees for diagnosis of constant faults in read-once contact networks over finite bases. This includes diagnosis of 0-1 faults, 0 faults and 1 faults. For any finite basis, we prove a linear upper bound on the minimum depth of decision tree for diagnosis of constant faults depending on the number of edges in a contact network over that basis. Also, we obtain asymptotic bounds on the depth of decision trees for diagnosis of each type of constant faults depending on the number of edges in contact networks in the worst case per basis. We study the set of indecomposable contact networks with up to 10 edges and obtain sharp coefficients for the linear upper bound for diagnosis of constant faults in contact networks over bases of these indecomposable contact networks. We use a set of algorithms, including one that we create, to obtain the sharp coefficients.

  15. Dynamic Security Assessment of Western Danish Power System Based on Ensemble Decision Trees

    DEFF Research Database (Denmark)

    Liu, Leo; Bak, Claus Leth; Chen, Zhe

    2014-01-01

    With the increasing penetration of renewable energy resources and other forms of dispersed generation, more and more uncertainties will be brought to the dynamic security assessment (DSA) of power systems. This paper proposes an approach that uses ensemble decision trees (EDT) for online DSA. Fed...... with online wide-area measurement data, it is capable of not only predicting the security states of current operating conditions (OC) with high accuracy, but also indicating the confidence of the security states 1 minute ahead of the real time by an outlier identification method. The results of EDT together...

  16. Using decision-tree classifier systems to extract knowledge from databases

    Science.gov (United States)

    St.clair, D. C.; Sabharwal, C. L.; Hacke, Keith; Bond, W. E.

    1990-01-01

    One difficulty in applying artificial intelligence techniques to the solution of real world problems is that the development and maintenance of many AI systems, such as those used in diagnostics, require large amounts of human resources. At the same time, databases frequently exist which contain information about the process(es) of interest. Recently, efforts to reduce development and maintenance costs of AI systems have focused on using machine learning techniques to extract knowledge from existing databases. Research is described in the area of knowledge extraction using a class of machine learning techniques called decision-tree classifier systems. Results of this research suggest ways of performing knowledge extraction which may be applied in numerous situations. In addition, a measurement called the concept strength metric (CSM) is described which can be used to determine how well the resulting decision tree can differentiate between the concepts it has learned. The CSM can be used to determine whether or not additional knowledge needs to be extracted from the database. An experiment involving real world data is presented to illustrate the concepts described.

  17. Modifiable risk factors predicting major depressive disorder at four year follow-up: a decision tree approach

    OpenAIRE

    Batterham, Philip J; Christensen, Helen; Mackinnon, Andrew J

    2009-01-01

    Abstract Background Relative to physical health conditions such as cardiovascular disease, little is known about risk factors that predict the prevalence of depression. The present study investigates the expected effects of a reduction of these risks over time, using the decision tree method favoured in assessing cardiovascular disease risk. Methods The PATH through Life cohort was used for the study, comprising 2,105 20-24 year olds, 2,323 40-44 year olds and 2,177 60-64 year olds sampled fr...

  18. DECISION TREE CLASSIFIERS FOR STAR/GALAXY SEPARATION

    International Nuclear Information System (INIS)

    Vasconcellos, E. C.; Ruiz, R. S. R.; De Carvalho, R. R.; Capelato, H. V.; Gal, R. R.; LaBarbera, F. L.; Frago Campos Velho, H.; Trevisan, M.

    2011-01-01

    We study the star/galaxy classification efficiency of 13 different decision tree algorithms applied to photometric objects in the Sloan Digital Sky Survey Data Release Seven (SDSS-DR7). Each algorithm is defined by a set of parameters which, when varied, produce different final classification trees. We extensively explore the parameter space of each algorithm, using the set of 884,126 SDSS objects with spectroscopic data as the training set. The efficiency of star-galaxy separation is measured using the completeness function. We find that the Functional Tree algorithm (FT) yields the best results as measured by the mean completeness in two magnitude intervals: 14 ≤ r ≤ 21 (85.2%) and r ≥ 19 (82.1%). We compare the performance of the tree generated with the optimal FT configuration to the classifications provided by the SDSS parametric classifier, 2DPHOT, and Ball et al. We find that our FT classifier is comparable to or better in completeness over the full magnitude range 15 ≤ r ≤ 21, with much lower contamination than all but the Ball et al. classifier. At the faintest magnitudes (r > 19), our classifier is the only one that maintains high completeness (>80%) while simultaneously achieving low contamination (∼2.5%). We also examine the SDSS parametric classifier (psfMag - modelMag) to see if the dividing line between stars and galaxies can be adjusted to improve the classifier. We find that currently stars in close pairs are often misclassified as galaxies, and suggest a new cut to improve the classifier. Finally, we apply our FT classifier to separate stars from galaxies in the full set of 69,545,326 SDSS photometric objects in the magnitude range 14 ≤ r ≤ 21.

  19. Decision tree analysis to evaluate dry cow strategies under UK conditions

    OpenAIRE

    Berry, E.A.; Hogeveen, H.; Hillerton, J.E.

    2005-01-01

    Economic decisions on animal health strategies address the cost-benefit aspect along with animal welfare and public health concerns. Decision tree analysis at an individual cow level highlighted that there is little economic difference between the use of either dry cow antibiotic or an internal teat sealant in preventing a new intramammary infection in a cow free of infection in all quarters of the mammary gland at drying off. However, a potential net loss of over pound20 per cow might occur ...

  20. Exploratory Use of Decision Tree Analysis in Classification of Outcome in Hypoxic-Ischemic Brain Injury.

    Science.gov (United States)

    Phan, Thanh G; Chen, Jian; Singhal, Shaloo; Ma, Henry; Clissold, Benjamin B; Ly, John; Beare, Richard

    2018-01-01

    Prognostication following hypoxic ischemic encephalopathy (brain injury) is important for clinical management. The aim of this exploratory study is to use a decision tree model to find clinical and MRI associates of severe disability and death in this condition. We evaluate clinical model and then the added value of MRI data. The inclusion criteria were as follows: age ≥17 years, cardio-respiratory arrest, and coma on admission (2003-2011). Decision tree analysis was used to find clinical [Glasgow Coma Score (GCS), features about cardiac arrest, therapeutic hypothermia, age, and sex] and MRI (infarct volume) associates of severe disability and death. We used the area under the ROC (auROC) to determine accuracy of model. There were 41 (63.7% males) patients having MRI imaging with the average age 51.5 ± 18.9 years old. The decision trees showed that infarct volume and age were important factors for discrimination between mild to moderate disability and severe disability and death at day 0 and day 2. The auROC for this model was 0.94 (95% CI 0.82-1.00). At day 7, GCS value was the only predictor; the auROC was 0.96 (95% CI 0.86-1.00). Our findings provide proof of concept for further exploration of the role of MR imaging and decision tree analysis in the early prognostication of hypoxic ischemic brain injury.

  1. Applying of Decision Tree Analysis to Risk Factors Associated with Pressure Ulcers in Long-Term Care Facilities.

    Science.gov (United States)

    Moon, Mikyung; Lee, Soo-Kyoung

    2017-01-01

    The purpose of this study was to use decision tree analysis to explore the factors associated with pressure ulcers (PUs) among elderly people admitted to Korean long-term care facilities. The data were extracted from the 2014 National Inpatient Sample (NIS)-data of Health Insurance Review and Assessment Service (HIRA). A MapReduce-based program was implemented to join and filter 5 tables of the NIS. The outcome predicted by the decision tree model was the prevalence of PUs as defined by the Korean Standard Classification of Disease-7 (KCD-7; code L89 * ). Using R 3.3.1, a decision tree was generated with the finalized 15,856 cases and 830 variables. The decision tree displayed 15 subgroups with 8 variables showing 0.804 accuracy, 0.820 sensitivity, and 0.787 specificity. The most significant primary predictor of PUs was length of stay less than 0.5 day. Other predictors were the presence of an infectious wound dressing, followed by having diagnoses numbering less than 3.5 and the presence of a simple dressing. Among diagnoses, "injuries to the hip and thigh" was the top predictor ranking 5th overall. Total hospital cost exceeding 2,200,000 Korean won (US $2,000) rounded out the top 7. These results support previous studies that showed length of stay, comorbidity, and total hospital cost were associated with PUs. Moreover, wound dressings were commonly used to treat PUs. They also show that machine learning, such as a decision tree, could effectively predict PUs using big data.

  2. Decision-table development for use with the CAT code for the automated fault-tree construction

    International Nuclear Information System (INIS)

    Wu, J.S.; Salem, S.L.; Apostolakis, G.E.

    1977-01-01

    A library of decision tables to be used in connection with the CAT computer code for the automated construction of fault trees is presented. A decision table is constructed for each component type describing the output of the component in terms of its inputs and its internal states. In addition, a modification of the CAT code that couples it with a fault tree analysis code is presented. This report represents one aspect of a study entitled, 'A General Evaluation Approach to Risk-Benefit for Large Technological Systems, and Its Application to Nuclear Power.'

  3. Application of alternating decision trees in selecting sparse linear solvers

    KAUST Repository

    Bhowmick, Sanjukta; Eijkhout, Victor; Freund, Yoav; Fuentes, Erika; Keyes, David E.

    2010-01-01

    The solution of sparse linear systems, a fundamental and resource-intensive task in scientific computing, can be approached through multiple algorithms. Using an algorithm well adapted to characteristics of the task can significantly enhance the performance, such as reducing the time required for the operation, without compromising the quality of the result. However, the best solution method can vary even across linear systems generated in course of the same PDE-based simulation, thereby making solver selection a very challenging problem. In this paper, we use a machine learning technique, Alternating Decision Trees (ADT), to select efficient solvers based on the properties of sparse linear systems and runtime-dependent features, such as the stages of simulation. We demonstrate the effectiveness of this method through empirical results over linear systems drawn from computational fluid dynamics and magnetohydrodynamics applications. The results also demonstrate that using ADT can resolve the problem of over-fitting, which occurs when limited amount of data is available. © 2010 Springer Science+Business Media LLC.

  4. Decision-tree approach to evaluating inactive uranium-processing sites for liner requirements

    International Nuclear Information System (INIS)

    Relyea, J.F.

    1983-03-01

    Recently, concern has been expressed about potential toxic effects of both radon emission and release of toxic elements in leachate from inactive uranium mill tailings piles. Remedial action may be required to meet disposal standards set by the states and the US Environmental Protection Agency (EPA). In some cases, a possible disposal option is the exhumation and reburial (either on site or at a new location) of tailings and reliance on engineered barriers to satisfy the objectives established for remedial actions. Liners under disposal pits are the major engineered barrier for preventing contaminant release to ground and surface water. The purpose of this report is to provide a logical sequence of action, in the form of a decision tree, which could be followed to show whether a selected tailings disposal design meets the objectives for subsurface contaminant release without a liner. This information can be used to determine the need and type of liner for sites exhibiting a potential groundwater problem. The decision tree is based on the capability of hydrologic and mass transport models to predict the movement of water and contaminants with time. The types of modeling capabilities and data needed for those models are described, and the steps required to predict water and contaminant movement are discussed. A demonstration of the decision tree procedure is given to aid the reader in evaluating the need for the adequacy of a liner

  5. Decision Optimization of Machine Sets Taking Into Consideration Logical Tree Minimization of Design Guidelines

    Science.gov (United States)

    Deptuła, A.; Partyka, M. A.

    2014-08-01

    The method of minimization of complex partial multi-valued logical functions determines the degree of importance of construction and exploitation parameters playing the role of logical decision variables. Logical functions are taken into consideration in the issues of modelling machine sets. In multi-valued logical functions with weighting products, it is possible to use a modified Quine - McCluskey algorithm of multi-valued functions minimization. Taking into account weighting coefficients in the logical tree minimization reflects a physical model of the object being analysed much better

  6. Using Boosted Decision Trees to look for displaced Jets in the ATLAS Calorimeter

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    A boosted decision tree is used to identify unique jets in a recently released conference note describing a search for long lived particles decaying to hadrons in the ATLAS Calorimeter. Neutral Long lived particles decaying to hadrons are “typical” signatures in a lot of models including Hidden Valley models, Higgs Portal Models, Baryogenesis, Stealth SUSY, etc. Long lived neutral particles that decay in the calorimeter leave behind an object that looks like a regular Standard Model jet, with subtle differences. For example, the later in the calorimeter it decays, the less energy will be deposited in the early layers of the calorimeter. Because the jet does not originate at the interaction point, it will likely be more narrow as reconstructed by the standard Anti-kT jet reconstruction algorithm used by ATLAS. To separate the jets due to neutral long lived decays from the standard model jets we used a boosted decision tree with thirteen variables as inputs. We used the information from the boosted decision...

  7. Decision Tree and Texture Analysis for Mapping Debris-Covered Glaciers in the Kangchenjunga Area, Eastern Himalaya

    Directory of Open Access Journals (Sweden)

    Adina Racoviteanu

    2012-10-01

    Full Text Available In this study we use visible, short-wave infrared and thermal Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER data validated with high-resolution Quickbird (QB and Worldview2 (WV2 for mapping debris cover in the eastern Himalaya using two independent approaches: (a a decision tree algorithm, and (b texture analysis. The decision tree algorithm was based on multi-spectral and topographic variables, such as band ratios, surface reflectance, kinetic temperature from ASTER bands 10 and 12, slope angle, and elevation. The decision tree algorithm resulted in 64 km2 classified as debris-covered ice, which represents 11% of the glacierized area. Overall, for ten glacier tongues in the Kangchenjunga area, there was an area difference of 16.2 km2 (25% between the ASTER and the QB areas, with mapping errors mainly due to clouds and shadows. Texture analysis techniques included co-occurrence measures, geostatistics and filtering in spatial/frequency domain. Debris cover had the highest variance of all terrain classes, highest entropy and lowest homogeneity compared to the other classes, for example a mean variance of 15.27 compared to 0 for clouds and 0.06 for clean ice. Results of the texture image for debris-covered areas were comparable with those from the decision tree algorithm, with 8% area difference between the two techniques.

  8. Entropy lower bounds of quantum decision tree complexity

    OpenAIRE

    Shi, Yaoyun

    2000-01-01

    We prove a general lower bound of quantum decision tree complexity in terms of some entropy notion. We regard the computation as a communication process in which the oracle and the computer exchange several rounds of messages, each round consisting of O(log(n)) bits. Let E(f) be the Shannon entropy of the random variable f(X), where X is uniformly random in f's domain. Our main result is that it takes \\Omega(E(f)) queries to compute any \\emph{total} function f. It is interesting to contrast t...

  9. Decision tree for accurate infection timing in individuals newly diagnosed with HIV-1 infection.

    Science.gov (United States)

    Verhofstede, Chris; Fransen, Katrien; Van Den Heuvel, Annelies; Van Laethem, Kristel; Ruelle, Jean; Vancutsem, Ellen; Stoffels, Karolien; Van den Wijngaert, Sigi; Delforge, Marie-Luce; Vaira, Dolores; Hebberecht, Laura; Schauvliege, Marlies; Mortier, Virginie; Dauwe, Kenny; Callens, Steven

    2017-11-29

    There is today no gold standard method to accurately define the time passed since infection at HIV diagnosis. Infection timing and incidence measurement is however essential to better monitor the dynamics of local epidemics and the effect of prevention initiatives. Three methods for infection timing were evaluated using 237 serial samples from documented seroconversions and 566 cross sectional samples from newly diagnosed patients: identification of antibodies against the HIV p31 protein in INNO-LIA, SediaTM BED CEIA and SediaTM LAg-Avidity EIA. A multi-assay decision tree for infection timing was developed. Clear differences in recency window between BED CEIA, LAg-Avidity EIA and p31 antibody presence were observed with a switch from recent to long term infection a median of 169.5, 108.0 and 64.5 days after collection of the pre-seroconversion sample respectively. BED showed high reliability for identification of long term infections while LAg-Avidity is highly accurate for identification of recent infections. Using BED as initial assay to identify the long term infections and LAg-Avidity as a confirmatory assay for those classified as recent infection by BED, explores the strengths of both while reduces the workload. The short recency window of p31 antibodies allows to discriminate very early from early infections based on this marker. BED recent infection results not confirmed by LAg-Avidity are considered to reflect a period more distant from the infection time. False recency predictions in this group can be minimized by elimination of patients with a CD4 count of less than 100 cells/mm3 or without no p31 antibodies. For 566 cross sectional sample the outcome of the decision tree confirmed the infection timing based on the results of all 3 markers but reduced the overall cost from 13.2 USD to 5.2 USD per sample. A step-wise multi assay decision tree allows accurate timing of the HIV infection at diagnosis at affordable effort and cost and can be an important

  10. Relationships Between Average Depth and Number of Nodes for Decision Trees

    KAUST Repository

    Chikalov, Igor

    2013-07-24

    This paper presents a new tool for the study of relationships between total path length or average depth and number of nodes of decision trees. In addition to algorithm, the paper also presents the results of experiments with datasets from UCI ML Repository [1]. © Springer-Verlag Berlin Heidelberg 2014.

  11. Studies of stability and robustness for artificial neural networks and boosted decision trees

    International Nuclear Information System (INIS)

    Yang, H.-J.; Roe, Byron P.; Zhu Ji

    2007-01-01

    In this paper, we compare the performance, stability and robustness of Artificial Neural Networks (ANN) and Boosted Decision Trees (BDT) using MiniBooNE Monte Carlo samples. These methods attempt to classify events given a number of identification variables. The BDT algorithm has been discussed by us in previous publications. Testing is done in this paper by smearing and shifting the input variables of testing samples. Based on these studies, BDT has better particle identification performance than ANN. The degradation of the classifications obtained by shifting or smearing variables of testing results is smaller for BDT than for ANN

  12. Obesity and the decision tree: predictors of sustained weight loss after bariatric surgery.

    Science.gov (United States)

    Lee, Yi-Chih; Lee, Wei-Jei; Lin, Yang-Chu; Liew, Phui-Ly; Lee, Chia Ko; Lin, Steven C H; Lee, Tian-Shyung

    2009-01-01

    Bariatric surgery is the only long-lasting effective treatment to reduce body weight in morbid obesity. Previous literature in using data mining techniques to predict weight loss in obese patients who have undergone bariatric surgery is limited. This study used initial evaluations before bariatric surgery and data mining techniques to predict weight outcomes in morbidly obese patients seeking surgical treatment. 251 morbidly obese patients undergoing laparoscopic mini-gastric bypass (LMGB) or adjustable gastric banding (LAGB) with complete clinical data at baseline and at two years were enrolled for analysis. Decision Tree, Logistic Regression and Discriminant analysis technologies were used to predict weight loss. Overall classification capability of the designed diagnostic models was evaluated by the misclassification costs. Two hundred fifty-one patients consisting of 68 men and 183 women was studied; with mean age 33 years. Mean +/- SD weight loss at 2 year was 74.5 +/- 16.4 kg. During two years of follow up, two-hundred and five (81.7%) patients had successful weight reduction while 46 (18.3%) were failed to reduce body weight. Operation methods, alanine transaminase (ALT), aspartate transaminase (AST), white blood cell counts (WBC), insulin and hemoglobin A1c (HbA1c) levels were the predictive factors for successful weight reduction. Decision tree model was a better classification models than traditional logistic regression and discriminant analysis in view of predictive accuracies.

  13. A hybrid model using decision tree and neural network for credit scoring problem

    Directory of Open Access Journals (Sweden)

    Amir Arzy Soltan

    2012-08-01

    Full Text Available Nowadays credit scoring is an important issue for financial and monetary organizations that has substantial impact on reduction of customer attraction risks. Identification of high risk customer can reduce finished cost. An accurate classification of customer and low type 1 and type 2 errors have been investigated in many studies. The primary objective of this paper is to develop a new method, which chooses the best neural network architecture based on one column hidden layer MLP, multiple columns hidden layers MLP, RBFN and decision trees and ensembling them with voting methods. The proposed method of this paper is run on an Australian credit data and a private bank in Iran called Export Development Bank of Iran and the results are used for making solution in low customer attraction risks.

  14. Using Decision Trees to Detect and Isolate Leaks in the J-2X

    Data.gov (United States)

    National Aeronautics and Space Administration — Full title: Using Decision Trees to Detect and Isolate Simulated Leaks in the J-2X Rocket Engine Mark Schwabacher, NASA Ames Research Center Robert Aguilar, Pratt...

  15. Toward the Decision Tree for Inferring Requirements Maturation Types

    Science.gov (United States)

    Nakatani, Takako; Kondo, Narihito; Shirogane, Junko; Kaiya, Haruhiko; Hori, Shozo; Katamine, Keiichi

    Requirements are elicited step by step during the requirements engineering (RE) process. However, some types of requirements are elicited completely after the scheduled requirements elicitation process is finished. Such a situation is regarded as problematic situation. In our study, the difficulties of eliciting various kinds of requirements is observed by components. We refer to the components as observation targets (OTs) and introduce the word “Requirements maturation.” It means when and how requirements are elicited completely in the project. The requirements maturation is discussed on physical and logical OTs. OTs Viewed from a logical viewpoint are called logical OTs, e.g. quality requirements. The requirements of physical OTs, e.g., modules, components, subsystems, etc., includes functional and non-functional requirements. They are influenced by their requesters' environmental changes, as well as developers' technical changes. In order to infer the requirements maturation period of each OT, we need to know how much these factors influence the OTs' requirements maturation. According to the observation of actual past projects, we defined the PRINCE (Pre Requirements Intelligence Net Consideration and Evaluation) model. It aims to guide developers in their observation of the requirements maturation of OTs. We quantitatively analyzed the actual cases with their requirements elicitation process and extracted essential factors that influence the requirements maturation. The results of interviews of project managers are analyzed by WEKA, a data mining system, from which the decision tree was derived. This paper introduces the PRINCE model and the category of logical OTs to be observed. The decision tree that helps developers infer the maturation type of an OT is also described. We evaluate the tree through real projects and discuss its ability to infer the requirements maturation types.

  16. Single nucleotide polymorphism barcoding of cytochrome c oxidase I sequences for discriminating 17 species of Columbidae by decision tree algorithm.

    Science.gov (United States)

    Yang, Cheng-Hong; Wu, Kuo-Chuan; Dahms, Hans-Uwe; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2017-07-01

    DNA barcodes are widely used in taxonomy, systematics, species identification, food safety, and forensic science. Most of the conventional DNA barcode sequences contain the whole information of a given barcoding gene. Most of the sequence information does not vary and is uninformative for a given group of taxa within a monophylum. We suggest here a method that reduces the amount of noninformative nucleotides in a given barcoding sequence of a major taxon, like the prokaryotes, or eukaryotic animals, plants, or fungi. The actual differences in genetic sequences, called single nucleotide polymorphism (SNP) genotyping, provide a tool for developing a rapid, reliable, and high-throughput assay for the discrimination between known species. Here, we investigated SNPs as robust markers of genetic variation for identifying different pigeon species based on available cytochrome c oxidase I (COI) data. We propose here a decision tree-based SNP barcoding (DTSB) algorithm where SNP patterns are selected from the DNA barcoding sequence of several evolutionarily related species in order to identify a single species with pigeons as an example. This approach can make use of any established barcoding system. We here firstly used as an example the mitochondrial gene COI information of 17 pigeon species (Columbidae, Aves) using DTSB after sequence trimming and alignment. SNPs were chosen which followed the rule of decision tree and species-specific SNP barcodes. The shortest barcode of about 11 bp was then generated for discriminating 17 pigeon species using the DTSB method. This method provides a sequence alignment and tree decision approach to parsimoniously assign a unique and shortest SNP barcode for any known species of a chosen monophyletic taxon where a barcoding sequence is available.

  17. Decision-Tree Analysis for Predicting First-Time Pass/Fail Rates for the NCLEX-RN® in Associate Degree Nursing Students.

    Science.gov (United States)

    Chen, Hsiu-Chin; Bennett, Sean

    2016-08-01

    Little evidence shows the use of decision-tree algorithms in identifying predictors and analyzing their associations with pass rates for the NCLEX-RN(®) in associate degree nursing students. This longitudinal and retrospective cohort study investigated whether a decision-tree algorithm could be used to develop an accurate prediction model for the students' passing or failing the NCLEX-RN. This study used archived data from 453 associate degree nursing students in a selected program. The chi-squared automatic interaction detection analysis of the decision trees module was used to examine the effect of the collected predictors on passing/failing the NCLEX-RN. The actual percentage scores of Assessment Technologies Institute®'s RN Comprehensive Predictor(®) accurately identified students at risk of failing. The classification model correctly classified 92.7% of the students for passing. This study applied the decision-tree model to analyze a sequence database for developing a prediction model for early remediation in preparation for the NCLEXRN. [J Nurs Educ. 2016;55(8):454-457.]. Copyright 2016, SLACK Incorporated.

  18. Relationships between average depth and number of misclassifications for decision trees

    KAUST Repository

    Chikalov, Igor

    2014-02-14

    This paper presents a new tool for the study of relationships between the total path length or the average depth and the number of misclassifications for decision trees. In addition to algorithm, the paper also presents the results of experiments with datasets from UCI ML Repository [9] and datasets representing Boolean functions with 10 variables.

  19. Relationships between average depth and number of misclassifications for decision trees

    KAUST Repository

    Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail

    2014-01-01

    This paper presents a new tool for the study of relationships between the total path length or the average depth and the number of misclassifications for decision trees. In addition to algorithm, the paper also presents the results of experiments with datasets from UCI ML Repository [9] and datasets representing Boolean functions with 10 variables.

  20. Calculation of the number of branches of multi-valued decision trees in computer aided importance rank of parameters

    Directory of Open Access Journals (Sweden)

    Tiszbierek Agnieszka

    2017-01-01

    Full Text Available An elaborated digital computer programme supporting the time-consuming process of selecting the importance rank of construction and operation parameters by means of stating optimum sets is based on the Quine – McCluskey algorithm of minimizing individual partial multi-valued logic functions. The example with real time data, calculated by means of the programme, showed that among the obtained optimum sets there were such which had a different number of real branches after being presented on the multi-valued logic decision tree. That is why an idea of elaborating another functionality of the programme – a module calculating the number of branches of real, multi-valued logic decision trees presenting optimum sets chosen by the programme was pursued. This paper presents the idea and the method for developing a module calculating the number of branches, real for each of optimum sets indicated by the programme, as well as to the calculation process.

  1. How to differentiate acute pelvic inflammatory disease from acute appendicitis? A decision tree based on CT findings

    Energy Technology Data Exchange (ETDEWEB)

    El Hentour, Kim; Millet, Ingrid; Pages-Bouic, Emmanuelle; Curros-Doyon, Fernanda; Taourel, Patrice [Lapeyronie Hospital, Department of Medical Imaging, Montpellier (France); Molinari, Nicolas [UMR 5149 IMAG, CHU, Department of Medical Information and Statistics, Montpellier (France)

    2018-02-15

    To construct a decision tree based on CT findings to differentiate acute pelvic inflammatory disease (PID) from acute appendicitis (AA) in women with lower abdominal pain and inflammatory syndrome. This retrospective study was approved by our institutional review board and informed consent was waived. Contrast-enhanced CT studies of 109 women with acute PID and 218 age-matched women with AA were retrospectively and independently reviewed by two radiologists to identify CT findings predictive of PID or AA. Surgical and laboratory data were used for the PID and AA reference standard. Appropriate tests were performed to compare PID and AA and a CT decision tree using the classification and regression tree (CART) algorithm was generated. The median patient age was 28 years (interquartile range, 22-39 years). According to the decision tree, an appendiceal diameter ≥ 7 mm was the most discriminating criterion for differentiating acute PID and AA, followed by a left tubal diameter ≥ 10 mm, with a global accuracy of 98.2 % (95 % CI: 96-99.4). Appendiceal diameter and left tubal thickening are the most discriminating CT criteria for differentiating acute PID from AA. (orig.)

  2. How to differentiate acute pelvic inflammatory disease from acute appendicitis? A decision tree based on CT findings

    International Nuclear Information System (INIS)

    El Hentour, Kim; Millet, Ingrid; Pages-Bouic, Emmanuelle; Curros-Doyon, Fernanda; Taourel, Patrice; Molinari, Nicolas

    2018-01-01

    To construct a decision tree based on CT findings to differentiate acute pelvic inflammatory disease (PID) from acute appendicitis (AA) in women with lower abdominal pain and inflammatory syndrome. This retrospective study was approved by our institutional review board and informed consent was waived. Contrast-enhanced CT studies of 109 women with acute PID and 218 age-matched women with AA were retrospectively and independently reviewed by two radiologists to identify CT findings predictive of PID or AA. Surgical and laboratory data were used for the PID and AA reference standard. Appropriate tests were performed to compare PID and AA and a CT decision tree using the classification and regression tree (CART) algorithm was generated. The median patient age was 28 years (interquartile range, 22-39 years). According to the decision tree, an appendiceal diameter ≥ 7 mm was the most discriminating criterion for differentiating acute PID and AA, followed by a left tubal diameter ≥ 10 mm, with a global accuracy of 98.2 % (95 % CI: 96-99.4). Appendiceal diameter and left tubal thickening are the most discriminating CT criteria for differentiating acute PID from AA. (orig.)

  3. CorRECTreatment: a web-based decision support tool for rectal cancer treatment that uses the analytic hierarchy process and decision tree.

    Science.gov (United States)

    Suner, A; Karakülah, G; Dicle, O; Sökmen, S; Çelikoğlu, C C

    2015-01-01

    The selection of appropriate rectal cancer treatment is a complex multi-criteria decision making process, in which clinical decision support systems might be used to assist and enrich physicians' decision making. The objective of the study was to develop a web-based clinical decision support tool for physicians in the selection of potentially beneficial treatment options for patients with rectal cancer. The updated decision model contained 8 and 10 criteria in the first and second steps respectively. The decision support model, developed in our previous study by combining the Analytic Hierarchy Process (AHP) method which determines the priority of criteria and decision tree that formed using these priorities, was updated and applied to 388 patients data collected retrospectively. Later, a web-based decision support tool named corRECTreatment was developed. The compatibility of the treatment recommendations by the expert opinion and the decision support tool was examined for its consistency. Two surgeons were requested to recommend a treatment and an overall survival value for the treatment among 20 different cases that we selected and turned into a scenario among the most common and rare treatment options in the patient data set. In the AHP analyses of the criteria, it was found that the matrices, generated for both decision steps, were consistent (consistency ratiodecisions of experts, the consistency value for the most frequent cases was found to be 80% for the first decision step and 100% for the second decision step. Similarly, for rare cases consistency was 50% for the first decision step and 80% for the second decision step. The decision model and corRECTreatment, developed by applying these on real patient data, are expected to provide potential users with decision support in rectal cancer treatment processes and facilitate them in making projections about treatment options.

  4. A regret theory approach to decision curve analysis: A novel method for eliciting decision makers' preferences and decision-making

    Directory of Open Access Journals (Sweden)

    Vickers Andrew

    2010-09-01

    Full Text Available Abstract Background Decision curve analysis (DCA has been proposed as an alternative method for evaluation of diagnostic tests, prediction models, and molecular markers. However, DCA is based on expected utility theory, which has been routinely violated by decision makers. Decision-making is governed by intuition (system 1, and analytical, deliberative process (system 2, thus, rational decision-making should reflect both formal principles of rationality and intuition about good decisions. We use the cognitive emotion of regret to serve as a link between systems 1 and 2 and to reformulate DCA. Methods First, we analysed a classic decision tree describing three decision alternatives: treat, do not treat, and treat or no treat based on a predictive model. We then computed the expected regret for each of these alternatives as the difference between the utility of the action taken and the utility of the action that, in retrospect, should have been taken. For any pair of strategies, we measure the difference in net expected regret. Finally, we employ the concept of acceptable regret to identify the circumstances under which a potentially wrong strategy is tolerable to a decision-maker. Results We developed a novel dual visual analog scale to describe the relationship between regret associated with "omissions" (e.g. failure to treat vs. "commissions" (e.g. treating unnecessary and decision maker's preferences as expressed in terms of threshold probability. We then proved that the Net Expected Regret Difference, first presented in this paper, is equivalent to net benefits as described in the original DCA. Based on the concept of acceptable regret we identified the circumstances under which a decision maker tolerates a potentially wrong decision and expressed it in terms of probability of disease. Conclusions We present a novel method for eliciting decision maker's preferences and an alternative derivation of DCA based on regret theory. Our approach may

  5. Totally Optimal Decision Trees for Monotone Boolean Functions with at Most Five Variables

    KAUST Repository

    Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail

    2013-01-01

    In this paper, we present the empirical results for relationships between time (depth) and space (number of nodes) complexity of decision trees computing monotone Boolean functions, with at most five variables. We use Dagger (a tool for optimization

  6. Design of a new hybrid artificial neural network method based on decision trees for calculating the Froude number in rigid rectangular channels

    Directory of Open Access Journals (Sweden)

    Ebtehaj Isa

    2016-09-01

    Full Text Available A vital topic regarding the optimum and economical design of rigid boundary open channels such as sewers and drainage systems is determining the movement of sediment particles. In this study, the incipient motion of sediment is estimated using three datasets from literature, including a wide range of hydraulic parameters. Because existing equations do not consider the effect of sediment bed thickness on incipient motion estimation, this parameter is applied in this study along with the multilayer perceptron (MLP, a hybrid method based on decision trees (DT (MLP-DT, to estimate incipient motion. According to a comparison with the observed experimental outcome, the proposed method performs well (MARE = 0.048, RMSE = 0.134, SI = 0.06, BIAS = -0.036. The performance of MLP and MLP-DT is compared with that of existing regression-based equations, and significantly higher performance over existing models is observed. Finally, an explicit expression for practical engineering is also provided.

  7. A regret theory approach to decision curve analysis: a novel method for eliciting decision makers' preferences and decision-making.

    Science.gov (United States)

    Tsalatsanis, Athanasios; Hozo, Iztok; Vickers, Andrew; Djulbegovic, Benjamin

    2010-09-16

    Decision curve analysis (DCA) has been proposed as an alternative method for evaluation of diagnostic tests, prediction models, and molecular markers. However, DCA is based on expected utility theory, which has been routinely violated by decision makers. Decision-making is governed by intuition (system 1), and analytical, deliberative process (system 2), thus, rational decision-making should reflect both formal principles of rationality and intuition about good decisions. We use the cognitive emotion of regret to serve as a link between systems 1 and 2 and to reformulate DCA. First, we analysed a classic decision tree describing three decision alternatives: treat, do not treat, and treat or no treat based on a predictive model. We then computed the expected regret for each of these alternatives as the difference between the utility of the action taken and the utility of the action that, in retrospect, should have been taken. For any pair of strategies, we measure the difference in net expected regret. Finally, we employ the concept of acceptable regret to identify the circumstances under which a potentially wrong strategy is tolerable to a decision-maker. We developed a novel dual visual analog scale to describe the relationship between regret associated with "omissions" (e.g. failure to treat) vs. "commissions" (e.g. treating unnecessary) and decision maker's preferences as expressed in terms of threshold probability. We then proved that the Net Expected Regret Difference, first presented in this paper, is equivalent to net benefits as described in the original DCA. Based on the concept of acceptable regret we identified the circumstances under which a decision maker tolerates a potentially wrong decision and expressed it in terms of probability of disease. We present a novel method for eliciting decision maker's preferences and an alternative derivation of DCA based on regret theory. Our approach may be intuitively more appealing to a decision-maker, particularly

  8. Comparison of T-Square, Point Centered Quarter, and N-Tree Sampling Methods in Pittosporum undulatum Invaded Woodlands

    Directory of Open Access Journals (Sweden)

    Lurdes Borges Silva

    2017-01-01

    Full Text Available Tree density is an important parameter affecting ecosystems functions and management decisions, while tree distribution patterns affect sampling design. Pittosporum undulatum stands in the Azores are being targeted with a biomass valorization program, for which efficient tree density estimators are required. We compared T-Square sampling, Point Centered Quarter Method (PCQM, and N-tree sampling with benchmark quadrat (QD sampling in six 900 m2 plots established at P. undulatum stands in São Miguel Island. A total of 15 estimators were tested using a data resampling approach. The estimated density range (344–5056 trees/ha was found to agree with previous studies using PCQM only. Although with a tendency to underestimate tree density (in comparison with QD, overall, T-Square sampling appeared to be the most accurate and precise method, followed by PCQM. Tree distribution pattern was found to be slightly aggregated in 4 of the 6 stands. Considering (1 the low level of bias and high precision, (2 the consistency among three estimators, (3 the possibility of use with aggregated patterns, and (4 the possibility of obtaining a larger number of independent tree parameter estimates, we recommend the use of T-Square sampling in P. undulatum stands within the framework of a biomass valorization program.

  9. Validating a decision tree for serious infection: diagnostic accuracy in acutely ill children in ambulatory care.

    Science.gov (United States)

    Verbakel, Jan Y; Lemiengre, Marieke B; De Burghgraeve, Tine; De Sutter, An; Aertgeerts, Bert; Bullens, Dominique M A; Shinkins, Bethany; Van den Bruel, Ann; Buntinx, Frank

    2015-08-07

    Acute infection is the most common presentation of children in primary care with only few having a serious infection (eg, sepsis, meningitis, pneumonia). To avoid complications or death, early recognition and adequate referral are essential. Clinical prediction rules have the potential to improve diagnostic decision-making for rare but serious conditions. In this study, we aimed to validate a recently developed decision tree in a new but similar population. Diagnostic accuracy study validating a clinical prediction rule. Acutely ill children presenting to ambulatory care in Flanders, Belgium, consisting of general practice and paediatric assessment in outpatient clinics or the emergency department. Physicians were asked to score the decision tree in every child. The outcome of interest was hospital admission for at least 24 h with a serious infection within 5 days after initial presentation. We report the diagnostic accuracy of the decision tree in sensitivity, specificity, likelihood ratios and predictive values. In total, 8962 acute illness episodes were included, of which 283 lead to admission to hospital with a serious infection. Sensitivity of the decision tree was 100% (95% CI 71.5% to 100%) at a specificity of 83.6% (95% CI 82.3% to 84.9%) in the general practitioner setting with 17% of children testing positive. In the paediatric outpatient and emergency department setting, sensitivities were below 92%, with specificities below 44.8%. In an independent validation cohort, this clinical prediction rule has shown to be extremely sensitive to identify children at risk of hospital admission for a serious infection in general practice, making it suitable for ruling out. NCT02024282. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  10. Determination of Component Failure Modes for a Fire PSA by Using Decision Trees

    International Nuclear Information System (INIS)

    Kang, Dae Il; Han, Sang Hoon; Lim, Jae Won

    2007-01-01

    KAERI developed the method, called a mapping technique, for the quantification of external events PSA models with one top model for an internal events PSA. The mapping technique can be implemented by the construction of mapping tables. The mapping tables include initiating events and transfer events of fire, and internal PSA basic events affected by a fire. This year, KAERI is making mapping tables for the one top model for Ulchin Unit 3 and 4 fire PSA with previously conducted Fire PSA results for Ulchin Unit 3 and 4. A Fire PSA requires a PSA analyst to determine component failure modes affected by a fire. The component failure modes caused by a fire depend on several factors. These several factors are whether components are located at fire initiation and propagation areas or not, fire effects on control and power cables for components, designed failure modes of components, success criteria in a PSA model, etc. Thus, it is not easy to manually determine component failure modes caused by a fire. In this paper, we propose the use of decision trees for the determination of component failure modes affected by a fire and the selection of internal PSA basic events. Section 2 presents the procedure for previously performed the Ulchin Unit 3 and 4 fire PSA and mapping technique. Section 3 presents the process for identification of basic events and decision trees. Section 4 presents the concluding remarks

  11. Application of decision tree technique to sensitivity analysis for results of radionuclide migration calculations. Research documents

    International Nuclear Information System (INIS)

    Nakajima, Kunihiko; Makino, Hitoshi

    2005-03-01

    Uncertainties are always present in the parameters used for the nuclide migration analysis in the geological disposal system. These uncertainties affect the result of such analyses, e.g., and the identification of dominant nuclides. It is very important to identify the parameters causing the significant impact on the results, and to investigate the influence of identified parameters in order to recognize R and D items with respect to the development of geological disposal system and understanding of the system performance. In our study, the decision tree analysis technique was examined in the sensitivity analysis as a method for investigation of the influences of the parameters and for complement existing sensitivity analysis. As a result, results obtained from Monte Carlo simulation with parameter uncertainties could be distinguished with not only important parameters but also with their quantitative conditions (e.g., ranges of parameter values). Furthermore, information obtained from the decision tree analysis could be used 1) to categorize the results obtained from the nuclide migration analysis for a given parameter set, 2) to show prospective effect of reduction to parameter uncertainties on the results. (author)

  12. Predicting Lung Radiotherapy-Induced Pneumonitis Using a Model Combining Parametric Lyman Probit With Nonparametric Decision Trees

    International Nuclear Information System (INIS)

    Das, Shiva K.; Zhou Sumin; Zhang, Junan; Yin, F.-F.; Dewhirst, Mark W.; Marks, Lawrence B.

    2007-01-01

    Purpose: To develop and test a model to predict for lung radiation-induced Grade 2+ pneumonitis. Methods and Materials: The model was built from a database of 234 lung cancer patients treated with radiotherapy (RT), of whom 43 were diagnosed with pneumonitis. The model augmented the predictive capability of the parametric dose-based Lyman normal tissue complication probability (LNTCP) metric by combining it with weighted nonparametric decision trees that use dose and nondose inputs. The decision trees were sequentially added to the model using a 'boosting' process that enhances the accuracy of prediction. The model's predictive capability was estimated by 10-fold cross-validation. To facilitate dissemination, the cross-validation result was used to extract a simplified approximation to the complicated model architecture created by boosting. Application of the simplified model is demonstrated in two example cases. Results: The area under the model receiver operating characteristics curve for cross-validation was 0.72, a significant improvement over the LNTCP area of 0.63 (p = 0.005). The simplified model used the following variables to output a measure of injury: LNTCP, gender, histologic type, chemotherapy schedule, and treatment schedule. For a given patient RT plan, injury prediction was highest for the combination of pre-RT chemotherapy, once-daily treatment, female gender and lowest for the combination of no pre-RT chemotherapy and nonsquamous cell histologic type. Application of the simplified model to the example cases revealed that injury prediction for a given treatment plan can range from very low to very high, depending on the settings of the nondose variables. Conclusions: Radiation pneumonitis prediction was significantly enhanced by decision trees that added the influence of nondose factors to the LNTCP formulation

  13. Model-Independent Evaluation of Tumor Markers and a Logistic-Tree Approach to Diagnostic Decision Support

    Directory of Open Access Journals (Sweden)

    Weizeng Ni

    2014-01-01

    Full Text Available Sensitivity and specificity of using individual tumor markers hardly meet the clinical requirement. This challenge gave rise to many efforts, e.g., combing multiple tumor markers and employing machine learning algorithms. However, results from different studies are often inconsistent, which are partially attributed to the use of different evaluation criteria. Also, the wide use of model-dependent validation leads to high possibility of data overfitting when complex models are used for diagnosis. We propose two model-independent criteria, namely, area under the curve (AUC and Relief to evaluate the diagnostic values of individual and multiple tumor markers, respectively. For diagnostic decision support, we propose the use of logistic-tree which combines decision tree and logistic regression. Application on a colorectal cancer dataset shows that the proposed evaluation criteria produce results that are consistent with current knowledge. Furthermore, the simple and highly interpretable logistic-tree has diagnostic performance that is competitive with other complex models.

  14. Identifying Risk Factors for Drug Use in an Iranian Treatment Sample: A Prediction Approach Using Decision Trees.

    Science.gov (United States)

    Amirabadizadeh, Alireza; Nezami, Hossein; Vaughn, Michael G; Nakhaee, Samaneh; Mehrpour, Omid

    2018-05-12

    Substance abuse exacts considerable social and health care burdens throughout the world. The aim of this study was to create a prediction model to better identify risk factors for drug use. A prospective cross-sectional study was conducted in South Khorasan Province, Iran. Of the total of 678 eligible subjects, 70% (n: 474) were randomly selected to provide a training set for constructing decision tree and multiple logistic regression (MLR) models. The remaining 30% (n: 204) were employed in a holdout sample to test the performance of the decision tree and MLR models. Predictive performance of different models was analyzed by the receiver operating characteristic (ROC) curve using the testing set. Independent variables were selected from demographic characteristics and history of drug use. For the decision tree model, the sensitivity and specificity for identifying people at risk for drug abuse were 66% and 75%, respectively, while the MLR model was somewhat less effective at 60% and 73%. Key independent variables in the analyses included first substance experience, age at first drug use, age, place of residence, history of cigarette use, and occupational and marital status. While study findings are exploratory and lack generalizability they do suggest that the decision tree model holds promise as an effective classification approach for identifying risk factors for drug use. Convergent with prior research in Western contexts is that age of drug use initiation was a critical factor predicting a substance use disorder.

  15. A decision tree approach using silvics to guide planning for forest restoration

    Science.gov (United States)

    Sharon M. Hermann; John S. Kush; John C. Gilbert

    2013-01-01

    We created a decision tree based on silvics of longleaf pine (Pinus palustris) and historical descriptions to develop approaches for restoration management at Horseshoe Bend National Military Park located in central Alabama. A National Park Service goal is to promote structure and composition of a forest that likely surrounded the 1814 battlefield....

  16. The Americans with Disabilities Act: A Decision Tree for Social Services Administrators

    Science.gov (United States)

    O'Brien, Gerald V.; Ellegood, Christina

    2005-01-01

    The 1990 Americans with Disabilities Act has had a profound influence on social workers and social services administrators in virtually all work settings. Because of the multiple elements of the act, however, assessing the validity of claims can be a somewhat arduous and complicated task. This article provides a "decision tree" for…

  17. Prediction of adverse drug reactions using decision tree modeling.

    Science.gov (United States)

    Hammann, F; Gutmann, H; Vogt, N; Helma, C; Drewe, J

    2010-07-01

    Drug safety is of great importance to public health. The detrimental effects of drugs not only limit their application but also cause suffering in individual patients and evoke distrust of pharmacotherapy. For the purpose of identifying drugs that could be suspected of causing adverse reactions, we present a structure-activity relationship analysis of adverse drug reactions (ADRs) in the central nervous system (CNS), liver, and kidney, and also of allergic reactions, for a broad variety of drugs (n = 507) from the Swiss drug registry. Using decision tree induction, a machine learning method, we determined the chemical, physical, and structural properties of compounds that predispose them to causing ADRs. The models had high predictive accuracies (78.9-90.2%) for allergic, renal, CNS, and hepatic ADRs. We show the feasibility of predicting complex end-organ effects using simple models that involve no expensive computations and that can be used (i) in the selection of the compound during the drug discovery stage, (ii) to understand how drugs interact with the target organ systems, and (iii) for generating alerts in postmarketing drug surveillance and pharmacovigilance.

  18. Cardiovascular Dysautonomias Diagnosis Using Crisp and Fuzzy Decision Tree: A Comparative Study.

    Science.gov (United States)

    Kadi, Ilham; Idri, Ali

    2016-01-01

    Decision trees (DTs) are one of the most popular techniques for learning classification systems, especially when it comes to learning from discrete examples. In real world, many data occurred in a fuzzy form. Hence a DT must be able to deal with such fuzzy data. In fact, integrating fuzzy logic when dealing with imprecise and uncertain data allows reducing uncertainty and providing the ability to model fine knowledge details. In this paper, a fuzzy decision tree (FDT) algorithm was applied on a dataset extracted from the ANS (Autonomic Nervous System) unit of the Moroccan university hospital Avicenne. This unit is specialized on performing several dynamic tests to diagnose patients with autonomic disorder and suggest them the appropriate treatment. A set of fuzzy classifiers were generated using FID 3.4. The error rates of the generated FDTs were calculated to measure their performances. Moreover, a comparison between the error rates obtained using crisp and FDTs was carried out and has proved that the results of FDTs were better than those obtained using crisp DTs.

  19. Multi-output decision trees for lesion segmentation in multiple sclerosis

    Science.gov (United States)

    Jog, Amod; Carass, Aaron; Pham, Dzung L.; Prince, Jerry L.

    2015-03-01

    Multiple Sclerosis (MS) is a disease of the central nervous system in which the protective myelin sheath of the neurons is damaged. MS leads to the formation of lesions, predominantly in the white matter of the brain and the spinal cord. The number and volume of lesions visible in magnetic resonance (MR) imaging (MRI) are important criteria for diagnosing and tracking the progression of MS. Locating and delineating lesions manually requires the tedious and expensive efforts of highly trained raters. In this paper, we propose an automated algorithm to segment lesions in MR images using multi-output decision trees. We evaluated our algorithm on the publicly available MICCAI 2008 MS Lesion Segmentation Challenge training dataset of 20 subjects, and showed improved results in comparison to state-of-the-art methods. We also evaluated our algorithm on an in-house dataset of 49 subjects with a true positive rate of 0.41 and a positive predictive value 0.36.

  20. Behaviour change in overweight and obese pregnancy: a decision tree to support the development of antenatal lifestyle interventions.

    Science.gov (United States)

    Ainscough, Kate M; Lindsay, Karen L; O'Sullivan, Elizabeth J; Gibney, Eileen R; McAuliffe, Fionnuala M

    2017-10-01

    Antenatal healthy lifestyle interventions are frequently implemented in overweight and obese pregnancy, yet there is inconsistent reporting of the behaviour-change methods and behavioural outcomes. This limits our understanding of how and why such interventions were successful or not. The current paper discusses the application of behaviour-change theories and techniques within complex lifestyle interventions in overweight and obese pregnancy. The authors propose a decision tree to help guide researchers through intervention design, implementation and evaluation. The implications for adopting behaviour-change theories and techniques, and using appropriate guidance when constructing and evaluating interventions in research and clinical practice are also discussed. To enhance the evidence base for successful behaviour-change interventions during pregnancy, adoption of behaviour-change theories and techniques, and use of published guidelines when designing lifestyle interventions are necessary. The proposed decision tree may be a useful guide for researchers working to develop effective behaviour-change interventions in clinical settings. This guide directs researchers towards key literature sources that will be important in each stage of study development.

  1. What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

    Science.gov (United States)

    Thomas, Emily H.; Galambos, Nora

    2004-01-01

    To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

  2. Object-based methods for individual tree identification and tree species classification from high-spatial resolution imagery

    Science.gov (United States)

    Wang, Le

    2003-10-01

    Modern forest management poses an increasing need for detailed knowledge of forest information at different spatial scales. At the forest level, the information for tree species assemblage is desired whereas at or below the stand level, individual tree related information is preferred. Remote Sensing provides an effective tool to extract the above information at multiple spatial scales in the continuous time domain. To date, the increasing volume and readily availability of high-spatial-resolution data have lead to a much wider application of remotely sensed products. Nevertheless, to make effective use of the improving spatial resolution, conventional pixel-based classification methods are far from satisfactory. Correspondingly, developing object-based methods becomes a central challenge for researchers in the field of Remote Sensing. This thesis focuses on the development of methods for accurate individual tree identification and tree species classification. We develop a method in which individual tree crown boundaries and treetop locations are derived under a unified framework. We apply a two-stage approach with edge detection followed by marker-controlled watershed segmentation. Treetops are modeled from radiometry and geometry aspects. Specifically, treetops are assumed to be represented by local radiation maxima and to be located near the center of the tree-crown. As a result, a marker image was created from the derived treetop to guide a watershed segmentation to further differentiate overlapping trees and to produce a segmented image comprised of individual tree crowns. The image segmentation method developed achieves a promising result for a 256 x 256 CASI image. Then further effort is made to extend our methods to the multiscales which are constructed from a wavelet decomposition. A scale consistency and geometric consistency are designed to examine the gradients along the scale-space for the purpose of separating true crown boundary from unwanted

  3. Computational Prediction of Blood-Brain Barrier Permeability Using Decision Tree Induction

    Directory of Open Access Journals (Sweden)

    Jörg Huwyler

    2012-08-01

    Full Text Available Predicting blood-brain barrier (BBB permeability is essential to drug development, as a molecule cannot exhibit pharmacological activity within the brain parenchyma without first transiting this barrier. Understanding the process of permeation, however, is complicated by a combination of both limited passive diffusion and active transport. Our aim here was to establish predictive models for BBB drug permeation that include both active and passive transport. A database of 153 compounds was compiled using in vivo surface permeability product (logPS values in rats as a quantitative parameter for BBB permeability. The open source Chemical Development Kit (CDK was used to calculate physico-chemical properties and descriptors. Predictive computational models were implemented by machine learning paradigms (decision tree induction on both descriptor sets. Models with a corrected classification rate (CCR of 90% were established. Mechanistic insight into BBB transport was provided by an Ant Colony Optimization (ACO-based binary classifier analysis to identify the most predictive chemical substructures. Decision trees revealed descriptors of lipophilicity (aLogP and charge (polar surface area, which were also previously described in models of passive diffusion. However, measures of molecular geometry and connectivity were found to be related to an active drug transport component.

  4. Method for Walking Gait Identification in a Lower Extremity Exoskeleton Based on C4.5 Decision Tree Algorithm

    Directory of Open Access Journals (Sweden)

    Qing Guo

    2015-04-01

    Full Text Available A gait identification method for a lower extremity exoskeleton is presented in order to identify the gait sub-phases in human-machine coordinated motion. First, a sensor layout for the exoskeleton is introduced. Taking the difference between human lower limb motion and human-machine coordinated motion into account, the walking gait is divided into five sub-phases, which are ‘double standing’, ‘right leg swing and left leg stance’, ‘double stance with right leg front and left leg back’, ‘right leg stance and left leg swing’, and ‘double stance with left leg front and right leg back’. The sensors include shoe pressure sensors, knee encoders, and thigh and calf gyroscopes, and are used to measure the contact force of the foot, and the knee joint angle and its angular velocity. Then, five sub-phases of walking gait are identified by a C4.5 decision tree algorithm according to the data fusion of the sensors' information. Based on the simulation results for the gait division, identification accuracy can be guaranteed by the proposed algorithm. Through the exoskeleton control experiment, a division of five sub-phases for the human-machine coordinated walk is proposed. The experimental results verify this gait division and identification method. They can make hydraulic cylinders retract ahead of time and improve the maximal walking velocity when the exoskeleton follows the person's motion.

  5. New weighting methods for phylogenetic tree reconstruction using multiple loci.

    Science.gov (United States)

    Misawa, Kazuharu; Tajima, Fumio

    2012-08-01

    Efficient determination of evolutionary distances is important for the correct reconstruction of phylogenetic trees. The performance of the pooled distance required for reconstructing a phylogenetic tree can be improved by applying large weights to appropriate distances for reconstructing phylogenetic trees and small weights to inappropriate distances. We developed two weighting methods, the modified Tajima-Takezaki method and the modified least-squares method, for reconstructing phylogenetic trees from multiple loci. By computer simulations, we found that both of the new methods were more efficient in reconstructing correct topologies than the no-weight method. Hence, we reconstructed hominoid phylogenetic trees from mitochondrial DNA using our new methods, and found that the levels of bootstrap support were significantly increased by the modified Tajima-Takezaki and by the modified least-squares method.

  6. Review of cause-based decision tree approach for the development of domestic standard human reliability analysis procedure in low power/shutdown operation probabilistic safety assessment

    International Nuclear Information System (INIS)

    Kang, D. I.; Jung, W. D.

    2003-01-01

    We review the Cause-Based Decision Tree (CBDT) approach to decide whether we incorporate it or not for the development of domestic standard Human Reliability Analysis (HRA) procedure in low power/shutdown operation Probabilistic Safety Assessment (PSA). In this paper, we introduce the cause based decision tree approach, quantify human errors using it, and identify merits and demerits of it in comparision with previously used THERP. The review results show that it is difficult to incorporate the CBDT method for the development of domestic standard HRA procedure in low power/shutdown PSA because the CBDT method need for the subjective judgment of HRA analyst like as THERP. However, it is expected that the incorporation of the CBDT method into the development of domestic standard HRA procedure only for the comparision of quantitative HRA results will relieve the burden of development of detailed HRA procedure and will help maintain consistent quantitative HRA results

  7. Extraction of airway trees using multiple hypothesis tracking and template matching

    DEFF Research Database (Denmark)

    Raghavendra, Selvan; Petersen, Jens; Pedersen, Jesper Johannes Holst

    2016-01-01

    used in constructing a multiple hypothesis tree, which is then traversed to reach decisions. The proposed modifications remove the need for local thresholding of hypotheses as decisions are made entirely based on statistical comparisons involving the hypothesis tree. The results show improvements......Knowledge of airway tree morphology has important clinical applications in diagnosis of chronic obstructive pulmonary disease. We present an automatic tree extraction method based on multiple hypothesis tracking and template matching for this purpose and evaluate its performance on chest CT images...

  8. The risk of disabling, surgery and reoperation in Crohn's disease - A decision tree-based approach to prognosis.

    Science.gov (United States)

    Dias, Cláudia Camila; Pereira Rodrigues, Pedro; Fernandes, Samuel; Portela, Francisco; Ministro, Paula; Martins, Diana; Sousa, Paula; Lago, Paula; Rosa, Isadora; Correia, Luis; Moura Santos, Paula; Magro, Fernando

    2017-01-01

    Crohn's disease (CD) is a chronic inflammatory bowel disease known to carry a high risk of disabling and many times requiring surgical interventions. This article describes a decision-tree based approach that defines the CD patients' risk or undergoing disabling events, surgical interventions and reoperations, based on clinical and demographic variables. This multicentric study involved 1547 CD patients retrospectively enrolled and divided into two cohorts: a derivation one (80%) and a validation one (20%). Decision trees were built upon applying the CHAIRT algorithm for the selection of variables. Three-level decision trees were built for the risk of disabling and reoperation, whereas the risk of surgery was described in a two-level one. A receiver operating characteristic (ROC) analysis was performed, and the area under the curves (AUC) Was higher than 70% for all outcomes. The defined risk cut-off values show usefulness for the assessed outcomes: risk levels above 75% for disabling had an odds test positivity of 4.06 [3.50-4.71], whereas risk levels below 34% and 19% excluded surgery and reoperation with an odds test negativity of 0.15 [0.09-0.25] and 0.50 [0.24-1.01], respectively. Overall, patients with B2 or B3 phenotype had a higher proportion of disabling disease and surgery, while patients with later introduction of pharmacological therapeutic (1 months after initial surgery) had a higher proportion of reoperation. The decision-tree based approach used in this study, with demographic and clinical variables, has shown to be a valid and useful approach to depict such risks of disabling, surgery and reoperation.

  9. A multivariate decision tree analysis of biophysical factors in tropical forest fire occurrence

    Science.gov (United States)

    Rey S. Ofren; Edward Harvey

    2000-01-01

    A multivariate decision tree model was used to quantify the relative importance of complex hierarchical relationships between biophysical variables and the occurrence of tropical forest fires. The study site is the Huai Kha Kbaeng wildlife sanctuary, a World Heritage Site in northwestern Thailand where annual fires are common and particularly destructive. Thematic...

  10. The risk factors of laryngeal pathology in Korean adults using a decision tree model.

    Science.gov (United States)

    Byeon, Haewon

    2015-01-01

    The purpose of this study was to identify risk factors affecting laryngeal pathology in the Korean population and to evaluate the derived prediction model. Cross-sectional study. Data were drawn from the 2008 Korea National Health and Nutritional Examination Survey. The subjects were 3135 persons (1508 male and 2114 female) aged 19 years and older living in the community. The independent variables were age, sex, occupation, smoking, alcohol drinking, and self-reported voice problems. A decision tree analysis was done to identify risk factors for predicting a model of laryngeal pathology. The significant risk factors of laryngeal pathology were age, gender, occupation, smoking, and self-reported voice problem in decision tree model. Four significant paths were identified in the decision tree model for the prediction of laryngeal pathology. Those identified as high risk groups for laryngeal pathology included those who self-reported a voice problem, those who were males in their 50s who did not recognize a voice problem, those who were not economically active males in their 40s, and male workers aged 19 and over and under 50 or 60 and over who currently smoked. The results of this study suggest that individual risk factors, such as age, sex, occupation, health behavior, and self-reported voice problem, affect the onset of laryngeal pathology in a complex manner. Based on the results of this study, early management of the high-risk groups is needed for the prevention of laryngeal pathology. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  11. Identification of Water Bodies in a Landsat 8 OLI Image Using a J48 Decision Tree.

    Science.gov (United States)

    Acharya, Tri Dev; Lee, Dong Ha; Yang, In Tae; Lee, Jae Kang

    2016-07-12

    Water bodies are essential to humans and other forms of life. Identification of water bodies can be useful in various ways, including estimation of water availability, demarcation of flooded regions, change detection, and so on. In past decades, Landsat satellite sensors have been used for land use classification and water body identification. Due to the introduction of a New Operational Land Imager (OLI) sensor on Landsat 8 with a high spectral resolution and improved signal-to-noise ratio, the quality of imagery sensed by Landsat 8 has improved, enabling better characterization of land cover and increased data size. Therefore, it is necessary to explore the most appropriate and practical water identification methods that take advantage of the improved image quality and use the fewest inputs based on the original OLI bands. The objective of the study is to explore the potential of a J48 decision tree (JDT) in identifying water bodies using reflectance bands from Landsat 8 OLI imagery. J48 is an open-source decision tree. The test site for the study is in the Northern Han River Basin, which is located in Gangwon province, Korea. Training data with individual bands were used to develop the JDT model and later applied to the whole study area. The performance of the model was statistically analysed using the kappa statistic and area under the curve (AUC). The results were compared with five other known water identification methods using a confusion matrix and related statistics. Almost all the methods showed high accuracy, and the JDT was successfully applied to the OLI image using only four bands, where the new additional deep blue band of OLI was found to have the third highest information gain. Thus, the JDT can be a good method for water body identification based on images with improved resolution and increased size.

  12. USING DECISION TREES FOR ESTIMATING MODE CHOICE OF TRIPS IN BUCA-IZMIR

    Directory of Open Access Journals (Sweden)

    L. O. Oral

    2013-05-01

    Full Text Available Decision makers develop transportation plans and models for providing sustainable transport systems in urban areas. Mode Choice is one of the stages in transportation modelling. Data mining techniques can discover factors affecting the mode choice. These techniques can be applied with knowledge process approach. In this study a data mining process model is applied to determine the factors affecting the mode choice with decision trees techniques by considering individual trip behaviours from household survey data collected within Izmir Transportation Master Plan. From this perspective transport mode choice problem is solved on a case in district of Buca-Izmir, Turkey with CRISP-DM knowledge process model.

  13. Using Decision Trees for Estimating Mode Choice of Trips in Buca-Izmir

    Science.gov (United States)

    Oral, L. O.; Tecim, V.

    2013-05-01

    Decision makers develop transportation plans and models for providing sustainable transport systems in urban areas. Mode Choice is one of the stages in transportation modelling. Data mining techniques can discover factors affecting the mode choice. These techniques can be applied with knowledge process approach. In this study a data mining process model is applied to determine the factors affecting the mode choice with decision trees techniques by considering individual trip behaviours from household survey data collected within Izmir Transportation Master Plan. From this perspective transport mode choice problem is solved on a case in district of Buca-Izmir, Turkey with CRISP-DM knowledge process model.

  14. Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model

    Directory of Open Access Journals (Sweden)

    Takada Masahiro

    2012-06-01

    Full Text Available Abstract Background The aim of this study was to develop a new data-mining model to predict axillary lymph node (AxLN metastasis in primary breast cancer. To achieve this, we used a decision tree-based prediction method—the alternating decision tree (ADTree. Methods Clinical datasets for primary breast cancer patients who underwent sentinel lymph node biopsy or AxLN dissection without prior treatment were collected from three institutes (institute A, n = 148; institute B, n = 143; institute C, n = 174 and were used for variable selection, model training and external validation, respectively. The models were evaluated using area under the receiver operating characteristics (ROC curve analysis to discriminate node-positive patients from node-negative patients. Results The ADTree model selected 15 of 24 clinicopathological variables in the variable selection dataset. The resulting area under the ROC curve values were 0.770 [95% confidence interval (CI, 0.689–0.850] for the model training dataset and 0.772 (95% CI: 0.689–0.856 for the validation dataset, demonstrating high accuracy and generalization ability of the model. The bootstrap value of the validation dataset was 0.768 (95% CI: 0.763–0.774. Conclusions Our prediction model showed high accuracy for predicting nodal metastasis in patients with breast cancer using commonly recorded clinical variables. Therefore, our model might help oncologists in the decision-making process for primary breast cancer patients before starting treatment.

  15. A practical method for accurate quantification of large fault trees

    International Nuclear Information System (INIS)

    Choi, Jong Soo; Cho, Nam Zin

    2007-01-01

    This paper describes a practical method to accurately quantify top event probability and importance measures from incomplete minimal cut sets (MCS) of a large fault tree. The MCS-based fault tree method is extensively used in probabilistic safety assessments. Several sources of uncertainties exist in MCS-based fault tree analysis. The paper is focused on quantification of the following two sources of uncertainties: (1) the truncation neglecting low-probability cut sets and (2) the approximation in quantifying MCSs. The method proposed in this paper is based on a Monte Carlo simulation technique to estimate probability of the discarded MCSs and the sum of disjoint products (SDP) approach complemented by the correction factor approach (CFA). The method provides capability to accurately quantify the two uncertainties and estimate the top event probability and importance measures of large coherent fault trees. The proposed fault tree quantification method has been implemented in the CUTREE code package and is tested on the two example fault trees

  16. The risk evaluation of difficult substances in USES 2.0 and EUSES. A decision tree for data gap filling of Kow, Koc and BCF

    NARCIS (Netherlands)

    Beelen P van; ECO

    2000-01-01

    This report presents a decision tree for the risk evaluation of the so-called "difficult" substances with the Uniform System for the Evaluation of Substances (USES). The decision tree gives practical guidelines for the regulatory authorities to evaluate notified substances like organometallic

  17. Penerapan Algoritma Decision Tree C4.5 Untuk Penilaian Rumah Tinggal

    OpenAIRE

    Setiadi, Budi

    2015-01-01

    There is still a possibility of assessment error homes as a reference value of credit, which will open opportunities for NPL. So we need a way of assessment (predictive value) is quite proportional, credible and accurate. Inaccurate predictions led to the planning of improper credit management. Prediction value of collateral house has attracted the interest of many researchers because of its importance both in theoretical andempirical.Namely C4.5 decision tree algorithm, CART and CHAID that c...

  18. Grading of Parameters for Urban Tree Inventories by City Officials, Arborists, and Academics Using the Delphi Method

    Science.gov (United States)

    Östberg, Johan; Delshammar, Tim; Wiström, Björn; Nielsen, Anders Busse

    2013-03-01

    Tree inventories are expensive to conduct and update, so every inventory carried out must be maximized. However, increasing the number of constituent parameters increases the cost of performing and updating the inventory, illustrating the need for careful parameter selection. This article reports the results of a systematic expert rating of tree inventories aiming to quantify the relative importance of each parameter. Using the Delphi method, panels comprising city officials, arborists, and academics rated a total of 148 parameters. The total mean score, the top ranking parameters, which can serve as a guide for decision-making at practical level and for standardization of tree inventories, were: Scientific name of the tree species and genera, Vitality, Coordinates, Hazard class, and Identification number. The study also examined whether the different responsibilities and usage of urban tree databases among organizations and people engaged in urban tree inventories affected their prioritization. The results revealed noticeable dissimilarities in the ranking of parameters between the panels, underlining the need for collaboration between the research community and those commissioning, administrating, and conducting inventories. Only by applying such a transdisciplinary approach to parameter selection can urban tree inventories be strengthened and made more relevant.

  19. The use of decision trees and naïve Bayes algorithms and trace element patterns for controlling the authenticity of free-range-pastured hens' eggs.

    Science.gov (United States)

    Barbosa, Rommel Melgaço; Nacano, Letícia Ramos; Freitas, Rodolfo; Batista, Bruno Lemos; Barbosa, Fernando

    2014-09-01

    This article aims to evaluate 2 machine learning algorithms, decision trees and naïve Bayes (NB), for egg classification (free-range eggs compared with battery eggs). The database used for the study consisted of 15 chemical elements (As, Ba, Cd, Co, Cs, Cu, Fe, Mg, Mn, Mo, Pb, Se, Sr, V, and Zn) determined in 52 eggs samples (20 free-range and 32 battery eggs) by inductively coupled plasma mass spectrometry. Our results demonstrated that decision trees and NB associated with the mineral contents of eggs provide a high level of accuracy (above 80% and 90%, respectively) for classification between free-range and battery eggs and can be used as an alternative method for adulteration evaluation. © 2014 Institute of Food Technologists®

  20. Using decision trees to manage hospital readmission risk for acute myocardial infarction, heart failure, and pneumonia.

    Science.gov (United States)

    Hilbert, John P; Zasadil, Scott; Keyser, Donna J; Peele, Pamela B

    2014-12-01

    To improve healthcare quality and reduce costs, the Affordable Care Act places hospitals at financial risk for excessive readmissions associated with acute myocardial infarction (AMI), heart failure (HF), and pneumonia (PN). Although predictive analytics is increasingly looked to as a means for measuring, comparing, and managing this risk, many modeling tools require data inputs that are not readily available and/or additional resources to yield actionable information. This article demonstrates how hospitals and clinicians can use their own structured discharge data to create decision trees that produce highly transparent, clinically relevant decision rules for better managing readmission risk associated with AMI, HF, and PN. For illustrative purposes, basic decision trees are trained and tested using publically available data from the California State Inpatient Databases and an open-source statistical package. As expected, these simple models perform less well than other more sophisticated tools, with areas under the receiver operating characteristic (ROC) curve (or AUC) of 0.612, 0.583, and 0.650, respectively, but achieve a lift of at least 1.5 or greater for higher-risk patients with any of the three conditions. More importantly, they are shown to offer substantial advantages in terms of transparency and interpretability, comprehensiveness, and adaptability. By enabling hospitals and clinicians to identify important factors associated with readmissions, target subgroups of patients at both high and low risk, and design and implement interventions that are appropriate to the risk levels observed, decision trees serve as an ideal application for addressing the challenge of reducing hospital readmissions.

  1. Least Squares Methods for Equidistant Tree Reconstruction

    OpenAIRE

    Fahey, Conor; Hosten, Serkan; Krieger, Nathan; Timpe, Leslie

    2008-01-01

    UPGMA is a heuristic method identifying the least squares equidistant phylogenetic tree given empirical distance data among $n$ taxa. We study this classic algorithm using the geometry of the space of all equidistant trees with $n$ leaves, also known as the Bergman complex of the graphical matroid for the complete graph $K_n$. We show that UPGMA performs an orthogonal projection of the data onto a maximal cell of the Bergman complex. We also show that the equidistant tree with the least (Eucl...

  2. Spatial distribution of block falls using volumetric GIS-decision-tree models

    Science.gov (United States)

    Abdallah, C.

    2010-10-01

    Block falls are considered a significant aspect of surficial instability contributing to losses in land and socio-economic aspects through their damaging effects to natural and human environments. This paper predicts and maps the geographic distribution and volumes of block falls in central Lebanon using remote sensing, geographic information systems (GIS) and decision-tree modeling (un-pruned and pruned trees). Eleven terrain parameters (lithology, proximity to fault line, karst type, soil type, distance to drainage line, elevation, slope gradient, slope aspect, slope curvature, land cover/use, and proximity to roads) were generated to statistically explain the occurrence of block falls. The latter were discriminated using SPOT4 satellite imageries, and their dimensions were determined during field surveys. The un-pruned tree model based on all considered parameters explained 86% of the variability in field block fall measurements. Once pruned, it classifies 50% in block falls' volumes by selecting just four parameters (lithology, slope gradient, soil type, and land cover/use). Both tree models (un-pruned and pruned) were converted to quantitative 1:50,000 block falls' maps with different classes; starting from Nil (no block falls) to more than 4000 m 3. These maps are fairly matching with coincidence value equal to 45%; however, both can be used to prioritize the choice of specific zones for further measurement and modeling, as well as for land-use management. The proposed tree models are relatively simple, and may also be applied to other areas (i.e. the choice of un-pruned or pruned model is related to the availability of terrain parameters in a given area).

  3. Dynamic Security Assessment of Danish Power System Based on Decision Trees: Today and Tomorrow

    DEFF Research Database (Denmark)

    Rather, Zakir Hussain; Liu, Leo; Chen, Zhe

    2013-01-01

    The research work presented in this paper analyzes the impact of wind energy, phasing out of central power plants and cross border power exchange on dynamic security of Danish Power System. Contingency based decision tree (DT) approach is used to assess the dynamic security of present and future...

  4. Computer-oriented approach to fault-tree construction

    International Nuclear Information System (INIS)

    Salem, S.L.; Apostolakis, G.E.; Okrent, D.

    1976-11-01

    A methodology for systematically constructing fault trees for general complex systems is developed and applied, via the Computer Automated Tree (CAT) program, to several systems. A means of representing component behavior by decision tables is presented. The method developed allows the modeling of components with various combinations of electrical, fluid and mechanical inputs and outputs. Each component can have multiple internal failure mechanisms which combine with the states of the inputs to produce the appropriate output states. The generality of this approach allows not only the modeling of hardware, but human actions and interactions as well. A procedure for constructing and editing fault trees, either manually or by computer, is described. The techniques employed result in a complete fault tree, in standard form, suitable for analysis by current computer codes. Methods of describing the system, defining boundary conditions and specifying complex TOP events are developed in order to set up the initial configuration for which the fault tree is to be constructed. The approach used allows rapid modifications of the decision tables and systems to facilitate the analysis and comparison of various refinements and changes in the system configuration and component modeling

  5. Car allocation between household heads in car deficient households : A decision model

    NARCIS (Netherlands)

    Anggraini, Renni; Arentze, Theo A.; Timmermans, Harry J P

    2008-01-01

    This paper considers car allocation choice behaviour in car-deficient households explicitly in the context of an activity-scheduling process, focusing on work activities. A decision tree induction method is applied to derive a decision tree for the car allocation decision in automobile deficient

  6. A Maze Game on Android Using Growing Tree Method

    Science.gov (United States)

    Hendrawan, Y. F.

    2018-01-01

    A maze is a type of puzzle games where a player moves in complex and branched passages to find a particular target or location. One method to create a maze is the Growing Tree method. The method creates a tree that has branches which are the paths of a maze. This research explored three types of Growing Tree method implementations for maze generation on Android mobile devices. The layouts produced could be played in first and third-person perspectives. The experiment results showed that it took 17.3 seconds on average to generate 20 cells x 20 cells dynamic maze layouts.

  7. A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran.

    Science.gov (United States)

    Khosravi, Khabat; Pham, Binh Thai; Chapi, Kamran; Shirzadi, Ataollah; Shahabi, Himan; Revhaug, Inge; Prakash, Indra; Tien Bui, Dieu

    2018-06-15

    Floods are one of the most damaging natural hazards causing huge loss of property, infrastructure and lives. Prediction of occurrence of flash flood locations is very difficult due to sudden change in climatic condition and manmade factors. However, prior identification of flood susceptible areas can be done with the help of machine learning techniques for proper timely management of flood hazards. In this study, we tested four decision trees based machine learning models namely Logistic Model Trees (LMT), Reduced Error Pruning Trees (REPT), Naïve Bayes Trees (NBT), and Alternating Decision Trees (ADT) for flash flood susceptibility mapping at the Haraz Watershed in the northern part of Iran. For this, a spatial database was constructed with 201 present and past flood locations and eleven flood-influencing factors namely ground slope, altitude, curvature, Stream Power Index (SPI), Topographic Wetness Index (TWI), land use, rainfall, river density, distance from river, lithology, and Normalized Difference Vegetation Index (NDVI). Statistical evaluation measures, the Receiver Operating Characteristic (ROC) curve, and Freidman and Wilcoxon signed-rank tests were used to validate and compare the prediction capability of the models. Results show that the ADT model has the highest prediction capability for flash flood susceptibility assessment, followed by the NBT, the LMT, and the REPT, respectively. These techniques have proven successful in quickly determining flood susceptible areas. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. The Efficacy of Consensus Tree Methods for Summarizing Phylogenetic Relationships from a Posterior Sample of Trees Estimated from Morphological Data.

    Science.gov (United States)

    O'Reilly, Joseph E; Donoghue, Philip C J

    2018-03-01

    Consensus trees are required to summarize trees obtained through MCMC sampling of a posterior distribution, providing an overview of the distribution of estimated parameters such as topology, branch lengths, and divergence times. Numerous consensus tree construction methods are available, each presenting a different interpretation of the tree sample. The rise of morphological clock and sampled-ancestor methods of divergence time estimation, in which times and topology are coestimated, has increased the popularity of the maximum clade credibility (MCC) consensus tree method. The MCC method assumes that the sampled, fully resolved topology with the highest clade credibility is an adequate summary of the most probable clades, with parameter estimates from compatible sampled trees used to obtain the marginal distributions of parameters such as clade ages and branch lengths. Using both simulated and empirical data, we demonstrate that MCC trees, and trees constructed using the similar maximum a posteriori (MAP) method, often include poorly supported and incorrect clades when summarizing diffuse posterior samples of trees. We demonstrate that the paucity of information in morphological data sets contributes to the inability of MCC and MAP trees to accurately summarise of the posterior distribution. Conversely, majority-rule consensus (MRC) trees represent a lower proportion of incorrect nodes when summarizing the same posterior samples of trees. Thus, we advocate the use of MRC trees, in place of MCC or MAP trees, in attempts to summarize the results of Bayesian phylogenetic analyses of morphological data.

  9. A decision-making framework for protecting process plants from flooding based on fault tree analysis

    International Nuclear Information System (INIS)

    Hauptmanns, Ulrich

    2010-01-01

    The protection of process plants from external events is mandatory in the Seveso Directive. Among these events figures the possibility of inundation of a plant, which may cause a hazard by disabling technical components and obviating operator interventions. A methodological framework for dealing with hazards from potential flooding events is presented. It combines an extension of the fault tree method with generic properties of flooding events in rivers and of dikes, which should be adapted to site-specific characteristics in a concrete case. Thus, a rational basis for deciding whether upgrading is required or not and which of the components should be upgraded is provided. Both the deterministic and the probabilistic approaches are compared. Preference is given to the probabilistic one. The conclusions drawn naturally depend on the scope and detail of the model calculations and the decision criterion adopted. The latter has to be supplied from outside the analysis, e.g. by the analyst himself, the plant operator or the competent authority. It turns out that decision-making is only viable if the boundary conditions for both the procedure of analysis and the decision criterion are clear.

  10. A web-based decision support system to enhance IPM programs in Washington tree fruit.

    Science.gov (United States)

    Jones, Vincent P; Brunner, Jay F; Grove, Gary G; Petit, Brad; Tangren, Gerald V; Jones, Wendy E

    2010-06-01

    Integrated pest management (IPM) decision-making has become more information intensive in Washington State tree crops in response to changes in pesticide availability, the development of new control tactics (such as mating disruption) and the development of new information on pest and natural enemy biology. The time-sensitive nature of the information means that growers must have constant access to a single source of verified information to guide management decisions. The authors developed a decision support system for Washington tree fruit growers that integrates environmental data [140 Washington State University (WSU) stations plus weather forecasts from NOAA], model predictions (ten insects, four diseases and a horticultural model), management recommendations triggered by model status and a pesticide database that provides information on non-target impacts on other pests and natural enemies. A user survey in 2008 found that the user base was providing recommendations for most of the orchards and acreage in the state, and that users estimated the value at $ 16 million per year. The design of the system facilitates education on a range of time-sensitive topics and will make it possible easily to incorporate other models, new management recommendations or information from new sensors as they are developed.

  11. Determinants of farmers' tree-planting investment decisions as a degraded landscape management strategy in the central highlands of Ethiopia

    Science.gov (United States)

    Gessesse, Berhan; Bewket, Woldeamlak; Bräuning, Achim

    2016-04-01

    Land degradation due to lack of sustainable land management practices is one of the critical challenges in many developing countries including Ethiopia. This study explored the major determinants of farm-level tree-planting decisions as a land management strategy in a typical farming and degraded landscape of the Modjo watershed, Ethiopia. The main data were generated from household surveys and analysed using descriptive statistics and a binary logistic regression model. The model significantly predicted farmers' tree-planting decisions (χ2 = 37.29, df = 15, P labour force availability, the disparity of schooling age, level of perception of the process of deforestation and the current land tenure system had a critical influence on tree-growing investment decisions in the study watershed. Eventually, the processes of land-use conversion and land degradation were serious, which in turn have had adverse effects on agricultural productivity, local food security and poverty trap nexus. Hence, the study recommended that devising and implementing sustainable land management policy options would enhance ecological restoration and livelihood sustainability in the study watershed.

  12. Determinants of farmers' tree planting investment decision as a degraded landscape management strategy in the central highlands of Ethiopia

    Science.gov (United States)

    Gessesse, B.; Bewket, W.; Bräuning, A.

    2015-11-01

    Land degradation due to lack of sustainable land management practices are one of the critical challenges in many developing countries including Ethiopia. This study explores the major determinants of farm level tree planting decision as a land management strategy in a typical framing and degraded landscape of the Modjo watershed, Ethiopia. The main data were generated from household surveys and analysed using descriptive statistics and binary logistic regression model. The model significantly predicted farmers' tree planting decision (Chi-square = 37.29, df = 15, Plabour force availability, the disparity of schooling age, level of perception of the process of deforestation and the current land tenure system have positively and significantly influence on tree growing investment decisions in the study watershed. Eventually, the processes of land use conversion and land degradation are serious which in turn have had adverse effects on agricultural productivity, local food security and poverty trap nexus. Hence, devising sustainable and integrated land management policy options and implementing them would enhance ecological restoration and livelihood sustainability in the study watershed.

  13. Learning machines and sleeping brains: Automatic sleep stage classification using decision-tree multi-class support vector machines.

    Science.gov (United States)

    Lajnef, Tarek; Chaibi, Sahbi; Ruby, Perrine; Aguera, Pierre-Emmanuel; Eichenlaub, Jean-Baptiste; Samet, Mounir; Kachouri, Abdennaceur; Jerbi, Karim

    2015-07-30

    Sleep staging is a critical step in a range of electrophysiological signal processing pipelines used in clinical routine as well as in sleep research. Although the results currently achievable with automatic sleep staging methods are promising, there is need for improvement, especially given the time-consuming and tedious nature of visual sleep scoring. Here we propose a sleep staging framework that consists of a multi-class support vector machine (SVM) classification based on a decision tree approach. The performance of the method was evaluated using polysomnographic data from 15 subjects (electroencephalogram (EEG), electrooculogram (EOG) and electromyogram (EMG) recordings). The decision tree, or dendrogram, was obtained using a hierarchical clustering technique and a wide range of time and frequency-domain features were extracted. Feature selection was carried out using forward sequential selection and classification was evaluated using k-fold cross-validation. The dendrogram-based SVM (DSVM) achieved mean specificity, sensitivity and overall accuracy of 0.92, 0.74 and 0.88 respectively, compared to expert visual scoring. Restricting DSVM classification to data where both experts' scoring was consistent (76.73% of the data) led to a mean specificity, sensitivity and overall accuracy of 0.94, 0.82 and 0.92 respectively. The DSVM framework outperforms classification with more standard multi-class "one-against-all" SVM and linear-discriminant analysis. The promising results of the proposed methodology suggest that it may be a valuable alternative to existing automatic methods and that it could accelerate visual scoring by providing a robust starting hypnogram that can be further fine-tuned by expert inspection. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. CAT: a computer code for the automated construction of fault trees

    International Nuclear Information System (INIS)

    Apostolakis, G.E.; Salem, S.L.; Wu, J.S.

    1978-03-01

    A computer code, CAT (Computer Automated Tree, is presented which applies decision table methods to model the behavior of components for systematic construction of fault trees. The decision tables for some commonly encountered mechanical and electrical components are developed; two nuclear subsystems, a Containment Spray Recirculation System and a Consequence Limiting Control System, are analyzed to demonstrate the applications of CAT code

  15. Combined prediction model for supply risk in nuclear power equipment manufacturing industry based on support vector machine and decision tree

    International Nuclear Information System (INIS)

    Shi Chunsheng; Meng Dapeng

    2011-01-01

    The prediction index for supply risk is developed based on the factor identifying of nuclear equipment manufacturing industry. The supply risk prediction model is established with the method of support vector machine and decision tree, based on the investigation on 3 important nuclear power equipment manufacturing enterprises and 60 suppliers. Final case study demonstrates that the combination model is better than the single prediction model, and demonstrates the feasibility and reliability of this model, which provides a method to evaluate the suppliers and measure the supply risk. (authors)

  16. Assisting Sustainable Forest Management and Forest Policy Planning with the Sim4Tree Decision Support System

    Directory of Open Access Journals (Sweden)

    Floris Dalemans

    2015-03-01

    Full Text Available As European forest policy increasingly focuses on multiple ecosystem services and participatory decision making, forest managers and policy planners have a need for integrated, user-friendly, broad spectrum decision support systems (DSS that address risks and uncertainties, such as climate change, in a robust way and that provide credible advice in a transparent manner, enabling effective stakeholder involvement. The Sim4Tree DSS has been accordingly developed as a user-oriented, modular and multipurpose toolbox. Sim4Tree supports strategic and tactical forestry planning by providing simulations of forest development, ecosystem services potential and economic performance through time, from a regional to a stand scale, under various management and climate regimes. Sim4Tree allows comparing the performance of different scenarios with regard to diverse criteria so as to optimize management choices. This paper explains the concept, characteristics, functionalities, components and use of the current Sim4Tree DSS v2.5, which was parameterized for the region of Flanders, Belgium, but can be flexibly adapted to allow a broader use. When considering the current challenges for forestry DSS, an effort has been made towards the participatory component and towards integration, while the lack of robustness remains Sim4Tree’s weakest point. However, its structural flexibility allows many possibilities for future improvement and extension.

  17. Diagnosis of Constant Faults in Read-Once Contact Networks over Finite Bases using Decision Trees

    KAUST Repository

    Busbait, Monther I.

    2014-01-01

    We study the depth of decision trees for diagnosis of constant faults in read-once contact networks over finite bases. This includes diagnosis of 0-1 faults, 0 faults and 1 faults. For any finite basis, we prove a linear upper bound on the minimum

  18. A new fast method for inferring multiple consensus trees using k-medoids.

    Science.gov (United States)

    Tahiri, Nadia; Willems, Matthieu; Makarenkov, Vladimir

    2018-04-05

    Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant. We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms. The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while

  19. Detection of clinical mastitis with sensor data from automatic milking systems is improved by using decision-tree induction.

    Science.gov (United States)

    Kamphuis, C; Mollenhorst, H; Heesterbeek, J A P; Hogeveen, H

    2010-08-01

    The objective was to develop and validate a clinical mastitis (CM) detection model by means of decision-tree induction. For farmers milking with an automatic milking system (AMS), it is desirable that the detection model has a high level of sensitivity (Se), especially for more severe cases of CM, at a very high specificity (Sp). In addition, an alert for CM should be generated preferably at the quarter milking (QM) at which the CM infection is visible for the first time. Data were collected from 9 Dutch dairy herds milking automatically during a 2.5-yr period. Data included sensor data (electrical conductivity, color, and yield) at the QM level and visual observations of quarters with CM recorded by the farmers. Visual observations of quarters with CM were combined with sensor data of the most recent automatic milking recorded for that same quarter, within a 24-h time window before the visual assessment time. Sensor data of 3.5 million QM were collected, of which 348 QM were combined with a CM observation. Data were divided into a training set, including two-thirds of all data, and a test set. Cows in the training set were not included in the test set and vice versa. A decision-tree model was trained using only clear examples of healthy (n=24,717) or diseased (n=243) QM. The model was tested on 105 QM with CM and a random sample of 50,000 QM without CM. While keeping the Se at a level comparable to that of models currently used by AMS, the decision-tree model was able to decrease the number of false-positive alerts by more than 50%. At an Sp of 99%, 40% of the CM cases were detected. Sixty-four percent of the severe CM cases were detected and only 12.5% of the CM that were scored as watery milk. The Se increased considerably from 40% to 66.7% when the time window increased from less than 24h before the CM observation, to a time window from 24h before to 24h after the CM observation. Even at very wide time windows, however, it was impossible to reach an Se of 100

  20. Quantitative analysis of dynamic fault trees using improved Sequential Binary Decision Diagrams

    International Nuclear Information System (INIS)

    Ge, Daochuan; Lin, Meng; Yang, Yanhua; Zhang, Ruoxing; Chou, Qiang

    2015-01-01

    Dynamic fault trees (DFTs) are powerful in modeling systems with sequence- and function dependent failure behaviors. The key point lies in how to quantify complex DFTs analytically and efficiently. Unfortunately, the existing methods for analyzing DFTs all have their own disadvantages. They either suffer from the problem of combinatorial explosion or need a long computation time to obtain an accurate solution. Sequential Binary Decision Diagrams (SBDDs) are regarded as novel and efficient approaches to deal with DFTs, but their two apparent shortcomings remain to be handled: That is, SBDDs probably generate invalid nodes when given an unpleasant variable index and the scale of the resultant cut sequences greatly relies on the chosen variable index. An improved SBDD method is proposed in this paper to deal with the two mentioned problems. It uses an improved ite (If-Then-Else) algorithm to avoid generating invalid nodes when building SBDDs, and a heuristic variable index to keep the scale of resultant cut sequences as small as possible. To confirm the applicability and merits of the proposed method, several benchmark examples are demonstrated, and the results indicate this approach is efficient as well as reasonable. - Highlights: • New ITE method. • Linear complexity-based finding algorithm. • Heuristic variable index

  1. A Clinical Decision Tree to Predict Whether a Bacteremic Patient Is Infected With an Extended-Spectrum β-Lactamase-Producing Organism.

    Science.gov (United States)

    Goodman, Katherine E; Lessler, Justin; Cosgrove, Sara E; Harris, Anthony D; Lautenbach, Ebbing; Han, Jennifer H; Milstone, Aaron M; Massey, Colin J; Tamma, Pranita D

    2016-10-01

    Timely identification of extended-spectrum β-lactamase (ESBL) bacteremia can improve clinical outcomes while minimizing unnecessary use of broad-spectrum antibiotics, including carbapenems. However, most clinical microbiology laboratories currently require at least 24 additional hours from the time of microbial genus and species identification to confirm ESBL production. Our objective was to develop a user-friendly decision tree to predict which organisms are ESBL producing, to guide appropriate antibiotic therapy. We included patients ≥18 years of age with bacteremia due to Escherichia coli or Klebsiella species from October 2008 to March 2015 at Johns Hopkins Hospital. Isolates with ceftriaxone minimum inhibitory concentrations ≥2 µg/mL underwent ESBL confirmatory testing. Recursive partitioning was used to generate a decision tree to determine the likelihood that a bacteremic patient was infected with an ESBL producer. Discrimination of the original and cross-validated models was evaluated using receiver operating characteristic curves and by calculation of C-statistics. A total of 1288 patients with bacteremia met eligibility criteria. For 194 patients (15%), bacteremia was due to a confirmed ESBL producer. The final classification tree for predicting ESBL-positive bacteremia included 5 predictors: history of ESBL colonization/infection, chronic indwelling vascular hardware, age ≥43 years, recent hospitalization in an ESBL high-burden region, and ≥6 days of antibiotic exposure in the prior 6 months. The decision tree's positive and negative predictive values were 90.8% and 91.9%, respectively. Our findings suggest that a clinical decision tree can be used to estimate a bacteremic patient's likelihood of infection with ESBL-producing bacteria. Recursive partitioning offers a practical, user-friendly approach for addressing important diagnostic questions. © The Author 2016. Published by Oxford University Press for the Infectious Diseases Society of

  2. Traditional Chinese medicine pharmacovigilance in signal detection: decision tree-based data classification.

    Science.gov (United States)

    Wei, Jian-Xiang; Wang, Jing; Zhu, Yun-Xia; Sun, Jun; Xu, Hou-Ming; Li, Ming

    2018-03-09

    Traditional Chinese Medicine (TCM) is a style of traditional medicine informed by modern medicine but built on a foundation of more than 2500 years of Chinese medical practice. According to statistics, TCM accounts for approximately 14% of total adverse drug reaction (ADR) spontaneous reporting data in China. Because of the complexity of the components in TCM formula, which makes it essentially different from Western medicine, it is critical to determine whether ADR reports of TCM should be analyzed independently. Reports in the Chinese spontaneous reporting database between 2010 and 2011 were selected. The dataset was processed and divided into the total sample (all data) and the subsample (including TCM data only). Four different ADR signal detection methods-PRR, ROR, MHRA and IC- currently widely used in China, were applied for signal detection on the two samples. By comparison of experimental results, three of them-PRR, MHRA and IC-were chosen to do the experiment. We designed several indicators for performance evaluation such as R (recall ratio), P (precision ratio), and D (discrepancy ratio) based on the reference database and then constructed a decision tree for data classification based on such indicators. For PRR: R 1 -R 2  = 0.72%, P 1 -P 2  = 0.16% and D = 0.92%; For MHRA: R 1 -R 2  = 0.97%, P 1 -P 2  = 0.20% and D = 1.18%; For IC: R 1 -R 2  = 1.44%, P 2 -P 1  = 4.06% and D = 4.72%. The threshold of R,Pand Dis set as 2%, 2% and 3% respectively. Based on the decision tree, the results are "separation" for PRR, MHRA and IC. In order to improve the efficiency and accuracy of signal detection, we suggest that TCM data should be separated from the total sample when conducting analyses.

  3. hs-CRP is strongly associated with coronary heart disease (CHD): A data mining approach using decision tree algorithm.

    Science.gov (United States)

    Tayefi, Maryam; Tajfard, Mohammad; Saffar, Sara; Hanachi, Parichehr; Amirabadizadeh, Ali Reza; Esmaeily, Habibollah; Taghipour, Ali; Ferns, Gordon A; Moohebati, Mohsen; Ghayour-Mobarhan, Majid

    2017-04-01

    Coronary heart disease (CHD) is an important public health problem globally. Algorithms incorporating the assessment of clinical biomarkers together with several established traditional risk factors can help clinicians to predict CHD and support clinical decision making with respect to interventions. Decision tree (DT) is a data mining model for extracting hidden knowledge from large databases. We aimed to establish a predictive model for coronary heart disease using a decision tree algorithm. Here we used a dataset of 2346 individuals including 1159 healthy participants and 1187 participant who had undergone coronary angiography (405 participants with negative angiography and 782 participants with positive angiography). We entered 10 variables of a total 12 variables into the DT algorithm (including age, sex, FBG, TG, hs-CRP, TC, HDL, LDL, SBP and DBP). Our model could identify the associated risk factors of CHD with sensitivity, specificity, accuracy of 96%, 87%, 94% and respectively. Serum hs-CRP levels was at top of the tree in our model, following by FBG, gender and age. Our model appears to be an accurate, specific and sensitive model for identifying the presence of CHD, but will require validation in prospective studies. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. A Comparison between Decision Tree and Random Forest in Determining the Risk Factors Associated with Type 2 Diabetes.

    Science.gov (United States)

    Esmaily, Habibollah; Tayefi, Maryam; Doosti, Hassan; Ghayour-Mobarhan, Majid; Nezami, Hossein; Amirabadizadeh, Alireza

    2018-04-24

    We aimed to identify the associated risk factors of type 2 diabetes mellitus (T2DM) using data mining approach, decision tree and random forest techniques using the Mashhad Stroke and Heart Atherosclerotic Disorders (MASHAD) Study program. A cross-sectional study. The MASHAD study started in 2010 and will continue until 2020. Two data mining tools, namely decision trees, and random forests, are used for predicting T2DM when some other characteristics are observed on 9528 subjects recruited from MASHAD database. This paper makes a comparison between these two models in terms of accuracy, sensitivity, specificity and the area under ROC curve. The prevalence rate of T2DM was 14% among these subjects. The decision tree model has 64.9% accuracy, 64.5% sensitivity, 66.8% specificity, and area under the ROC curve measuring 68.6%, while the random forest model has 71.1% accuracy, 71.3% sensitivity, 69.9% specificity, and area under the ROC curve measuring 77.3% respectively. The random forest model, when used with demographic, clinical, and anthropometric and biochemical measurements, can provide a simple tool to identify associated risk factors for type 2 diabetes. Such identification can substantially use for managing the health policy to reduce the number of subjects with T2DM .

  5. OmniGA: Optimized Omnivariate Decision Trees for Generalizable Classification Models

    KAUST Repository

    Magana-Mora, Arturo

    2017-06-14

    Classification problems from different domains vary in complexity, size, and imbalance of the number of samples from different classes. Although several classification models have been proposed, selecting the right model and parameters for a given classification task to achieve good performance is not trivial. Therefore, there is a constant interest in developing novel robust and efficient models suitable for a great variety of data. Here, we propose OmniGA, a framework for the optimization of omnivariate decision trees based on a parallel genetic algorithm, coupled with deep learning structure and ensemble learning methods. The performance of the OmniGA framework is evaluated on 12 different datasets taken mainly from biomedical problems and compared with the results obtained by several robust and commonly used machine-learning models with optimized parameters. The results show that OmniGA systematically outperformed these models for all the considered datasets, reducing the F score error in the range from 100% to 2.25%, compared to the best performing model. This demonstrates that OmniGA produces robust models with improved performance. OmniGA code and datasets are available at www.cbrc.kaust.edu.sa/omniga/.

  6. OmniGA: Optimized Omnivariate Decision Trees for Generalizable Classification Models

    KAUST Repository

    Magana-Mora, Arturo; Bajic, Vladimir B.

    2017-01-01

    Classification problems from different domains vary in complexity, size, and imbalance of the number of samples from different classes. Although several classification models have been proposed, selecting the right model and parameters for a given classification task to achieve good performance is not trivial. Therefore, there is a constant interest in developing novel robust and efficient models suitable for a great variety of data. Here, we propose OmniGA, a framework for the optimization of omnivariate decision trees based on a parallel genetic algorithm, coupled with deep learning structure and ensemble learning methods. The performance of the OmniGA framework is evaluated on 12 different datasets taken mainly from biomedical problems and compared with the results obtained by several robust and commonly used machine-learning models with optimized parameters. The results show that OmniGA systematically outperformed these models for all the considered datasets, reducing the F score error in the range from 100% to 2.25%, compared to the best performing model. This demonstrates that OmniGA produces robust models with improved performance. OmniGA code and datasets are available at www.cbrc.kaust.edu.sa/omniga/.

  7. Decision Tree-Based Contextual Location Prediction from Mobile Device Logs

    Directory of Open Access Journals (Sweden)

    Linyuan Xia

    2018-01-01

    Full Text Available Contextual location prediction is an important topic in the field of personalized location recommendation in LBS (location-based services. With the advancement of mobile positioning techniques and various sensors embedded in smartphones, it is convenient to obtain massive human mobile trajectories and to derive a large amount of valuable information from geospatial big data. Extracting and recognizing personally interesting places and predicting next semantic location become a research hot spot in LBS. In this paper, we proposed an approach to predict next personally semantic place with historical visiting patterns derived from mobile device logs. To address the problems of location imprecision and lack of semantic information, a modified trip-identify method is employed to extract key visit points from GPS trajectories to a more accurate extent while semantic information are added through stay point detection and semantic places recognition. At last, a decision tree model is adopted to explore the spatial, temporal, and sequential features in contextual location prediction. To validate the effectiveness of our approach, experiments were conducted based on a trajectory collection in Guangzhou downtown area. The results verified the feasibility of our approach on contextual location prediction from continuous mobile devices logs.

  8. Applying decision tree for identification of a low risk population for type 2 diabetes. Tehran Lipid and Glucose Study.

    Science.gov (United States)

    Ramezankhani, Azra; Pournik, Omid; Shahrabi, Jamal; Khalili, Davood; Azizi, Fereidoun; Hadaegh, Farzad

    2014-09-01

    The aim of this study was to create a prediction model using data mining approach to identify low risk individuals for incidence of type 2 diabetes, using the Tehran Lipid and Glucose Study (TLGS) database. For a 6647 population without diabetes, aged ≥20 years, followed for 12 years, a prediction model was developed using classification by the decision tree technique. Seven hundred and twenty-nine (11%) diabetes cases occurred during the follow-up. Predictor variables were selected from demographic characteristics, smoking status, medical and drug history and laboratory measures. We developed the predictive models by decision tree using 60 input variables and one output variable. The overall classification accuracy was 90.5%, with 31.1% sensitivity, 97.9% specificity; and for the subjects without diabetes, precision and f-measure were 92% and 0.95, respectively. The identified variables included fasting plasma glucose, body mass index, triglycerides, mean arterial blood pressure, family history of diabetes, educational level and job status. In conclusion, decision tree analysis, using routine demographic, clinical, anthropometric and laboratory measurements, created a simple tool to predict individuals at low risk for type 2 diabetes. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  9. Applications of urban tree canopy assessment and prioritization tools: supporting collaborative decision making to achieve urban sustainability goals

    Science.gov (United States)

    Dexter H. Locke; J. Morgan Grove; Michael Galvin; Jarlath P.M. ONeil-Dunne; Charles. Murphy

    2013-01-01

    Urban Tree Canopy (UTC) Prioritizations can be both a set of geographic analysis tools and a planning process for collaborative decision-making. In this paper, we describe how UTC Prioritizations can be used as a planning process to provide decision support to multiple government agencies, civic groups and private businesses to aid in reaching a canopy target. Linkages...

  10. Establishing Decision Trees for Predicting Successful Postpyloric Nasoenteric Tube Placement in Critically Ill Patients.

    Science.gov (United States)

    Chen, Weisheng; Sun, Cheng; Wei, Ru; Zhang, Yanlin; Ye, Heng; Chi, Ruibin; Zhang, Yichen; Hu, Bei; Lv, Bo; Chen, Lifang; Zhang, Xiunong; Lan, Huilan; Chen, Chunbo

    2018-01-01

    Despite the use of prokinetic agents, the overall success rate for postpyloric placement via a self-propelled spiral nasoenteric tube is quite low. This retrospective study was conducted in the intensive care units of 11 university hospitals from 2006 to 2016 among adult patients who underwent self-propelled spiral nasoenteric tube insertion. Success was defined as postpyloric nasoenteric tube placement confirmed by abdominal x-ray scan 24 hours after tube insertion. Chi-square automatic interaction detection (CHAID), simple classification and regression trees (SimpleCart), and J48 methodologies were used to develop decision tree models, and multiple logistic regression (LR) methodology was used to develop an LR model for predicting successful postpyloric nasoenteric tube placement. The area under the receiver operating characteristic curve (AUC) was used to evaluate the performance of these models. Successful postpyloric nasoenteric tube placement was confirmed in 427 of 939 patients enrolled. For predicting successful postpyloric nasoenteric tube placement, the performance of the 3 decision trees was similar in terms of the AUCs: 0.715 for the CHAID model, 0.682 for the SimpleCart model, and 0.671 for the J48 model. The AUC of the LR model was 0.729, which outperformed the J48 model. Both the CHAID and LR models achieved an acceptable discrimination for predicting successful postpyloric nasoenteric tube placement and were useful for intensivists in the setting of self-propelled spiral nasoenteric tube insertion. © 2016 American Society for Parenteral and Enteral Nutrition.

  11. [The Application of the Fault Tree Analysis Method in Medical Equipment Maintenance].

    Science.gov (United States)

    Liu, Hongbin

    2015-11-01

    In this paper, the traditional fault tree analysis method is presented, detailed instructions for its application characteristics in medical instrument maintenance is made. It is made significant changes when the traditional fault tree analysis method is introduced into the medical instrument maintenance: gave up the logic symbolic, logic analysis and calculation, gave up its complicated programs, and only keep its image and practical fault tree diagram, and the fault tree diagram there are also differences: the fault tree is no longer a logical tree but the thinking tree in troubleshooting, the definition of the fault tree's nodes is different, the composition of the fault tree's branches is also different.

  12. Change Analysis and Decision Tree Based Detection Model for Residential Objects across Multiple Scales

    Directory of Open Access Journals (Sweden)

    CHEN Liyan

    2018-03-01

    Full Text Available Change analysis and detection plays important role in the updating of multi-scale databases.When overlap an updated larger-scale dataset and a to-be-updated smaller-scale dataset,people usually focus on temporal changes caused by the evolution of spatial entities.Little attention is paid to the representation changes influenced by map generalization.Using polygonal building data as an example,this study examines the changes from different perspectives,such as the reasons for their occurrence,their performance format.Based on this knowledge,we employ decision tree in field of machine learning to establish a change detection model.The aim of the proposed model is to distinguish temporal changes that need to be applied as updates to the smaller-scale dataset from representation changes.The proposed method is validated through tests using real-world building data from Guangzhou city.The experimental results show the overall precision of change detection is more than 90%,which indicates our method is effective to identify changed objects.

  13. Development of a diagnostic decision tree for obstructive pulmonary diseases based on real-life data

    NARCIS (Netherlands)

    Metting, Esther I; In 't Veen, Johannes C C M; Dekhuijzen, P N Richard; van Heijst, Ellen; Kocks, Janwillem W H; Muilwijk-Kroes, Jacqueline B; Chavannes, Niels H; van der Molen, Thys

    2016-01-01

    The aim of this study was to develop and explore the diagnostic accuracy of a decision tree derived from a large real-life primary care population. Data from 9297 primary care patients (45% male, mean age 53±17 years) with suspicion of an obstructive pulmonary disease was derived from an

  14. Development of decision tree software and protein profiling using surface enhanced laser desorption/ionization-time of flight-mass spectrometry (SELDI-TOF-MS) in papillary thyroid cancer

    International Nuclear Information System (INIS)

    Yoon, Joon Kee; An, Young Sil; Park, Bok Nam; Yoon, Seok Nam; Lee, Jun

    2007-01-01

    The aim of this study was to develop a bioinformatics software and to test it in serum samples of papillary thyroid cancer using mass spectrometry (SELDI-TOF-MS). Development of 'Protein analysis' software performing decision tree analysis was done by customizing C4.5. Sixty-one serum samples from 27 papillary thyroid cancer, 17 autoimmune thyroiditis, 17 controls were applied to 2 types of protein chips, CM10 (weak cation exchange) and IMAC3 (metal binding - Cu). Mass spectrometry was performed to reveal the protein expression profiles. Decision trees were generated using 'Protein analysis' software, and automatically detected biomarker candidates. Validation analysis was performed for CM10 chip by random sampling. Decision tree software, which can perform training and validation from profiling data, was developed. For CM10 and IMAC3 chips, 23 of 113 and 8 of 41 protein peaks were significantly different among 3 groups (ρ < 0.05), respectively. Decision tree correctly classified 3 groups with an error rate of 3.3% for CM10 and 2.0% for IMAC3, and 4 and 7 biomarker candidates were detected respectively. In 2 group comparisons, all cancer samples were correctly discriminated from non-cancer samples (error rate = 0%) for CM10 by single node and for IMAC3 by multiple nodes. Validation results from 5 test sets revealed SELDI-TOF-MS and decision tree correctly differentiated cancers from non-cancers (54/55, 98%), while predictability was moderate in 3 group classification (36/55, 65%). Our in-house software was able to successfully build decision trees and detect biomarker candidates, therefore it could be useful for biomarker discovery and clinical follow up of papillary thyroid cancer

  15. A decision tree model for predicting mediastinal lymph node metastasis in non-small cell lung cancer with F-18 FDG PET/CT.

    Science.gov (United States)

    Pak, Kyoungjune; Kim, Keunyoung; Kim, Mi-Hyun; Eom, Jung Seop; Lee, Min Ki; Cho, Jeong Su; Kim, Yun Seong; Kim, Bum Soo; Kim, Seong Jang; Kim, In Joo

    2018-01-01

    We aimed to develop a decision tree model to improve diagnostic performance of positron emission tomography/computed tomography (PET/CT) to detect metastatic lymph nodes (LN) in non-small cell lung cancer (NSCLC). 115 patients with NSCLC were included in this study. The training dataset included 66 patients. A decision tree model was developed with 9 variables, and validated with 49 patients: short and long diameters of LNs, ratio of short and long diameters, maximum standardized uptake value (SUVmax) of LN, mean hounsfield unit, ratio of LN SUVmax and ascending aorta SUVmax (LN/AA), and ratio of LN SUVmax and superior vena cava SUVmax. A total of 301 LNs of 115 patients were evaluated in this study. Nodular calcification was applied as the initial imaging parameter, and LN SUVmax (≥3.95) was assessed as the second. LN/AA (≥2.92) was required to high LN SUVmax. Sensitivity was 50% for training dataset, and 40% for validation dataset. However, specificity was 99.28% for training dataset, and 96.23% for validation dataset. In conclusion, we have developed a new decision tree model for interpreting mediastinal LNs. All LNs with nodular calcification were benign, and LNs with high LN SUVmax and high LN/AA were metastatic Further studies are needed to incorporate subjective parameters and pathologic evaluations into a decision tree model to improve the test performance of PET/CT.

  16. Application of Decision Tree on Collision Avoidance System Design and Verification for Quadcopter

    Science.gov (United States)

    Chen, C.-W.; Hsieh, P.-H.; Lai, W.-H.

    2017-08-01

    The purpose of the research is to build a collision avoidance system with decision tree algorithm used for quadcopters. While the ultrasonic range finder judges the distance is in collision avoidance interval, the access will be replaced from operator to the system to control the altitude of the UAV. According to the former experiences on operating quadcopters, we can obtain the appropriate pitch angle. The UAS implement the following three motions to avoid collisions. Case1: initial slow avoidance stage, Case2: slow avoidance stage and Case3: Rapid avoidance stage. Then the training data of collision avoidance test will be transmitted to the ground station via wireless transmission module to further analysis. The entire decision tree algorithm of collision avoidance system, transmission data, and ground station have been verified in some flight tests. In the flight test, the quadcopter can implement avoidance motion in real-time and move away from obstacles steadily. In the avoidance area, the authority of the collision avoidance system is higher than the operator and implements the avoidance process. The quadcopter can successfully fly away from the obstacles in 1.92 meter per second and the minimum distance between the quadcopter and the obstacle is 1.05 meters.

  17. APPLICATION OF DECISION TREE ON COLLISION AVOIDANCE SYSTEM DESIGN AND VERIFICATION FOR QUADCOPTER

    Directory of Open Access Journals (Sweden)

    C.-W. Chen

    2017-08-01

    Full Text Available The purpose of the research is to build a collision avoidance system with decision tree algorithm used for quadcopters. While the ultrasonic range finder judges the distance is in collision avoidance interval, the access will be replaced from operator to the system to control the altitude of the UAV. According to the former experiences on operating quadcopters, we can obtain the appropriate pitch angle. The UAS implement the following three motions to avoid collisions. Case1: initial slow avoidance stage, Case2: slow avoidance stage and Case3: Rapid avoidance stage. Then the training data of collision avoidance test will be transmitted to the ground station via wireless transmission module to further analysis. The entire decision tree algorithm of collision avoidance system, transmission data, and ground station have been verified in some flight tests. In the flight test, the quadcopter can implement avoidance motion in real-time and move away from obstacles steadily. In the avoidance area, the authority of the collision avoidance system is higher than the operator and implements the avoidance process. The quadcopter can successfully fly away from the obstacles in 1.92 meter per second and the minimum distance between the quadcopter and the obstacle is 1.05 meters.

  18. A Method to Quantify Plant Availability and Initiating Event Frequency Using a Large Event Tree, Small Fault Tree Model

    International Nuclear Information System (INIS)

    Kee, Ernest J.; Sun, Alice; Rodgers, Shawn; Popova, ElmiraV; Nelson, Paul; Moiseytseva, Vera; Wang, Eric

    2006-01-01

    South Texas Project uses a large fault tree to produce scenarios (minimal cut sets) used in quantification of plant availability and event frequency predictions. On the other hand, the South Texas Project probabilistic risk assessment model uses a large event tree, small fault tree for quantifying core damage and radioactive release frequency predictions. The South Texas Project is converting its availability and event frequency model to use a large event tree, small fault in an effort to streamline application support and to provide additional detail in results. The availability and event frequency model as well as the applications it supports (maintenance and operational risk management, system engineering health assessment, preventive maintenance optimization, and RIAM) are briefly described. A methodology to perform availability modeling in a large event tree, small fault tree framework is described in detail. How the methodology can be used to support South Texas Project maintenance and operations risk management is described in detail. Differences with other fault tree methods and other recently proposed methods are discussed in detail. While the methods described are novel to the South Texas Project Risk Management program and to large event tree, small fault tree models, concepts in the area of application support and availability modeling have wider applicability to the industry. (authors)

  19. Statistically Consistent k-mer Methods for Phylogenetic Tree Reconstruction.

    Science.gov (United States)

    Allman, Elizabeth S; Rhodes, John A; Sullivant, Seth

    2017-02-01

    Frequencies of k-mers in sequences are sometimes used as a basis for inferring phylogenetic trees without first obtaining a multiple sequence alignment. We show that a standard approach of using the squared Euclidean distance between k-mer vectors to approximate a tree metric can be statistically inconsistent. To remedy this, we derive model-based distance corrections for orthologous sequences without gaps, which lead to consistent tree inference. The identifiability of model parameters from k-mer frequencies is also studied. Finally, we report simulations showing that the corrected distance outperforms many other k-mer methods, even when sequences are generated with an insertion and deletion process. These results have implications for multiple sequence alignment as well since k-mer methods are usually the first step in constructing a guide tree for such algorithms.

  20. A new methodology for the computer-aided construction of fault trees

    International Nuclear Information System (INIS)

    Salem, S.L.; Apostolakis, G.E.; Okrent, D.

    1977-01-01

    A methodology for systematically constructing fault trees for general complex systems is developed. A means of modeling component behaviour via decision tables is presented, and a procedure, and a procedure for constructing and editing fault trees, either manually or by computer, is developed. The techniques employed result in a complete fault tree in standard form. In order to demonstrate the methodology, the computer program CAT was developed and is used to construct trees for a nuclear system. By analyzing and comparing these fault trees, several conclusions are reached. First, such an approach can be used to produce fault trees that accurately describe system behaviour. Second, multiple trees can be rapidly produced by defining various TOP events, including system success. Finally, the accuracy and utility of such trees is shown to depend upon the careful development of the decision table models by the analyst, and of the overall system definition itself. Thus the method is seen to be a tool for assisting in the work of fault tree construction rather than a replacement for the careful work of the fault tree analyst. (author)

  1. Decision tree analysis of treatment strategies for mild and moderate cases of clinical mastitis occurring in early lactation.

    Science.gov (United States)

    Pinzón-Sánchez, C; Cabrera, V E; Ruegg, P L

    2011-04-01

    The objective of this study was to develop a decision tree to evaluate the economic impact of different durations of intramammary treatment for the first case of mild or moderate clinical mastitis (CM) occurring in early lactation with various scenarios of pathogen distributions and use of on-farm culture. The tree included 2 decision and 3 probability events. The first decision evaluated use of on-farm culture (OFC; 2 programs using OFC and 1 not using OFC) and the second decision evaluated treatment strategies (no intramammary antimicrobials or antimicrobials administered for 2, 5, or 8 d). The tree included probabilities for the distribution of etiologies (gram-positive, gram-negative, or no growth), bacteriological cure, and recurrence. The economic consequences of mastitis included costs of diagnosis and initial treatment, additional treatments, labor, discarded milk, milk production losses due to clinical and subclinical mastitis, culling, and transmission of infection to other cows (only for CM caused by Staphylococcus aureus). Pathogen-specific estimates for bacteriological cure and milk losses were used. The economically optimal path for several scenarios was determined by comparison of expected monetary values. For most scenarios, the optimal economic strategy was to treat CM caused by gram-positive pathogens for 2 d and to avoid antimicrobials for CM cases caused by gram-negative pathogens or when no pathogen was recovered. Use of extended intramammary antimicrobial therapy (5 or 8 d) resulted in the least expected monetary values. Copyright © 2011 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  2. Career Path Suggestion using String Matching and Decision Trees

    Science.gov (United States)

    Nagpal, Akshay; P. Panda, Supriya

    2015-05-01

    High school and college graduates seemingly are often battling for the courses they should major in order to achieve their target career. In this paper, we worked on suggesting a career path to a graduate to reach his/her dream career given the current educational status. Firstly, we collected the career data of professionals and academicians from various career fields and compiled the data set by using the necessary information from the data. Further, this was used as the basis to suggest the most appropriate career path for the person given his/her current educational status. Decision trees and string matching algorithms were employed to suggest the appropriate career path for a person. Finally, an analysis of the result has been done directing to further improvements in the model.

  3. Test Reviews: Euler, B. L. (2007). "Emotional Disturbance Decision Tree". Lutz, FL: Psychological Assessment Resources

    Science.gov (United States)

    Tansy, Michael

    2009-01-01

    The Emotional Disturbance Decision Tree (EDDT) is a teacher-completed norm-referenced rating scale published by Psychological Assessment Resources, Inc., in Lutz, Florida. The 156-item EDDT was developed for use as part of a broader assessment process to screen and assist in the identification of 5- to 18-year-old children for the special…

  4. Decision tree analysis to evaluate dry cow strategies under UK conditions.

    Science.gov (United States)

    Berry, Elizabeth A; Hogeveen, Henk; Hillerton, J Eric

    2004-11-01

    Economic decisions on animal health strategies address the cost-benefit aspect along with animal welfare and public health concerns. Decision tree analysis at an individual cow level highlighted that there is little economic difference between the use of either dry cow antibiotic or an internal teat sealant in preventing a new intramammary infection in a cow free of infection in all quarters of the mammary gland at drying off. However, a potential net loss of over ł20 per cow might occur if the uninfected cow was left untreated. The only economically viable option, for a cow with one or more quarters infected at drying off, is antibiotic treatment, although a loss might still be incurred depending on the pathogen concerned and the cure rates achievable. There was a net loss for cows with quarters infected with Corynebacterium spp. at drying off, for both the teat sealant and untreated groups (ł22 and ł48, respectively) with only antibiotic-treated cows showing a gain.

  5. Visualizing Decision Trees in Games to Support Children's Analytic Reasoning: Any Negative Effects on Gameplay?

    Directory of Open Access Journals (Sweden)

    Robert Haworth

    2010-01-01

    Full Text Available The popularity and usage of digital games has increased in recent years, bringing further attention to their design. Some digital games require a significant use of higher order thought processes, such as problem solving and reflective and analytical thinking. Through the use of appropriate and interactive representations, these thought processes could be supported. A visualization of the game's internal structure is an example of this. However, it is unknown whether including these extra representations will have a negative effect on gameplay. To investigate this issue, a digital maze-like game was designed with its underlying structure represented as a decision tree. A qualitative, exploratory study with children was performed to examine whether the tree supported their thought processes and what effects, if any, the tree had on gameplay. This paper reports the findings of this research and discusses the implications for the design of games in general.

  6. A Novel Treatment Decision Tree and Literature Review of Retrograde Peri-Implantitis.

    Science.gov (United States)

    Sarmast, Nima D; Wang, Howard H; Soldatos, Nikolaos K; Angelov, Nikola; Dorn, Samuel; Yukna, Raymond; Iacono, Vincent J

    2016-12-01

    Although retrograde peri-implantitis (RPI) is not a common sequela of dental implant surgery, its prevalence has been reported in the literature to be 0.26%. Incidence of RPI is reported to increase to 7.8% when teeth adjacent to the implant site have a previous history of root canal therapy, and it is correlated with distance between implant and adjacent tooth and/or with time from endodontic treatment of adjacent tooth to implant placement. Minimum 2 mm space between implant and adjacent tooth is needed to decrease incidence of apical RPI, with minimum 4 weeks between completion of endodontic treatment and actual implant placement. The purpose of this study is to compile all available treatment modalities and to provide a decision tree as a general guide for clinicians to aid in diagnosis and treatment of RPI. Literature search was performed for articles published in English on the topic of RPI. Articles selected were case reports with study populations ranging from 1 to 32 patients. Any case report or clinical trial that attempted to treat or rescue an implant diagnosed with RPI was included. Predominant diagnostic presentation of a lesion was presence of sinus tract at buccal or facial abscess of apical portion of implant, and subsequent periapical radiographs taken demonstrated a radiolucent lesion. On the basis of case reports analyzed, RPI was diagnosed between 1 week and 4 years after implant placement. Twelve of 20 studies reported that RPI lesions were diagnosed within 6 months after implant placement. A step-by-step decision tree is provided to allow clinicians to triage and properly manage cases of RPI on the basis of recommendations and successful treatments provided in analyzed case reports. It is divided between symptomatic and asymptomatic implants and adjacent teeth with vital and necrotic pulps. Most common etiology of apical RPI is endodontic infection from neighboring teeth, which was diagnosed within 6 months after implant placement. Most

  7. Bonsai trees in your head: how the pavlovian system sculpts goal-directed choices by pruning decision trees.

    Directory of Open Access Journals (Sweden)

    Quentin J M Huys

    Full Text Available When planning a series of actions, it is usually infeasible to consider all potential future sequences; instead, one must prune the decision tree. Provably optimal pruning is, however, still computationally ruinous and the specific approximations humans employ remain unknown. We designed a new sequential reinforcement-based task and showed that human subjects adopted a simple pruning strategy: during mental evaluation of a sequence of choices, they curtailed any further evaluation of a sequence as soon as they encountered a large loss. This pruning strategy was Pavlovian: it was reflexively evoked by large losses and persisted even when overwhelmingly counterproductive. It was also evident above and beyond loss aversion. We found that the tendency towards Pavlovian pruning was selectively predicted by the degree to which subjects exhibited sub-clinical mood disturbance, in accordance with theories that ascribe Pavlovian behavioural inhibition, via serotonin, a role in mood disorders. We conclude that Pavlovian behavioural inhibition shapes highly flexible, goal-directed choices in a manner that may be important for theories of decision-making in mood disorders.

  8. Development of decision tree software and protein profiling using surface enhanced laser desorption/ionization-time of flight-mass spectrometry (SELDI-TOF-MS) in papillary thyroid cancer

    Energy Technology Data Exchange (ETDEWEB)

    Yoon, Joon Kee; An, Young Sil; Park, Bok Nam; Yoon, Seok Nam [Ajou University School of Medicine, Suwon (Korea, Republic of); Lee, Jun [Konkuk University, Seoul (Korea, Republic of)

    2007-08-15

    The aim of this study was to develop a bioinformatics software and to test it in serum samples of papillary thyroid cancer using mass spectrometry (SELDI-TOF-MS). Development of 'Protein analysis' software performing decision tree analysis was done by customizing C4.5. Sixty-one serum samples from 27 papillary thyroid cancer, 17 autoimmune thyroiditis, 17 controls were applied to 2 types of protein chips, CM10 (weak cation exchange) and IMAC3 (metal binding - Cu). Mass spectrometry was performed to reveal the protein expression profiles. Decision trees were generated using 'Protein analysis' software, and automatically detected biomarker candidates. Validation analysis was performed for CM10 chip by random sampling. Decision tree software, which can perform training and validation from profiling data, was developed. For CM10 and IMAC3 chips, 23 of 113 and 8 of 41 protein peaks were significantly different among 3 groups ({rho} < 0.05), respectively. Decision tree correctly classified 3 groups with an error rate of 3.3% for CM10 and 2.0% for IMAC3, and 4 and 7 biomarker candidates were detected respectively. In 2 group comparisons, all cancer samples were correctly discriminated from non-cancer samples (error rate = 0%) for CM10 by single node and for IMAC3 by multiple nodes. Validation results from 5 test sets revealed SELDI-TOF-MS and decision tree correctly differentiated cancers from non-cancers (54/55, 98%), while predictability was moderate in 3 group classification (36/55, 65%). Our in-house software was able to successfully build decision trees and detect biomarker candidates, therefore it could be useful for biomarker discovery and clinical follow up of papillary thyroid cancer.

  9. Blood oxygen level dependent magnetic resonance imaging for detecting pathological patterns in lupus nephritis patients: a preliminary study using a decision tree model.

    Science.gov (United States)

    Shi, Huilan; Jia, Junya; Li, Dong; Wei, Li; Shang, Wenya; Zheng, Zhenfeng

    2018-02-09

    Precise renal histopathological diagnosis will guide therapy strategy in patients with lupus nephritis. Blood oxygen level dependent (BOLD) magnetic resonance imaging (MRI) has been applicable noninvasive technique in renal disease. This current study was performed to explore whether BOLD MRI could contribute to diagnose renal pathological pattern. Adult patients with lupus nephritis renal pathological diagnosis were recruited for this study. Renal biopsy tissues were assessed based on the lupus nephritis ISN/RPS 2003 classification. The Blood oxygen level dependent magnetic resonance imaging (BOLD-MRI) was used to obtain functional magnetic resonance parameter, R2* values. Several functions of R2* values were calculated and used to construct algorithmic models for renal pathological patterns. In addition, the algorithmic models were compared as to their diagnostic capability. Both Histopathology and BOLD MRI were used to examine a total of twelve patients. Renal pathological patterns included five classes III (including 3 as class III + V) and seven classes IV (including 4 as class IV + V). Three algorithmic models, including decision tree, line discriminant, and logistic regression, were constructed to distinguish the renal pathological pattern of class III and class IV. The sensitivity of the decision tree model was better than that of the line discriminant model (71.87% vs 59.48%, P decision tree model was equivalent to that of the line discriminant model (63.87% vs 63.73%, P = 0.939) and higher than that of the logistic regression model (63.87% vs 38.0%, P decision tree model was greater than that of the line discriminant model (0.765 vs 0.629, P Decision tree models constructed using functions of R2* values may facilitate the prediction of renal pathological patterns.

  10. Efficient, reliable and fast high-level triggering using a bonsai boosted decision tree

    International Nuclear Information System (INIS)

    Gligorov, V V; Williams, M

    2013-01-01

    High-level triggering is a vital component of many modern particle physics experiments. This paper describes a modification to the standard boosted decision tree (BDT) classifier, the so-called bonsai BDT, that has the following important properties: it is more efficient than traditional cut-based approaches; it is robust against detector instabilities, and it is very fast. Thus, it is fit-for-purpose for the online running conditions faced by any large-scale data acquisition system.

  11. A Path Walkability Assessment Index Model for Evaluating and Facilitating Retail Walking Using Decision-Tree-Making (DTM Method

    Directory of Open Access Journals (Sweden)

    Ali Keyvanfar

    2018-03-01

    Full Text Available Transportation is the major contributor of ever-increasing CO2 and Greenhouse Gas emissions in cities. The ever-increasing hazardous emissions of transportation and energy consumption have persuaded transportation and urban planners to motivate people to non-motorized mode of travel, especially walking. Currently, there are several urban walkability assessment models; however, coping with a limited range of walkability assessment variables make these models not fully able to promote inclusive walkable urban neighborhoods. In this regard, this study develops the path walkability assessment (PWA index model which evaluates and analyzes path walkability in association with the pedestrian’s decision-tree-making (DTM. The model converts the pedestrian’s DTM qualitative data to quantifiable values. This model involves ninety-two (92 physical and environmental walkability assessment variables clustered into three layers of DTM (Layer 1: features; Layer 2: Criteria; and Layer 3: Sub-Criteria, and scoped to shopping and retail type of walking. The PWA model as a global decision support tool can be applied in any neighborhood in the world, and this study implements it at Taman Universiti neighborhood in Skudai, Malaysia. The PWA model has established the walkability score index which determines the grading rate of walkability accomplishment for each walkability variable of the under-survey neighborhood. Using the PWA grading index enables urban designers to manage properly the financial resource allocation for inspiring walkability in the targeted neighborhood.

  12. Preliminary hazard analysis using sequence tree method

    International Nuclear Information System (INIS)

    Huang Huiwen; Shih Chunkuan; Hung Hungchih; Chen Minghuei; Yih Swu; Lin Jiinming

    2007-01-01

    A system level PHA using sequence tree method was developed to perform Safety Related digital I and C system SSA. The conventional PHA is a brainstorming session among experts on various portions of the system to identify hazards through discussions. However, this conventional PHA is not a systematic technique, the analysis results strongly depend on the experts' subjective opinions. The analysis quality cannot be appropriately controlled. Thereby, this research developed a system level sequence tree based PHA, which can clarify the relationship among the major digital I and C systems. Two major phases are included in this sequence tree based technique. The first phase uses a table to analyze each event in SAR Chapter 15 for a specific safety related I and C system, such as RPS. The second phase uses sequence tree to recognize what I and C systems are involved in the event, how the safety related systems work, and how the backup systems can be activated to mitigate the consequence if the primary safety systems fail. In the sequence tree, the defense-in-depth echelons, including Control echelon, Reactor trip echelon, ESFAS echelon, and Indication and display echelon, are arranged to construct the sequence tree structure. All the related I and C systems, include digital system and the analog back-up systems are allocated in their specific echelon. By this system centric sequence tree based analysis, not only preliminary hazard can be identified systematically, the vulnerability of the nuclear power plant can also be recognized. Therefore, an effective simplified D3 evaluation can be performed as well. (author)

  13. Bayesian Decision Trees for predicting survival of patients: a study on the US National Trauma Data Bank.

    Science.gov (United States)

    Schetinin, Vitaly; Jakaite, Livia; Jakaitis, Janis; Krzanowski, Wojtek

    2013-09-01

    Trauma and Injury Severity Score (TRISS) models have been developed for predicting the survival probability of injured patients the majority of which obtain up to three injuries in six body regions. Practitioners have noted that the accuracy of TRISS predictions is unacceptable for patients with a larger number of injuries. Moreover, the TRISS method is incapable of providing accurate estimates of predictive density of survival, that are required for calculating confidence intervals. In this paper we propose Bayesian inference for estimating the desired predictive density. The inference is based on decision tree models which split data along explanatory variables, that makes these models interpretable. The proposed method has outperformed the TRISS method in terms of accuracy of prediction on the cases recorded in the US National Trauma Data Bank. The developed method has been made available for evaluation purposes as a stand-alone application. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  14. On the relationship between the prices of oil and the precious metals: Revisiting with a multivariate regime-switching decision tree

    International Nuclear Information System (INIS)

    Charlot, Philippe; Marimoutou, Vêlayoudom

    2014-01-01

    This study examines the volatility and correlation and their relationships among the euro/US dollar exchange rates, the S and P500 equity indices, and the prices of WTI crude oil and the precious metals (gold, silver, and platinum) over the period 2005 to 2012. Our model links the univariate volatilities with the correlations via a hidden stochastic decision tree. The ensuing Hidden Markov Decision Tree (HMDT) model is in fact an extension of the Hidden Markov Model (HMM) introduced by Jordan et al. (1997). The architecture of this model is the opposite that of the classical deterministic approach based on a binary decision tree and, it allows a probabilistic vision of the relationship between univariate volatility and correlation. Our results are categorized into three groups, namely (1) exchange rates and oil, (2) S and P500 indices, and (3) precious metals. A switching dynamics is seen to characterize the volatilities, while, in the case of the correlations, the series switch from one regime to another, this movement touching a peak during the period of the Subprime crisis in the US, and again during the days following the Tohoku earthquake in Japan. Our findings show that the relationships between volatility and correlation are dependent upon the nature of the series considered, sometimes corresponding to those found in econometric studies, according to which correlation increases in bear markets, at other times differing from them. - Highlights: • This study examines the volatility and correlation and their relationships of precious metals and crude oil. • Our model links the univariate volatilities with the correlations via a hidden stochastic decision tree. • This model allows a probabilistic point of view of the relationship between univariate volatility and correlation. • Results show the relationships between volatility and correlation are dependent upon the nature of the series considered

  15. Comparison of event tree, fault tree and Markov methods for probabilistic safety assessment and application to accident mitigation

    International Nuclear Information System (INIS)

    James, H.; Harris, M.J.; Hall, S.F.

    1992-01-01

    Probabilistic safety assessment (PSA) is used extensively in the nuclear industry. The main stages of PSA and the traditional event tree method are described. Focussing on hydrogen explosions, an event tree model is compared to a novel Markov model and a fault tree, and unexpected implication for accident mitigation is revealed. (author)

  16. A decision-tree-based model for evaluating the thermal comfort of horses

    Directory of Open Access Journals (Sweden)

    Ana Paula de Assis Maia

    2013-12-01

    Full Text Available Thermal comfort is of great importance in preserving body temperature homeostasis during thermal stress conditions. Although the thermal comfort of horses has been widely studied, there is no report of its relationship with surface temperature (T S. This study aimed to assess the potential of data mining techniques as a tool to associate surface temperature with thermal comfort of horses. T S was obtained using infrared thermography image processing. Physiological and environmental variables were used to define the predicted class, which classified thermal comfort as "comfort" and "discomfort". The variables of armpit, croup, breast and groin T S of horses and the predicted classes were then subjected to a machine learning process. All variables in the dataset were considered relevant for the classification problem and the decision-tree model yielded an accuracy rate of 74 %. The feature selection methods used to reduce computational cost and simplify predictive learning decreased model accuracy to 70 %; however, the model became simpler with easily interpretable rules. For both these selection methods and for the classification using all attributes, armpit and breast T S had a higher power rating for predicting thermal comfort. Data mining techniques show promise in the discovery of new variables associated with the thermal comfort of horses.

  17. Integrating individual trip planning in energy efficiency – Building decision tree models for Danish fisheries

    DEFF Research Database (Denmark)

    Bastardie, Francois; Nielsen, J. Rasmus; Andersen, Bo Sølgaard

    2013-01-01

    efficiency for the value of catch per unit of fuel consumed is analysed by merging the questionnaire, logbook and VMS (vessel monitoring system) information. Logic decision trees and conditional behaviour probabilities are established from the responses of fishermen regarding a range of sequential......-intensive but efficient vessels conducting pelagic or industrial fishing are more inclined to base their decision on fish price only, while numerous smaller and less efficient vessels conducting demersal mixed or crustacean fishery usually consider other flexible factors, e.g., the potential for a large catch, weather...... the adaptations of individual fishermen to resource availability dynamics, increasing fuel prices, changes in regulations, and the consequences of socioeconomic external pressures on harvested stocks. A new methodology is described here to obtain quantitative information on the fishermen’s micro-scale decisions...

  18. Prediction of cannabis and cocaine use in adolescence using decision trees and logistic regression

    Directory of Open Access Journals (Sweden)

    Alfonso L. Palmer

    2010-01-01

    Full Text Available Spain is one of the European countries with the highest prevalence of cannabis and cocaine use among young people. The aim of this study was to investigate the factors related to the consumption of cocaine and cannabis among adolescents. A questionnaire was administered to 9,284 students between 14 and 18 years of age in Palma de Mallorca (47.1% boys and 52.9% girls whose mean age was 15.59 years. Logistic regression and decision trees were carried out in order to model the consumption of cannabis and cocaine. The results show the use of legal substances and committing fraudulence or theft are the main variables that raise the odds of consuming cannabis. In boys, cannabis consumption and a family history of drug use increase the odds of consuming cocaine, whereas in girls the use of alcohol, behaviours of fraudulence or theft and difficulty in some personal skills influence their odds of consuming cocaine. Finally, ease of access to the substance greatly raises the odds of consuming cocaine and cannabis in both genders. Decision trees highlight the role of consuming other substances and committing fraudulence or theft. The results of this study gain importance when it comes to putting into practice effective prevention programmes.

  19. Fast and frugal trees: translating population-based pharmacogenomics to medication prioritization

    NARCIS (Netherlands)

    Rooij, T. van; Roederer, M.; Wareham, H.T.; Rooij, I.J.E.I. van; McLeod, H.L.; Marsh, S.

    2015-01-01

    Aim: Fast and frugal decision trees (FFTs) can simplify clinical decision making by providing a heuristic approach to contextual guidance. We wanted to use FFTs for pharmacogenomic knowledge translation at point-of-care. Materials & Methods: The Pharmacogenomics for Every Nation Initiative (PGENI),

  20. ForEx++: A New Framework for Knowledge Discovery from Decision Forests

    Directory of Open Access Journals (Sweden)

    Md Nasim Adnan

    2017-11-01

    Full Text Available Decision trees are popularly used in a wide range of real world problems for both prediction and classification (logic rules discovery. A decision forest is an ensemble of decision trees and it is often built for achieving better predictive performance compared to a single decision tree. Besides improving predictive performance, a decision forest can be seen as a pool of logic rules (rules with great potential for knowledge discovery. However, a standard-sized decision forest usually generates a large number of rules that a user may not able to manage for effective knowledge analysis. In this paper, we propose a new, data set independent framework for extracting those rules that are comparatively more accurate, generalized and concise than others. We apply the proposed framework on rules generated by two different decision forest algorithms from some publicly available medical related data sets on dementia and heart disease. We then compare the quality of rules extracted by the proposed framework with rules generated from a single J48 decision tree and rules extracted by another recent method. The results reported in this paper demonstrate the effectiveness of the proposed framework.

  1. Electronic Nose Odor Classification with Advanced Decision Tree Structures

    Directory of Open Access Journals (Sweden)

    S. Guney

    2013-09-01

    Full Text Available Electronic nose (e-nose is an electronic device which can measure chemical compounds in air and consequently classify different odors. In this paper, an e-nose device consisting of 8 different gas sensors was designed and constructed. Using this device, 104 different experiments involving 11 different odor classes (moth, angelica root, rose, mint, polis, lemon, rotten egg, egg, garlic, grass, and acetone were performed. The main contribution of this paper is the finding that using the chemical domain knowledge it is possible to train an accurate odor classification system. The domain knowledge about chemical compounds is represented by a decision tree whose nodes are composed of classifiers such as Support Vector Machines and k-Nearest Neighbor. The overall accuracy achieved with the proposed algorithm and the constructed e-nose device was 97.18 %. Training and testing data sets used in this paper are published online.

  2. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    Science.gov (United States)

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  3. Boosted decision trees as an alternative to artificial neural networks for particle identification

    International Nuclear Information System (INIS)

    Roe, Byron P.; Yang Haijun; Zhu Ji; Liu Yong; Stancu, Ion; McGregor, Gordon

    2005-01-01

    The efficacy of particle identification is compared using artificial neutral networks and boosted decision trees. The comparison is performed in the context of the MiniBooNE, an experiment at Fermilab searching for neutrino oscillations. Based on studies of Monte Carlo samples of simulated data, particle identification with boosting algorithms has better performance than that with artificial neural networks for the MiniBooNE experiment. Although the tests in this paper were for one experiment, it is expected that boosting algorithms will find wide application in physics

  4. Modeling flash floods in ungauged mountain catchments of China: A decision tree learning approach for parameter regionalization

    Science.gov (United States)

    Ragettli, S.; Zhou, J.; Wang, H.; Liu, C.; Guo, L.

    2017-12-01

    Flash floods in small mountain catchments are one of the most frequent causes of loss of life and property from natural hazards in China. Hydrological models can be a useful tool for the anticipation of these events and the issuing of timely warnings. One of the main challenges of setting up such a system is finding appropriate model parameter values for ungauged catchments. Previous studies have shown that the transfer of parameter sets from hydrologically similar gauged catchments is one of the best performing regionalization methods. However, a remaining key issue is the identification of suitable descriptors of similarity. In this study, we use decision tree learning to explore parameter set transferability in the full space of catchment descriptors. For this purpose, a semi-distributed rainfall-runoff model is set up for 35 catchments in ten Chinese provinces. Hourly runoff data from in total 858 storm events are used to calibrate the model and to evaluate the performance of parameter set transfers between catchments. We then present a novel technique that uses the splitting rules of classification and regression trees (CART) for finding suitable donor catchments for ungauged target catchments. The ability of the model to detect flood events in assumed ungauged catchments is evaluated in series of leave-one-out tests. We show that CART analysis increases the probability of detection of 10-year flood events in comparison to a conventional measure of physiographic-climatic similarity by up to 20%. Decision tree learning can outperform other regionalization approaches because it generates rules that optimally consider spatial proximity and physical similarity. Spatial proximity can be used as a selection criteria but is skipped in the case where no similar gauged catchments are in the vicinity. We conclude that the CART regionalization concept is particularly suitable for implementation in sparsely gauged and topographically complex environments where a proximity

  5. Computer aided construction of fault tree

    International Nuclear Information System (INIS)

    Kovacs, Z.

    1982-01-01

    Computer code CAT for the automatic construction of the fault tree is briefly described. Code CAT makes possible simple modelling of components using decision tables, it accelerates the fault tree construction process, constructs fault trees of different complexity, and is capable of harmonized co-operation with programs PREPandKITT 1,2 for fault tree analysis. The efficiency of program CAT and thus the accuracy and completeness of fault trees constructed significantly depends on the compilation and sophistication of decision tables. Currently, program CAT is used in co-operation with programs PREPandKITT 1,2 in reliability analyses of nuclear power plant systems. (B.S.)

  6. A novel decision tree approach based on transcranial Doppler sonography to screen for blunt cervical vascular injuries.

    Science.gov (United States)

    Purvis, Dianna; Aldaghlas, Tayseer; Trickey, Amber W; Rizzo, Anne; Sikdar, Siddhartha

    2013-06-01

    Early detection and treatment of blunt cervical vascular injuries prevent adverse neurologic sequelae. Current screening criteria can miss up to 22% of these injuries. The study objective was to investigate bedside transcranial Doppler sonography for detecting blunt cervical vascular injuries in trauma patients using a novel decision tree approach. This prospective pilot study was conducted at a level I trauma center. Patients undergoing computed tomographic angiography for suspected blunt cervical vascular injuries were studied with transcranial Doppler sonography. Extracranial and intracranial vasculatures were examined with a portable power M-mode transcranial Doppler unit. The middle cerebral artery mean flow velocity, pulsatility index, and their asymmetries were used to quantify flow patterns and develop an injury decision tree screening protocol. Student t tests validated associations between injuries and transcranial Doppler predictive measures. We evaluated 27 trauma patients with 13 injuries. Single vertebral artery injuries were most common (38.5%), followed by single internal carotid artery injuries (30%). Compared to patients without injuries, mean flow velocity asymmetry was higher for single internal carotid artery (P = .003) and single vertebral artery (P = .004) injuries. Similarly, pulsatility index asymmetry was higher in single internal carotid artery (P = .015) and single vertebral artery (P = .042) injuries, whereas the lowest pulsatility index was elevated for bilateral vertebral artery injuries (P = .006). The decision tree yielded 92% specificity, 93% sensitivity, and 93% correct classifications. In this pilot feasibility study, transcranial Doppler measures were significantly associated with the blunt cervical vascular injury status, suggesting that transcranial Doppler sonography might be a viable bedside screening tool for trauma. Patient-specific hemodynamic information from transcranial Doppler assessment has the potential to alter

  7. An improved spatial contour tree constructed method

    Science.gov (United States)

    Zheng, Yi; Zhang, Ling; Guilbert, Eric; Long, Yi

    2018-05-01

    Contours are important data to delineate the landform on a map. A contour tree provides an object-oriented description of landforms and can be used to enrich the topological information. The traditional contour tree is used to store topological relationships between contours in a hierarchical structure and allows for the identification of eminences and depressions as sets of nested contours. This research proposes an improved contour tree so-called spatial contour tree that contains not only the topological but also the geometric information. It can be regarded as a terrain skeleton in 3-dimention, and it is established based on the spatial nodes of contours which have the latitude, longitude and elevation information. The spatial contour tree is built by connecting spatial nodes from low to high elevation for a positive landform, and from high to low elevation for a negative landform to form a hierarchical structure. The connection between two spatial nodes can provide the real distance and direction as a Euclidean vector in 3-dimention. In this paper, the construction method is tested in the experiment, and the results are discussed. The proposed hierarchical structure is in 3-demintion and can show the skeleton inside a terrain. The structure, where all nodes have geo-information, can be used to distinguish different landforms and applied for contour generalization with consideration of geographic characteristics.

  8. Decision analytic methods in RODOS

    International Nuclear Information System (INIS)

    Borzenko, V.; French, S.

    1996-01-01

    In the event of a nuclear accident, RODOS seeks to provide decision support at all levels ranging from the largely descriptive to providing a detailed evaluation of the benefits and disadvantages of various countermeasure strategies and ranking them according to the societal preferences as perceived by the decision makers. To achieve this, it must draw upon several decision analytic methods and bring them together in a coherent manner so that the guidance offered to decision makers is consistent from one stage of an accident to the next. The methods used draw upon multi-attribute value and utility theories

  9. What Satisfies Students? Mining Student-Opinion Data with Regression and Decision-Tree Analysis. AIR 2002 Forum Paper.

    Science.gov (United States)

    Thomas, Emily H.; Galambos, Nora

    To investigate how students' characteristics and experiences affect satisfaction, this study used regression and decision-tree analysis with the CHAID algorithm to analyze student opinion data from a sample of 1,783 college students. A data-mining approach identifies the specific aspects of students' university experience that most influence three…

  10. Application Of Decision Tree Approach To Student Selection Model- A Case Study

    Science.gov (United States)

    Harwati; Sudiya, Amby

    2016-01-01

    The main purpose of the institution is to provide quality education to the students and to improve the quality of managerial decisions. One of the ways to improve the quality of students is to arrange the selection of new students with a more selective. This research takes the case in the selection of new students at Islamic University of Indonesia, Yogyakarta, Indonesia. One of the university's selection is through filtering administrative selection based on the records of prospective students at the high school without paper testing. Currently, that kind of selection does not yet has a standard model and criteria. Selection is only done by comparing candidate application file, so the subjectivity of assessment is very possible to happen because of the lack standard criteria that can differentiate the quality of students from one another. By applying data mining techniques classification, can be built a model selection for new students which includes criteria to certain standards such as the area of origin, the status of the school, the average value and so on. These criteria are determined by using rules that appear based on the classification of the academic achievement (GPA) of the students in previous years who entered the university through the same way. The decision tree method with C4.5 algorithm is used here. The results show that students are given priority for admission is that meet the following criteria: came from the island of Java, public school, majoring in science, an average value above 75, and have at least one achievement during their study in high school.

  11. New approaches to evaluating fault trees

    International Nuclear Information System (INIS)

    Sinnamon, R.M.; Andrews, J.D.

    1997-01-01

    Fault Tree Analysis is now a widely accepted technique to assess the probability and frequency of system failure in many industries. For complex systems an analysis may produce hundreds of thousands of combinations of events which can cause system failure (minimal cut sets). The determination of these cut sets can be a very time consuming process even on modern high speed digital computers. Computerised methods, such as bottom-up or top-down approaches, to conduct this analysis are now so well developed that further refinement is unlikely to result in vast reductions in computer time. It is felt that substantial improvement in computer utilisation will only result from a completely new approach. This paper describes the use of a Binary Decision Diagram for Fault Tree Analysis and some ways in which it can be efficiently implemented on a computer. In particular, attention is given to the production of a minimum form of the Binary Decision Diagram by considering the ordering that has to be given to the basic events of the fault tree

  12. Understanding Boswellia papyrifera tree secondary metabolites through bark spectral analysis

    NARCIS (Netherlands)

    Girma, A.; Skidmore, A.K.; Bie, de C.A.J.M.; Bongers, F.

    2015-01-01

    Decision makers are concerned whether to tap or rest Boswellia Papyrifera trees. Tapping for the production of frankincense is known to deplete carbon reserves from the tree leading to production of less viable seeds, tree carbon starvation and ultimately tree mortality. Decision makers use

  13. Spatial Decision Support Systems

    Directory of Open Access Journals (Sweden)

    Silviu Ioan Bejinariu

    2015-10-01

    Full Text Available The satellite image processing is an important tool for decision making in domains like agriculture, forestry, hydrology, for normal activity tracking but also in special situations caused by natural disasters. In this paper it is proposed a method for forestry surface evaluation in terms of occupied surface and also as number of trees. The segmentation method is based on watershed transform which offers good performances in case the objects to detect have connected borders. The method is applied for automatic multi-temporal analysis of forestry areas and represents a useful instrument for decision makers.

  14. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    Directory of Open Access Journals (Sweden)

    Suduan Chen

    2014-01-01

    Full Text Available As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  15. Comparison between Decision Tree and Genetic Programming to distinguish healthy from stroke postural sway patterns.

    Science.gov (United States)

    Marrega, Luiz H G; Silva, Simone M; Manffra, Elisangela F; Nievola, Julio C

    2015-01-01

    Maintaining balance is a motor task of crucial importance for humans to perform their daily activities safely and independently. Studies in the field of Artificial Intelligence have considered different classification methods in order to distinguish healthy subjects from patients with certain motor disorders based on their postural strategies during the balance control. The main purpose of this paper is to compare the performance between Decision Tree (DT) and Genetic Programming (GP) - both classification methods of easy interpretation by health professionals - to distinguish postural sway patterns produced by healthy and stroke individuals based on 16 widely used posturographic variables. For this purpose, we used a posturographic dataset of time-series of center-of-pressure displacements derived from 19 stroke patients and 19 healthy matched subjects in three quiet standing tasks of balance control. Then, DT and GP models were trained and tested under two different experiments where accuracy, sensitivity and specificity were adopted as performance metrics. The DT method has performed statistically significant (P < 0.05) better in both cases, showing for example an accuracy of 72.8% against 69.2% from GP in the second experiment of this paper.

  16. Recovery of crown mass for energy with whole-tree skidding methods; Puupolttoaineen tuottaminen kokopuujuontomenetelmillae

    Energy Technology Data Exchange (ETDEWEB)

    Nousiainen, I [Finntech Ltd Oy, Jyvaeskylae (Finland); Vesisenaho, T [VTT Energy, Jyvaeskylae (Finland)

    1997-12-31

    The main aim of the project `Recovery of crown mass for energy with whole-tree skidding methods` was to develop the integrated harvesting method of wood raw material and wood fuel based on whole-tree skidding. The developed method gives also the possibility to deliver to sawmills raw material in the form of log section. In the harvesting chain under development whole-trees are felled and bunched with a normal one-grip harvester. The whole-trees are skidded to the roadside by a forwarder equipped with a clam bunk. At the roadside the trees are delimbed and cut with the one-grip harvester used for felling and bunching. According to the results of the field tests the harvesting costs of logging residues are in certain final cutting conditions even under 10 FIM/m{sup 3}, when the average stem size is over 0,500 m{sup 3}. In the developed method felling and bunching of whole trees with the one-grip harvester and skidding of whole-trees with the clam skidder succeeded well. The problems of the method concentrate on delimbing and bucking of whole-trees in landing site

  17. Recovery of crown mass for energy with whole-tree skidding methods; Puupolttoaineen tuottaminen kokopuujuontomenetelmillae

    Energy Technology Data Exchange (ETDEWEB)

    Nousiainen, I. [Finntech Ltd Oy, Jyvaeskylae (Finland); Vesisenaho, T. [VTT Energy, Jyvaeskylae (Finland)

    1996-12-31

    The main aim of the project `Recovery of crown mass for energy with whole-tree skidding methods` was to develop the integrated harvesting method of wood raw material and wood fuel based on whole-tree skidding. The developed method gives also the possibility to deliver to sawmills raw material in the form of log section. In the harvesting chain under development whole-trees are felled and bunched with a normal one-grip harvester. The whole-trees are skidded to the roadside by a forwarder equipped with a clam bunk. At the roadside the trees are delimbed and cut with the one-grip harvester used for felling and bunching. According to the results of the field tests the harvesting costs of logging residues are in certain final cutting conditions even under 10 FIM/m{sup 3}, when the average stem size is over 0,500 m{sup 3}. In the developed method felling and bunching of whole trees with the one-grip harvester and skidding of whole-trees with the clam skidder succeeded well. The problems of the method concentrate on delimbing and bucking of whole-trees in landing site

  18. A Decision-Tree-Oriented Guidance Mechanism for Conducting Nature Science Observation Activities in a Context-Aware Ubiquitous Learning

    Science.gov (United States)

    Hwang, Gwo-Jen; Chu, Hui-Chun; Shih, Ju-Ling; Huang, Shu-Hsien; Tsai, Chin-Chung

    2010-01-01

    A context-aware ubiquitous learning environment is an authentic learning environment with personalized digital supports. While showing the potential of applying such a learning environment, researchers have also indicated the challenges of providing adaptive and dynamic support to individual students. In this paper, a decision-tree-oriented…

  19. Study on the scope of fault tree method applicability

    International Nuclear Information System (INIS)

    Ito, Taiju

    1980-03-01

    In fault tree analysis of the reliability of nuclear safety system, including reliability analysis of nuclear protection system, there seem to be some documents in which application of the fault tree method is unreasonable. In fault tree method, the addition rule and the multiplication rule are usually used. The addition rule and the multiplication rule must hold exactly or at least practically. The addition rule has no problem but the multiplication rule has occasionally some problem. For unreliability, mean unavailability and instantaneous unavailability of the elements, holding or not of the multiplication rule has been studied comprehensively. Between the unreliability of each element without maintenance, the multiplication rule holds. Between the instantaneous unavailability of each element, with maintenance or not, the multiplication rule also holds. Between the unreliability of each subsystem with maintenance, however, the multiplication rule does not hold, because the product value is larger than the value of unreliability for a parallel system consisting of the two subsystems with maintenance. Between the mean unavailability of each element without maintenance, the multiplication rule also does not hold, because the product value is smaller than the value of mean unavailability for a parallel system consisting of the two elements without maintenance. In these cases, therefore, the fault tree method may not be applied by rote for reliability analysis of the system. (author)

  20. Section-Based Tree Species Identification Using Airborne LIDAR Point Cloud

    Science.gov (United States)

    Yao, C.; Zhang, X.; Liu, H.

    2017-09-01

    The application of LiDAR data in forestry initially focused on mapping forest community, particularly and primarily intended for largescale forest management and planning. Then with the smaller footprint and higher sampling density LiDAR data available, detecting individual tree overstory, estimating crowns parameters and identifying tree species are demonstrated practicable. This paper proposes a section-based protocol of tree species identification taking palm tree as an example. Section-based method is to detect objects through certain profile among different direction, basically along X-axis or Y-axis. And this method improve the utilization of spatial information to generate accurate results. Firstly, separate the tree points from manmade-object points by decision-tree-based rules, and create Crown Height Mode (CHM) by subtracting the Digital Terrain Model (DTM) from the digital surface model (DSM). Then calculate and extract key points to locate individual trees, thus estimate specific tree parameters related to species information, such as crown height, crown radius, and cross point etc. Finally, with parameters we are able to identify certain tree species. Comparing to species information measured on ground, the portion correctly identified trees on all plots could reach up to 90.65 %. The identification result in this research demonstrate the ability to distinguish palm tree using LiDAR point cloud. Furthermore, with more prior knowledge, section-based method enable the process to classify trees into different classes.

  1. Evaluation of four methods for estimating leaf area of isolated trees

    Science.gov (United States)

    P.J. Peper; E.G. McPherson

    2003-01-01

    The accurate modeling of the physiological and functional processes of urban forests requires information on the leaf area of urban tree species. Several non-destructive, indirect leaf area sampling methods have shown good performance for homogenous canopies. These methods have not been evaluated for use in urban settings where trees are typically isolated and...

  2. A novel decision diagrams extension method

    International Nuclear Information System (INIS)

    Li, Shumin; Si, Shubin; Dui, Hongyan; Cai, Zhiqiang; Sun, Shudong

    2014-01-01

    Binary decision diagram (BDD) is a graph-based representation of Boolean functions. It is a directed acyclic graph (DAG) based on Shannon's decomposition. Multi-state multi-valued decision diagram (MMDD) is a natural extension of BDD for the symbolic representation and manipulation of the multi-valued logic functions. This paper proposes a decision diagram extension method based on original BDD/MMDD while the scale of a reliability system is extended. Following a discussion of decomposition and physical meaning of BDD and MMDD, the modeling method of BDD/MMDD based on original BDD/MMDD is introduced. Three case studies are implemented to demonstrate the presented methods. Compared with traditional BDD and MMDD generation methods, the decision diagrams extension method is more computationally efficient as shown through the running time

  3. Classification tree for the assessment of sedentary lifestyle among hypertensive

    Directory of Open Access Journals (Sweden)

    Larissa Castelo Guedes Martins

    Full Text Available Objective.To develop a classification tree of clinical indicators for the correct prediction of the nursing diagnosis "Sedentary lifestyle" (SL in people with high blood pressure (HTN. Methods. A cross-sectional study conducted in an outpatient care center specializing in high blood pressure and Mellitus diabetes located in northeastern Brazil. The sample consisted of 285 people between 19 and 59 years old diagnosed with high blood pressure and was applied an interview and physical examination, obtaining socio-demographic information, related factors and signs and symptoms that made the defining characteristics for the diagnosis under study. The tree was generated using the CHAID algorithm (Chi-square Automatic Interaction Detection. Results. The construction of the decision tree allowed establishing the interactions between clinical indicators that facilitate a probabilistic analysis of multiple situations allowing quantify the probability of an individual presenting a sedentary lifestyle. The tree included the clinical indicator Choose daily routine without exercise as the first node. People with this indicator showed a probability of 0.88 of presenting the SL. The second node was composed of the indicator Does not perform physical activity during leisure, with 0.99 probability of presenting the SL with these two indicators. The predictive capacity of the tree was established at 69.5%. Conclusion. Decision trees help nurses who care HTN people in decision-making in assessing the characteristics that increase the probability of SL nursing diagnosis, optimizing the time for diagnostic inference.

  4. Discovering Patterns in Brain Signals Using Decision Trees

    Directory of Open Access Journals (Sweden)

    Narusci S. Bastos

    2016-01-01

    Full Text Available Even with emerging technologies, such as Brain-Computer Interfaces (BCI systems, understanding how our brains work is a very difficult challenge. So we propose to use a data mining technique to help us in this task. As a case of study, we analyzed the brain’s behaviour of blind people and sighted people in a spatial activity. There is a common belief that blind people compensate their lack of vision using the other senses. If an object is given to sighted people and we asked them to identify this object, probably the sense of vision will be the most determinant one. If the same experiment was repeated with blind people, they will have to use other senses to identify the object. In this work, we propose a methodology that uses decision trees (DT to investigate the difference of how the brains of blind people and people with vision react against a spatial problem. We choose the DT algorithm because it can discover patterns in the brain signal, and its presentation is human interpretable. Our results show that using DT to analyze brain signals can help us to understand the brain’s behaviour.

  5. Investigation of Growth and Survival of Transplanted Plane and Pine Trees According to IBA Application, Tree Age, Transplanting Time and Method

    Directory of Open Access Journals (Sweden)

    N. Etemadi

    2015-03-01

    Full Text Available The major problems in transplanting the landscape trees are high level of mortality and low establishment rate of transplanted trees, especially in the first year. In order to achieve the best condition for successful transplanting of pine and plane trees in Isfahan landscape, the present study was carried out based on a completely randomized block design with four replicates and three treatments including transplanting method (balled and burlapped and bare root, tree age (immature and mature and IBA application (0 and 150 mg/L. Trees were transplanted during 2009 and 2010 in three times (dormant season, early and late growing season. Survival rate and Relative Growth Rate index based on tree height (RGRH and trunk diameter (RGRD were measured during the first and second years. Trees transplanted early in the growing season showed the most survival percentage during the two years, as compared to other transplanting dates. Survival of Balled and burlapped and immature transplanted trees was significantly greater than bare root or mature trees. The significant effect of age treatment was continued in the second year. IBA treatment had no effect on survival rate of the studied species. Balled and burlapped and immature transplanted pine trees also had higher RGRH and RGRD compared to bare root or mature trees. According to the results of this study, early growing season is the best time for transplanting pine and plane trees. Also, transplanting of immature trees using balled and burlapped method is recommended to increase the survival and establishment rate.

  6. A New Decision Tree to Solve the Puzzle of Alzheimer's Disease Pathogenesis Through Standard Diagnosis Scoring System.

    Science.gov (United States)

    Kumar, Ashwani; Singh, Tiratha Raj

    2017-03-01

    Alzheimer's disease (AD) is a progressive, incurable and terminal neurodegenerative disorder of the brain and is associated with mutations in amyloid precursor protein, presenilin 1, presenilin 2 or apolipoprotein E, but its underlying mechanisms are still not fully understood. Healthcare sector is generating a large amount of information corresponding to diagnosis, disease identification and treatment of an individual. Mining knowledge and providing scientific decision-making for the diagnosis and treatment of disease from the clinical dataset are therefore increasingly becoming necessary. The current study deals with the construction of classifiers that can be human readable as well as robust in performance for gene dataset of AD using a decision tree. Models of classification for different AD genes were generated according to Mini-Mental State Examination scores and all other vital parameters to achieve the identification of the expression level of different proteins of disorder that may possibly determine the involvement of genes in various AD pathogenesis pathways. The effectiveness of decision tree in AD diagnosis is determined by information gain with confidence value (0.96), specificity (92 %), sensitivity (98 %) and accuracy (77 %). Besides this functional gene classification using different parameters and enrichment analysis, our finding indicates that the measures of all the gene assess in single cohorts are sufficient to diagnose AD and will help in the prediction of important parameters for other relevant assessments.

  7. Comparison of tree types of models for the prediction of final academic achievement

    Directory of Open Access Journals (Sweden)

    Silvana Gasar

    2002-12-01

    Full Text Available For efficient prevention of inappropriate secondary school choices and by that academic failure, school counselors need a tool for the prediction of individual pupil's final academic achievements. Using data mining techniques on pupils' data base and expert modeling, we developed several models for the prediction of final academic achievement in an individual high school educational program. For data mining, we used statistical analyses, clustering and two machine learning methods: developing classification decision trees and hierarchical decision models. Using an expert system shell DEX, an expert system, based on a hierarchical multi-attribute decision model, was developed manually. All the models were validated and evaluated from the viewpoint of their applicability. The predictive accuracy of DEX models and decision trees was equal and very satisfying, as it reached the predictive accuracy of an experienced counselor. With respect on the efficiency and difficulties in developing models, and relatively rapid changing of our education system, we propose that decision trees are used in further development of predictive models.

  8. A prediction rule for the development of delirium among patients in medical wards: Chi-Square Automatic Interaction Detector (CHAID) decision tree analysis model.

    Science.gov (United States)

    Kobayashi, Daiki; Takahashi, Osamu; Arioka, Hiroko; Koga, Shinichiro; Fukui, Tsuguya

    2013-10-01

    To predict development of delirium among patients in medical wards by a Chi-Square Automatic Interaction Detector (CHAID) decision tree model. This was a retrospective cohort study of all adult patients admitted to medical wards at a large community hospital. The subject patients were randomly assigned to either a derivation or validation group (2:1) by computed random number generation. Baseline data and clinically relevant factors were collected from the electronic chart. Primary outcome was the development of delirium during hospitalization. All potential predictors were included in a forward stepwise logistic regression model. CHAID decision tree analysis was also performed to make another prediction model with the same group of patients. Receiver operating characteristic curves were drawn, and the area under the curves (AUCs) were calculated for both models. In the validation group, these receiver operating characteristic curves and AUCs were calculated based on the rules from derivation. A total of 3,570 patients were admitted: 2,400 patients assigned to the derivation group and 1,170 to the validation group. A total of 91 and 51 patients, respectively, developed delirium. Statistically significant predictors were delirium history, age, underlying malignancy, and activities of daily living impairment in CHAID decision tree model, resulting in six distinctive groups by the level of risk. AUC was 0.82 in derivation and 0.82 in validation with CHAID model and 0.78 in derivation and 0.79 in validation with logistic model. We propose a validated CHAID decision tree prediction model to predict the development of delirium among medical patients. Copyright © 2013 American Association for Geriatric Psychiatry. Published by Elsevier Inc. All rights reserved.

  9. Interacting with mobile devices by fusion eye and hand gestures recognition systems based on decision tree approach

    Science.gov (United States)

    Elleuch, Hanene; Wali, Ali; Samet, Anis; Alimi, Adel M.

    2017-03-01

    Two systems of eyes and hand gestures recognition are used to control mobile devices. Based on a real-time video streaming captured from the device's camera, the first system recognizes the motion of user's eyes and the second one detects the static hand gestures. To avoid any confusion between natural and intentional movements we developed a system to fuse the decision coming from eyes and hands gesture recognition systems. The phase of fusion was based on decision tree approach. We conducted a study on 5 volunteers and the results that our system is robust and competitive.

  10. Decision tree analysis as a supplementary tool to enhance histomorphological differentiation when distinguishing human from non-human cranial bone in both burnt and unburnt states: A feasibility study.

    Science.gov (United States)

    Simmons, T; Goodburn, B; Singhrao, S K

    2016-01-01

    This feasibility study was undertaken to describe and record the histological characteristics of burnt and unburnt cranial bone fragments from human and non-human bones. Reference series of fully mineralized, transverse sections of cranial bone, from all variables and specimen states, were prepared by manual cutting and semi-automated grinding and polishing methods. A photomicrograph catalogue reflecting differences in burnt and unburnt bone from human and non-humans was recorded and qualitative analysis was performed using an established classification system based on primary bone characteristics. The histomorphology associated with human and non-human samples was, for the main part, preserved following burning at high temperature. Clearly, fibro-lamellar complex tissue subtypes, such as plexiform or laminar primary bone, were only present in non-human bones. A decision tree analysis based on histological features provided a definitive identification key for distinguishing human from non-human bone, with an accuracy of 100%. The decision tree for samples where burning was unknown was 96% accurate, and multi-step classification to taxon was possible with 100% accuracy. The results of this feasibility study strongly suggest that histology remains a viable alternative technique if fragments of cranial bone require forensic examination in both burnt and unburnt states. The decision tree analysis may provide an additional but vital tool to enhance data interpretation. Further studies are needed to assess variation in histomorphology taking into account other cranial bones, ontogeny, species and burning conditions. © The Author(s) 2015.

  11. LWR design decision methodology. Phase III. Final report

    International Nuclear Information System (INIS)

    Bertucio, R.; Held, J.; Lainoff, S.; Leahy, T.; Prather, W.; Rees, D.; Young, J.

    1982-01-01

    Traditionally, management decisions regarding design options have been made using quantitative cost information and qualitative safety information. A Design Decision Methodology, which utilizes probabilistic risk assessment techniques, including event trees and fault trees, along with systems engineering and standard cost estimation methods, has been developed so that a quantitative safety measure may be obtained as well. The report documents the development of this Design Decision Methodology, a demonstration of the methodology on a current licensing issue with the cooperation of the Washington Public Power Supply System (WPPSS), and a discussion of how the results of the demonstration may be used addressing the various issues associated with a licensing position on the issue

  12. Decision tree sensitivity analysis for cost-effectiveness of chest FDG-PET in patients with a pulmonary tumor (non-small cell carcinoma)

    International Nuclear Information System (INIS)

    Kosuda, Shigeru; Watanabe, Masumi; Kobayashi, Hideo; Kusano, Shoichi; Ichihara, Kiyoshi

    1998-01-01

    Decision tree analysis was used to assess cost-effectiveness of chest FDG-PET in patients with a pulmonary tumor (non-small cell carcinoma, ≤Stage IIIB), based on the data of the current decision tree. Decision tree models were constructed with two competing strategies (CT alone and CT plus chest FDG-PET) in 1,000 patient population with 71.4% prevalence. Baselines of FDG-PET sensitivity and specificity on detection of lung cancer and lymph node metastasis, and mortality and life expectancy were available from references. Chest CT plus chest FDG-PET strategy increased a total cost by 10.5% when a chest FDG-PET study costs 0.1 million yen, since it increased the number of mediastinoscopy and curative thoracotomy despite reducing the number of bronchofiberscopy to half. However, the strategy resulted in a remarkable increase by 115 patients with curable thoracotomy and decrease by 51 patients with non-curable thoracotomy. In addition, an average life expectancy increased by 0.607 year/patient, which means increase in medical cost is approximately 218,080 yen/year/patient when a chest FDG-PET study costs 0.1 million yen. In conclusion, chest CT plus chest FDG-PET strategy might not be cost-effective in Japan, but we are convinced that the strategy is useful in cost-benefit analysis. (author)

  13. Rapid decision support tool based on novel ecosystem service variables for retrofitting of permeable pavement systems in the presence of trees.

    Science.gov (United States)

    Scholz, Miklas; Uzomah, Vincent C

    2013-08-01

    The retrofitting of sustainable drainage systems (SuDS) such as permeable pavements is currently undertaken ad hoc using expert experience supported by minimal guidance based predominantly on hard engineering variables. There is a lack of practical decision support tools useful for a rapid assessment of the potential of ecosystem services when retrofitting permeable pavements in urban areas that either feature existing trees or should be planted with trees in the near future. Thus the aim of this paper is to develop an innovative rapid decision support tool based on novel ecosystem service variables for retrofitting of permeable pavement systems close to trees. This unique tool proposes the retrofitting of permeable pavements that obtained the highest ecosystem service score for a specific urban site enhanced by the presence of trees. This approach is based on a novel ecosystem service philosophy adapted to permeable pavements rather than on traditional engineering judgement associated with variables based on quick community and environment assessments. For an example case study area such as Greater Manchester, which was dominated by Sycamore and Common Lime, a comparison with the traditional approach of determining community and environment variables indicates that permeable pavements are generally a preferred SuDS option. Permeable pavements combined with urban trees received relatively high scores, because of their great potential impact in terms of water and air quality improvement, and flood control, respectively. The outcomes of this paper are likely to lead to more combined permeable pavement and tree systems in the urban landscape, which are beneficial for humans and the environment. Copyright © 2013 Elsevier B.V. All rights reserved.

  14. WASTK: A Weighted Abstract Syntax Tree Kernel Method for Source Code Plagiarism Detection

    Directory of Open Access Journals (Sweden)

    Deqiang Fu

    2017-01-01

    Full Text Available In this paper, we introduce a source code plagiarism detection method, named WASTK (Weighted Abstract Syntax Tree Kernel, for computer science education. Different from other plagiarism detection methods, WASTK takes some aspects other than the similarity between programs into account. WASTK firstly transfers the source code of a program to an abstract syntax tree and then gets the similarity by calculating the tree kernel of two abstract syntax trees. To avoid misjudgment caused by trivial code snippets or frameworks given by instructors, an idea similar to TF-IDF (Term Frequency-Inverse Document Frequency in the field of information retrieval is applied. Each node in an abstract syntax tree is assigned a weight by TF-IDF. WASTK is evaluated on different datasets and, as a result, performs much better than other popular methods like Sim and JPlag.

  15. Study on reliability analysis based on multilevel flow models and fault tree method

    International Nuclear Information System (INIS)

    Chen Qiang; Yang Ming

    2014-01-01

    Multilevel flow models (MFM) and fault tree method describe the system knowledge in different forms, so the two methods express an equivalent logic of the system reliability under the same boundary conditions and assumptions. Based on this and combined with the characteristics of MFM, a method mapping MFM to fault tree was put forward, thus providing a way to establish fault tree rapidly and realizing qualitative reliability analysis based on MFM. Taking the safety injection system of pressurized water reactor nuclear power plant as an example, its MFM was established and its reliability was analyzed qualitatively. The analysis result shows that the logic of mapping MFM to fault tree is correct. The MFM is easily understood, created and modified. Compared with the traditional fault tree analysis, the workload is greatly reduced and the modeling time is saved. (authors)

  16. Method of reliability allocation based on fault tree analysis and fuzzy math in nuclear power plants

    International Nuclear Information System (INIS)

    Chen Zhaobing; Deng Jian; Cao Xuewu

    2005-01-01

    Reliability allocation is a kind of a difficult multi-objective optimization problem. It can not only be applied to determine the reliability characteristic of reactor systems, subsystem and main components but also be performed to improve the design, operation and maintenance of nuclear plants. The fuzzy math known as one of the powerful tools for fuzzy optimization and the fault analysis deemed to be one of the effective methods of reliability analysis can be applied to the reliability allocation model so as to work out the problems of fuzzy characteristic of some factors and subsystem's choice respectively in this paper. Thus we develop a failure rate allocation model on the basis of the fault tree analysis and fuzzy math. For the choice of the reliability constraint factors, we choose the six important ones according to practical need for conducting the reliability allocation. The subsystem selected by the top-level fault tree analysis is to avoid allocating reliability for all the equipment and components including the unnecessary parts. During the reliability process, some factors can be calculated or measured quantitatively while others only can be assessed qualitatively by the expert rating method. So we adopt fuzzy decision and dualistic contrast to realize the reliability allocation with the help of fault tree analysis. Finally the example of the emergency diesel generator's reliability allocation is used to illustrate reliability allocation model and improve this model simple and applicable. (authors)

  17. Forest Tree Species Distribution Mapping Using Landsat Satellite Imagery and Topographic Variables with the Maximum Entropy Method in Mongolia

    Science.gov (United States)

    Hao Chiang, Shou; Valdez, Miguel; Chen, Chi-Farn

    2016-06-01

    Forest is a very important ecosystem and natural resource for living things. Based on forest inventories, government is able to make decisions to converse, improve and manage forests in a sustainable way. Field work for forestry investigation is difficult and time consuming, because it needs intensive physical labor and the costs are high, especially surveying in remote mountainous regions. A reliable forest inventory can give us a more accurate and timely information to develop new and efficient approaches of forest management. The remote sensing technology has been recently used for forest investigation at a large scale. To produce an informative forest inventory, forest attributes, including tree species are unavoidably required to be considered. In this study the aim is to classify forest tree species in Erdenebulgan County, Huwsgul province in Mongolia, using Maximum Entropy method. The study area is covered by a dense forest which is almost 70% of total territorial extension of Erdenebulgan County and is located in a high mountain region in northern Mongolia. For this study, Landsat satellite imagery and a Digital Elevation Model (DEM) were acquired to perform tree species mapping. The forest tree species inventory map was collected from the Forest Division of the Mongolian Ministry of Nature and Environment as training data and also used as ground truth to perform the accuracy assessment of the tree species classification. Landsat images and DEM were processed for maximum entropy modeling, and this study applied the model with two experiments. The first one is to use Landsat surface reflectance for tree species classification; and the second experiment incorporates terrain variables in addition to the Landsat surface reflectance to perform the tree species classification. All experimental results were compared with the tree species inventory to assess the classification accuracy. Results show that the second one which uses Landsat surface reflectance coupled

  18. FOREST TREE SPECIES DISTRIBUTION MAPPING USING LANDSAT SATELLITE IMAGERY AND TOPOGRAPHIC VARIABLES WITH THE MAXIMUM ENTROPY METHOD IN MONGOLIA

    Directory of Open Access Journals (Sweden)

    S. H. Chiang

    2016-06-01

    Full Text Available Forest is a very important ecosystem and natural resource for living things. Based on forest inventories, government is able to make decisions to converse, improve and manage forests in a sustainable way. Field work for forestry investigation is difficult and time consuming, because it needs intensive physical labor and the costs are high, especially surveying in remote mountainous regions. A reliable forest inventory can give us a more accurate and timely information to develop new and efficient approaches of forest management. The remote sensing technology has been recently used for forest investigation at a large scale. To produce an informative forest inventory, forest attributes, including tree species are unavoidably required to be considered. In this study the aim is to classify forest tree species in Erdenebulgan County, Huwsgul province in Mongolia, using Maximum Entropy method. The study area is covered by a dense forest which is almost 70% of total territorial extension of Erdenebulgan County and is located in a high mountain region in northern Mongolia. For this study, Landsat satellite imagery and a Digital Elevation Model (DEM were acquired to perform tree species mapping. The forest tree species inventory map was collected from the Forest Division of the Mongolian Ministry of Nature and Environment as training data and also used as ground truth to perform the accuracy assessment of the tree species classification. Landsat images and DEM were processed for maximum entropy modeling, and this study applied the model with two experiments. The first one is to use Landsat surface reflectance for tree species classification; and the second experiment incorporates terrain variables in addition to the Landsat surface reflectance to perform the tree species classification. All experimental results were compared with the tree species inventory to assess the classification accuracy. Results show that the second one which uses Landsat surface

  19. Decision Tree and Survey Development for Support in Agricultural Sampling Strategies during Nuclear and Radiological Emergencies

    International Nuclear Information System (INIS)

    Yi, Amelia Lee Zhi; Dercon, Gerd

    2017-01-01

    In the event of a severe nuclear or radiological accident, the release of radionuclides results in contamination of land surfaces affecting agricultural and food resources. Speedy accumulation of information and guidance on decision making is essential in enhancing the ability of stakeholders to strategize for immediate countermeasure strategies. Support tools such as decision trees and sampling protocols allow for swift response by governmental bodies and assist in proper management of the situation. While such tools exist, they focus mainly on protecting public well-being and not food safety management strategies. Consideration of the latter is necessary as it has long-term implications especially to agriculturally dependent Member States. However, it is a research gap that remains to be filled.

  20. Comparing different methods to assess weaver ant abundance in plantation trees

    DEFF Research Database (Denmark)

    Wargui, Rosine; Offenberg, Joachim; Sinzogan, Antonio

    2015-01-01

    Weaver ants (Oecophylla spp.) are widely used as effective biological control agents. In order to optimize their use, ant abundance needs to be tracked. As several methods have been used to estimate ant abundance on plantation trees, abundances are not comparable between studies and no guideline...... is available on which method to apply in a particular study. This study compared four existing methods: three methods based on the number of ant trails on the main branches of a tree (called the Peng 1, Peng 2 and Offenberg index) and one method based on the number of ant nests per tree. Branch indices did...... not produce equal scores and cannot be compared directly. The Peng 1 index was the fastest to assess, but showed only limited seasonal fluctuations when ant abundance was high, because it approached its upper limit. The Peng 2 and Offenberg indices were lower and not close to the upper limit and therefore...

  1. Two tree-formation methods for fast pattern search using nearest-neighbour and nearest-centroid matching

    NARCIS (Netherlands)

    Schomaker, Lambertus; Mangalagiu, D.; Vuurpijl, Louis; Weinfeld, M.; Schomaker, Lambert; Vuurpijl, Louis

    2000-01-01

    This paper describes tree­based classification of character images, comparing two methods of tree formation and two methods of matching: nearest neighbor and nearest centroid. The first method, Preprocess Using Relative Distances (PURD) is a tree­based reorganization of a flat list of patterns,

  2. Boosted Decision Tree Optimization for the ATLAS search of ttH production in the 2l same-sign channel

    CERN Document Server

    Rojas Huamani, Jairo Martin

    2017-01-01

    The main goal is to have a direct measurement of the Yukawa coupling of the Higgs boson to the top quark which is only possible in the production process → ttH + . In this analysis, final states with 2 same sign leptons (neutrinos not counted) have been used in order to estimate the expected significance of the ttH process. A study using Boosted Decision Trees was done using Monte Carlo simulation equivalent to a luminosity of 36.5 fb-1 at √s=13 TeV, characteristics of the years 2015 and 2016 of Run-2 at LHC. The focus of my summer student program was to investigate the performance of the BDT, mainly: To avoid building of a rigid and possible overtrained BTD (Boosted Decision Tree) in charge of identifying pp→ttH+X process by removing systematically the number of variables used in the analysis. Look the expected sensitivity’s dependence on different parameters that takes in account the BDT.

  3. Gene tree rooting methods give distributions that mimic the coalescent process.

    Science.gov (United States)

    Tian, Yuan; Kubatko, Laura S

    2014-01-01

    Multi-locus phylogenetic inference is commonly carried out via models that incorporate the coalescent process to model the possibility that incomplete lineage sorting leads to incongruence between gene trees and the species tree. An interesting question that arises in this context is whether data "fit" the coalescent model. Previous work (Rosenfeld et al., 2012) has suggested that rooting of gene trees may account for variation in empirical data that has been previously attributed to the coalescent process. We examine this possibility using simulated data. We show that, in the case of four taxa, the distribution of gene trees observed from rooting estimated gene trees with either the molecular clock or with outgroup rooting can be closely matched by the distribution predicted by the coalescent model with specific choices of species tree branch lengths. We apply commonly-used coalescent-based methods of species tree inference to assess their performance in these situations. Copyright © 2013 Elsevier Inc. All rights reserved.

  4. Computer-aided event tree analysis by the impact vector method

    International Nuclear Information System (INIS)

    Lima, J.E.P.

    1984-01-01

    In the development of the Probabilistic Risk Analysis of Angra I, the ' large event tree/small fault tree' approach was adopted for the analysis of the plant behavior in an emergency situation. In this work, the event tree methodology is presented along with the adaptations which had to be made in order to attain a correct description of the safety system performances according to the selected analysis method. The problems appearing in the application of the methodology and their respective solutions are presented and discussed, with special emphasis to the impact vector technique. A description of the ETAP code ('Event Tree Analysis Program') developed for constructing and quantifying event trees is also given in this work. A preliminary version of the small-break LOCA analysis for Angra 1 is presented as an example of application of the methodology and of the code. It is shown that the use of the ETAP code sigmnificantly contributes to decreasing the time spent in event tree analyses, making it viable the practical application of the analysis approach referred above. (author) [pt

  5. Decision trees to characterise the roles of permeability and solubility on the prediction of oral absorption.

    Science.gov (United States)

    Newby, Danielle; Freitas, Alex A; Ghafourian, Taravat

    2015-01-27

    Oral absorption of compounds depends on many physiological, physiochemical and formulation factors. Two important properties that govern oral absorption are in vitro permeability and solubility, which are commonly used as indicators of human intestinal absorption. Despite this, the nature and exact characteristics of the relationship between these parameters are not well understood. In this study a large dataset of human intestinal absorption was collated along with in vitro permeability, aqueous solubility, melting point, and maximum dose for the same compounds. The dataset allowed a permeability threshold to be established objectively to predict high or low intestinal absorption. Using this permeability threshold, classification decision trees incorporating a solubility-related parameter such as experimental or predicted solubility, or the melting point based absorption potential (MPbAP), along with structural molecular descriptors were developed and validated to predict oral absorption class. The decision trees were able to determine the individual roles of permeability and solubility in oral absorption process. Poorly permeable compounds with high solubility show low intestinal absorption, whereas poorly water soluble compounds with high or low permeability may have high intestinal absorption provided that they have certain molecular characteristics such as a small polar surface or specific topology. Copyright © 2015 Elsevier Masson SAS. All rights reserved.

  6. Inferences from growing trees backwards

    Science.gov (United States)

    David W. Green; Kent A. McDonald

    1997-01-01

    The objective of this paper is to illustrate how longitudinal stress wave techniques can be useful in tracking the future quality of a growing tree. Monitoring the quality of selected trees in a plantation forest could provide early input to decisions on the effectiveness of management practices, or future utilization options, for trees in a plantation. There will...

  7. Predicting the disease of Alzheimer with SNP biomarkers and clinical data using data mining classification approach: decision tree.

    Science.gov (United States)

    Erdoğan, Onur; Aydin Son, Yeşim

    2014-01-01

    Single Nucleotide Polymorphisms (SNPs) are the most common genomic variations where only a single nucleotide differs between individuals. Individual SNPs and SNP profiles associated with diseases can be utilized as biological markers. But there is a need to determine the SNP subsets and patients' clinical data which is informative for the diagnosis. Data mining approaches have the highest potential for extracting the knowledge from genomic datasets and selecting the representative SNPs as well as most effective and informative clinical features for the clinical diagnosis of the diseases. In this study, we have applied one of the widely used data mining classification methodology: "decision tree" for associating the SNP biomarkers and significant clinical data with the Alzheimer's disease (AD), which is the most common form of "dementia". Different tree construction parameters have been compared for the optimization, and the most accurate tree for predicting the AD is presented.

  8. Identifying Different Transportation Modes from Trajectory Data Using Tree-Based Ensemble Classifiers

    Directory of Open Access Journals (Sweden)

    Zhibin Xiao

    2017-02-01

    Full Text Available Recognition of transportation modes can be used in different applications including human behavior research, transport management and traffic control. Previous work on transportation mode recognition has often relied on using multiple sensors or matching Geographic Information System (GIS information, which is not possible in many cases. In this paper, an approach based on ensemble learning is proposed to infer hybrid transportation modes using only Global Position System (GPS data. First, in order to distinguish between different transportation modes, we used a statistical method to generate global features and extract several local features from sub-trajectories after trajectory segmentation, before these features were combined in the classification stage. Second, to obtain a better performance, we used tree-based ensemble models (Random Forest, Gradient Boosting Decision Tree, and XGBoost instead of traditional methods (K-Nearest Neighbor, Decision Tree, and Support Vector Machines to classify the different transportation modes. The experiment results on the later have shown the efficacy of our proposed approach. Among them, the XGBoost model produced the best performance with a classification accuracy of 90.77% obtained on the GEOLIFE dataset, and we used a tree-based ensemble method to ensure accurate feature selection to reduce the model complexity.

  9. Using real options analysis to support strategic management decisions

    Science.gov (United States)

    Kabaivanov, Stanimir; Markovska, Veneta; Milev, Mariyan

    2013-12-01

    Decision making is a complex process that requires taking into consideration multiple heterogeneous sources of uncertainty. Standard valuation and financial analysis techniques often fail to properly account for all these sources of risk as well as for all sources of additional flexibility. In this paper we explore applications of a modified binomial tree method for real options analysis (ROA) in an effort to improve decision making process. Usual cases of use of real options are analyzed with elaborate study on the applications and advantages that company management can derive from their application. A numeric results based on extending simple binomial tree approach for multiple sources of uncertainty are provided to demonstrate the improvement effects on management decisions.

  10. AFTC Code for Automatic Fault Tree Construction: Users Manual

    International Nuclear Information System (INIS)

    Gopika Vinod; Saraf, R.K.; Babar, A.K.

    1999-04-01

    Fault Trees perform a predominant role in reliability and safety analysis of system. Manual construction of fault tree is a very time consuming task and moreover, it won't give a formalized result, since it relies highly on analysts experience and heuristics. This necessitates a computerised fault tree construction, which is still attracting interest of reliability analysts. AFTC software is a user friendly software model for constructing fault trees based on decision tables. Software is equipped with libraries of decision tables for components commonly used in various Nuclear Power Plant (NPP) systems. User is expected to make a nodal diagram of the system, for which fault tree is to be constructed, from the flow sheets available. The text nodal diagram goes as the sole input defining the system flow chart. AFTC software is a rule based expert system which draws the fault tree from the system flow chart and component decision tables. AFTC software gives fault tree in both text and graphic format. Help is provided as how to enter system flow chart and component decision tables. The software is developed in 'C' language. Software is verified with simplified version of the fire water system of an Indian PHWR. Code conversion will be undertaken to create a window based version. (author)

  11. Diagnostic assessment of intraoperative cytology for papillary thyroid carcinoma: using a decision tree analysis.

    Science.gov (United States)

    Pyo, J-S; Sohn, J H; Kang, G

    2017-03-01

    The aim of this study was to elucidate the cytological characteristics and the diagnostic usefulness of intraoperative cytology (IOC) for papillary thyroid carcinoma (PTC). In addition, using decision tree analysis, effective features for accurate cytological diagnosis were sought. We investigated cellularity, cytological features and diagnosis based on the Bethesda System for Reporting Thyroid Cytopathology in IOC of 240 conventional PTCs. The cytological features were evaluated in terms of nuclear score with nuclear features, and additional figures such as presence of swirling sheets, psammoma bodies, and multinucleated giant cells. The nuclear score (range 0-7) was made via seven nuclear features, including (1) enlarged, (2) oval or irregularly shaped nuclei, (3) longitudinal nuclear grooves, (4) intranuclear cytoplasmic pseudoinclusion, (5) pale nuclei with powdery chromatin, (6) nuclear membrane thickening, and (7) marginally placed micronucleoli. Nuclear scores in PTC, suspicious for malignancy, and atypia of undetermined significance cases were 6.18 ± 0.80, 4.48 ± 0.82, and 3.15 ± 0.67, respectively. Additional figures more frequent in PTC than in other diagnostic categories were identified. Cellularity of IOC significantly correlated with tumor size, nuclear score, and presence of additional figures. Also, IOCs with higher nuclear scores (4-7) significantly correlated with larger tumor size and presence of additional figures. In decision tree analysis, IOCs with nuclear score >5 and swirling sheets could be considered diagnostic for PTCs. Our study suggests that IOCs using nuclear features and additional figures could be useful with decreasing the likelihood of inconclusive results.

  12. A New Architecture for Making Moral Agents Based on C4.5 Decision Tree Algorithm

    OpenAIRE

    Meisam Azad-Manjiri

    2014-01-01

    Regarding to the influence of robots in the various fields of life, the issue of trusting to them is important, especially when a robot deals with people directly. One of the possible ways to get this confidence is adding a moral dimension to the robots. Therefore, we present a new architecture in order to build moral agents that learn from demonstrations. This agent is based on Beauchamp and Childress’s principles of biomedical ethics (a type of deontological theory) and uses decision tree a...

  13. Use of decision trees for evaluating severe accident management strategies in nuclear power plants

    Energy Technology Data Exchange (ETDEWEB)

    Jae, Moosung [Hanyang Univ., Seoul (Korea, Republic of). Dept. of Nuclerar Engineering; Lee, Yongjin; Jerng, Dong Wook [Chung-Ang Univ., Seoul (Korea, Republic of). School of Energy Systems Engineering

    2016-07-15

    Accident management strategies are defined to innovative actions taken by plant operators to prevent core damage or to maintain the sound containment integrity. Such actions minimize the chance of offsite radioactive substance leaks that lead to and intensify core damage under power plant accident conditions. Accident management extends the concept of Defense in Depth against core meltdown accidents. In pressurized water reactors, emergency operating procedures are performed to extend the core cooling time. The effectiveness of Severe Accident Management Guidance (SAMG) became an important issue. Severe accident management strategies are evaluated with a methodology utilizing the decision tree technique.

  14. GENERATION OF 2D LAND COVER MAPS FOR URBAN AREAS USING DECISION TREE CLASSIFICATION

    DEFF Research Database (Denmark)

    Höhle, Joachim

    2014-01-01

    A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects...... of stereo-observations of false-colour stereopairs. The stratified statistical assessment of the produced land cover map with six classes and based on 91 points per class reveals a high thematic accuracy for classes ‘building’ (99%, 95% CI: 95%-100%) and ‘road and parking lot’ (90%, 95% CI: 83%-95%). Some...

  15. A restricted Steiner tree problem is solved by Geometric Method II

    Science.gov (United States)

    Lin, Dazhi; Zhang, Youlin; Lu, Xiaoxu

    2013-03-01

    The minimum Steiner tree problem has wide application background, such as transportation system, communication network, pipeline design and VISL, etc. It is unfortunately that the computational complexity of the problem is NP-hard. People are common to find some special problems to consider. In this paper, we first put forward a restricted Steiner tree problem, which the fixed vertices are in the same side of one line L and we find a vertex on L such the length of the tree is minimal. By the definition and the complexity of the Steiner tree problem, we know that the complexity of this problem is also Np-complete. In the part one, we have considered there are two fixed vertices to find the restricted Steiner tree problem. Naturally, we consider there are three fixed vertices to find the restricted Steiner tree problem. And we also use the geometric method to solve such the problem.

  16. Structural Equation Model Trees

    Science.gov (United States)

    Brandmaier, Andreas M.; von Oertzen, Timo; McArdle, John J.; Lindenberger, Ulman

    2013-01-01

    In the behavioral and social sciences, structural equation models (SEMs) have become widely accepted as a modeling tool for the relation between latent and observed variables. SEMs can be seen as a unification of several multivariate analysis techniques. SEM Trees combine the strengths of SEMs and the decision tree paradigm by building tree…

  17. Estimating Surface Downward Shortwave Radiation over China Based on the Gradient Boosting Decision Tree Method

    Directory of Open Access Journals (Sweden)

    Lu Yang

    2018-01-01

    Full Text Available Downward shortwave radiation (DSR is an essential parameter in the terrestrial radiation budget and a necessary input for models of land-surface processes. Although several radiation products using satellite observations have been released, coarse spatial resolution and low accuracy limited their application. It is important to develop robust and accurate retrieval methods with higher spatial resolution. Machine learning methods may be powerful candidates for estimating the DSR from remotely sensed data because of their ability to perform adaptive, nonlinear data fitting. In this study, the gradient boosting regression tree (GBRT was employed to retrieve DSR measurements with the ground observation data in China collected from the China Meteorological Administration (CMA Meteorological Information Center and the satellite observations from the Advanced Very High Resolution Radiometer (AVHRR at a spatial resolution of 5 km. The validation results of the DSR estimates based on the GBRT method in China at a daily time scale for clear sky conditions show an R2 value of 0.82 and a root mean square error (RMSE value of 27.71 W·m−2 (38.38%. These values are 0.64 and 42.97 W·m−2 (34.57%, respectively, for cloudy sky conditions. The monthly DSR estimates were also evaluated using ground measurements. The monthly DSR estimates have an overall R2 value of 0.92 and an RMSE of 15.40 W·m−2 (12.93%. Comparison of the DSR estimates with the reanalyzed and retrieved DSR measurements from satellite observations showed that the estimated DSR is reasonably accurate but has a higher spatial resolution. Moreover, the proposed GBRT method has good scalability and is easy to apply to other parameter inversion problems by changing the parameters and training data.

  18. Determining Accuracy of Thermal Dissipation Methods-based Sap Flux in Japanese Cedar Trees

    Science.gov (United States)

    Su, Man-Ping; Shinohara, Yoshinori; Laplace, Sophie; Lin, Song-Jin; Kume, Tomonori

    2017-04-01

    Thermal dissipation method, one kind of sap flux measurement method that can estimate individual tree transpiration, have been widely used because of its low cost and uncomplicated operation. Although thermal dissipation method is widespread, the accuracy of this method is doubted recently because some tree species materials in previous studies were not suitable for its empirical formula from Granier due to difference of wood characteristics. In Taiwan, Cryptomeria japonica (Japanese cedar) is one of the dominant species in mountainous area, quantifying the transpiration of Japanese cedar trees is indispensable to understand water cycling there. However, no one have tested the accuracy of thermal dissipation methods-based sap flux for Japanese cedar trees in Taiwan. Thus, in this study we conducted calibration experiment using twelve Japanese cedar stem segments from six trees to investigate the accuracy of thermal dissipation methods-based sap flux in Japanese cedar trees in Taiwan. By pumping water from segment bottom to top and inserting probes into segments to collect data simultaneously, we compared sap flux densities calculated from real water uptakes (Fd_actual) and empirical formula (Fd_Granier). Exact sapwood area and sapwood depth of each sample were obtained from dying segment with safranin stain solution. Our results showed that Fd_Granier underestimated 39 % of Fd_actual across sap flux densities ranging from 10 to 150 (cm3m-2s-1); while applying sapwood depth corrected formula from Clearwater, Fd_Granier became accurately that only underestimated 0.01 % of Fd_actual. However, when sap flux densities ranging from 10 to 50 (cm3m-2s-1)which is similar with the field data of Japanese cedar trees in a mountainous area of Taiwan, Fd_Granier underestimated 51 % of Fd_actual, and underestimated 26 % with applying Clearwater sapwood depth corrected formula. These results suggested sapwood depth significantly impacted on the accuracy of thermal dissipation

  19. Analysis of large fault trees based on functional decomposition

    International Nuclear Information System (INIS)

    Contini, Sergio; Matuzas, Vaidas

    2011-01-01

    With the advent of the Binary Decision Diagrams (BDD) approach in fault tree analysis, a significant enhancement has been achieved with respect to previous approaches, both in terms of efficiency and accuracy of the overall outcome of the analysis. However, the exponential increase of the number of nodes with the complexity of the fault tree may prevent the construction of the BDD. In these cases, the only way to complete the analysis is to reduce the complexity of the BDD by applying the truncation technique, which nevertheless implies the problem of estimating the truncation error or upper and lower bounds of the top-event unavailability. This paper describes a new method to analyze large coherent fault trees which can be advantageously applied when the working memory is not sufficient to construct the BDD. It is based on the decomposition of the fault tree into simpler disjoint fault trees containing a lower number of variables. The analysis of each simple fault tree is performed by using all the computational resources. The results from the analysis of all simpler fault trees are re-combined to obtain the results for the original fault tree. Two decomposition methods are herewith described: the first aims at determining the minimal cut sets (MCS) and the upper and lower bounds of the top-event unavailability; the second can be applied to determine the exact value of the top-event unavailability. Potentialities, limitations and possible variations of these methods will be discussed with reference to the results of their application to some complex fault trees.

  20. Analysis of large fault trees based on functional decomposition

    Energy Technology Data Exchange (ETDEWEB)

    Contini, Sergio, E-mail: sergio.contini@jrc.i [European Commission, Joint Research Centre, Institute for the Protection and Security of the Citizen, 21020 Ispra (Italy); Matuzas, Vaidas [European Commission, Joint Research Centre, Institute for the Protection and Security of the Citizen, 21020 Ispra (Italy)

    2011-03-15

    With the advent of the Binary Decision Diagrams (BDD) approach in fault tree analysis, a significant enhancement has been achieved with respect to previous approaches, both in terms of efficiency and accuracy of the overall outcome of the analysis. However, the exponential increase of the number of nodes with the complexity of the fault tree may prevent the construction of the BDD. In these cases, the only way to complete the analysis is to reduce the complexity of the BDD by applying the truncation technique, which nevertheless implies the problem of estimating the truncation error or upper and lower bounds of the top-event unavailability. This paper describes a new method to analyze large coherent fault trees which can be advantageously applied when the working memory is not sufficient to construct the BDD. It is based on the decomposition of the fault tree into simpler disjoint fault trees containing a lower number of variables. The analysis of each simple fault tree is performed by using all the computational resources. The results from the analysis of all simpler fault trees are re-combined to obtain the results for the original fault tree. Two decomposition methods are herewith described: the first aims at determining the minimal cut sets (MCS) and the upper and lower bounds of the top-event unavailability; the second can be applied to determine the exact value of the top-event unavailability. Potentialities, limitations and possible variations of these methods will be discussed with reference to the results of their application to some complex fault trees.