WorldWideScience

Sample records for pruned decision tree

  1. How to Prune Trees

    Science.gov (United States)

    Peter Bedker; Joseph O' Brien; Manfred Mielke

    2012-01-01

    The objective of pruning is to produce strong, healthy, attractive plants. By understanding how, when and why to prune, and by following a few simple principles, this objective can be achievedHow to Prune Trees (Revised 2012) Agency Publisher: Agriculture Dept., Forest Service, Northeastern Area State and Price forestry USA List Price:$4.00 Sale...

  2. Multi-pruning of decision trees for knowledge representation and classification

    KAUST Repository

    Azad, Mohammad

    2016-06-09

    We consider two important questions related to decision trees: first how to construct a decision tree with reasonable number of nodes and reasonable number of misclassification, and second how to improve the prediction accuracy of decision trees when they are used as classifiers. We have created a dynamic programming based approach for bi-criteria optimization of decision trees relative to the number of nodes and the number of misclassification. This approach allows us to construct the set of all Pareto optimal points and to derive, for each such point, decision trees with parameters corresponding to that point. Experiments on datasets from UCI ML Repository show that, very often, we can find a suitable Pareto optimal point and derive a decision tree with small number of nodes at the expense of small increment in number of misclassification. Based on the created approach we have proposed a multi-pruning procedure which constructs decision trees that, as classifiers, often outperform decision trees constructed by CART. © 2015 IEEE.

  3. Bonsai trees in your head: how the pavlovian system sculpts goal-directed choices by pruning decision trees.

    Directory of Open Access Journals (Sweden)

    Quentin J M Huys

    Full Text Available When planning a series of actions, it is usually infeasible to consider all potential future sequences; instead, one must prune the decision tree. Provably optimal pruning is, however, still computationally ruinous and the specific approximations humans employ remain unknown. We designed a new sequential reinforcement-based task and showed that human subjects adopted a simple pruning strategy: during mental evaluation of a sequence of choices, they curtailed any further evaluation of a sequence as soon as they encountered a large loss. This pruning strategy was Pavlovian: it was reflexively evoked by large losses and persisted even when overwhelmingly counterproductive. It was also evident above and beyond loss aversion. We found that the tendency towards Pavlovian pruning was selectively predicted by the degree to which subjects exhibited sub-clinical mood disturbance, in accordance with theories that ascribe Pavlovian behavioural inhibition, via serotonin, a role in mood disorders. We conclude that Pavlovian behavioural inhibition shapes highly flexible, goal-directed choices in a manner that may be important for theories of decision-making in mood disorders.

  4. Bonsai trees in your head: how the pavlovian system sculpts goal-directed choices by pruning decision trees.

    Science.gov (United States)

    Huys, Quentin J M; Eshel, Neir; O'Nions, Elizabeth; Sheridan, Luke; Dayan, Peter; Roiser, Jonathan P

    2012-01-01

    When planning a series of actions, it is usually infeasible to consider all potential future sequences; instead, one must prune the decision tree. Provably optimal pruning is, however, still computationally ruinous and the specific approximations humans employ remain unknown. We designed a new sequential reinforcement-based task and showed that human subjects adopted a simple pruning strategy: during mental evaluation of a sequence of choices, they curtailed any further evaluation of a sequence as soon as they encountered a large loss. This pruning strategy was Pavlovian: it was reflexively evoked by large losses and persisted even when overwhelmingly counterproductive. It was also evident above and beyond loss aversion. We found that the tendency towards Pavlovian pruning was selectively predicted by the degree to which subjects exhibited sub-clinical mood disturbance, in accordance with theories that ascribe Pavlovian behavioural inhibition, via serotonin, a role in mood disorders. We conclude that Pavlovian behavioural inhibition shapes highly flexible, goal-directed choices in a manner that may be important for theories of decision-making in mood disorders.

  5. Training and Pruning Apple Trees

    OpenAIRE

    Marini, Richard P. (Richard Paul), 1952-

    2009-01-01

    Discusses the pruning and training of apple trees, placing emphasis on proper training of young trees to save time and the expense of future pruning, and to produce earlier profitable crops. Advises about the best techniques for pruning in relation to age of the apple tree.

  6. Pruning Chinese trees : an experimental and modelling approach

    NARCIS (Netherlands)

    Zeng, Bo

    2001-01-01

    Pruning of trees, in which some branches are removed from the lower crown of a tree, has been extensively used in China in silvicultural management for many purposes. With an experimental and modelling approach, the effects of pruning on tree growth and on the harvest of plant material were studied.

  7. VC-dimension of univariate decision trees.

    Science.gov (United States)

    Yildiz, Olcay Taner

    2015-02-01

    In this paper, we give and prove the lower bounds of the Vapnik-Chervonenkis (VC)-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. Via a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively, we show that our VC-dimension bounds are tight for simple trees. To verify that the VC-dimension bounds are useful, we also use them to get VC-generalization bounds for complexity control using structural risk minimization in decision trees, i.e., pruning. Our simulation results show that structural risk minimization pruning using the VC-dimension bounds finds trees that are more accurate as those pruned using cross validation.

  8. REDUCING COMPETITION IN AGROFORESTRY BY PRUNING NATIVE TREES

    OpenAIRE

    Nicodemo,Maria Luiza Franceschi; Castiglioni, Paula Priscila; Pezzopane,José Ricardo Macedo; Tholon, Patrícia; Carpanezzi, Antônio Aparecido

    2016-01-01

    ABSTRACT The degree to which pruning helps reestablish balance in agroforestry was assessed in a system established in São Carlos, São Paulo, Brazil, in 2008. Seven native tree species were planted at a density of 600 trees/ha in five strips of three rows each, and annual crops were cultivated in the 17-m crop strips between the tree strips. Competition was established after 35 months, decreasing the aboveground biomass production of corn planted close to the trees. An assessment of black oat...

  9. Effect of Root Pruning and Irrigation Regimes on Yield and Physiology of Pear Trees

    DEFF Research Database (Denmark)

    Wang, Yufei

    Clara Frijs’ is the dominant pear (Pyrus communis L.) cultivar in Denmark. It is vigorous with long annual shoots, and therefore can be difficult to prune. Root pruning has been widely used to control the canopy size of fruit trees including pears. However, root pruned trees are more likely to su...

  10. Bayesian Evidence Framework for Decision Tree Learning

    Science.gov (United States)

    Chatpatanasiri, Ratthachat; Kijsirikul, Boonserm

    2005-11-01

    This work is primary interested in the problem of, given the observed data, selecting a single decision (or classification) tree. Although a single decision tree has a high risk to be overfitted, the induced tree is easily interpreted. Researchers have invented various methods such as tree pruning or tree averaging for preventing the induced tree from overfitting (and from underfitting) the data. In this paper, instead of using those conventional approaches, we apply the Bayesian evidence framework of Gull, Skilling and Mackay to a process of selecting a decision tree. We derive a formal function to measure `the fitness' for each decision tree given a set of observed data. Our method, in fact, is analogous to a well-known Bayesian model selection method for interpolating noisy continuous-value data. As in regression problems, given reasonable assumptions, this derived score function automatically quantifies the principle of Ockham's razor, and hence reasonably deals with the issue of underfitting-overfitting tradeoff.

  11. PHENOLOGICAL CHARACTERISTICS OF GENOTYPES FROM CATTLEY GUAVA AND GUAVA TREES SUBMITTED TO FRUCTIFICATION PRUNING

    Directory of Open Access Journals (Sweden)

    CINTIA APARECIDA BREMENKAMP

    Full Text Available ABSTRACT Psidium cattleianum Sabine is a species from the Myrtaceae family that serves as an option for the native fruits cultivation, besides being considered a source of resistance to the Meloidogyne enterolobii nematode. Although cattley guava trees from this species produce flower buds in young branches, there are no reports of response to fructification pruning or phenological synchronism with the guava tree. The objective of this paper was the comparative evaluation of the genotype response of strawberry guava trees and guava cultivars to fructification pruning, thus, describing the phenology of both species under the same cultivation conditions. The experiment was conducted under an entirely randomized outline, in 7x2 factorial scheme, being evaluated seven genotypes (three from strawberry guava and four from guava trees, and with pruning performed in two seasons (May 2012 and March 2013, with three repetitions. Fructification pruning was executed by a lopping on all mature branches, from the last growth flow in the woody branch region. Were evaluated budding characteristics and fruit harvesting, as well as number of days from pruning to the observation of the phenological event. Cattley guava tree pruning stimulated fructification of all three genotypes after pruning done on May and two genotypes after the March’s pruning. There has been a sync between the guava cultivars’ flowering and both strawberry guava trees genotypes, when those were pruned on May.

  12. Tree growth and management in Ugandan agroforestry systems: effects of root pruning on tree growth and crop yield.

    Science.gov (United States)

    Wajja-Musukwe, Tellie-Nelson; Wilson, Julia; Sprent, Janet I; Ong, Chin K; Deans, J Douglas; Okorio, John

    2008-02-01

    Tree root pruning is a potential tool for managing belowground competition when trees and crops are grown together in agroforestry systems. We investigated the effects of tree root pruning on shoot growth and root distribution of Alnus acuminata (H.B. & K.), Casuarina equisetifolia L., Grevillea robusta A. Cunn. ex R. Br., Maesopsis eminii Engl. and Markhamia lutea (Benth.) K. Schum. and on yield of adjacent crops in sub-humid Uganda. The trees were 3 years old at the commencement of the study, and most species were competing strongly with crops. Tree roots were pruned 41 months after planting by cutting and back-filling a trench to a depth of 0.3 m, at a distance of 0.3 m from the trees, on one side of the tree row. The trench was reopened and roots recut at 50 and 62 months after planting. We assessed the effects on tree growth and root distribution over a 3 year period, and crop yield after the third root pruning at 62 months. Overall, root pruning had only a slight effect on aboveground tree growth: height growth was unaffected and diameter growth was reduced by only 4%. A substantial amount of root regrowth was observed by 11 months after pruning. Tree species varied in the number and distribution of roots, and C. equisetifolia and M. lutea had considerably more roots per unit of trunk volume than the other species, especially in the surface soil layers. Casuarina equisetifolia and M. eminii were the tree species most competitive with crops and G. robusta and M. lutea the least competitive. Crop yield data provided strong evidence of the redistribution of root activity following root pruning, with competition increasing on the unpruned side of tree rows. Thus, one-sided root pruning will be useful in only a few circumstances.

  13. Quantifying pruning impacts on olive tree architecture and annual canopy growth by using UAV-based 3D modelling.

    Science.gov (United States)

    Jiménez-Brenes, F M; López-Granados, F; de Castro, A I; Torres-Sánchez, J; Serrano, N; Peña, J M

    2017-01-01

    Tree pruning is a costly practice with important implications for crop harvest and nutrition, pest and disease control, soil protection and irrigation strategies. Investigations on tree pruning usually involve tedious on-ground measurements of the primary tree crown dimensions, which also might generate inconsistent results due to the irregular geometry of the trees. As an alternative to intensive field-work, this study shows a innovative procedure based on combining unmanned aerial vehicle (UAV) technology and advanced object-based image analysis (OBIA) methodology for multi-temporal three-dimensional (3D) monitoring of hundreds of olive trees that were pruned with three different strategies (traditional, adapted and mechanical pruning). The UAV images were collected before pruning, after pruning and a year after pruning, and the impacts of each pruning treatment on the projected canopy area, tree height and crown volume of every tree were quantified and analyzed over time. The full procedure described here automatically identified every olive tree on the orchard and computed their primary 3D dimensions on the three study dates with high accuracy in the most cases. Adapted pruning was generally the most aggressive treatment in terms of the area and volume (the trees decreased by 38.95 and 42.05% on average, respectively), followed by trees under traditional pruning (33.02 and 35.72% on average, respectively). Regarding the tree heights, mechanical pruning produced a greater decrease (12.15%), and these values were minimal for the other two treatments. The tree growth over one year was affected by the pruning severity and by the type of pruning treatment, i.e., the adapted-pruning trees experienced higher growth than the trees from the other two treatments when pruning intensity was low (<10%), similar to the traditionally pruned trees at moderate intensity (10-30%), and lower than the other trees when the pruning intensity was higher than 30% of the crown volume

  14. Effect of root pruning and irrigation regimes on leaf water relations and xylem ABA and ionic concentrations in pear trees

    DEFF Research Database (Denmark)

    Wang, Yufei; Bertelsen, Marianne G.; Petersen, Karen Koefoed

    2014-01-01

    pruning caused water deficit stress in pear trees. Further RP trees had significantly lower concentrations of total cations and anions and the sum of cations and anions than the NP trees implying root pruning decreased acquisition of nutrients from the soil. In the root pruned trees, the leaf water......Root pruning is an effective approach for controlling vegetative growth of pear trees (Pyrus communis L.), yet the underlying mechanisms for such effect remain largely elusive. A two-year field experiment was conducted to investigate the effect of root pruning and irrigation regimes on leaf water...... relation characteristics, stomatal conductance and xylem sap abscisic acid (ABA) and ionic concentrations. Results showed that leaf water potential, leaf turgor and stomatal conductance of root pruning (RP) treatment was significantly lower than those of non-root pruning (NP) treatment indicating that root...

  15. Trees and Decisions

    OpenAIRE

    Alós-Ferrer, Carlos; Ritzberger, Klaus

    2003-01-01

    Abstract: The traditional model of sequential decision making, for instance, in extensive form games, is a tree. Most texts define a tree as a connected directed graph without loops and a distinguished node, called the root. But an abstract graph isnot a domain for decision theory. Decision theory perceives of acts as functions from states to consequences. Sequential decisions, accordingly, get conceptualized by mappings from sets of states to sets of consequences. Thus, the question arises w...

  16. Automatic design of decision-tree induction algorithms

    CERN Document Server

    Barros, Rodrigo C; Freitas, Alex A

    2015-01-01

    Presents a detailed study of the major design components that constitute a top-down decision-tree induction algorithm, including aspects such as split criteria, stopping criteria, pruning, and the approaches for dealing with missing values. Whereas the strategy still employed nowadays is to use a 'generic' decision-tree induction algorithm regardless of the data, the authors argue on the benefits that a bias-fitting strategy could bring to decision-tree induction, in which the ultimate goal is the automatic generation of a decision-tree induction algorithm tailored to the application domain o

  17. Growth following pruning of young loblolly pine trees: some early results

    Science.gov (United States)

    Ralph L. Amateis; Harold E. Burkhart

    2006-01-01

    In the spring of 2000, a designed experiment was established to study the effects of pruning on juvenile loblolly pine (Pinus taeda L.) tree growth and the subsequent formation of mature wood. Trees were planted at a 3 m x 3 m square spacing in plots of 6 rows with 6 trees per row, with the inner 16 trees constituting the measurement plot. Among the...

  18. Supporting medical decisions with vector decision trees.

    Science.gov (United States)

    Sprogar, M; Kokol, P; Zorman, M; Podgorelec, V; Yamamoto, R; Masuda, G; Sakamoto, N

    2001-01-01

    The article presents the extension of a common decision tree concept to a multidimensional - vector - decision tree constructed with the help of evolutionary techniques. In contrary to the common decision tree the vector decision tree can make more than just one suggestion per input sample. It has the functionality of many separate decision trees acting on a same set of training data and answering different questions. Vector decision tree is therefore simple in its form, is easy to use and analyse and can express some relationships between decisions not visible before. To explore and test the possibilities of this concept we developed a software tool--DecRain--for building vector decision trees using the ideas of evolutionary computing. Generated vector decision trees showed good results in comparison to classical decision trees. The concept of vector decision trees can be safely and effectively used in any decision making process.

  19. Energy potential of fruit tree pruned biomass in Croatia

    Energy Technology Data Exchange (ETDEWEB)

    Bilandzija, N.; Voca, N.; Kricka, T.; Martin, A.; Jurisic, V.

    2012-11-01

    The world's most developed countries and the European Union (EU) deem that the renewable energy sources should partly substitute fossil fuels and become a bridge to the utilization of other energy sources of the future. This paper will present the possibility of using pruned biomass from fruit cultivars. It will also present the calculation of potential energy from the mentioned raw materials in order to determine the extent of replacement of non-renewable sources with these types of renewable energy. One of the results of the intensive fruit-growing process, in post pruning stage, is large amount of pruned biomass waste. Based on the calculated biomass (kg ha{sup 1}) from intensively grown woody fruit crops that are most grown in Croatia (apple, pear, apricots, peach and nectarine, sweet cherry, sour cherry, prune, walnut, hazelnut, almond, fig, grapevine, and olive) and the analysis of combustible (carbon 45.55-49.28%, hydrogen 5.91-6.83%, and sulphur 0.18-0.21%) and non-combustible matters (oxygen 43.34-46.6%, nitrogen 0.54-1.05%, moisture 3.65-8.83%, ashes 1.52-5.39%) with impact of lowering the biomass heating value (15.602-17.727 MJ kg{sup 1}), the energy potential of the pruned fruit biomass is calculated at 4.21 PJ. (Author) 31 refs.

  20. Competition in apple, as influenced by Alar sprays, fruiting, pruning and tree spacing

    NARCIS (Netherlands)

    Verheij, E.W.M.

    1972-01-01

    In the spring of 1965 a trial was planted with Golden Delicious IX and James Grieve 'aimed' VII, in which tree spacing, deblossoming, Alar sprays and pruning were variable factors, Results are presented over the period 1966-1969.

    At the end of 1969, the 5th year from planting, 400

  1. Mineralization and N-use efficiency of tree legume prunings from ...

    African Journals Online (AJOL)

    The treatment combinations were laid out as a randomized complete blocks design. Mixtures of tree prunings with 2.5 t ha-1 maize stover increased maize N uptake and grain yield whereas 5 t ha-1 maize stover reduced maize N uptake and grain yield during the wetter season. Mixtures of Pea-R, Stover-1 or Stover-2 with ...

  2. Monotone Decision Trees

    NARCIS (Netherlands)

    J.C. Bioch (Cor); T. Petter; R. Potharst (Rob)

    1997-01-01

    textabstractEUR-FEW-CS-97-07 Title Monotone decision trees Author(s) R. Potharst J.C. Bioch T. Petter Abstract In many classification problems the domains of the attributes and the classes are linearly ordered. Often, classification must preserve this ordering: this is called monotone

  3. Effects of leader topping and branch pruning on efficiency of Douglas-fir cone harvesting with a tree shaker.

    Science.gov (United States)

    D.L. Copes

    1985-01-01

    In 1983, a study was conducted to evaluate the effects of leader topping and branch pruning on the efficiency to tree shaking to remove Douglas-fir (Pseudotsuga menziesii (Mirb.) Franco) cones. Removal efficiency for three topping and pruning treatments averaged 69 percent, whereas for the uncut control treatment it was 62 percent. The treatment...

  4. Evaluation of fungicides to protect pruning wounds from Botryosphaeriaceae species infections on almond trees

    Directory of Open Access Journals (Sweden)

    Diego OLMO

    2017-05-01

    Full Text Available In vitro efficacy of ten fungicides was evaluated against four Botryosphaeriaceae spp. (Diplodia seriata, Neofusicoccum luteum, N. mediterraneum and N. parvum associated with branch cankers on almond trees. Cyproconazole, pyraclostrobin, tebuconazole, and thiophanate-methyl were effective for the inhibition of mycelial growth of most of these fungi. An experiment on 3-year-old almond trees evaluated boscalid, mancozeb, thiophanate-methyl, pyraclostrobin and tebuconazole for preventative ability against infections caused by the four pathogens. Five months after pruning and fungicide application, lesion length measurements and isolation percentages showed no significant differences among the four pathogens after they were inoculated onto the trees, and also between the two inoculation times tested (1 or 7 d after fungicide application. Thiophanate-methyl was the most effective fungicide, resulting in the shortest lesion lengths and the lowest isolation percentages from artificially inoculated pruning wounds. This chemical is therefore a candidate for inclusion in integrated disease management, to protect pruning wounds from infections caused by species of Botryosphaeriaceae. This study represents the first approach to development of chemical control strategies for the management of canker diseases caused by Botryosphaeriaceae fungi on almond trees

  5. Parallel peak pruning for scalable SMP contour tree computation

    Energy Technology Data Exchange (ETDEWEB)

    Carr, Hamish A. [Univ. of Leeds (United Kingdom); Weber, Gunther H. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Davis, CA (United States); Sewell, Christopher M. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Ahrens, James P. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2017-03-09

    As data sets grow to exascale, automated data analysis and visualisation are increasingly important, to intermediate human understanding and to reduce demands on disk storage via in situ analysis. Trends in architecture of high performance computing systems necessitate analysis algorithms to make effective use of combinations of massively multicore and distributed systems. One of the principal analytic tools is the contour tree, which analyses relationships between contours to identify features of more than local importance. Unfortunately, the predominant algorithms for computing the contour tree are explicitly serial, and founded on serial metaphors, which has limited the scalability of this form of analysis. While there is some work on distributed contour tree computation, and separately on hybrid GPU-CPU computation, there is no efficient algorithm with strong formal guarantees on performance allied with fast practical performance. Here in this paper, we report the first shared SMP algorithm for fully parallel contour tree computation, withfor-mal guarantees of O(lgnlgt) parallel steps and O(n lgn) work, and implementations with up to 10x parallel speed up in OpenMP and up to 50x speed up in NVIDIA Thrust.

  6. Construction and application of hierarchical decision tree for classification of ultrasonographic prostate images

    NARCIS (Netherlands)

    Giesen, R. J.; Huynen, A. L.; Aarnink, R. G.; de la Rosette, J. J.; Debruyne, F. M.; Wijkstra, H.

    1996-01-01

    A non-parametric algorithm is described for the construction of a binary decision tree classifier. This tree is used to correlate textural features, computed from ultrasonographic prostate images, with the histopathology of the imaged tissue. The algorithm consists of two parts; growing and pruning.

  7. Decision analysis using decision trees for a simple clinical decision.

    Science.gov (United States)

    Blakley, Brian

    2012-10-01

    To illustrate the use of decision trees with a utility index in clinical decision making. A decision tree was created related to whether or not to perform a tonsillectomy. Data from the literature were applied to a common hypothetical clinical scenario. A decision tree graphically represents the typical decision-making process that many clinicians use. The addition of utility functions permitted consideration of the adverse or beneficial effects of outcomes, altering the treatment decision. Quantitative tools such as decision trees may quantify outcome preferences and aid in clinical decision making, but the proper tool and background data are essential.

  8. Combustion of a Pb(II)-loaded olive tree pruning used as biosorbent

    Energy Technology Data Exchange (ETDEWEB)

    Ronda, A., E-mail: alirg@ugr.es [Department of Chemical Engineering, University of Granada, 18071 Granada (Spain); Della Zassa, M. [Department of Industrial Engineering, University of Padua, 35131 Padova (Italy); Martín-Lara, M.A.; Calero, M. [Department of Chemical Engineering, University of Granada, 18071 Granada (Spain); Canu, P. [Department of Industrial Engineering, University of Padua, 35131 Padova (Italy)

    2016-05-05

    Highlights: • The fate of Pb during combustion at two scales of investigation was studied. • Results from combustion in a flow reactor and in the thermobalance were consistent. • The Pb contained in the solid remained in the ashes. • The Pb does not interfere in the use of OTP as fuel. • The combustion of Pb(II)-loaded OTP does not cause environmental hazards. - Abstract: The olive tree pruning is a specific agroindustrial waste that can be successfully used as adsorbent, to remove Pb(II) from contaminated wastewater. Its final incineration has been studied in a thermobalance and in a laboratory flow reactor. The study aims at evaluating the fate of Pb during combustion, at two different scales of investigation. The flow reactor can treat samples approximately 10{sup 2} larger than the conventional TGA. A detailed characterization of the raw and Pb(II)-loaded waste, before and after combustion is presented, including analysis of gas and solids products. The Pb(II)-loaded olive tree pruning has been prepared by a previous biosorption step in a lead solution, reaching a concentration of lead of 2.3 wt%. Several characterizations of the ashes and the mass balances proved that after the combustion, all the lead presents in the waste remained in ashes. Combustion in a flow reactor produced results consistent with those obtained in the thermobalance. It is thus confirmed that the combustion of Pb(II)-loaded olive tree pruning is a viable option to use it after the biosorption process. The Pb contained in the solid remained in the ashes, preventing possible environmental hazards.

  9. IND - THE IND DECISION TREE PACKAGE

    Science.gov (United States)

    Buntine, W.

    1994-01-01

    A common approach to supervised classification and prediction in artificial intelligence and statistical pattern recognition is the use of decision trees. A tree is "grown" from data using a recursive partitioning algorithm to create a tree which has good prediction of classes on new data. Standard algorithms are CART (by Breiman Friedman, Olshen and Stone) and ID3 and its successor C4 (by Quinlan). As well as reimplementing parts of these algorithms and offering experimental control suites, IND also introduces Bayesian and MML methods and more sophisticated search in growing trees. These produce more accurate class probability estimates that are important in applications like diagnosis. IND is applicable to most data sets consisting of independent instances, each described by a fixed length vector of attribute values. An attribute value may be a number, one of a set of attribute specific symbols, or it may be omitted. One of the attributes is delegated the "target" and IND grows trees to predict the target. Prediction can then be done on new data or the decision tree printed out for inspection. IND provides a range of features and styles with convenience for the casual user as well as fine-tuning for the advanced user or those interested in research. IND can be operated in a CART-like mode (but without regression trees, surrogate splits or multivariate splits), and in a mode like the early version of C4. Advanced features allow more extensive search, interactive control and display of tree growing, and Bayesian and MML algorithms for tree pruning and smoothing. These often produce more accurate class probability estimates at the leaves. IND also comes with a comprehensive experimental control suite. IND consists of four basic kinds of routines: data manipulation routines, tree generation routines, tree testing routines, and tree display routines. The data manipulation routines are used to partition a single large data set into smaller training and test sets. The

  10. Decision-Tree Program

    Science.gov (United States)

    Buntine, Wray

    1994-01-01

    IND computer program introduces Bayesian and Markov/maximum-likelihood (MML) methods and more-sophisticated methods of searching in growing trees. Produces more-accurate class-probability estimates important in applications like diagnosis. Provides range of features and styles with convenience for casual user, fine-tuning for advanced user or for those interested in research. Consists of four basic kinds of routines: data-manipulation, tree-generation, tree-testing, and tree-display. Written in C language.

  11. Endophytic Fungi as Pretreatment to Enhance Enzymatic Hydrolysis of Olive Tree Pruning

    Directory of Open Access Journals (Sweden)

    Raquel Martín-Sampedro

    2017-01-01

    Full Text Available Olive tree pruning, as one of the most abundant lignocellulosic residues in Mediterranean countries, has been evaluated as a source of sugars for fuel and chemicals production. A mild acid pretreatment has been combined with a fungal pretreatment using either two endophytes (Ulocladium sp. and Hormonema sp. or a saprophyte (Trametes sp. I-62. The use of endophytes is based on the important role that some of them play during the initial stages of wood decomposition. Without acid treatment, fungal pretreatment with Ulocladium sp. provided a nonsignificant enhancement of 4.6% in glucose digestibility, compared to control. When a mild acid hydrolysis was carried out after fungal pretreatments, significant increases in glucose digestibility from 4.9% to 12.0% (compared to control without fungi were observed for all fungal pretreatments, with maximum values yielded by Hormonema sp. However, despite the observed digestibility boost, the total sugar yields (taking into account solid yield were not significantly increased by the pretreatments. Nevertheless, based on these preliminary improvements in digestibility, this work proves the potential of endophytic fungi to boost the production of sugar from olive tree pruning, which would add an extra value to the bioeconomy of olive crops.

  12. Effects of tree species and wood particle size on the properties of cement-bonded particleboard manufacturing from tree prunings.

    Science.gov (United States)

    Nasser, Ramadan A; Al-Mefarrej, H A; Abdel-Aal, M A; Alshahrani, T S

    2014-09-01

    This study investigated the possibility of using the prunings of six locally grown tree species in Saudi Arabia for cement-bonded particleboard (CBP) production. Panels were made using four different wood particle sizes and a constant wood/cement ratio (1/3 by weight) and target density (1200 kg/m3). The mechanical properties and dimensional stability of the produced panels were determined. The interfacial area and distribution of the wood particles in cement matrix were also investigated by scanning electron microscopy. The results revealed that the panels produced from these pruning materials at a target density of 1200 kg m(-3) meet the strength and dimensional stability requirements of the commercial CBP panels. The mean moduli of rupture and elasticity (MOR and MOE) ranged from 9.68 to 11.78 N mm2 and from 3952 to 5667 N mm2, respectively. The mean percent water absorption for twenty four hours (WA24) ranged from 12.93% to 23.39%. Thickness swelling values ranged from 0.62% to 1.53%. For CBP panels with high mechanical properties and good dimensional stability, mixed-size or coarse particles should be used. Using the tree prunings for CBPs production may help to solve the problem of getting rid of these residues by reducing their negative effects on environment, which are caused by poor disposal of such materials through direct combustion process and appearance of black cloud and then the impact on human health or the random accumulation and its indirect effects on the environment.

  13. Decision trees in epidemiological research.

    Science.gov (United States)

    Venkatasubramaniam, Ashwini; Wolfson, Julian; Mitchell, Nathan; Barnes, Timothy; JaKa, Meghan; French, Simone

    2017-01-01

    In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.

  14. Decision trees in epidemiological research

    Directory of Open Access Journals (Sweden)

    Ashwini Venkatasubramaniam

    2017-09-01

    Full Text Available Abstract Background In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. Main text We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART technique and the newer Conditional Inference tree (CTree technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Conclusions Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.

  15. Objective consensus from decision trees.

    Science.gov (United States)

    Putora, Paul Martin; Panje, Cedric M; Papachristofilou, Alexandros; Dal Pra, Alan; Hundsberger, Thomas; Plasswilm, Ludwig

    2014-12-05

    Consensus-based approaches provide an alternative to evidence-based decision making, especially in situations where high-level evidence is limited. Our aim was to demonstrate a novel source of information, objective consensus based on recommendations in decision tree format from multiple sources. Based on nine sample recommendations in decision tree format a representative analysis was performed. The most common (mode) recommendations for each eventuality (each permutation of parameters) were determined. The same procedure was applied to real clinical recommendations for primary radiotherapy for prostate cancer. Data was collected from 16 radiation oncology centres, converted into decision tree format and analyzed in order to determine the objective consensus. Based on information from multiple sources in decision tree format, treatment recommendations can be assessed for every parameter combination. An objective consensus can be determined by means of mode recommendations without compromise or confrontation among the parties. In the clinical example involving prostate cancer therapy, three parameters were used with two cut-off values each (Gleason score, PSA, T-stage) resulting in a total of 27 possible combinations per decision tree. Despite significant variations among the recommendations, a mode recommendation could be found for specific combinations of parameters. Recommendations represented as decision trees can serve as a basis for objective consensus among multiple parties.

  16. Decision trees in epidemiological research

    National Research Council Canada - National Science Library

    Ashwini Venkatasubramaniam; Julian Wolfson; Nathan Mitchell; Timothy Barnes; Meghan JaKa; Simone French

    2017-01-01

    .... Main text We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable...

  17. EVALUATION OF THE WORK CONDITIONS OF ACTIVITIES OF URBAN TREE PRUNING

    Directory of Open Access Journals (Sweden)

    Nilton César Fiedler

    2007-03-01

    Full Text Available this work analyzed the work environment in the trees pruning activities in the urban arborization, comparison with the values of the legislation and the practical application of results to provide a better comfort, security, health, welfare to workers, and also a better efficiency and quality of the work. The weather conditions, the noise levels, the light conditions and vibration were analyzed using suitable ergonomic methods. The weather conditions in the work environment were according the permissible values in the legislation (NR15 for index of humid bulb and globe thermometer (IBUTG of 25°C for the activities of pruning, with exception of the schedule to twelve hours (26,2°C, the hours of working should be of 30 minutes of work and 30 minutes of rest. The noise levels found in the activities of cut were 105,7 dB (A and bucking were 103.9 dB (A, above the level permited by legislation (NR15. The minimum light conditions values were acceptable for legislation (NBR 5413/92, but the global indices were too high being able to cause problems to the worker health. The vibration conditions were acceptable.

  18. Severity of scab and its effects on fruit weight in mechanically hedge-pruned and topped pecan trees

    Science.gov (United States)

    Scab is the most damaging disease of pecan in the southeastern USA. Pecan trees can attain 44 m in height, so managing disease in the upper canopy is a problem. Fungicide is ordinarily applied using ground-based air-blast sprayers. Although mechanical hedge-pruning and topping of pecan is done for s...

  19. TCF bleaching sequence in kraft pulping of olive tree pruning residues.

    Science.gov (United States)

    Requejo, A; Rodríguez, A; Colodette, J L; Gomide, J L; Jiménez, L

    2012-08-01

    The aim of the present work was to find a suitable Kraft cooking process for olive tree pruning (OTP), in order to produce pulp of kappa number about 17. The Kraft pulp produced under optimized conditions showed a viscosity of 31.5 mPa·s and good physical, mechanical, and optical properties, which are suitable for paper production. The physical-mechanical and optical properties were measured before and after bleaching. Although the OTP pulp was bleached to 90.9% ISO brightness (kappapulp showed a brightness reversion equal to 1.3%. Furthermore, this bleached pulp did not need a high intensity of beating due to high drainability degree in the unbeaten pulp. So that, OTP is suggested as an interesting raw material for cellulosic pulp production because its properties are comparable to those of other agricultural residues, currently used in the paper industry. Copyright © 2012 Elsevier Ltd. All rights reserved.

  20. Monomeric carbohydrates production from olive tree pruning biomass: modeling of dilute acid hydrolysis.

    Science.gov (United States)

    Puentes, Juan G; Mateo, Soledad; Fonseca, Bruno G; Roberto, Inês C; Sánchez, Sebastián; Moya, Alberto J

    2013-12-01

    Statistical modeling and optimization of dilute sulfuric acid hydrolysis of olive tree pruning biomass has been performed using response surface methodology. Central composite rotatable design was applied to assess the effect of acid concentration, reaction time and temperature on efficiency and selectivity of hemicellulosic monomeric carbohydrates to d-xylose. Second-order polynomial model was fitted to experimental data to find the optimum reaction conditions by multiple regression analysis. The monomeric d-xylose recovery 85% (as predicted by the model) was achieved under optimized hydrolysis conditions (1.27% acid concentration, 96.5°C and 138 min), confirming the high validity of the developed model. The content of d-glucose (8.3%) and monosaccharide degradation products (0.1% furfural and 0.04% 5-hydroxymethylfurfural) provided a high quality subtract, ready for subsequent biochemical conversion to value-added products. Copyright © 2013 Elsevier Ltd. All rights reserved.

  1. Algorithms for Decision Tree Construction

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    The study of algorithms for decision tree construction was initiated in 1960s. The first algorithms are based on the separation heuristic [13, 31] that at each step tries dividing the set of objects as evenly as possible. Later Garey and Graham [28] showed that such algorithm may construct decision trees whose average depth is arbitrarily far from the minimum. Hyafil and Rivest in [35] proved NP-hardness of DT problem that is constructing a tree with the minimum average depth for a diagnostic problem over 2-valued information system and uniform probability distribution. Cox et al. in [22] showed that for a two-class problem over information system, even finding the root node attribute for an optimal tree is an NP-hard problem. © Springer-Verlag Berlin Heidelberg 2011.

  2. Improved Frame Mode Selection for AMR-WB+ Based on Decision Tree

    Science.gov (United States)

    Kim, Jong Kyu; Kim, Nam Soo

    In this letter, we propose a coding mode selection method for the AMR-WB+ audio coder based on a decision tree. In order to reduce computation while maintaining good performance, decision tree classifier is adopted with the closed loop mode selection results as the target classification labels. The size of the decision tree is controlled by pruning, so the proposed method does not increase the memory requirement significantly. Through an evaluation test on a database covering both speech and music materials, the proposed method is found to achieve a much better mode selection accuracy compared with the open loop mode selection module in the AMR-WB+.

  3. Targeted pruning of a neuron’s dendritic tree via femtosecond laser dendrotomy

    Science.gov (United States)

    Go, Mary Ann; Choy, Julian Min Chiang; Colibaba, Alexandru Serban; Redman, Stephen; Bachor, Hans-A.; Stricker, Christian; Daria, Vincent Ricardo

    2016-01-01

    Neurons are classified according to action potential firing in response to current injection. While such firing patterns are shaped by the composition and distribution of ion channels, modelling studies suggest that the geometry of dendritic branches also influences temporal firing patterns. Verifying this link is crucial to understanding how neurons transform their inputs to output but has so far been technically challenging. Here, we investigate branching-dependent firing by pruning the dendritic tree of pyramidal neurons. We use a focused ultrafast laser to achieve highly localized and minimally invasive cutting of dendrites, thus keeping the rest of the dendritic tree intact and the neuron functional. We verify successful dendrotomy via two-photon uncaging of neurotransmitters before and after dendrotomy at sites around the cut region and via biocytin staining. Our results show that significantly altering the dendritic arborisation, such as by severing the apical trunk, enhances excitability in layer V cortical pyramidal neurons as predicted by simulations. This method may be applied to the analysis of specific relationships between dendritic structure and neuronal function. The capacity to dynamically manipulate dendritic topology or isolate inputs from various dendritic domains can provide a fresh perspective on the roles they play in shaping neuronal output.

  4. Decision tree modeling using R.

    Science.gov (United States)

    Zhang, Zhongheng

    2016-08-01

    In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building.

  5. A composition theorem for decision tree complexity

    OpenAIRE

    Montanaro, Ashley

    2013-01-01

    We completely characterise the complexity in the decision tree model of computing composite relations of the form h = g(f^1,...,f^n), where each relation f^i is boolean-valued. Immediate corollaries include a direct sum theorem for decision tree complexity and a tight characterisation of the decision tree complexity of iterated boolean functions.

  6. Totally optimal decision trees for Boolean functions

    KAUST Repository

    Chikalov, Igor

    2016-07-28

    We study decision trees which are totally optimal relative to different sets of complexity parameters for Boolean functions. A totally optimal tree is an optimal tree relative to each parameter from the set simultaneously. We consider the parameters characterizing both time (in the worst- and average-case) and space complexity of decision trees, i.e., depth, total path length (average depth), and number of nodes. We have created tools based on extensions of dynamic programming to study totally optimal trees. These tools are applicable to both exact and approximate decision trees, and allow us to make multi-stage optimization of decision trees relative to different parameters and to count the number of optimal trees. Based on the experimental results we have formulated the following hypotheses (and subsequently proved): for almost all Boolean functions there exist totally optimal decision trees (i) relative to the depth and number of nodes, and (ii) relative to the depth and average depth.

  7. Meta-learning in decision tree induction

    CERN Document Server

    Grąbczewski, Krzysztof

    2014-01-01

    The book focuses on different variants of decision tree induction but also describes  the meta-learning approach in general which is applicable to other types of machine learning algorithms. The book discusses different variants of decision tree induction and represents a useful source of information to readers wishing to review some of the techniques used in decision tree learning, as well as different ensemble methods that involve decision trees. It is shown that the knowledge of different components used within decision tree learning needs to be systematized to enable the system to generate and evaluate different variants of machine learning algorithms with the aim of identifying the top-most performers or potentially the best one. A unified view of decision tree learning enables to emulate different decision tree algorithms simply by setting certain parameters. As meta-learning requires running many different processes with the aim of obtaining performance results, a detailed description of the experimen...

  8. Extensions of Dynamic Programming: Decision Trees, Combinatorial Optimization, and Data Mining

    KAUST Repository

    Hussain, Shahid

    2016-07-10

    This thesis is devoted to the development of extensions of dynamic programming to the study of decision trees. The considered extensions allow us to make multi-stage optimization of decision trees relative to a sequence of cost functions, to count the number of optimal trees, and to study relationships: cost vs cost and cost vs uncertainty for decision trees by construction of the set of Pareto-optimal points for the corresponding bi-criteria optimization problem. The applications include study of totally optimal (simultaneously optimal relative to a number of cost functions) decision trees for Boolean functions, improvement of bounds on complexity of decision trees for diagnosis of circuits, study of time and memory trade-off for corner point detection, study of decision rules derived from decision trees, creation of new procedure (multi-pruning) for construction of classifiers, and comparison of heuristics for decision tree construction. Part of these extensions (multi-stage optimization) was generalized to well-known combinatorial optimization problems: matrix chain multiplication, binary search trees, global sequence alignment, and optimal paths in directed graphs.

  9. Decision trees and forests: a probabilistic perspective

    OpenAIRE

    Lakshminarayanan, B.

    2016-01-01

    Decision trees and ensembles of decision trees are very popular in machine learning and often achieve state-of-the-art performance on black-box prediction tasks. However, popular variants such as C4.5, CART, boosted trees and random forests lack a probabilistic interpretation since they usually just specify an algorithm for training a model. We take a probabilistic approach where we cast the decision tree structures and the parameters associated with the nodes of a decision tree as a probabil...

  10. Qualidade de frutos da tangerina 'Ponkan' após poda de recuperação Quality of 'Ponkan' tangerine tree after recovering pruning

    Directory of Open Access Journals (Sweden)

    Vander Mendonça

    2006-04-01

    Full Text Available Objetivou-se com esta pesquisa avaliar a qualidade de frutos de tangerineira 'Ponkan' em três safras subseqüentes aos tratamentos: poda de topo no rebaixamento da copa e poda da parte baixa da planta (saia. O experimento foi conduzido na Fazenda Vito Crincoli, localizada no município de Perdões - MG. O delineamento experimental utilizado foi em blocos ao acaso, em esquema fatorial 4 x 2, sendo poda do topo (sem poda, poda a 3,0; 2,5 e 2,0 m e poda da saia (sem e com a poda com quatro repetições e três plantas úteis por parcela. Os diferentes tipos de poda não prejudicaram a qualidade de frutos de tangerineira 'Ponkan' nas três safras subseqüente as podas. Após o terceiro ano as plantas que sofreram podas mais severas produziram frutos com peso superiores, demonstrando a viabilidade da poda na recuperação da qualidade dos frutos.This research aimed to test top pruning effect on lowering the top canopy and pruning the lower part of the plant on the recovering of 12 years old 'Ponkan' tangerine tree. Plants were four meters height, 6x4 spacing, grafted on 'Cravo' lemon tree. This experiment was carried out at Vito Crincoli' s Farm in Perdões, MG, Brazil . It was used a randomized block experimental design in a factorial scheme of 4x2, top pruning (without pruning, pruning at 3.0; 2.5 and 2.0m from soil level and circumference pruning (with and without pruning with four replications. The useful plot was constituted of three tangerine plants. After third year of treatment plants that had been under more severe pruning gave higher fruit weight. Indicating the used of pruning to recover fruit quality.

  11. Coast redwood responses to pruning

    Science.gov (United States)

    Kevin L. O' Hara

    2012-01-01

    A large-scale pruning study was established in the winter of 1999 to 2000 at seven different sites on Green Diamond Resource Company forestlands in Humboldt County. The objective of this study was to determine the effects of pruning on increment, epicormic sprouting, stem taper, heartwood formation, and bear damage on these young trees. Pruning treatments varied...

  12. Aerial pruning mechanism, initial real environment test.

    Science.gov (United States)

    Molina, Javier; Hirai, Shinichi

    2017-01-01

    In this research, a pruning mechanism for aerial pruning tasks is tested in a real environment. Since the final goal of the aerial pruning robot will be to prune tree branches close to power lines, some experiments related to wireless communication and pruning performance were conducted. The experiments consisted of testing the communication between two XBee RF modules for monitoring purposes as well as testing the speed control of the circular saw used for pruning tree branches. Results show that both the monitoring and the pruning tasks were successfully done in a real environment.

  13. Representing Boolean Functions by Decision Trees

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    A Boolean or discrete function can be represented by a decision tree. A compact form of decision tree named binary decision diagram or branching program is widely known in logic design [2, 40]. This representation is equivalent to other forms, and in some cases it is more compact than values table or even the formula [44]. Representing a function in the form of decision tree allows applying graph algorithms for various transformations [10]. Decision trees and branching programs are used for effective hardware [15] and software [5] implementation of functions. For the implementation to be effective, the function representation should have minimal time and space complexity. The average depth of decision tree characterizes the expected computing time, and the number of nodes in branching program characterizes the number of functional elements required for implementation. Often these two criteria are incompatible, i.e. there is no solution that is optimal on both time and space complexity. © Springer-Verlag Berlin Heidelberg 2011.

  14. Produção da tangerineira 'ponkan' após poda de recuperação Production of 'ponkan' tangerine tree after pruning recovery

    Directory of Open Access Journals (Sweden)

    Vander Mendonça

    2008-02-01

    Full Text Available Objetivou esta pesquisa testar o efeito da poda de topo no rebaixamento da copa e poda da saia, na recuperação da tangerineira 'Ponkan' com 12 anos de idade, altura de 4 metros, espaçadas de 6 x 4 m e enxertadas sobre limoeiro 'Cravo'. O experimento foi conduzido na Fazenda Vito Crincoli localizada no município de Perdões, MG. O delineamento experimental utilizado foi em blocos ao acaso, em esquema fatorial 4 x 2, sendo poda do topo (sem poda, poda a 3,0; 2,5 e 2,0m e poda da saia (sem e com a poda com quatro repetições. A parcela útil foi constituída de três plantas. As podas drásticas de topo prejudicaram a primeira produção, contudo a partir do segundo ano após a poda, as plantas apresentaram boa recuperação. Esse comportamento foi confirmado na terceira colheita, quando os diferentes tipos de podas do topo não se diferenciaram na produtividade, sendo que o tratamento com poda da saia foi superior ao sem poda.This research aimed to test top pruning effect on the lowering of plant canopy, pruning the lower canopy on the recover of 12 years old 'Ponkan' tangerine tree, 4 meters height, 6 x 4 spaced and grafted on 'Cravo' lemon rootstock. This experiment was carried out at Vito Crincoli's Farm in Perdões, MG. It was carried out under randomized plots in a factorial scheme of 4x2, top pruning (without pruning, pruning at 3.0; 2.5 and 2.0 m and skirt pruning (with and without pruning with 4 replications. Plot size was composed by three plants. Heavy pruning of the top canopy lowered the first tree yield, however, and in the begining of the second year, the plants showed a very good recover. The same behavior was seem for the third harvest when several kinds of top pruning did not differ in the yield. Skirt pruning treatment was superior to that without any pruning.

  15. From Family Trees to Decision Trees.

    Science.gov (United States)

    Trobian, Helen R.

    This paper is a preliminary inquiry by a non-mathematician into graphic methods of sequential planning and ways in which hierarchical analysis and tree structures can be helpful in developing interest in the use of mathematical modeling in the search for creative solutions to real-life problems. Highlights include a discussion of hierarchical…

  16. Enzymatic hydrolyses of pretreated eucalyptus residues, wheat straw or olive tree pruning, and their mixtures towards flexible sugar-based biorefineries

    DEFF Research Database (Denmark)

    Silva-Fernandes, Talita; Marques, Susana; Rodrigues, Rita C. L. B.

    2016-01-01

    Eucalyptus residues, wheat straw, and olive tree pruning are lignocellulosic materials largely available in Southern Europe and have high potential to be used solely or in mixtures in sugar-based biorefineries for the production of biofuels and other bio-based products. Enzymatic hydrolysis of ce...

  17. The decision tree approach to classification

    Science.gov (United States)

    Wu, C.; Landgrebe, D. A.; Swain, P. H.

    1975-01-01

    A class of multistage decision tree classifiers is proposed and studied relative to the classification of multispectral remotely sensed data. The decision tree classifiers are shown to have the potential for improving both the classification accuracy and the computation efficiency. Dimensionality in pattern recognition is discussed and two theorems on the lower bound of logic computation for multiclass classification are derived. The automatic or optimization approach is emphasized. Experimental results on real data are reported, which clearly demonstrate the usefulness of decision tree classifiers.

  18. Comprehensive decision tree models in bioinformatics.

    Science.gov (United States)

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly

  19. Rootstocks influence yield performance of navel orange trees after drastic pruning

    Directory of Open Access Journals (Sweden)

    Henrique Belmonte Petry

    2015-12-01

    Full Text Available Drastic pruning is an alternative control recommended in orchards affected by citrus canker (Xanthomonas citri subsp. citri. This study aimed at evaluating the influence of six rootstocks on growth, yield and quality of 'Monte Parnaso'(Citrus sinensis (L. Osb. navel oranges, after performing a drastic pruning to eradicate the citrus canker. A complete randomized blocks design, with six treatments and four replicates, was used. The following rootstocks were tested: 'Caipira' sweet orange (C. sinensis (L. Osb., 'Volkamer' lemon (C. volkameriana Pasq., 'Cravo' Rangpur lime (C. limonia Osb., 'Swingle'citrumelo (C. paradisi Macf. x Poncirus trifoliata (L. Raf., 'Sunki' mandarin (C. sunki Hort. ex Tan. and 'Troyer' citrange (C. sinensis x P. trifoliata. Traits related to plant height, yield and fruit quality were evaluated. The largest cumulative yield was obtained from 'Cravo', 'Volkamer' and 'Sunki'. 'Cravo' and 'Volkamer' induced higher production efficiency, fruits with the highest average weight and the lowest pre-harvest fruit drop. All the evaluated rootstocks produced high quality fruits and similar canopy sizes.

  20. PRIA 3 Fee Determination Decision Tree

    Science.gov (United States)

    The PRIA 3 decision tree will help applicants requesting a pesticide registration or certain tolerance action to accurately identify the category of their application and the amount of the required fee before they submit the application.

  1. RE-Powering’s Electronic Decision Tree

    Science.gov (United States)

    Developed by US EPA's RE-Powering America's Land Initiative, the RE-Powering Decision Trees tool guides interested parties through a process to screen sites for their suitability for solar photovoltaics or wind installations

  2. Speech Recognition Using Randomized Relational Decision Trees

    National Research Council Canada - National Science Library

    Amit, Yali

    1999-01-01

    .... This implies that we recognize words as units, without recognizing their subcomponents. Multiple randomized decision trees are used to access the large pool of acoustic events in a systematic manner and are aggregated to produce the classifier.

  3. Solar and Wind Site Screening Decision Trees

    Science.gov (United States)

    EPA and NREL created a decision tree to guide state and local governments and other stakeholders through a process for screening sites for their suitability for future redevelopment with solar photovoltaic (PV) energy and wind energy.

  4. A survey of decision tree classifier methodology

    Science.gov (United States)

    Safavian, S. R.; Landgrebe, David

    1991-01-01

    Decision tree classifiers (DTCs) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps the most important feature of DTCs is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issues. After considering potential advantages of DTCs over single-state classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  5. A survey of decision tree classifier methodology

    Science.gov (United States)

    Safavian, S. Rasoul; Landgrebe, David

    1990-01-01

    Decision Tree Classifiers (DTC's) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps, the most important feature of DTC's is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issue. After considering potential advantages of DTC's over single stage classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  6. Parallel object-oriented decision tree system

    Science.gov (United States)

    Kamath,; Chandrika, Cantu-Paz [Dublin, CA; Erick, [Oakland, CA

    2006-02-28

    A data mining decision tree system that uncovers patterns, associations, anomalies, and other statistically significant structures in data by reading and displaying data files, extracting relevant features for each of the objects, and using a method of recognizing patterns among the objects based upon object features through a decision tree that reads the data, sorts the data if necessary, determines the best manner to split the data into subsets according to some criterion, and splits the data.

  7. Combining soft decision algorithms and scale-sequential hypotheses pruning for object recognition

    Energy Technology Data Exchange (ETDEWEB)

    Kumar, V.P.; Manolakos, E.S. [Northeastern Univ., Boston, MA (United States)

    1996-12-31

    This paper describes a system that exploits the synergy of Hierarchical Mixture Density (HMD) estimation with multiresolution decomposition based hypothesis pruning to perform efficiently joint segmentation and labeling of partially occluded objects in images. First we present the overall structure of the HMD estimation algorithm in the form of a recurrent neural network which generates the posterior probabilities of the various hypotheses associated with the image. Then in order to reduce the large memory and computation requirement we propose a hypothesis pruning scheme making use of the orthonormal discrete wavelet transform for dimensionality reduction. We provide an intuitive justification for the validity of this scheme and present experimental results and performance analysis on real and synthetic images to verify our claims.

  8. Fast Image Texture Classification Using Decision Trees

    Science.gov (United States)

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  9. CUDT: a CUDA based decision tree algorithm.

    Science.gov (United States)

    Lo, Win-Tsung; Chang, Yue-Shan; Sheu, Ruey-Kai; Chiu, Chun-Chieh; Yuan, Shyan-Ming

    2014-01-01

    Decision tree is one of the famous classification methods in data mining. Many researches have been proposed, which were focusing on improving the performance of decision tree. However, those algorithms are developed and run on traditional distributed systems. Obviously the latency could not be improved while processing huge data generated by ubiquitous sensing node in the era without new technology help. In order to improve data processing latency in huge data mining, in this paper, we design and implement a new parallelized decision tree algorithm on a CUDA (compute unified device architecture), which is a GPGPU solution provided by NVIDIA. In the proposed system, CPU is responsible for flow control while the GPU is responsible for computation. We have conducted many experiments to evaluate system performance of CUDT and made a comparison with traditional CPU version. The results show that CUDT is 5 ∼ 55 times faster than Weka-j48 and is 18 times speedup than SPRINT for large data set.

  10. Minimization of Decision Tree Average Depth for Decision Tables with Many-valued Decisions

    KAUST Repository

    Azad, Mohammad

    2014-09-13

    The paper is devoted to the analysis of greedy algorithms for the minimization of average depth of decision trees for decision tables such that each row is labeled with a set of decisions. The goal is to find one decision from the set of decisions. When we compare with the optimal result obtained from dynamic programming algorithm, we found some greedy algorithms produces results which are close to the optimal result for the minimization of average depth of decision trees.

  11. Algorithms for optimal dyadic decision trees

    Energy Technology Data Exchange (ETDEWEB)

    Hush, Don [Los Alamos National Laboratory; Porter, Reid [Los Alamos National Laboratory

    2009-01-01

    A new algorithm for constructing optimal dyadic decision trees was recently introduced, analyzed, and shown to be very effective for low dimensional data sets. This paper enhances and extends this algorithm by: introducing an adaptive grid search for the regularization parameter that guarantees optimal solutions for all relevant trees sizes, revising the core tree-building algorithm so that its run time is substantially smaller for most regularization parameter values on the grid, and incorporating new data structures and data pre-processing steps that provide significant run time enhancement in practice.

  12. Steam explosion treatment for ethanol production from branches pruned from pear trees by simultaneous saccharification and fermentation.

    Science.gov (United States)

    Sasaki, Chizuru; Okumura, Ryosuke; Asada, Chikako; Nakamura, Yoshitoshi

    2014-01-01

    This study investigated the production of ethanol from unutilized branches pruned from pear trees by steam explosion pretreatment. Steam pressures of 25, 35, and 45 atm were applied for 5 min, followed by enzymatic saccharification of the extracted residues with cellulase (Cellic CTec2). High glucose recoveries, of 93.3, 99.7, and 87.1%, of the total sugar derived from the cellulose were obtained from water- and methanol-extracted residues after steam explosion at 25, 35, and 45 tm, respectively. These values corresponded to 34.9, 34.3, and 27.1 g of glucose per 100 g of dry steam-exploded branches. Simultaneous saccharification and fermentation experiments were done on water-extracted residues and water- and methanol-extracted residues by Kluyveromyces marxianus NBRC 1777. An overall highest theoretical ethanol yield of 76% of the total sugar derived from cellulose was achieved when 100 g/L of water- and methanol-washed residues from 35 atm-exploded pear branches was used as substrate.

  13. The potential of legume tree prunings as organic matters for improving phosphorus availability in an acid soil

    Directory of Open Access Journals (Sweden)

    I Wahyudi

    2015-01-01

    Full Text Available A study that was aimed to elucidate roles of Gliricidia sepium and Tithonia diversifolia prunings and their extracted humic and fulvic acids on improving phosphorus availability and decreasing aluminum concentration in an Ultisol was conducted in a glasshouse. Thirteen treatments consisting of two prunings, six rates of pruning application (5, 7.5, 10, 20, 40 and 80 t/ha and one control (no added prunings were arranged in a randomized block design with four replicates. Each mixture of prunings and soil was placed in a pot containing 8 kg of soil and maize of Srikandi cultivar was grown on it for 45 days. At harvest, soil pH, P content and aluminium concentration were measured. Results of the glasshouse experiment showed that application of Gliricidia and Tithonia prunings significantly increased soil pH, reduced Alo concentration, increased Alp content, increased P availability, and increased P taken up by maize grown for 45 days. The optimum rate of both Gliricidia and Tithonia pruning should be 40 t/ha. However, at the same rate, optimum production gained by Tithonia would be higher than that of Gliricidia.

  14. Financial analysis of pruning coast Douglas-fir.

    Science.gov (United States)

    Roger D. Fight; James M. Cahlll; Thomas D. Fahey; Thomas A. Snellgrove

    1987-01-01

    Pruning of coast Douglas-fir was evaluated; recent product recovery information for pruned and unpruned logs for both sawn and peeled products was used. Dimensions of pruned and unpruned trees were simulated with the Douglas-fir stand simulator (DFSIM). Results are presented for a range of sites, ages at time of pruning, ages at time of harvest, product prices, and...

  15. Two Trees: Migrating Fault Trees to Decision Trees for Real Time Fault Detection on International Space Station

    Science.gov (United States)

    Lee, Charles; Alena, Richard L.; Robinson, Peter

    2004-01-01

    We started from ISS fault trees example to migrate to decision trees, presented a method to convert fault trees to decision trees. The method shows that the visualizations of root cause of fault are easier and the tree manipulating becomes more programmatic via available decision tree programs. The visualization of decision trees for the diagnostic shows a format of straight forward and easy understands. For ISS real time fault diagnostic, the status of the systems could be shown by mining the signals through the trees and see where it stops at. The other advantage to use decision trees is that the trees can learn the fault patterns and predict the future fault from the historic data. The learning is not only on the static data sets but also can be online, through accumulating the real time data sets, the decision trees can gain and store faults patterns in the trees and recognize them when they come.

  16. Safety validation of decision trees for hepatocellular carcinoma.

    Science.gov (United States)

    Wang, Xian-Qiang; Liu, Zhe; Lv, Wen-Ping; Luo, Ying; Yang, Guang-Yun; Li, Chong-Hui; Meng, Xiang-Fei; Liu, Yang; Xu, Ke-Sen; Dong, Jia-Hong

    2015-08-21

    To evaluate a different decision tree for safe liver resection and verify its efficiency. A total of 2457 patients underwent hepatic resection between January 2004 and December 2010 at the Chinese PLA General Hospital, and 634 hepatocellular carcinoma (HCC) patients were eligible for the final analyses. Post-hepatectomy liver failure (PHLF) was identified by the association of prothrombin time 50 μmol/L (the "50-50" criteria), which were assessed at day 5 postoperatively or later. The Swiss-Clavien decision tree, Tokyo University-Makuuchi decision tree, and Chinese consensus decision tree were adopted to divide patients into two groups based on those decision trees in sequence, and the PHLF rates were recorded. The overall mortality and PHLF rate were 0.16% and 3.0%. A total of 19 patients experienced PHLF. The numbers of patients to whom the Swiss-Clavien, Tokyo University-Makuuchi, and Chinese consensus decision trees were applied were 581, 573, and 622, and the PHLF rates were 2.75%, 2.62%, and 2.73%, respectively. Significantly more cases satisfied the Chinese consensus decision tree than the Swiss-Clavien decision tree and Tokyo University-Makuuchi decision tree (P decision trees. The Chinese consensus decision tree expands the indications for hepatic resection for HCC patients and does not increase the PHLF rate compared to the Swiss-Clavien and Tokyo University-Makuuchi decision trees. It would be a safe and effective algorithm for hepatectomy in patients with hepatocellular carcinoma.

  17. Integrating Decision Tree Learning into Inductive Databases

    OpenAIRE

    Fromont, Elisa; Blockeel, Hendrik; Struyf, Jan

    2006-01-01

    In inductive databases, there is no conceptual difference between data and the models describing the data: both can be stored and queried using some query language. The approach that adheres most strictly to this philosophy is probably the one proposed by Calders et al. (2006): in this approach, models are stored in relational tables and queried using standard SQL. The approach has been described in detail for association rule discovery. In this work, we study how decision tree induction can ...

  18. Época de poda da figueira cultivada no estado de São Paulo Pruning time for fig trees in the state of São Paulo

    Directory of Open Access Journals (Sweden)

    Orlando Rigitano

    1963-01-01

    Full Text Available No Estado de São Paulo as figueiras (Ficus carica L. são anualmente submetidas a um tipo de poda hibernal que consiste na eliminação quase total da copa formada na estação anterior. Com a finalidade de estudar o comportamento de figueiras podadas em diferentes épocas durante o inverno, foi iniciado em 1960, em Campinas, um experimento com cinco épocas de poda no período de 1.° de maio a 1.° de setembro. São apresentados os dados de produção, por tratamento obtidos em 1962 e 1963, relativos ao número e ao pêso de figos, assim como os pesos médios de uma fruta. Os dados de 1963, revelaram diferenças significativas e permitiram várias conclusões. A poda feita em 1.° de agôsto ofereceu os melhores resultados, embora sem diferir significativamente daquela executada em 1.° de julho. Como era esperado, as podas levadas a efeito nas épocas extremas, isto é, em princípios de maio e de setembro, resultaram nas produções mais baixas. Observou-se tendência da obtenção de colheitas mais precoces e figos mais pesados nos tratamentos mais produtivos.With a view to compare the effects on fruit bearing, pruning of fig trees was carried out in Campinas, State of São Paulo, during the dormant season of the plant, at 5 different dates, namely on the 1st day of each of the months of May, June, July, August and September. Pruning was started as soon as the plants became more or less dormant in the fall and was continued until vegetation again appeared at the end of winter. The pruning operation took place for two following years and at the dates mentioned all the new branches were cut back to short stubs. The experimental plot consisted of 30 trees of the variety "Roxo de Valinhos" (San Piero spread apart 7 by 13 feet and was laid out in randomized blocks with 3 replications. The results of this trial can be summarized as follows: a Trees pruned on August 1st gave the highest yield followed by those pruned on July 1st. While the

  19. CUDT: A CUDA Based Decision Tree Algorithm

    Directory of Open Access Journals (Sweden)

    Win-Tsung Lo

    2014-01-01

    Full Text Available Decision tree is one of the famous classification methods in data mining. Many researches have been proposed, which were focusing on improving the performance of decision tree. However, those algorithms are developed and run on traditional distributed systems. Obviously the latency could not be improved while processing huge data generated by ubiquitous sensing node in the era without new technology help. In order to improve data processing latency in huge data mining, in this paper, we design and implement a new parallelized decision tree algorithm on a CUDA (compute unified device architecture, which is a GPGPU solution provided by NVIDIA. In the proposed system, CPU is responsible for flow control while the GPU is responsible for computation. We have conducted many experiments to evaluate system performance of CUDT and made a comparison with traditional CPU version. The results show that CUDT is 5∼55 times faster than Weka-j48 and is 18 times speedup than SPRINT for large data set.

  20. The Decision Tree: A Tool for Achieving Behavioral Change.

    Science.gov (United States)

    Saren, Dru

    1999-01-01

    Presents a "Decision Tree" process for structuring team decision making and problem solving about specific student behavioral goals. The Decision Tree involves a sequence of questions/decisions that can be answered in "yes/no" terms. Questions address reasonableness of the goal, time factors, importance of the goal, responsibilities, safety,…

  1. Bias-variance tradeoff of soft decision trees

    OpenAIRE

    Olaru, Cristina; Wehenkel, Louis

    2004-01-01

    This paper focuses on the study of the error composition of a fuzzy decision tree induction method recently proposed by the authors, called soft decision trees. This error may be expressed as a sum of three types of error: residual error, bias and variance. The paper studies empirically the tradeoff between bias and variance in a soft decision tree method and compares it with the tradeoff of classical crisp regression and classification trees. The m...

  2. Horizontal bone augmentation: the decision tree.

    Science.gov (United States)

    Fu, Jia-Hui; Wang, Hom-Lay

    2011-01-01

    The emergence of implant dentistry has led to the need for bone augmentation procedures. With the removal of a tooth, there is an inevitable three-dimensional (3D) loss of alveolar bone. More often than not, horizontal bone loss occurs at a faster rate and to a greater extent compared to vertical bone loss. This led to the development of several horizontal bone augmentation techniques, such as guided bone regeneration, ridge expansion, distraction osteogenesis, and block grafts. These proposed augmentation techniques aim to place the implant in an ideal 3D position for successful restorative therapy. The literature has shown that horizontal bone augmentation is fairly predictable if certain criteria are fulfilled. However, with numerous techniques and materials currently available, it is difficult to choose the most suitable treatment modality. A search of the literature available was conducted to validate the decision-making process when planning for a horizontal ridge augmentation procedure. The decision tree proposed in this paper stems from the 3D buccolingual bone width available at the site of implant placement (⋝ 3.5 mm, factors such as the tissue thickness, the arch position, and the availability of autogenous bone. The decision tree provides insight on how clinicians can choose the most appropriate and predictable horizontal ridge augmentation procedure to minimize unnecessary complications.

  3. Shopping intention prediction using decision trees

    Directory of Open Access Journals (Sweden)

    Dario Šebalj

    2017-09-01

    Full Text Available Introduction: The price is considered to be neglected marketing mix element due to the complexity of price management and sensitivity of customers on price changes. It pulls the fastest customer reactions to that change. Accordingly, the process of making shopping decisions can be very challenging for customer. Objective: The aim of this paper is to create a model that is able to predict shopping intention and classify respondents into one of the two categories, depending on whether they intend to shop or not. Methods: Data sample consists of 305 respondents, who are persons older than 18 years involved in buying groceries for their household. The research was conducted in February 2017. In order to create a model, the decision trees method was used with its several classification algorithms. Results: All models, except the one that used RandomTree algorithm, achieved relatively high classification rate (over the 80%. The highest classification accuracy of 84.75% gave J48 and RandomForest algorithms. Since there is no statistically significant difference between those two algorithms, authors decided to choose J48 algorithm and build a decision tree. Conclusions: The value for money and price level in the store were the most significant variables for classification of shopping intention. Future study plans to compare this model with some other data mining techniques, such as neural networks or support vector machines since these techniques achieved very good accuracy in some previous research in this field.

  4. Decision trees with minimum average depth for sorting eight elements

    KAUST Repository

    AbouEisha, Hassan M.

    2015-11-19

    We prove that the minimum average depth of a decision tree for sorting 8 pairwise different elements is equal to 620160/8!. We show also that each decision tree for sorting 8 elements, which has minimum average depth (the number of such trees is approximately equal to 8.548×10^326365), has also minimum depth. Both problems were considered by Knuth (1998). To obtain these results, we use tools based on extensions of dynamic programming which allow us to make sequential optimization of decision trees relative to depth and average depth, and to count the number of decision trees with minimum average depth.

  5. Ethanol production from glucose and xylose obtained from steam exploded water-extracted olive tree pruning using phosphoric acid as catalyst.

    Science.gov (United States)

    Negro, M J; Alvarez, C; Ballesteros, I; Romero, I; Ballesteros, M; Castro, E; Manzanares, P; Moya, M; Oliva, J M

    2014-02-01

    In this work, the effect of phosphoric acid (1% w/w) in steam explosion pretreatment of water extracted olive tree pruning at 175°C and 195°C was evaluated. The objective is to produce ethanol from all sugars (mainly glucose and xylose) contained in the pretreated material. The water insoluble fraction obtained after pretreatment was used as substrate in a simultaneous saccharification and fermentation (SSF) process by a commercial strain of Saccharomyces cerevisiae. The liquid fraction, containing mainly xylose, was detoxified by alkali and ion-exchange resin and then fermented by the xylose fermenting yeast Scheffersomyces stipitis. Ethanol yields reached in a SSF process were close to 80% when using 15% (w/w) substrate consistency and about 70% of theoretical when using prehydrolysates detoxified by ion-exchange resins. Considering sugars recovery and ethanol yields about 160g of ethanol from kg of water extracted olive tree pruning could be obtained. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. Evaluating Lignin-Rich Residues from Biochemical Ethanol Production of Wheat Straw and Olive Tree Pruning by FTIR and 2D-NMR

    Directory of Open Access Journals (Sweden)

    José I. Santos

    2015-01-01

    Full Text Available Lignin-rich residues from the cellulose-based industry are traditionally incinerated for internal energy use. The future biorefineries that convert cellulosic biomass into biofuels will generate more lignin than necessary for internal energy use, and therefore value-added products from lignin could be produced. In this context, a good understanding of lignin is necessary prior to its valorization. The present study focused on the characterization of lignin-rich residues from biochemical ethanol production, including steam explosion, saccharification, and fermentation, of wheat straw and olive tree pruning. In addition to the composition and purity, the lignin structures (S/G ratio, interunit linkages were investigated by spectroscopy techniques such as FTIR and 2D-NMR. Together with the high lignin content, both residues contained significant amounts of carbohydrates, mainly glucose and protein. Wheat straw lignin showed a very low S/G ratio associated with p-hydroxycinnamates (p-coumarate and ferulate, whereas a strong predominance of S over G units was observed for olive tree pruning lignin. The main interunit linkages present in both lignins were β-O-4′ ethers followed by resinols and phenylcoumarans. These structural characteristics determine the use of these lignins in respect to their valorization.

  7. A tool for study of optimal decision trees

    KAUST Repository

    Alkhalid, Abdulaziz

    2010-01-01

    The paper describes a tool which allows us for relatively small decision tables to make consecutive optimization of decision trees relative to various complexity measures such as number of nodes, average depth, and depth, and to find parameters and the number of optimal decision trees. © 2010 Springer-Verlag Berlin Heidelberg.

  8. Multi-stage optimization of decision and inhibitory trees for decision tables with many-valued decisions

    KAUST Repository

    Azad, Mohammad

    2017-06-16

    We study problems of optimization of decision and inhibitory trees for decision tables with many-valued decisions. As cost functions, we consider depth, average depth, number of nodes, and number of terminal/nonterminal nodes in trees. Decision tables with many-valued decisions (multi-label decision tables) are often more accurate models for real-life data sets than usual decision tables with single-valued decisions. Inhibitory trees can sometimes capture more information from decision tables than decision trees. In this paper, we create dynamic programming algorithms for multi-stage optimization of trees relative to a sequence of cost functions. We apply these algorithms to prove the existence of totally optimal (simultaneously optimal relative to a number of cost functions) decision and inhibitory trees for some modified decision tables from the UCI Machine Learning Repository.

  9. TreePOD: Sensitivity-Aware Selection of Pareto-Optimal Decision Trees.

    Science.gov (United States)

    Muhlbacher, Thomas; Linhardt, Lorenz; Moller, Torsten; Piringer, Harald

    2018-01-01

    Balancing accuracy gains with other objectives such as interpretability is a key challenge when building decision trees. However, this process is difficult to automate because it involves know-how about the domain as well as the purpose of the model. This paper presents TreePOD, a new approach for sensitivity-aware model selection along trade-offs. TreePOD is based on exploring a large set of candidate trees generated by sampling the parameters of tree construction algorithms. Based on this set, visualizations of quantitative and qualitative tree aspects provide a comprehensive overview of possible tree characteristics. Along trade-offs between two objectives, TreePOD provides efficient selection guidance by focusing on Pareto-optimal tree candidates. TreePOD also conveys the sensitivities of tree characteristics on variations of selected parameters by extending the tree generation process with a full-factorial sampling. We demonstrate how TreePOD supports a variety of tasks involved in decision tree selection and describe its integration in a holistic workflow for building and selecting decision trees. For evaluation, we illustrate a case study for predicting critical power grid states, and we report qualitative feedback from domain experts in the energy sector. This feedback suggests that TreePOD enables users with and without statistical background a confident and efficient identification of suitable decision trees.

  10. Primer on medical decision analysis: Part 2--Building a tree.

    Science.gov (United States)

    Detsky, A S; Naglie, G; Krahn, M D; Redelmeier, D A; Naimark, D

    1997-01-01

    This part of a five-part series covering practical issues in the performance of decision analysis outlines the basic strategies for building decision trees. The authors offer six recommendations for building and programming decision trees. Following these six recommendations will facilitate performance of the sensitivity analyses required to achieve two goals. The first is to find modeling or programming errors, a process known as "debugging" the tree. The second is to determine the robustness of the qualitative conclusions drawn from the analysis.

  11. On algorithm for building of optimal α-decision trees

    KAUST Repository

    Alkhalid, Abdulaziz

    2010-01-01

    The paper describes an algorithm that constructs approximate decision trees (α-decision trees), which are optimal relatively to one of the following complexity measures: depth, total path length or number of nodes. The algorithm uses dynamic programming and extends methods described in [4] to constructing approximate decision trees. Adjustable approximation rate allows controlling algorithm complexity. The algorithm is applied to build optimal α-decision trees for two data sets from UCI Machine Learning Repository [1]. © 2010 Springer-Verlag Berlin Heidelberg.

  12. Comparison of greedy algorithms for α-decision tree construction

    KAUST Repository

    Alkhalid, Abdulaziz

    2011-01-01

    A comparison among different heuristics that are used by greedy algorithms which constructs approximate decision trees (α-decision trees) is presented. The comparison is conducted using decision tables based on 24 data sets from UCI Machine Learning Repository [2]. Complexity of decision trees is estimated relative to several cost functions: depth, average depth, number of nodes, number of nonterminal nodes, and number of terminal nodes. Costs of trees built by greedy algorithms are compared with minimum costs calculated by an algorithm based on dynamic programming. The results of experiments assign to each cost function a set of potentially good heuristics that minimize it. © 2011 Springer-Verlag.

  13. Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets.

    Science.gov (United States)

    Sankari, E Siva; Manimegalai, D

    2017-12-21

    Predicting membrane protein types is an important and challenging research area in bioinformatics and proteomics. Traditional biophysical methods are used to classify membrane protein types. Due to large exploration of uncharacterized protein sequences in databases, traditional methods are very time consuming, expensive and susceptible to errors. Hence, it is highly desirable to develop a robust, reliable, and efficient method to predict membrane protein types. Imbalanced datasets and large datasets are often handled well by decision tree classifiers. Since imbalanced datasets are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Random tree, REP (Reduced Error Pruning) tree, ensemble methods such as Adaboost, RUS (Random Under Sampling) boost, Rotation forest and Random forest are analysed. Among the various decision tree classifiers Random forest performs well in less time with good accuracy of 96.35%. Another inference is RUS boost decision tree classifier is able to classify one or two samples in the class with very less samples while the other classifiers such as DT, Adaboost, Rotation forest and Random forest are not sensitive for the classes with fewer samples. Also the performance of decision tree classifiers is compared with SVM (Support Vector Machine) and Naive Bayes classifier. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Statistical Decision-Tree Models for Parsing

    CERN Document Server

    Magerman, D M

    1995-01-01

    Syntactic natural language parsers have shown themselves to be inadequate for processing highly-ambiguous large-vocabulary text, as is evidenced by their poor performance on domains like the Wall Street Journal, and by the movement away from parsing-based approaches to text-processing in general. In this paper, I describe SPATTER, a statistical parser based on decision-tree learning techniques which constructs a complete parse for every sentence and achieves accuracy rates far better than any published result. This work is based on the following premises: (1) grammars are too complex and detailed to develop manually for most interesting domains; (2) parsing models must rely heavily on lexical and contextual information to analyze sentences accurately; and (3) existing {$n$}-gram modeling techniques are inadequate for parsing models. In experiments comparing SPATTER with IBM's computer manuals parser, SPATTER significantly outperforms the grammar-based parser. Evaluating SPATTER against the Penn Treebank Wall ...

  15. Frequência e intensidade de poda em pomar jovem de laranjeiras 'Valência' sob manejo orgânico Frequency and intensity of pruning young 'Valencia' orange trees in orchards under organic culture system

    Directory of Open Access Journals (Sweden)

    Emiliano Santarosa

    2010-10-01

    Full Text Available Este trabalho teve por objetivo avaliar o efeito da frequência e intensidade de poda sobre a produção e qualidade dos frutos da laranjeira 'Valência', enxertada sobre Poncirus trifoliata, em pomar jovem, sob sistema de manejo orgânico. O plantio foi realizado em agosto de 2001, em espaçamento de 5,0x2,5m, em Montenegro, Rio Grande do Sul (RS. Os tratamentos testados foram: A - Testemunha (sem poda; B - Poda anual de 15%; C - Poda bienal de 15%; D - Poda bienal de 30%; e E - Poda trienal de 30% do volume da copa. O delineamento experimental foi de blocos ao acaso, sendo quatro repetições e quatro plantas por parcela. Nas safras de 2006, 2007 e 2008, foram avaliados: número, massa total de frutos e massa média dos frutos, teor de sólidos solúveis totais (SST, acidez total titulável (ATT e relação SST/AT do suco dos frutos. Em pomares jovens, com menos de sete anos de idade, durante três safras consecutivas, verificou-se que as podas de frutificação não alteram a produção acumulada, nem a qualidade físico-química dos frutos, mas reduzem a produção no ano subsequente à execução da poda.This study evaluated the influence of frequency and intensity of pruning on young orchards, with organic management system, on the yield and fruit quality of 'Valencia' oranges. The trees were budded on Poncirus trifoliata rootstock and implanted in August, 2001, in Montenegro-RS. The pruning tested was: A - control, without pruning; B - annual pruning of 15%; C - biennial pruning of 15%; D - biennial pruning of 30% and E - three-year 30% pruning of the canopy volume. The experiment had a randomized complete-block design, with four-trees plots and four replications. The total fruit mass production was registered and the average weight fruit in the crops 2006, 2007 and 2008 was determined. The fruit quality, total soluble solids (TSS, total acids concentration (TTA and ratio (TSS/TTA were assessed. In orchards with fewer than seven years old

  16. Ventriculogram segmentation using boosted decision trees

    Science.gov (United States)

    McDonald, John A.; Sheehan, Florence H.

    2004-05-01

    Left ventricular status, reflected in ejection fraction or end systolic volume, is a powerful prognostic indicator in heart disease. Quantitative analysis of these and other parameters from ventriculograms (cine xrays of the left ventricle) is infrequently performed due to the labor required for manual segmentation. None of the many methods developed for automated segmentation has achieved clinical acceptance. We present a method for semi-automatic segmentation of ventriculograms based on a very accurate two-stage boosted decision-tree pixel classifier. The classifier determines which pixels are inside the ventricle at key ED (end-diastole) and ES (end-systole) frames. The test misclassification rate is about 1%. The classifier is semi-automatic, requiring a user to select 3 points in each frame: the endpoints of the aortic valve and the apex. The first classifier stage is 2 boosted decision-trees, trained using features such as gray-level statistics (e.g. median brightness) and image geometry (e.g. coordinates relative to user supplied 3 points). Second stage classifiers are trained using the same features as the first, plus the output of the first stage. Border pixels are determined from the segmented images using dilation and erosion. A curve is then fit to the border pixels, minimizing a penalty function that trades off fidelity to the border pixels with smoothness. ED and ES volumes, and ejection fraction are estimated from border curves using standard area-length formulas. On independent test data, the differences between automatic and manual volumes (and ejection fractions) are similar in size to the differences between two human observers.

  17. Relationships for Cost and Uncertainty of Decision Trees

    KAUST Repository

    Chikalov, Igor

    2013-01-01

    This chapter is devoted to the design of new tools for the study of decision trees. These tools are based on dynamic programming approach and need the consideration of subtables of the initial decision table. So this approach is applicable only to relatively small decision tables. The considered tools allow us to compute: 1. Theminimum cost of an approximate decision tree for a given uncertainty value and a cost function. 2. The minimum number of nodes in an exact decision tree whose depth is at most a given value. For the first tool we considered various cost functions such as: depth and average depth of a decision tree and number of nodes (and number of terminal and nonterminal nodes) of a decision tree. The uncertainty of a decision table is equal to the number of unordered pairs of rows with different decisions. The uncertainty of approximate decision tree is equal to the maximum uncertainty of a subtable corresponding to a terminal node of the tree. In addition to the algorithms for such tools we also present experimental results applied to various datasets acquired from UCI ML Repository [4]. © Springer-Verlag Berlin Heidelberg 2013.

  18. Greedy algorithm with weights for decision tree construction

    KAUST Repository

    Moshkov, Mikhail

    2010-12-01

    An approximate algorithm for minimization of weighted depth of decision trees is considered. A bound on accuracy of this algorithm is obtained which is unimprovable in general case. Under some natural assumptions on the class NP, the considered algorithm is close (from the point of view of accuracy) to best polynomial approximate algorithms for minimization of weighted depth of decision trees.

  19. Ensemble of randomized soft decision trees for robust classification

    Indian Academy of Sciences (India)

    For classification, decision trees have become very popular because of its simplicity, interpret-ability and good performance. To induce a decision tree classifier for data having continuous valued attributes, the most common approach is, split the continuous attribute range into a hard (crisp) partition having two or more ...

  20. 15 CFR Supplement 1 to Part 732 - Decision Tree

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 2 2010-01-01 2010-01-01 false Decision Tree 1 Supplement 1 to Part 732 Commerce and Foreign Trade Regulations Relating to Commerce and Foreign Trade (Continued) BUREAU... THE EAR Pt. 732, Supp. 1 Supplement 1 to Part 732—Decision Tree ER06FE04.000 ...

  1. Decision-Tree Formulation With Order-1 Lateral Execution

    Science.gov (United States)

    James, Mark

    2007-01-01

    A compact symbolic formulation enables mapping of an arbitrarily complex decision tree of a certain type into a highly computationally efficient multidimensional software object. The type of decision trees to which this formulation applies is that known in the art as the Boolean class of balanced decision trees. Parallel lateral slices of an object created by means of this formulation can be executed in constant time considerably less time than would otherwise be required. Decision trees of various forms are incorporated into almost all large software systems. A decision tree is a way of hierarchically solving a problem, proceeding through a set of true/false responses to a conclusion. By definition, a decision tree has a tree-like structure, wherein each internal node denotes a test on an attribute, each branch from an internal node represents an outcome of a test, and leaf nodes represent classes or class distributions that, in turn represent possible conclusions. The drawback of decision trees is that execution of them can be computationally expensive (and, hence, time-consuming) because each non-leaf node must be examined to determine whether to progress deeper into a tree structure or to examine an alternative. The present formulation was conceived as an efficient means of representing a decision tree and executing it in as little time as possible. The formulation involves the use of a set of symbolic algorithms to transform a decision tree into a multi-dimensional object, the rank of which equals the number of lateral non-leaf nodes. The tree can then be executed in constant time by means of an order-one table lookup. The sequence of operations performed by the algorithms is summarized as follows: 1. Determination of whether the tree under consideration can be encoded by means of this formulation. 2. Extraction of decision variables. 3. Symbolic optimization of the decision tree to minimize its form. 4. Expansion and transformation of all nested conjunctive

  2. Veneer grade yield from pruned Douglas-fir.

    Science.gov (United States)

    Edward J. II Dimock; Henry H. Haskell

    1962-01-01

    This paper reports actual veneer yields obtained from 10 trees pruned at age 38 and harvested 20 years later. Information of this kind is needed to help determine if and when to prune and ultimately will be essential to a thorough economic analysis of expected returns from pruning.

  3. Relationships among various parameters for decision tree optimization

    KAUST Repository

    Hussain, Shahid

    2014-01-14

    In this chapter, we study, in detail, the relationships between various pairs of cost functions and between uncertainty measure and cost functions, for decision tree optimization. We provide new tools (algorithms) to compute relationship functions, as well as provide experimental results on decision tables acquired from UCI ML Repository. The algorithms presented in this paper have already been implemented and are now a part of Dagger, which is a software system for construction/optimization of decision trees and decision rules. The main results presented in this chapter deal with two types of algorithms for computing relationships; first, we discuss the case where we construct approximate decision trees and are interested in relationships between certain cost function, such as depth or number of nodes of a decision trees, and an uncertainty measure, such as misclassification error (accuracy) of decision tree. Secondly, relationships between two different cost functions are discussed, for example, the number of misclassification of a decision tree versus number of nodes in a decision trees. The results of experiments, presented in the chapter, provide further insight. © 2014 Springer International Publishing Switzerland.

  4. Construction of α-decision trees for tables with many-valued decisions

    KAUST Repository

    Moshkov, Mikhail

    2011-01-01

    The paper is devoted to the study of greedy algorithm for construction of approximate decision trees (α-decision trees). This algorithm is applicable to decision tables with many-valued decisions where each row is labeled with a set of decisions. For a given row, we should find a decision from the set attached to this row. We consider bound on the number of algorithm steps, and bound on the algorithm accuracy relative to the depth of decision trees. © 2011 Springer-Verlag.

  5. Minimization of decision tree depth for multi-label decision tables

    KAUST Repository

    Azad, Mohammad

    2014-10-01

    In this paper, we consider multi-label decision tables that have a set of decisions attached to each row. Our goal is to find one decision from the set of decisions for each row by using decision tree as our tool. Considering our target to minimize the depth of the decision tree, we devised various kinds of greedy algorithms as well as dynamic programming algorithm. When we compare with the optimal result obtained from dynamic programming algorithm, we found some greedy algorithms produces results which are close to the optimal result for the minimization of depth of decision trees.

  6. Eficiência da poda em cafeeiros no controle da Xylella fastidiosa Prune efficiency in the control of Xylella fastidiosa in coffee trees

    Directory of Open Access Journals (Sweden)

    Rachel Benetti Queiroz-Voltan

    2006-01-01

    management procedures have attenuated the disease incidence, such as the use of bacteria-free seedlings and insect vector control. Pruning is an important practice for optimization of coffee orchard production. Coffee growers refer to pruning as training; coffee tree training depends on the coffee plant type and environment, using traditional or drastic trimming. This research aimed at evaluating the efficiency of different prune procedures in the control of X. fastidiosa incidence in coffee commercial cultivars Acaiá IAC 474-19 and Catuaí Vermelho IAC 81. Eight plants of each cultivar were submitted to three pruning types (traditional, "skeleton cut" and trunking; and eight plants were not pruned (controls. Prior to pruning, five plant branches were collected for anatomical studies. Thereafter, five other branches from all treatments were collected in October/2004 (rainy period and June/2005 (dry period for the anatomical studies. No significant differences were observed for `Acaiá IAC 474-19' that presented lower proportion of xylem vessel obstruction independent of the prune treatment. Prune treatments in `Catuaí Vermelho IAC 81' were also not significantly different; however, plants submitted to dramatic trimmings such as the "skeleton cut" and trunking showed a trend for lower proportion of xylem vessel obstruction by the bacteria, in both rainy and dry periods. It was suggested that the drastic pruning procedures ("skeleton cut" and trunking might be advantageous for the Xyllela control in situations of high disease incidence.

  7. Applied Research of Decision Tree Method on Football Training

    Directory of Open Access Journals (Sweden)

    Liu Jinhui

    2015-01-01

    Full Text Available This paper will make an analysis of decision tree at first, and then offer a further analysis of CLS based on it. As CLS contains the most substantial and most primitive decision-making idea, it can provide the basis of decision tree establishment. Due to certain limitation in details, the ID3 decision tree algorithm is introduced to offer more details. It applies information gain as attribute selection metrics to provide reference for seeking the optimal segmentation point. At last, the ID3 algorithm is applied in football training. Verification is made on this algorithm and it has been proved effectively and reasonably.

  8. An automated approach to the design of decision tree classifiers

    Science.gov (United States)

    Argentiero, P.; Chin, R.; Beaudet, P.

    1982-01-01

    An automated technique is presented for designing effective decision tree classifiers predicated only on a priori class statistics. The procedure relies on linear feature extractions and Bayes table look-up decision rules. Associated error matrices are computed and utilized to provide an optimal design of the decision tree at each so-called 'node'. A by-product of this procedure is a simple algorithm for computing the global probability of correct classification assuming the statistical independence of the decision rules. Attention is given to a more precise definition of decision tree classification, the mathematical details on the technique for automated decision tree design, and an example of a simple application of the procedure using class statistics acquired from an actual Landsat scene.

  9. A simple model to predict the probability of a peach (Prunus persicae tree bud to develop as a long or short shoot as a consequence of winter pruning intensity and previous year growth.

    Directory of Open Access Journals (Sweden)

    Daniele Bevacqua

    Full Text Available In many woody plants, shoots emerging from buds can develop as short or long shoots. The probability of a bud to develop as a long or short shoot relies upon genetic, environmental and management factors and controlling it is an important issue in commercial orchard. We use peach (Prunus persicae trees, subjected to different winter pruning levels and monitored for two years, to develop and calibrate a model linking the probability of a bud to develop as a long shoot to winter pruning intensity and previous year vegetative growth. Eventually we show how our model can be used to adjust pruning intensity to obtain a desired proportion of long and short shoots.

  10. PROSES DECISION TREE PADA DATAMINING DENGAN ALQORITMA ID3

    Directory of Open Access Journals (Sweden)

    LITA SARI MUCHLIS

    2010-06-01

    Full Text Available This article was study about decision process data meaning with algoritma ID3. Data meaning is an atomatic extraction process of large data. It was found with a contour data. Data meaning have a function to produce a different contour wich other. The function of classification of data meaning is helping write decision tree, the function with algoritma ID3. Key words: data meaning, classification, decision tree

  11. Identifying Bank Frauds Using CRISP-DM and Decision Trees

    OpenAIRE

    Bruno Carneiro da Rocha; Rafael Timóteo de Sousa Júnior

    2010-01-01

    This article aims to evaluate the use of techniques of decision trees, in conjunction with the managementmodel CRISP-DM, to help in the prevention of bank fraud. This article offers a study on decision trees, animportant concept in the field of artificial intelligence. The study is focused on discussing how these treesare able to assist in the decision making process of identifying frauds by the analysis of informationregarding bank transactions. This information is captured with the use of t...

  12. Decision tree methods: applications for classification and prediction.

    Science.gov (United States)

    Song, Yan-Yan; Lu, Ying

    2015-04-25

    Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure.

  13. A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran.

    Science.gov (United States)

    Khosravi, Khabat; Pham, Binh Thai; Chapi, Kamran; Shirzadi, Ataollah; Shahabi, Himan; Revhaug, Inge; Prakash, Indra; Tien Bui, Dieu

    2018-02-01

    Floods are one of the most damaging natural hazards causing huge loss of property, infrastructure and lives. Prediction of occurrence of flash flood locations is very difficult due to sudden change in climatic condition and manmade factors. However, prior identification of flood susceptible areas can be done with the help of machine learning techniques for proper timely management of flood hazards. In this study, we tested four decision trees based machine learning models namely Logistic Model Trees (LMT), Reduced Error Pruning Trees (REPT), Naïve Bayes Trees (NBT), and Alternating Decision Trees (ADT) for flash flood susceptibility mapping at the Haraz Watershed in the northern part of Iran. For this, a spatial database was constructed with 201 present and past flood locations and eleven flood-influencing factors namely ground slope, altitude, curvature, Stream Power Index (SPI), Topographic Wetness Index (TWI), land use, rainfall, river density, distance from river, lithology, and Normalized Difference Vegetation Index (NDVI). Statistical evaluation measures, the Receiver Operating Characteristic (ROC) curve, and Freidman and Wilcoxon signed-rank tests were used to validate and compare the prediction capability of the models. Results show that the ADT model has the highest prediction capability for flash flood susceptibility assessment, followed by the NBT, the LMT, and the REPT, respectively. These techniques have proven successful in quickly determining flood susceptible areas. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. Automated Sleep Stage Scoring by Decision Tree Learning

    National Research Council Canada - National Science Library

    Hanaoka, Masaaki

    2001-01-01

    In this paper we describe a waveform recognition method that extracts characteristic parameters from wave- forms and a method of automated sleep stage scoring using decision tree learning that is in...

  15. Decision tree approach for classification of remotely sensed satellite ...

    Indian Academy of Sciences (India)

    DTC) algorithm for classification of remotely sensed satellite data (Landsat TM) using open source support. The decision tree is constructed by recursively partitioning the spectral distribution of the training dataset using WEKA, open source ...

  16. Automatic design of decision-tree algorithms with evolutionary algorithms.

    Science.gov (United States)

    Barros, Rodrigo C; Basgalupp, Márcio P; de Carvalho, André C P L F; Freitas, Alex A

    2013-01-01

    This study reports the empirical analysis of a hyper-heuristic evolutionary algorithm that is capable of automatically designing top-down decision-tree induction algorithms. Top-down decision-tree algorithms are of great importance, considering their ability to provide an intuitive and accurate knowledge representation for classification problems. The automatic design of these algorithms seems timely, given the large literature accumulated over more than 40 years of research in the manual design of decision-tree induction algorithms. The proposed hyper-heuristic evolutionary algorithm, HEAD-DT, is extensively tested using 20 public UCI datasets and 10 microarray gene expression datasets. The algorithms automatically designed by HEAD-DT are compared with traditional decision-tree induction algorithms, such as C4.5 and CART. Experimental results show that HEAD-DT is capable of generating algorithms which are significantly more accurate than C4.5 and CART.

  17. Comparison of Greedy Algorithms for Decision Tree Optimization

    KAUST Repository

    Alkhalid, Abdulaziz

    2013-01-01

    This chapter is devoted to the study of 16 types of greedy algorithms for decision tree construction. The dynamic programming approach is used for construction of optimal decision trees. Optimization is performed relative to minimal values of average depth, depth, number of nodes, number of terminal nodes, and number of nonterminal nodes of decision trees. We compare average depth, depth, number of nodes, number of terminal nodes and number of nonterminal nodes of constructed trees with minimum values of the considered parameters obtained based on a dynamic programming approach. We report experiments performed on data sets from UCI ML Repository and randomly generated binary decision tables. As a result, for depth, average depth, and number of nodes we propose a number of good heuristics. © Springer-Verlag Berlin Heidelberg 2013.

  18. PP prune users guide.

    Science.gov (United States)

    N.A. Bolon; R.D. Fight; J.M. Cahill

    1992-01-01

    The PP PRUNE program allows users to conduct a financial analysis of pruning ponderosa pine (Pinus ponderosa Dougl. ex Laws.). The increase in product value and rate of return from pruning the butt 16.5-foot log can be estimated. Lumber recovery information is based on actual mill experience with pruned and unpruned logs. Users supply lumber prices...

  19. Transferability of decision trees for land cover classification in a ...

    African Journals Online (AJOL)

    This paper attempts to derive classification rules from training data of four Landsat-8 scenes by using the classification and regression tree (CART) implementation of the decision tree algorithm. The transferability of the ruleset was evaluated by classifying two adjacent scenes. The classification of the four mosaicked scenes ...

  20. Decision making for health care professionals: use of decision trees within the community mental health setting.

    Science.gov (United States)

    Bonner, G

    2001-08-01

    To examine the application of the decision tree approach to collaborative clinical decision-making in mental health care in the United Kingdom (UK). While this approach to decision-making has been examined in the acute care setting, there is little published evidence of its use in clinical decision-making within the mental health setting. The complexities of dual diagnosis (schizophrenia and substance misuse in this case example) and the varied viewpoints of different professionals often hamper the decision-making process. This paper highlights how the approach was used successfully as a multiprofessional collaborative approach to decision-making in the context of British community mental health care. A selective review of the relevant literature and a case study application of the decision tree framework. The process of applying the decision tree framework to clinical decision-making in mental health practice can be time consuming and client inclusion within the process is not always appropriate. The approach offers a method of assigning numerical values to support complex multiprofessional decision-making as well as considering underpinning literature to inform the final decision. Use of the decision tree offers a common framework that can assist professionals to examine the options available to them in depth, while considering the complex variables that influence decision-making in collaborative mental health practice. Use of the decision tree warrants further consideration in mental health care in terms of practice and education.

  1. Comparison of decision tree methods for finding active objects

    Science.gov (United States)

    Zhao, Yongheng; Zhang, Yanxia

    The automated classification of objects from large catalogs or survey projects is an important task in many astronomical surveys. Faced with various classification algorithms, astronomers should select the method according to their requirements. Here we describe several kinds of decision trees for finding active objects by multi-wavelength data, such as REPTree, Random Tree, Decision Stump, Random Forest, J48, NBTree, AdTree. All decision tree approaches investigated are in the WEKA package. The classification performance of the methods is presented. In the process of classification by decision tree methods, the classification rules are easily obtained, moreover these rules are clear and easy to understand for astronomers. As a result, astronomers are inclined to prefer and apply them, thus know which attributes are important to discriminate celestial objects. The experimental results show that when various decision trees are applied in discriminating active objects (quasars, BL Lac objects and active galaxies) from non-active objects (stars and galaxies), ADTree is the best only in terms of accuracy, Decision Stump is the best only considering speed, J48 is the optimal choice considering both accuracy and speed.

  2. Pruning in poplar plantations by mechanized device Stihl HT-75

    Directory of Open Access Journals (Sweden)

    Danilović Milorad

    2009-01-01

    Full Text Available The effects of branch pruning device Stihl HT-75 were researched on sample plots in FA Kupinovo and FA Klenak, in poplar plantations of Populus×euramericana 'I-214', Populus×euramericana 'M1' and Populus deltoides of different planting spaces and different ages. The analysed factors were: pruning method, site conditions, number of pruned branches, pruning height, branch diameter, etc. Time measurement was performed by the flow method, and the required number of measurements was calculated by variation statistics. The results of the analysis of variance show the statistical significance of the differences between pruning times of different clone species, different planting spaces and different plantation ages. The results of the analysis of variance and statistical tests show that there are no statistically significant differences between the average time of poplar pruning in plantations of the same age and different planting spaces. The correlation of branch pruning time and the number of pruned branches is represented by the power function model, which according to the results of the regression analyses, is the best representation of the nature of this dependence. Exponential function represents the correlation of the average diameter of pruned branches and the time of pruning. Also, there is a correlation of the average diameter of pruned branches and fuel consumption. Pruning time of poplar trees increases with the increase of the average diameter of pruned branches.

  3. RNA search with decision trees and partial covariance models.

    Science.gov (United States)

    Smith, Jennifer A

    2009-01-01

    The use of partial covariance models to search for RNA family members in genomic sequence databases is explored. The partial models are formed from contiguous subranges of the overall RNA family multiple alignment columns. A binary decision-tree framework is presented for choosing the order to apply the partial models and the score thresholds on which to make the decisions. The decision trees are chosen to minimize computation time subject to the constraint that all of the training sequences are passed to the full covariance model for final evaluation. Computational intelligence methods are suggested to select the decision tree since the tree can be quite complex and there is no obvious method to build the tree in these cases. Experimental results from seven RNA families shows execution times of 0.066-0.268 relative to using the full covariance model alone. Tests on the full sets of known sequences for each family show that at least 95 percent of these sequences are found for two families and 100 percent for five others. Since the full covariance model is run on all sequences accepted by the partial model decision tree, the false alarm rate is at least as low as that of the full model alone.

  4. The Decision Tree for Teaching Management of Uncertainty

    Science.gov (United States)

    Knaggs, Sara J.; And Others

    1974-01-01

    A 'decision tree' consists of an outline of the patient's symptoms and a logic for decision and action. It is felt that this approach to the decisionmaking process better facilitates each learner's application of his own level of knowledge and skills. (Author)

  5. Ethnographic Decision Tree Modeling: A Research Method for Counseling Psychology.

    Science.gov (United States)

    Beck, Kirk A.

    2005-01-01

    This article describes ethnographic decision tree modeling (EDTM; C. H. Gladwin, 1989) as a mixed method design appropriate for counseling psychology research. EDTM is introduced and located within a postpositivist research paradigm. Decision theory that informs EDTM is reviewed, and the 2 phases of EDTM are highlighted. The 1st phase, model…

  6. Bounds on Average Time Complexity of Decision Trees

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    In this chapter, bounds on the average depth and the average weighted depth of decision trees are considered. Similar problems are studied in search theory [1], coding theory [77], design and analysis of algorithms (e.g., sorting) [38]. For any diagnostic problem, the minimum average depth of decision tree is bounded from below by the entropy of probability distribution (with a multiplier 1/log2 k for a problem over a k-valued information system). Among diagnostic problems, the problems with a complete set of attributes have the lowest minimum average depth of decision trees (e.g, the problem of building optimal prefix code [1] and a blood test study in assumption that exactly one patient is ill [23]). For such problems, the minimum average depth of decision tree exceeds the lower bound by at most one. The minimum average depth reaches the maximum on the problems in which each attribute is "indispensable" [44] (e.g., a diagnostic problem with n attributes and kn pairwise different rows in the decision table and the problem of implementing the modulo 2 summation function). These problems have the minimum average depth of decision tree equal to the number of attributes in the problem description. © Springer-Verlag Berlin Heidelberg 2011.

  7. Evaluation of lignins from side-streams generated in an olive tree pruning-based biorefinery: Bioethanol production and alkaline pulping.

    Science.gov (United States)

    Santos, José I; Fillat, Úrsula; Martín-Sampedro, Raquel; Eugenio, María E; Negro, María J; Ballesteros, Ignacio; Rodríguez, Alejandro; Ibarra, David

    2017-12-01

    In modern lignocellulosic-based biorefineries, carbohydrates can be transformed into biofuels and pulp and paper, whereas lignin is burned to obtain energy. However, a part of lignin could be converted into value-added products including bio-based aromatic chemicals, as well as building blocks for materials. Then, a good knowledge of lignin is necessary to define its valorisation procedure. This study characterized different lignins from side-streams produced from olive tree pruning bioethanol production (lignins collected from steam explosion pretreatment with water or phosphoric acid as catalysts, followed by simultaneous saccharification and fermentation process) and alkaline pulping (lignins recovered from kraft and soda-AQ black liquors). Together with the chemical composition, the structure of lignins was investigated by FTIR, (13)C NMR, and 2D NMR. Bioethanol lignins had clearly distinct characteristics compared to pulping lignins; a certain number of side-chain linkages (mostly alkyl-aryl ether and resinol) accompanied with lower phenolic hydroxyls content. Bioethanol lignins also showed a significant amount of carbohydrates, mainly glucose and protein impurities. By contrast, pulping lignins revealed xylose together with a dramatical reduction of side-chains (some resinol linkages survive) and thereby higher phenol content, indicating rather severe lignin degradation during alkaline pulping processes. All lignins showed a predominance of syringyl units. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Proactive data mining with decision trees

    CERN Document Server

    Dahan, Haim; Rokach, Lior; Maimon, Oded

    2014-01-01

    This book explores a proactive and domain-driven method to classification tasks. This novel proactive approach to data mining not only induces a model for predicting or explaining a phenomenon, but also utilizes specific problem/domain knowledge to suggest specific actions to achieve optimal changes in the value of the target attribute. In particular, the authors suggest a specific implementation of the domain-driven proactive approach for classification trees. The book centers on the core idea of moving observations from one branch of the tree to another. It introduces a novel splitting crite

  9. Decision tree induction in the diagnosis of otoneurological diseases.

    Science.gov (United States)

    Viikki, K; Kentala, E; Juhola, M; Pyykkö, I

    1999-01-01

    Expert systems have been applied in medicine as diagnostic aids and education tools. The construction of a knowledge base for an expert system may be a difficult task; to automate this task several machine learning methods have been developed. These methods can be also used in the refinement of knowledge bases for removing inconsistencies and redundancies, and for simplifying decision rules. In this study, decision tree induction was employed to acquire diagnostic knowledge for otoneurological diseases and to extract relevant parameters from the database of an otoneurological expert system ONE. The records of patients with benign positional vertigo, Meniere's disease, sudden deafness, traumatic vertigo, vestibular neuritis and vestibular schwannoma were retrieved from the database of ONE, and for each disease, decision trees were constructed. The study shows that decision tree induction is a useful technique for acquiring diagnostic knowledge for otoneurological diseases and for extracting relevant parameters from a large set of parameters.

  10. Minimizing size of decision trees for multi-label decision tables

    KAUST Repository

    Azad, Mohammad

    2014-09-29

    We used decision tree as a model to discover the knowledge from multi-label decision tables where each row has a set of decisions attached to it and our goal is to find out one arbitrary decision from the set of decisions attached to a row. The size of the decision tree can be small as well as very large. We study here different greedy as well as dynamic programming algorithms to minimize the size of the decision trees. When we compare the optimal result from dynamic programming algorithm, we found some greedy algorithms produce results which are close to the optimal result for the minimization of number of nodes (at most 18.92% difference), number of nonterminal nodes (at most 20.76% difference), and number of terminal nodes (at most 18.71% difference).

  11. Multivariate analysis of flow cytometric data using decision trees.

    Science.gov (United States)

    Simon, Svenja; Guthke, Reinhard; Kamradt, Thomas; Frey, Oliver

    2012-01-01

    Characterization of the response of the host immune system is important in understanding the bidirectional interactions between the host and microbial pathogens. For research on the host site, flow cytometry has become one of the major tools in immunology. Advances in technology and reagents allow now the simultaneous assessment of multiple markers on a single cell level generating multidimensional data sets that require multivariate statistical analysis. We explored the explanatory power of the supervised machine learning method called "induction of decision trees" in flow cytometric data. In order to examine whether the production of a certain cytokine is depended on other cytokines, datasets from intracellular staining for six cytokines with complex patterns of co-expression were analyzed by induction of decision trees. After weighting the data according to their class probabilities, we created a total of 13,392 different decision trees for each given cytokine with different parameter settings. For a more realistic estimation of the decision trees' quality, we used stratified fivefold cross validation and chose the "best" tree according to a combination of different quality criteria. While some of the decision trees reflected previously known co-expression patterns, we found that the expression of some cytokines was not only dependent on the co-expression of others per se, but was also dependent on the intensity of expression. Thus, for the first time we successfully used induction of decision trees for the analysis of high dimensional flow cytometric data and demonstrated the feasibility of this method to reveal structural patterns in such data sets.

  12. EEG feature selection method based on decision tree.

    Science.gov (United States)

    Duan, Lijuan; Ge, Hui; Ma, Wei; Miao, Jun

    2015-01-01

    This paper aims to solve automated feature selection problem in brain computer interface (BCI). In order to automate feature selection process, we proposed a novel EEG feature selection method based on decision tree (DT). During the electroencephalogram (EEG) signal processing, a feature extraction method based on principle component analysis (PCA) was used, and the selection process based on decision tree was performed by searching the feature space and automatically selecting optimal features. Considering that EEG signals are a series of non-linear signals, a generalized linear classifier named support vector machine (SVM) was chosen. In order to test the validity of the proposed method, we applied the EEG feature selection method based on decision tree to BCI Competition II datasets Ia, and the experiment showed encouraging results.

  13. USING PRECEDENTS FOR REDUCTION OF DECISION TREE BY GRAPH SEARCH

    Directory of Open Access Journals (Sweden)

    I. A. Bessmertny

    2015-01-01

    Full Text Available The paper considers the problem of mutual payment organization between business entities by means of clearing that is solved by search of graph paths. To reduce the decision tree complexity a method of precedents is proposed that consists in saving the intermediate solution during the moving along decision tree. An algorithm and example are presented demonstrating solution complexity coming close to a linear one. The tests carried out in civil aviation settlement system demonstrate approximately 30 percent shortage of real money transfer. The proposed algorithm is planned to be implemented also in other clearing organizations of the Russian Federation.

  14. Reconciliation of Decision-Making Heuristics Based on Decision Trees Topologies and Incomplete Fuzzy Probabilities Sets.

    Science.gov (United States)

    Doubravsky, Karel; Dohnal, Mirko

    2015-01-01

    Complex decision making tasks of different natures, e.g. economics, safety engineering, ecology and biology, are based on vague, sparse, partially inconsistent and subjective knowledge. Moreover, decision making economists / engineers are usually not willing to invest too much time into study of complex formal theories. They require such decisions which can be (re)checked by human like common sense reasoning. One important problem related to realistic decision making tasks are incomplete data sets required by the chosen decision making algorithm. This paper presents a relatively simple algorithm how some missing III (input information items) can be generated using mainly decision tree topologies and integrated into incomplete data sets. The algorithm is based on an easy to understand heuristics, e.g. a longer decision tree sub-path is less probable. This heuristic can solve decision problems under total ignorance, i.e. the decision tree topology is the only information available. But in a practice, isolated information items e.g. some vaguely known probabilities (e.g. fuzzy probabilities) are usually available. It means that a realistic problem is analysed under partial ignorance. The proposed algorithm reconciles topology related heuristics and additional fuzzy sets using fuzzy linear programming. The case study, represented by a tree with six lotteries and one fuzzy probability, is presented in details.

  15. Reconciliation of Decision-Making Heuristics Based on Decision Trees Topologies and Incomplete Fuzzy Probabilities Sets.

    Directory of Open Access Journals (Sweden)

    Karel Doubravsky

    Full Text Available Complex decision making tasks of different natures, e.g. economics, safety engineering, ecology and biology, are based on vague, sparse, partially inconsistent and subjective knowledge. Moreover, decision making economists / engineers are usually not willing to invest too much time into study of complex formal theories. They require such decisions which can be (rechecked by human like common sense reasoning. One important problem related to realistic decision making tasks are incomplete data sets required by the chosen decision making algorithm. This paper presents a relatively simple algorithm how some missing III (input information items can be generated using mainly decision tree topologies and integrated into incomplete data sets. The algorithm is based on an easy to understand heuristics, e.g. a longer decision tree sub-path is less probable. This heuristic can solve decision problems under total ignorance, i.e. the decision tree topology is the only information available. But in a practice, isolated information items e.g. some vaguely known probabilities (e.g. fuzzy probabilities are usually available. It means that a realistic problem is analysed under partial ignorance. The proposed algorithm reconciles topology related heuristics and additional fuzzy sets using fuzzy linear programming. The case study, represented by a tree with six lotteries and one fuzzy probability, is presented in details.

  16. 'Misclassification error' greedy heuristic to construct decision trees for inconsistent decision tables

    KAUST Repository

    Azad, Mohammad

    2014-01-01

    A greedy algorithm has been presented in this paper to construct decision trees for three different approaches (many-valued decision, most common decision, and generalized decision) in order to handle the inconsistency of multiple decisions in a decision table. In this algorithm, a greedy heuristic ‘misclassification error’ is used which performs faster, and for some cost function, results are better than ‘number of boundary subtables’ heuristic in literature. Therefore, it can be used in the case of larger data sets and does not require huge amount of memory. Experimental results of depth, average depth and number of nodes of decision trees constructed by this algorithm are compared in the framework of each of the three approaches.

  17. Relationships between depth and number of misclassifications for decision trees

    KAUST Repository

    Chikalov, Igor

    2011-01-01

    This paper describes a new tool for the study of relationships between depth and number of misclassifications for decision trees. In addition to the algorithm the paper also presents the results of experiments with three datasets from UCI Machine Learning Repository [3]. © 2011 Springer-Verlag.

  18. A framework for sensitivity analysis of decision trees.

    Science.gov (United States)

    Kamiński, Bogumił; Jakubczyk, Michał; Szufel, Przemysław

    2018-01-01

    In the paper, we consider sequential decision problems with uncertainty, represented as decision trees. Sensitivity analysis is always a crucial element of decision making and in decision trees it often focuses on probabilities. In the stochastic model considered, the user often has only limited information about the true values of probabilities. We develop a framework for performing sensitivity analysis of optimal strategies accounting for this distributional uncertainty. We design this robust optimization approach in an intuitive and not overly technical way, to make it simple to apply in daily managerial practice. The proposed framework allows for (1) analysis of the stability of the expected-value-maximizing strategy and (2) identification of strategies which are robust with respect to pessimistic/optimistic/mode-favoring perturbations of probabilities. We verify the properties of our approach in two cases: (a) probabilities in a tree are the primitives of the model and can be modified independently; (b) probabilities in a tree reflect some underlying, structural probabilities, and are interrelated. We provide a free software tool implementing the methods described.

  19. New Splitting Criteria for Decision Trees in Stationary Data Streams.

    Science.gov (United States)

    Jaworski, Maciej; Duda, Piotr; Rutkowski, Leszek

    2017-05-10

    The most popular tools for stream data mining are based on decision trees. In previous 15 years, all designed methods, headed by the very fast decision tree algorithm, relayed on Hoeffding's inequality and hundreds of researchers followed this scheme. Recently, we have demonstrated that although the Hoeffding decision trees are an effective tool for dealing with stream data, they are a purely heuristic procedure; for example, classical decision trees such as ID3 or CART cannot be adopted to data stream mining using Hoeffding's inequality. Therefore, there is an urgent need to develop new algorithms, which are both mathematically justified and characterized by good performance. In this paper, we address this problem by developing a family of new splitting criteria for classification in stationary data streams and investigating their probabilistic properties. The new criteria, derived using appropriate statistical tools, are based on the misclassification error and the Gini index impurity measures. The general division of splitting criteria into two types is proposed. Attributes chosen based on type-$I$ splitting criteria guarantee, with high probability, the highest expected value of split measure. Type-$II$ criteria ensure that the chosen attribute is the same, with high probability, as it would be chosen based on the whole infinite data stream. Moreover, in this paper, two hybrid splitting criteria are proposed, which are the combinations of single criteria based on the misclassification error and Gini index.

  20. Evaluation of Decision Trees for Cloud Detection from AVHRR Data

    Science.gov (United States)

    Shiffman, Smadar; Nemani, Ramakrishna

    2005-01-01

    Automated cloud detection and tracking is an important step in assessing changes in radiation budgets associated with global climate change via remote sensing. Data products based on satellite imagery are available to the scientific community for studying trends in the Earth's atmosphere. The data products include pixel-based cloud masks that assign cloud-cover classifications to pixels. Many cloud-mask algorithms have the form of decision trees. The decision trees employ sequential tests that scientists designed based on empirical astrophysics studies and simulations. Limitations of existing cloud masks restrict our ability to accurately track changes in cloud patterns over time. In a previous study we compared automatically learned decision trees to cloud masks included in Advanced Very High Resolution Radiometer (AVHRR) data products from the year 2000. In this paper we report the replication of the study for five-year data, and for a gold standard based on surface observations performed by scientists at weather stations in the British Islands. For our sample data, the accuracy of automatically learned decision trees was greater than the accuracy of the cloud masks p < 0.001.

  1. Practical secure decision tree learning in a teletreatment application

    NARCIS (Netherlands)

    de Hoogh, Sebastiaan; Schoenmakers, Berry; Chen, Ping; op den Akker, Harm

    In this paper we develop a range of practical cryptographic protocols for secure decision tree learning, a primary problem in privacy preserving data mining. We focus on particular variants of the well-known ID3 algorithm allowing a high level of security and performance at the same time. Our

  2. A Decision Tree for Nonmetric Sex Assessment from the Skull.

    Science.gov (United States)

    Langley, Natalie R; Dudzik, Beatrix; Cloutier, Alesia

    2018-01-01

    This study uses five well-documented cranial nonmetric traits (glabella, mastoid process, mental eminence, supraorbital margin, and nuchal crest) and one additional trait (zygomatic extension) to develop a validated decision tree for sex assessment. The decision tree was built and cross-validated on a sample of 293 U.S. White individuals from the William M. Bass Donated Skeletal Collection. Ordinal scores from the six traits were analyzed using the partition modeling option in JMP Pro 12. A holdout sample of 50 skulls was used to test the model. The most accurate decision tree includes three variables: glabella, zygomatic extension, and mastoid process. This decision tree yielded 93.5% accuracy on the training sample, 94% on the cross-validated sample, and 96% on a holdout validation sample. Linear weighted kappa statistics indicate acceptable agreement among observers for these variables. Mental eminence should be avoided, and definitions and figures should be referenced carefully to score nonmetric traits. © 2017 American Academy of Forensic Sciences.

  3. Three approaches to deal with inconsistent decision tables - Comparison of decision tree complexity

    KAUST Repository

    Azad, Mohammad

    2013-01-01

    In inconsistent decision tables, there are groups of rows with equal values of conditional attributes and different decisions (values of the decision attribute). We study three approaches to deal with such tables. Instead of a group of equal rows, we consider one row given by values of conditional attributes and we attach to this row: (i) the set of all decisions for rows from the group (many-valued decision approach); (ii) the most common decision for rows from the group (most common decision approach); and (iii) the unique code of the set of all decisions for rows from the group (generalized decision approach). We present experimental results and compare the depth, average depth and number of nodes of decision trees constructed by a greedy algorithm in the framework of each of the three approaches. © 2013 Springer-Verlag.

  4. Extensions of dynamic programming as a new tool for decision tree optimization

    KAUST Repository

    Alkhalid, Abdulaziz

    2013-01-01

    The chapter is devoted to the consideration of two types of decision trees for a given decision table: α-decision trees (the parameter α controls the accuracy of tree) and decision trees (which allow arbitrary level of accuracy). We study possibilities of sequential optimization of α-decision trees relative to different cost functions such as depth, average depth, and number of nodes. For decision trees, we analyze relationships between depth and number of misclassifications. We also discuss results of computer experiments with some datasets from UCI ML Repository. ©Springer-Verlag Berlin Heidelberg 2013.

  5. An overview of decision tree applied to power systems

    DEFF Research Database (Denmark)

    Liu, Leo; Rather, Zakir Hussain; Chen, Zhe

    2013-01-01

    The corrosive volume of available data in electric power systems motivate the adoption of data mining techniques in the emerging field of power system data analytics. The mainstream of data mining algorithm applied to power system, Decision Tree (DT), also named as Classification And Regression...... Tree (CART), has gained increasing interests because of its high performance in terms of computational efficiency, uncertainty manageability, and interpretability. This paper presents an overview of a variety of DT applications to power systems for better interfacing of power systems with data...... analytics. The fundamental knowledge of CART algorithm is also introduced which is then followed by examples of both classification tree and regression tree with the help of case study for security assessment of Danish power system....

  6. An anonymization technique using intersected decision trees

    Directory of Open Access Journals (Sweden)

    Sam Fletcher

    2015-07-01

    Full Text Available Data mining plays an important role in analyzing the massive amount of data collected in today’s world. However, due to the public’s rising awareness of privacy and lack of trust in organizations, suitable Privacy Preserving Data Mining (PPDM techniques have become vital. A PPDM technique provides individual privacy while allowing useful data mining. We present a novel noise addition technique called Forest Framework, two novel data quality evaluation techniques called EDUDS and EDUSC, and a security evaluation technique called SERS. Forest Framework builds a decision forest from a dataset and preserves all the patterns (logic rules of the forest while adding noise to the dataset. We compare Forest Framework to its predecessor, Framework, and another established technique, GADP. Our comparison is done using our three evaluation criteria, as well as Prediction Accuracy. Our experimental results demonstrate the success of our proposed extensions to Framework and the usefulness of our evaluation criteria.

  7. MR-Tree - A Scalable MapReduce Algorithm for Building Decision Trees

    Directory of Open Access Journals (Sweden)

    Vasile PURDILĂ

    2014-03-01

    Full Text Available Learning decision trees against very large amounts of data is not practical on single node computers due to the huge amount of calculations required by this process. Apache Hadoop is a large scale distributed computing platform that runs on commodity hardware clusters and can be used successfully for data mining task against very large datasets. This work presents a parallel decision tree learning algorithm expressed in MapReduce programming model that runs on Apache Hadoop platform and has a very good scalability with dataset size.

  8. Computerized Adaptive Test vs. decision trees: Development of a support decision system to identify suicidal behavior.

    Science.gov (United States)

    Delgado-Gomez, D; Baca-Garcia, E; Aguado, D; Courtet, P; Lopez-Castroman, J

    2016-12-01

    Several Computerized Adaptive Tests (CATs) have been proposed to facilitate assessments in mental health. These tests are built in a standard way, disregarding useful and usually available information not included in the assessment scales that could increase the precision and utility of CATs, such as the history of suicide attempts. Using the items of a previously developed scale for suicidal risk, we compared the performance of a standard CAT and a decision tree in a support decision system to identify suicidal behavior. We included the history of past suicide attempts as a class for the separation of patients in the decision tree. The decision tree needed an average of four items to achieve a similar accuracy than a standard CAT with nine items. The accuracy of the decision tree, obtained after 25 cross-validations, was 81.4%. A shortened test adapted for the separation of suicidal and non-suicidal patients was developed. CATs can be very useful tools for the assessment of suicidal risk. However, standard CATs do not use all the information that is available. A decision tree can improve the precision of the assessment since they are constructed using a priori information. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. The Utility of Decision Trees in Oncofertility Care in Japan.

    Science.gov (United States)

    Ito, Yuki; Shiraishi, Eriko; Kato, Atsuko; Haino, Takayuki; Sugimoto, Kouhei; Okamoto, Aikou; Suzuki, Nao

    2017-03-01

    To identify the utility and issues associated with the use of decision trees in oncofertility patient care in Japan. A total of 35 women who had been diagnosed with cancer, but had not begun anticancer treatment, were enrolled. We applied the oncofertility decision tree for women published by Gardino et al. to counsel a consecutive series of women on fertility preservation (FP) options following cancer diagnosis. Percentage of women who decided to undergo oocyte retrieval for embryo cryopreservation and the expected live-birth rate for these patients were calculated using the following equation: expected live-birth rate = pregnancy rate at each age per embryo transfer × (1 - miscarriage rate) × No. of cryopreserved embryos. Oocyte retrieval was performed for 17 patients (48.6%; mean ± standard deviation [SD] age, 36.35 ± 3.82 years). The mean ± SD number of cryopreserved embryos was 5.29 ± 4.63. The expected live-birth rate was 0.66. The expected live-birth rate with FP indicated that one in three oncofertility patients would not expect to have a live birth following oocyte retrieval and embryo cryopreservation. While the decision trees were useful as decision-making tools for women contemplating FP, in the context of the current restrictions on oocyte donation and the extremely small number of adoptions in Japan, the remaining options for fertility after cancer are limited. In order for cancer survivors to feel secure in their decisions, the decision tree may need to be adapted simultaneously with improvements to the social environment, such as greater support for adoption.

  10. Modeling and Testing Landslide Hazard Using Decision Tree

    Directory of Open Access Journals (Sweden)

    Mutasem Sh. Alkhasawneh

    2014-01-01

    Full Text Available This paper proposes a decision tree model for specifying the importance of 21 factors causing the landslides in a wide area of Penang Island, Malaysia. These factors are vegetation cover, distance from the fault line, slope angle, cross curvature, slope aspect, distance from road, geology, diagonal length, longitude curvature, rugosity, plan curvature, elevation, rain perception, soil texture, surface area, distance from drainage, roughness, land cover, general curvature, tangent curvature, and profile curvature. Decision tree models are used for prediction, classification, and factors importance and are usually represented by an easy to interpret tree like structure. Four models were created using Chi-square Automatic Interaction Detector (CHAID, Exhaustive CHAID, Classification and Regression Tree (CRT, and Quick-Unbiased-Efficient Statistical Tree (QUEST. Twenty-one factors were extracted using digital elevation models (DEMs and then used as input variables for the models. A data set of 137570 samples was selected for each variable in the analysis, where 68786 samples represent landslides and 68786 samples represent no landslides. 10-fold cross-validation was employed for testing the models. The highest accuracy was achieved using Exhaustive CHAID (82.0% compared to CHAID (81.9%, CRT (75.6%, and QUEST (74.0% model. Across the four models, five factors were identified as most important factors which are slope angle, distance from drainage, surface area, slope aspect, and cross curvature.

  11. Pruning quality affects infection of Acacia mangium and A ...

    African Journals Online (AJOL)

    Pruning (singling) is a common silvicultural practice in commercial Acacia plantations because these trees tend to have multiple stems. The wounds resulting from pruning are susceptible to infection by pathogens. Ceratocystis acaciivora and Lasiodiplodia theobromae have been shown recently to be important pathogens ...

  12. Effect of mycorrhiza and pruning regimes on seasonality of ...

    African Journals Online (AJOL)

    Effect of mycorrhiza and pruning regimes on seasonality of hedgerow tree mulch contribution to alley-cropped cassava in Ibadan, Nigeria. ... promoted dry season pruning production which was masked in Leucaena at 3 months by biomass diversion into flowering and in Gliricidia with both flowering and mite infestation.

  13. Distributed Decision-Tree Induction in Peer-to-Peer Systems

    Data.gov (United States)

    National Aeronautics and Space Administration — This paper offers a scalable and robust distributed algorithm for decision-tree induction in large peer-to-peer (P2P) environments. Computing a decision tree in such...

  14. Black Walnut on Kansas Strip Mine Spoils: Some Observations 25 Years after Pruning

    Science.gov (United States)

    Alex L. Shigo; Nelson F. Rogers; E. Allen, Jr. McGinnes; David T. Funk

    1978-01-01

    Dissections of 14 slow-growing black walnut trees on a strip-mine site revealed that bands of discolored heartwood were associated with pruned and nonpruned branch stubs. Ring shakes were associated with a few pruned and nonpruned stubs, especially with groups of stubs at the same position on the stem. The advantage of early pruning was that even the defects that...

  15. Classification and characterisation of SRF produced from different flows of processed MSW in the Navarra region and its co-combustion performance with olive tree pruning residues.

    Science.gov (United States)

    Ramos Casado, Raquel; Arenales Rivera, Jorge; Borjabad García, Elena; Escalada Cuadrado, Ricardo; Fernández Llorente, Miguel; Bados Sevillano, Raquel; Pascual Delgado, Alfonso

    2016-01-01

    The scope of this work is to study the co-combustion of a solid recovered fuel (SRF) produced from household wastes and packaging wastes recovered from selective collection (SC) in the autonomous community of Navarra, located in the northeast of Spain. The municipal solid waste (MSW) is subjected to a mechanical biological treatment (MBT) in order to stabilize the organic matter and recover the recyclable materials as it is done for packaging wastes. Afterwards, rejects from this treatment plant were preconditioned and compressed by a pelletizing process to produce a secondary fuel according to quality and classification criteria of EN 15359, producing the so-called SRF. A fuel characterisation was carried out according to CEN standards and the SRF was classified as follows: NCV 2; Cl 3; Hg 1. SRF pellets were cofired with residual biomass pellets from olive tree pruning (OTP) in a bubbling fluidised bed combustor, as an option of energy recovery. The mixture of fuels, with a mixing ratio close to 50% by weight, showed a significant calorific value of 18.25 MJ/kg at 8% of moisture content. In addition, elemental composition of the mixture based on nitrogen (N), sulphur (S) and chlorine (Cl) (1% N, 0.2% S and 0.4% Cl) was not far from some herbaceous biomasses. The co-combustion showed good results as an energy recovery technology because of the synergies of both fuels, improving notably the combustion conditions and reducing significantly CO concentration, regarding to the combustion of OTP, though other contaminants such as NOx and HCl increased. During eight hours of stable operation, the concentration of dioxins and furans was measured obtaining a value of 7.68 ng/Nm(3) (toxic equivalence: i-TEQ of 0.33 ng/Nm(3)). Proportions of SRF lower than 50% in the mixtures should be tested in order to cut down the emissions of these pollutants, or an abatement system for organochloride compounds may be required. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Using boosted decision trees for star-galaxy separation

    Science.gov (United States)

    Etayo-Sotos, P.; Sevilla-Noarbe, I.

    2013-05-01

    We present an application of a particular machine-learning method (Boosted Decision Trees, BDT) to separate stars and galaxies from their catalog characteristics. This application is based on the BDT implementation in the Toolkit for Multivariate Analysis (TMVA) for ROOT, a physics analysis package widely used in high energy physics. The main goal is to improve from simple thresholding cuts on standard separation variables that may be affected by local effects such as blending, badly calculated background levels or which do not include information in other bands. We explain the basics of decision trees and the training sets used for the cases that we analyze. The improvements are shown using the Sloan Digital Sky Survey Data Release 7. With this method we have reached an efficiency of 99% with a contamination level of less than 0.45%.

  17. Constructing an optimal decision tree for FAST corner point detection

    KAUST Repository

    Alkhalid, Abdulaziz

    2011-01-01

    In this paper, we consider a problem that is originated in computer vision: determining an optimal testing strategy for the corner point detection problem that is a part of FAST algorithm [11,12]. The problem can be formulated as building a decision tree with the minimum average depth for a decision table with all discrete attributes. We experimentally compare performance of an exact algorithm based on dynamic programming and several greedy algorithms that differ in the attribute selection criterion. © 2011 Springer-Verlag.

  18. Optimized block-based connected components labeling with decision trees.

    Science.gov (United States)

    Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita

    2010-06-01

    In this paper, we define a new paradigm for eight-connection labeling, which employs a general approach to improve neighborhood exploration and minimizes the number of memory accesses. First, we exploit and extend the decision table formalism introducing OR-decision tables, in which multiple alternative actions are managed. An automatic procedure to synthesize the optimal decision tree from the decision table is used, providing the most effective conditions evaluation order. Second, we propose a new scanning technique that moves on a 2 x 2 pixel grid over the image, which is optimized by the automatically generated decision tree. An extensive comparison with the state of art approaches is proposed, both on synthetic and real datasets. The synthetic dataset is composed of different sizes and densities random images, while the real datasets are an artistic image analysis dataset, a document analysis dataset for text detection and recognition, and finally a standard resolution dataset for picture segmentation tasks. The algorithm provides an impressive speedup over the state of the art algorithms.

  19. Predicting the distribution of out-of-reach biotopes with decision trees in a Swedish marine protected area.

    Science.gov (United States)

    Gonzalez-Mirelis, Genoveva; Lindegarth, Mats

    2012-12-01

    Through spatially explicit predictive models, knowledge of spatial patterns of biota can be generated for out-of-reach environments, where there is a paucity of survey data. This knowledge is invaluable for conservation decisions. We used distribution modeling to predict the occurrence of benthic biotopes, or megafaunal communities of the seabed, to support the spatial planning of a marine national park. Nine biotope classes were obtained prior to modeling from multivariate species data derived from point source, underwater imagery. Five map layers relating to depth and terrain were used as predictor variables. Biotope type was predicted on a pixel-by-pixel basis, where pixel size was 15 x 15 m and total modeled area was 455 km2. To choose a suitable modeling technique we compared the performance of five common models based on recursive partitioning: two types of classification and regression trees ([1] pruned by 10-fold cross-validation and [2] pruned by minimizing complexity), random forests, conditional inference (CI) trees, and CI forests. The selected model was a CI forest (an ensemble of CI trees), a machine-learning technique whose discriminatory power (class-by-class area under the curve [AUC] ranged from 0.75 to 0.86) and classification accuracy (72%) surpassed those of the other methods tested. Conditional inference trees are virtually new to the field of ecology. The final model's overall prediction error was 28%. Model predictions were also checked against a custom-built measure of dubiousness, calculated at the polygon level. Key factors other than the choice of modeling technique include: the use of a multinomial response, accounting for the heterogeneity of observations, and spatial autocorrelation. To illustrate how the model results can be implemented in spatial planning, representation of biodiversity in the national park was described and quantified. Given a goal of maximizing classification accuracy, we conclude that conditional inference trees

  20. Classification and Optimization of Decision Trees for Inconsistent Decision Tables Represented as MVD Tables

    KAUST Repository

    Azad, Mohammad

    2015-10-11

    Decision tree is a widely used technique to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples (objects) with equal values of conditional attributes but different decisions (values of the decision attribute), then to discover the essential patterns or knowledge from the data set is challenging. We consider three approaches (generalized, most common and many-valued decision) to handle such inconsistency. We created different greedy algorithms using various types of impurity and uncertainty measures to construct decision trees. We compared the three approaches based on the decision tree properties of the depth, average depth and number of nodes. Based on the result of the comparison, we choose to work with the many-valued decision approach. Now to determine which greedy algorithms are efficient, we compared them based on the optimization and classification results. It was found that some greedy algorithms Mult\\\\_ws\\\\_entSort, and Mult\\\\_ws\\\\_entML are good for both optimization and classification.

  1. DECISION TREES – A PERSPECTIVE OF ELECTRONIC DECISIONAL SUPPORT

    OpenAIRE

    Nicolae Marginean; Janetta Sirbu; Dan Racovitan

    2010-01-01

    Without substitute decision-maker, decision support system, through their components,can facilitate the work of decision-maker by providing useful clues to solving problems andidentifying opportunities. Choosing an optimal solution in case of complex decision makingprocesses, with a degree of uncertainty, involving a series of interdependent decisions, performed inseveral periods of time, can be achieved using decision trees. Suggestive and simple propertiespropel decision trees among the too...

  2. Baler for the harvesting without shredding of pruning-wood from vineyards and fruit trees in general. Demonstration project. Pressa per la raccolta senza trinciatura dei sarmenti della potatura di vigneto e piu in generale di frutteto. Progetto dimostrativo

    Energy Technology Data Exchange (ETDEWEB)

    Grilli, W.; Muratori, L.

    1986-01-01

    The machine under study was designed to harvest the pruning wood from vineyards and from fruit trees in general without shredding it, forming the cuttings into round bales which are easy to transport and to store, and which are practical for use as fuel for central heating on a farm or family level, or for the production of hot water or steam for use in food-processing industries, preferably near the zone of production. The made-up bales conserve the product well, and reduce to a minimum the loss of cuttings from bale-ends. The compact baler can be used for rows of varying widths, on different kinds of terrain, and for many different types of fruit trees as well.

  3. Pruning Allegheny hardwoods

    Science.gov (United States)

    W. D. Zeedyk; A. F. Hough

    1958-01-01

    The continuing heavy demand for high-quality Allegheny hardwoods, particularly black cherry and sugar maple, impresses on us the need for more information responses of hardwoods to pruning. Pruning may have beneficial effects: it may increase quality without sacrificing growth. Or it may have detrimental effects: it may cause dieback of cambium, decay, staining and...

  4. PREDIKSI CALON MAHASISWA BARU MENGUNAKAN METODE KLASIFIKASI DECISION TREE

    Directory of Open Access Journals (Sweden)

    Mambang

    2015-02-01

    Full Text Available Prior to the organization of health education begin the new school year, then the first step will be carried out selection of new admissions from general secondary education graduates and vocational. In this study, predicting new students to take multiple data attributes. The model is a decision tree classification prediction method to create a tree consisting of a root node, internal nodes and terminal nodes. While the root node and internal nodes are variables / features, the terminal node. Based on the experimental results and evaluations are done, it can be concluded that algorithm C4.5 with 80.39% accuracy obtained Uncertainty, Precision 94.44%, Recall of 75.00 % while the C4.5 algorithm with Information Gain Accuracy Ratio 88.24%, 98.28% Precision, 83.82% Recall.

  5. Totally Optimal Decision Trees for Monotone Boolean Functions with at Most Five Variables

    KAUST Repository

    Chikalov, Igor

    2013-01-01

    In this paper, we present the empirical results for relationships between time (depth) and space (number of nodes) complexity of decision trees computing monotone Boolean functions, with at most five variables. We use Dagger (a tool for optimization of decision trees and decision rules) to conduct experiments. We show that, for each monotone Boolean function with at most five variables, there exists a totally optimal decision tree which is optimal with respect to both depth and number of nodes.

  6. Using decision trees to understand structure in missing data.

    Science.gov (United States)

    Tierney, Nicholas J; Harden, Fiona A; Harden, Maurice J; Mengersen, Kerrie L

    2015-06-29

    Demonstrate the application of decision trees--classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs)--to understand structure in missing data. Data taken from employees at 3 different industrial sites in Australia. 7915 observations were included. The approach was evaluated using an occupational health data set comprising results of questionnaires, medical tests and environmental monitoring. Statistical methods included standard statistical tests and the 'rpart' and 'gbm' packages for CART and BRT analyses, respectively, from the statistical software 'R'. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. CART and BRT models were effective in highlighting a missingness structure in the data, related to the type of data (medical or environmental), the site in which it was collected, the number of visits, and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured as compared to structured missingness. Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Researchers are encouraged to use CART and BRT models to explore and understand missing data. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  7. Optimization and analysis of decision trees and rules: Dynamic programming approach

    KAUST Repository

    Alkhalid, Abdulaziz

    2013-08-01

    This paper is devoted to the consideration of software system Dagger created in KAUST. This system is based on extensions of dynamic programming. It allows sequential optimization of decision trees and rules relative to different cost functions, derivation of relationships between two cost functions (in particular, between number of misclassifications and depth of decision trees), and between cost and uncertainty of decision trees. We describe features of Dagger and consider examples of this systems work on decision tables from UCI Machine Learning Repository. We also use Dagger to compare 16 different greedy algorithms for decision tree construction. © 2013 Taylor and Francis Group, LLC.

  8. Optimization and analysis of decision trees and rules: dynamic programming approach

    Science.gov (United States)

    Alkhalid, Abdulaziz; Amin, Talha; Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail; Zielosko, Beata

    2013-08-01

    This paper is devoted to the consideration of software system Dagger created in KAUST. This system is based on extensions of dynamic programming. It allows sequential optimization of decision trees and rules relative to different cost functions, derivation of relationships between two cost functions (in particular, between number of misclassifications and depth of decision trees), and between cost and uncertainty of decision trees. We describe features of Dagger and consider examples of this system's work on decision tables from UCI Machine Learning Repository. We also use Dagger to compare 16 different greedy algorithms for decision tree construction.

  9. Decision Tree Approach to Discovering Fraud in Leasing Agreements

    Directory of Open Access Journals (Sweden)

    Horvat Ivan

    2014-09-01

    Full Text Available Background: Fraud attempts create large losses for financing subjects in modern economies. At the same time, leasing agreements have become more and more popular as a means of financing objects such as machinery and vehicles, but are more vulnerable to fraud attempts. Objectives: The goal of the paper is to estimate the usability of the data mining approach in discovering fraud in leasing agreements. Methods/Approach: Real-world data from one Croatian leasing firm was used for creating tow models for fraud detection in leasing. The decision tree method was used for creating a classification model, and the CHAID algorithm was deployed. Results: The decision tree model has indicated that the object of the leasing agreement had the strongest impact on the probability of fraud. Conclusions: In order to enhance the probability of the developed model, it would be necessary to develop software that would enable automated, quick and transparent retrieval of data from the system, processing according to the rules and displaying the results in multiple categories.

  10. Peripheral Exophytic Oral Lesions: A Clinical Decision Tree

    Directory of Open Access Journals (Sweden)

    Hamed Mortazavi

    2017-01-01

    Full Text Available Diagnosis of peripheral oral exophytic lesions might be quite challenging. This review article aimed to introduce a decision tree for oral exophytic lesions according to their clinical features. General search engines and specialized databases including PubMed, PubMed Central, Medline Plus, EBSCO, Science Direct, Scopus, Embase, and authenticated textbooks were used to find relevant topics by means of keywords such as “oral soft tissue lesion,” “oral tumor like lesion,” “oral mucosal enlargement,” and “oral exophytic lesion.” Related English-language articles published since 1988 to 2016 in both medical and dental journals were appraised. Upon compilation of data, peripheral oral exophytic lesions were categorized into two major groups according to their surface texture: smooth (mesenchymal or nonsquamous epithelium-originated and rough (squamous epithelium-originated. Lesions with smooth surface were also categorized into three subgroups according to their general frequency: reactive hyperplastic lesions/inflammatory hyperplasia, salivary gland lesions (nonneoplastic and neoplastic, and mesenchymal lesions (benign and malignant neoplasms. In addition, lesions with rough surface were summarized in six more common lesions. In total, 29 entities were organized in the form of a decision tree in order to help clinicians establish a logical diagnosis by a stepwise progression method.

  11. Toward the Decision Tree for Inferring Requirements Maturation Types

    Science.gov (United States)

    Nakatani, Takako; Kondo, Narihito; Shirogane, Junko; Kaiya, Haruhiko; Hori, Shozo; Katamine, Keiichi

    Requirements are elicited step by step during the requirements engineering (RE) process. However, some types of requirements are elicited completely after the scheduled requirements elicitation process is finished. Such a situation is regarded as problematic situation. In our study, the difficulties of eliciting various kinds of requirements is observed by components. We refer to the components as observation targets (OTs) and introduce the word “Requirements maturation.” It means when and how requirements are elicited completely in the project. The requirements maturation is discussed on physical and logical OTs. OTs Viewed from a logical viewpoint are called logical OTs, e.g. quality requirements. The requirements of physical OTs, e.g., modules, components, subsystems, etc., includes functional and non-functional requirements. They are influenced by their requesters' environmental changes, as well as developers' technical changes. In order to infer the requirements maturation period of each OT, we need to know how much these factors influence the OTs' requirements maturation. According to the observation of actual past projects, we defined the PRINCE (Pre Requirements Intelligence Net Consideration and Evaluation) model. It aims to guide developers in their observation of the requirements maturation of OTs. We quantitatively analyzed the actual cases with their requirements elicitation process and extracted essential factors that influence the requirements maturation. The results of interviews of project managers are analyzed by WEKA, a data mining system, from which the decision tree was derived. This paper introduces the PRINCE model and the category of logical OTs to be observed. The decision tree that helps developers infer the maturation type of an OT is also described. We evaluate the tree through real projects and discuss its ability to infer the requirements maturation types.

  12. Decision trees for predicting the academic success of students

    Directory of Open Access Journals (Sweden)

    Josip Mesarić

    2016-12-01

    Full Text Available The aim of this paper is to create a model that successfully classifies students into one of two categories, depending on their success at the end of their first academic year, and finding meaningful variables affecting their success. This model is based on information regarding student success in high school and their courses after completing their first year of study, as well as the rank of preferences assigned to the observed faculty, and attempts to classify students into one of the two categories in line with their academic success. Creating a model required collecting data on all undergraduate students enrolled into their second year at the Faculty of Economics, University of Osijek, as well as data on completion of the state exam. These two datasets were combined and used for the model. Several classification algorithms for constructing decision trees were compared and the statistical significance (t-test of the results was analyzed. Finally, the algorithm that produced the highest accuracy was chosen as the most successful algorithm for modeling the academic success of students. The highest classification rate of 79% was produced using the REPTree decision tree algorithm, but the tree was not as successful in classifying both classes. Therefore, the average rate of classification was calculated for two models that gave the highest total rate of classification, where a higher percentage is achieved using the model relying on the algorithm J48. The most significant variables were total points in the state exam, points from high school and points in the Croatian language exam.

  13. Comparative study of biodegradability prediction of chemicals using decision trees, functional trees, and logistic regression.

    Science.gov (United States)

    Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M

    2014-12-01

    Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed. © 2014 SETAC.

  14. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models

    DEFF Research Database (Denmark)

    Kheir, Rania Bou; Greve, Mogens Humlekrog; Bøcher, Peder Klith

    2010-01-01

    the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect, mean curvature...... field measurements in the area of interest (Denmark). A large number of tree-based classification models (588) were developed using (i) all of the parameters, (ii) all Digital Elevation Model (DEM) parameters only, (iii) the primary DEM parameters only, (iv), the remote sensing (RS) indices only, (v......) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME...

  15. Post-pruning shoot growth increases fruit abscission and reduces stem carbohydrates and yield in macadamia.

    Science.gov (United States)

    McFadyen, Lisa M; Robertson, David; Sedgley, Margaret; Kristiansen, Paul; Olesen, Trevor

    2011-05-01

    There is good evidence for deciduous trees that competition for carbohydrates from shoot growth accentuates early fruit abscission and reduces yield but the effect for evergreen trees is not well defined. Here, whole-tree tip-pruning at anthesis is used to examine the effect of post-pruning shoot development on fruit abscission in the evergreen subtropical tree macadamia (Macadamia integrifolia, M. integrifolia × tetraphylla). Partial-tree tip-pruning is also used to test the localization of the effect. In the first experiment (2005/2006), all branches on trees were tip-pruned at anthesis, some trees were allowed to re-shoot (R treatment) and shoots were removed from others (NR treatment). Fruit set and stem total non-structural carbohydrates (TNSC) over time, and yield were measured. In the second experiment (2006/2007), upper branches of trees were tip-pruned at anthesis, some trees were allowed to re-shoot (R) and shoots were removed from others (NR). Fruit set and yield were measured separately for upper (pruned) and lower (unpruned) branches. In the first experiment, R trees set far fewer fruit and had lower yield than NR trees. TNSC fell and rose in all treatments but the decline in R trees occurred earlier than in NR trees and coincided with early shoot growth and the increase in fruit abscission relative to the other treatments. In the second experiment, fruit abscission on upper branches of R trees increased relative to the other treatments but there was little difference in fruit abscission between treatments on lower branches. This study is the first to demonstrate an increase in fruit abscission in an evergreen tree in response to pruning. The effect appeared to be related to competition for carbohydrates between post-pruning shoot growth and fruit development and was local, with shoot growth on pruned branches having no effect on fruit abscission on unpruned branches.

  16. Prune belly syndrome

    Science.gov (United States)

    The causes of prune belly syndrome are unknown. The condition affects mostly boys. While in the womb, the developing baby's abdomen swells with fluid. Often, the cause is a problem in the urinary tract. The fluid disappears after birth, leading ...

  17. A Novel Approach on Designing Augmented Fuzzy Cognitive Maps Using Fuzzified Decision Trees

    Science.gov (United States)

    Papageorgiou, Elpiniki I.

    This paper proposes a new methodology for designing Fuzzy Cognitive Maps using crisp decision trees that have been fuzzified. Fuzzy cognitive map is a knowledge-based technique that works as an artificial cognitive network inheriting the main aspects of cognitive maps and artificial neural networks. Decision trees, in the other hand, are well known intelligent techniques that extract rules from both symbolic and numeric data. Fuzzy theoretical techniques are used to fuzzify crisp decision trees in order to soften decision boundaries at decision nodes inherent in this type of trees. Comparisons between crisp decision trees and the fuzzified decision trees suggest that the later fuzzy tree is significantly more robust and produces a more balanced decision making. The approach proposed in this paper could incorporate any type of fuzzy decision trees. Through this methodology, new linguistic weights were determined in FCM model, thus producing augmented FCM tool. The framework is consisted of a new fuzzy algorithm to generate linguistic weights that describe the cause-effect relationships among the concepts of the FCM model, from induced fuzzy decision trees.

  18. A new approach to enhance the performance of decision tree for classifying gene expression data.

    Science.gov (United States)

    Hassan, Md; Kotagiri, Ramamohanarao

    2013-12-20

    Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree.

  19. Bi-Criteria Optimization of Decision Trees with Applications to Data Analysis

    KAUST Repository

    Chikalov, Igor

    2017-10-19

    This paper is devoted to the study of bi-criteria optimization problems for decision trees. We consider different cost functions such as depth, average depth, and number of nodes. We design algorithms that allow us to construct the set of Pareto optimal points (POPs) for a given decision table and the corresponding bi-criteria optimization problem. These algorithms are suitable for investigation of medium-sized decision tables. We discuss three examples of applications of the created tools: the study of relationships among depth, average depth and number of nodes for decision trees for corner point detection (such trees are used in computer vision for object tracking), study of systems of decision rules derived from decision trees, and comparison of different greedy algorithms for decision tree construction as single- and bi-criteria optimization algorithms.

  20. Using histograms to introduce randomization in the generation of ensembles of decision trees

    Science.gov (United States)

    Kamath, Chandrika; Cantu-Paz, Erick; Littau, David

    2005-02-22

    A system for decision tree ensembles that includes a module to read the data, a module to create a histogram, a module to evaluate a potential split according to some criterion using the histogram, a module to select a split point randomly in an interval around the best split, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method includes the steps of reading the data; creating a histogram; evaluating a potential split according to some criterion using the histogram, selecting a split point randomly in an interval around the best split, splitting the data, and combining multiple decision trees in ensembles.

  1. Data Fusion Research of Triaxial Human Body Motion Gesture based on Decision Tree

    Directory of Open Access Journals (Sweden)

    Feihong Zhou

    2014-05-01

    Full Text Available The development status of human body motion gesture data fusion domestic and overseas has been analyzed. A triaxial accelerometer is adopted to develop a wearable human body motion gesture monitoring system aimed at old people healthcare. On the basis of a brief introduction of decision tree algorithm, the WEKA workbench is adopted to generate a human body motion gesture decision tree. At last, the classification quality of the decision tree has been validated through experiments. The experimental results show that the decision tree algorithm could reach an average predicting accuracy of 97.5 % with lower time cost.

  2. Fuzzy decision trees as a decision-making framework in the public sector

    Directory of Open Access Journals (Sweden)

    Benčina Jože

    2011-01-01

    Full Text Available Systematic approaches to making decisions in the public sector are becoming very common. Most often, these approaches concern expert decision models. The expansion of the idea of the development of e-participation and e-democracy was influenced by the development of technology. All stakeholders are supposed to participate in decision making, so this brings a new feature to the decision-making process, in which amateurs and non-specialists are participating decision making instead of experts. To be able to understand the needs and wishes of stakeholders, it is not enough to vote for alternatives - it is important to participate in solution-finding and to express opinions about the important elements of these matters. The solution presented in this paper concerns fuzzy decision-making framework. This framework combines the advantages of the introduction of the decision-making problem in a tree structure and the possibilities offered by the flexibility of the fuzzy approach. The possibilities of implementation of the framework in practice are introduced by case studies of investment projects appraisal in a community and assessment of efficiency and effectiveness of public institutions.

  3. Ship Engine Room Casualty Analysis by Using Decision Tree Method

    Directory of Open Access Journals (Sweden)

    Ömür Yaşar SAATÇİOĞLU

    2017-03-01

    Full Text Available Ships may encounter undesirable conditions during operations. In consequence of a casualty, fire, explosion, flooding, grounding, injury even death may occur. Besides, these results can be avoidable with precautions and preventive operating processes. In maritime transportation, casualties depend on various factors. These were listed as misuse of the engine equipment and tools, defective machinery or equipment, inadequacy of operational procedure and measure of safety and force majeure effects. Casualty reports which were published in Australia, New Zealand, United Kingdom, Canada and United States until 2015 were examined and the probable causes and consequences of casualties were determined with their occurrence percentages. In this study, 89 marine investigation reports regarding engine room casualties were analyzed. Casualty factors were analyzed with their frequency percentages and also their main causes were constructed. This study aims to investigate engine room based casualties, frequency of each casualty type and main causes by using decision tree method.

  4. Electronic Nose Odor Classification with Advanced Decision Tree Structures

    Directory of Open Access Journals (Sweden)

    S. Guney

    2013-09-01

    Full Text Available Electronic nose (e-nose is an electronic device which can measure chemical compounds in air and consequently classify different odors. In this paper, an e-nose device consisting of 8 different gas sensors was designed and constructed. Using this device, 104 different experiments involving 11 different odor classes (moth, angelica root, rose, mint, polis, lemon, rotten egg, egg, garlic, grass, and acetone were performed. The main contribution of this paper is the finding that using the chemical domain knowledge it is possible to train an accurate odor classification system. The domain knowledge about chemical compounds is represented by a decision tree whose nodes are composed of classifiers such as Support Vector Machines and k-Nearest Neighbor. The overall accuracy achieved with the proposed algorithm and the constructed e-nose device was 97.18 %. Training and testing data sets used in this paper are published online.

  5. EVFDT: An Enhanced Very Fast Decision Tree Algorithm for Detecting Distributed Denial of Service Attack in Cloud-Assisted Wireless Body Area Network

    Directory of Open Access Journals (Sweden)

    Rabia Latif

    2015-01-01

    Full Text Available Due to the scattered nature of DDoS attacks and advancement of new technologies such as cloud-assisted WBAN, it becomes challenging to detect malicious activities by relying on conventional security mechanisms. The detection of such attacks demands an adaptive and incremental learning classifier capable of accurate decision making with less computation. Hence, the DDoS attack detection using existing machine learning techniques requires full data set to be stored in the memory and are not appropriate for real-time network traffic. To overcome these shortcomings, Very Fast Decision Tree (VFDT algorithm has been proposed in the past that can handle high speed streaming data efficiently. Whilst considering the data generated by WBAN sensors, noise is an obvious aspect that severely affects the accuracy and increases false alarms. In this paper, an enhanced VFDT (EVFDT is proposed to efficiently detect the occurrence of DDoS attack in cloud-assisted WBAN. EVFDT uses an adaptive tie-breaking threshold for node splitting. To resolve the tree size expansion under extreme noise, a lightweight iterative pruning technique is proposed. To analyze the performance of EVFDT, four metrics are evaluated: classification accuracy, tree size, time, and memory. Simulation results show that EVFDT attains significantly high detection accuracy with fewer false alarms.

  6. The value of decision tree analysis in planning anaesthetic care in obstetrics.

    Science.gov (United States)

    Bamber, J H; Evans, S A

    2016-08-01

    The use of decision tree analysis is discussed in the context of the anaesthetic and obstetric management of a young pregnant woman with joint hypermobility syndrome with a history of insensitivity to local anaesthesia and a previous difficult intubation due to a tongue tumour. The multidisciplinary clinical decision process resulted in the woman being delivered without complication by elective caesarean section under general anaesthesia after an awake fibreoptic intubation. The decision process used is reviewed and compared retrospectively to a decision tree analytical approach. The benefits and limitations of using decision tree analysis are reviewed and its application in obstetric anaesthesia is discussed. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. A Pruning Neural Network Model in Credit Classification Analysis

    Directory of Open Access Journals (Sweden)

    Yajiao Tang

    2018-01-01

    Full Text Available Nowadays, credit classification models are widely applied because they can help financial decision-makers to handle credit classification issues. Among them, artificial neural networks (ANNs have been widely accepted as the convincing methods in the credit industry. In this paper, we propose a pruning neural network (PNN and apply it to solve credit classification problem by adopting the well-known Australian and Japanese credit datasets. The model is inspired by synaptic nonlinearity of a dendritic tree in a biological neural model. And it is trained by an error back-propagation algorithm. The model is capable of realizing a neuronal pruning function by removing the superfluous synapses and useless dendrites and forms a tidy dendritic morphology at the end of learning. Furthermore, we utilize logic circuits (LCs to simulate the dendritic structures successfully which makes PNN be implemented on the hardware effectively. The statistical results of our experiments have verified that PNN obtains superior performance in comparison with other classical algorithms in terms of accuracy and computational efficiency.

  8. 7 CFR 993.7 - French prunes.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false French prunes. 993.7 Section 993.7 Agriculture... Order Regulating Handling Definitions § 993.7 French prunes. French prunes means: (a) Prunes produced from plums of the following varieties of plums: French (Prune d'Agen, Petite Prune d'Agen), Coates (Cox...

  9. An Improved Decision Tree for Predicting a Major Product in Competing Reactions

    Science.gov (United States)

    Graham, Kate J.

    2014-01-01

    When organic chemistry students encounter competing reactions, they are often overwhelmed by the task of evaluating multiple factors that affect the outcome of a reaction. The use of a decision tree is a useful tool to teach students to evaluate a complex situation and propose a likely outcome. Specifically, a decision tree can help students…

  10. Decision-Tree Models of Categorization Response Times, Choice Proportions, and Typicality Judgments

    Science.gov (United States)

    Lafond, Daniel; Lacouture, Yves; Cohen, Andrew L.

    2009-01-01

    The authors present 3 decision-tree models of categorization adapted from T. Trabasso, H. Rollins, and E. Shaughnessy (1971) and use them to provide a quantitative account of categorization response times, choice proportions, and typicality judgments at the individual-participant level. In Experiment 1, the decision-tree models were fit to…

  11. A Decision Tree for Psychology Majors: Supplying Questions as Well as Answers.

    Science.gov (United States)

    Poe, Retta E.

    1988-01-01

    Outlines the development of a psychology careers decision tree to help faculty advise students plan their program. States that students using the decision tree may benefit by learning more about their career options and by acquiring better question-asking skills. (GEA)

  12. Effect of mycorrhiza and pruning regimes on seasonality of ...

    African Journals Online (AJOL)

    GRACE

    2006-07-16

    Jul 16, 2006 ... Effect of mycorrhiza and pruning regimes on seasonality of hedgerow tree mulch contribution to .... Mycorrhizae are symbiotic association between plant roots and certain soil fungi (Sieverding, 1991). ..... inoculum was put under the seeds in the polythene bags for inoculated hedgerow tree seedlings and ...

  13. Decision Rules, Trees and Tests for Tables with Many-valued Decisions–comparative Study

    KAUST Repository

    Azad, Mohammad

    2013-10-04

    In this paper, we present three approaches for construction of decision rules for decision tables with many-valued decisions. We construct decision rules directly for rows of decision table, based on paths in decision tree, and based on attributes contained in a test (super-reduct). Experimental results for the data sets taken from UCI Machine Learning Repository, contain comparison of the maximum and the average length of rules for the mentioned approaches.

  14. Greedy heuristics for minimization of number of terminal nodes in decision trees

    KAUST Repository

    Hussain, Shahid

    2014-10-01

    This paper describes, in detail, several greedy heuristics for construction of decision trees. We study the number of terminal nodes of decision trees, which is closely related with the cardinality of the set of rules corresponding to the tree. We compare these heuristics empirically for two different types of datasets (datasets acquired from UCI ML Repository and randomly generated data) as well as compare with the optimal results obtained using dynamic programming method.

  15. Multi-test decision tree and its application to microarray data classification.

    Science.gov (United States)

    Czajkowski, Marcin; Grześ, Marek; Kretowski, Marek

    2014-05-01

    The desirable property of tools used to investigate biological data is easy to understand models and predictive decisions. Decision trees are particularly promising in this regard due to their comprehensible nature that resembles the hierarchical process of human decision making. However, existing algorithms for learning decision trees have tendency to underfit gene expression data. The main aim of this work is to improve the performance and stability of decision trees with only a small increase in their complexity. We propose a multi-test decision tree (MTDT); our main contribution is the application of several univariate tests in each non-terminal node of the decision tree. We also search for alternative, lower-ranked features in order to obtain more stable and reliable predictions. Experimental validation was performed on several real-life gene expression datasets. Comparison results with eight classifiers show that MTDT has a statistically significantly higher accuracy than popular decision tree classifiers, and it was highly competitive with ensemble learning algorithms. The proposed solution managed to outperform its baseline algorithm on 14 datasets by an average 6%. A study performed on one of the datasets showed that the discovered genes used in the MTDT classification model are supported by biological evidence in the literature. This paper introduces a new type of decision tree which is more suitable for solving biological problems. MTDTs are relatively easy to analyze and much more powerful in modeling high dimensional microarray data than their popular counterparts. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. The decision tree classifier - Design and potential. [for Landsat-1 data

    Science.gov (United States)

    Hauska, H.; Swain, P. H.

    1975-01-01

    A new classifier has been developed for the computerized analysis of remote sensor data. The decision tree classifier is essentially a maximum likelihood classifier using multistage decision logic. It is characterized by the fact that an unknown sample can be classified into a class using one or several decision functions in a successive manner. The classifier is applied to the analysis of data sensed by Landsat-1 over Kenosha Pass, Colorado. The classifier is illustrated by a tree diagram which for processing purposes is encoded as a string of symbols such that there is a unique one-to-one relationship between string and decision tree.

  17. Decision trees and decision committee applied to star/galaxy separation problem

    Science.gov (United States)

    Vasconcellos, Eduardo Charles

    Vasconcellos et al [1] study the efficiency of 13 diferente decision tree algorithms applied to photometric data in the Sloan Digital Sky Digital Survey Data Release Seven (SDSS-DR7) to perform star/galaxy separation. Each algorithm is defined by a set fo parameters which, when varied, produce diferente final classifications trees. In that work we extensively explore the parameter space of each algorithm, using the set of 884,126 SDSS objects with spectroscopic data as the training set. We find that Functional Tree algorithm (FT) yields the best results by the mean completeness function (galaxy true positive rate) in two magnitude intervals:14=19 (82.1%). We compare FT classification to the SDSS parametric, 2DPHOT and Ball et al (2006) classifications. At the faintest magnitudes (r > 19), our classifier is the only one that maintains high completeness (>80%) while simultaneously achieving low contamination ( 2.5%). We also examine the SDSS parametric classifier (psfMag - modelMag) to see if the dividing line between stars and galaxies can be adjusted to improve the classifier. We find that currently stars in close pairs are often misclassified as galaxies, and suggest a new cut to improve the classifier. Finally, we apply our FT classifier to separate stars from galaxies in the full set of 69,545,326 SDSS photometric objects in the magnitude range 14 objects from the same 884,126 SDSS-DR7 objects with spectroscopic data that we use before. Both, the decision commitee and our previous single FT classifier will be applied to the new ojects from SDSS data releses eight, nine and ten. Finally we will compare peformances of both methods in this new data set. [1] Vasconcellos, E. C.; de Carvalho, R. R.; Gal, R. R.; LaBarbera, F. L.; Capelato, H. V.; Fraga Campos Velho, H.; Trevisan, M.; Ruiz, R. S. R.. Decision Tree Classifiers for Star/Galaxy Separation. The Astronomical Journal, Volume 141, Issue 6, 2011.

  18. Reconciliation as a tool for decision making within decision tree related to insolvency problems

    Directory of Open Access Journals (Sweden)

    Tomáš Poláček

    2016-05-01

    Full Text Available Purpose of the article: The paper draws on the results of previous studies recoverability of creditor’s claims, where it was research from debtor’s point of view and his/her debts on the Czech Republic financial market. The company, which fell into a bankruptcy hearing, has several legislatively supported options how to deal with this situation and repay creditors money. Each of the options has been specified as a variant of a decisionmaking tree. This paper is focused on third option of evaluation – The reconciliation. The heuristic generates all missing information items. The result is then focused on the comparison and evaluation of the best ways to repay the debt, also including solution for the future continuation of the company currently in liquidation and quantification of percentage refund of creditors claim. A realistic case study is presented in full details. Further introduction of decision making with uncerteinties in insolvency proceedings. Methodology/methods: Solving within decision tree with partially ignorance of probability using reconciliation. Scientific aim: Comparison and evaluation of the best ways to repay the debt, also including solution for the future continuation of the company currently in liquidation and quantification of percentage refund of creditors claim. Findings: Predictions of future actions in dealing with insolvency act and bankruptcy hearing, quicker and more effective agreeing on compromises among all creditors and debtor. Conclusions: Finding a best way and solution of repayment and avoiding of termination for both of interested parties (creditor and debtor.

  19. A greedy algorithm for construction of decision trees for tables with many-valued decisions - A comparative study

    KAUST Repository

    Azad, Mohammad

    2013-11-25

    In the paper, we study a greedy algorithm for construction of decision trees. This algorithm is applicable to decision tables with many-valued decisions where each row is labeled with a set of decisions. For a given row, we should find a decision from the set attached to this row. Experimental results for data sets from UCI Machine Learning Repository and randomly generated tables are presented. We make a comparative study of the depth and average depth of the constructed decision trees for proposed approach and approach based on generalized decision. The obtained results show that the proposed approach can be useful from the point of view of knowledge representation and algorithm construction.

  20. Prune Belly Syndrome

    African Journals Online (AJOL)

    User

    a rare case and review of literature. Europe- an Journal of Plastic Surgery 35, 241-243. Moerman P., Fryns J.P., Goddeeris P. and Lau- weryns J.M. (1984) Pathogenesis of the. Prune-Belly Syndrome: A Functional Ure- thral Obstruction Caused by Prostatic Hy- poplasia. PEDIATRICS 73, 470-475. Okeniyi J.A., Ogunlesi T.A, ...

  1. Application of alternating decision trees in selecting sparse linear solvers

    KAUST Repository

    Bhowmick, Sanjukta

    2010-01-01

    The solution of sparse linear systems, a fundamental and resource-intensive task in scientific computing, can be approached through multiple algorithms. Using an algorithm well adapted to characteristics of the task can significantly enhance the performance, such as reducing the time required for the operation, without compromising the quality of the result. However, the best solution method can vary even across linear systems generated in course of the same PDE-based simulation, thereby making solver selection a very challenging problem. In this paper, we use a machine learning technique, Alternating Decision Trees (ADT), to select efficient solvers based on the properties of sparse linear systems and runtime-dependent features, such as the stages of simulation. We demonstrate the effectiveness of this method through empirical results over linear systems drawn from computational fluid dynamics and magnetohydrodynamics applications. The results also demonstrate that using ADT can resolve the problem of over-fitting, which occurs when limited amount of data is available. © 2010 Springer Science+Business Media LLC.

  2. Learning from examples - Generation and evaluation of decision trees for software resource analysis

    Science.gov (United States)

    Selby, Richard W.; Porter, Adam A.

    1988-01-01

    A general solution method for the automatic generation of decision (or classification) trees is investigated. The approach is to provide insights through in-depth empirical characterization and evaluation of decision trees for software resource data analysis. The trees identify classes of objects (software modules) that had high development effort. Sixteen software systems ranging from 3,000 to 112,000 source lines were selected for analysis from a NASA production environment. The collection and analysis of 74 attributes (or metrics), for over 4,700 objects, captured information about the development effort, faults, changes, design style, and implementation style. A total of 9,600 decision trees were automatically generated and evaluated. The trees correctly identified 79.3 percent of the software modules that had high development effort or faults, and the trees generated from the best parameter combinations correctly identified 88.4 percent of the modules on the average.

  3. PERFORMANCE EVALUATION OF C-FUZZY DECISION TREE BASED IDS WITH DIFFERENT DISTANCE MEASURES

    Directory of Open Access Journals (Sweden)

    Vinayak Mantoor

    2012-01-01

    Full Text Available With the ever-increasing growth of computer networks and emergence of electronic commerce in recent years, computer security has become a priority. Intrusion detection system (IDS is often used as another wall of protection in addition to intrusion prevention techniques. This paper introduces a concept and design of decision trees based on Fuzzy clustering. Fuzzy clustering is the core functional part of the overall decision tree development and the developed tree will be referred to as C-fuzzy decision trees. Distance measure plays an important role in clustering data points. Choosing the right distance measure for a given dataset is a non-trivial problem. In this paper, we study the performance of C-fuzzy decision tree based IDS with different distance measures. We analyzed the results of our study using KDD Cup 1999 data and compared the accuracy of the classifier with different distance measures.

  4. A decision tree for differentiating multiple system atrophy from Parkinson's disease using 3-T MR imaging.

    Science.gov (United States)

    Nair, Shalini Rajandran; Tan, Li Kuo; Mohd Ramli, Norlisah; Lim, Shen Yang; Rahmat, Kartini; Mohd Nor, Hazman

    2013-06-01

    To develop a decision tree based on standard magnetic resonance imaging (MRI) and diffusion tensor imaging to differentiate multiple system atrophy (MSA) from Parkinson's disease (PD). 3-T brain MRI and DTI (diffusion tensor imaging) were performed on 26 PD and 13 MSA patients. Regions of interest (ROIs) were the putamen, substantia nigra, pons, middle cerebellar peduncles (MCP) and cerebellum. Linear, volumetry and DTI (fractional anisotropy and mean diffusivity) were measured. A three-node decision tree was formulated, with design goals being 100 % specificity at node 1, 100 % sensitivity at node 2 and highest combined sensitivity and specificity at node 3. Nine parameters (mean width, fractional anisotropy (FA) and mean diffusivity (MD) of MCP; anteroposterior diameter of pons; cerebellar FA and volume; pons and mean putamen volume; mean FA substantia nigra compacta-rostral) showed statistically significant (P decision tree. Threshold values were 14.6 mm, 21.8 mm and 0.55, respectively. Overall performance of the decision tree was 92 % sensitivity, 96 % specificity, 92 % PPV and 96 % NPV. Twelve out of 13 MSA patients were accurately classified. Formation of the decision tree using these parameters was both descriptive and predictive in differentiating between MSA and PD. • Parkinson's disease and multiple system atrophy can be distinguished on MR imaging. • Combined conventional MRI and diffusion tensor imaging improves the accuracy of diagnosis. • A decision tree is descriptive and predictive in differentiating between clinical entities. • A decision tree can reliably differentiate Parkinson's disease from multiple system atrophy.

  5. Iron Supplementation and Altitude: Decision Making Using a Regression Tree

    Directory of Open Access Journals (Sweden)

    Laura A. Garvican-Lewis, Andrew D. Govus, Peter Peeling, Chris R. Abbiss, Christopher J. Gore

    2016-03-01

    Full Text Available Altitude exposure increases the body’s need for iron (Gassmann and Muckenthaler, 2015, primarily to support accelerated erythropoiesis, yet clear supplementation guidelines do not exist. Athletes are typically recommended to ingest a daily oral iron supplement to facilitate altitude adaptations, and to help maintain iron balance. However, there is some debate as to whether athletes with otherwise healthy iron stores should be supplemented, due in part to concerns of iron overload. Excess iron in vital organs is associated with an increased risk of a number of conditions including cancer, liver disease and heart failure. Therefore clear guidelines are warranted and athletes should be discouraged from ‘self-prescribing” supplementation without medical advice. In the absence of prospective-controlled studies, decision tree analysis can be used to describe a data set, with the resultant regression tree serving as guide for clinical decision making. Here, we present a regression tree in the context of iron supplementation during altitude exposure, to examine the association between pre-altitude ferritin (Ferritin-Pre and the haemoglobin mass (Hbmass response, based on daily iron supplement dose. De-identified ferritin and Hbmass data from 178 athletes engaged in altitude training were extracted from the Australian Institute of Sport (AIS database. Altitude exposure was predominantly achieved via normobaric Live high: Train low (n = 147 at a simulated altitude of 3000 m for 2 to 4 weeks. The remaining athletes engaged in natural altitude training at venues ranging from 1350 to 2800 m for 3-4 weeks. Thus, the “hypoxic dose” ranged from ~890 km.h to ~1400 km.h. Ethical approval was granted by the AIS Human Ethics Committee, and athletes provided written informed consent. An in depth description and traditional analysis of the complete data set is presented elsewhere (Govus et al., 2015. Iron supplementation was prescribed by a sports physician

  6. Discovering Decision Knowledge from Web Log Portfolio for Managing Classroom Processes by Applying Decision Tree and Data Cube Technology.

    Science.gov (United States)

    Chen, Gwo-Dong; Liu, Chen-Chung; Ou, Kuo-Liang; Liu, Baw-Jhiune

    2000-01-01

    Discusses the use of Web logs to record student behavior that can assist teachers in assessing performance and making curriculum decisions for distance learning students who are using Web-based learning systems. Adopts decision tree and data cube information processing methodologies for developing more effective pedagogical strategies. (LRW)

  7. Using Decision Trees to Detect and Isolate Simulated Leaks in the J-2X Rocket Engine

    Science.gov (United States)

    Schwabacher, Mark A.; Aguilar, Robert; Figueroa, Fernando F.

    2009-01-01

    The goal of this work was to use data-driven methods to automatically detect and isolate faults in the J-2X rocket engine. It was decided to use decision trees, since they tend to be easier to interpret than other data-driven methods. The decision tree algorithm automatically "learns" a decision tree by performing a search through the space of possible decision trees to find one that fits the training data. The particular decision tree algorithm used is known as C4.5. Simulated J-2X data from a high-fidelity simulator developed at Pratt & Whitney Rocketdyne and known as the Detailed Real-Time Model (DRTM) was used to "train" and test the decision tree. Fifty-six DRTM simulations were performed for this purpose, with different leak sizes, different leak locations, and different times of leak onset. To make the simulations as realistic as possible, they included simulated sensor noise, and included a gradual degradation in both fuel and oxidizer turbine efficiency. A decision tree was trained using 11 of these simulations, and tested using the remaining 45 simulations. In the training phase, the C4.5 algorithm was provided with labeled examples of data from nominal operation and data including leaks in each leak location. From the data, it "learned" a decision tree that can classify unseen data as having no leak or having a leak in one of the five leak locations. In the test phase, the decision tree produced very low false alarm rates and low missed detection rates on the unseen data. It had very good fault isolation rates for three of the five simulated leak locations, but it tended to confuse the remaining two locations, perhaps because a large leak at one of these two locations can look very similar to a small leak at the other location.

  8. An Analysis on Performance of Decision Tree Algorithms using Student’s Qualitative Data

    OpenAIRE

    T.Miranda Lakshmi; A.Martin; R.Mumtaj Begum; V.Prasanna Venkatesan

    2013-01-01

    Decision Tree is the most widely applied supervised classification technique. The learning and classification steps of decision tree induction are simple and fast and it can be applied to any domain. In this research student qualitative data has been taken from educational data mining and the performance analysis of the decision tree algorithm ID3, C4.5 and CART are compared. The comparison result shows that the Gini Index of CART influence information Gain Ratio of ID3 and C4.5. The classif...

  9. An Applied Research of Decision Tree Algorithm in Track and Field Equipment Training

    Directory of Open Access Journals (Sweden)

    Liu Shaoqing

    2015-01-01

    Full Text Available This paper has conducted a study on the applications of track and field equipment training based on ID3 algorithm of decision tree model. For the selection of the elements used by decision tree, this paper can be divided into track training equipment, field events training equipment and auxiliary training equipment according to the properties of track and field equipment. The decision tree that regards track training equipment as root nodes has been obtained under the conditions of lowering computation cost through the selection of data as well as the application and optimization of ID3 algorithm model.

  10. Induction and pruning of classification rules for prediction of microseismic hazards in coal mines

    Energy Technology Data Exchange (ETDEWEB)

    Sikora, M. [Silesian Technical University, Gliwice (Poland)

    2011-06-15

    The paper presents results of application of a rule induction and pruning algorithm for classification of a microseismic hazard state in coal mines. Due to imbalanced distribution of examples describing states 'hazardous' and 'safe', the special algorithm was used for induction and rule pruning. The algorithm selects optimal parameters' values influencing rule induction and pruning based on training and tuning sets. A rule quality measure which decides about a form and classification abilities of rules that are induced is the basic parameter of the algorithm. The specificity and sensitivity of a classifier were used to evaluate its quality. Conducted tests show that the admitted method of rules induction and classifier's quality evaluation enables to get better results of classification of microseismic hazards than by methods currently used in mining practice. Results obtained by the rules-based classifier were also compared with results got by a decision tree induction algorithm and by a neuro-fuzzy system.

  11. Supervised hashing using graph cuts and boosted decision trees.

    Science.gov (United States)

    Lin, Guosheng; Shen, Chunhua; Hengel, Anton van den

    2015-11-01

    To build large-scale query-by-example image retrieval systems, embedding image features into a binary Hamming space provides great benefits. Supervised hashing aims to map the original features to compact binary codes that are able to preserve label based similarity in the binary Hamming space. Most existing approaches apply a single form of hash function, and an optimization process which is typically deeply coupled to this specific form. This tight coupling restricts the flexibility of those methods, and can result in complex optimization problems that are difficult to solve. In this work we proffer a flexible yet simple framework that is able to accommodate different types of loss functions and hash functions. The proposed framework allows a number of existing approaches to hashing to be placed in context, and simplifies the development of new problem-specific hashing methods. Our framework decomposes the hashing learning problem into two steps: binary code (hash bit) learning and hash function learning. The first step can typically be formulated as binary quadratic problems, and the second step can be accomplished by training a standard binary classifier. For solving large-scale binary code inference, we show how it is possible to ensure that the binary quadratic problems are submodular such that efficient graph cut methods may be used. To achieve efficiency as well as efficacy on large-scale high-dimensional data, we propose to use boosted decision trees as the hash functions, which are nonlinear, highly descriptive, and are very fast to train and evaluate. Experiments demonstrate that the proposed method significantly outperforms most state-of-the-art methods, especially on high-dimensional data.

  12. Decision tree analysis in subarachnoid hemorrhage: prediction of outcome parameters during the course of aneurysmal subarachnoid hemorrhage using decision tree analysis.

    Science.gov (United States)

    Hostettler, Isabel Charlotte; Muroi, Carl; Richter, Johannes Konstantin; Schmid, Josef; Neidert, Marian Christoph; Seule, Martin; Boss, Oliver; Pangalu, Athina; Germans, Menno Robbert; Keller, Emanuela

    2018-01-19

    OBJECTIVE The aim of this study was to create prediction models for outcome parameters by decision tree analysis based on clinical and laboratory data in patients with aneurysmal subarachnoid hemorrhage (aSAH). METHODS The database consisted of clinical and laboratory parameters of 548 patients with aSAH who were admitted to the Neurocritical Care Unit, University Hospital Zurich. To examine the model performance, the cohort was randomly divided into a derivation cohort (60% [n = 329]; training data set) and a validation cohort (40% [n = 219]; test data set). The classification and regression tree prediction algorithm was applied to predict death, functional outcome, and ventriculoperitoneal (VP) shunt dependency. Chi-square automatic interaction detection was applied to predict delayed cerebral infarction on days 1, 3, and 7. RESULTS The overall mortality was 18.4%. The accuracy of the decision tree models was good for survival on day 1 and favorable functional outcome at all time points, with a difference between the training and test data sets of decision trees enables exploration of dependent variables in the context of multiple changing influences over the course of an illness. The decision tree currently generated increases awareness of the early systemic stress response, which is seemingly pertinent for prognostication.

  13. Detecting road maps for capacity utilization decisions by Clustering Analysis and CHAID Decision Trees.

    Science.gov (United States)

    Koyuncugil, Ali Serhan; Ozgulbas, Nermin

    2010-08-01

    The aims of this study are to provide a standard CUR value, to determine financial and organizational factors which affect the capacity utilization and develop road maps for increasing capacity utilization. To reach these aims by an objective method, we used data mining method that discovers hidden and useful pattern in a large amount of data. Two different method of data mining were used in two stages for this study. In first step, standard value of CUR was determined by K-means Clustering Analysis. CHAID Decision Tree Algorithm as a second method was implemented for determination of impact factors that provided steps for road maps. The study was concerned Turkish Ministry of Health public hospitals. 592 hospitals were covered and financial and operational data of the year 2004 were used in the study. Finally two different road maps were developed and suggestions were made according the results of the study.

  14. Ensemble Pruning for Glaucoma Detection in an Unbalanced Data Set.

    Science.gov (United States)

    Adler, Werner; Gefeller, Olaf; Gul, Asma; Horn, Folkert K; Khan, Zardad; Lausen, Berthold

    2016-12-07

    Random forests are successful classifier ensemble methods consisting of typically 100 to 1000 classification trees. Ensemble pruning techniques reduce the computational cost, especially the memory demand, of random forests by reducing the number of trees without relevant loss of performance or even with increased performance of the sub-ensemble. The application to the problem of an early detection of glaucoma, a severe eye disease with low prevalence, based on topographical measurements of the eye background faces specific challenges. We examine the performance of ensemble pruning strategies for glaucoma detection in an unbalanced data situation. The data set consists of 102 topographical features of the eye background of 254 healthy controls and 55 glaucoma patients. We compare the area under the receiver operating characteristic curve (AUC), and the Brier score on the total data set, in the majority class, and in the minority class of pruned random forest ensembles obtained with strategies based on the prediction accuracy of greedily grown sub-ensembles, the uncertainty weighted accuracy, and the similarity between single trees. To validate the findings and to examine the influence of the prevalence of glaucoma in the data set, we additionally perform a simulation study with lower prevalences of glaucoma. In glaucoma classification all three pruning strategies lead to improved AUC and smaller Brier scores on the total data set with sub-ensembles as small as 30 to 80 trees compared to the classification results obtained with the full ensemble consisting of 1000 trees. In the simulation study, we were able to show that the prevalence of glaucoma is a critical factor and lower prevalence decreases the performance of our pruning strategies. The memory demand for glaucoma classification in an unbalanced data situation based on random forests could effectively be reduced by the application of pruning strategies without loss of performance in a population with increased

  15. Using decision trees to enhance interdisciplinary team work: the case of oncofertility.

    Science.gov (United States)

    Gardino, Shauna L; Jeruss, Jacqueline S; Woodruff, Teresa K

    2010-05-01

    Oncofertility, an emerging discipline at the intersection of cancer and fertility, strives to give cancer patients options when they are confronting potential infertility as a consequence of cancer treatment. Fertility preservation decisions must be made before treatment begins, adding stress to the decision-making process. Healthcare providers need to be aware of the intricacies involved in oncofertility decision making, and the often tight time line that patients face when making these decisions. Cancer patient's perspectives may also change, as the dual burden of a cancer diagnosis and potential infertility can cause great flux in emotions. A provider-facing decision tree was created to enhance patient decision-making capacities and outline the multiple potential intervention points. Decision trees, which highlight the important decision points during which providers can approach patients, can be a useful tool to help providers in counseling patients on fertility preservation.

  16. Using Decision Trees to Detect and Isolate Leaks in the J-2X

    Data.gov (United States)

    National Aeronautics and Space Administration — Full title: Using Decision Trees to Detect and Isolate Simulated Leaks in the J-2X Rocket Engine Mark Schwabacher, NASA Ames Research Center Robert Aguilar, Pratt...

  17. Detecting Structural Metadata with Decision Trees and Transformation-Based Learning

    National Research Council Canada - National Science Library

    Kim, Joungbum; Schwarm, Sarah E; Ostendorf, Mari

    2004-01-01

    .... Specifically, combinations of decision trees and language models are used to predict sentence ends and interruption points and given these events transformation based learning is used to detect edit...

  18. Statistical Sensitive Data Protection and Inference Prevention with Decision Tree Methods

    National Research Council Canada - National Science Library

    Chang, LiWu

    2003-01-01

    .... We consider inference as correct classification and approach it with decision tree methods. As in our previous work, sensitive data are viewed as classes of those test data and non-sensitive data are the rest attribute values...

  19. A Decision Tree Approach to the Interpretation of Multivariate Statistical Techniques.

    Science.gov (United States)

    Fok, Lillian Y.; And Others

    1995-01-01

    Discusses the nature, power, and limitations of four multivariate techniques: factor analysis, multiple analysis of variance, multiple regression, and multiple discriminant analysis. Shows how decision trees assist in interpreting results. (SK)

  20. Speech recognition based on statistical models including multiple phonetic decision trees

    National Research Council Canada - National Science Library

    Shiota, Sayaka; Hashimoto, Kei; Zen, Heiga; Nankaku, Yoshihiko; Lee, Akinobu; Tokuda, Keiichi

    2011-01-01

    We propose a speech recognition technique using multiple model structures. In the use of context-dependent models, decision-tree-based context clustering is applied to find an appropriate parameter tying structure...

  1. Tips for teachers of evidence-based medicine: making sense of decision analysis using a decision tree.

    Science.gov (United States)

    Lee, Anna; Joynt, Gavin M; Ho, Anthony M H; Keitz, Sheri; McGinn, Thomas; Wyer, Peter C

    2009-05-01

    Decision analysis is a tool that clinicians can use to choose an option that maximizes the overall net benefit to a patient. It is an explicit, quantitative, and systematic approach to decision making under conditions of uncertainty. In this article, we present two teaching tips aimed at helping clinical learners understand the use and relevance of decision analysis. The first tip demonstrates the structure of a decision tree. With this tree, a clinician may identify the optimal choice among complicated options by calculating probabilities of events and incorporating patient valuations of possible outcomes. The second tip demonstrates how to address uncertainty regarding the estimates used in a decision tree. We field tested the tips twice with interns and senior residents. Teacher preparatory time was approximately 90 minutes. The field test utilized a board and a calculator. Two handouts were prepared. Learners identified the importance of incorporating values into the decision-making process as well as the role of uncertainty. The educational objectives appeared to be reached. These teaching tips introduce clinical learners to decision analysis in a fashion aimed to illustrate principles of clinical reasoning and how patient values can be actively incorporated into complex decision making.

  2. Real-Time Speech/Music Classification With a Hierarchical Oblique Decision Tree

    Science.gov (United States)

    2008-04-01

    REAL-TIME SPEECH/ MUSIC CLASSIFICATION WITH A HIERARCHICAL OBLIQUE DECISION TREE Jun Wang, Qiong Wu, Haojiang Deng, Qin Yan Institute of Acoustics...time speech/ music classification with a hierarchical oblique decision tree. A set of discrimination features in frequency domain are selected...handle signals without discrimination and can not work properly in the existence of multimedia signals. This paper proposes a real-time speech/ music

  3. Analyzing brain signals using decision trees: an approach based on neuroscience

    OpenAIRE

    Diana Francisca Adamatti; Josimara Silveira; Fernanda de Carvalho

    2016-01-01

    This paper presents a case study of treatment of brain signals using decision trees to classify of these signals, and they are analyzed based on neuroscience. We have collected brain signals for 3 subjects during an imagination task and we classify these signals using decision trees, a supervised machine learning method. To analyze the processing data and basing in neuroscience, we have defined a matching between the electrodes position and the corresponding functions into brain. The results ...

  4. Development of a New Decision Tree to Rapidly Screen Chemical Estrogenic Activities of Xenopus laevis.

    Science.gov (United States)

    Wang, Ting; Li, Weiying; Zheng, Xiaofeng; Lin, Zhifen; Kong, Deyang

    2014-02-01

    During the last past decades, there is an increasing number of studies about estrogenic activities of the environmental pollutants on amphibians and many determination methods have been proposed. However, these determination methods are time-consuming and expensive, and a rapid and simple method to screen and test the chemicals for estrogenic activities to amphibians is therefore imperative. Herein is proposed a new decision tree formulated not only with physicochemical parameters but also a biological parameter that was successfully used to screen estrogenic activities of the chemicals on amphibians. The biological parameter, CDOCKER interaction energy (Ebinding ) between chemicals and the target proteins was calculated based on the method of molecular docking, and it was used to revise the decision tree formulated by Hong only with physicochemical parameters for screening estrogenic activity of chemicals in rat. According to the correlation between Ebinding of rat and Xenopus laevis, a new decision tree for estrogenic activities in Xenopus laevis is finally proposed. Then it was validated by using the randomly 8 chemicals which can be frequently exposed to Xenopus laevis, and the agreement between the results from the new decision tree and the ones from experiments is generally satisfactory. Consequently, the new decision tree can be used to screen the estrogenic activities of the chemicals, and combinational use of the Ebinding and classical physicochemical parameters can greatly improves Hong's decision tree. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Metric Sex Determination of the Human Coxal Bone on a Virtual Sample using Decision Trees.

    Science.gov (United States)

    Savall, Frédéric; Faruch-Bilfeld, Marie; Dedouit, Fabrice; Sans, Nicolas; Rousseau, Hervé; Rougé, Daniel; Telmon, Norbert

    2015-11-01

    Decision trees provide an alternative to multivariate discriminant analysis, which is still the most commonly used in anthropometric studies. Our study analyzed the metric characterization of a recent virtual sample of 113 coxal bones using decision trees for sex determination. From 17 osteometric type I landmarks, a dataset was built with five classic distances traditionally reported in the literature and six new distances selected using the two-step ratio method. A ten-fold cross-validation was performed, and a decision tree was established on two subsamples (training and test sets). The decision tree established on the training set included three nodes and its application to the test set correctly classified 92% of individuals. This percentage was similar to the data of the literature. The usefulness of decision trees has been demonstrated in numerous fields. They have been already used in sex determination, body mass prediction, and ancestry estimation. This study shows another use of decision trees enabling simple and accurate sex determination. © 2015 American Academy of Forensic Sciences.

  6. Total Path Length and Number of Terminal Nodes for Decision Trees

    KAUST Repository

    Hussain, Shahid

    2014-09-13

    This paper presents a new tool for study of relationships between total path length (average depth) and number of terminal nodes for decision trees. These relationships are important from the point of view of optimization of decision trees. In this particular case of total path length and number of terminal nodes, the relationships between these two cost functions are closely related with space-time trade-off. In addition to algorithm to compute the relationships, the paper also presents results of experiments with datasets from UCI ML Repository1. These experiments show how two cost functions behave for a given decision table and the resulting plots show the Pareto frontier or Pareto set of optimal points. Furthermore, in some cases this Pareto frontier is a singleton showing the total optimality of decision trees for the given decision table.

  7. Factors affecting branch wound occlusion and associated decay following pruning – a case study with wild cherry (Prunus avium L.

    Directory of Open Access Journals (Sweden)

    Jonathan Sheppard

    2016-11-01

    Full Text Available Pruning wild cherry (Prunus avium L. is a common silvicultural practice carried out to produce valuable timber at a veneer wood quality. Sub-optimal pruning treatments can permit un-occluded pruning wounds to develop devaluing decay. The aim of this study is to determine relevant branch, tree and pruning characteristics affecting the occlusion process of pruning wounds. Important factors influencing occlusion time for an optimised pruning treatment for valuable timber production utilising wild cherry are derived. 85 artificially pruned branches originating from ten wild cherry trees were retrospectively analysed. Branch stub length, branch diameter and radial stem increment during occlusion were found to be significant predictors for occlusion time. From the results it could be concluded that for the long term success of artificial pruning of wild cherry it is crucial to (i keep branch stubs short (while avoiding damage to the branch collar, (ii to enable the tree to maintain significant radial growth after pruning, (iii to avoid large pruning wounds (>2.5 cm by removing steeply angled and fast growing branches at an early stage.

  8. Assessing the impact of a ttendance in students ’ final success using the Decision- Making Tree

    Directory of Open Access Journals (Sweden)

    ALIJA Sadri

    2018-01-01

    Full Text Available In this paper, we use the decision-making tree to explain the impact attendance has on students’ final success. The paper analyses the results of 56 students in 3 subjects during the academic year 2016/2017 (first, second and third- year students of Business Mathematics, Statistics and Managerial Economics at the SEE University in Tetovo . The results show that attendance is the most important of the 5 attributes in this study, placing itat the root of the tree. In constructing the Decision-making Tree, we have used the ID3 Algorithm within the Weka software package.

  9. GENERATION OF 2D LAND COVER MAPS FOR URBAN AREAS USING DECISION TREE CLASSIFICATION

    DEFF Research Database (Denmark)

    Höhle, Joachim

    2014-01-01

    A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects...... like buildings, roads, grassland, trees, hedges, and walls from such an ‘intelligent’ point cloud. The decision tree is derived from training areas which borders are digitized on top of a false-colour orthoimage. The produced 2D land cover map with six classes is then subsequently refined by using...

  10. Money laundering regulatory risk evaluation using Bitmap Index-based Decision Tree

    OpenAIRE

    Jayasree, Vikas; R.V. Siva Balan

    2017-01-01

    This paper proposes to evaluate the adaptability risk in money laundering using Bitmap Index-based Decision Tree (BIDT) technique. Initially, the Bitmap Index-based Decision Tree learning is used to induce the knowledge tree which helps to determine a company’s money laundering risk and improve scalability. A bitmap index in BIDT is used to effectively access large banking databases. In a BIDT bitmap index, account in a table is numbered in sequence with each key value, account number and a b...

  11. Predicting gene function using hierarchical multi-label decision tree ensembles

    Directory of Open Access Journals (Sweden)

    Kocev Dragi

    2010-01-01

    Full Text Available Abstract Background S. cerevisiae, A. thaliana and M. musculus are well-studied organisms in biology and the sequencing of their genomes was completed many years ago. It is still a challenge, however, to develop methods that assign biological functions to the ORFs in these genomes automatically. Different machine learning methods have been proposed to this end, but it remains unclear which method is to be preferred in terms of predictive performance, efficiency and usability. Results We study the use of decision tree based models for predicting the multiple functions of ORFs. First, we describe an algorithm for learning hierarchical multi-label decision trees. These can simultaneously predict all the functions of an ORF, while respecting a given hierarchy of gene functions (such as FunCat or GO. We present new results obtained with this algorithm, showing that the trees found by it exhibit clearly better predictive performance than the trees found by previously described methods. Nevertheless, the predictive performance of individual trees is lower than that of some recently proposed statistical learning methods. We show that ensembles of such trees are more accurate than single trees and are competitive with state-of-the-art statistical learning and functional linkage methods. Moreover, the ensemble method is computationally efficient and easy to use. Conclusions Our results suggest that decision tree based methods are a state-of-the-art, efficient and easy-to-use approach to ORF function prediction.

  12. Application of decision tree model for the ground subsidence hazard mapping near abandoned underground coal mines.

    Science.gov (United States)

    Lee, Saro; Park, Inhye

    2013-09-30

    Subsidence of ground caused by underground mines poses hazards to human life and property. This study analyzed the hazard to ground subsidence using factors that can affect ground subsidence and a decision tree approach in a geographic information system (GIS). The study area was Taebaek, Gangwon-do, Korea, where many abandoned underground coal mines exist. Spatial data, topography, geology, and various ground-engineering data for the subsidence area were collected and compiled in a database for mapping ground-subsidence hazard (GSH). The subsidence area was randomly split 50/50 for training and validation of the models. A data-mining classification technique was applied to the GSH mapping, and decision trees were constructed using the chi-squared automatic interaction detector (CHAID) and the quick, unbiased, and efficient statistical tree (QUEST) algorithms. The frequency ratio model was also applied to the GSH mapping for comparing with probabilistic model. The resulting GSH maps were validated using area-under-the-curve (AUC) analysis with the subsidence area data that had not been used for training the model. The highest accuracy was achieved by the decision tree model using CHAID algorithm (94.01%) comparing with QUEST algorithms (90.37%) and frequency ratio model (86.70%). These accuracies are higher than previously reported results for decision tree. Decision tree methods can therefore be used efficiently for GSH analysis and might be widely used for prediction of various spatial events. Copyright © 2013. Published by Elsevier Ltd.

  13. MRI-based decision tree model for diagnosis of biliary atresia.

    Science.gov (United States)

    Kim, Yong Hee; Kim, Myung-Joon; Shin, Hyun Joo; Yoon, Haesung; Han, Seok Joo; Koh, Hong; Roh, Yun Ho; Lee, Mi-Jung

    2018-02-23

    To evaluate MRI findings and to generate a decision tree model for diagnosis of biliary atresia (BA) in infants with jaundice. We retrospectively reviewed features of MRI and ultrasonography (US) performed in infants with jaundice between January 2009 and June 2016 under approval of the institutional review board, including the maximum diameter of periportal signal change on MRI (MR triangular cord thickness, MR-TCT) or US (US-TCT), visibility of common bile duct (CBD) and abnormality of gallbladder (GB). Hepatic subcapsular flow was reviewed on Doppler US. We performed conditional inference tree analysis using MRI findings to generate a decision tree model. A total of 208 infants were included, 112 in the BA group and 96 in the non-BA group. Mean age at the time of MRI was 58.7 ± 36.6 days. Visibility of CBD, abnormality of GB and MR-TCT were good discriminators for the diagnosis of BA and the MRI-based decision tree using these findings with MR-TCT cut-off 5.1 mm showed 97.3 % sensitivity, 94.8 % specificity and 96.2 % accuracy. MRI-based decision tree model reliably differentiates BA in infants with jaundice. MRI can be an objective imaging modality for the diagnosis of BA. • MRI-based decision tree model reliably differentiates biliary atresia in neonatal cholestasis. • Common bile duct, gallbladder and periportal signal changes are the discriminators. • MRI has comparable performance to ultrasonography for diagnosis of biliary atresia.

  14. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis.

    Science.gov (United States)

    Lo, Benjamin W Y; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A H

    2016-01-01

    Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56-2.45, P tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH.

  15. Applied Swarm-based medicine: collecting decision trees for patterns of algorithms analysis.

    Science.gov (United States)

    Panje, Cédric M; Glatzer, Markus; von Rappard, Joscha; Rothermundt, Christian; Hundsberger, Thomas; Zumstein, Valentin; Plasswilm, Ludwig; Putora, Paul Martin

    2017-08-16

    The objective consensus methodology has recently been applied in consensus finding in several studies on medical decision-making among clinical experts or guidelines. The main advantages of this method are an automated analysis and comparison of treatment algorithms of the participating centers which can be performed anonymously. Based on the experience from completed consensus analyses, the main steps for the successful implementation of the objective consensus methodology were identified and discussed among the main investigators. The following steps for the successful collection and conversion of decision trees were identified and defined in detail: problem definition, population selection, draft input collection, tree conversion, criteria adaptation, problem re-evaluation, results distribution and refinement, tree finalisation, and analysis. This manuscript provides information on the main steps for successful collection of decision trees and summarizes important aspects at each point of the analysis.

  16. Boundary expansion algorithm of a decision tree induction for an imbalanced dataset

    Directory of Open Access Journals (Sweden)

    Kesinee Boonchuay

    2017-10-01

    Full Text Available A decision tree is one of the famous classifiers based on a recursive partitioning algorithm. This paper introduces the Boundary Expansion Algorithm (BEA to improve a decision tree induction that deals with an imbalanced dataset. BEA utilizes all attributes to define non-splittable ranges. The computed means of all attributes for minority instances are used to find the nearest minority instance, which will be expanded along all attributes to cover a minority region. As a result, BEA can successfully cope with an imbalanced dataset comparing with C4.5, Gini, asymmetric entropy, top-down tree, and Hellinger distance decision tree on 25 imbalanced datasets from the UCI Repository.

  17. Assessing School Readiness for a Practice Arrangement Using Decision Tree Methodology.

    Science.gov (United States)

    Barger, Sara E.

    1998-01-01

    Questions in a decision-tree address mission, faculty interest, administrative support, and practice plan as a way of assessing arrangements for nursing faculty's clinical practice. Decisions should be based on congruence between the human resource allocation and the reward systems. (SK)

  18. Visualization of Decision Tree State for the Classification of Parkinson's Disease

    NARCIS (Netherlands)

    Valentijn, E

    2016-01-01

    Decision trees have been shown to be effective at classifying subjects with Parkinson’s disease when provided with features (subject scores) derived from FDG-PET data. Such subject scores have strong discriminative power but are not intuitive to understand. We therefore augment each decision node

  19. Decision trees for farm management on acid sulfate soils, Mekong Delta, Vietnam

    NARCIS (Netherlands)

    Quang Tri, Le; Mensvoort, van M.E.F.

    2004-01-01

    Our study shows how farmers in the Mekong Delta, Viet Nam, have developed new cropping systems and management practices to overcome the constraints of their land. Decision trees for land evaluation at farm level, combining farmer and expert knowledge, were developed to support management decisions

  20. Decision tree analysis to evaluate dry cow strategies under UK conditions

    NARCIS (Netherlands)

    Berry, E.A.; Hogeveen, H.; Hillerton, J.E.

    2004-01-01

    Economic decisions on animal health strategies address the cost-benefit aspect along with animal welfare and public health concerns. Decision tree analysis at an individual cow level highlighted that there is little economic difference between the use of either dry cow antibiotic or an internal teat

  1. [Prediction of regional soil quality based on mutual information theory integrated with decision tree algorithm].

    Science.gov (United States)

    Lin, Fen-Fang; Wang, Ke; Yang, Ning; Yan, Shi-Guang; Zheng, Xin-Yu

    2012-02-01

    In this paper, some main factors such as soil type, land use pattern, lithology type, topography, road, and industry type that affect soil quality were used to precisely obtain the spatial distribution characteristics of regional soil quality, mutual information theory was adopted to select the main environmental factors, and decision tree algorithm See 5.0 was applied to predict the grade of regional soil quality. The main factors affecting regional soil quality were soil type, land use, lithology type, distance to town, distance to water area, altitude, distance to road, and distance to industrial land. The prediction accuracy of the decision tree model with the variables selected by mutual information was obviously higher than that of the model with all variables, and, for the former model, whether of decision tree or of decision rule, its prediction accuracy was all higher than 80%. Based on the continuous and categorical data, the method of mutual information theory integrated with decision tree could not only reduce the number of input parameters for decision tree algorithm, but also predict and assess regional soil quality effectively.

  2. Automatic design of decision-tree induction algorithms tailored to flexible-receptor docking data.

    Science.gov (United States)

    Barros, Rodrigo C; Winck, Ana T; Machado, Karina S; Basgalupp, Márcio P; de Carvalho, André C P L F; Ruiz, Duncan D; de Souza, Osmar Norberto

    2012-11-21

    This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance. The empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application. We conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor.

  3. Effects of pruning in Monterey pine plantations affected by Fusarium circinatum

    Energy Technology Data Exchange (ETDEWEB)

    Bezos, D.; Lomba, J. M.; Martinez-Alvarez, P.; Fernandez, M.; Diez, J. J.

    2012-07-01

    Fusarium circinatum Nirenberg and O'Donnell (1998) is the causal agent of Pitch Canker Disease (PCD) in Pinus species, producing damage to the main trunk and lateral branches as well as causing branch dieback. The disease has been detected recently in northern Spain in Pinus spp. seedlings at nurseries and in Pinus radiata D. Don adult trees in plantations. Fusarium circinatum seems to require a wound to enter the tree, not only that as caused by insects but also that resulting from damage by humans, i.e. mechanical wounds. However, the effects of pruning on the infection process have yet to be studied. The aim of the present study was to know how the presence of mechanical damage caused by pruning affects PCD occurrence and severity in P. radiata plantations. Fifty P. radiata plots (pruned and unpruned) distributed throughout 16 sites affected by F. circinatum in the Cantabria region (northern Spain) were studied. Symptoms of PCD presence, such as dieback, oozing cankers and trunk deformation were evaluated in 25 trees per plot and related to pruning effect. A significant relationship between pruning and the number of cankers per tree was observed, concluding that wounds caused by pruning increase the chance of pathogen infection. Other trunk symptoms, such as the presence of resin outside the cankers, were also higher in pruned plots. These results should be taken into account for future management of Monterey Pine plantations. (Author) 36 refs.

  4. Decision tree approach for classification of remotely sensed satellite ...

    Indian Academy of Sciences (India)

    using classical ISODATA clustering and Maximum Likelihood Classifier (MLC) algorithms. Classification ... clustering method. Based on the overall accuracy and kappa statistics, DTC was found to be more preferred classification approach than others. 1. Introduction .... trees branch or split the dataset; terminal nodes.

  5. The Studies of Decision Tree in Estimation of Breast Cancer Risk by Using Polymorphism Nucleotide

    Directory of Open Access Journals (Sweden)

    Frida Seyedmir

    2017-07-01

    Full Text Available Abstract Introduction:   Decision tree is the data mining tools to collect, accurate prediction and sift information from massive amounts of data that are used widely in the field of computational biology and bioinformatics. In bioinformatics can be predict on diseases, including breast cancer. The use of genomic data including single nucleotide polymorphisms is a very important factor in predicting the risk of diseases. The number of seven important SNP among hundreds of thousands genetic markers were identified as factors associated with breast cancer. The objective of this study is to evaluate the training data on decision tree predictor error of the risk of breast cancer by using single nucleotide polymorphism genotype. Methods: The risk of breast cancer were calculated associated with the use of SNP formula:xj = fo * In human,  The decision tree can be used To predict the probability of disease using single nucleotide polymorphisms .Seven SNP with different odds ratio associated with breast cancer considered and coding and design of decision tree model, C4.5, by  Csharp2013 programming language were done. In the decision tree created with the coding, the four important associated SNP was considered. The decision tree error in two case of coding and using WEKA were assessment and percentage of decision tree accuracy in prediction of breast cancer were calculated. The number of trained samples was obtained with systematic sampling. With coding, two scenarios as well as software WEKA, three scenarios with different sets of data and the number of different learning and testing, were evaluated. Results: In both scenarios of coding, by increasing the training percentage from 66/66 to 86/42, the error reduced from 55/56 to 9/09. Also by running of WEKA on three scenarios with different sets of data, the number of different education, and different tests by increasing records number from 81 to 2187, the error rate decreased from 48/15 to 13

  6. Multivariate analysis of flow cytometric data using decision trees

    Directory of Open Access Journals (Sweden)

    Svenja eSimon

    2012-04-01

    Full Text Available Characterization of the response of the host immune system is important in understanding the bidirectional interactions between the host and microbial pathogens. For research on the host site, flow cytometry has become one of the major tools in immunology. Advances in technology and reagents allow now the simultaneous assessment of multiple markers on a single cell level generating multidimensional data sets that require multivariate statistical analysis. We explored the explanatory power of the supervised machine learning method called 'induction of decision trees' in flow cytometric data. In order to examine whether the production of a certain cytokine is depended on other cytokines, datasets from intracellular staining for six cytokines with complex patterns of co-expression were analyzed by induction of decision trees. After weighting the data according to their class probabilities, we created a total of 13,392 different decision trees for each given cytokine with different parameter settings. For a more realistic estimation of the decision trees's quality, we used stratified 5-fold cross-validation and chose the 'best' tree according to a combination of different quality criteria. While some of the decision trees reflected previously known co-expression patterns, we found that the expression of some cytokines was not only dependent on the co-expression of others per se, but was also dependent on the intensity of expression. Thus, for the first time we successfully used induction of decision trees for the analysis of high dimensional flow cytometric data and demonstrated the feasibility of this method to reveal structural patterns in such data sets.

  7. Using decision trees to characterize verbal communication during change and stuck episodes in the therapeutic process.

    Science.gov (United States)

    Masías, Víctor H; Krause, Mariane; Valdés, Nelson; Pérez, J C; Laengle, Sigifredo

    2015-01-01

    Methods are needed for creating models to characterize verbal communication between therapists and their patients that are suitable for teaching purposes without losing analytical potential. A technique meeting these twin requirements is proposed that uses decision trees to identify both change and stuck episodes in therapist-patient communication. Three decision tree algorithms (C4.5, NBTree, and REPTree) are applied to the problem of characterizing verbal responses into change and stuck episodes in the therapeutic process. The data for the problem is derived from a corpus of 8 successful individual therapy sessions with 1760 speaking turns in a psychodynamic context. The decision tree model that performed best was generated by the C4.5 algorithm. It delivered 15 rules characterizing the verbal communication in the two types of episodes. Decision trees are a promising technique for analyzing verbal communication during significant therapy events and have much potential for use in teaching practice on changes in therapeutic communication. The development of pedagogical methods using decision trees can support the transmission of academic knowledge to therapeutic practice.

  8. [Comparison of Discriminant Analysis and Decision Trees for the Detection of Subclinical Keratoconus].

    Science.gov (United States)

    Kleinhans, Sonja; Herrmann, Eva; Kohnen, Thomas; Bühren, Jens

    2017-08-15

    Background Iatrogenic keratectasia is one of the most dreaded complications of refractive surgery. In most cases, keratectasia develops after refractive surgery of eyes suffering from subclinical stages of keratoconus with few or no signs. Unfortunately, there has been no reliable procedure for the early detection of keratoconus. In this study, we used binary decision trees (recursive partitioning) to assess their suitability for discrimination between normal eyes and eyes with subclinical keratoconus. Patients and Methods The method of decision tree analysis was compared with discriminant analysis which has shown good results in previous studies. Input data were 32 eyes of 32 patients with newly diagnosed keratoconus in the contralateral eye and preoperative data of 10 eyes of 5 patients with keratectasia after laser in-situ keratomileusis (LASIK). The control group was made up of 245 normal eyes after LASIK and 12-month follow-up without any signs of iatrogenic keratectasia. Results Decision trees gave better accuracy and specificity than did discriminant analysis. The sensitivity of decision trees was lower than the sensitivity of discriminant analysis. Conclusion On the basis of the patient population of this study, decision trees did not prove to be superior to linear discriminant analysis for the detection of subclinical keratoconus. Georg Thieme Verlag KG Stuttgart · New York.

  9. Outsourcing the Portal: Another Branch in the Decision Tree.

    Science.gov (United States)

    McMahon, Tim

    2000-01-01

    Discussion of the management of information resources in organizations focuses on the use of portal technologies to update intranet capabilities. Considers application outsourcing decisions, reviews benefits (including reducing costs) as well as concerns, and describes application service providers (ASPs). (LRW)

  10. Rough Set Based Splitting Criterion for Binary Decision Tree Classifiers

    National Research Council Canada - National Science Library

    Mikulski, Dariusz G

    2006-01-01

    ...%ation based on rough set theory - the rough product. The rough product helps us to understand the manner in which an attribute value partition affects the upper approximation for each decision class...

  11. Wood quality for longleaf pines: a spacing, thinning and pruning study on the Kisatchie National Forest

    Science.gov (United States)

    Chi-Leung So; Thomas L. Eberhardt; Daniel J. Leduc; Leslie H. Groom; Jeffery C. G. Goelz

    2010-01-01

    Twenty 70-year-old longleaf pine (Pinus palustris Mill.) trees were harvested from a spacing, thinning, and pruning study on the Kisatchie National Forest, LA. Tree property mapping was used to show the property variation within and between three of the trees. The construction of such maps is both time consuming and cost prohibitive using traditional...

  12. Post-event human decision errors: operator action tree/time reliability correlation

    Energy Technology Data Exchange (ETDEWEB)

    Hall, R E; Fragola, J; Wreathall, J

    1982-11-01

    This report documents an interim framework for the quantification of the probability of errors of decision on the part of nuclear power plant operators after the initiation of an accident. The framework can easily be incorporated into an event tree/fault tree analysis. The method presented consists of a structure called the operator action tree and a time reliability correlation which assumes the time available for making a decision to be the dominating factor in situations requiring cognitive human response. This limited approach decreases the magnitude and complexity of the decision modeling task. Specifically, in the past, some human performance models have attempted prediction by trying to emulate sequences of human actions, or by identifying and modeling the information processing approach applicable to the task. The model developed here is directed at describing the statistical performance of a representative group of hypothetical individuals responding to generalized situations.

  13. Diagnosis of Constant Faults in Read-Once Contact Networks over Finite Bases using Decision Trees

    KAUST Repository

    Busbait, Monther I.

    2014-05-01

    We study the depth of decision trees for diagnosis of constant faults in read-once contact networks over finite bases. This includes diagnosis of 0-1 faults, 0 faults and 1 faults. For any finite basis, we prove a linear upper bound on the minimum depth of decision tree for diagnosis of constant faults depending on the number of edges in a contact network over that basis. Also, we obtain asymptotic bounds on the depth of decision trees for diagnosis of each type of constant faults depending on the number of edges in contact networks in the worst case per basis. We study the set of indecomposable contact networks with up to 10 edges and obtain sharp coefficients for the linear upper bound for diagnosis of constant faults in contact networks over bases of these indecomposable contact networks. We use a set of algorithms, including one that we create, to obtain the sharp coefficients.

  14. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem

    Science.gov (United States)

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  15. Comparison of Taxi Time Prediction Performance Using Different Taxi Speed Decision Trees

    Science.gov (United States)

    Lee, Hanbong

    2017-01-01

    In the STBO modeler and tactical surface scheduler for ATD-2 project, taxi speed decision trees are used to calculate the unimpeded taxi times of flights taxiing on the airport surface. The initial taxi speed values in these decision trees did not show good prediction accuracy of taxi times. Using the more recent, reliable surveillance data, new taxi speed values in ramp area and movement area were computed. Before integrating these values into the STBO system, we performed test runs using live data from Charlotte airport, with different taxi speed settings: 1) initial taxi speed values and 2) new ones. Taxi time prediction performance was evaluated by comparing various metrics. The results show that the new taxi speed decision trees can calculate the unimpeded taxi-out times more accurately.

  16. An Efficient Method of Vibration Diagnostics For Rotating Machinery Using a Decision Tree

    Directory of Open Access Journals (Sweden)

    Bo Suk Yang

    2000-01-01

    Full Text Available This paper describes an efficient method to automatize vibration diagnosis for rotating machinery using a decision tree, which is applicable to vibration diagnosis expert system. Decision tree is a widely known formalism for expressing classification knowledge and has been used successfully in many diverse areas such as character recognition, medical diagnosis, and expert systems, etc. In order to build a decision tree for vibration diagnosis, we have to define classes and attributes. A set of cases based on past experiences is also needed. This training set is inducted using a result-cause matrix newly developed in the present work instead of using a conventionally implemented cause-result matrix. This method was applied to diagnostics for various cases taken from published work. It is found that the present method predicts causes of the abnormal vibration for test cases with high reliability.

  17. Inverse S-Transform Based Decision Tree for Power System Faults Identification

    Directory of Open Access Journals (Sweden)

    P. Srikanth

    2011-04-01

    Full Text Available In this paper a decision tree based identification of power system faults has been proposed. The key input values to the decision tree are the performance indices calculated from the maximum values of unfiltered inverse Stockwell transform (MUNIST technique. A wide range of techniques including Stockwell transform (ST have been used for the identification of power system faults. However, the signatures produced by these techniques are not unique and sometimes lead to misinterpretation of faults. Consequently, a decision tree based on the inverse Stockwell transform method is proposed in the present paper to automatically identify both the symmetrical and unsymmetrical power system faults. The method is able to determine both sudden and gradual changes in the signal caused by different power system faults. The technique is very accurate and produces unique signatures compared to the existing techniques. The results obtained show the efficacy of the proposed technique.

  18. Decision Trees Predicting Tumor Shrinkage for Head and Neck Cancer: Implications for Adaptive Radiotherapy.

    Science.gov (United States)

    Surucu, Murat; Shah, Karan K; Mescioglu, Ibrahim; Roeske, John C; Small, William; Choi, Mehee; Emami, Bahman

    2016-02-01

    To develop decision trees predicting for tumor volume reduction in patients with head and neck (H&N) cancer using pretreatment clinical and pathological parameters. Forty-eight patients treated with definitive concurrent chemoradiotherapy for squamous cell carcinoma of the nasopharynx, oropharynx, oral cavity, or hypopharynx were retrospectively analyzed. These patients were rescanned at a median dose of 37.8 Gy and replanned to account for anatomical changes. The percentages of gross tumor volume (GTV) change from initial to rescan computed tomography (CT; %GTVΔ) were calculated. Two decision trees were generated to correlate %GTVΔ in primary and nodal volumes with 14 characteristics including age, gender, Karnofsky performance status (KPS), site, human papilloma virus (HPV) status, tumor grade, primary tumor growth pattern (endophytic/exophytic), tumor/nodal/group stages, chemotherapy regimen, and primary, nodal, and total GTV volumes in the initial CT scan. The C4.5 Decision Tree induction algorithm was implemented. The median %GTVΔ for primary, nodal, and total GTVs was 26.8%, 43.0%, and 31.2%, respectively. Type of chemotherapy, age, primary tumor growth pattern, site, KPS, and HPV status were the most predictive parameters for primary %GTVΔ decision tree, whereas for nodal %GTVΔ, KPS, site, age, primary tumor growth pattern, initial primary GTV, and total GTV volumes were predictive. Both decision trees had an accuracy of 88%. There can be significant changes in primary and nodal tumor volumes during the course of H&N chemoradiotherapy. Considering the proposed decision trees, radiation oncologists can select patients predicted to have high %GTVΔ, who would theoretically gain the most benefit from adaptive radiotherapy, in order to better use limited clinical resources. © The Author(s) 2015.

  19. Money laundering regulatory risk evaluation using Bitmap Index-based Decision Tree

    Directory of Open Access Journals (Sweden)

    Vikas Jayasree

    2017-06-01

    Full Text Available This paper proposes to evaluate the adaptability risk in money laundering using Bitmap Index-based Decision Tree (BIDT technique. Initially, the Bitmap Index-based Decision Tree learning is used to induce the knowledge tree which helps to determine a company’s money laundering risk and improve scalability. A bitmap index in BIDT is used to effectively access large banking databases. In a BIDT bitmap index, account in a table is numbered in sequence with each key value, account number and a bitmap (array of bytes used instead of a list of row ids. Subsequently, BIDT algorithm uses the “select” query performance to apply count and bit-wise logical operations on AND. Query result coincides exactly to build a decision tree and more precisely to evaluate the adaptability risk in the money laundering operation. For the root node, the main account of the decision tree, the population frequencies are obtained by simply counting the total number of “1” in the bitmaps constructed on the attribute to predict money laundering and evaluate the risk factor rate. The experiment is conducted on factors such as regulatory risk rate, false positive rate, and risk identification time.

  20. [Analysis of the characteristics of the older adults with depression using data mining decision tree analysis].

    Science.gov (United States)

    Park, Myonghwa; Choi, Sora; Shin, A Mi; Koo, Chul Hoi

    2013-02-01

    The purpose of this study was to develop a prediction model for the characteristics of older adults with depression using the decision tree method. A large dataset from the 2008 Korean Elderly Survey was used and data of 14,970 elderly people were analyzed. Target variable was depression and 53 input variables were general characteristics, family & social relationship, economic status, health status, health behavior, functional status, leisure & social activity, quality of life, and living environment. Data were analyzed by decision tree analysis, a data mining technique using SPSS Window 19.0 and Clementine 12.0 programs. The decision trees were classified into five different rules to define the characteristics of older adults with depression. Classification & Regression Tree (C&RT) showed the best prediction with an accuracy of 80.81% among data mining models. Factors in the rules were life satisfaction, nutritional status, daily activity difficulty due to pain, functional limitation for basic or instrumental daily activities, number of chronic diseases and daily activity difficulty due to disease. The different rules classified by the decision tree model in this study should contribute as baseline data for discovering informative knowledge and developing interventions tailored to these individual characteristics.

  1. Evaluation of soil carbon pools after the addition of prunings in subtropical orchards placed in terraces

    Science.gov (United States)

    Márquez San Emeterio, Layla; Martín Reyes, Marino Pedro; Ortiz Bernad, Irene; Fernández Ondoño, Emilia; Sierra Aragón, Manuel

    2017-04-01

    The amount of carbon that can be stored in a soil depends on many factors, such as the type of soil, the chemical composition of plant rests and the climate, and is also highly affected by land use and soil management. Agricultural ecosystems are proved to absorb a large amount of CO2 from the atmosphere through several sustainable management practices. In addition, organic materials such as leaves, grass, prunings, etc., comprise a significant type of agricultural practices as a result of waste recycling. The aim of this research was to evaluate the effects of the addition of different organic prunings on the potential for carbon sequestration in agricultural soils placed in terraces. Three subtropical orchards were sampled in Almuñécar (Granada, S Spain): mango (Mangifera indica L.), avocado (Persea americana Mill.) and cherimoya (Annonacherimola Mill.). The predominant climate is Subtropical Mediterranean and the soil is an Eutric Anthrosol. The experimental design consisted in the application of prunings from avocado, cherimoya and mango trees, placed on the surface soil underneath their correspondent trees, as well as garden prunings from the green areas surrounding the town center on the surface soils under the three orchard trees. Control experiences without the addition of prunings were also evaluated. These experiences were followed for three years. Soil samples were taken at4 cm depth. They were dried for 3-4 days and then sieved (avocado prunings and their control soil, and between soils under garden prunings with cherimoya and their control soil. Regarding the water-soluble soil organic carbon, low differences were shown. Differences in mineral-associated and non-oxidable organic carbon fractions were also statistically significant between soils under avocado prunings and their control soil, and between soils under garden prunings with cherimoya and their control soil. No significant differences in any organic carbon pool were founded for the soils

  2. Bayesian additive decision trees of biomarker by treatment interactions for predictive biomarker detection and subgroup identification.

    Science.gov (United States)

    Zhao, Yang; Zheng, Wei; Zhuo, Daisy Y; Lu, Yuefeng; Ma, Xiwen; Liu, Hengchang; Zeng, Zhen; Laird, Glen

    2017-10-11

    Personalized medicine, or tailored therapy, has been an active and important topic in recent medical research. Many methods have been proposed in the literature for predictive biomarker detection and subgroup identification. In this article, we propose a novel decision tree-based approach applicable in randomized clinical trials. We model the prognostic effects of the biomarkers using additive regression trees and the biomarker-by-treatment effect using a single regression tree. Bayesian approach is utilized to periodically revise the split variables and the split rules of the decision trees, which provides a better overall fitting. Gibbs sampler is implemented in the MCMC procedure, which updates the prognostic trees and the interaction tree separately. We use the posterior distribution of the interaction tree to construct the predictive scores of the biomarkers and to identify the subgroup where the treatment is superior to the control. Numerical simulations show that our proposed method performs well under various settings comparing to existing methods. We also demonstrate an application of our method in a real clinical trial.

  3. Vlsi implementation of flexible architecture for decision tree classification in data mining

    Science.gov (United States)

    Sharma, K. Venkatesh; Shewandagn, Behailu; Bhukya, Shankar Nayak

    2017-07-01

    The Data mining algorithms have become vital to researchers in science, engineering, medicine, business, search and security domains. In recent years, there has been a terrific raise in the size of the data being collected and analyzed. Classification is the main difficulty faced in data mining. In a number of the solutions developed for this problem, most accepted one is Decision Tree Classification (DTC) that gives high precision while handling very large amount of data. This paper presents VLSI implementation of flexible architecture for Decision Tree classification in data mining using c4.5 algorithm.

  4. The nutritional levels in leaves and fruits of fig trees as a function of pruning time and irrigation / Teores nutricionais em folhas e frutos de figueira, submetida a épocas de poda e irrigação

    Directory of Open Access Journals (Sweden)

    Marco Antonio Tecchio

    2009-07-01

    Full Text Available The present study aimed to evaluating the nutritional content in leaves and fruits of the fg tree ‘Roxo de Valinhos’, pruned at different periods corresponding to the months of July, August, September and October in the years of 2004 and 2005, with and without the use of irrigation, in the county of Botucatu, São Paulo State, Brazil. To achieve this objective, the adopted experimental design was in blocks with subdivided plots and 5 replications, in which plots corresponded to treatments with and without irrigation and subplots included prunings done in the above-mentioned four months. The levels of N, P, K, Ca, Mg, S, B, Cu, Mn and Zn in leaves and fruits were evaluated in the two crop cycles. The results indicated no signifcant differences among macro and micronutrient levels in the leaves subjected to treatments with and without irrigation in the cycle 2004/05, except for cupper which showed higher level with the treatment including irrigation (6 mg kg-1. In the fruits, there was no difference, except for Zn, which also showed the highest levels (28 mg kg-1 with irrigation. In the crop cycle 2005/06, there were differences for N (40 g kg-1 and K (20 g kg-1 in the leaves, where the highest levels were observed with the treatment including irrigation. In the fruits, N had signifcant difference and its highest level was observed without irrigation (21 g kg-1. In relation to the pruning periods, signifcant differences were observed for Ca, Fe and Zn content in the leaves and Ca, K, Mg, S and Zn content in the fruits in the crop cycle 2004/05. In the cycle 2005/06, there were not differences among the levels of the evaluated nutrients in the leaves, and in the fruits there was difference for N, Ca and Cu.O trabalho teve como objetivo avaliar os teores nutricionais foliares e nos frutos de fgueira ‘Roxo de Valinhos’, podada em diferentes épocas, correspondentes aos meses de julho, agosto, setembro e outubro dos anos de 2004 e 2005, com e

  5. Regularization with a pruning prior

    DEFF Research Database (Denmark)

    Goutte, Cyril; Hansen, Lars Kai

    1997-01-01

    We investigate the use of a regularization priorthat we show has pruning properties. Analyses areconducted both using a Bayesian framework and withthe generalization method, on a simple toyproblem. Results are thoroughly compared withthose obtained with a traditional weight decay.......We investigate the use of a regularization priorthat we show has pruning properties. Analyses areconducted both using a Bayesian framework and withthe generalization method, on a simple toyproblem. Results are thoroughly compared withthose obtained with a traditional weight decay....

  6. Relationships between average depth and number of misclassifications for decision trees

    KAUST Repository

    Chikalov, Igor

    2014-02-14

    This paper presents a new tool for the study of relationships between the total path length or the average depth and the number of misclassifications for decision trees. In addition to algorithm, the paper also presents the results of experiments with datasets from UCI ML Repository [9] and datasets representing Boolean functions with 10 variables.

  7. Relationships Between Average Depth and Number of Nodes for Decision Trees

    KAUST Repository

    Chikalov, Igor

    2013-07-24

    This paper presents a new tool for the study of relationships between total path length or average depth and number of nodes of decision trees. In addition to algorithm, the paper also presents the results of experiments with datasets from UCI ML Repository [1]. © Springer-Verlag Berlin Heidelberg 2014.

  8. A snow forecasting decision tree for significant snowfall over the interior of South Africa

    Directory of Open Access Journals (Sweden)

    Jan Hendrik Stander

    2016-09-01

    Full Text Available Snowfall occurs every winter over the mountains of South Africa but is rare over the highly populated metropolises over the interior of South Africa. When snowfall does occur over highly populated areas, it causes widespread disruption to infrastructure and even loss of life. Because of the rarity of snow over the interior of South Africa, inexperienced weather forecasters often miss these events. We propose a five-step snow forecasting decision tree in which all five criteria must be met to forecast snowfall. The decision tree comprises physical attributes that are necessary for snowfall to occur. The first step recognises the synoptic circulation patterns associated with snow and the second step detects whether precipitation is likely in an area. The remaining steps all deal with identifying the presence of a snowflake in a cloud and determining that the snowflake will not melt on the way to the ground. The decision tree is especially useful to forecast the very rare snow events that develop from relatively dry and warmer surface conditions. We propose operational implementation of the decision tree in the weather forecasting offices of South Africa, as it is foreseen that this approach could significantly contribute to accurately forecasting snow over the interior of South Africa.

  9. Dynamic Programming Strategies on the Decision Tree Hidden behind the Optimizing Problems

    OpenAIRE

    Zoltan KATAI

    2007-01-01

    The aim of the paper is to present the characteristics of certain dynamic programming strategies on the decision tree hidden behind the optimizing problems and thus to offer such a clear tool for their study and classification which can help in the comprehension of the essence of this programming technique.

  10. What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

    Science.gov (United States)

    Thomas, Emily H.; Galambos, Nora

    2004-01-01

    To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

  11. A decision tree approach using silvics to guide planning for forest restoration

    Science.gov (United States)

    Sharon M. Hermann; John S. Kush; John C. Gilbert

    2013-01-01

    We created a decision tree based on silvics of longleaf pine (Pinus palustris) and historical descriptions to develop approaches for restoration management at Horseshoe Bend National Military Park located in central Alabama. A National Park Service goal is to promote structure and composition of a forest that likely surrounded the 1814 battlefield....

  12. Decision-tree induction to detect clinical mastitis with automatic milking

    NARCIS (Netherlands)

    Kamphuis, C.; Mollenhorst, H.; Feelders, A.; Pietersma, D.; Hogeveen, H.

    2010-01-01

    a b s t r a c t This study explored the potential of using decision-tree induction to develop models for the detection of clinical mastitis with automatic milking. Sensor data (including electrical conductivity and colour) of over 711,000 quarter milkings were collected from December 2006 till

  13. Test Reviews: Euler, B. L. (2007). "Emotional Disturbance Decision Tree". Lutz, FL: Psychological Assessment Resources

    Science.gov (United States)

    Tansy, Michael

    2009-01-01

    The Emotional Disturbance Decision Tree (EDDT) is a teacher-completed norm-referenced rating scale published by Psychological Assessment Resources, Inc., in Lutz, Florida. The 156-item EDDT was developed for use as part of a broader assessment process to screen and assist in the identification of 5- to 18-year-old children for the special…

  14. Which Types of Leadership Styles Do Followers Prefer? A Decision Tree Approach

    Science.gov (United States)

    Salehzadeh, Reza

    2017-01-01

    Purpose: The purpose of this paper is to propose a new method to find the appropriate leadership styles based on the followers' preferences using the decision tree technique. Design/methodology/approach: Statistical population includes the students of the University of Isfahan. In total, 750 questionnaires were distributed; out of which, 680…

  15. Development of a diagnostic decision tree for obstructive pulmonary diseases based on real-life data

    NARCIS (Netherlands)

    Metting, Esther I; In 't Veen, Johannes C C M; Dekhuijzen, P N Richard; van Heijst, Ellen; Kocks, Janwillem W H; Muilwijk-Kroes, Jacqueline B; Chavannes, Niels H; van der Molen, Thys

    2016-01-01

    The aim of this study was to develop and explore the diagnostic accuracy of a decision tree derived from a large real-life primary care population. Data from 9297 primary care patients (45% male, mean age 53±17 years) with suspicion of an obstructive pulmonary disease was derived from an

  16. Dynamic Security Assessment of Danish Power System Based on Decision Trees: Today and Tomorrow

    DEFF Research Database (Denmark)

    Rather, Zakir Hussain; Liu, Leo; Chen, Zhe

    2013-01-01

    The research work presented in this paper analyzes the impact of wind energy, phasing out of central power plants and cross border power exchange on dynamic security of Danish Power System. Contingency based decision tree (DT) approach is used to assess the dynamic security of present and future...

  17. A multivariate decision tree analysis of biophysical factors in tropical forest fire occurrence

    Science.gov (United States)

    Rey S. Ofren; Edward Harvey

    2000-01-01

    A multivariate decision tree model was used to quantify the relative importance of complex hierarchical relationships between biophysical variables and the occurrence of tropical forest fires. The study site is the Huai Kha Kbaeng wildlife sanctuary, a World Heritage Site in northwestern Thailand where annual fires are common and particularly destructive. Thematic...

  18. Dynamic Security Assessment of Western Danish Power System Based on Ensemble Decision Trees

    DEFF Research Database (Denmark)

    Liu, Leo; Bak, Claus Leth; Chen, Zhe

    2014-01-01

    With the increasing penetration of renewable energy resources and other forms of dispersed generation, more and more uncertainties will be brought to the dynamic security assessment (DSA) of power systems. This paper proposes an approach that uses ensemble decision trees (EDT) for online DSA. Fed...

  19. Development of a diagnostic decision tree for obstructive pulmonary diseases based on real-life data

    NARCIS (Netherlands)

    Metting, E.I.; Veen, J.C. In 't; Dekhuijzen, P.N.R.; Heijst, E. van; Kocks, J.W.; Muilwijk-Kroes, J.B.; Chavannes, N.H.; Molen, T. van der

    2016-01-01

    The aim of this study was to develop and explore the diagnostic accuracy of a decision tree derived from a large real-life primary care population. Data from 9297 primary care patients (45% male, mean age 53+/-17 years) with suspicion of an obstructive pulmonary disease was derived from an

  20. Ultrasonographic Diagnosis of Biliary Atresia Based on a Decision-Making Tree Model.

    Science.gov (United States)

    Lee, So Mi; Cheon, Jung-Eun; Choi, Young Hun; Kim, Woo Sun; Cho, Hyun-Hae; Cho, Hyun-Hye; Kim, In-One; You, Sun Kyoung

    2015-01-01

    To assess the diagnostic value of various ultrasound (US) findings and to make a decision-tree model for US diagnosis of biliary atresia (BA). From March 2008 to January 2014, the following US findings were retrospectively evaluated in 100 infants with cholestatic jaundice (BA, n = 46; non-BA, n = 54): length and morphology of the gallbladder, triangular cord thickness, hepatic artery and portal vein diameters, and visualization of the common bile duct. Logistic regression analyses were performed to determine the features that would be useful in predicting BA. Conditional inference tree analysis was used to generate a decision-making tree for classifying patients into the BA or non-BA groups. Multivariate logistic regression analysis showed that abnormal gallbladder morphology and greater triangular cord thickness were significant predictors of BA (p = 0.003 and 0.001; adjusted odds ratio: 345.6 and 65.6, respectively). In the decision-making tree using conditional inference tree analysis, gallbladder morphology and triangular cord thickness (optimal cutoff value of triangular cord thickness, 3.4 mm) were also selected as significant discriminators for differential diagnosis of BA, and gallbladder morphology was the first discriminator. The diagnostic performance of the decision-making tree was excellent, with sensitivity of 100% (46/46), specificity of 94.4% (51/54), and overall accuracy of 97% (97/100). Abnormal gallbladder morphology and greater triangular cord thickness (> 3.4 mm) were the most useful predictors of BA on US. We suggest that the gallbladder morphology should be evaluated first and that triangular cord thickness should be evaluated subsequently in cases with normal gallbladder morphology.

  1. Ultrasonographic diagnosis of biliary atresia based on a decision-making tree model

    Energy Technology Data Exchange (ETDEWEB)

    Lee, So Mi; Cheon, Jung Eun; Choi, Young Hun; Kim, Woo Sun; Cho, Hyun Hye; Kim, In One; You, Sun Kyoung [Dept. of Radiology, Seoul National University College of Medicine, Seoul (Korea, Republic of)

    2015-12-15

    To assess the diagnostic value of various ultrasound (US) findings and to make a decision-tree model for US diagnosis of biliary atresia (BA). From March 2008 to January 2014, the following US findings were retrospectively evaluated in 100 infants with cholestatic jaundice (BA, n = 46; non-BA, n = 54): length and morphology of the gallbladder, triangular cord thickness, hepatic artery and portal vein diameters, and visualization of the common bile duct. Logistic regression analyses were performed to determine the features that would be useful in predicting BA. Conditional inference tree analysis was used to generate a decision-making tree for classifying patients into the BA or non-BA groups. Multivariate logistic regression analysis showed that abnormal gallbladder morphology and greater triangular cord thickness were significant predictors of BA (p = 0.003 and 0.001; adjusted odds ratio: 345.6 and 65.6, respectively). In the decision-making tree using conditional inference tree analysis, gallbladder morphology and triangular cord thickness (optimal cutoff value of triangular cord thickness, 3.4 mm) were also selected as significant discriminators for differential diagnosis of BA, and gallbladder morphology was the first discriminator. The diagnostic performance of the decision-making tree was excellent, with sensitivity of 100% (46/46), specificity of 94.4% (51/54), and overall accuracy of 97% (97/100). Abnormal gallbladder morphology and greater triangular cord thickness (> 3.4 mm) were the most useful predictors of BA on US. We suggest that the gallbladder morphology should be evaluated first and that triangular cord thickness should be evaluated subsequently in cases with normal gallbladder morphology.

  2. Predicting metabolic syndrome using decision tree and support vector machine methods.

    Science.gov (United States)

    Karimi-Alavijeh, Farzaneh; Jalili, Saeed; Sadeghi, Masoumeh

    2016-05-01

    Metabolic syndrome which underlies the increased prevalence of cardiovascular disease and Type 2 diabetes is considered as a group of metabolic abnormalities including central obesity, hypertriglyceridemia, glucose intolerance, hypertension, and dyslipidemia. Recently, artificial intelligence based health-care systems are highly regarded because of its success in diagnosis, prediction, and choice of treatment. This study employs machine learning technics for predict the metabolic syndrome. This study aims to employ decision tree and support vector machine (SVM) to predict the 7-year incidence of metabolic syndrome. This research is a practical one in which data from 2107 participants of Isfahan Cohort Study has been utilized. The subjects without metabolic syndrome according to the ATPIII criteria were selected. The features that have been used in this data set include: gender, age, weight, body mass index, waist circumference, waist-to-hip ratio, hip circumference, physical activity, smoking, hypertension, antihypertensive medication use, systolic blood pressure (BP), diastolic BP, fasting blood sugar, 2-hour blood glucose, triglycerides (TGs), total cholesterol, low-density lipoprotein, high density lipoprotein-cholesterol, mean corpuscular volume, and mean corpuscular hemoglobin. Metabolic syndrome was diagnosed based on ATPIII criteria and two methods of decision tree and SVM were selected to predict the metabolic syndrome. The criteria of sensitivity, specificity and accuracy were used for validation. SVM and decision tree methods were examined according to the criteria of sensitivity, specificity and accuracy. Sensitivity, specificity and accuracy were 0.774 (0.758), 0.74 (0.72) and 0.757 (0.739) in SVM (decision tree) method. The results show that SVM method sensitivity, specificity and accuracy is more efficient than decision tree. The results of decision tree method show that the TG is the most important feature in predicting metabolic syndrome. According

  3. Predicting metabolic syndrome using decision tree and support vector machine methods

    Science.gov (United States)

    Karimi-Alavijeh, Farzaneh; Jalili, Saeed; Sadeghi, Masoumeh

    2016-01-01

    BACKGROUND Metabolic syndrome which underlies the increased prevalence of cardiovascular disease and Type 2 diabetes is considered as a group of metabolic abnormalities including central obesity, hypertriglyceridemia, glucose intolerance, hypertension, and dyslipidemia. Recently, artificial intelligence based health-care systems are highly regarded because of its success in diagnosis, prediction, and choice of treatment. This study employs machine learning technics for predict the metabolic syndrome. METHODS This study aims to employ decision tree and support vector machine (SVM) to predict the 7-year incidence of metabolic syndrome. This research is a practical one in which data from 2107 participants of Isfahan Cohort Study has been utilized. The subjects without metabolic syndrome according to the ATPIII criteria were selected. The features that have been used in this data set include: gender, age, weight, body mass index, waist circumference, waist-to-hip ratio, hip circumference, physical activity, smoking, hypertension, antihypertensive medication use, systolic blood pressure (BP), diastolic BP, fasting blood sugar, 2-hour blood glucose, triglycerides (TGs), total cholesterol, low-density lipoprotein, high density lipoprotein-cholesterol, mean corpuscular volume, and mean corpuscular hemoglobin. Metabolic syndrome was diagnosed based on ATPIII criteria and two methods of decision tree and SVM were selected to predict the metabolic syndrome. The criteria of sensitivity, specificity and accuracy were used for validation. RESULTS SVM and decision tree methods were examined according to the criteria of sensitivity, specificity and accuracy. Sensitivity, specificity and accuracy were 0.774 (0.758), 0.74 (0.72) and 0.757 (0.739) in SVM (decision tree) method. CONCLUSION The results show that SVM method sensitivity, specificity and accuracy is more efficient than decision tree. The results of decision tree method show that the TG is the most important feature in

  4. The application of a decision tree to establish the parameters associated with hypertension.

    Science.gov (United States)

    Tayefi, Maryam; Esmaeili, Habibollah; Saberi Karimian, Maryam; Amirabadi Zadeh, Alireza; Ebrahimi, Mahmoud; Safarian, Mohammad; Nematy, Mohsen; Parizadeh, Seyed Mohammad Reza; Ferns, Gordon A; Ghayour-Mobarhan, Majid

    2017-02-01

    Hypertension is an important risk factor for cardiovascular disease (CVD). The goal of this study was to establish the factors associated with hypertension by using a decision-tree algorithm as a supervised classification method of data mining. Data from a cross-sectional study were used in this study. A total of 9078 subjects who met the inclusion criteria were recruited. 70% of these subjects (6358 cases) were randomly allocated to the training dataset for the constructing of the decision-tree. The remaining 30% (2720 cases) were used as the testing dataset to evaluate the performance of decision-tree. Two models were evaluated in this study. In model I, age, gender, body mass index, marital status, level of education, occupation status, depression and anxiety status, physical activity level, smoking status, LDL, TG, TC, FBG, uric acid and hs-CRP were considered as input variables and in model II, age, gender, WBC, RBC, HGB, HCT MCV, MCH, PLT, RDW and PDW were considered as input variables. The validation of the model was assessed by constructing a receiver operating characteristic (ROC) curve. The prevalence rates of hypertension were 32% in our population. For the decision-tree model I, the accuracy, sensitivity, specificity and area under the ROC curve (AUC) value for identifying the related risk factors of hypertension were 73%, 63%, 77% and 0.72, respectively. The corresponding values for model II were 70%, 61%, 74% and 0.68, respectively. We have developed a decision tree model to identify the risk factors associated with hypertension that maybe used to develop programs for hypertension management. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  5. Predicting the probability of mortality of gastric cancer patients using decision tree.

    Science.gov (United States)

    Mohammadzadeh, F; Noorkojuri, H; Pourhoseingholi, M A; Saadat, S; Baghestani, A R

    2015-06-01

    Gastric cancer is the fourth most common cancer worldwide. This reason motivated us to investigate and introduce gastric cancer risk factors utilizing statistical methods. The aim of this study was to identify the most important factors influencing the mortality of patients who suffer from gastric cancer disease and to introduce a classification approach according to decision tree model for predicting the probability of mortality from this disease. Data on 216 patients with gastric cancer, who were registered in Taleghani hospital in Tehran,Iran, were analyzed. At first, patients were divided into two groups: the dead and alive. Then, to fit decision tree model to our data, we randomly selected 20% of dataset to the test sample and remaining dataset considered as the training sample. Finally, the validity of the model examined with sensitivity, specificity, diagnosis accuracy and the area under the receiver operating characteristic curve. The CART version 6.0 and SPSS version 19.0 softwares were used for the analysis of the data. Diabetes, ethnicity, tobacco, tumor size, surgery, pathologic stage, age at diagnosis, exposure to chemical weapons and alcohol consumption were determined as effective factors on mortality of gastric cancer. The sensitivity, specificity and accuracy of decision tree were 0.72, 0.75 and 0.74 respectively. The indices of sensitivity, specificity and accuracy represented that the decision tree model has acceptable accuracy to prediction the probability of mortality in gastric cancer patients. So a simple decision tree consisted of factors affecting on mortality of gastric cancer may help clinicians as a reliable and practical tool to predict the probability of mortality in these patients.

  6. Klasifikasi Nilai Kelayakan Calon Debitur Baru Menggunakan Decision Tree C4.5

    Directory of Open Access Journals (Sweden)

    Bambang Hermanto

    2017-01-01

    Full Text Available In an effort to improve the quality of customer service, especially in terms of feasibility assessment of borrowers due to the increasing number of new prospective borrowers loans financing the purchase of a motor vehicle, then the company needs a decision making tool allowing you to easily and quickly estimate Where the debtor is able to pay off the loans. This study discusses the process generates C4.5 decision tree algorithm and utilizing the learning group of debtor financing dataset motorcycle. The decision tree is then interpreted into the form of decision rules that can be understood and used as a reference in processing the data of borrowers in determining the feasibility of prospective new borrowers. Feasibility value refers to the value of the destination parameter credit status. If the value of the credit is paid off status mean estimated prospective borrower is able to repay the loan in question, but if the credit status parameters estimated worth pull means candidates concerned debtor is unable to pay loans.. System testing is done by comparing the results of the testing data by learning data in three scenarios with the decision that the data is valid at over 70% for all case scenarios. Moreover, in generated tree  and generate rules takes fairly quickly, which is no more than 15 minutes for each test scenario

  7. Prognostic Factors and Decision Tree for Long-term Survival in Metastatic Uveal Melanoma.

    Science.gov (United States)

    Lorenzo, Daniel; Ochoa, María; Piulats, Josep Maria; Gutiérrez, Cristina; Arias, Luis; Català, Jaum; Grau, María; Peñafiel, Judith; Cobos, Estefanía; Garcia-Bru, Pere; Rubio, Marcos Javier; Padrón-Pérez, Noel; Dias, Bruno; Pera, Joan; Caminal, Josep Maria

    2017-12-04

    The purpose of this study was to demonstrate the existence of a bimodal survival pattern in metastatic uveal melanoma. Secondary aims were to identify the characteristics and prognostic factors associated with long-term survival and to develop a clinical decision tree. The medical records of 99 metastatic uveal melanoma patients were retrospectively reviewed. Patients were classified as either short (≤ 12 months) or long-term survivors (> 12 months) based on a graphical interpretation of the survival curve after diagnosis of the first metastatic lesion. Ophthalmic and oncological characteristics were assessed in both groups. Of the 99 patients, 62 (62.6%) were classified as short-term survivors, and 37 (37.4%) as long-term survivors. The multivariate analysis identified the following predictors of long-term survival: age ≤ 65 years (p=0.012) and unaltered serum lactate dehydrogenase levels (p=0.018); additionally, the size (smaller vs. larger) of the largest liver metastasis showed a trend towards significance (p=0.063). Based on the variables significantly associated with long-term survival, we developed a decision tree to facilitate clinical decision-making. The findings of this study demonstrate the existence of a bimodal survival pattern in patients with metastatic uveal melanoma. The presence of certain clinical characteristics at diagnosis of distant disease is associated with long-term survival. A decision tree was developed to facilitate clinical decision-making and to counsel patients about the expected course of disease.

  8. Minimizing the cost of translocation failure with decision-tree models that predict species' behavioral response in translocation sites.

    Science.gov (United States)

    Ebrahimi, Mehregan; Ebrahimie, Esmaeil; Bull, C Michael

    2015-08-01

    The high number of failures is one reason why translocation is often not recommended. Considering how behavior changes during translocations may improve translocation success. To derive decision-tree models for species' translocation, we used data on the short-term responses of an endangered Australian skink in 5 simulated translocations with different release conditions. We used 4 different decision-tree algorithms (decision tree, decision-tree parallel, decision stump, and random forest) with 4 different criteria (gain ratio, information gain, gini index, and accuracy) to investigate how environmental and behavioral parameters may affect the success of a translocation. We assumed behavioral changes that increased dispersal away from a release site would reduce translocation success. The trees became more complex when we included all behavioral parameters as attributes, but these trees yielded more detailed information about why and how dispersal occurred. According to these complex trees, there were positive associations between some behavioral parameters, such as fight and dispersal, that showed there was a higher chance, for example, of dispersal among lizards that fought than among those that did not fight. Decision trees based on parameters related to release conditions were easier to understand and could be used by managers to make translocation decisions under different circumstances. © 2015 Society for Conservation Biology.

  9. Using attribute behavior diversity to build accurate decision tree committees for microarray data.

    Science.gov (United States)

    Han, Qian; Dong, Guozhu

    2012-08-01

    DNA microarrays (gene chips), frequently used in biological and medical studies, measure the expressions of thousands of genes per sample. Using microarray data to build accurate classifiers for diseases is an important task. This paper introduces an algorithm, called Committee of Decision Trees by Attribute Behavior Diversity (CABD), to build highly accurate ensembles of decision trees for such data. Since a committee's accuracy is greatly influenced by the diversity among its member classifiers, CABD uses two new ideas to "optimize" that diversity, namely (1) the concept of attribute behavior-based similarity between attributes, and (2) the concept of attribute usage diversity among trees. The ideas are effective for microarray data, since such data have many features and behavior similarity between genes can be high. Experiments on microarray data for six cancers show that CABD outperforms previous ensemble methods significantly and outperforms SVM, and show that the diversified features used by CABD's decision tree committee can be used to improve performance of other classifiers such as SVM. CABD has potential for other high-dimensional data, and its ideas may apply to ensembles of other classifier types.

  10. Application of Decision-Tree Model to Groundwater Productivity-Potential Mapping

    Directory of Open Access Journals (Sweden)

    Saro Lee

    2015-09-01

    Full Text Available For the sustainable use of groundwater, this study analyzed groundwater productivity-potential using a decision-tree approach in a geographic information system (GIS in Boryeong and Pohang cities, Korea. The model was based on the relationship between groundwater-productivity data, including specific capacity (SPC, and its related hydrogeological factors. SPC data which is measured and calculated for groundwater productivity and data about related factors, including topography, lineament, geology, forest and soil data, were collected and input into a spatial database. A decision-tree model was applied and decision trees were constructed using the chi-squared automatic interaction detector (CHAID and the quick, unbiased, and efficient statistical tree (QUEST algorithms. The resulting groundwater-productivity-potential (GPP maps were validated using area-under-the-curve (AUC analysis with the well data that had not been used for training the model. In the Boryeong city, the CHAID and QUEST algorithms had accuracies of 83.31% and 79.47%, and in the Pohang city, the CHAID and QUEST algorithms had accuracies of 86.18% and 80.00%. As another validation, the GPP maps were validated by comparing the actual SPC data. As the result, in the Boryeong city, the CHAID and QUEST algorithms had accuracies of 96.55% and 94.92% and in the Pohang city, the CHAID and QUEST algorithms had accuracies of 87.88% and 87.50%. These results indicate that decision-tree models can be useful for development of groundwater resources.

  11. Amazigh Part-of-Speech Tagging Using Markov Models and Decision Trees

    OpenAIRE

    Samir AMRI; Lahbib ZENKOUAR; Outahajala, Mohamed

    2016-01-01

    The main goal of this work is the implementation of a new tool for the Amazigh part of speech tagging using Markov Models and decision trees. After studying different approaches and problems of part of speech tagging, we have implemented a tagging system based on TreeTagger - a generic stochastic tagging tool, very popular for its efficiency. We have gathered a working corpus, large enough to ensure a general linguistic coverage. This corpus has been used to run the tokenization process, as w...

  12. Binary Decision Trees for Preoperative Periapical Cyst Screening Using Cone-beam Computed Tomography.

    Science.gov (United States)

    Pitcher, Brandon; Alaqla, Ali; Noujeim, Marcel; Wealleans, James A; Kotsakis, Georgios; Chrepa, Vanessa

    2017-03-01

    Cone-beam computed tomographic (CBCT) analysis allows for 3-dimensional assessment of periradicular lesions and may facilitate preoperative periapical cyst screening. The purpose of this study was to develop and assess the predictive validity of a cyst screening method based on CBCT volumetric analysis alone or combined with designated radiologic criteria. Three independent examiners evaluated 118 presurgical CBCT scans from cases that underwent apicoectomies and had an accompanying gold standard histopathological diagnosis of either a cyst or granuloma. Lesion volume, density, and specific radiologic characteristics were assessed using specialized software. Logistic regression models with histopathological diagnosis as the dependent variable were constructed for cyst prediction, and receiver operating characteristic curves were used to assess the predictive validity of the models. A conditional inference binary decision tree based on a recursive partitioning algorithm was constructed to facilitate preoperative screening. Interobserver agreement was excellent for volume and density, but it varied from poor to good for the radiologic criteria. Volume and root displacement were strong predictors for cyst screening in all analyses. The binary decision tree classifier determined that if the volume of the lesion was >247 mm 3 , there was 80% probability of a cyst. If volume was decision tree classifier renders it a useful preoperative cyst screening tool that can aid in clinical decision making but not a substitute for definitive histopathological diagnosis after biopsy. Confirmatory studies are required to validate the present findings. Published by Elsevier Inc.

  13. Validating a decision tree for serious infection: diagnostic accuracy in acutely ill children in ambulatory care.

    Science.gov (United States)

    Verbakel, Jan Y; Lemiengre, Marieke B; De Burghgraeve, Tine; De Sutter, An; Aertgeerts, Bert; Bullens, Dominique M A; Shinkins, Bethany; Van den Bruel, Ann; Buntinx, Frank

    2015-08-07

    Acute infection is the most common presentation of children in primary care with only few having a serious infection (eg, sepsis, meningitis, pneumonia). To avoid complications or death, early recognition and adequate referral are essential. Clinical prediction rules have the potential to improve diagnostic decision-making for rare but serious conditions. In this study, we aimed to validate a recently developed decision tree in a new but similar population. Diagnostic accuracy study validating a clinical prediction rule. Acutely ill children presenting to ambulatory care in Flanders, Belgium, consisting of general practice and paediatric assessment in outpatient clinics or the emergency department. Physicians were asked to score the decision tree in every child. The outcome of interest was hospital admission for at least 24 h with a serious infection within 5 days after initial presentation. We report the diagnostic accuracy of the decision tree in sensitivity, specificity, likelihood ratios and predictive values. In total, 8962 acute illness episodes were included, of which 283 lead to admission to hospital with a serious infection. Sensitivity of the decision tree was 100% (95% CI 71.5% to 100%) at a specificity of 83.6% (95% CI 82.3% to 84.9%) in the general practitioner setting with 17% of children testing positive. In the paediatric outpatient and emergency department setting, sensitivities were below 92%, with specificities below 44.8%. In an independent validation cohort, this clinical prediction rule has shown to be extremely sensitive to identify children at risk of hospital admission for a serious infection in general practice, making it suitable for ruling out. NCT02024282. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  14. Financial analysis of pruning ponderosa pine.

    Science.gov (United States)

    Roger D. Fight; Natalie A. Bolon; James M. Cahill

    1992-01-01

    A recent lumber recovery study of pruned and unpruned ponderosa pine (Pinus ponderosa Dougl. ex Laws.) was used to project the financial return from pruning ponderosa pine in the Medford District of the Bureau of Land Management and in the Ochoco and Deschutes National Forests. The cost of pruning at which the investment would yield an expected 4-...

  15. 7 CFR 993.5 - Prunes.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Prunes. 993.5 Section 993.5 Agriculture Regulations of the Department of Agriculture (Continued) AGRICULTURAL MARKETING SERVICE (Marketing Agreements and... Regulating Handling Definitions § 993.5 Prunes. Prunes means and includes all sun-dried or artificially...

  16. [The application of decision tree in the research of anemia among rural children under 3-year-old].

    Science.gov (United States)

    Ma, Yu-gang; Bi, Yu-xue; Yan, Hong; Deng, Li-na; Liang, Wei-feng; Wang, Bei; Zhang, Xue-li

    2009-05-01

    To study the application of decision tree in the research of anemia among rural children. In the Enterprise Miner module of software SAS 8.2, 3000 observations were sampled from database and the decision tree model was built. The model using decision tree of CART bases on Gini impurity index. The misclassification rate of decision tree model was, training set 21.2%, validation set 21.9%. The Root ASE of decision tree model was, training set 0.399, validation set 0.404. The area under the ROC curve was larger than the reference line. The diagnostic chart showed that the corresponding percentage was higher than the other. The decision tree model selected 9 important factors and ranked them by their power, among which mother of anemia (1.00) was the most important factor. Others were children's age (0.75), time of ablactation (0.53), mother's age (0.32), the time of egg supplementation (0.26), category of the project county (0.26), the time of milk supplementation (0.16), number of people in the family (0.13), the education status of the mother (0.12). Decision tree produced simple and easy rules that might be used to classify and predict in the same research. Decision tree could screen out the important factors of anemia and identify the cutting-points for factors. With the wide application of decision tree, it would exhibit important application values in the research of the rural children health care.

  17. Development of a Grapevine Pruning Algorithm for Using in Pruning

    Directory of Open Access Journals (Sweden)

    S. M Hosseini

    2017-10-01

    Full Text Available Introduction Great areas of the orchards in the world are dedicated to cultivation of the grapevine. Normally grape vineyards are pruned twice a year. Among the operations of grape production, winter pruning of the bushes is the only operation that still has not been fully mechanized while it is known as the most laborious jobs in the farm. Some of the grape producing countries use various mechanical machines to prune the grapevines, but in most cases, these machines do not have a good performance. Therefore intelligent pruning machine seems to be necessary in this regard and this intelligent pruning machines can reduce the labor required to prune the vineyards. It this study in was attempted to develop an algorithm that uses image processing techniques to identify which parts of the grapevine should be cut. Stereo vision technique was used to obtain three dimensional images from the bare bushes whose leaves were fallen in autumn. Stereo vision systems are used to determine the depth from two images taken at the same time but from slightly different viewpoints using two cameras. Each pair of images of a common scene is related by a popular geometry, and corresponding points in the images pairs are constrained to lie on pairs of conjugate popular lines. Materials and Methods Photos were taken from gardens of the Research Center for Agriculture and Natural Resources of Fars province, Iran. At first, the distance between the plants and the cameras should be determined. The distance between the plants and cameras can be obtained by using the stereo vision techniques. Therefore, this method was used in this paper by two pictures taken from each plant with the left and right cameras. The algorithm was written in MATLAB. To facilitate the segmentation of the branches from the rows at the back, a blue plate with dimensions of 2×2 m2 were used at the background. After invoking the images, branches were segmented from the background to produce the binary

  18. Decision tree approach for classification of remotely sensed satellite data using open source support

    Science.gov (United States)

    Sharma, Richa; Ghosh, Aniruddha; Joshi, P. K.

    2013-10-01

    In this study, an attempt has been made to develop a decision tree classification (DTC) algorithm for classification of remotely sensed satellite data (Landsat TM) using open source support. The decision tree is constructed by recursively partitioning the spectral distribution of the training dataset using WEKA, open source data mining software. The classified image is compared with the image classified using classical ISODATA clustering and Maximum Likelihood Classifier (MLC) algorithms. Classification result based on DTC method provided better visual depiction than results produced by ISODATA clustering or by MLC algorithms. The overall accuracy was found to be 90% (kappa = 0.88) using the DTC, 76.67% (kappa = 0.72) using the Maximum Likelihood and 57.5% (kappa = 0.49) using ISODATA clustering method. Based on the overall accuracy and kappa statistics, DTC was found to be more preferred classification approach than others.

  19. Establishing a serologic decision tree model of extrapulmonary tuberculosis by MALDI-TOF MS analysis.

    Science.gov (United States)

    Deng, Chuiwen; Lin, Minggui; Hu, Chaojun; Li, Yanfeng; Gao, Yang; Cheng, Xiaoxing; Zhang, Fengchun; Dong, Mei; Li, Yongzhe

    2011-10-01

    Matrix-assisted laser desorption-ionization time of flight mass spectrometry (MALDI-TOF MS) combined with weak cationic exchange (WCX) magnetic beads was used to establish a decision tree model that distinguished extrapulmonary tuberculosis (EPTB) from non-EPTB individuals. Eight-one patients with EPTB and 112 non-EPTB individuals (72 disease controls and 40 healthy controls) were involved in this study. The model was set up by 5 of 19 differentially expressed peaks (P EPTB from non-EPTB with a sensitivity of 97.7% and a specificity of 84.1%. The test set verified that this model had good sensitivity and specificity: 94.4% and 83.6%, respectively. In conclusion, MALDI-TOF MS combined with WCX magnetic beads is a powerful technology for constructing a decision tree model and the model we built could serve as a potential diagnostic tool for EPTB. Copyright © 2011 Elsevier Inc. All rights reserved.

  20. Three-dimensional object recognition using similar triangles and decision trees

    Science.gov (United States)

    Spirkovska, Lilly

    1993-01-01

    A system, TRIDEC, that is capable of distinguishing between a set of objects despite changes in the objects' positions in the input field, their size, or their rotational orientation in 3D space is described. TRIDEC combines very simple yet effective features with the classification capabilities of inductive decision tree methods. The feature vector is a list of all similar triangles defined by connecting all combinations of three pixels in a coarse coded 127 x 127 pixel input field. The classification is accomplished by building a decision tree using the information provided from a limited number of translated, scaled, and rotated samples. Simulation results are presented which show that TRIDEC achieves 94 percent recognition accuracy in the 2D invariant object recognition domain and 98 percent recognition accuracy in the 3D invariant object recognition domain after training on only a small sample of transformed views of the objects.

  1. Circum-Arctic petroleum systems identified using decision-tree chemometrics

    Science.gov (United States)

    Peters, K.E.; Ramos, L.S.; Zumberge, J.E.; Valin, Z.C.; Scotese, C.R.; Gautier, D.L.

    2007-01-01

    Source- and age-related biomarker and isotopic data were measured for more than 1000 crude oil samples from wells and seeps collected above approximately 55??N latitude. A unique, multitiered chemometric (multivariate statistical) decision tree was created that allowed automated classification of 31 genetically distinct circumArctic oil families based on a training set of 622 oil samples. The method, which we call decision-tree chemometrics, uses principal components analysis and multiple tiers of K-nearest neighbor and SIMCA (soft independent modeling of class analogy) models to classify and assign confidence limits for newly acquired oil samples and source rock extracts. Geochemical data for each oil sample were also used to infer the age, lithology, organic matter input, depositional environment, and identity of its source rock. These results demonstrate the value of large petroleum databases where all samples were analyzed using the same procedures and instrumentation. Copyright ?? 2007. The American Association of Petroleum Geologists. All rights reserved.

  2. USING DECISION TREES FOR ESTIMATING MODE CHOICE OF TRIPS IN BUCA-IZMIR

    Directory of Open Access Journals (Sweden)

    L. O. Oral

    2013-05-01

    Full Text Available Decision makers develop transportation plans and models for providing sustainable transport systems in urban areas. Mode Choice is one of the stages in transportation modelling. Data mining techniques can discover factors affecting the mode choice. These techniques can be applied with knowledge process approach. In this study a data mining process model is applied to determine the factors affecting the mode choice with decision trees techniques by considering individual trip behaviours from household survey data collected within Izmir Transportation Master Plan. From this perspective transport mode choice problem is solved on a case in district of Buca-Izmir, Turkey with CRISP-DM knowledge process model.

  3. Assisting Sustainable Forest Management and Forest Policy Planning with the Sim4Tree Decision Support System

    Directory of Open Access Journals (Sweden)

    Floris Dalemans

    2015-03-01

    Full Text Available As European forest policy increasingly focuses on multiple ecosystem services and participatory decision making, forest managers and policy planners have a need for integrated, user-friendly, broad spectrum decision support systems (DSS that address risks and uncertainties, such as climate change, in a robust way and that provide credible advice in a transparent manner, enabling effective stakeholder involvement. The Sim4Tree DSS has been accordingly developed as a user-oriented, modular and multipurpose toolbox. Sim4Tree supports strategic and tactical forestry planning by providing simulations of forest development, ecosystem services potential and economic performance through time, from a regional to a stand scale, under various management and climate regimes. Sim4Tree allows comparing the performance of different scenarios with regard to diverse criteria so as to optimize management choices. This paper explains the concept, characteristics, functionalities, components and use of the current Sim4Tree DSS v2.5, which was parameterized for the region of Flanders, Belgium, but can be flexibly adapted to allow a broader use. When considering the current challenges for forestry DSS, an effort has been made towards the participatory component and towards integration, while the lack of robustness remains Sim4Tree’s weakest point. However, its structural flexibility allows many possibilities for future improvement and extension.

  4. Quantifying human and organizational factors in accident management using decision trees: the HORAAM method

    Energy Technology Data Exchange (ETDEWEB)

    Baumont, G.; Menage, F.; Schneiter, J.R.; Spurgin, A.; Vogel, A

    2000-11-01

    In the framework of the level 2 Probabilistic Safety Study (PSA 2) project, the Institute for Nuclear Safety and Protection (IPSN) has developed a method for taking into account Human and Organizational Reliability Aspects during accident management. Actions are taken during very degraded installation operations by teams of experts in the French framework of Crisis Organization (ONC). After describing the background of the framework of the Level 2 PSA, the French specific Crisis Organization and the characteristics of human actions in the Accident Progression Event Tree, this paper describes the method developed to introduce in PSA the Human and Organizational Reliability Analysis in Accident Management (HORAAM). This method is based on the Decision Tree method and has gone through a number of steps in its development. The first one was the observation of crisis center exercises, in order to identify the main influence factors (IFs) which affect human and organizational reliability. These IFs were used as headings in the Decision Tree method. Expert judgment was used in order to verify the IFs, to rank them, and to estimate the value of the aggregated factors to simplify the quantification of the tree. A tool based on Mathematica was developed to increase the flexibility and the efficiency of the study.

  5. Efficient, reliable and fast high-level triggering using a bonsai boosted decision tree

    Science.gov (United States)

    Gligorov, V. V.; Williams, M.

    2013-02-01

    High-level triggering is a vital component of many modern particle physics experiments. This paper describes a modification to the standard boosted decision tree (BDT) classifier, the so-called bonsai BDT, that has the following important properties: it is more efficient than traditional cut-based approaches; it is robust against detector instabilities, and it is very fast. Thus, it is fit-for-purpose for the online running conditions faced by any large-scale data acquisition system.

  6. Efficient, reliable and fast high-level triggering using a bonsai boosted decision tree

    CERN Document Server

    Gligorov, V.V.

    2013-01-01

    High-level triggering is a vital component in many modern particle physics experiments. This paper describes a modification to the standard boosted decision tree (BDT) classifier, the so-called "bonsai" BDT, that has the following important properties: it is more efficient than traditional cut-based approaches; it is robust against detector instabilities, and it is very fast. Thus, it is fit-for-purpose for the online running conditions faced by any large-scale data acquisition system.

  7. Advocating the broad use of the decision tree method in education

    OpenAIRE

    Almeida, Leandro S.; Gomes, Cristiano Mauro Assis

    2017-01-01

    Predictive studies have been widely undertaken in the field of education to provide strategic information about the extensive set of processes related to teaching and learning, as well as about what variables predict certain educational outcomes, such as academic achievement or dropout. As in any other area, there is a set of standard techniques that is usually used in predictive studies in the field education. Even though the Decision Tree Method is a well-known and standard approach in Data...

  8. The Studies of Decision Tree in Estimation of Breast Cancer Risk by Using Polymorphism Nucleotide

    OpenAIRE

    Frida Seyedmir; Kamal Mirzaie; Morteza Bitaraf Sani

    2017-01-01

    Abstract Introduction:   Decision tree is the data mining tools to collect, accurate prediction and sift information from massive amounts of data that are used widely in the field of computational biology and bioinformatics. In bioinformatics can be predict on diseases, including breast cancer. The use of genomic data including single nucleotide polymorphisms is a very important factor in predicting the risk of diseases. The number of seven important SNP among hundreds of thousan...

  9. Making Robust Classification Decisions: Constructing and Evaluating Fast and Frugal Trees (FFTs)

    OpenAIRE

    Neth, Hansjorg; Czienskowski, Uwe; Schooler, Lael; Gluck, Kevin

    2013-01-01

    Fast and Frugal Trees (FFTs) are a quintessential family of simple heuristics that allow effective and efficient binary clas- sification decisions and often perform remarkably well when compared to more complex methods. This half-day tutorial will familiarize participants with examples of FFTs and elu- cidate the theoretical link between FFTs and signal detection theory (SDT). A range of presentations, practical exercises and interactive tools will enable participants to construct and eval- u...

  10. Yield and crop cycle time of peaches cultivated in subtropical climates and subjected to different pruning times

    Directory of Open Access Journals (Sweden)

    Rafael Augusto Ferraz

    2015-12-01

    Full Text Available The cultivation of peaches in regions of subtropical and tropical climate is currently achieved through a set of practices such as using less demanding cultivars in cold conditions, applying plant growth regulators to break dormancy, and performing specific pruning, like production and renewal pruning. Research on the climate adaptation of cultivars is of great importance in establishing a crop in a given region. Therefore, the objective of this study was to evaluate the agronomic performance of three cultivars subjected to different production pruning times in Botucatu/SP, where 2-year old peach trees were evaluated, grown at a spacing of 6.0 x 4.0 meters. The experimental design was a split plot design with four blocks, using the cultivars Douradão, BRS Kampai and BRS Rubimel, and the subplots corresponded to pruning times in May, June, July and August. Ten plants were used per plot, with the four central plants considered useful and the remaining considered as margins. Pruning in June and July showed the best results in terms of percentage of fruit set and production. The cultivar BRS Rubimel showed the best percentage of fruit set when pruned in June (44.96%, and best fruit production when pruned in July (18.7 kg plant-1. Pruning in May anticipated the harvest of cultivar BRS Rubimel by 13 days whereas pruning carried out in July and August provided late harvests for cultivars Douradão and BRS Kampai.

  11. Datamining techniques - decision tree: new view on nurses' intention to leave

    Directory of Open Access Journals (Sweden)

    Jiří Vévoda

    2016-12-01

    Full Text Available Aim: The aim of the survey is to identify factors of the work environment which are important for general nurses when they are considering whether or not to leave their current employer. Design: The research consists of an observational and a cross-sectional study. Methods: Based on a modified interpretation of Herzberg's theory, we created a structured interview to investigate environmental factors. Interviewers carried out 1,992 interviews with hospital nurses working in the Czech Republic, between 2011 and 2012. The data gathered were analyzed with data mining tools - a decision tree and non-parametric tests. Results: If a good opportunity arose, 34.7% of nurses would leave their current employer. The analysis of the decision tree identified the factor "Patient care", i.e. a factor concerning the nature of the work itself, as the most important. Data mining offers a new view of the data and can reveal valuable information existing within the primary data. Conclusion: Data mining has great potential in nursing. In this research, the decision tree shows that the essence of the nursing profession is the nursing work itself and it is also the most significant stabilizing factor. The management of healthcare providers should create and maintain a work environment which will ensure nursing work can be performed without impediment, thus minimizing staff turnover.

  12. MODIS Snow Cover Mapping Decision Tree Technique: Snow and Cloud Discrimination

    Science.gov (United States)

    Riggs, George A.; Hall, Dorothy K.

    2010-01-01

    Accurate mapping of snow cover continues to challenge cryospheric scientists and modelers. The Moderate-Resolution Imaging Spectroradiometer (MODIS) snow data products have been used since 2000 by many investigators to map and monitor snow cover extent for various applications. Users have reported on the utility of the products and also on problems encountered. Three problems or hindrances in the use of the MODIS snow data products that have been reported in the literature are: cloud obscuration, snow/cloud confusion, and snow omission errors in thin or sparse snow cover conditions. Implementation of the MODIS snow algorithm in a decision tree technique using surface reflectance input to mitigate those problems is being investigated. The objective of this work is to use a decision tree structure for the snow algorithm. This should alleviate snow/cloud confusion and omission errors and provide a snow map with classes that convey information on how snow was detected, e.g. snow under clear sky, snow tinder cloud, to enable users' flexibility in interpreting and deriving a snow map. Results of a snow cover decision tree algorithm are compared to the standard MODIS snow map and found to exhibit improved ability to alleviate snow/cloud confusion in some situations allowing up to about 5% increase in mapped snow cover extent, thus accuracy, in some scenes.

  13. Decision Trees for Continuous Data and Conditional Mutual Information as a Criterion for Splitting Instances.

    Science.gov (United States)

    Drakakis, Georgios; Moledina, Saadiq; Chomenidis, Charalampos; Doganis, Philip; Sarimveis, Haralambos

    2016-01-01

    Decision trees are renowned in the computational chemistry and machine learning communities for their interpretability. Their capacity and usage are somewhat limited by the fact that they normally work on categorical data. Improvements to known decision tree algorithms are usually carried out by increasing and tweaking parameters, as well as the post-processing of the class assignment. In this work we attempted to tackle both these issues. Firstly, conditional mutual information was used as the criterion for selecting the attribute on which to split instances. The algorithm performance was compared with the results of C4.5 (WEKA's J48) using default parameters and no restrictions. Two datasets were used for this purpose, DrugBank compounds for HRH1 binding prediction and Traditional Chinese Medicine formulation predicted bioactivities for therapeutic class annotation. Secondly, an automated binning method for continuous data was evaluated, namely Scott's normal reference rule, in order to allow any decision tree to easily handle continuous data. This was applied to all approved drugs in DrugBank for predicting the RDKit SLogP property, using the remaining RDKit physicochemical attributes as input.

  14. Flood-type classification in mountainous catchments using crisp and fuzzy decision trees

    Science.gov (United States)

    Sikorska, Anna E.; Viviroli, Daniel; Seibert, Jan

    2015-10-01

    Floods are governed by largely varying processes and thus exhibit various behaviors. Classification of flood events into flood types and the determination of their respective frequency is therefore important for a better understanding and prediction of floods. This study presents a flood classification for identifying flood patterns at a catchment scale by means of a fuzzy decision tree. Hence, events are represented as a spectrum of six main possible flood types that are attributed with their degree of acceptance. Considered types are flash, short rainfall, long rainfall, snow-melt, rainfall on snow and, in high alpine catchments, glacier-melt floods. The fuzzy decision tree also makes it possible to acknowledge the uncertainty present in the identification of flood processes and thus allows for more reliable flood class estimates than using a crisp decision tree, which identifies one flood type per event. Based on the data set in nine Swiss mountainous catchments, it was demonstrated that this approach is less sensitive to uncertainties in the classification attributes than the classical crisp approach. These results show that the fuzzy approach bears additional potential for analyses of flood patterns at a catchment scale and thereby it provides more realistic representation of flood processes.

  15. Teratozoospermia Classification Based on the Shape of Sperm Head Using OTSU Threshold and Decision Tree

    Directory of Open Access Journals (Sweden)

    Masdiyasa I Gede Susrama

    2016-01-01

    Full Text Available Teratozoospermia is one of the results of expert analysis of male infertility, by conducting lab tests microscopically to determine the morphology of spermatozoa, one of which is the normal and abnormal form of the head of spermatozoa. The laboratory test results are in the form of a complete image of spermatozoa. In this study, the shape of the head of spermatozoa was taken from a WHO standards book. The pictures taken had a fairly clear imaging and still had noise, thus to differentiate between the head of normal and abnormal spermatozoa, several processes need to be performed, which include: a pre-process or image adjusting, a threshold segmentation process using Otsu threshold method, and a classification process using a decision tree. Training and test data are presented in stages, from 5 to 20 data. Test results of using Otsu segmentation and a decision tree produced different errors in each level of training data, which were 70%, 75%, and 80% for training data of size 5×2, 10×2, and 20×2, respectively, with an average error of 75%. Thus, this study of using Otsu threshold segmentation and a Decision Tree can classify the form of the head of spermatozoa as abnormal or Normal

  16. Imitation learning of car driving skills with decision trees and random forests

    Directory of Open Access Journals (Sweden)

    Cichosz Paweł

    2014-09-01

    Full Text Available Machine learning is an appealing and useful approach to creating vehicle control algorithms, both for simulated and real vehicles. One common learning scenario that is often possible to apply is learning by imitation, in which the behavior of an exemplary driver provides training instances for a supervised learning algorithm. This article follows this approach in the domain of simulated car racing, using the TORCS simulator. In contrast to most prior work on imitation learning, a symbolic decision tree knowledge representation is adopted, which combines potentially high accuracy with human readability, an advantage that can be important in many applications. Decision trees are demonstrated to be capable of representing high quality control models, reaching the performance level of sophisticated pre-designed algorithms. This is achieved by enhancing the basic imitation learning scenario to include active retraining, automatically triggered on control failures. It is also demonstrated how better stability and generalization can be achieved by sacrificing human-readability and using decision tree model ensembles. The methodology for learning control models contributed by this article can be hopefully applied to solve real-world control tasks, as well as to develop video game bots

  17. LOCAL BINARIZATION FOR DOCUMENT IMAGES CAPTURED BY CAMERAS WITH DECISION TREE

    Directory of Open Access Journals (Sweden)

    Naser Jawas

    2012-07-01

    Full Text Available Character recognition in a document image captured by a digital camera requires a good binary image as the input for the separation the text from the background. Global binarization method does not provide such good separation because of the problem of uneven levels of lighting in images captured by cameras. Local binarization method overcomes the problem but requires a method to partition the large image into local windows properly. In this paper, we propose a local binariation method with dynamic image partitioning using integral image and decision tree for the binarization decision. The integral image is used to estimate the number of line in the document image. The number of line in the document image is used to devide the document into local windows. The decision tree makes a decision for threshold in every local window. The result shows that the proposed method can separate the text from the background better than using global thresholding with the best OCR result of the binarized image is 99.4%. Pengenalan karakter pada sebuah dokumen citra yang diambil menggunakan kamera digital membutuhkan citra yang terbinerisasi dengan baik untuk memisahkan antara teks dengan background. Metode binarisasi global tidak memberikan hasil pemisahan yang bagus karena permasalahan tingkat pencahayaan yang tidak seimbang pada citra hasil kamera digital. Metode binarisasi lokal dapat mengatasi permasalahan tersebut namun metode tersebut membutuhkan metode untuk membagi citra ke dalam bagian-bagian window lokal. Pada paper ini diusulkan sebuah metode binarisasi lokal dengan pembagian citra secara dinamis menggunakan integral image dan decision tree untuk keputusan binarisasi lokalnya. Integral image digunakan untuk mengestimasi jumlah baris teks dalam dokumen citra. Jumlah baris tersebut kemudian digunakan untuk membagi citra dokumen ke dalam window lokal. Keputusan nilai threshold untuk setiap window lokal ditentukan dengan decisiontree. Hasilnya menunjukkan

  18. Decision-making model for early diagnosis of congestive heart failure using rough set and decision tree approaches.

    Science.gov (United States)

    Son, Chang-Sik; Kim, Yoon-Nyun; Kim, Hyung-Seop; Park, Hyoung-Seob; Kim, Min-Soo

    2012-10-01

    The accurate diagnosis of heart failure in emergency room patients is quite important, but can also be quite difficult due to our insufficient understanding of the characteristics of heart failure. The purpose of this study is to design a decision-making model that provides critical factors and knowledge associated with congestive heart failure (CHF) using an approach that makes use of rough sets (RSs) and decision trees. Among 72 laboratory findings, it was determined that two subsets (RBC, EOS, Protein, O2SAT, Pro BNP) in an RS-based model, and one subset (Gender, MCHC, Direct bilirubin, and Pro BNP) in a logistic regression (LR)-based model were indispensable factors for differentiating CHF patients from those with dyspnea, and the risk factor Pro BNP was particularly so. To demonstrate the usefulness of the proposed model, we compared the discriminatory power of decision-making models that utilize RS- and LR-based decision models by conducting 10-fold cross-validation. The experimental results showed that the RS-based decision-making model (accuracy: 97.5%, sensitivity: 97.2%, specificity: 97.7%, positive predictive value: 97.2%, negative predictive value: 97.7%, and area under ROC curve: 97.5%) consistently outperformed the LR-based decision-making model (accuracy: 88.7%, sensitivity: 90.1%, specificity: 87.5%, positive predictive value: 85.3%, negative predictive value: 91.7%, and area under ROC curve: 88.8%). In addition, a pairwise comparison of the ROC curves of the two models showed a statistically significant difference (p<0.01; 95% CI: 2.63-14.6). Copyright © 2012 Elsevier Inc. All rights reserved.

  19. Using decision tree to predict serum ferritin level in women with anemia

    Directory of Open Access Journals (Sweden)

    Parisa Safaee

    2016-04-01

    Full Text Available Background: Data mining is known as a process of discovering and analysing large amounts of data in order to find meaningful rules and trends. In healthcare, data mining offers numerous opportunities to study the unknown patterns in a data set. These patterns can be used to diagnosis, prognosis and treatment of patients by physicians. The main objective of this study was to predict the level of serum ferritin in women with anemia and to specify the basic predictive factors of iron deficiency anemia using data mining techniques. Methods: In this research 690 patients and 22 variables have been studied in women population with anemia. These data include 11 laboratories and 11 clinical variables of patients related to the patients who have referred to the laboratory of Imam Hossein and Shohada-E- Haft Tir hospitals from April 2013 to April 2014. Decision tree technique has been used to build the model. Results: The accuracy of the decision tree with all the variables is 75%. Different combinations of variables were examined in order to determine the best model to predict. Regarding the optimum obtained model of the decision tree, the RBC, MCH, MCHC, gastrointestinal cancer and gastrointestinal ulcer were identified as the most important predictive factors. The results indicate if the values of MCV, MCHC and MCH variables are normal and the value of RBC variable is lower than normal limitation, it is diagnosed that the patient is likely 90% iron deficiency anemia. Conclusion: Regarding the simplicity and the low cost of the complete blood count examination, the model of decision tree was taken into consideration to diagnose iron deficiency anemia in patients. Also the impact of new factors such as gastrointestinal hemorrhoids, gastrointestinal surgeries, different gastrointestinal diseases and gastrointestinal ulcers are considered in this paper while the previous studies have been limited only to assess laboratory variables. The rules of the

  20. Improvement of adequate use of warfarin for the elderly using decision tree-based approaches.

    Science.gov (United States)

    Liu, K E; Lo, C-L; Hu, Y-H

    2014-01-01

    Due to the narrow therapeutic range and high drug-to-drug interactions (DDIs), improving the adequate use of warfarin for the elderly is crucial in clinical practice. This study examines whether the effectiveness of using warfarin among elderly inpatients can be improved when machine learning techniques and data from the laboratory information system are incorporated. Having employed 288 validated clinical cases in the DDI group and 89 cases in the non-DDI group, we evaluate the prediction performance of seven classification techniques, with and without an Adaptive Boosting (AdaBoost) algorithm. Measures including accuracy, sensitivity, specificity and area under the curve are used to evaluate model performance. Decision tree-based classifiers outperform other investigated classifiers in all evaluation measures. The classifiers supplemented with AdaBoost can generally improve the performance. In addition, weight, congestive heart failure, and gender are among the top three critical variables affecting prediction accuracy for the non-DDI group, while age, ALT, and warfarin doses are the most influential factors for the DDI group. Medical decision support systems incorporating decision tree-based approaches improve predicting performance and thus may serve as a supplementary tool in clinical practice. Information from laboratory tests and inpatients' history should not be ignored because related variables are shown to be decisive in our prediction models, especially when the DDIs exist.

  1. Socioeconomic determinants of menarche in rural Polish girls using the decision trees method.

    Science.gov (United States)

    Matusik, Stanisław; Laska-Mierzejewska, Teresa; Chrzanowska, Maria

    2011-05-01

    The aim of this study was to assess the usefulness of the decision trees method as a research method of multidimensional associations between menarche and socioeconomic variables. The article is based on data collected from the rural area of Choszczno in the West Pomerania district of Poland between 1987 and 2001. Girls were asked about the appearance of first menstruation (a yes/no method). The average menarchal age was estimated by the probit analysis method, using second grade polynomials. The socioeconomic status of the girls' families was determined using five qualitative variables: fathers' and mothers' educational level, source of income, household appliances and the number of children in a family. For classification based on five socioeconomic variables, one of the most effective algorithms CART (Classification and Regression Trees) was used. In 2001 the menarchal age in 66% of examined girls was properly classified, while a higher efficiency of 70% was obtained for girls examined in 1987. The decision trees method enabled the definition of the hierarchy of socioeconomic variables influencing girls' biological development level. The strongest discriminatory power was attributed to the number of children in a family, and the mother's and then father's educational level. Using this method it is possible to detect differences in strength of socioeconomic variables associated with girls' pubescence before 1987 and after 2001 during the transformation of the economic and political systems in Poland. However, the decision trees method is infrequently applied in social sciences and constitutes a novelty; this article proves its usefulness in examining relations between biological processes and a population's living conditions.

  2. Predicting volume of distribution with decision tree-based regression methods using predicted tissue:plasma partition coefficients.

    Science.gov (United States)

    Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat

    2015-01-01

    Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.

  3. Modifiable risk factors predicting major depressive disorder at four year follow-up: a decision tree approach

    Directory of Open Access Journals (Sweden)

    Christensen Helen

    2009-11-01

    Full Text Available Abstract Background Relative to physical health conditions such as cardiovascular disease, little is known about risk factors that predict the prevalence of depression. The present study investigates the expected effects of a reduction of these risks over time, using the decision tree method favoured in assessing cardiovascular disease risk. Methods The PATH through Life cohort was used for the study, comprising 2,105 20-24 year olds, 2,323 40-44 year olds and 2,177 60-64 year olds sampled from the community in the Canberra region, Australia. A decision tree methodology was used to predict the presence of major depressive disorder after four years of follow-up. The decision tree was compared with a logistic regression analysis using ROC curves. Results The decision tree was found to distinguish and delineate a wide range of risk profiles. Previous depressive symptoms were most highly predictive of depression after four years, however, modifiable risk factors such as substance use and employment status played significant roles in assessing the risk of depression. The decision tree was found to have better sensitivity and specificity than a logistic regression using identical predictors. Conclusion The decision tree method was useful in assessing the risk of major depressive disorder over four years. Application of the model to the development of a predictive tool for tailored interventions is discussed.

  4. Cloud Detection from Satellite Imagery: A Comparison of Expert-Generated and Automatically-Generated Decision Trees

    Science.gov (United States)

    Shiffman, Smadar

    2004-01-01

    Automated cloud detection and tracking is an important step in assessing global climate change via remote sensing. Cloud masks, which indicate whether individual pixels depict clouds, are included in many of the data products that are based on data acquired on- board earth satellites. Many cloud-mask algorithms have the form of decision trees, which employ sequential tests that scientists designed based on empirical astrophysics studies and astrophysics simulations. Limitations of existing cloud masks restrict our ability to accurately track changes in cloud patterns over time. In this study we explored the potential benefits of automatically-learned decision trees for detecting clouds from images acquired using the Advanced Very High Resolution Radiometer (AVHRR) instrument on board the NOAA-14 weather satellite of the National Oceanic and Atmospheric Administration. We constructed three decision trees for a sample of 8km-daily AVHRR data from 2000 using a decision-tree learning procedure provided within MATLAB(R), and compared the accuracy of the decision trees to the accuracy of the cloud mask. We used ground observations collected by the National Aeronautics and Space Administration Clouds and the Earth s Radiant Energy Systems S COOL project as the gold standard. For the sample data, the accuracy of automatically learned decision trees was greater than the accuracy of the cloud masks included in the AVHRR data product.

  5. Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining.

    Science.gov (United States)

    Habibi, Shafi; Ahmadi, Maryam; Alizadeh, Somayeh

    2015-03-18

    The aim of this study was to examine a predictive model using features related to the diabetes type 2 risk factors. The data were obtained from a database in a diabetes control system in Tabriz, Iran. The data included all people referred for diabetes screening between 2009 and 2011. The features considered as "Inputs" were: age, sex, systolic and diastolic blood pressure, family history of diabetes, and body mass index (BMI). Moreover, we used diagnosis as "Class". We applied the "Decision Tree" technique and "J48" algorithm in the WEKA (3.6.10 version) software to develop the model. After data preprocessing and preparation, we used 22,398 records for data mining. The model precision to identify patients was 0.717. The age factor was placed in the root node of the tree as a result of higher information gain. The ROC curve indicates the model function in identification of patients and those individuals who are healthy. The curve indicates high capability of the model, especially in identification of the healthy persons. We developed a model using the decision tree for screening T2DM which did not require laboratory tests for T2DM diagnosis.

  6. Establishing Decision Trees for Predicting Successful Postpyloric Nasoenteric Tube Placement in Critically Ill Patients.

    Science.gov (United States)

    Chen, Weisheng; Sun, Cheng; Wei, Ru; Zhang, Yanlin; Ye, Heng; Chi, Ruibin; Zhang, Yichen; Hu, Bei; Lv, Bo; Chen, Lifang; Zhang, Xiunong; Lan, Huilan; Chen, Chunbo

    2016-08-31

    Despite the use of prokinetic agents, the overall success rate for postpyloric placement via a self-propelled spiral nasoenteric tube is quite low. This retrospective study was conducted in the intensive care units of 11 university hospitals from 2006 to 2016 among adult patients who underwent self-propelled spiral nasoenteric tube insertion. Success was defined as postpyloric nasoenteric tube placement confirmed by abdominal x-ray scan 24 hours after tube insertion. Chi-square automatic interaction detection (CHAID), simple classification and regression trees (SimpleCart), and J48 methodologies were used to develop decision tree models, and multiple logistic regression (LR) methodology was used to develop an LR model for predicting successful postpyloric nasoenteric tube placement. The area under the receiver operating characteristic curve (AUC) was used to evaluate the performance of these models. Successful postpyloric nasoenteric tube placement was confirmed in 427 of 939 patients enrolled. For predicting successful postpyloric nasoenteric tube placement, the performance of the 3 decision trees was similar in terms of the AUCs: 0.715 for the CHAID model, 0.682 for the SimpleCart model, and 0.671 for the J48 model. The AUC of the LR model was 0.729, which outperformed the J48 model. Both the CHAID and LR models achieved an acceptable discrimination for predicting successful postpyloric nasoenteric tube placement and were useful for intensivists in the setting of self-propelled spiral nasoenteric tube insertion. © 2016 American Society for Parenteral and Enteral Nutrition.

  7. COST-BENEFIT ANALYSIS FOR MAKING DECISIONS ON INCENTIVES FOR INVESTMENTS IN PLUM TREES PLANTING

    Directory of Open Access Journals (Sweden)

    Marijan Karić

    2004-12-01

    Full Text Available In this paper we consider the application of Cost/Benefit Analysis procedure in the decision process on socialeconomic profitability of subsidy implementation for investments in agricultural production, based on newly planted plum trees. Cost/Benefit Analysis has many advantages over the other common methods. It proved to be especially useful in the agricultural production, because it is possible to estimate the profitability of investments in the special conditions of agricultural production, taking into account many factors of its economic efficiency, as well as main effects that individual producers and the whole social community can expect. The application of Cost/Benefit Analysis, based on the data gathered for Bosnia and Herzegovina, enabled insight into the profitability of the existing subsidy programs for investments in plum trees planting, that take place in the conditon of the whole economy transition and high degree of rural population unemployment.

  8. Using Boosted Decision Trees to look for displaced Jets in the ATLAS Calorimeter

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    A boosted decision tree is used to identify unique jets in a recently released conference note describing a search for long lived particles decaying to hadrons in the ATLAS Calorimeter. Neutral Long lived particles decaying to hadrons are “typical” signatures in a lot of models including Hidden Valley models, Higgs Portal Models, Baryogenesis, Stealth SUSY, etc. Long lived neutral particles that decay in the calorimeter leave behind an object that looks like a regular Standard Model jet, with subtle differences. For example, the later in the calorimeter it decays, the less energy will be deposited in the early layers of the calorimeter. Because the jet does not originate at the interaction point, it will likely be more narrow as reconstructed by the standard Anti-kT jet reconstruction algorithm used by ATLAS. To separate the jets due to neutral long lived decays from the standard model jets we used a boosted decision tree with thirteen variables as inputs. We used the information from the boosted decision...

  9. Decision-tree model of treatment-seeking behaviors after detecting symptoms by Korean stroke patients.

    Science.gov (United States)

    Oh, Hyo-Sook; Park, Hyeoun-Ae

    2006-06-01

    This study was performed to develop and test a decision-tree model of treatment-seeking behaviors about when Korean patients visit a doctor after experiencing stroke symptoms. The study used methodological triangulation. The model was developed based on qualitative data collected from in-depth interviews with 18 stroke patients. The model was tested using quantitative data collected from interviews and a structured questionnaire involving 150 stroke patients. The predictability of the decision-tree model was quantified as the proportion of participants who followed the pathway predicted by the model. Decision outcomes of the model were categorized into immediate and delayed treatment-seeking behavior. The model was influenced by lowered consciousness, social-group influences, perceived seriousness of symptoms, past history of hypertension or stroke, and barriers to hospital visits. The predictability of the model was found to be 90.7%. The results from this study can help healthcare personnel understand the education needs of stroke patients regarding treatment-seeking behaviors, and hence aid in the development of educational strategies for stroke patients.

  10. A web-based decision support system to enhance IPM programs in Washington tree fruit.

    Science.gov (United States)

    Jones, Vincent P; Brunner, Jay F; Grove, Gary G; Petit, Brad; Tangren, Gerald V; Jones, Wendy E

    2010-06-01

    Integrated pest management (IPM) decision-making has become more information intensive in Washington State tree crops in response to changes in pesticide availability, the development of new control tactics (such as mating disruption) and the development of new information on pest and natural enemy biology. The time-sensitive nature of the information means that growers must have constant access to a single source of verified information to guide management decisions. The authors developed a decision support system for Washington tree fruit growers that integrates environmental data [140 Washington State University (WSU) stations plus weather forecasts from NOAA], model predictions (ten insects, four diseases and a horticultural model), management recommendations triggered by model status and a pesticide database that provides information on non-target impacts on other pests and natural enemies. A user survey in 2008 found that the user base was providing recommendations for most of the orchards and acreage in the state, and that users estimated the value at $ 16 million per year. The design of the system facilitates education on a range of time-sensitive topics and will make it possible easily to incorporate other models, new management recommendations or information from new sensors as they are developed.

  11. Pešek lecture: SETI and society—decision trees

    Science.gov (United States)

    Billingham, John

    This paper presents a simplified decision tree diagram for SETI (the Search for Extraterrestrial Intelligence) and society. It deals with the series of steps and circumstances that follow from the quest for evidence of the existence of extraterrestrial civilizations, with the major goal of including those branch points which involve decisions that are societal rather than scientific or technological. Since SETI is based on science and technology these factors are also included in the decision diagram, but more in a summary fashion. A condensed list of the relevant societal disciplines is given. The most difficult decisions are those related to issues of transmitting communications from Earth to ETI. The diagram may be useful as a new way of looking at the subject of Communication with Extraterrestrial Intelligence (CETI), and how it inevitably blends into SETI. Arguments are made for vigorous pursuit of studies in all the societal, cultural and behavioral domains involved, and it is shown that many of these studies can profitably be undertaken now, before an ETI signal has been detected. Not least, it is argued that the newly emerging field of CETI and Society would benefit materially from the application of formal decision theory and analysis, and from game theory and utility theory.

  12. The risk factors of laryngeal pathology in Korean adults using a decision tree model.

    Science.gov (United States)

    Byeon, Haewon

    2015-01-01

    The purpose of this study was to identify risk factors affecting laryngeal pathology in the Korean population and to evaluate the derived prediction model. Cross-sectional study. Data were drawn from the 2008 Korea National Health and Nutritional Examination Survey. The subjects were 3135 persons (1508 male and 2114 female) aged 19 years and older living in the community. The independent variables were age, sex, occupation, smoking, alcohol drinking, and self-reported voice problems. A decision tree analysis was done to identify risk factors for predicting a model of laryngeal pathology. The significant risk factors of laryngeal pathology were age, gender, occupation, smoking, and self-reported voice problem in decision tree model. Four significant paths were identified in the decision tree model for the prediction of laryngeal pathology. Those identified as high risk groups for laryngeal pathology included those who self-reported a voice problem, those who were males in their 50s who did not recognize a voice problem, those who were not economically active males in their 40s, and male workers aged 19 and over and under 50 or 60 and over who currently smoked. The results of this study suggest that individual risk factors, such as age, sex, occupation, health behavior, and self-reported voice problem, affect the onset of laryngeal pathology in a complex manner. Based on the results of this study, early management of the high-risk groups is needed for the prevention of laryngeal pathology. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  13. Neuron Networks and Trees of Decision-making for Prediction of Eficiency in Studies

    Directory of Open Access Journals (Sweden)

    Marijana Zekić-Sušac

    2009-12-01

    Full Text Available The paper is dealing with models for prediction of students eficiency with the help of neuron networks and decision-making classifcation trees and then with the analysis of factors that infuence the efciency of students. A created model, based on demographic data of students as well as their behaviour and attitudes toward learning, tries to classify student in one of the two efciency categories. Te efciency is measured by the average of marks during studies. Various architectures of neuron networks have been trained and tested and the best model is obtained with the help of stratifed perceptron network. Te trees of decisi- on-making ofered a signifcantly better accuracy than neuron networks and we suggest their using due to their being a more precise method for the set of observed data. A sensitivity analysis of output variables on the input ones carried out with neuron networks refers to the fact that preliminary exams, attendance of exercises, importance of marks to students, and scholarships are among the most signifcant factors for the efciency of students. Te trees of decision-making separated the most signifcant variables: the time spent in learning, attendance of exercises and the sorts of materials from which students learn. Future researches, with the increased number of input variables and enlargement of the pattern and methodological expansion of other artifcial intelligence techniques and statistical methods, would make possible to create more succe- ssful model to be the basis for building the support system of decision-making in university level education.

  14. Decision tree-based learning to predict patient controlled analgesia consumption and readjustment

    Science.gov (United States)

    2012-01-01

    Background Appropriate postoperative pain management contributes to earlier mobilization, shorter hospitalization, and reduced cost. The under treatment of pain may impede short-term recovery and have a detrimental long-term effect on health. This study focuses on Patient Controlled Analgesia (PCA), which is a delivery system for pain medication. This study proposes and demonstrates how to use machine learning and data mining techniques to predict analgesic requirements and PCA readjustment. Methods The sample in this study included 1099 patients. Every patient was described by 280 attributes, including the class attribute. In addition to commonly studied demographic and physiological factors, this study emphasizes attributes related to PCA. We used decision tree-based learning algorithms to predict analgesic consumption and PCA control readjustment based on the first few hours of PCA medications. We also developed a nearest neighbor-based data cleaning method to alleviate the class-imbalance problem in PCA setting readjustment prediction. Results The prediction accuracies of total analgesic consumption (continuous dose and PCA dose) and PCA analgesic requirement (PCA dose only) by an ensemble of decision trees were 80.9% and 73.1%, respectively. Decision tree-based learning outperformed Artificial Neural Network, Support Vector Machine, Random Forest, Rotation Forest, and Naïve Bayesian classifiers in analgesic consumption prediction. The proposed data cleaning method improved the performance of every learning method in this study of PCA setting readjustment prediction. Comparative analysis identified the informative attributes from the data mining models and compared them with the correlates of analgesic requirement reported in previous works. Conclusion This study presents a real-world application of data mining to anesthesiology. Unlike previous research, this study considers a wider variety of predictive factors, including PCA demands over time. We analyzed

  15. A decision tree-based approach for determining low bone mineral density in inflammatory bowel disease using WEKA software.

    Science.gov (United States)

    Firouzi, Farzad; Rashidi, Marjan; Hashemi, Sattar; Kangavari, Mohammadreza; Bahari, Ali; Daryani, Naser Ebrahimi; Emam, Mohammad Mehdi; Naderi, Nosratollah; Shalmani, Hamid Mohaghegh; Farnood, Alma; Zali, Mohammadreza

    2007-12-01

    Decision tree classification is a standard machine learning technique that has been used for a wide range of applications. Patients with inflammatory bowel disease (IBD) are at increased risk of developing low bone mineral density (BMD). This study aimed at developing a new approach to select truly affected IBD patients who are indicated for densitometry, hence, subjecting fewer patients for bone densitometry and reducing expenses. Simple decision trees have been developed by means of WEKA (Waikato Environment for Knowledge Analysis) package of machine learning algorithms to predict factors influencing the bone density among IBD patients. The BMD status was the outcome variable whereas age, sex, duration of disease, smoking status, corticosteroid use, oral contraceptive use, calcium or vitamin D supplementation, menstruation, milk abstinence, BMI, and levels of calcium, phosphorous, alkaline phosphatase, and 25-OH vitamin D were all attributes. Testing showed the decision trees to have sensitivities of 65.7-82.8%, specificities of 95.2-96.3%, accuracies of 86.2-89.8%, and Matthews correlation coefficients of 0.68-0.79. Smoking status was the most significant node (root) for ulcerative colitis and IBD-associated trees whereas calcium status was the root of Crohn's disease patients' decision tree. BD specialists could use such decision trees to reduce substantially the number of patients referred for bone densitometry and potentially save resources.

  16. Improving medical diagnosis reliability using Boosted C5.0 decision tree empowered by Particle Swarm Optimization.

    Science.gov (United States)

    Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin

    2015-08-01

    Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.

  17. Comparison of neurofuzzy logic and decision trees in discovering knowledge from experimental data of an immediate release tablet formulation.

    Science.gov (United States)

    Shao, Q; Rowe, R C; York, P

    2007-06-01

    Understanding of the cause-effect relationships between formulation ingredients, process conditions and product properties is essential for developing a quality product. However, the formulation knowledge is often hidden in experimental data and not easily interpretable. This study compares neurofuzzy logic and decision tree approaches in discovering hidden knowledge from an immediate release tablet formulation database relating formulation ingredients (silica aerogel, magnesium stearate, microcrystalline cellulose and sodium carboxymethylcellulose) and process variables (dwell time and compression force) to tablet properties (tensile strength, disintegration time, friability, capping and drug dissolution at various time intervals). Both approaches successfully generated useful knowledge in the form of either "if then" rules or decision trees. Although different strategies are employed by the two approaches in generating rules/trees, similar knowledge was discovered in most cases. However, as decision trees are not able to deal with continuous dependent variables, data discretisation procedures are generally required.

  18. Applications of urban tree canopy assessment and prioritization tools: supporting collaborative decision making to achieve urban sustainability goals

    Science.gov (United States)

    Dexter H. Locke; J. Morgan Grove; Michael Galvin; Jarlath P.M. ONeil-Dunne; Charles. Murphy

    2013-01-01

    Urban Tree Canopy (UTC) Prioritizations can be both a set of geographic analysis tools and a planning process for collaborative decision-making. In this paper, we describe how UTC Prioritizations can be used as a planning process to provide decision support to multiple government agencies, civic groups and private businesses to aid in reaching a canopy target. Linkages...

  19. Decision Tree Classifier for Classification of Plant and Animal Micro RNA's

    Science.gov (United States)

    Pant, Bhasker; Pant, Kumud; Pardasani, K. R.

    Gene expression is regulated by miRNAs or micro RNAs which can be 21-23 nucleotide in length. They are non coding RNAs which control gene expression either by translation repression or mRNA degradation. Plants and animals both contain miRNAs which have been classified by wet lab techniques. These techniques are highly expensive, labour intensive and time consuming. Hence faster and economical computational approaches are needed. In view of above a machine learning model has been developed for classification of plant and animal miRNAs using decision tree classifier. The model has been tested on available data and it gives results with 91% accuracy.

  20. An Examination of Mathematically Gifted Students' Learning Styles by Decision Trees

    Directory of Open Access Journals (Sweden)

    Esra Aksoy

    2015-12-01

    Full Text Available The aim of this study was to examine mathematically gifted students' learning styles through data mining method. ‘Learning Style Inventory’ and ‘Multiple Intelligences Scale’ were used to collect data. The sample included 234 mathematically gifted middle school students. The construct decision tree was examined predicting mathematically gifted students’ learning styles according to their multiple intelligences and gender and grade level. Results showed that all the variables used in the study had a significant effect on mathematically gifted students’ learning styles, but the most effective attribute found was intelligence type.

  1. Development of a decision tree to determine appropriateness of NVivo in analyzing qualitative data sets.

    Science.gov (United States)

    Auld, Garry W; Diker, Ann; Bock, M Ann; Boushey, Carol J; Bruhn, Christine M; Cluskey, Mary; Edlefsen, Miriam; Goldberg, Dena L; Misner, Scottie L; Olson, Beth H; Reicks, Marla; Wang, Changzheng; Zaghloul, Sahar

    2007-01-01

    A decision tree was developed to determine when NVivo is an appropriate tool for qualitative analysis. NVivo, a qualitative analysis software package, was used to analyze interviews of 204 Asian, Hispanic, and white parents in 12 states. The experience provided insight into issues that should be considered when deciding to use the software. NVivo can enhance the qualitative research process, quickly process queries, and expand analytical avenues. Before using, however, the following must be considered: training time, establishing inter-coder reliability, number and length of documents, coding time, coding structure, use of automated coding, and possible need for separate databases or additional supporting software.

  2. SITUATIONAL CONTROL OF HOT BLAST STOVES GROUP BASED ON DECISION TREE

    Directory of Open Access Journals (Sweden)

    E. I. Kobysh

    2016-09-01

    Full Text Available In this paper was developed the control system of group of hot blast stoves, which operates on the basis of the packing heating control subsystem and subsystem of forecasting of modes duration in the hot blast stoves APCS of iron smelting in a blast furnace. With the use of multi-criteria optimization methods, implemented the adjustment of control system conduct, which takes into account the current production situation that has arisen in the course of the heating packing of each hot blast stove group. Developed a situation recognition algorithm and the choice of scenarios of control based on a decision tree.

  3. Achieving high recognition reliability using decision trees and AdaBoost

    Science.gov (United States)

    Xiang, Jianying; Tu, Xiao; Lu, Yue; Wang, Patrick S. P.

    2008-01-01

    Recognition rate is traditionally used as the main criterion for evaluating the performance of a recognition system. High recognition reliability with low misclassification rate is also a must for many applications. To handle the variability of the writing style of different individuals, this paper employs decision trees and WRB AdaBoost to design a classifier with high recognition reliability for recognizing Bangla handwritten numerals. Experiments on the numeral images obtained from real Bangladesh envelopes show that the proposed recognition method is capable of achieving high recognition reliability with acceptable recognition rate.

  4. A comparison of student academic achievement using decision trees techniques: Reflection from University Malaysia Perlis

    Science.gov (United States)

    Aziz, Fatihah; Jusoh, Abd Wahab; Abu, Mohd Syafarudy

    2015-05-01

    A decision tree is one of the techniques in data mining for prediction. Using this method, hidden information from abundant of data can be taken out and interpret the information into useful knowledge. In this paper the academic performance of the student will be examined from 2002 to 2012 from two faculties; Faculty of Manufacturing Engineering and Faculty of Microelectronic Engineering in University Malaysia Perlis (UniMAP). The objectives of this study are to determine and compare the factors that affect the students' academic achievement between the two faculties. The prediction results show there are five attributes that have been considered as factors that influence the students' academic performance.

  5. Diagnostic Features of Common Oral Ulcerative Lesions: An Updated Decision Tree

    OpenAIRE

    Hamed Mortazavi; Yaser Safi; Maryam Baharvand; Somayeh Rahmani

    2016-01-01

    Diagnosis of oral ulcerative lesions might be quite challenging. This narrative review article aims to introduce an updated decision tree for diagnosing oral ulcerative lesions on the basis of their diagnostic features. Various general search engines and specialized databases including PubMed, PubMed Central, Medline Plus, EBSCO, Science Direct, Scopus, Embase, and authenticated textbooks were used to find relevant topics by means of MeSH keywords such as “oral ulcer,” “stomatitis,” and “mout...

  6. Decision support for mitigating the risk of tree induced transmission line failure in utility rights-of-way.

    Science.gov (United States)

    Poulos, H M; Camp, A E

    2010-02-01

    Vegetation management is a critical component of rights-of-way (ROW) maintenance for preventing electrical outages and safety hazards resulting from tree contact with conductors during storms. Northeast Utility's (NU) transmission lines are a critical element of the nation's power grid; NU is therefore under scrutiny from federal agencies charged with protecting the electrical transmission infrastructure of the United States. We developed a decision support system to focus right-of-way maintenance and minimize the potential for a tree fall episode that disables transmission capacity across the state of Connecticut. We used field data on tree characteristics to develop a system for identifying hazard trees (HTs) in the field using limited equipment to manage Connecticut power line ROW. Results from this study indicated that the tree height-to-diameter ratio, total tree height, and live crown ratio were the key characteristics that differentiated potential risk trees (danger trees) from trees with a high probability of tree fall (HTs). Products from this research can be transferred to adaptive right-of-way management, and the methods we used have great potential for future application to other regions of the United States and elsewhere where tree failure can disrupt electrical power.

  7. Integrating Decision Tree and Hidden Markov Model (HMM) for Subtype Prediction of Human Influenza A Virus

    Science.gov (United States)

    Attaluri, Pavan K.; Chen, Zhengxin; Weerakoon, Aruna M.; Lu, Guoqing

    Multiple criteria decision making (MCDM) has significant impact in bioinformatics. In the research reported here, we explore the integration of decision tree (DT) and Hidden Markov Model (HMM) for subtype prediction of human influenza A virus. Infection with influenza viruses continues to be an important public health problem. Viral strains of subtype H3N2 and H1N1 circulates in humans at least twice annually. The subtype detection depends mainly on the antigenic assay, which is time-consuming and not fully accurate. We have developed a Web system for accurate subtype detection of human influenza virus sequences. The preliminary experiment showed that this system is easy-to-use and powerful in identifying human influenza subtypes. Our next step is to examine the informative positions at the protein level and extend its current functionality to detect more subtypes. The web functions can be accessed at http://glee.ist.unomaha.edu/.

  8. Sistem Pakar Untuk Diagnosa Penyakit Kehamilan Menggunakan Metode Dempster-Shafer Dan Decision Tree

    Directory of Open Access Journals (Sweden)

    joko popo minardi

    2016-01-01

    Full Text Available Dempster-Shafer theory is a mathematical theory of evidence based on belief functions and plausible reasoning, which is used to combine separate pieces of information. Dempster-Shafer theory an alternative to traditional probabilistic theory for the mathematical representation of uncertainty. In the diagnosis of diseases of pregnancy information obtained from the patient sometimes incomplete, with Dempster-Shafer method and expert system rules can be a combination of symptoms that are not complete to get an appropriate diagnosis while the decision tree is used as a decision support tool reference tracking of disease symptoms This Research aims to develop an expert system that can perform a diagnosis of pregnancy using Dempster Shafer method, which can produce a trust value to a disease diagnosis. Based on the results of diagnostic testing Dempster-Shafer method and expert systems, the resulting accuracy of 76%.   Keywords: Expert system; Diseases of pregnancy; Dempster Shafer

  9. Predicting Short-Term Subway Ridership and Prioritizing Its Influential Factors Using Gradient Boosting Decision Trees

    Directory of Open Access Journals (Sweden)

    Chuan Ding

    2016-10-01

    Full Text Available Understanding the relationship between short-term subway ridership and its influential factors is crucial to improving the accuracy of short-term subway ridership prediction. Although there has been a growing body of studies on short-term ridership prediction approaches, limited effort is made to investigate the short-term subway ridership prediction considering bus transfer activities and temporal features. To fill this gap, a relatively recent data mining approach called gradient boosting decision trees (GBDT is applied to short-term subway ridership prediction and used to capture the associations with the independent variables. Taking three subway stations in Beijing as the cases, the short-term subway ridership and alighting passengers from its adjacent bus stops are obtained based on transit smart card data. To optimize the model performance with different combinations of regularization parameters, a series of GBDT models are built with various learning rates and tree complexities by fitting a maximum of trees. The optimal model performance confirms that the gradient boosting approach can incorporate different types of predictors, fit complex nonlinear relationships, and automatically handle the multicollinearity effect with high accuracy. In contrast to other machine learning methods—or “black-box” procedures—the GBDT model can identify and rank the relative influences of bus transfer activities and temporal features on short-term subway ridership. These findings suggest that the GBDT model has considerable advantages in improving short-term subway ridership prediction in a multimodal public transportation system.

  10. Categorization of 77 dystrophin exons into 5 groups by a decision tree using indexes of splicing regulatory factors as decision markers.

    Science.gov (United States)

    Malueka, Rusdy Ghazali; Takaoka, Yutaka; Yagi, Mariko; Awano, Hiroyuki; Lee, Tomoko; Dwianingsih, Ery Kus; Nishida, Atsushi; Takeshima, Yasuhiro; Matsuo, Masafumi

    2012-03-31

    Duchenne muscular dystrophy, a fatal muscle-wasting disease, is characterized by dystrophin deficiency caused by mutations in the dystrophin gene. Skipping of a target dystrophin exon during splicing with antisense oligonucleotides is attracting much attention as the most plausible way to express dystrophin in DMD. Antisense oligonucleotides have been designed against splicing regulatory sequences such as splicing enhancer sequences of target exons. Recently, we reported that a chemical kinase inhibitor specifically enhances the skipping of mutated dystrophin exon 31, indicating the existence of exon-specific splicing regulatory systems. However, the basis for such individual regulatory systems is largely unknown. Here, we categorized the dystrophin exons in terms of their splicing regulatory factors. Using a computer-based machine learning system, we first constructed a decision tree separating 77 authentic from 14 known cryptic exons using 25 indexes of splicing regulatory factors as decision markers. We evaluated the classification accuracy of a novel cryptic exon (exon 11a) identified in this study. However, the tree mislabeled exon 11a as a true exon. Therefore, we re-constructed the decision tree to separate all 15 cryptic exons. The revised decision tree categorized the 77 authentic exons into five groups. Furthermore, all nine disease-associated novel exons were successfully categorized as exons, validating the decision tree. One group, consisting of 30 exons, was characterized by a high density of exonic splicing enhancer sequences. This suggests that AOs targeting splicing enhancer sequences would efficiently induce skipping of exons belonging to this group. The decision tree categorized the 77 authentic exons into five groups. Our classification may help to establish the strategy for exon skipping therapy for Duchenne muscular dystrophy.

  11. Termination of pregnancy for fetal abnormalities: main arguments and a decision-tree model.

    Science.gov (United States)

    Kose, Semir; Altunyurt, Sabahattin; Yıldırım, Nuri; Keskinoğlu, Pembe; Çankaya, Tufan; Bora, Elçin; Erçal, Derya; Özer, Erdener

    2015-11-01

    By looking through our ethical committee cases, we demonstrate the main arguments we use for making a judgment in face of fetal abnormalities. Our decision making model is a simplified algorithm of the arguments and concepts we use in scientific-ethic discussion. A retrospective analysis was conducted from single, tertiary referral center of patients evaluated for fetal abnormalities from 2004 to 2014. We hypothesized that all our judgments would fit into a decision-tree model. 553 fetal abnormality cases were discussed, 348 (63%) were given termination of pregnancy (TOP) proposal. When detected genetic disorders (n:100) and with mental retardation risk (n:93) ended up with TOP proposal. For incompatibility with life cases (n:111) and the multimorbidity cases (n:44) the committee suggest TOP, regardless of gestational age. The highest family approval ratios were in chromosomal abnormalities/genetic disorders group (93%), and the lowest figures were in mental retardation risk group (80%). Continuously changing literature on prenatal and postnatal therapy options and the long term outcome of various fetal abnormalities influence committee decisions. Theoretical high success rates and inconsistent data on long term prognosis of some anomaly groups resulted in heterogenous decisions and various approval ratios. © 2015 John Wiley & Sons, Ltd.

  12. Non-compliance with a postmastectomy radiotherapy guideline: decision tree and cause analysis.

    Science.gov (United States)

    Razavi, Amir R; Gill, Hans; Ahlfeldt, Hans; Shahsavar, Nosrat

    2008-09-21

    The guideline for postmastectomy radiotherapy (PMRT), which is prescribed to reduce recurrence of breast cancer in the chest wall and improve overall survival, is not always followed. Identifying and extracting important patterns of non-compliance are crucial in maintaining the quality of care in Oncology. Analysis of 759 patients with malignant breast cancer using decision tree induction (DTI) found patterns of non-compliance with the guideline. The PMRT guideline was used to separate cases according to the recommendation to receive or not receive PMRT. The two groups of patients were analyzed separately. Resulting patterns were transformed into rules that were then compared with the reasons that were extracted by manual inspection of records for the non-compliant cases. Analyzing patients in the group who should receive PMRT according to the guideline did not result in a robust decision tree. However, classification of the other group, patients who should not receive PMRT treatment according to the guideline, resulted in a tree with nine leaves and three of them were representing non-compliance with the guideline. In a comparison between rules resulting from these three non-compliant patterns and manual inspection of patient records, the following was found: In the decision tree, presence of perigland growth is the most important variable followed by number of malignantly invaded lymph nodes and level of Progesterone receptor. DNA index, age, size of the tumor and level of Estrogen receptor are also involved but with less importance. From manual inspection of the cases, the most frequent pattern for non-compliance is age above the threshold followed by near cut-off values for risk factors and unknown reasons. Comparison of patterns of non-compliance acquired from data mining and manual inspection of patient records demonstrates that not all of the non-compliances are repetitive or important. There are some overlaps between important variables acquired from manual

  13. The Legacy of Past Tree Planting Decisions for a City Confronting Emerald Ash Borer (Agrilus planipennis Invasion

    Directory of Open Access Journals (Sweden)

    Christopher Sean Greene

    2016-03-01

    Full Text Available Management decisions grounded in ecological understanding are essential to the maintenance of a healthy urban forest. Decisions about where and what tree species to plant have both short and long-term consequences for the future function and resilience of city trees. Through the construction of a theoretical damage index, this study examines the legacy effects of a street tree planting program in a densely populated North American city confronting an invasion of emerald ash borer (Agrilus planipennis. An investigation of spatial autocorrelation for locations of high damage potential across the City of Toronto, Canada was then conducted using Getis-Ord Gi*. Significant spatial clustering of high damage index values affirmed that past urban tree planting practices placing little emphasis on species diversity have created time-lagged consequences of enhanced vulnerability of trees to insect pests. Such consequences are observed at the geographically local scale, but can easily cascade to become multi-scalar in their spatial extent. The theoretical damage potential index developed in this study provides a framework for contextualizing historical urban tree planting decisions where analysis of damage index values for Toronto reinforces the importance of urban forest management that prioritizes proactive tree planting strategies that consider species diversity in the context of planting location.

  14. Model Data Mining sebagai Prediksi Penyakit Hipertensi Kehamilan dengan Teknik Decision Tree

    Directory of Open Access Journals (Sweden)

    Ari Muzakir

    2016-06-01

    Full Text Available Prevalensi hipertensi pada wanita hamil terjadi sebanyak 1.062 kasus (12,7%. Dari 1062 kasus ibu hamil dengan hipertensi, ditemukan 125 kasus (11,8% yang telah didiagnosis dengan hipertensi oleh tenaga kesehatan. RSIA YK Madira Palembang sebagai pusat kesehatan harus mengembangkan metode yang dapat memprediksi risiko tinggi ibu hamil dengan hipertensi dari data hasil pemeriksaan kehamilan. Dengan memanfaatkan sumber data yang terdiri dari data perawatan antenatal, diterapkan teknik data mining dengan algoritma decision tree C4.5, berdasarkan Knowledge Discovery in Database (KDD. Sehingga akan ditemukan pengetahuan, informasi, dan pola tersembunyi dari data pelayanan antenatal, yang merupakan prediksi hipertensi pada kehamilan. Metode yang digunakan yaitu Algoritma C4.5. Setelah mendapatkan decision tree dan rules yang dapat memprediksi penyakit hipertensi dalam kehamilan, dilakukan evaluasi dengan supplied test set menggunakan WEKA dihasilkan kesalahan (error 7.3427% dan tingkat akurasi 92.6573%. Data training yang berjumlah 286 instances, hal ini menunjukkan bahwa terdapat 265 instances yang akurat dan 21 instances yang error atau prediksinya salah. 

  15. Identifying controlling factors of ground-level ozone levels over southwestern Taiwan using a decision tree

    Science.gov (United States)

    Chu, Hone-Jay; Lin, Chuan-Yao; Liau, Churn-Jung; Kuo, Yi-Ming

    2012-12-01

    Kaohsiung City and the suburban region of southwestern Taiwan have suffered from severe air pollution since becoming the largest center of heavy industry in Taiwan. The complex process of ozone (O3) formation and its precursor compounds (the volatile organic compounds (VOCs) and nitrogen oxide (NOx) emissions), accompanied by meteorological conditions, make controlling ozone difficult. Using a decision tree is especially appropriate for analyzing time series data that contain ozone levels and meteorological and explanatory variables for ozone formation. Results show that dominant variables such as temperature, wind speed, VOCs, and NOx can play vital roles in describing ozone variations among observations. That temperature and wind speed are highly correlated with ozone levels indicates that these meteorological conditions largely affect ozone variability. The results also demonstrate that spatial heterogeneity of ozone patterns are in coastal and inland areas caused by sea-land breeze and pollutant sources during high ozone episodes over southwestern Taiwan. This study used a decision tree to obtain quantitative insight into spatial distributions of precursor compound emissions and effects of meteorological conditions on ozone levels that are useful for refining monitoring plans and developing management strategies.

  16. Computational Prediction of Blood-Brain Barrier Permeability Using Decision Tree Induction

    Directory of Open Access Journals (Sweden)

    Jörg Huwyler

    2012-08-01

    Full Text Available Predicting blood-brain barrier (BBB permeability is essential to drug development, as a molecule cannot exhibit pharmacological activity within the brain parenchyma without first transiting this barrier. Understanding the process of permeation, however, is complicated by a combination of both limited passive diffusion and active transport. Our aim here was to establish predictive models for BBB drug permeation that include both active and passive transport. A database of 153 compounds was compiled using in vivo surface permeability product (logPS values in rats as a quantitative parameter for BBB permeability. The open source Chemical Development Kit (CDK was used to calculate physico-chemical properties and descriptors. Predictive computational models were implemented by machine learning paradigms (decision tree induction on both descriptor sets. Models with a corrected classification rate (CCR of 90% were established. Mechanistic insight into BBB transport was provided by an Ant Colony Optimization (ACO-based binary classifier analysis to identify the most predictive chemical substructures. Decision trees revealed descriptors of lipophilicity (aLogP and charge (polar surface area, which were also previously described in models of passive diffusion. However, measures of molecular geometry and connectivity were found to be related to an active drug transport component.

  17. APPLICATION OF DECISION TREE ON COLLISION AVOIDANCE SYSTEM DESIGN AND VERIFICATION FOR QUADCOPTER

    Directory of Open Access Journals (Sweden)

    C.-W. Chen

    2017-08-01

    Full Text Available The purpose of the research is to build a collision avoidance system with decision tree algorithm used for quadcopters. While the ultrasonic range finder judges the distance is in collision avoidance interval, the access will be replaced from operator to the system to control the altitude of the UAV. According to the former experiences on operating quadcopters, we can obtain the appropriate pitch angle. The UAS implement the following three motions to avoid collisions. Case1: initial slow avoidance stage, Case2: slow avoidance stage and Case3: Rapid avoidance stage. Then the training data of collision avoidance test will be transmitted to the ground station via wireless transmission module to further analysis. The entire decision tree algorithm of collision avoidance system, transmission data, and ground station have been verified in some flight tests. In the flight test, the quadcopter can implement avoidance motion in real-time and move away from obstacles steadily. In the avoidance area, the authority of the collision avoidance system is higher than the operator and implements the avoidance process. The quadcopter can successfully fly away from the obstacles in 1.92 meter per second and the minimum distance between the quadcopter and the obstacle is 1.05 meters.

  18. A decision tree for the management of exposed cervical dentin (ECD) and dentin hypersensitivity (DHS).

    Science.gov (United States)

    Martens, Luc C

    2013-03-01

    Dentin hypersensitivity (DHS) is a problematic clinical entity that may become an increasing clinical problem for dentists to treat as a consequence of patients retaining their teeth throughout life and improved oral hygiene practices. The aim of this review was to develop a decision tree for the management of exposed cervical dentin (ECD) and DHS. A brief PUBMED literature search was performed on dentin hypersensitivity using "MeSH" terms, "review", and "management". In addition, some websites and local guidelines were screened. From this review, it became clear that all dentate patients should routinely be screened for ECD and DHS. In this respect, underdiagnosis of the condition will be avoided and the preventive management can be initiated early. A decision tree process and a flowchart for daily practice were designed which should be started up as soon as a patient present with ECD or suffers from DHS. This approach takes into account the possible improved quality of life of the patient and is further based on a hierarchy of treatment options. In this respect, active management of DHS will usually involve a combination of at-home and in-office therapies. Starting with the use of desensitizing toothpastes is strongly recommended.

  19. A Modular Approach Utilizing Decision Tree in Teaching Integration Techniques in Calculus

    Directory of Open Access Journals (Sweden)

    Edrian E. Gonzales

    2015-08-01

    Full Text Available – This study was conducted to test the effectiveness of modular approach using decision tree in teaching integration techniques in Calculus. It sought answer to the question: Is there a significant difference between the mean scores of two groups of students in their quizzes on (1 integration by parts and (2 integration by trigonometric transformation? Twenty-eight second year B.S. Computer Science students at City College of Calamba who were enrolled in Mathematical Analysis II for the second semester of school year 2013-2014 were purposively chosen as respondents. The study made use of the non-equivalent control group posttest-only design of quasi-experimental research. The experimental group was taught using modular approach while the comparison group was exposed to traditional instruction. The research instruments used were two twenty-item multiple-choice-type quizzes. Statistical treatment used the mean, standard deviation, Shapiro-Wilk test for normality, twotailed t-test for independent samples, and Mann-Whitney U-test. The findings led to the conclusion that both modular and traditional instructions were equally effective in facilitating the learning of integration by parts. The other result revealed that the use of modular approach utilizing decision tree in teaching integration by trigonometric transformation was more effective than the traditional method.

  20. Application of Decision Tree on Collision Avoidance System Design and Verification for Quadcopter

    Science.gov (United States)

    Chen, C.-W.; Hsieh, P.-H.; Lai, W.-H.

    2017-08-01

    The purpose of the research is to build a collision avoidance system with decision tree algorithm used for quadcopters. While the ultrasonic range finder judges the distance is in collision avoidance interval, the access will be replaced from operator to the system to control the altitude of the UAV. According to the former experiences on operating quadcopters, we can obtain the appropriate pitch angle. The UAS implement the following three motions to avoid collisions. Case1: initial slow avoidance stage, Case2: slow avoidance stage and Case3: Rapid avoidance stage. Then the training data of collision avoidance test will be transmitted to the ground station via wireless transmission module to further analysis. The entire decision tree algorithm of collision avoidance system, transmission data, and ground station have been verified in some flight tests. In the flight test, the quadcopter can implement avoidance motion in real-time and move away from obstacles steadily. In the avoidance area, the authority of the collision avoidance system is higher than the operator and implements the avoidance process. The quadcopter can successfully fly away from the obstacles in 1.92 meter per second and the minimum distance between the quadcopter and the obstacle is 1.05 meters.

  1. Application of a hybrid association rules/decision tree model for drought monitoring

    Science.gov (United States)

    Nourani, Vahid; Molajou, Amir

    2017-12-01

    The previous researches have shown that the incorporation of the oceanic-atmospheric climate phenomena such as Sea Surface Temperature (SST) into hydro-climatic models could provide important predictive information about hydro-climatic variability. In this paper, the hybrid application of two data mining techniques (decision tree and association rules) was offered to discover affiliation between drought of Tabriz and Kermanshah synoptic stations (located in Iran) and de-trend SSTs of the Black, Mediterranean and Red Seas. Two major steps of the proposed model were the classification of de-trend SST data and selecting the most effective groups and extracting hidden information involved in the data. The techniques of decision tree which can identify the good traits from a data set for the classification purpose were used for classification and selecting the most effective groups and association rules were employed to extract the hidden predictive information from the large observed data. To examine the accuracy of the rules, confidence and Heidke Skill Score (HSS) measures were calculated and compared for different considering lag times. The computed measures confirm reliable performance of the proposed hybrid data mining method to forecast drought and the results show a relative correlation between the Mediterranean, Black and Red Sea de-trend SSTs and drought of Tabriz and Kermanshah synoptic stations so that the confidence between the monthly Standardized Precipitation Index (SPI) values and the de-trend SST of seas is higher than 70 and 80% respectively for Tabriz and Kermanshah synoptic stations.

  2. Using decision-tree classifier systems to extract knowledge from databases

    Science.gov (United States)

    St.clair, D. C.; Sabharwal, C. L.; Hacke, Keith; Bond, W. E.

    1990-01-01

    One difficulty in applying artificial intelligence techniques to the solution of real world problems is that the development and maintenance of many AI systems, such as those used in diagnostics, require large amounts of human resources. At the same time, databases frequently exist which contain information about the process(es) of interest. Recently, efforts to reduce development and maintenance costs of AI systems have focused on using machine learning techniques to extract knowledge from existing databases. Research is described in the area of knowledge extraction using a class of machine learning techniques called decision-tree classifier systems. Results of this research suggest ways of performing knowledge extraction which may be applied in numerous situations. In addition, a measurement called the concept strength metric (CSM) is described which can be used to determine how well the resulting decision tree can differentiate between the concepts it has learned. The CSM can be used to determine whether or not additional knowledge needs to be extracted from the database. An experiment involving real world data is presented to illustrate the concepts described.

  3. Identification of Biomarkers for Esophageal Squamous Cell Carcinoma Using Feature Selection and Decision Tree Methods

    Directory of Open Access Journals (Sweden)

    Chun-Wei Tung

    2013-01-01

    Full Text Available Esophageal squamous cell cancer (ESCC is one of the most common fatal human cancers. The identification of biomarkers for early detection could be a promising strategy to decrease mortality. Previous studies utilized microarray techniques to identify more than one hundred genes; however, it is desirable to identify a small set of biomarkers for clinical use. This study proposes a sequential forward feature selection algorithm to design decision tree models for discriminating ESCC from normal tissues. Two potential biomarkers of RUVBL1 and CNIH were identified and validated based on two public available microarray datasets. To test the discrimination ability of the two biomarkers, 17 pairs of expression profiles of ESCC and normal tissues from Taiwanese male patients were measured by using microarray techniques. The classification accuracies of the two biomarkers in all three datasets were higher than 90%. Interpretable decision tree models were constructed to analyze expression patterns of the two biomarkers. RUVBL1 was consistently overexpressed in all three datasets, although we found inconsistent CNIH expression possibly affected by the diverse major risk factors for ESCC across different areas.

  4. Prediction of antimicrobial activity of synthetic peptides by a decision tree model.

    Science.gov (United States)

    Lira, Felipe; Perez, Pedro S; Baranauskas, José A; Nozawa, Sérgio R

    2013-05-01

    Antimicrobial resistance is a persistent problem in the public health sphere. However, recent attempts to find effective substitutes to combat infections have been directed at identifying natural antimicrobial peptides in order to circumvent resistance to commercial antibiotics. This study describes the development of synthetic peptides with antimicrobial activity, created in silico by site-directed mutation modeling using wild-type peptides as scaffolds for these mutations. Fragments of antimicrobial peptides were used for modeling with molecular modeling computational tools. To analyze these peptides, a decision tree model, which indicated the action range of peptides on the types of microorganisms on which they can exercise biological activity, was created. The decision tree model was processed using physicochemistry properties from known antimicrobial peptides available at the Antimicrobial Peptide Database (APD). The two most promising peptides were synthesized, and antimicrobial assays showed inhibitory activity against Gram-positive and Gram-negative bacteria. Colossomin C and colossomin D were the most inhibitory peptides at 5 μg/ml against Staphylococcus aureus and Escherichia coli. The methods described in this work and the results obtained are useful for the identification and development of new compounds with antimicrobial activity through the use of computational tools.

  5. Recognition of Protozoa and Metazoa using image analysis tools, discriminant analysis, neural networks and decision trees.

    Science.gov (United States)

    Ginoris, Y P; Amaral, A L; Nicolau, A; Coelho, M A Z; Ferreira, E C

    2007-07-09

    Protozoa and metazoa are considered good indicators of the treatment quality in activated sludge systems due to the fact that these organisms are fairly sensitive to physical, chemical and operational processes. Therefore, it is possible to establish close relationships between the predominance of certain species or groups of species and several operational parameters of the plant, such as the biotic indices, namely the Sludge Biotic Index (SBI). This procedure requires the identification, classification and enumeration of the different species, which is usually achieved manually implying both time and expertise availability. Digital image analysis combined with multivariate statistical techniques has proved to be a useful tool to classify and quantify organisms in an automatic and not subjective way. This work presents a semi-automatic image analysis procedure for protozoa and metazoa recognition developed in Matlab language. The obtained morphological descriptors were analyzed using discriminant analysis, neural network and decision trees multivariable statistical techniques to identify and classify each protozoan or metazoan. The obtained procedure was quite adequate for distinguishing between the non-sessile protozoa classes and also for the metazoa classes, with high values for the overall species recognition with the exception of sessile protozoa. In terms of the wastewater conditions assessment the obtained results were found to be suitable for the prediction of these conditions. Finally, the discriminant analysis and neural networks results were found to be quite similar whereas the decision trees technique was less appropriate.

  6. Cardiovascular Dysautonomias Diagnosis Using Crisp and Fuzzy Decision Tree: A Comparative Study.

    Science.gov (United States)

    Kadi, Ilham; Idri, Ali

    2016-01-01

    Decision trees (DTs) are one of the most popular techniques for learning classification systems, especially when it comes to learning from discrete examples. In real world, many data occurred in a fuzzy form. Hence a DT must be able to deal with such fuzzy data. In fact, integrating fuzzy logic when dealing with imprecise and uncertain data allows reducing uncertainty and providing the ability to model fine knowledge details. In this paper, a fuzzy decision tree (FDT) algorithm was applied on a dataset extracted from the ANS (Autonomic Nervous System) unit of the Moroccan university hospital Avicenne. This unit is specialized on performing several dynamic tests to diagnose patients with autonomic disorder and suggest them the appropriate treatment. A set of fuzzy classifiers were generated using FID 3.4. The error rates of the generated FDTs were calculated to measure their performances. Moreover, a comparison between the error rates obtained using crisp and FDTs was carried out and has proved that the results of FDTs were better than those obtained using crisp DTs.

  7. Decision trees to characterise the roles of permeability and solubility on the prediction of oral absorption.

    Science.gov (United States)

    Newby, Danielle; Freitas, Alex A; Ghafourian, Taravat

    2015-01-27

    Oral absorption of compounds depends on many physiological, physiochemical and formulation factors. Two important properties that govern oral absorption are in vitro permeability and solubility, which are commonly used as indicators of human intestinal absorption. Despite this, the nature and exact characteristics of the relationship between these parameters are not well understood. In this study a large dataset of human intestinal absorption was collated along with in vitro permeability, aqueous solubility, melting point, and maximum dose for the same compounds. The dataset allowed a permeability threshold to be established objectively to predict high or low intestinal absorption. Using this permeability threshold, classification decision trees incorporating a solubility-related parameter such as experimental or predicted solubility, or the melting point based absorption potential (MPbAP), along with structural molecular descriptors were developed and validated to predict oral absorption class. The decision trees were able to determine the individual roles of permeability and solubility in oral absorption process. Poorly permeable compounds with high solubility show low intestinal absorption, whereas poorly water soluble compounds with high or low permeability may have high intestinal absorption provided that they have certain molecular characteristics such as a small polar surface or specific topology. Copyright © 2015 Elsevier Masson SAS. All rights reserved.

  8. Diagnostic Features of Common Oral Ulcerative Lesions: An Updated Decision Tree

    Directory of Open Access Journals (Sweden)

    Hamed Mortazavi

    2016-01-01

    Full Text Available Diagnosis of oral ulcerative lesions might be quite challenging. This narrative review article aims to introduce an updated decision tree for diagnosing oral ulcerative lesions on the basis of their diagnostic features. Various general search engines and specialized databases including PubMed, PubMed Central, Medline Plus, EBSCO, Science Direct, Scopus, Embase, and authenticated textbooks were used to find relevant topics by means of MeSH keywords such as “oral ulcer,” “stomatitis,” and “mouth diseases.” Thereafter, English-language articles published since 1983 to 2015 in both medical and dental journals including reviews, meta-analyses, original papers, and case reports were appraised. Upon compilation of the relevant data, oral ulcerative lesions were categorized into three major groups: acute, chronic, and recurrent ulcers and into five subgroups: solitary acute, multiple acute, solitary chronic, multiple chronic, and solitary/multiple recurrent, based on the number and duration of lesions. In total, 29 entities were organized in the form of a decision tree in order to help clinicians establish a logical diagnosis by stepwise progression.

  9. Diagnostic Features of Common Oral Ulcerative Lesions: An Updated Decision Tree.

    Science.gov (United States)

    Mortazavi, Hamed; Safi, Yaser; Baharvand, Maryam; Rahmani, Somayeh

    2016-01-01

    Diagnosis of oral ulcerative lesions might be quite challenging. This narrative review article aims to introduce an updated decision tree for diagnosing oral ulcerative lesions on the basis of their diagnostic features. Various general search engines and specialized databases including PubMed, PubMed Central, Medline Plus, EBSCO, Science Direct, Scopus, Embase, and authenticated textbooks were used to find relevant topics by means of MeSH keywords such as "oral ulcer," "stomatitis," and "mouth diseases." Thereafter, English-language articles published since 1983 to 2015 in both medical and dental journals including reviews, meta-analyses, original papers, and case reports were appraised. Upon compilation of the relevant data, oral ulcerative lesions were categorized into three major groups: acute, chronic, and recurrent ulcers and into five subgroups: solitary acute, multiple acute, solitary chronic, multiple chronic, and solitary/multiple recurrent, based on the number and duration of lesions. In total, 29 entities were organized in the form of a decision tree in order to help clinicians establish a logical diagnosis by stepwise progression.

  10. Decision tree algorithm for detection of spatial processes in landscape transformation.

    Science.gov (United States)

    Bogaert, Jan; Ceulemans, Reinhart; Salvador-Van Eysenrode, David

    2004-01-01

    The conversion of landscapes by human activities results in widespread changes in landscape spatial structure. Regardless of the type of land conversion, there appears to be a limited number of common spatial configurations that result from such land transformation processes. Some of these configurations are considered optimal or more desirable than others. Based on pattern geometry, we define ten processes responsible for pattern change: aggregation, attrition, creation, deformation, dissection, enlargement, fragmentation, perforation, shift, and shrinkage. A novelty in this contribution is the inclusion of transformation processes causing expansion of the land cover of interest. Consequently, we propose a decision tree algorithm that enables detection of these processes, based on three parameters that have to be determined before and after the transformation of the landscape: area, perimeter length, and number of patches of the focal landscape class. As an example, the decision tree algorithm is applied to determine the transformation processes of three divergent land cover change scenarios: deciduous woodland degradation in Cadiz Township (Wisconsin, USA) 1831-1950, canopy gap formation in a terra firme rain forest at the Tiputini Biodiversity Station (Amazonian Ecuador) 1997-1998, and forest regrowth in Petersham Township (Massachusetts, USA) 1830-1985. The examples signal the importance of the temporal resolution of the data, since long-term pattern conversions can be subdivided in stadia in which particular pattern components are altered by specific transformation processes.

  11. Prediction of cannabis and cocaine use in adolescence using decision trees and logistic regression

    Directory of Open Access Journals (Sweden)

    Alfonso L. Palmer

    2010-01-01

    Full Text Available Spain is one of the European countries with the highest prevalence of cannabis and cocaine use among young people. The aim of this study was to investigate the factors related to the consumption of cocaine and cannabis among adolescents. A questionnaire was administered to 9,284 students between 14 and 18 years of age in Palma de Mallorca (47.1% boys and 52.9% girls whose mean age was 15.59 years. Logistic regression and decision trees were carried out in order to model the consumption of cannabis and cocaine. The results show the use of legal substances and committing fraudulence or theft are the main variables that raise the odds of consuming cannabis. In boys, cannabis consumption and a family history of drug use increase the odds of consuming cocaine, whereas in girls the use of alcohol, behaviours of fraudulence or theft and difficulty in some personal skills influence their odds of consuming cocaine. Finally, ease of access to the substance greatly raises the odds of consuming cocaine and cannabis in both genders. Decision trees highlight the role of consuming other substances and committing fraudulence or theft. The results of this study gain importance when it comes to putting into practice effective prevention programmes.

  12. Pruning for crop regulation in high density guava (Psidium guajava L.) plantation

    Energy Technology Data Exchange (ETDEWEB)

    Thakre, M.; Lal, S.; Uniyal, S.; Goswami, A.K. Prakash. P.

    2016-11-01

    High density management and crop regulation are two important aspects in guava (Psidium guajava L.) production. Therefore, to find out the economic way of managing high density planting and crop regulation, the present work was carried out on 6-year-old guava trees of cv. Pant Prabhat under double-hedge row system of planting during 2009-10 and 2010-11. Seven different forms of pruning [FBT: flower bud thinning by hand, FBTT: flower bud thinning by hand followed by removal of terminal one leaf pair, RLFO: removal of leaves and flower buds by hand, retaining one leaf pair at the top, RLF: removal of all leaves and flowers by hand, OLPS: one leaf pair shoot pruning, FSP: full shoot pruning, OLPF: one leaf pair pruning of fruited shoots only] were studied along with control (C).Minimum annual increase in tree volume (6.764 m3) was recorded with the treatment OLPF, which was 2.31 times less than the control (15.682 m3). Highest yield during winter season (55.30 kg/tree) and total yield (59.87 kg/tree) was obtained from treatment OLPF. One leaf pair pruning of fruited shoots only (OLPF) was also found profitable among other treatments by recording cost:benefit ratio of 1:2.96. This treatment also recorded the highest return distributed in rainy as well as in winter season. On the basis of findings it can be concluded that one leaf pair pruning of fruited shoots only is suitable for profitable high density management as well as crop regulation of guava in farmer friendly manner. (Author)

  13. Olive Crown Porosity Measurement Based on Radiation Transmittance: An Assessment of Pruning Effect

    Directory of Open Access Journals (Sweden)

    Francisco J. Castillo-Ruiz

    2016-05-01

    Full Text Available Crown porosity influences radiation interception, air movement through the fruit orchard, spray penetration, and harvesting operation in fruit crops. The aim of the present study was to develop an accurate and reliable methodology based on transmitted radiation measurements to assess the porosity of traditional olive trees under different pruning treatments. Transmitted radiation was employed as an indirect method to measure crown porosity in two olive orchards of the Picual and Hojiblanca cultivars. Additionally, three different pruning treatments were considered to determine if the pruning system influences crown porosity. This study evaluated the accuracy and repeatability of four algorithms in measuring crown porosity under different solar zenith angles. From a 14° to 30° solar zenith angle, the selected algorithm produced an absolute error of less than 5% and a repeatability higher than 0.9. The described method and selected algorithm proved satisfactory in field results, making it possible to measure crown porosity at different solar zenith angles. However, pruning fresh weight did not show any relationship with crown porosity due to the great differences between removed branches. A robust and accurate algorithm was selected for crown porosity measurements in traditional olive trees, making it possible to discern between different pruning treatments.

  14. Can early thinning and pruning lessen the impact of pine plantations ...

    African Journals Online (AJOL)

    dwelling insects found in pine tree plantations in Patagonia. We compared the abundance, species richness and composition of the beetle and ant assemblages within 16-year-old pine stands (n = 10) subjected to early pruning and thinning (i.e. ...

  15. A Decision Tree-Based Clustering Approach to State Definition in an Excitation Modeling Framework for HMM-Based Speech Synthesis

    OpenAIRE

    Ranniery Maia; Tomoki Toda; Keiichi Tokuda; Shinsuke Sakai; Satoshi Nakamura

    2009-01-01

    This paper presents a decision tree-based algorithm to cluster residual segments assuming an excitation model based on statedependent filtering of pulse train and white noise. The decision tree construction principle is the same as the one applied to speech recognition. Here parent nodes are split using the residual maximum likelihood criterion. Once these excitation decision trees are constructed for residual signals segmented by full context models, using questions related to the full conte...

  16. Trees

    Science.gov (United States)

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  17. The Over-Pruning Hypothesis of Autism

    Science.gov (United States)

    Thomas, Michael S. C.; Davis, Rachael; Karmiloff-Smith, Annette; Knowland, Victoria C. P.; Charman, Tony

    2016-01-01

    This article outlines the "over-pruning hypothesis" of autism. The hypothesis originates in a neurocomputational model of the regressive sub-type (Thomas, Knowland & Karmiloff-Smith, 2011a, 2011b). Here we develop a more general version of the over-pruning hypothesis to address heterogeneity in the timing of manifestation of ASD,…

  18. A Novel Treatment Decision Tree and Literature Review of Retrograde Peri-Implantitis.

    Science.gov (United States)

    Sarmast, Nima D; Wang, Howard H; Soldatos, Nikolaos K; Angelov, Nikola; Dorn, Samuel; Yukna, Raymond; Iacono, Vincent J

    2016-12-01

    Although retrograde peri-implantitis (RPI) is not a common sequela of dental implant surgery, its prevalence has been reported in the literature to be 0.26%. Incidence of RPI is reported to increase to 7.8% when teeth adjacent to the implant site have a previous history of root canal therapy, and it is correlated with distance between implant and adjacent tooth and/or with time from endodontic treatment of adjacent tooth to implant placement. Minimum 2 mm space between implant and adjacent tooth is needed to decrease incidence of apical RPI, with minimum 4 weeks between completion of endodontic treatment and actual implant placement. The purpose of this study is to compile all available treatment modalities and to provide a decision tree as a general guide for clinicians to aid in diagnosis and treatment of RPI. Literature search was performed for articles published in English on the topic of RPI. Articles selected were case reports with study populations ranging from 1 to 32 patients. Any case report or clinical trial that attempted to treat or rescue an implant diagnosed with RPI was included. Predominant diagnostic presentation of a lesion was presence of sinus tract at buccal or facial abscess of apical portion of implant, and subsequent periapical radiographs taken demonstrated a radiolucent lesion. On the basis of case reports analyzed, RPI was diagnosed between 1 week and 4 years after implant placement. Twelve of 20 studies reported that RPI lesions were diagnosed within 6 months after implant placement. A step-by-step decision tree is provided to allow clinicians to triage and properly manage cases of RPI on the basis of recommendations and successful treatments provided in analyzed case reports. It is divided between symptomatic and asymptomatic implants and adjacent teeth with vital and necrotic pulps. Most common etiology of apical RPI is endodontic infection from neighboring teeth, which was diagnosed within 6 months after implant placement. Most

  19. Decision tree-based learning to predict patient controlled analgesia consumption and readjustment

    Directory of Open Access Journals (Sweden)

    Hu Yuh-Jyh

    2012-11-01

    Full Text Available Abstract Background Appropriate postoperative pain management contributes to earlier mobilization, shorter hospitalization, and reduced cost. The under treatment of pain may impede short-term recovery and have a detrimental long-term effect on health. This study focuses on Patient Controlled Analgesia (PCA, which is a delivery system for pain medication. This study proposes and demonstrates how to use machine learning and data mining techniques to predict analgesic requirements and PCA readjustment. Methods The sample in this study included 1099 patients. Every patient was described by 280 attributes, including the class attribute. In addition to commonly studied demographic and physiological factors, this study emphasizes attributes related to PCA. We used decision tree-based learning algorithms to predict analgesic consumption and PCA control readjustment based on the first few hours of PCA medications. We also developed a nearest neighbor-based data cleaning method to alleviate the class-imbalance problem in PCA setting readjustment prediction. Results The prediction accuracies of total analgesic consumption (continuous dose and PCA dose and PCA analgesic requirement (PCA dose only by an ensemble of decision trees were 80.9% and 73.1%, respectively. Decision tree-based learning outperformed Artificial Neural Network, Support Vector Machine, Random Forest, Rotation Forest, and Naïve Bayesian classifiers in analgesic consumption prediction. The proposed data cleaning method improved the performance of every learning method in this study of PCA setting readjustment prediction. Comparative analysis identified the informative attributes from the data mining models and compared them with the correlates of analgesic requirement reported in previous works. Conclusion This study presents a real-world application of data mining to anesthesiology. Unlike previous research, this study considers a wider variety of predictive factors, including PCA

  20. Predictability of the future development of aggressive behavior of cranial dural arteriovenous fistulas based on decision tree analysis.

    Science.gov (United States)

    Satomi, Junichiro; Ghaibeh, A Ammar; Moriguchi, Hiroki; Nagahiro, Shinji

    2015-07-01

    The severity of clinical signs and symptoms of cranial dural arteriovenous fistulas (DAVFs) are well correlated with their pattern of venous drainage. Although the presence of cortical venous drainage can be considered a potential predictor of aggressive DAVF behaviors, such as intracranial hemorrhage or progressive neurological deficits due to venous congestion, accurate statistical analyses are currently not available. Using a decision tree data mining method, the authors aimed at clarifying the predictability of the future development of aggressive behaviors of DAVF and at identifying the main causative factors. Of 266 DAVF patients, 89 were eligible for analysis. Under observational management, 51 patients presented with intracranial hemorrhage/infarction during the follow-up period. The authors created a decision tree able to assess the risk for the development of aggressive DAVF behavior. Evaluated by 10-fold cross-validation, the decision tree's accuracy, sensitivity, and specificity were 85.28%, 88.33%, and 80.83%, respectively. The tree shows that the main factor in symptomatic patients was the presence of cortical venous drainage. In its absence, the lesion location determined the risk of a DAVF developing aggressive behavior. Decision tree analysis accurately predicts the future development of aggressive DAVF behavior.

  1. Decision tree analysis to stratify risk of de novo non-melanoma skin cancer following liver transplantation.

    Science.gov (United States)

    Tanaka, Tomohiro; Voigt, Michael D

    2018-03-01

    Non-melanoma skin cancer (NMSC) is the most common de novo malignancy in liver transplant (LT) recipients; it behaves more aggressively and it increases mortality. We used decision tree analysis to develop a tool to stratify and quantify risk of NMSC in LT recipients. We performed Cox regression analysis to identify which predictive variables to enter into the decision tree analysis. Data were from the Organ Procurement Transplant Network (OPTN) STAR files of September 2016 (n = 102984). NMSC developed in 4556 of the 105984 recipients, a mean of 5.6 years after transplant. The 5/10/20-year rates of NMSC were 2.9/6.3/13.5%, respectively. Cox regression identified male gender, Caucasian race, age, body mass index (BMI) at LT, and sirolimus use as key predictive or protective factors for NMSC. These factors were entered into a decision tree analysis. The final tree stratified non-Caucasians as low risk (0.8%), and Caucasian males > 47 years, BMI decision tree model accurately stratifies the risk of developing NMSC in the long-term after LT.

  2. CorRECTreatment: a web-based decision support tool for rectal cancer treatment that uses the analytic hierarchy process and decision tree.

    Science.gov (United States)

    Suner, A; Karakülah, G; Dicle, O; Sökmen, S; Çelikoğlu, C C

    2015-01-01

    The selection of appropriate rectal cancer treatment is a complex multi-criteria decision making process, in which clinical decision support systems might be used to assist and enrich physicians' decision making. The objective of the study was to develop a web-based clinical decision support tool for physicians in the selection of potentially beneficial treatment options for patients with rectal cancer. The updated decision model contained 8 and 10 criteria in the first and second steps respectively. The decision support model, developed in our previous study by combining the Analytic Hierarchy Process (AHP) method which determines the priority of criteria and decision tree that formed using these priorities, was updated and applied to 388 patients data collected retrospectively. Later, a web-based decision support tool named corRECTreatment was developed. The compatibility of the treatment recommendations by the expert opinion and the decision support tool was examined for its consistency. Two surgeons were requested to recommend a treatment and an overall survival value for the treatment among 20 different cases that we selected and turned into a scenario among the most common and rare treatment options in the patient data set. In the AHP analyses of the criteria, it was found that the matrices, generated for both decision steps, were consistent (consistency ratiodecisions of experts, the consistency value for the most frequent cases was found to be 80% for the first decision step and 100% for the second decision step. Similarly, for rare cases consistency was 50% for the first decision step and 80% for the second decision step. The decision model and corRECTreatment, developed by applying these on real patient data, are expected to provide potential users with decision support in rectal cancer treatment processes and facilitate them in making projections about treatment options.

  3. Unusual presentation of prune belly syndrome: a case report.

    Science.gov (United States)

    Demisse, Abayneh Girma; Berhanu, Ashenafi; Tadesse, Temesgen

    2017-12-04

    Prune belly syndrome is a rare congenital malformation of unknown etiology, with the following triad of findings: abdominal muscle wall weakness, undescended testes, and urinary tract abnormalities. In most cases, detection of prune belly syndrome occurs during neonatal or infancy period. In this case report, we describe a 12-year-old boy from Ethiopia with the triad of findings of prune belly syndrome along with skeletal malformations. We are unaware of any previous report of prune belly syndrome in Ethiopia. A 12-year-old Amhara boy from the Northwest Gondar Amhara regional state presented to our referral hospital with a complaint of swelling over his left flank for the past 3 months. Maternal pregnancy course and medical history were noncontributory, and he had an attended birth at a health center. He has seven siblings, none of whom had similar symptoms. On examination he had a distended abdomen, asymmetric with bulging left flank, visible horizontal line, upward umbilical slit, and absent rectus abdominis muscles. His abdomen was soft with a tender cystic, bimanually palpable mass on the left flank measuring 13 × 11 cm. Both testes were undescended and he also has developmental dysplasia of the hips. An abdominal ultrasound revealed a large cystic mass in his left kidney area with echo debris and a hip X-ray showed bilateral developmental dysplasia of the hip. Intraoperative findings were cystic left kidney, both testes were intraperitoneal, tortuous left renal vein, enlarged bladder reaching above umbilicus, and left megaureter. bilateral orchidectomy and left nephrectomy were done. He was given intravenously administered antibiotics for treatment of pyelonephritis and discharged home with an appointment for follow up and possible abdominoplasty. In the current report delayed presentation contributed to testicular atrophy and decision for orchidectomy. Furthermore, he will be at potential risk for sex hormone abnormality. Therefore, diagnosis of prune

  4. Evaluation with Decision Trees of Efficacy and Safety of Semirigid Ureteroscopy in the Treatment of Proximal Ureteral Calculi.

    Science.gov (United States)

    Sancak, Eyup Burak; Kılınç, Muhammet Fatih; Yücebaş, Sait Can

    2017-01-01

    The decision on the choice of proximal ureteral stone therapy depends on many factors, and sometimes urologists have difficulty in choosing the treatment option. This study is aimed at evaluating the factors affecting the success of semirigid ureterorenoscopy (URS) using the "decision tree" method. From January 2005 to November 2015, the data of consecutive patients treated for proximal ureteral stone were retrospectively analyzed. A total of 920 patients with proximal ureteral stone treated with semirigid URS were included in the study. All statistically significant attributes were tested using the decision tree method. The model created using decision tree had a sensitivity of 0.993 and an accuracy of 0.857. While URS treatment was successful in 752 patients (81.7%), it was unsuccessful in 168 patients (18.3%). According to the decision tree method, the most important factor affecting the success of URS is whether the stone is impacted to the ureteral wall. The second most important factor affecting treatment was intramural stricture requiring dilatation if the stone is impacted, and the size of the stone if not impacted. Our study suggests that the impacted stone, intramural stricture requiring dilatation and stone size may have a significant effect on the success rate of semirigid URS for proximal ureteral stone. Further studies with population-based and longitudinal design should be conducted to confirm this finding. © 2017 S. Karger AG, Basel.

  5. A New Information Measure Based on Example-Dependent Misclassification Costs and Its Application in Decision Tree Learning

    Directory of Open Access Journals (Sweden)

    Fritz Wysotzki

    2009-01-01

    Full Text Available This article describes how the costs of misclassification given with the individual training objects for classification learning can be used in the construction of decision trees for minimal cost instead of minimal error class decisions. This is demonstrated by defining modified, cost-dependent probabilities, a new, cost-dependent information measure, and using a cost-sensitive extension of the CAL5 algorithm for learning decision trees. The cost-dependent information measure ensures the selection of the (local next best, that is, cost-minimizing, discriminating attribute in the sequential construction of the classification trees. This is shown to be a cost-dependent generalization of the classical information measure introduced by Shannon, which only depends on classical probabilities. It is therefore of general importance and extends classic information theory, knowledge processing, and cognitive science, since subjective evaluations of decision alternatives can be included in entropy and the transferred information. Decision trees can then be viewed as cost-minimizing decoders for class symbols emitted by a source and coded by feature vectors. Experiments with two artificial datasets and one application example show that this approach is more accurate than a method which uses class dependent costs given by experts a priori.

  6. Multi-output decision trees for lesion segmentation in multiple sclerosis

    Science.gov (United States)

    Jog, Amod; Carass, Aaron; Pham, Dzung L.; Prince, Jerry L.

    2015-03-01

    Multiple Sclerosis (MS) is a disease of the central nervous system in which the protective myelin sheath of the neurons is damaged. MS leads to the formation of lesions, predominantly in the white matter of the brain and the spinal cord. The number and volume of lesions visible in magnetic resonance (MR) imaging (MRI) are important criteria for diagnosing and tracking the progression of MS. Locating and delineating lesions manually requires the tedious and expensive efforts of highly trained raters. In this paper, we propose an automated algorithm to segment lesions in MR images using multi-output decision trees. We evaluated our algorithm on the publicly available MICCAI 2008 MS Lesion Segmentation Challenge training dataset of 20 subjects, and showed improved results in comparison to state-of-the-art methods. We also evaluated our algorithm on an in-house dataset of 49 subjects with a true positive rate of 0.41 and a positive predictive value 0.36.

  7. Automated soil resources mapping based on decision tree and Bayesian predictive modeling.

    Science.gov (United States)

    Zhou, Bin; Zhang, Xin-Gang; Wang, Ren-Chao

    2004-07-01

    This article presents two approaches for automated building of knowledge bases of soil resources mapping. These methods used decision tree and Bayesian predictive modeling, respectively to generate knowledge from training data. With these methods, building a knowledge base for automated soil mapping is easier than using the conventional knowledge acquisition approach. The knowledge bases built by these two methods were used by the knowledge classifier for soil type classification of the Longyou area, Zhejiang Province, China using TM bi-temporal imageries and GIS data. To evaluate the performance of the resultant knowledge bases, the classification results were compared to existing soil map based on field survey. The accuracy assessment and analysis of the resultant soil maps suggested that the knowledge bases built by these two methods were of good quality for mapping distribution model of soil classes over the study area.

  8. Prediction of heart disease using apache spark analysing decision trees and gradient boosting algorithm

    Science.gov (United States)

    Chugh, Saryu; Arivu Selvan, K.; Nadesh, RK

    2017-11-01

    Numerous destructive things influence the working arrangement of human body as hypertension, smoking, obesity, inappropriate medication taking which causes many contrasting diseases as diabetes, thyroid, strokes and coronary diseases. The impermanence and horribleness of the environment situation is also the reason for the coronary disease. The structure of Apache start relies on the evolution which requires gathering of the data. To break down the significance of use programming focused on data structure the Apache stop ought to be utilized and it gives various central focuses as it is fast in light as it uses memory worked in preparing. Apache Spark continues running on dispersed environment and chops down the data in bunches giving a high profitability rate. Utilizing mining procedure as a part of the determination of coronary disease has been exhaustively examined indicating worthy levels of precision. Decision trees, Neural Network, Gradient Boosting Algorithm are the various apache spark proficiencies which help in collecting the information.

  9. A hybrid model using decision tree and neural network for credit scoring problem

    Directory of Open Access Journals (Sweden)

    Amir Arzy Soltan

    2012-08-01

    Full Text Available Nowadays credit scoring is an important issue for financial and monetary organizations that has substantial impact on reduction of customer attraction risks. Identification of high risk customer can reduce finished cost. An accurate classification of customer and low type 1 and type 2 errors have been investigated in many studies. The primary objective of this paper is to develop a new method, which chooses the best neural network architecture based on one column hidden layer MLP, multiple columns hidden layers MLP, RBFN and decision trees and ensembling them with voting methods. The proposed method of this paper is run on an Australian credit data and a private bank in Iran called Export Development Bank of Iran and the results are used for making solution in low customer attraction risks.

  10. Diagnostic accuracy of hepatic hemangioma and its decision tree on ultrasonography, computed tomography and angiography

    Energy Technology Data Exchange (ETDEWEB)

    Uesaka, Katsuhiko; Takayasu, Kenichi; Muramatu, Yukio; Moriyama, Noriyuki; Matsue, Hiroto; Yamada, Tatsuya; Hasegawa, Hiroshi

    1988-08-01

    Fifty seven lesions in 31 patients with hepatic hemangioma were concurrently imaged with ultrasonography (US), computed tomography (CT), and angiography (AG). Rates of lesion detection and qualitative diagnosis were 73.7 % and 59.5 %, respectively, for US; 89.5 % and 88.2 % for CT ; ad 89.5 % and 90.2 % for AG. Each of the three imaging methods had a diagnostic rate of 100 % for tumors more than 5 cm. In diagnosing tumors 5 cm or less, US was less sensitive than CT and AG (62.5 % vs 85.0 %). The qualitative diagnostic rate of both CT and AG was 90 % regardless of tumor diameter. As for US, it was 76.5 % in lesions more than 5 cm, and 48.0 % in lesions 5 cm or less. The necessity of decision tree of CT and the other imaging methods in hepatic hemangioma is presented. (Namekawa, K.).

  11. Comparison of CIV, SIV and AIV using Decision Tree and SVM

    Directory of Open Access Journals (Sweden)

    Park Hyorin

    2016-01-01

    Full Text Available The H3N2, the canine influenza virus has numerous types of animal hosts that can live and reproduce on. They mostly settle on pigs and birds. However, some concerned voices are rising that there is high possibility that humans could be an additional victim for the canine flu. Consequently, our project group expect that the information about the H3N2’s DNA are valuable, since the information could attribute to development of vaccine and medicine. In the experiments of analysing the properties of CIV, Canine Influenza Virus with the comparison of SIV, Swine Influenza Virus and AIV, Avian Influenza Virus with the decision tree and SVM, Support Vector Machine. The result came out that CIV, SIV and AIV are alike but also different in some aspects.

  12. Enhancement of Fast Face Detection Algorithm Based on a Cascade of Decision Trees

    Science.gov (United States)

    Khryashchev, V. V.; Lebedev, A. A.; Priorov, A. L.

    2017-05-01

    Face detection algorithm based on a cascade of ensembles of decision trees (CEDT) is presented. The new approach allows detecting faces other than the front position through the use of multiple classifiers. Each classifier is trained for a specific range of angles of the rotation head. The results showed a high rate of productivity for CEDT on images with standard size. The algorithm increases the area under the ROC-curve of 13% compared to a standard Viola-Jones face detection algorithm. Final realization of given algorithm consist of 5 different cascades for frontal/non-frontal faces. One more thing which we take from the simulation results is a low computational complexity of CEDT algorithm in comparison with standard Viola-Jones approach. This could prove important in the embedded system and mobile device industries because it can reduce the cost of hardware and make battery life longer.

  13. Independent component analysis and decision trees for ECG holter recording de-noising.

    Directory of Open Access Journals (Sweden)

    Jakub Kuzilek

    Full Text Available We have developed a method focusing on ECG signal de-noising using Independent component analysis (ICA. This approach combines JADE source separation and binary decision tree for identification and subsequent ECG noise removal. In order to to test the efficiency of this method comparison to standard filtering a wavelet- based de-noising method was used. Freely data available at Physionet medical data storage were evaluated. Evaluation criteria was root mean square error (RMSE between original ECG and filtered data contaminated with artificial noise. Proposed algorithm achieved comparable result in terms of standard noises (power line interference, base line wander, EMG, but noticeably significantly better results were achieved when uncommon noise (electrode cable movement artefact were compared.

  14. Preventing KPI Violations in Business Processes based on Decision Tree Learning and Proactive Runtime Adaptation

    Directory of Open Access Journals (Sweden)

    Dimka Karastoyanova

    2012-01-01

    Full Text Available The performance of business processes is measured and monitored in terms of Key Performance Indicators (KPIs. If the monitoring results show that the KPI targets are violated, the underlying reasons have to be identified and the process should be adapted accordingly to address the violations. In this paper we propose an integrated monitoring, prediction and adaptation approach for preventing KPI violations of business process instances. KPIs are monitored continuously while the process is executed. Additionally, based on KPI measurements of historical process instances we use decision tree learning to construct classification models which are then used to predict the KPI value of an instance while it is still running. If a KPI violation is predicted, we identify adaptation requirements and adaptation strategies in order to prevent the violation.

  15. Integrating individual trip planning in energy efficiency – Building decision tree models for Danish fisheries

    DEFF Research Database (Denmark)

    Bastardie, Francois; Nielsen, J. Rasmus; Andersen, Bo Sølgaard

    2013-01-01

    Danish fishermen have provided information on dynamics in their fuel consumption, running costs, and fishing patterns through a web-based questionnaire. This detailed documentation of the fishing practices is used in spatial modelling tools to improve advice and research for fisheries. The tools...... integrate detailed information on vessel distribution, catch and fuel consumption for different fisheries with a detailed resource distribution of targeted stocks from research surveys to evaluate the optimum consumption and efficiency to reduce fuel costs and the costs of displacement of effort. The energy...... efficiency for the value of catch per unit of fuel consumed is analysed by merging the questionnaire, logbook and VMS (vessel monitoring system) information. Logic decision trees and conditional behaviour probabilities are established from the responses of fishermen regarding a range of sequential...

  16. [Identification of subgroups with lower level of stroke knowledge using decision-tree analysis].

    Science.gov (United States)

    Kim, Hyun Kyung; Jeong, Seok Hee; Kang, Hyun Cheol

    2014-02-01

    This study was performed to explore levels of stroke knowledge and identify subgroups with lower levels of stroke knowledge among adults in Korea. A cross-sectional survey was used and data were collected in 2012. A national sample of 990 Koreans aged 20 to 74 years participated in this study. Knowledge of risk factors, warning signs, and first action for stroke were surveyed using face-to-face interviews. Descriptive statistics and decision tree analysis were performed using SPSS WIN 20.0 and Answer Tree 3.1. Mean score for stroke risk factor knowledge was 7.7 out of 10. The least recognized risk factor was diabetes and four subgroups with lower levels of knowledge were identified. Score for knowledge of stroke warning signs was 3.6 out of 6. The least recognized warning sign was sudden severe headache and six subgroups with lower levels of knowledge were identified. The first action for stroke was recognized by 65.7 percent of participants and four subgroups with lower levels of knowledge were identified. Multi-faceted education should be designed to improve stroke knowledge among Korean adults, particularly focusing on subgroups with lower levels of knowledge and less recognition of items in this study.

  17. Diagnostic assessment of intraoperative cytology for papillary thyroid carcinoma: using a decision tree analysis.

    Science.gov (United States)

    Pyo, J-S; Sohn, J H; Kang, G

    2017-03-01

    The aim of this study was to elucidate the cytological characteristics and the diagnostic usefulness of intraoperative cytology (IOC) for papillary thyroid carcinoma (PTC). In addition, using decision tree analysis, effective features for accurate cytological diagnosis were sought. We investigated cellularity, cytological features and diagnosis based on the Bethesda System for Reporting Thyroid Cytopathology in IOC of 240 conventional PTCs. The cytological features were evaluated in terms of nuclear score with nuclear features, and additional figures such as presence of swirling sheets, psammoma bodies, and multinucleated giant cells. The nuclear score (range 0-7) was made via seven nuclear features, including (1) enlarged, (2) oval or irregularly shaped nuclei, (3) longitudinal nuclear grooves, (4) intranuclear cytoplasmic pseudoinclusion, (5) pale nuclei with powdery chromatin, (6) nuclear membrane thickening, and (7) marginally placed micronucleoli. Nuclear scores in PTC, suspicious for malignancy, and atypia of undetermined significance cases were 6.18 ± 0.80, 4.48 ± 0.82, and 3.15 ± 0.67, respectively. Additional figures more frequent in PTC than in other diagnostic categories were identified. Cellularity of IOC significantly correlated with tumor size, nuclear score, and presence of additional figures. Also, IOCs with higher nuclear scores (4-7) significantly correlated with larger tumor size and presence of additional figures. In decision tree analysis, IOCs with nuclear score >5 and swirling sheets could be considered diagnostic for PTCs. Our study suggests that IOCs using nuclear features and additional figures could be useful with decreasing the likelihood of inconclusive results.

  18. Prediction model for demands of the health meteorological information using a decision tree method.

    Science.gov (United States)

    Oh, Jina; Kim, Byungsoo

    2010-09-01

    Climate change affects human health and calls for health meteorological services. The purpose of this study is to find the significant predictors for the demands of the health meteorological information. This study used a descriptive design through structured self-report questionnaires. Data from 956 participants who were at least 18 years old and living in Busan, Korea, were collected from June 1 to July 31, 2009. The data was analyzed using a decision tree method, one of the data mining techniques by SAS 9.1 and Enterprise Miner 4.3 program. Two hundred and ninety participants (30.3%) demanded the information, and 505 of them (52.8%) perceived the necessity of health meteorological information. From the decision tree method, the predictors related to the demands of the health meteorological information were determined as "the perception of the necessity of health meteorological information," "the coping to the weather warnings" and "the importance of the weather forecasting in daily life." In Particular, the significant different variables in the perception of the necessity of health meteorological information were "female," "aged over 40" and "environmental diseases." Thus, the model derived in this study is considered for explaining and predicting the demands of health meteorological information. It can be effectively used as a reference model for future studies and is a suggested direction in health meteorological information service and policy development. We suggest health forecasting as a nursing service and a primary health care network for healthier and more comfortable life. Copyright © 2010 Korean Society of Nursing Science. Published by . All rights reserved.

  19. Supertrees Based on the Subtree Prune-and-Regraft Distance.

    Science.gov (United States)

    Whidden, Christopher; Zeh, Norbert; Beiko, Robert G

    2014-07-01

    Supertree methods reconcile a set of phylogenetic trees into a single structure that is often interpreted as a branching history of species. A key challenge is combining conflicting evolutionary histories that are due to artifacts of phylogenetic reconstruction and phenomena such as lateral gene transfer (LGT). Many supertree approaches use optimality criteria that do not reflect underlying processes, have known biases, and may be unduly influenced by LGT. We present the first method to construct supertrees by using the subtree prune-and-regraft (SPR) distance as an optimality criterion. Although calculating the rooted SPR distance between a pair of trees is NP-hard, our new maximum agreement forest-based methods can reconcile trees with hundreds of taxa and>50 transfers in fractions of a second, which enables repeated calculations during the course of an iterative search. Our approach can accommodate trees in which uncertain relationships have been collapsed to multifurcating nodes. Using a series of benchmark datasets simulated under plausible rates of LGT, we show that SPR supertrees are more similar to correct species histories than supertrees based on parsimony or Robinson-Foulds distance criteria. We successfully constructed an SPR supertree from a phylogenomic dataset of 40,631 gene trees that covered 244 genomes representing several major bacterial phyla. Our SPR-based approach also allowed direct inference of highways of gene transfer between bacterial classes and genera. A Small number of these highways connect genera in different phyla and can highlight specific genes implicated in long-distance LGT. [Lateral gene transfer; matrix representation with parsimony; phylogenomics; prokaryotic phylogeny; Robinson-Foulds; subtree prune-and-regraft; supertrees.]. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  20. 7 CFR 993.6 - Non-French prunes.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Non-French prunes. 993.6 Section 993.6 Agriculture Regulations of the Department of Agriculture (Continued) AGRICULTURAL MARKETING SERVICE (Marketing Agreements... Order Regulating Handling Definitions § 993.6 Non-French prunes. Non-French prunes means prunes commonly...

  1. A data mining approach to optimize pellets manufacturing process based on a decision tree algorithm.

    Science.gov (United States)

    Ronowicz, Joanna; Thommes, Markus; Kleinebudde, Peter; Krysiński, Jerzy

    2015-06-20

    The present study is focused on the thorough analysis of cause-effect relationships between pellet formulation characteristics (pellet composition as well as process parameters) and the selected quality attribute of the final product. The shape using the aspect ratio value expressed the quality of pellets. A data matrix for chemometric analysis consisted of 224 pellet formulations performed by means of eight different active pharmaceutical ingredients and several various excipients, using different extrusion/spheronization process conditions. The data set contained 14 input variables (both formulation and process variables) and one output variable (pellet aspect ratio). A tree regression algorithm consistent with the Quality by Design concept was applied to obtain deeper understanding and knowledge of formulation and process parameters affecting the final pellet sphericity. The clear interpretable set of decision rules were generated. The spehronization speed, spheronization time, number of holes and water content of extrudate have been recognized as the key factors influencing pellet aspect ratio. The most spherical pellets were achieved by using a large number of holes during extrusion, a high spheronizer speed and longer time of spheronization. The described data mining approach enhances knowledge about pelletization process and simultaneously facilitates searching for the optimal process conditions which are necessary to achieve ideal spherical pellets, resulting in good flow characteristics. This data mining approach can be taken into consideration by industrial formulation scientists to support rational decision making in the field of pellets technology. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Learning Dispatching Rules for Scheduling: A Synergistic View Comprising Decision Trees, Tabu Search and Simulation

    Directory of Open Access Journals (Sweden)

    Atif Shahzad

    2016-02-01

    Full Text Available A promising approach for an effective shop scheduling that synergizes the benefits of the combinatorial optimization, supervised learning and discrete-event simulation is presented. Though dispatching rules are in widely used by shop scheduling practitioners, only ordinary performance rules are known; hence, dynamic generation of dispatching rules is desired to make them more effective in changing shop conditions. Meta-heuristics are able to perform quite well and carry more knowledge of the problem domain, however at the cost of prohibitive computational effort in real-time. The primary purpose of this research lies in an offline extraction of this domain knowledge using decision trees to generate simple if-then rules that subsequently act as dispatching rules for scheduling in an online manner. We used similarity index to identify parametric and structural similarity in problem instances in order to implicitly support the learning algorithm for effective rule generation and quality index for relative ranking of the dispatching decisions. Maximum lateness is used as the scheduling objective in a job shop scheduling environment.

  3. Construction and validation of a decision tree for treating metabolic acidosis in calves with neonatal diarrhea.

    Science.gov (United States)

    Trefz, Florian M; Lorch, Annette; Feist, Melanie; Sauter-Louis, Carola; Lorenz, Ingrid

    2012-12-06

    The aim of the present prospective study was to investigate whether a decision tree based on basic clinical signs could be used to determine the treatment of metabolic acidosis in calves successfully without expensive laboratory equipment. A total of 121 calves with a diagnosis of neonatal diarrhea admitted to a veterinary teaching hospital were included in the study. The dosages of sodium bicarbonate administered followed simple guidelines based on the results of a previous retrospective analysis. Calves that were neither dehydrated nor assumed to be acidemic received an oral electrolyte solution. In cases in which intravenous correction of acidosis and/or dehydration was deemed necessary, the provided amount of sodium bicarbonate ranged from 250 to 750 mmol (depending on alterations in posture) and infusion volumes from 1 to 6.25 liters (depending on the degree of dehydration). Individual body weights of calves were disregarded. During the 24 hour study period the investigator was blinded to all laboratory findings. After being lifted, many calves were able to stand despite base excess levels below -20 mmol/l. Especially in those calves, metabolic acidosis was undercorrected with the provided amount of 500 mmol sodium bicarbonate, which was intended for calves standing insecurely. In 13 calves metabolic acidosis was not treated successfully as defined by an expected treatment failure or a measured base excess value below -5 mmol/l. By contrast, 24 hours after the initiation of therapy, a metabolic alkalosis was present in 55 calves (base excess levels above +5 mmol/l). However, the clinical status was not affected significantly by the metabolic alkalosis. Assuming re-evaluation of the calf after 24 hours, the tested decision tree can be recommended for the use in field practice with minor modifications. Calves that stand insecurely and are not able to correct their position if pushed require higher doses of sodium bicarbonate, if there is clinical evidence of a

  4. Construction and validation of a decision tree for treating metabolic acidosis in calves with neonatal diarrhea

    Science.gov (United States)

    2012-01-01

    Background The aim of the present prospective study was to investigate whether a decision tree based on basic clinical signs could be used to determine the treatment of metabolic acidosis in calves successfully without expensive laboratory equipment. A total of 121 calves with a diagnosis of neonatal diarrhea admitted to a veterinary teaching hospital were included in the study. The dosages of sodium bicarbonate administered followed simple guidelines based on the results of a previous retrospective analysis. Calves that were neither dehydrated nor assumed to be acidemic received an oral electrolyte solution. In cases in which intravenous correction of acidosis and/or dehydration was deemed necessary, the provided amount of sodium bicarbonate ranged from 250 to 750 mmol (depending on alterations in posture) and infusion volumes from 1 to 6.25 liters (depending on the degree of dehydration). Individual body weights of calves were disregarded. During the 24 hour study period the investigator was blinded to all laboratory findings. Results After being lifted, many calves were able to stand despite base excess levels below −20 mmol/l. Especially in those calves, metabolic acidosis was undercorrected with the provided amount of 500 mmol sodium bicarbonate, which was intended for calves standing insecurely. In 13 calves metabolic acidosis was not treated successfully as defined by an expected treatment failure or a measured base excess value below −5 mmol/l. By contrast, 24 hours after the initiation of therapy, a metabolic alkalosis was present in 55 calves (base excess levels above +5 mmol/l). However, the clinical status was not affected significantly by the metabolic alkalosis. Conclusions Assuming re-evaluation of the calf after 24 hours, the tested decision tree can be recommended for the use in field practice with minor modifications. Calves that stand insecurely and are not able to correct their position if pushed require higher doses of

  5. Construction and validation of a decision tree for treating metabolic acidosis in calves with neonatal diarrhea

    Directory of Open Access Journals (Sweden)

    Trefz Florian M

    2012-12-01

    Full Text Available Abstract Background The aim of the present prospective study was to investigate whether a decision tree based on basic clinical signs could be used to determine the treatment of metabolic acidosis in calves successfully without expensive laboratory equipment. A total of 121 calves with a diagnosis of neonatal diarrhea admitted to a veterinary teaching hospital were included in the study. The dosages of sodium bicarbonate administered followed simple guidelines based on the results of a previous retrospective analysis. Calves that were neither dehydrated nor assumed to be acidemic received an oral electrolyte solution. In cases in which intravenous correction of acidosis and/or dehydration was deemed necessary, the provided amount of sodium bicarbonate ranged from 250 to 750 mmol (depending on alterations in posture and infusion volumes from 1 to 6.25 liters (depending on the degree of dehydration. Individual body weights of calves were disregarded. During the 24 hour study period the investigator was blinded to all laboratory findings. Results After being lifted, many calves were able to stand despite base excess levels below −20 mmol/l. Especially in those calves, metabolic acidosis was undercorrected with the provided amount of 500 mmol sodium bicarbonate, which was intended for calves standing insecurely. In 13 calves metabolic acidosis was not treated successfully as defined by an expected treatment failure or a measured base excess value below −5 mmol/l. By contrast, 24 hours after the initiation of therapy, a metabolic alkalosis was present in 55 calves (base excess levels above +5 mmol/l. However, the clinical status was not affected significantly by the metabolic alkalosis. Conclusions Assuming re-evaluation of the calf after 24 hours, the tested decision tree can be recommended for the use in field practice with minor modifications. Calves that stand insecurely and are not able to correct their position if pushed

  6. Determinants of farmers' tree planting investment decision as a degraded landscape management strategy in the central highlands of Ethiopia

    Science.gov (United States)

    Gessesse, B.; Bewket, W.; Bräuning, A.

    2015-11-01

    Land degradation due to lack of sustainable land management practices are one of the critical challenges in many developing countries including Ethiopia. This study explores the major determinants of farm level tree planting decision as a land management strategy in a typical framing and degraded landscape of the Modjo watershed, Ethiopia. The main data were generated from household surveys and analysed using descriptive statistics and binary logistic regression model. The model significantly predicted farmers' tree planting decision (Chi-square = 37.29, df = 15, Pmanagement strategy. In this regard, the finding of the study show that local land-users' willingness to adopt tree growing decision is a function of a wide range of biophysical, institutional, socioeconomic and household level factors, however, the likelihood of household size, productive labour force availability, the disparity of schooling age, level of perception of the process of deforestation and the current land tenure system have positively and significantly influence on tree growing investment decisions in the study watershed. Eventually, the processes of land use conversion and land degradation are serious which in turn have had adverse effects on agricultural productivity, local food security and poverty trap nexus. Hence, devising sustainable and integrated land management policy options and implementing them would enhance ecological restoration and livelihood sustainability in the study watershed.

  7. What Satisfies Students? Mining Student-Opinion Data with Regression and Decision-Tree Analysis. AIR 2002 Forum Paper.

    Science.gov (United States)

    Thomas, Emily H.; Galambos, Nora

    To investigate how students' characteristics and experiences affect satisfaction, this study used regression and decision-tree analysis with the CHAID algorithm to analyze student opinion data from a sample of 1,783 college students. A data-mining approach identifies the specific aspects of students' university experience that most influence three…

  8. A Decision-Tree-Oriented Guidance Mechanism for Conducting Nature Science Observation Activities in a Context-Aware Ubiquitous Learning

    Science.gov (United States)

    Hwang, Gwo-Jen; Chu, Hui-Chun; Shih, Ju-Ling; Huang, Shu-Hsien; Tsai, Chin-Chung

    2010-01-01

    A context-aware ubiquitous learning environment is an authentic learning environment with personalized digital supports. While showing the potential of applying such a learning environment, researchers have also indicated the challenges of providing adaptive and dynamic support to individual students. In this paper, a decision-tree-oriented…

  9. VR-BFDT: A variance reduction based binary fuzzy decision tree induction method for protein function prediction.

    Science.gov (United States)

    Golzari, Fahimeh; Jalili, Saeed

    2015-07-21

    In protein function prediction (PFP) problem, the goal is to predict function of numerous well-sequenced known proteins whose function is not still known precisely. PFP is one of the special and complex problems in machine learning domain in which a protein (regarded as instance) may have more than one function simultaneously. Furthermore, the functions (regarded as classes) are dependent and also are organized in a hierarchical structure in the form of a tree or directed acyclic graph. One of the common learning methods proposed for solving this problem is decision trees in which, by partitioning data into sharp boundaries sets, small changes in the attribute values of a new instance may cause incorrect change in predicted label of the instance and finally misclassification. In this paper, a Variance Reduction based Binary Fuzzy Decision Tree (VR-BFDT) algorithm is proposed to predict functions of the proteins. This algorithm just fuzzifies the decision boundaries instead of converting the numeric attributes into fuzzy linguistic terms. It has the ability of assigning multiple functions to each protein simultaneously and preserves the hierarchy consistency between functional classes. It uses the label variance reduction as splitting criterion to select the best "attribute-value" at each node of the decision tree. The experimental results show that the overall performance of the proposed algorithm is promising. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. Classification of Parkinsonian syndromes from FDG-PET brain data using decision trees with SSM/PCA features.

    Science.gov (United States)

    Mudali, D; Teune, L K; Renken, R J; Leenders, K L; Roerdink, J B T M

    2015-01-01

    Medical imaging techniques like fluorodeoxyglucose positron emission tomography (FDG-PET) have been used to aid in the differential diagnosis of neurodegenerative brain diseases. In this study, the objective is to classify FDG-PET brain scans of subjects with Parkinsonian syndromes (Parkinson's disease, multiple system atrophy, and progressive supranuclear palsy) compared to healthy controls. The scaled subprofile model/principal component analysis (SSM/PCA) method was applied to FDG-PET brain image data to obtain covariance patterns and corresponding subject scores. The latter were used as features for supervised classification by the C4.5 decision tree method. Leave-one-out cross validation was applied to determine classifier performance. We carried out a comparison with other types of classifiers. The big advantage of decision tree classification is that the results are easy to understand by humans. A visual representation of decision trees strongly supports the interpretation process, which is very important in the context of medical diagnosis. Further improvements are suggested based on enlarging the number of the training data, enhancing the decision tree method by bagging, and adding additional features based on (f)MRI data.

  11. Identifying Risk Factors for Drug Use in an Iranian Treatment Sample: A Prediction Approach Using Decision Trees.

    Science.gov (United States)

    Amirabadizadeh, Alireza; Nezami, Hossein; Vaughn, Michael G; Nakhaee, Samaneh; Mehrpour, Omid

    2017-11-27

    Substance abuse exacts considerable social and health care burdens throughout the world. The aim of this study was to create a prediction model to better identify risk factors for drug use. A prospective cross-sectional study was conducted in South Khorasan Province, Iran. Of the total of 678 eligible subjects, 70% (n: 474) were randomly selected to provide a training set for constructing decision tree and multiple logistic regression (MLR) models. The remaining 30% (n: 204) were employed in a holdout sample to test the performance of the decision tree and MLR models. Predictive performance of different models was analyzed by the receiver operating characteristic (ROC) curve using the testing set. Independent variables were selected from demographic characteristics and history of drug use. For the decision tree model, the sensitivity and specificity for identifying people at risk for drug abuse were 66% and 75%, respectively, while the MLR model was somewhat less effective at 60% and 73%. Key independent variables in the analyses included first substance experience, age at first drug use, age, place of residence, history of cigarette use, and occupational and marital status. While study findings are exploratory and lack generalizability they do suggest that the decision tree model holds promise as an effective classification approach for identifying risk factors for drug use. Convergent with prior research in Western contexts is that age of drug use initiation was a critical factor predicting a substance use disorder.

  12. Knowledge discovery and data mining in psychology: Using decision trees to predict the Sensation Seeking Scale score

    Directory of Open Access Journals (Sweden)

    Andrej Kastrin

    2008-12-01

    Full Text Available Knowledge discovery from data is an interdisciplinary research field combining technology and knowledge from domains of statistics, databases, machine learning and artificial intelligence. Data mining is the most important part of knowledge discovery process. The objective of this paper is twofold. The first objective is to point out the qualitative shift in research methodology due to evolving knowledge discovery technology. The second objective is to introduce the technique of decision trees to psychological domain experts. We illustrate the utility of the decision trees on the prediction model of sensation seeking. Prediction of the Zuckerman's Sensation Seeking Scale (SSS-V score was based on the bundle of Eysenck's personality traits and Pavlovian temperament properties. Predictors were operationalized on the basis of Eysenck Personality Questionnaire (EPQ and Slovenian adaptation of the Pavlovian Temperament Survey (SVTP. The standard statistical technique of multiple regression was used as a baseline method to evaluate the decision trees methodology. The multiple regression model was the most accurate model in terms of predictive accuracy. However, the decision trees could serve as a powerful general method for initial exploratory data analysis, data visualization and knowledge discovery.

  13. Model-independent evaluation of tumor markers and a logistic-tree approach to diagnostic decision support.

    Science.gov (United States)

    Ni, Weizeng; Huang, Samuel H; Su, Qiang; Shi, Jinghua

    2014-01-01

    Sensitivity and specificity of using individual tumor markers hardly meet the clinical requirement. This challenge gave rise to many efforts, e.g., combing multiple tumor markers and employing machine learning algorithms. However, results from different studies are often inconsistent, which are partially attributed to the use of different evaluation criteria. Also, the wide use of model-dependent validation leads to high possibility of data overfitting when complex models are used for diagnosis. We propose two model-independent criteria, namely, area under the curve (AUC) and Relief to evaluate the diagnostic values of individual and multiple tumor markers, respectively. For diagnostic decision support, we propose the use of logistic-tree which combines decision tree and logistic regression. Application on a colorectal cancer dataset shows that the proposed evaluation criteria produce results that are consistent with current knowledge. Furthermore, the simple and highly interpretable logistic-tree has diagnostic performance that is competitive with other complex models.

  14. Model-Independent Evaluation of Tumor Markers and a Logistic-Tree Approach to Diagnostic Decision Support

    Directory of Open Access Journals (Sweden)

    Weizeng Ni

    2014-01-01

    Full Text Available Sensitivity and specificity of using individual tumor markers hardly meet the clinical requirement. This challenge gave rise to many efforts, e.g., combing multiple tumor markers and employing machine learning algorithms. However, results from different studies are often inconsistent, which are partially attributed to the use of different evaluation criteria. Also, the wide use of model-dependent validation leads to high possibility of data overfitting when complex models are used for diagnosis. We propose two model-independent criteria, namely, area under the curve (AUC and Relief to evaluate the diagnostic values of individual and multiple tumor markers, respectively. For diagnostic decision support, we propose the use of logistic-tree which combines decision tree and logistic regression. Application on a colorectal cancer dataset shows that the proposed evaluation criteria produce results that are consistent with current knowledge. Furthermore, the simple and highly interpretable logistic-tree has diagnostic performance that is competitive with other complex models.

  15. Visualization of spatial decision tree for predicting hotspot occurrence in land and forest in Rokan Hilir District Riau

    Science.gov (United States)

    Primajaya, Aji; Sukaesih Sitanggang, Imas; Syaufina, Lailan

    2017-01-01

    Visualization is an important issue in datamining to easy understand patterns extracted from dataset. This research applied the Bottom-Up Approach method to develop a visualization module for a spatial decision tree in a geographic information system. Spatial data used in this work consists of nine explanatory layers and one target layers. Explanatory layers include maximum daily temperature, daily precipitation, wind of speed, distance of nearest river, distance of nearest road, land cover, peatland type, peatland depth, income source. The target layer contains hotspot and non-hotspot points that occurred in 2008. The result is the visualization module of spatial decision tree that has three main features including mapping window, interactive window, tree node and tabular visualization for predicting hotspot occurrence.

  16. Options Evaluation for Remediation of the Gunnar Site Using a Decision- Tree Approach

    Energy Technology Data Exchange (ETDEWEB)

    Yankovich, Tamara L. [International Atomic Energy Agency, P.O. Box 100, 1400 Vienna (Austria); Hachkowski, Andrea [CH2M Hill Canada Limited, 1305 Kenaston Blvd, Winnipeg, Manitoba, R3P 2P2 (Canada); Klyashtorin, Alexey [Saskatchewan Research Council, 15 Innovation Blvd no.125, Saskatoon, Saskatchewan, S7N 2X8 (Canada)

    2014-07-01

    Current best practice in the nuclear industry involves proactive planning of activities from cradle-to-grave over the entire nuclear life cycle in accordance with national requirements and international guidance. This includes the development of detailed decommissioning plans (DDP) at an early stage to facilitate proactive, responsible decision-making as activities are being planned. It should be noted, however, that the current approach may not be applicable to historic nuclear legacy sites, such as abandoned uranium mines and mills, which had operated in the past under less stringent regulatory regimes. In such cases, records documenting past activities are often not available and monitoring data may not have been collected, thereby limiting knowledge of impacts related to past activities. This can lead to challenges in gaining regulatory and funding approvals related to the remediation of such sites, especially given the costs that can be associated with remediation and the uncertainties in characterizing the existing situation. The Gunnar Site, in northern Saskatchewan, is an example of an abandoned uranium mine/mill site, which was operated between the late 1950's to early 1960's under a different regulatory regime than today. Due to the lack of monitoring data and records for the site, and the corresponding uncertainties, a number of precedent-setting approaches have been developed and applied, as part of the environmental impact assessment (EIA) process. Specifically, unlike traditional environmental assessments for planned and operating facilities, it was not possible to identify a preferred and alternative remedial option. Instead, a step-wise decision-tree approach has been developed to identify all potentially feasible remedial options and to map out key decision points, during the licensing phase of the project (following approval of the environmental assessment), when final remedial options will be selected. The presentation will provide

  17. The risk evaluation of difficult substances in USES 2.0 and EUSES. A decision tree for data gap filling of Kow, Koc and BCF

    NARCIS (Netherlands)

    Beelen P van; ECO

    2000-01-01

    This report presents a decision tree for the risk evaluation of the so-called "difficult" substances with the Uniform System for the Evaluation of Substances (USES). The decision tree gives practical guidelines for the regulatory authorities to evaluate notified substances like organometallic

  18. Classification and Progression Based on CFS-GA and C5.0 Boost Decision Tree of TCM Zheng in Chronic Hepatitis B.

    Science.gov (United States)

    Chen, Xiao Yu; Ma, Li Zhuang; Chu, Na; Zhou, Min; Hu, Yiyang

    2013-01-01

    Chronic hepatitis B (CHB) is a serious public health problem, and Traditional Chinese Medicine (TCM) plays an important role in the control and treatment for CHB. In the treatment of TCM, zheng discrimination is the most important step. In this paper, an approach based on CFS-GA (Correlation based Feature Selection and Genetic Algorithm) and C5.0 boost decision tree is used for zheng classification and progression in the TCM treatment of CHB. The CFS-GA performs better than the typical method of CFS. By CFS-GA, the acquired attribute subset is classified by C5.0 boost decision tree for TCM zheng classification of CHB, and C5.0 decision tree outperforms two typical decision trees of NBTree and REPTree on CFS-GA, CFS, and nonselection in comparison. Based on the critical indicators from C5.0 decision tree, important lab indicators in zheng progression are obtained by the method of stepwise discriminant analysis for expressing TCM zhengs in CHB, and alterations of the important indicators are also analyzed in zheng progression. In conclusion, all the three decision trees perform better on CFS-GA than on CFS and nonselection, and C5.0 decision tree outperforms the two typical decision trees both on attribute selection and nonselection.

  19. Analysis of meal patterns with the use of supervised data mining techniques--artificial neural networks and decision trees.

    Science.gov (United States)

    Hearty, Aine P; Gibney, Michael J

    2008-12-01

    At present, the analysis of dietary patterns is based on the intake of individual foods. This article demonstrates how a coding system at the meal level might be analyzed by using data mining techniques. The objective was to evaluate the usability of supervised data mining methods to predict an aspect of dietary quality based on dietary intake with a food-based coding system and a novel meal-based coding system. Food consumption databases from the North-South Ireland Food Consumption Survey 1997-1999 were used. This was a randomized cross-sectional study of 7-d recorded food and nutrient intakes of a representative sample of 1379 Irish adults. Meal definitions were recorded by the respondent. A healthy eating index (HEI) score was developed. Artificial neural networks (ANNs) and decision trees were used to predict quintiles of the HEI based on combinations of foods consumed at breakfast and main meals. This study applied both data mining techniques to the food and meal-based coding systems. The ANN had a slightly higher accuracy than did the decision tree in relation to its ability to predict HEI quintiles 1 and 5 based on the food coding system (78.7% compared with 76.9% and 71.9% compared with 70.1%, respectively). However, the decision tree had higher accuracies than did the ANN on the basis of the meal coding system (67.5% compared with 54.6% and 75.1% compared with 72.4%, respectively). ANNs and decision trees were successfully used to predict an aspect of dietary quality. However, further exploration of the use of ANNs and decision trees in dietary pattern analysis is warranted.

  20. Applying of Decision Tree Analysis to Risk Factors Associated with Pressure Ulcers in Long-Term Care Facilities.

    Science.gov (United States)

    Moon, Mikyung; Lee, Soo-Kyoung

    2017-01-01

    The purpose of this study was to use decision tree analysis to explore the factors associated with pressure ulcers (PUs) among elderly people admitted to Korean long-term care facilities. The data were extracted from the 2014 National Inpatient Sample (NIS)-data of Health Insurance Review and Assessment Service (HIRA). A MapReduce-based program was implemented to join and filter 5 tables of the NIS. The outcome predicted by the decision tree model was the prevalence of PUs as defined by the Korean Standard Classification of Disease-7 (KCD-7; code L89 * ). Using R 3.3.1, a decision tree was generated with the finalized 15,856 cases and 830 variables. The decision tree displayed 15 subgroups with 8 variables showing 0.804 accuracy, 0.820 sensitivity, and 0.787 specificity. The most significant primary predictor of PUs was length of stay less than 0.5 day. Other predictors were the presence of an infectious wound dressing, followed by having diagnoses numbering less than 3.5 and the presence of a simple dressing. Among diagnoses, "injuries to the hip and thigh" was the top predictor ranking 5th overall. Total hospital cost exceeding 2,200,000 Korean won (US $2,000) rounded out the top 7. These results support previous studies that showed length of stay, comorbidity, and total hospital cost were associated with PUs. Moreover, wound dressings were commonly used to treat PUs. They also show that machine learning, such as a decision tree, could effectively predict PUs using big data.

  1. Decisions, Decisions!

    Science.gov (United States)

    McFadden, F. Lee

    1975-01-01

    A self-instructional program on decision making was used in conjunction with workshops to introduce the staff of an instructional materials company to the decision tree process as they used it to study their own film production problem. (Author/MS)

  2. Biomass production and essential oil yield from leaves, fine stems and resprouts using pruning the crown of Aniba canelilla (H.B.K.) (Lauraceae) in the Central Amazon

    OpenAIRE

    Manhães,Adriana Pellegrini; Veiga-Júnior,Valdir Florêncio da; Wiedemann,Larissa Silveira Moreira; Fernandes,Karenn Silveira; Sampaio,Paulo de Tarso Barbosa

    2012-01-01

    Aniba canelilla (H.B.K.) Mez. is a tree species from Amazon that produces essential oil. The oil extraction from its leaves and stems can be an alternative way to avoid the tree cutting for production of essential oil. The aim of this study was to analyse factors that may influence the essential oil production and the biomass of resprouts after pruning the leaves and stems of A. canelilla trees. The tree crowns were pruned in the wet season and after nine months the leaves and stems of the re...

  3. Decision tree-based modeling of androgen pathway genes and prostate cancer risk.

    Science.gov (United States)

    Barnholtz-Sloan, Jill S; Guan, Xiaowei; Zeigler-Johnson, Charnita; Meropol, Neal J; Rebbeck, Timothy R

    2011-06-01

    Inherited variability in genes that influence androgen metabolism has been associated with risk of prostate cancer. The objective of this analysis was to evaluate interactions for prostate cancer risk by using classification and regression tree (CART) models (i.e., decision trees), and to evaluate whether these interactive effects add information about prostate cancer risk prediction beyond that of "traditional" risk factors. We compared CART models with traditional logistic regression (LR) models for associations of factors with prostate cancer risk using 1,084 prostate cancer cases and 941 controls. All analyses were stratified by race. We used unconditional LR to complement and compare with the race-stratified CART results using the area under curve (AUC) for the receiver operating characteristic curves. The CART modeling of prostate cancer risk showed different interaction profiles by race. For European Americans, interactions among CYP3A43 genotype, history of benign prostate hypertrophy, family history of prostate cancer, and age at consent revealed a distinct hierarchy of gene-environment and gene-gene interactions, whereas for African Americans, interactions among family history of prostate cancer, individual proportion of European ancestry, number of GGC androgen receptor repeats, and CYP3A4/CYP3A5 haplotype revealed distinct interaction effects from those found in European Americans. For European Americans, the CART model had the highest AUC whereas for African Americans, the LR model with the CART discovered factors had the largest AUC. These results provide new insight into underlying prostate cancer biology for European Americans and African Americans. ©2011 AACR.

  4. Megalourethra associated with prune-belly syndrome.

    Science.gov (United States)

    Gökalp, A; Gültekin, E Y

    1993-01-01

    A 14-day-old male infant with megalourethra is presented because of the rarity of the anomaly and its association with prune-belly syndrome. The lax, wrinkled appearance of the abdomen, bilateral cryptorchidism and severe dilatation of the urinary system are features included in the classic triad of the prune-belly syndrome. Our patient had the scaphoid variety of megalourethra since the penis appeared elongated and floppy in the fusiform form.

  5. Visual versus chemical evaluation: Effects of pruning wood decomposition on soil quality in a cherry orchard (Northeast Germany).

    Science.gov (United States)

    van Dongen, Renee; Germer, Sonja; Kern, Jürgen; Stoorvogel, Jetse

    2016-04-01

    Returning crop residues to the soil is a well-known practice to keep a sustainable soil quality in agriculture. In an orchard, pruning material could be returned for soil and water conservation or could be removed for energy production. Pruning wood decomposition rates and their impact on soil quality and greenhouse-gas emissions depend on climate, soil type, land management and water availability. Changing the soil management from leaving wood prunings on soil to removing them from the orchard is expected to result in a slow but lasting change of soil quality. Therefore a quick and cost-effective technique for soil quality evaluation is needed. This study aims to compare pruning wood decomposition effects on soil quality determined by soil chemistry (pH, C/N-ratio) or by Visual Soil Examination and Evaluation (VSEE). In addition, treatments effects on soil quality were compared for sampling positions in tree rows versus interrows. In a cherry orchard (Northeast Germany) six plots were established spreading over two planting rows. At each plot, three subplots with 1x (0.55 kg/m2), 2x (1.10 kg/m2) and 10x (5.50 kg/m2) the average pruning wood rates were installed in both tree and interrows. 5 months later the soils were sampled and a Visual Soil Evaluation and Examination (VSEE) was applied. To relate wood decomposition to impacts on soil quality, wood bags were placed in each plot and were sampled in time intervals of 5 weeks (till a maximum of 20 weeks). Wood decomposition was characterized by decomposition rates and changes in carbon and nitrogen contents. To assess environmental effects, CO2, N2O and CH4 emissions or uptake from soils with different pruning rates were determined with the closed chamber method. There were no significant differences in pH and C/N-ratio between the 3 pruning rates. However, pH was significant higher in the tree row compared to the interrow for the 10-fold pruning rate. The 10-fold pruning rate had significant higher VSEE

  6. Application Of Decision Tree Approach To Student Selection Model- A Case Study

    Science.gov (United States)

    Harwati; Sudiya, Amby

    2016-01-01

    The main purpose of the institution is to provide quality education to the students and to improve the quality of managerial decisions. One of the ways to improve the quality of students is to arrange the selection of new students with a more selective. This research takes the case in the selection of new students at Islamic University of Indonesia, Yogyakarta, Indonesia. One of the university's selection is through filtering administrative selection based on the records of prospective students at the high school without paper testing. Currently, that kind of selection does not yet has a standard model and criteria. Selection is only done by comparing candidate application file, so the subjectivity of assessment is very possible to happen because of the lack standard criteria that can differentiate the quality of students from one another. By applying data mining techniques classification, can be built a model selection for new students which includes criteria to certain standards such as the area of origin, the status of the school, the average value and so on. These criteria are determined by using rules that appear based on the classification of the academic achievement (GPA) of the students in previous years who entered the university through the same way. The decision tree method with C4.5 algorithm is used here. The results show that students are given priority for admission is that meet the following criteria: came from the island of Java, public school, majoring in science, an average value above 75, and have at least one achievement during their study in high school.

  7. Trees

    Directory of Open Access Journals (Sweden)

    Henri Epstein

    2016-11-01

    Full Text Available An algebraic formalism, developed with V. Glaser and R. Stora for the study of the generalized retarded functions of quantum field theory, is used to prove a factorization theorem which provides a complete description of the generalized retarded functions associated with any tree graph. Integrating over the variables associated to internal vertices to obtain the perturbative generalized retarded functions for interacting fields arising from such graphs is shown to be possible for a large category of space–times.

  8. Using farm trees for fuelwood

    Energy Technology Data Exchange (ETDEWEB)

    Poulsen, G.

    1983-01-01

    In the tropics, a significant proportion of wood supplies is obtained from trees on farmland rather than from forest. Reliable estimates of wood fuel resources are difficult to obtain by conventional mensuration techniques since such trees are often subjected to regular heavy pruning and pollarding. Productive potential of hedgerows and other small scrub vegetation used for fuel is also difficult to measure.

  9. Uninjured trees - a meaningful guide to white-pine weevil control decisions

    Science.gov (United States)

    William E. Waters

    1962-01-01

    The white-pine weevil, Pissodes strobi, is a particularly insidious forest pest that can render a stand of host trees virtually worthless. It rarely, if ever, kills a tree; but the crooks, forks, and internal defects that develop in attacked trees over a period of years may reduce the merchantable volume and value of the tree at harvest age to zero. Dollar losses are...

  10. Importance Sampling Based Decision Trees for Security Assessment and the Corresponding Preventive Control Schemes: the Danish Case Study

    DEFF Research Database (Denmark)

    Liu, Leo; Rather, Zakir Hussain; Chen, Zhe

    2013-01-01

    Decision Trees (DT) based security assessment helps Power System Operators (PSO) by providing them with the most significant system attributes and guiding them in implementing the corresponding emergency control actions to prevent system insecurity and blackouts. DT is obtained offline from time......-domain simulation and the process of data mining, which is then implemented online as guidelines for preventive control schemes. An algorithm named Classification and Regression Trees (CART) is used to train the DT and key to this approach lies on the accuracy of DT. This paper proposes contingency oriented DT...

  11. Comparison between Decision Tree and Genetic Programming to distinguish healthy from stroke postural sway patterns.

    Science.gov (United States)

    Marrega, Luiz H G; Silva, Simone M; Manffra, Elisangela F; Nievola, Julio C

    2015-01-01

    Maintaining balance is a motor task of crucial importance for humans to perform their daily activities safely and independently. Studies in the field of Artificial Intelligence have considered different classification methods in order to distinguish healthy subjects from patients with certain motor disorders based on their postural strategies during the balance control. The main purpose of this paper is to compare the performance between Decision Tree (DT) and Genetic Programming (GP) - both classification methods of easy interpretation by health professionals - to distinguish postural sway patterns produced by healthy and stroke individuals based on 16 widely used posturographic variables. For this purpose, we used a posturographic dataset of time-series of center-of-pressure displacements derived from 19 stroke patients and 19 healthy matched subjects in three quiet standing tasks of balance control. Then, DT and GP models were trained and tested under two different experiments where accuracy, sensitivity and specificity were adopted as performance metrics. The DT method has performed statistically significant (P < 0.05) better in both cases, showing for example an accuracy of 72.8% against 69.2% from GP in the second experiment of this paper.

  12. OmniGA: Optimized Omnivariate Decision Trees for Generalizable Classification Models

    KAUST Repository

    Magana-Mora, Arturo

    2017-06-14

    Classification problems from different domains vary in complexity, size, and imbalance of the number of samples from different classes. Although several classification models have been proposed, selecting the right model and parameters for a given classification task to achieve good performance is not trivial. Therefore, there is a constant interest in developing novel robust and efficient models suitable for a great variety of data. Here, we propose OmniGA, a framework for the optimization of omnivariate decision trees based on a parallel genetic algorithm, coupled with deep learning structure and ensemble learning methods. The performance of the OmniGA framework is evaluated on 12 different datasets taken mainly from biomedical problems and compared with the results obtained by several robust and commonly used machine-learning models with optimized parameters. The results show that OmniGA systematically outperformed these models for all the considered datasets, reducing the F score error in the range from 100% to 2.25%, compared to the best performing model. This demonstrates that OmniGA produces robust models with improved performance. OmniGA code and datasets are available at www.cbrc.kaust.edu.sa/omniga/.

  13. ARABIC TEXT CLASSIFICATION USING NEW STEMMER FOR FEATURE SELECTION AND DECISION TREES

    Directory of Open Access Journals (Sweden)

    SAID BAHASSINE

    2017-06-01

    Full Text Available Text classification is the process of assignment of unclassified text to appropriate classes based on their content. The most prevalent representation for text classification is the bag of words vector. In this representation, the words that appear in documents often have multiple morphological structures, grammatical forms. In most cases, this morphological variant of words belongs to the same category. In the first part of this paper, anew stemming algorithm was developed in which each term of a given document is represented by its root. In the second part, a comparative study is conducted of the impact of two stemming algorithms namely Khoja’s stemmer and our new stemmer (referred to hereafter by origin-stemmer on Arabic text classification. This investigation was carried out using chi-square as a feature of selection to reduce the dimensionality of the feature space and decision tree classifier. In order to evaluate the performance of the classifier, this study used a corpus that consists of 5070 documents independently classified into six categories: sport, entertainment, business, Middle East, switch and world on WEKA toolkit. The recall, f-measure and precision measures are used to compare the performance of the obtained models. The experimental results show that text classification using rout stemmer outperforms classification using Khoja’s stemmer. The f-measure was 92.9% in sport category and 89.1% in business category.

  14. Artificial neural networks and decision tree model analysis of liver cancer proteomes.

    Science.gov (United States)

    Luk, John M; Lam, Brian Y; Lee, Nikki P Y; Ho, David W; Sham, Pak C; Chen, Lei; Peng, Jirun; Leng, Xisheng; Day, Philip J; Fan, Sheung-Tat

    2007-09-14

    Hepatocellular carcinoma (HCC) is a heterogeneous cancer and usually diagnosed at late advanced tumor stages of high lethality. The present study attempted to obtain a proteome-wide analysis of HCC in comparison with adjacent non-tumor liver tissues, in order to facilitate biomarkers' discovery and to investigate the mechanisms of HCC development. A cohort of 66 Chinese patients with HCC was included for proteomic profiling study by two-dimensional gel electrophoresis (2-DE) analysis. Artificial neural network (ANN) and decision tree (CART) data-mining methods were employed to analyze the profiling data and to delineate significant patterns and trends for discriminating HCC from non-malignant liver tissues. Protein markers were identified by tandem MS/MS. A total of 132 proteome datasets were generated by 2-DE expression profiling analysis, and each with 230 consolidated protein expression intensities. Both the data-mining algorithms successfully distinguished the HCC phenotype from other non-malignant liver samples. The detection sensitivity and specificity of ANN were 96.97% and 87.88%, while those of CART were 81.82% and 78.79%, respectively. The three biological classifiers in the CART model were identified as cytochrome b5, heat shock 70 kDa protein 8 isoform 2, and cathepsin B. The 2-DE-based proteomic profiling approach combined with the ANN or CART algorithm yielded satisfactory performance on identifying HCC and revealed potential candidate cancer biomarkers.

  15. [A prediction model for internet game addiction in adolescents: using a decision tree analysis].

    Science.gov (United States)

    Kim, Ki Sook; Kim, Kyung Hee

    2010-06-01

    This study was designed to build a theoretical frame to provide practical help to prevent and manage adolescent internet game addiction by developing a prediction model through a comprehensive analysis of related factors. The participants were 1,318 students studying in elementary, middle, and high schools in Seoul and Gyeonggi Province, Korea. Collected data were analyzed using the SPSS program. Decision Tree Analysis using the Clementine program was applied to build an optimum and significant prediction model to predict internet game addiction related to various factors, especially parent related factors. From the data analyses, the prediction model for factors related to internet game addiction presented with 5 pathways. Causative factors included gender, type of school, siblings, economic status, religion, time spent alone, gaming place, payment to Internet café, frequency, duration, parent's ability to use internet, occupation (mother), trust (father), expectations regarding adolescent's study (mother), supervising (both parents), rearing attitude (both parents). The results suggest preventive and managerial nursing programs for specific groups by path. Use of this predictive model can expand the role of school nurses, not only in counseling addicted adolescents but also, in developing and carrying out programs with parents and approaching adolescents individually through databases and computer programming.

  16. OmniGA: Optimized Omnivariate Decision Trees for Generalizable Classification Models.

    Science.gov (United States)

    Magana-Mora, Arturo; Bajic, Vladimir B

    2017-06-20

    Classification problems from different domains vary in complexity, size, and imbalance of the number of samples from different classes. Although several classification models have been proposed, selecting the right model and parameters for a given classification task to achieve good performance is not trivial. Therefore, there is a constant interest in developing novel robust and efficient models suitable for a great variety of data. Here, we propose OmniGA, a framework for the optimization of omnivariate decision trees based on a parallel genetic algorithm, coupled with deep learning structure and ensemble learning methods. The performance of the OmniGA framework is evaluated on 12 different datasets taken mainly from biomedical problems and compared with the results obtained by several robust and commonly used machine-learning models with optimized parameters. The results show that OmniGA systematically outperformed these models for all the considered datasets, reducing the F1 score error in the range from 100% to 2.25%, compared to the best performing model. This demonstrates that OmniGA produces robust models with improved performance. OmniGA code and datasets are available at www.cbrc.kaust.edu.sa/omniga/.

  17. Boosted Decision Trees in the CMS Level-1 Endcap Muon Trigger

    CERN Document Server

    Acosta, Darin Edward; Busch, Elena Laura; Carnes, Andrew Mathew; Furic, Ivan-Kresimir; Gleyzer, Sergei; Kotov, Khristian; Low, Jia Fu; Madorsky, Alexander; Rorie, Jamal Tildon; Scurlock, Bobby; Shi, Wei

    2017-01-01

    The first implementation of Boosted Decision Trees (BDTs) inside a Level-1 trigger system at the LHC is presented. The Endcap Muon Track Finder (EMTF) at CMS uses BDTs to infer the momentum of muons in the forward region of the detector, based on 25 different variables. Combinations of these variables are evaluated offline using regression BDTs, whose output is stored in 1.2 GB look-up tables (LUTs) in the EMTF hardware. These BDTs take advantage of complex correlations between variables, the inhomogeneous magnetic field, and non-linear effects such as inelastic scattering to distinguish high-momentum signal muons from the overwhelming low-momentum background. The LUTs are used to turn the complex BDT evaluation into a simple look-up operation in fixed low latency. The new momentum assignment algorithm has reduced the trigger rate by a factor of 3 at the 25 GeV trigger threshold with respect to the legacy system, with further improvements foreseen in the coming year.

  18. Plant MicroRNA Prediction by Supervised Machine Learning Using C5.0 Decision Trees

    Directory of Open Access Journals (Sweden)

    Philip H. Williams

    2012-01-01

    Full Text Available MicroRNAs (miRNAs are nonprotein coding RNAs between 20 and 22 nucleotides long that attenuate protein production. Different types of sequence data are being investigated for novel miRNAs, including genomic and transcriptomic sequences. A variety of machine learning methods have successfully predicted miRNA precursors, mature miRNAs, and other nonprotein coding sequences. MirTools, mirDeep2, and miRanalyzer require “read count” to be included with the input sequences, which restricts their use to deep-sequencing data. Our aim was to train a predictor using a cross-section of different species to accurately predict miRNAs outside the training set. We wanted a system that did not require read-count for prediction and could therefore be applied to short sequences extracted from genomic, EST, or RNA-seq sources. A miRNA-predictive decision-tree model has been developed by supervised machine learning. It only requires that the corresponding genome or transcriptome is available within a sequence window that includes the precursor candidate so that the required sequence features can be collected. Some of the most critical features for training the predictor are the miRNA:miRNA∗ duplex energy and the number of mismatches in the duplex. We present a cross-species plant miRNA predictor with 84.08% sensitivity and 98.53% specificity based on rigorous testing by leave-one-out validation.

  19. Plant MicroRNA Prediction by Supervised Machine Learning Using C5.0 Decision Trees.

    Science.gov (United States)

    Williams, Philip H; Eyles, Rod; Weiller, Georg

    2012-01-01

    MicroRNAs (miRNAs) are nonprotein coding RNAs between 20 and 22 nucleotides long that attenuate protein production. Different types of sequence data are being investigated for novel miRNAs, including genomic and transcriptomic sequences. A variety of machine learning methods have successfully predicted miRNA precursors, mature miRNAs, and other nonprotein coding sequences. MirTools, mirDeep2, and miRanalyzer require "read count" to be included with the input sequences, which restricts their use to deep-sequencing data. Our aim was to train a predictor using a cross-section of different species to accurately predict miRNAs outside the training set. We wanted a system that did not require read-count for prediction and could therefore be applied to short sequences extracted from genomic, EST, or RNA-seq sources. A miRNA-predictive decision-tree model has been developed by supervised machine learning. It only requires that the corresponding genome or transcriptome is available within a sequence window that includes the precursor candidate so that the required sequence features can be collected. Some of the most critical features for training the predictor are the miRNA:miRNA(∗) duplex energy and the number of mismatches in the duplex. We present a cross-species plant miRNA predictor with 84.08% sensitivity and 98.53% specificity based on rigorous testing by leave-one-out validation.

  20. Detection of Spam Email by Combining Harmony Search Algorithm and Decision Tree

    Directory of Open Access Journals (Sweden)

    M. Z. Gashti

    2017-06-01

    Full Text Available Spam emails is probable the main problem faced by most e-mail users. There are many features in spam email detection and some of these features have little effect on detection and cause skew detection and classification of spam email. Thus, Feature Selection (FS is one of the key topics in spam email detection systems. With choosing the important and effective features in classification, its performance can be optimized. Selector features has the task of finding a subset of features to improve the accuracy of its predictions. In this paper, a hybrid of Harmony Search Algorithm (HSA and decision tree is used for selecting the best features and classification. The obtained results on Spam-base dataset show that the rate of recognition accuracy in the proposed model is 95.25% which is high in comparison with models such as SVM, NB, J48 and MLP. Also, the accuracy of the proposed model on the datasets of Ling-spam and PU1 is high in comparison with models such as NB, SVM and LR.

  1. Using boosted decision trees for tau identification in the ATLAS experiment

    CERN Document Server

    Godfrey, Jennifer

    The ATLAS detector will begin taking data from p - p collisions in 2009. This experiment will allo w for man y dif ferent physics measurements and searches. The production of tau leptons at the LHC is a key signature of the decay of both the standard model Higgs (via H ! t t ) and SUSY particles. Taus have a short lifetime ( c t = 87 m m) and decay hadroni- cally 65% of the time. Man y QCD interactions produce similar hadronic sho wers and have cross-sections about 1 billion times lar ger than tau production. Multi variate techniques are therefore often used to distinguish taus from this background. Boosted Decision Trees (BDTs) are a machine-learning technique for developing cut-based discriminants which can signicantly aid in extracting small signal samples from overwhelming backgrounds. In this study , BDTs are used for tau identication for the ATLAS experiment. The y are a fast, exible alternati ve to existing discriminants with comparable or better performance.

  2. CLASSIFICATION OF ENTREPRENEURIAL INTENTIONS BY NEURAL NETWORKS, DECISION TREES AND SUPPORT VECTOR MACHINES

    Directory of Open Access Journals (Sweden)

    Marijana Zekić-Sušac

    2010-12-01

    Full Text Available Entrepreneurial intentions of students are important to recognize during the study in order to provide those students with educational background that will support such intentions and lead them to successful entrepreneurship after the study. The paper aims to develop a model that will classify students according to their entrepreneurial intentions by benchmarking three machine learning classifiers: neural networks, decision trees, and support vector machines. A survey was conducted at a Croatian university including a sample of students at the first year of study. Input variables described students’ demographics, importance of business objectives, perception of entrepreneurial carrier, and entrepreneurial predispositions. Due to a large dimension of input space, a feature selection method was used in the pre-processing stage. For comparison reasons, all tested models were validated on the same out-of-sample dataset, and a cross-validation procedure for testing generalization ability of the models was conducted. The models were compared according to its classification accuracy, as well according to input variable importance. The results show that although the best neural network model produced the highest average hit rate, the difference in performance is not statistically significant. All three models also extract similar set of features relevant for classifying students, which can be suggested to be taken into consideration by universities while designing their academic programs.

  3. A reduction approach to improve the quantification of linked fault trees through binary decision diagrams

    Energy Technology Data Exchange (ETDEWEB)

    Ibanez-Llano, Cristina, E-mail: cristina.ibanez@iit.upcomillas.e [Instituto de Investigacion Tecnologica (IIT), Escuela Tecnica Superior de Ingenieria ICAI, Universidad Pontificia Comillas, C/Santa Cruz de Marcenado 26, 28015 Madrid (Spain); Rauzy, Antoine, E-mail: Antoine.RAUZY@3ds.co [Dassault Systemes, 10 rue Marcel Dassault CS 40501, 78946 Velizy Villacoublay, Cedex (France); Melendez, Enrique, E-mail: ema@csn.e [Consejo de Seguridad Nuclear (CSN), C/Justo Dorado 11, 28040 Madrid (Spain); Nieto, Francisco, E-mail: nieto@iit.upcomillas.e [Instituto de Investigacion Tecnologica (IIT), Escuela Tecnica Superior de Ingenieria ICAI, Universidad Pontificia Comillas, C/Santa Cruz de Marcenado 26, 28015 Madrid (Spain)

    2010-12-15

    Over the last two decades binary decision diagrams have been applied successfully to improve Boolean reliability models. Conversely to the classical approach based on the computation of the MCS, the BDD approach involves no approximation in the quantification of the model and is able to handle correctly negative logic. However, when models are sufficiently large and complex, as for example the ones coming from the PSA studies of the nuclear industry, it begins to be unfeasible to compute the BDD within a reasonable amount of time and computer memory. Therefore, simplification or reduction of the full model has to be considered in some way to adapt the application of the BDD technology to the assessment of such models in practice. This paper proposes a reduction process based on using information provided by the set of the most relevant minimal cutsets of the model in order to perform the reduction directly on it. This allows controlling the degree of reduction and therefore the impact of such simplification on the final quantification results. This reduction is integrated in an incremental procedure that is compatible with the dynamic generation of the event trees and therefore adaptable to the recent dynamic developments and extensions of the PSA studies. The proposed method has been applied to a real case study, and the results obtained confirm that the reduction enables the BDD computation while maintaining accuracy.

  4. Beef Quality Identification Using Thresholding Method and Decision Tree Classification Based on Android Smartphone

    Directory of Open Access Journals (Sweden)

    Kusworo Adi

    2017-01-01

    Full Text Available Beef is one of the animal food products that have high nutrition because it contains carbohydrates, proteins, fats, vitamins, and minerals. Therefore, the quality of beef should be maintained so that consumers get good beef quality. Determination of beef quality is commonly conducted visually by comparing the actual beef and reference pictures of each beef class. This process presents weaknesses, as it is subjective in nature and takes a considerable amount of time. Therefore, an automated system based on image processing that is capable of determining beef quality is required. This research aims to develop an image segmentation method by processing digital images. The system designed consists of image acquisition processes with varied distance, resolution, and angle. Image segmentation is done to separate the images of fat and meat using the Otsu thresholding method. Classification was carried out using the decision tree algorithm and the best accuracies were obtained at 90% for training and 84% for testing. Once developed, this system is then embedded into the android programming. Results show that the image processing technique is capable of proper marbling score identification.

  5. Object classification in images for Epo doping control based on fuzzy decision trees

    Science.gov (United States)

    Bajla, Ivan; Hollander, Igor; Heiss, Dorothea; Granec, Reinhard; Minichmayr, Markus

    2005-02-01

    Erythropoietin (Epo) is a hormone which can be misused as a doping substance. Its detection involves analysis of images containing specific objects (bands), whose position and intensity are critical for doping positivity. Within a research project of the World Anti-Doping Agency (WADA) we are implementing the GASepo software that should serve for Epo testing in doping control laboratories world-wide. For identification of the bands we have developed a segmentation procedure based on a sequence of filters and edge detectors. Whereas all true bands are properly segmented, the procedure generates a relatively high number of false positives (artefacts). To separate these artefacts we suggested a post-segmentation supervised classification using real-valued geometrical measures of objects. The method is based on the ID3 (Ross Quinlan's) rule generation method, where fuzzy representation is used for linking the linguistic terms to quantitative data. The fuzzy modification of the ID3 method provides a framework that generates fuzzy decision trees, as well as fuzzy sets for input data. Using the MLTTM software (Machine Learning Framework) we have generated a set of fuzzy rules explicitly describing bands and artefacts. The method eliminated most of the artefacts. The contribution includes a comparison of the obtained misclassification errors to the errors produced by some other statistical classification methods.

  6. Abdominoplasty in Prune Belly Syndrome.

    Science.gov (United States)

    Dénes, F T; Park, R; Lopes, R I; Moscardi, P R M; Srougi, M

    2015-10-01

    Many patients with Prune Belly Syndrome (PBS) require abdominoplasty alone or in combination with correction of any urogenital abnormalities. This video presents a simplified technique with which to treat the abdominal flaccidity in PBS. A longitudinal xypho-pubic fusiform figure is drawn on the abdomen, based on the area of skin and subcutaneous tissue to be removed. This is performed with preservation of the musculo-fascial layer and the umbilicus. A lateral elliptical single xypho-pubic line is drawn in the most lax side of the fascia, which is incised along this line. After urinary tract reconstruction and orchidopexy, closure is initiated by suturing the medial edge of the wider fascial flap laterally to the peritoneal side of the contralateral flap. Next, the now outer fascial flap is laid over the inner flap, and a buttonhole is made to expose the umbilicus. The subcutaneous tissue of the inner flap is laterally undermined to gain extra distance for the suture of the outer flap over the inner flap. The subcutaneous tissue and skin are sutured in the midline, incorporating the umbilicus. In a 30-year period, 43 PBS patients underwent this procedure with good cosmetic and long-term functional results. This abdominoplasty technique is simple and presents good functional and cosmetic results in PBS patients. Copyright © 2015 Journal of Pediatric Urology Company. Published by Elsevier Ltd. All rights reserved.

  7. Development and acceptability testing of decision trees for self-management of prosthetic socket fit in adults with lower limb amputation.

    Science.gov (United States)

    Lee, Daniel Joseph; Veneri, Diana A

    2018-05-01

    The most common complaint lower limb prosthesis users report is inadequacy of a proper socket fit. Adjustments to the residual limb-socket interface can be made by the prosthesis user without consultation of a clinician in many scenarios through skilled self-management. Decision trees guide prosthesis wearers through the self-management process, empowering them to rectify fit issues, or referring them to a clinician when necessary. This study examines the development and acceptability testing of patient-centered decision trees for lower limb prosthesis users. Decision trees underwent a four-stage process: literature review and expert consultation, designing, two-rounds of expert panel review and revisions, and target audience testing. Fifteen lower limb prosthesis users (average age 61 years) reviewed the decision trees and completed an acceptability questionnaire. Participants reported agreement of 80% or above in five of the eight questions related to acceptability of the decision trees. Disagreement was related to the level of experience of the respondent. Decision trees were found to be easy to use, illustrate correct solutions to common issues, and have terminology consistent with that of a new prosthesis user. Some users with greater than 1.5 years of experience would not use the decision trees based on their own self-management skills. Implications for Rehabilitation Discomfort of the residual limb-prosthetic socket interface is the most common reason for clinician visits. Prosthesis users can use decision trees to guide them through the process of obtaining a proper socket fit independently. Newer users may benefit from using the decision trees more than experienced users.

  8. Prognostic factors for survival in adult patients with recurrent glioblastoma: a decision-tree-based model.

    Science.gov (United States)

    Audureau, Etienne; Chivet, Anaïs; Ursu, Renata; Corns, Robert; Metellus, Philippe; Noel, Georges; Zouaoui, Sonia; Guyotat, Jacques; Le Reste, Pierre-Jean; Faillot, Thierry; Litre, Fabien; Desse, Nicolas; Petit, Antoine; Emery, Evelyne; Lechapt-Zalcman, Emmanuelle; Peltier, Johann; Duntze, Julien; Dezamis, Edouard; Voirin, Jimmy; Menei, Philippe; Caire, François; Dam Hieu, Phong; Barat, Jean-Luc; Langlois, Olivier; Vignes, Jean-Rodolphe; Fabbro-Peray, Pascale; Riondel, Adeline; Sorbets, Elodie; Zanello, Marc; Roux, Alexandre; Carpentier, Antoine; Bauchet, Luc; Pallud, Johan

    2017-11-20

    We assessed prognostic factors in relation to OS from progression in recurrent glioblastomas. Retrospective multicentric study enrolling 407 (training set) and 370 (external validation set) adult patients with a recurrent supratentorial glioblastoma treated by surgical resection and standard combined chemoradiotherapy as first-line treatment. Four complementary multivariate prognostic models were evaluated: Cox proportional hazards regression modeling, single-tree recursive partitioning, random survival forest, conditional random forest. Median overall survival from progression was 7.6 months (mean, 10.1; range, 0-86) and 8.0 months (mean, 8.5; range, 0-56) in the training and validation sets, respectively (p = 0.900). Using the Cox model in the training set, independent predictors of poorer overall survival from progression included increasing age at histopathological diagnosis (aHR, 1.47; 95% CI [1.03-2.08]; p = 0.032), RTOG-RPA V-VI classes (aHR, 1.38; 95% CI [1.11-1.73]; p = 0.004), decreasing KPS at progression (aHR, 3.46; 95% CI [2.10-5.72]; p < 0.001), while independent predictors of longer overall survival from progression included surgical resection (aHR, 0.57; 95% CI [0.44-0.73]; p < 0.001) and chemotherapy (aHR, 0.41; 95% CI [0.31-0.55]; p < 0.001). Single-tree recursive partitioning identified KPS at progression, surgical resection at progression, chemotherapy at progression, and RTOG-RPA class at histopathological diagnosis, as main survival predictors in the training set, yielding four risk categories highly predictive of overall survival from progression both in training (p < 0.0001) and validation (p < 0.0001) sets. Both random forest approaches identified KPS at progression as the most important survival predictor. Age, KPS at progression, RTOG-RPA classes, surgical resection at progression and chemotherapy at progression are prognostic for survival in recurrent glioblastomas and should inform the treatment decisions.

  9. Oblique decision trees for spatial pattern detection: optimal algorithm and application to malaria risk

    Directory of Open Access Journals (Sweden)

    Ranque Stéphane

    2005-07-01

    Full Text Available Abstract Background In order to detect potential disease clusters where a putative source cannot be specified, classical procedures scan the geographical area with circular windows through a specified grid imposed to the map. However, the choice of the windows' shapes, sizes and centers is critical and different choices may not provide exactly the same results. The aim of our work was to use an Oblique Decision Tree model (ODT which provides potential clusters without pre-specifying shapes, sizes or centers. For this purpose, we have developed an ODT-algorithm to find an oblique partition of the space defined by the geographic coordinates. Methods ODT is based on the classification and regression tree (CART. As CART finds out rectangular partitions of the covariate space, ODT provides oblique partitions maximizing the interclass variance of the independent variable. Since it is a NP-Hard problem in RN, classical ODT-algorithms use evolutionary procedures or heuristics. We have developed an optimal ODT-algorithm in R2, based on the directions defined by each couple of point locations. This partition provided potential clusters which can be tested with Monte-Carlo inference. We applied the ODT-model to a dataset in order to identify potential high risk clusters of malaria in a village in Western Africa during the dry season. The ODT results were compared with those of the Kulldorff' s SaTScan™. Results The ODT procedure provided four classes of risk of infection. In the first high risk class 60%, 95% confidence interval (CI95% [52.22–67.55], of the children was infected. Monte-Carlo inference showed that the spatial pattern issued from the ODT-model was significant (p Satscan results yielded one significant cluster where the risk of disease was high with an infectious rate of 54.21%, CI95% [47.51–60.75]. Obviously, his center was located within the first high risk ODT class. Both procedures provided similar results identifying a high risk

  10. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    Science.gov (United States)

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  11. Evaluation of the potential allergenicity of the enzyme microbial transglutaminase using the 2001 FAO/WHO Decision Tree

    DEFF Research Database (Denmark)

    Pedersen, Mona H; Hansen, Tine K; Sten, Eva

    2004-01-01

    All novel proteins must be assessed for their potential allergenicity before they are introduced into the food market. One method to achieve this is the 2001 FAO/WHO Decision Tree recommended for evaluation of proteins from genetically modified organisms (GMOs). It was the aim of this study...... to investigate the allergenicity of microbial transglutaminase (m-TG) from Streptoverticillium mobaraense. Amino acid sequence similarity to known allergens, pepsin resistance, and detection of protein binding to specific serum immunoglobulin E (IgE) (RAST) have been evaluated as recommended by the decision tree....... Allergenicity in the source material was thought unlikely, since no IgE-mediated allergy to any bacteria has been reported. m-TG is fully degraded after 5 min of pepsin treatment. A database search showed that the enzyme has no homology with known allergens, down to a match of six contiguous amino acids, which...

  12. Identification of Potential Sources of Mercury (Hg) in Farmland Soil Using a Decision Tree Method in China.

    Science.gov (United States)

    Zhong, Taiyang; Chen, Dongmei; Zhang, Xiuying

    2016-11-09

    Identification of the sources of soil mercury (Hg) on the provincial scale is helpful for enacting effective policies to prevent further contamination and take reclamation measurements. The natural and anthropogenic sources and their contributions of Hg in Chinese farmland soil were identified based on a decision tree method. The results showed that the concentrations of Hg in parent materials were most strongly associated with the general spatial distribution pattern of Hg concentration on a provincial scale. The decision tree analysis gained an 89.70% total accuracy in simulating the influence of human activities on the additions of Hg in farmland soil. Human activities-for example, the production of coke, application of fertilizers, discharge of wastewater, discharge of solid waste, and the production of non-ferrous metals-were the main external sources of a large amount of Hg in the farmland soil.

  13. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    Directory of Open Access Journals (Sweden)

    Suduan Chen

    2014-01-01

    Full Text Available As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  14. Identification of Potential Sources of Mercury (Hg in Farmland Soil Using a Decision Tree Method in China

    Directory of Open Access Journals (Sweden)

    Taiyang Zhong

    2016-11-01

    Full Text Available Identification of the sources of soil mercury (Hg on the provincial scale is helpful for enacting effective policies to prevent further contamination and take reclamation measurements. The natural and anthropogenic sources and their contributions of Hg in Chinese farmland soil were identified based on a decision tree method. The results showed that the concentrations of Hg in parent materials were most strongly associated with the general spatial distribution pattern of Hg concentration on a provincial scale. The decision tree analysis gained an 89.70% total accuracy in simulating the influence of human activities on the additions of Hg in farmland soil. Human activities—for example, the production of coke, application of fertilizers, discharge of wastewater, discharge of solid waste, and the production of non-ferrous metals—were the main external sources of a large amount of Hg in the farmland soil.

  15. Interacting with mobile devices by fusion eye and hand gestures recognition systems based on decision tree approach

    Science.gov (United States)

    Elleuch, Hanene; Wali, Ali; Samet, Anis; Alimi, Adel M.

    2017-03-01

    Two systems of eyes and hand gestures recognition are used to control mobile devices. Based on a real-time video streaming captured from the device's camera, the first system recognizes the motion of user's eyes and the second one detects the static hand gestures. To avoid any confusion between natural and intentional movements we developed a system to fuse the decision coming from eyes and hands gesture recognition systems. The phase of fusion was based on decision tree approach. We conducted a study on 5 volunteers and the results that our system is robust and competitive.

  16. A Systematic Approach for Dynamic Security Assessment and the Corresponding Preventive Control Scheme Based on Decision Trees

    DEFF Research Database (Denmark)

    Liu, Leo; Sun, Kai; Rather, Zakir Hussain

    2014-01-01

    This paper proposes a decision tree (DT)-based systematic approach for cooperative online power system dynamic security assessment (DSA) and preventive control. This approach adopts a new methodology that trains two contingency-oriented DTs on a daily basis by the databases generated from power...... system simulations. Fed with real-time wide-area measurements, one DT of measurable variables is employed for online DSA to identify potential security issues, and the other DT of controllable variables provides online decision support on preventive control strategies against those issues. A cost...

  17. Analysis of Human Papillomavirus Using Datamining - Apriori, Decision Tree, and Support Vector Machine (SVM) and its Application Field

    OpenAIRE

    Cho Younghoon; Burm Seungwon; Choi Nayoung; Yoon Taeseon

    2016-01-01

    Human Papillomavirus(HPV) has various types (compared to other viruses) and plays a key role in evoking diverse diseases, especially cervical cancer. In this study, we aim to distinguish the features of HPV of different degree of fatality by analyzing their DNA sequences. We used Decision Tree Algorithm, Apriori Algorithm, and Support Vector Machine in our experiment. By analyzing their DNA sequences, we discovered some relationships between certain types of HPV, especially on the most fatal ...

  18. Chi-squared Automatic Interaction Detection Decision Tree Analysis of Risk Factors for Infant Anemia in Beijing, China.

    Science.gov (United States)

    Ye, Fang; Chen, Zhi-Hua; Chen, Jie; Liu, Fang; Zhang, Yong; Fan, Qin-Ying; Wang, Lin

    2016-05-20

    In the past decades, studies on infant anemia have mainly focused on rural areas of China. With the increasing heterogeneity of population in recent years, available information on infant anemia is inconclusive in large cities of China, especially with comparison between native residents and floating population. This population-based cross-sectional study was implemented to determine the anemic status of infants as well as the risk factors in a representative downtown area of Beijing. As useful methods to build a predictive model, Chi-squared automatic interaction detection (CHAID) decision tree analysis and logistic regression analysis were introduced to explore risk factors of infant anemia. A total of 1091 infants aged 6-12 months together with their parents/caregivers living at Heping Avenue Subdistrict of Beijing were surveyed from January 1, 2013 to December 31, 2014. The prevalence of anemia was 12.60% with a range of 3.47%-40.00% in different subgroup characteristics. The CHAID decision tree model has demonstrated multilevel interaction among risk factors through stepwise pathways to detect anemia. Besides the three predictors identified by logistic regression model including maternal anemia during pregnancy, exclusive breastfeeding in the first 6 months, and floating population, CHAID decision tree analysis also identified the fourth risk factor, the maternal educational level, with higher overall classification accuracy and larger area below the receiver operating characteristic curve. The infant anemic status in metropolis is complex and should be carefully considered by the basic health care practitioners. CHAID decision tree analysis has demonstrated a better performance in hierarchical analysis of population with great heterogeneity. Risk factors identified by this study might be meaningful in the early detection and prompt treatment of infant anemia in large cities.

  19. ATLAAS: an automatic decision tree-based learning algorithm for advanced image segmentation in positron emission tomography.

    Science.gov (United States)

    Berthon, Beatrice; Marshall, Christopher; Evans, Mererid; Spezi, Emiliano

    2016-07-07

    Accurate and reliable tumour delineation on positron emission tomography (PET) is crucial for radiotherapy treatment planning. PET automatic segmentation (PET-AS) eliminates intra- and interobserver variability, but there is currently no consensus on the optimal method to use, as different algorithms appear to perform better for different types of tumours. This work aimed to develop a predictive segmentation model, trained to automatically select and apply the best PET-AS method, according to the tumour characteristics. ATLAAS, the automatic decision tree-based learning algorithm for advanced segmentation is based on supervised machine learning using decision trees. The model includes nine PET-AS methods and was trained on a 100 PET scans with known true contour. A decision tree was built for each PET-AS algorithm to predict its accuracy, quantified using the Dice similarity coefficient (DSC), according to the tumour volume, tumour peak to background SUV ratio and a regional texture metric. The performance of ATLAAS was evaluated for 85 PET scans obtained from fillable and printed subresolution sandwich phantoms. ATLAAS showed excellent accuracy across a wide range of phantom data and predicted the best or near-best segmentation algorithm in 93% of cases. ATLAAS outperformed all single PET-AS methods on fillable phantom data with a DSC of 0.881, while the DSC for H&N phantom data was 0.819. DSCs higher than 0.650 were achieved in all cases. ATLAAS is an advanced automatic image segmentation algorithm based on decision tree predictive modelling, which can be trained on images with known true contour, to predict the best PET-AS method when the true contour is unknown. ATLAAS provides robust and accurate image segmentation with potential applications to radiation oncology.

  20. The risk of disabling, surgery and reoperation in Crohn's disease - A decision tree-based approach to prognosis.

    Science.gov (United States)

    Dias, Cláudia Camila; Pereira Rodrigues, Pedro; Fernandes, Samuel; Portela, Francisco; Ministro, Paula; Martins, Diana; Sousa, Paula; Lago, Paula; Rosa, Isadora; Correia, Luis; Moura Santos, Paula; Magro, Fernando

    2017-01-01

    Crohn's disease (CD) is a chronic inflammatory bowel disease known to carry a high risk of disabling and many times requiring surgical interventions. This article describes a decision-tree based approach that defines the CD patients' risk or undergoing disabling events, surgical interventions and reoperations, based on clinical and demographic variables. This multicentric study involved 1547 CD patients retrospectively enrolled and divided into two cohorts: a derivation one (80%) and a validation one (20%). Decision trees were built upon applying the CHAIRT algorithm for the selection of variables. Three-level decision trees were built for the risk of disabling and reoperation, whereas the risk of surgery was described in a two-level one. A receiver operating characteristic (ROC) analysis was performed, and the area under the curves (AUC) Was higher than 70% for all outcomes. The defined risk cut-off values show usefulness for the assessed outcomes: risk levels above 75% for disabling had an odds test positivity of 4.06 [3.50-4.71], whereas risk levels below 34% and 19% excluded surgery and reoperation with an odds test negativity of 0.15 [0.09-0.25] and 0.50 [0.24-1.01], respectively. Overall, patients with B2 or B3 phenotype had a higher proportion of disabling disease and surgery, while patients with later introduction of pharmacological therapeutic (1 months after initial surgery) had a higher proportion of reoperation. The decision-tree based approach used in this study, with demographic and clinical variables, has shown to be a valid and useful approach to depict such risks of disabling, surgery and reoperation.

  1. Integrating olive grove maintenance and energy biomass recovery with a single-pass pruning and harvesting machine

    Energy Technology Data Exchange (ETDEWEB)

    Spinelli, Raffaele; Nati, Carla; Picchi, Gianni [CNR-IVALSA, Via Madonna del Piano 10, I 50019 Sesto Fiorentino, FI (Italy); Magagnotti, Natascia [DEIAGRA, University of Bologna, Via Fanin 50, Bologna (Italy); Cantini, Claudio; Sani, Graziano [CNR-IVALSA, Azienda S. Paolina, Follonica, GR (Italy); Biocca, Marcello [CRA-ISMA, Via della Pascolare 16, Monterotondo, Roma (Italy)

    2011-02-15

    In Italy, olive tree groves may offer up to a million tonnes of dry biomass per year as pruning residue. Searching for a cost-effective way to tap this potential, the authors tested a new machine, capable of recovering pruning residue at the same time as pruning. The pre-commercial prototype was tested on four different plots and compared to a simpler tractor-base mechanical pruning unit. The authors conducted detailed time-studies in order to determine machine productivity and residue recovery cost. The integrated machine can treat between 0.2 and 0.6 ha h{sup -1}, producing between 0.33 and 1.03 tonnes of fresh residue hour{sup -1}. Its integrated residue recovery function does not slow the pruning, which actually proceeds faster than with the tractor-base unit, due to the more efficient multiple-disc cutting bar. The marginal cost of residue recovery hovers around 40-45 EUR fresh tonne{sup -1}. However, the new machine must not be considered just as a biomass harvester, but rather as a mechanical pruning unit with an integrated biomass recovery function. Its main benefit derives from the capacity of performing a very effective mechanical pruning, and the residue recovery function is a secondary benefit yet unavailable on standard pruning machines. Its deployment must be seen in the context of a general effort to modernize olive grove management and to develop an integrated biomass production system, rather than as a further attempt to build a specialised biomass supply chain. (author)

  2. Applying decision tree for identification of a low risk population for type 2 diabetes. Tehran Lipid and Glucose Study.

    Science.gov (United States)

    Ramezankhani, Azra; Pournik, Omid; Shahrabi, Jamal; Khalili, Davood; Azizi, Fereidoun; Hadaegh, Farzad

    2014-09-01

    The aim of this study was to create a prediction model using data mining approach to identify low risk individuals for incidence of type 2 diabetes, using the Tehran Lipid and Glucose Study (TLGS) database. For a 6647 population without diabetes, aged ≥20 years, followed for 12 years, a prediction model was developed using classification by the decision tree technique. Seven hundred and twenty-nine (11%) diabetes cases occurred during the follow-up. Predictor variables were selected from demographic characteristics, smoking status, medical and drug history and laboratory measures. We developed the predictive models by decision tree using 60 input variables and one output variable. The overall classification accuracy was 90.5%, with 31.1% sensitivity, 97.9% specificity; and for the subjects without diabetes, precision and f-measure were 92% and 0.95, respectively. The identified variables included fasting plasma glucose, body mass index, triglycerides, mean arterial blood pressure, family history of diabetes, educational level and job status. In conclusion, decision tree analysis, using routine demographic, clinical, anthropometric and laboratory measurements, created a simple tool to predict individuals at low risk for type 2 diabetes. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  3. a Rough Set Decision Tree Based Mlp-Cnn for Very High Resolution Remotely Sensed Image Classification

    Science.gov (United States)

    Zhang, C.; Pan, X.; Zhang, S. Q.; Li, H. P.; Atkinson, P. M.

    2017-09-01

    Recent advances in remote sensing have witnessed a great amount of very high resolution (VHR) images acquired at sub-metre spatial resolution. These VHR remotely sensed data has post enormous challenges in processing, analysing and classifying them effectively due to the high spatial complexity and heterogeneity. Although many computer-aid classification methods that based on machine learning approaches have been developed over the past decades, most of them are developed toward pixel level spectral differentiation, e.g. Multi-Layer Perceptron (MLP), which are unable to exploit abundant spatial details within VHR images. This paper introduced a rough set model as a general framework to objectively characterize the uncertainty in CNN classification results, and further partition them into correctness and incorrectness on the map. The correct classification regions of CNN were trusted and maintained, whereas the misclassification areas were reclassified using a decision tree with both CNN and MLP. The effectiveness of the proposed rough set decision tree based MLP-CNN was tested using an urban area at Bournemouth, United Kingdom. The MLP-CNN, well capturing the complementarity between CNN and MLP through the rough set based decision tree, achieved the best classification performance both visually and numerically. Therefore, this research paves the way to achieve fully automatic and effective VHR image classification.

  4. hs-CRP is strongly associated with coronary heart disease (CHD): A data mining approach using decision tree algorithm.

    Science.gov (United States)

    Tayefi, Maryam; Tajfard, Mohammad; Saffar, Sara; Hanachi, Parichehr; Amirabadizadeh, Ali Reza; Esmaeily, Habibollah; Taghipour, Ali; Ferns, Gordon A; Moohebati, Mohsen; Ghayour-Mobarhan, Majid

    2017-04-01

    Coronary heart disease (CHD) is an important public health problem globally. Algorithms incorporating the assessment of clinical biomarkers together with several established traditional risk factors can help clinicians to predict CHD and support clinical decision making with respect to interventions. Decision tree (DT) is a data mining model for extracting hidden knowledge from large databases. We aimed to establish a predictive model for coronary heart disease using a decision tree algorithm. Here we used a dataset of 2346 individuals including 1159 healthy participants and 1187 participant who had undergone coronary angiography (405 participants with negative angiography and 782 participants with positive angiography). We entered 10 variables of a total 12 variables into the DT algorithm (including age, sex, FBG, TG, hs-CRP, TC, HDL, LDL, SBP and DBP). Our model could identify the associated risk factors of CHD with sensitivity, specificity, accuracy of 96%, 87%, 94% and respectively. Serum hs-CRP levels was at top of the tree in our model, following by FBG, gender and age. Our model appears to be an accurate, specific and sensitive model for identifying the presence of CHD, but will require validation in prospective studies. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Determinants of farmers' tree-planting investment decisions as a degraded landscape management strategy in the central highlands of Ethiopia

    Science.gov (United States)

    Gessesse, Berhan; Bewket, Woldeamlak; Bräuning, Achim

    2016-04-01

    Land degradation due to lack of sustainable land management practices is one of the critical challenges in many developing countries including Ethiopia. This study explored the major determinants of farm-level tree-planting decisions as a land management strategy in a typical farming and degraded landscape of the Modjo watershed, Ethiopia. The main data were generated from household surveys and analysed using descriptive statistics and a binary logistic regression model. The model significantly predicted farmers' tree-planting decisions (χ2 = 37.29, df = 15, P function of a wide range of biophysical, institutional, socioeconomic and household-level factors. In this regard, the likelihood of household size, productive labour force availability, the disparity of schooling age, level of perception of the process of deforestation and the current land tenure system had a critical influence on tree-growing investment decisions in the study watershed. Eventually, the processes of land-use conversion and land degradation were serious, which in turn have had adverse effects on agricultural productivity, local food security and poverty trap nexus. Hence, the study recommended that devising and implementing sustainable land management policy options would enhance ecological restoration and livelihood sustainability in the study watershed.

  6. A ROUGH SET DECISION TREE BASED MLP-CNN FOR VERY HIGH RESOLUTION REMOTELY SENSED IMAGE CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    C. Zhang

    2017-09-01

    Full Text Available Recent advances in remote sensing have witnessed a great amount of very high resolution (VHR images acquired at sub-metre spatial resolution. These VHR remotely sensed data has post enormous challenges in processing, analysing and classifying them effectively due to the high spatial complexity and heterogeneity. Although many computer-aid classification methods that based on machine learning approaches have been developed over the past decades, most of them are developed toward pixel level spectral differentiation, e.g. Multi-Layer Perceptron (MLP, which are unable to exploit abundant spatial details within VHR images. This paper introduced a rough set model as a general framework to objectively characterize the uncertainty in CNN classification results, and further partition them into correctness and incorrectness on the map. The correct classification regions of CNN were trusted and maintained, whereas the misclassification areas were reclassified using a decision tree with both CNN and MLP. The effectiveness of the proposed rough set decision tree based MLP-CNN was tested using an urban area at Bournemouth, United Kingdom. The MLP-CNN, well capturing the complementarity between CNN and MLP through the rough set based decision tree, achieved the best classification performance both visually and numerically. Therefore, this research paves the way to achieve fully automatic and effective VHR image classification.

  7. Accurate and interpretable nanoSAR models from genetic programming-based decision tree construction approaches.

    Science.gov (United States)

    Oksel, Ceyda; Winkler, David A; Ma, Cai Y; Wilkins, Terry; Wang, Xue Z

    2016-09-01

    The number of engineered nanomaterials (ENMs) being exploited commercially is growing rapidly, due to the novel properties they exhibit. Clearly, it is important to understand and minimize any risks to health or the environment posed by the presence of ENMs. Data-driven models that decode the relationships between the biological activities of ENMs and their physicochemical characteristics provide an attractive means of maximizing the value of scarce and expensive experimental data. Although such structure-activity relationship (SAR) methods have become very useful tools for modelling nanotoxicity endpoints (nanoSAR), they have limited robustness and predictivity and, most importantly, interpretation of the models they generate is often very difficult. New computational modelling tools or new ways of using existing tools are required to model the relatively sparse and sometimes lower quality data on the biological effects of ENMs. The most commonly used SAR modelling methods work best with large datasets, are not particularly good at feature selection, can be relatively opaque to interpretation, and may not account for nonlinearity in the structure-property relationships. To overcome these limitations, we describe the application of a novel algorithm, a genetic programming-based decision tree construction tool (GPTree) to nanoSAR modelling. We demonstrate the use of GPTree in the construction of accurate and interpretable nanoSAR models by applying it to four diverse literature datasets. We describe the algorithm and compare model results across the four studies. We show that GPTree generates models with accuracies equivalent to or superior to those of prior modelling studies on the same datasets. GPTree is a robust, automatic method for generation of accurate nanoSAR models with important advantages that it works with small datasets, automatically selects descriptors, and provides significantly improved interpretability of models.

  8. Decision tree analysis of genetic risk for clinically heterogeneous Alzheimer's disease.

    Science.gov (United States)

    Yokoyama, Jennifer S; Bonham, Luke W; Sears, Renee L; Klein, Eric; Karydas, Anna; Kramer, Joel H; Miller, Bruce L; Coppola, Giovanni

    2015-03-28

    Heritability of Alzheimer's disease (AD) is estimated at 74% and genetic contributors have been widely sought. The ε4 allele of apolipoprotein E (APOE) remains the strongest common risk factor for AD, with numerous other common variants contributing only modest risk for disease. Variability in clinical presentation of AD, which is typically amnestic (AmnAD) but can less commonly involve visuospatial, language and/or dysexecutive syndromes (atypical or AtAD), further complicates genetic analyses. Taking a multi-locus approach may increase the ability to identify individuals at highest risk for any AD syndrome. In this study, we sought to develop and investigate the utility of a multi-variant genetic risk assessment on a cohort of phenotypically heterogeneous patients with sporadic AD clinical diagnoses. We genotyped 75 variants in our cohort and, using a two-staged study design, we developed a 17-marker AD risk score in a Discovery cohort (n = 59 cases, n = 133 controls) then assessed its utility in a second Validation cohort (n = 126 cases, n = 150 controls). We also performed a data-driven decision tree analysis to identify genetic and/or demographic criteria that are most useful for accurately differentiating all AD cases from controls. We confirmed APOE ε4 as a strong risk factor for AD. A 17-marker risk panel predicted AD significantly better than APOE genotype alone (P risk in non-ε4 carriers. Our study suggests that APOE ε4 remains the best predictor of broad AD risk when compared to multiple other genetic factors with modest effects, that phenotypic heterogeneity in broad AD can complicate simple polygenic risk modeling, and supports the association between HFE and AD risk in individuals without APOE ε4.

  9. Introducing a Model for Suspicious Behaviors Detection in Electronic Banking by Using Decision Tree Algorithms

    Directory of Open Access Journals (Sweden)

    Rohulla Kosari Langari

    2014-02-01

    Full Text Available Change the world through information technology and Internet development, has created competitive knowledge in the field of electronic commerce, lead to increasing in competitive potential among organizations. In this condition The increasing rate of commercial deals developing guaranteed with speed and light quality is due to provide dynamic system of electronic banking until by using modern technology to facilitate electronic business process. Internet banking is enumerate as a potential opportunity the fundamental pillars and determinates of e-banking that in cyber space has been faced with various obstacles and threats. One of this challenge is complete uncertainty in security guarantee of financial transactions also exist of suspicious and unusual behavior with mail fraud for financial abuse. Now various systems because of intelligence mechanical methods and data mining technique has been designed for fraud detection in users’ behaviors and applied in various industrial such as insurance, medicine and banking. Main of article has been recognizing of unusual users behaviors in e-banking system. Therefore, detection behavior user and categories of emerged patterns to paper the conditions for predicting unauthorized penetration and detection of suspicious behavior. Since detection behavior user in internet system has been uncertainty and records of transactions can be useful to understand these movement and therefore among machine method, decision tree technique is considered common tool for classification and prediction, therefore in this research at first has determinate banking effective variable and weight of everything in internet behaviors production and in continuation combining of various behaviors manner draw out such as the model of inductive rules to provide ability recognizing of different behaviors. At least trend of four algorithm Chaid, ex_Chaid, C4.5, C5.0 has compared and evaluated for classification and detection of exist

  10. Risk Factors Predicting Infectious Lactational Mastitis: Decision Tree Approach versus Logistic Regression Analysis.

    Science.gov (United States)

    Fernández, Leónides; Mediano, Pilar; García, Ricardo; Rodríguez, Juan M; Marín, María

    2016-09-01

    Objectives Lactational mastitis frequently leads to a premature abandonment of breastfeeding; its development has been associated with several risk factors. This study aims to use a decision tree (DT) approach to establish the main risk factors involved in mastitis and to compare its performance for predicting this condition with a stepwise logistic regression (LR) model. Methods Data from 368 cases (breastfeeding women with mastitis) and 148 controls were collected by a questionnaire about risk factors related to medical history of mother and infant, pregnancy, delivery, postpartum, and breastfeeding practices. The performance of the DT and LR analyses was compared using the area under the receiver operating characteristic (ROC) curve. Sensitivity, specificity and accuracy of both models were calculated. Results Cracked nipples, antibiotics and antifungal drugs during breastfeeding, infant age, breast pumps, familial history of mastitis and throat infection were significant risk factors associated with mastitis in both analyses. Bottle-feeding and milk supply were related to mastitis for certain subgroups in the DT model. The areas under the ROC curves were similar for LR and DT models (0.870 and 0.835, respectively). The LR model had better classification accuracy and sensitivity than the DT model, but the last one presented better specificity at the optimal threshold of each curve. Conclusions The DT and LR models constitute useful and complementary analytical tools to assess the risk of lactational infectious mastitis. The DT approach identifies high-risk subpopulations that need specific mastitis prevention programs and, therefore, it could be used to make the most of public health resources.

  11. Pruning-Based Sparse Recovery for Electrocardiogram Reconstruction from Compressed Measurements

    Directory of Open Access Journals (Sweden)

    Jaeseok Lee

    2017-01-01

    Full Text Available Due to the necessity of the low-power implementation of newly-developed electrocardiogram (ECG sensors, exact ECG data reconstruction from the compressed measurements has received much attention in recent years. Our interest lies in improving the compression ratio (CR, as well as the ECG reconstruction performance of the sparse signal recovery. To this end, we propose a sparse signal reconstruction method by pruning-based tree search, which attempts to choose the globally-optimal solution by minimizing the cost function. In order to achieve low complexity for the real-time implementation, we employ a novel pruning strategy to avoid exhaustive tree search. Through the restricted isometry property (RIP-based analysis, we show that the exact recovery condition of our approach is more relaxed than any of the existing methods. Through the simulations, we demonstrate that the proposed approach outperforms the existing sparse recovery methods for ECG reconstruction.

  12. First steps in translating human cognitive processes of cane pruning grapevines into AI rules for automated robotic pruning

    Directory of Open Access Journals (Sweden)

    Saxton Valerie

    2014-01-01

    Full Text Available Cane pruning of grapevines is a skilled task for which, internationally, there is a dire shortage of human pruners. As part of a larger project developing an automated robotic pruner, we have used artificial intelligence (AI algorithms to create an expert system for selecting new canes and cutting off unwanted canes. A domain and ontology has been created for AI, which reflects the expertise of expert human pruners. The first step in the creation of an expert system was to generate virtual vines, which were then ‘pruned’ by human pruners and also by the expert system in its infancy. Here we examined the decisions of 12 human pruners, for consistency of decision, on 60 virtual vines. 96.7% of the 12 pruners agreed on at least one cane choice after which there was diminishing agreement on which further canes to select for laying. Our results indicate that techniques developed in computational intelligence can be used to co-ordinate and synthesise the expertise of human pruners into a best practice format. This paper describes first steps in this knowledge elicitation process, and discusses the fit between cane pruning expertise and the expertise that can be elicited using AI based expert system techniques.

  13. Enhanced Context Recognition by Sensitivity Pruned Vocabularies

    DEFF Research Database (Denmark)

    Madsen, Rasmus Elsborg; Sigurdsson, Sigurdur; Hansen, Lars Kai

    2004-01-01

    Language independent `bag-of-words' representations are surprisingly effective for text classification. The generic BOW approach is based on a high-dimensional vocabulary which may reduce the generalization performance of subsequent classifiers, e.g., based on ill-posed principal component...... transformations. In this communication our aim is to study the effect of sensitivity based pruning of the bag-of-words representation. We consider neural network based sensitivity maps for determination of term relevancy, when pruning the vocabularies. With reduced vocabularies documents are classified using...... a latent semantic indexing representation and a probabilistic neural network classifier. Pruning the vocabularies to approximately 20% of the original size, we find consistent context recognition enhancement for two mid size data-sets for a range of training set sizes. We also study the applicability...

  14. Including public-health benefits of trees in urban-forestry decision making

    Science.gov (United States)

    Geoffrey H. Donovan

    2017-01-01

    Research demonstrating the biophysical benefits of urban trees are often used to justify investments in urban forestry. Far less emphasis, however, is placed on the non-bio-physical benefits such as improvements in public health. Indeed, the public-health benefits of trees may be significantly larger than the biophysical benefits, and, therefore, failure to account for...

  15. Influência da poda de renovação e controle da ferrugem nas reservas de carboidratos e produção de pessegueiro precoce Influence of the renewal pruning and control of the rust in the carbohydrate reserves and production of precocious peach tree

    Directory of Open Access Journals (Sweden)

    João Paulo Campos de Araujo

    2008-06-01

    Full Text Available Este trabalho objetivou verificar a influência da poda de renovação e controle da ferrugem nas reservas de carboidratos não-estruturados em ramos e raízes do pessegueiro cultivar Flordaprince, bem como o possível efeito na produção e qualidade dos frutos. O trabalho foi conduzido no Departamento de Produção Vegetal da ESALQ-USP, em Piracicaba. O delineamento experimental utilizado foi em sete blocos ao acaso, constando de três tratamentos, sendo cada parcela constituída de quatro plantas. O tratamento 1 consistiu na realização da poda de renovação que foi executada 45 dias após a colheita, no mês de outubro de 2003. No tratamento 2, não se realizou a poda de renovação, e foi feito o controle da ferrugem. No tratamento 3, não foi realizada a poda de renovação, tampouco o controle da ferrugem, ocasionando desfolha antecipada. Os dados foram submetidos às análises de variância e à comparação das médias, pelo teste de Tukey. O espaçamento utilizado foi de 3,0 por 1,2 m, correspondendo a 2.777 plantas ha-1. As plantas foram conduzidas em sistema de líder central e receberam as práticas culturais normalmente utilizadas. Foram coletadas amostras de raízes e ramos que foram secos, moídos e submetidos à análise de laboratório para verificação dos teores de carboidratos não-estruturados. Ocorre flutuação na concentração de carboidratos solúveis nas raízes e nos ramos de acordo com a época da coleta, sendo que os teores de carboidratos solúveis nas raízes são sempre superiores àqueles encontrados nos ramos. O tratamento 2 apresentou maior produção de frutos e maior número de frutos por planta. Não houve efeito dos tratamentos nos aspectos qualitativos dos frutos, como diâmetro, comprimento, coloração e teor de sólidos solúveis.This work aimed to verify the influence of the renewal pruning and control of the rust in the reserves of non structured carbohydrates in branches and cultivar peach tree

  16. Effect of different winter pruning systems on grapes produced

    Directory of Open Access Journals (Sweden)

    Claudio Caprara

    2013-09-01

    Full Text Available The purpose of these trials was to evaluate possible effects on properties of grapes, particularly the physical and mechanical features, depending on the winter pruning system. The following pruning techniques were carried out: manual pruning (m; mechanical pruning (M; mechanical pre-pruning and subsequent manual finishing (Mm; mechanical pre-pruning and contemporary fast manual finishing, using a wagon facility with two operators equipped with pneumatic scissors (Mw. The trials were carried out on Sangiovese trained to spurred cordon. During the trials were measured: time and cost of pruning, quality of pruning and the vegetative-productive response of vines. During grape harvesting a consolidated analytical method of texture analysis was applied to evaluate the physical parameters of grapevine cultivar: pedicel detachment, skin perforation, skin thickness, grape features as hardness, cohesiveness, springiness. Analysis of working time showed that the manual pruning (m determined a greater commitment of time, while the mechanized pruning (M presented a time reduction of 95%. The two mechanized pruning associated with manual finishing reduced the time of 21% (Mm and 69% (Mw. The lowering cost is less evident but important anyway. Regarding the quality of pruning, the increase in the level of mechanization has produced an increase of spurs and buds density. It was also detected a higher percentage of damaged spurs and in wrong position. The increasing of mechanization levels of pruning also has produced smaller and sparser bunches with smaller berries. The study of mechanical properties of berries showed significant differences in the mechanical behaviours of the different pruning tests. The mechanized pruning presented higher values for the pedicel detachment, skin perforation and cohesiveness, while it gave lower values for thickness of skin and springiness. The results showed that mechanical pruning can modify properties of the berries which

  17. A Case of Prune Belly Syndrome

    Directory of Open Access Journals (Sweden)

    Wei Xu

    2015-06-01

    Full Text Available Prune belly syndrome (PBS is a rare congenital disorder characterized by deficient abdominal wall muscles, urinary tract malformation, and, in males, cryptorchidism. We present a case of PBS in China. The patient was a newborn baby boy who had wrinkled, “prune-like” abdominal skin, bilateral cryptorchidism, and urinary system malformation, complicated with hypoplasia of the lung and branch of the coronary artery–right ventricular fistula. His kidney function was inadequate. The patient subsequently died at age 28 days due to septicemia from a severe urinary tract infection.

  18. Effects of root pruning in sour cherry (Prunus cersus) "Stevnsbaer"

    DEFF Research Database (Denmark)

    Toldam-Andersen, Torben; Jensen, Nauja Lisa; Dencker, Ivar Blücher

    2007-01-01

    (May), initial and final fruit set (June) were recorded. Root pruning had little effect on fruit set, fruit size and yield in the year of pruning. Only in one plot with a severe root pruning (26 cm from the trunks), negative effects were found. The effects on growth, flowering and fruit set in 2003...

  19. 7 CFR 993.149 - Receiving of prunes by handlers.

    Science.gov (United States)

    2010-01-01

    ... same procedure shall apply as set forth in paragraph (d)(1) of this section. For each day on which a... and usually received by a handler in any considerable volume as ranch deliveries, and at which there... samples of prunes drawn as prune plums and dehydrated in the same manner as the prunes to which they are...

  20. Lessons learned from Applications of a Decision Tree for Confronting Climate Change Uncertainty - the Short Term and the Long Term

    Science.gov (United States)

    Ray, P. A.; Wi, S.; Bonzanigo, L.; Taner, M. U.; Rodriguez, D.; Garcia, L.; Brown, C.

    2016-12-01

    The Decision Tree for Confronting Climate Change Uncertainty is a hierarchical, staged framework for accomplishing climate change risk management in water resources system investments. Since its development for the World Bank Water Group two years ago, the framework has been applied to pilot demonstration projects in Nepal (hydropower generation), Mexico (water supply), Kenya (multipurpose reservoir operation), and Indonesia (flood risks to dam infrastructure). An important finding of the Decision Tree demonstration projects has been the need to present the risks/opportunities of climate change to stakeholders and investors in proportion to risks/opportunities and hazards of other kinds. This presentation will provide an overview of tools and techniques used to quantify risks/opportunities to each of the project types listed above, with special attention to those found most useful for exploration of the risk space. Careful exploration of the risk/opportunity space shows that some interventions would be better taken now, whereas risks/opportunities of other types would be better instituted incrementally in order to maintain reversibility and flexibility. A number of factors contribute to the robustness/flexibility tradeoff: available capital, magnitude and imminence of potential risk/opportunity, modular (or not) character of investment, and risk aversion of the decision maker, among others. Finally, in each case, nuance was required in the translation of Decision Tree findings into actionable policy recommendations. Though the narrative of stakeholder solicitation, engagement, and ultimate partnership is unique to each case, summary lessons are available from the portfolio that can serve as a guideline to the community of climate change risk managers.

  1. Analysis of the impact of recreational trail usage for prioritising management decisions: a regression tree approach

    Science.gov (United States)

    Tomczyk, Aleksandra; Ewertowski, Marek; White, Piran; Kasprzak, Leszek

    2016-04-01

    The dual role of many Protected Natural Areas in providing benefits for both conservation and recreation poses challenges for management. Although recreation-based damage to ecosystems can occur very quickly, restoration can take many years. The protection of conservation interests at the same as providing for recreation requires decisions to be made about how to prioritise and direct management actions. Trails are commonly used to divert visitors from the most important areas of a site, but high visitor pressure can lead to increases in trail width and a concomitant increase in soil erosion. Here we use detailed field data on condition of recreational trails in Gorce National Park, Poland, as the basis for a regression tree analysis to determine the factors influencing trail deterioration, and link specific trail impacts with environmental, use related and managerial factors. We distinguished 12 types of trails, characterised by four levels of degradation: (1) trails with an acceptable level of degradation; (2) threatened trails; (3) damaged trails; and (4) heavily damaged trails. Damaged trails were the most vulnerable of all trails and should be prioritised for appropriate conservation and restoration. We also proposed five types of monitoring of recreational trail conditions: (1) rapid inventory of negative impacts; (2) monitoring visitor numbers and variation in type of use; (3) change-oriented monitoring focusing on sections of trail which were subjected to changes in type or level of use or subjected to extreme weather events; (4) monitoring of dynamics of trail conditions; and (5) full assessment of trail conditions, to be carried out every 10-15 years. The application of the proposed framework can enhance the ability of Park managers to prioritise their trail management activities, enhancing trail conditions and visitor safety, while minimising adverse impacts on the conservation value of the ecosystem. A.M.T. was supported by the Polish Ministry of

  2. Landslide Susceptibility Mapping of Tegucigalpa, Honduras Using Artificial Neural Network, Bayesian Network and Decision Trees

    Science.gov (United States)

    Garcia Urquia, E. L.; Braun, A.; Yamagishi, H.

    2016-12-01

    Tegucigalpa, the capital city of Honduras, experiences rainfall-induced landslides on a yearly basis. The high precipitation regime and the rugged topography the city has been built in couple with the lack of a proper urban expansion plan to contribute to the occurrence of landslides during the rainy season. Thousands of inhabitants live at risk of losing their belongings due to the construction of precarious shelters in landslide-prone areas on mountainous terrains and next to the riverbanks. Therefore, the city is in the need for landslide susceptibility and hazard maps to aid in the regulation of future development. Major challenges in the context of highly dynamic urbanizing areas are the overlap of natural and anthropogenic slope destabilizing factors, as well as the availability and accuracy of data. Data-driven multivariate techniques have proven to be powerful in discovering interrelations between factors, identifying important factors in large datasets, capturing non-linear problems and coping with noisy and incomplete data. This analysis focuses on the creation of a landslide susceptibility map using different methods from the field of data mining, Artificial Neural Networks (ANN), Bayesian Networks (BN) and Decision Trees (DT). The input dataset of the study contains geomorphological and hydrological factors derived from a digital elevation model with a 10 m resolution, lithological factors derived from a geological map, and anthropogenic factors, such as information on the development stage of the neighborhoods in Tegucigalpa and road density. Moreover, a landslide inventory map that was developed in 2014 through aerial photo interpretation was used as target variable in the analysis. The analysis covers an area of roughly 100 km2, while 8.95 km2 are occupied by landslides. In a first step, the dataset was explored by assessing and improving the data quality, identifying unimportant variables and finding interrelations. Then, based on a training

  3. Pruning the vocabulary for better context recognition

    DEFF Research Database (Denmark)

    Madsen, Rasmus Elsborg; Sigurdsson, Sigurdur; Hansen, Lars Kai

    2004-01-01

    of term relevancy, when pruning the vocabularies. With reduced vocabularies, documents are classified using a latent semantic indexing representation and a probabilistic neural network classifier. Reducing the bag-of-words vocabularies with 90%-98%, we find consistent classification improvement using two...

  4. Vocabulary Pruning for Improved Context Recognition

    DEFF Research Database (Denmark)

    Madsen, Rasmus Elsborg; Sigurdsson, Sigurdur; Hansen, Lars Kai

    2004-01-01

    of term relevancy, when pruning the vocabularies. With reduced vocabularies documents are classified using a latent semantic indexing representation and a probabilistic neural network classifier. Reducing the bag-of-words vocabularies with 90%-98%, we find consistent classification improvement using two...

  5. 21 CFR 145.190 - Canned prunes.

    Science.gov (United States)

    2010-04-01

    ... 21 Food and Drugs 2 2010-04-01 2010-04-01 false Canned prunes. 145.190 Section 145.190 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) FOOD FOR HUMAN...: (1) Natural and artificial flavors. (2) Spice. (3) Vinegar, lemon juice, or organic acids. (4...

  6. A New Decision Tree to Solve the Puzzle of Alzheimer's Disease Pathogenesis Through Standard Diagnosis Scoring System.

    Science.gov (United States)

    Kumar, Ashwani; Singh, Tiratha Raj

    2017-03-01

    Alzheimer's disease (AD) is a progressive, incurable and terminal neurodegenerative disorder of the brain and is associated with mutations in amyloid precursor protein, presenilin 1, presenilin 2 or apolipoprotein E, but its underlying mechanisms are still not fully understood. Healthcare sector is generating a large amount of information corresponding to diagnosis, disease identification and treatment of an individual. Mining knowledge and providing scientific decision-making for the diagnosis and treatment of disease from the clinical dataset are therefore increasingly becoming necessary. The current study deals with the construction of classifiers that can be human readable as well as robust in performance for gene dataset of AD using a decision tree. Models of classification for different AD genes were generated according to Mini-Mental State Examination scores and all other vital parameters to achieve the identification of the expression level of different proteins of disorder that may possibly determine the involvement of genes in various AD pathogenesis pathways. The effectiveness of decision tree in AD diagnosis is determined by information gain with confidence value (0.96), specificity (92 %), sensitivity (98 %) and accuracy (77 %). Besides this functional gene classification using different parameters and enrichment analysis, our finding indicates that the measures of all the gene assess in single cohorts are sufficient to diagnose AD and will help in the prediction of important parameters for other relevant assessments.

  7. [Study on extraction method of Panax notoginseng plots in Wenshan of Yunnan province based on decision tree model].

    Science.gov (United States)

    Shi, Ting-Ting; Zhang, Xiao-Bo; Guo, Lan-Ping; Huang, Lu-Qi

    2017-11-01

    The herbs used as the material for traditional Chinese medicine are always planted in the mountainous area where the natural environment is suitable. As the mountain terrain is complex and the distribution of planting plots is scattered, the traditional survey method is difficult to obtain accurate planting area. It is of great significance to provide decision support for the conservation and utilization of traditional Chinese medicine resources by studying the method of extraction of Chinese herbal medicine planting area based on remote sensing and realizing the dynamic monitoring and reserve estimation of Chinese herbal medicines. In this paper, taking the Panax notoginseng plots in Wenshan prefecture of Yunnan province as an example, the China-made GF-1multispectral remote sensing images with a 16 m×16 m resolution were obtained. Then, the time series that can reflect the difference of spectrum of P. notoginseng shed and the background objects were selected to the maximum extent, and the decision tree model of extraction the of P. notoginseng plots was constructed according to the spectral characteristics of the surface features. The results showed that the remote sensing classification method based on the decision tree model could extract P. notoginseng plots in the study area effectively. The method can provide technical support for extraction of P. notoginseng plots at county level. Copyright© by the Chinese Pharmaceutical Association.

  8. Classification of Different Degrees of Disability Following Intracerebral Hemorrhage: A Decision Tree Analysis from VISTA-ICH Collaboration.

    Science.gov (United States)

    Phan, Thanh G; Chen, Jian; Beare, Richard; Ma, Henry; Clissold, Benjamin; Van Ly, John; Srikanth, Velandai

    2017-01-01

    Prognostication following intracerebral hemorrhage (ICH) has focused on poor outcome at the expense of lumping together mild and moderate disability. We aimed to develop a novel approach at classifying a range of disability following ICH. The Virtual International Stroke Trial Archive collaboration database was searched for patients with ICH and known volume of ICH on baseline CT scans. Disability was partitioned into mild [modified Rankin Scale (mRS) at 90 days of 0-2], moderate (mRS = 3-4), and severe disabilities (mRS = 5-6). We used binary and trichotomy decision tree methodology. The data were randomly divided into training (2/3 of data) and validation (1/3 data) datasets. The area under the receiver operating characteristic curve (AUC) was used to calculate the accuracy of the decision tree model. We identified 957 patients, age 65.9 ± 12.3 years, 63.7% males, and ICH volume 22.6 ± 22.1 ml. The binary tree showed that lower ICH volume (27.9 ml), older age (>69.5 years), and low Glasgow Coma Scale (tree showed that ICH volume, age, and serum glucose can separate mild, moderate, and severe disability groups with AUC 0.79 (95% CI 0.71-0.87). Both the binary and trichotomy methods provide equivalent discrimination of disability outcome after ICH. The trichotomy method can classify three categories at once, whereas this action was not possible with the binary method. The trichotomy method may be of use to clinicians and trialists for classifying a range of disability in ICH.

  9. Decision-tree early warning score (DTEWS) validates the design of the National Early Warning Score (NEWS).

    Science.gov (United States)

    Badriyah, Tessy; Briggs, James S; Meredith, Paul; Jarvis, Stuart W; Schmidt, Paul E; Featherstone, Peter I; Prytherch, David R; Smith, Gary B

    2014-03-01

    To compare the performance of a human-generated, trial and error-optimised early warning score (EWS), i.e., National Early Warning Score (NEWS), with one generated entirely algorithmically using Decision Tree (DT) analysis. We used DT analysis to construct a decision-tree EWS (DTEWS) from a database of 198,755 vital signs observation sets collected from 35,585 consecutive, completed acute medical admissions. We evaluated the ability of DTEWS to discriminate patients at risk of cardiac arrest, unanticipated intensive care unit admission or death, each within 24h of a given vital signs observation. We compared the performance of DTEWS and NEWS using the area under the receiver-operating characteristic (AUROC) curve. The structures of DTEWS and NEWS were very similar. The AUROC (95% CI) for DTEWS for cardiac arrest, unanticipated ICU admission, death, and any of the outcomes, all within 24h, were 0.708 (0.669-0.747), 0.862 (0.852-0.872), 0.899 (0.892-0.907), and 0.877 (0.870-0.883), respectively. Values for NEWS were 0.722 (0.685-0.759) [cardiac arrest], 0.857 (0.847-0.868) [unanticipated ICU admission}, 0.894 (0.887-0.902) [death], and 0.873 (0.866-0.879) [any outcome]. The decision-tree technique independently validates the composition and weightings of NEWS. The DT approach quickly provided an almost identical EWS to NEWS, although one that admittedly would benefit from fine-tuning using clinical knowledge. We believe that DT analysis could be used to quickly develop candidate models for disease-specific EWSs, which may be required in future. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  10. Predicting lung radiotherapy-induced pneumonitis using a model combining parametric Lyman probit with nonparametric decision trees.

    Science.gov (United States)

    Das, Shiva K; Zhou, Sumin; Zhang, Junan; Yin, Fang-Fang; Dewhirst, Mark W; Marks, Lawrence B

    2007-07-15

    To develop and test a model to predict for lung radiation-induced Grade 2+ pneumonitis. The model was built from a database of 234 lung cancer patients treated with radiotherapy (RT), of whom 43 were diagnosed with pneumonitis. The model augmented the predictive capability of the parametric dose-based Lyman normal tissue complication probability (LNTCP) metric by combining it with weighted nonparametric decision trees that use dose and nondose inputs. The decision trees were sequentially added to the model using a "boosting" process that enhances the accuracy of prediction. The model's predictive capability was estimated by 10-fold cross-validation. To facilitate dissemination, the cross-validation result was used to extract a simplified approximation to the complicated model architecture created by boosting. Application of the simplified model is demonstrated in two example cases. The area under the model receiver operating characteristics curve for cross-validation was 0.72, a significant improvement over the LNTCP area of 0.63 (p = 0.005). The simplified model used the following variables to output a measure of injury: LNTCP, gender, histologic type, chemotherapy schedule, and treatment schedule. For a given patient RT plan, injury prediction was highest for the combination of pre-RT chemotherapy, once-daily treatment, female gender and lowest for the combination of no pre-RT chemotherapy and nonsquamous cell histologic type. Application of the simplified model to the example cases revealed that injury prediction for a given treatment plan can range from very low to very high, depending on the settings of the nondose variables. Radiation pneumonitis prediction was significantly enhanced by decision trees that added the influence of nondose factors to the LNTCP formulation.

  11. Analysis of Human Papillomavirus Using Datamining - Apriori, Decision Tree, and Support Vector Machine (SVM and its Application Field

    Directory of Open Access Journals (Sweden)

    Cho Younghoon

    2016-01-01

    Full Text Available Human Papillomavirus(HPV has various types (compared to other viruses and plays a key role in evoking diverse diseases, especially cervical cancer. In this study, we aim to distinguish the features of HPV of different degree of fatality by analyzing their DNA sequences. We used Decision Tree Algorithm, Apriori Algorithm, and Support Vector Machine in our experiment. By analyzing their DNA sequences, we discovered some relationships between certain types of HPV, especially on the most fatal types, 16 and 18. Moreover, we concluded that it would be possible for scientists to develop more potent HPV cures by applying these relationships and features that HPV virus exhibit.

  12. Mapping mangrove forests using multi-tidal remotely-sensed data and a decision-tree-based procedure

    Science.gov (United States)

    Zhang, Xuehong; Treitz, Paul M.; Chen, Dongmei; Quan, Chang; Shi, Lixin; Li, Xinhui

    2017-10-01

    Mangrove forests grow in intertidal zones in tropical and subtropical regions and have suffered a dramatic decline globally over the past few decades. Remote sensing data, collected at various spatial resolutions, provide an effective way to map the spatial distribution of mangrove forests over time. However, the spectral signatures of mangrove forests are significantly affected by tide levels. Therefore, mangrove forests may not be accurately mapped with remote sensing data collected during a single-tidal event, especially if not acquired at low tide. This research reports how a decision-tree -based procedure was developed to map mangrove forests using multi-tidal Landsat 5 Thematic Mapper (TM) data and a Digital Elevation Model (DEM). Three indices, including the Normalized Difference Moisture Index (NDMI), the Normalized Difference Vegetation Index (NDVI) and NDVIL·NDMIH (the multiplication of NDVIL by NDMIH, L: low tide level, H: high tide level) were used in this algorithm to differentiate mangrove forests from other land-cover and land-use types in Fangchenggang City, China. Additionally, the recent Landsat 8 OLI (Operational Land Imager) data were selected to validate the results and compare if the methodology is reliable. The results demonstrate that short-term multi-tidal remotely-sensed data better represent the unique nearshore coastal wetland habitats of mangrove forests than single-tidal data. Furthermore, multi-tidal remotely-sensed data has led to improved accuracies using two classification approaches: i.e. decision trees and the maximum likelihood classification (MLC). Since mangrove forests are typically found at low elevations, the inclusion of elevation data in the two classification procedures was tested. Given the decision-tree method does not assume strict data distribution parameters, it was able to optimize the application of multi-tidal and elevation data, resulting in higher classification accuracies of mangrove forests. When using multi

  13. STBase: one million species trees for comparative biology.

    Directory of Open Access Journals (Sweden)

    Michelle M McMahon

    Full Text Available Comprehensively sampled phylogenetic trees provide the most compelling foundations for strong inferences in comparative evolutionary biology. Mismatches are common, however, between the taxa for which comparative data are available and the taxa sampled by published phylogenetic analyses. Moreover, many published phylogenies are gene trees, which cannot always be adapted immediately for species level comparisons because of discordance, gene duplication, and other confounding biological processes. A new database, STBase, lets comparative biologists quickly retrieve species level phylogenetic hypotheses in response to a query list of species names. The database consists of 1 million single- and multi-locus data sets, each with a confidence set of 1000 putative species trees, computed from GenBank sequence data for 413,000 eukaryotic taxa. Two bodies of theoretical work are leveraged to aid in the assembly of multi-locus concatenated data sets for species tree construction. First, multiply labeled gene trees are pruned to conflict-free singly-labeled species-level trees that can be combined between loci. Second, impacts of missing data in multi-locus data sets are ameliorated by assembling only decisive data sets. Data sets overlapping with the user's query are ranked using a scheme that depends on user-provided weights for tree quality and for taxonomic overlap of the tree with the query. Retrieval times are independent of the size of the database, typically a few seconds. Tree quality is assessed by a real-time evaluation of bootstrap support on just the overlapping subtree. Associated sequence alignments, tree files and metadata can be downloaded for subsequent analysis. STBase provides a tool for comparative biologists interested in exploiting the most relevant sequence data available for the taxa of interest. It may also serve as a prototype for future species tree oriented databases and as a resource for assembly of larger species phylogenies

  14. STBase: one million species trees for comparative biology.

    Science.gov (United States)

    McMahon, Michelle M; Deepak, Akshay; Fernández-Baca, David; Boss, Darren; Sanderson, Michael J

    2015-01-01

    Comprehensively sampled phylogenetic trees provide the most compelling foundations for strong inferences in comparative evolutionary biology. Mismatches are common, however, between the taxa for which comparative data are available and the taxa sampled by published phylogenetic analyses. Moreover, many published phylogenies are gene trees, which cannot always be adapted immediately for species level comparisons because of discordance, gene duplication, and other confounding biological processes. A new database, STBase, lets comparative biologists quickly retrieve species level phylogenetic hypotheses in response to a query list of species names. The database consists of 1 million single- and multi-locus data sets, each with a confidence set of 1000 putative species trees, computed from GenBank sequence data for 413,000 eukaryotic taxa. Two bodies of theoretical work are leveraged to aid in the assembly of multi-locus concatenated data sets for species tree construction. First, multiply labeled gene trees are pruned to conflict-free singly-labeled species-level trees that can be combined between loci. Second, impacts of missing data in multi-locus data sets are ameliorated by assembling only decisive data sets. Data sets overlapping with the user's query are ranked using a scheme that depends on user-provided weights for tree quality and for taxonomic overlap of the tree with the query. Retrieval times are independent of the size of the database, typically a few seconds. Tree quality is assessed by a real-time evaluation of bootstrap support on just the overlapping subtree. Associated sequence alignments, tree files and metadata can be downloaded for subsequent analysis. STBase provides a tool for comparative biologists interested in exploiting the most relevant sequence data available for the taxa of interest. It may also serve as a prototype for future species tree oriented databases and as a resource for assembly of larger species phylogenies from precomputed

  15. Impact of wood pruning to greenhouse gas emissions in three orchards and a vineyard

    Science.gov (United States)

    Germer, Sonja; Schleicher, Sarah; Bischoff, Wolf-Anno; Gomez Palermo, Maider; Kern, Jürgen

    2015-04-01

    Pruning of orchards and vineyards is usually burned or left on the soil for nutrient and organic carbon recycling. Recently the interest rose to extract pruning for energetic use. Very few studies exist that analyzed the effects of pruning removal on soil physical and chemical characteristics. This is linked to the fact that changes are expected rather in the long term, but project funding is typically restricted to 2 or 3 years. Some soil characteristics, however, as organic carbon content and greenhouse gas emissions might also change on the short term as our literature review reveals. The main objective of this research is to determine if pruning extraction from orchards and vineyards impact greenhouse gas emissions (N2O, CH4, and CO2) from soil to the atmosphere, change soil nitrogen and carbon content or effect nitrogen leaching. Results from our study and from the literature will be compiled to formulate best management practices for sustainable pruning utilization from orchards and vineyards. Here we compare four different study sites in a block design over two rows each with two parcels where we extracted pruning and two parcels where pruning was chipped and left on the soil (n=4). Comparisons were made for initial soil chemistry and greenhouse gas emissions in a cherry orchard without irrigation in Germany, a vineyard without irrigation in France, an almond orchard with drip irrigation in Spain and a peach orchard with flood irrigation in Spain. Soil greenhouse gas emissions depend on soil chemistry and soil moisture. These characteristics can be expected to vary between the tree rows and inter-rows of orchards. Therefore we took soil samples from row and inter-row positions of each study site and analyzed them for chemical parameters (pH, total C, N, S, and H, and available PO4, NH4, NO3, K, Mg, Ca). Additionally soil moisture and temperature data have been recorded for tree rows and inter-rows in the cherry orchard and the vineyard. Gas samples were

  16. Interactions between factors related to the decision of sex offenders to confess during police interrogation: a classification-tree approach.

    Science.gov (United States)

    Beauregard, Eric; Deslauriers-Varin, Nadine; St-Yves, Michel

    2010-09-01

    Most studies of confessions have looked at the influence of individual factors, neglecting the potential interactions between these factors and their impact on the decision to confess or not during an interrogation. Classification and regression tree analyses conducted on a sample of 624 convicted sex offenders showed that certain factors related to the offenders (e.g., personality, criminal career), victims (e.g., sex, relationship to offender), and case (e.g., time of day of the crime) were related to the decision to confess or not during the police interrogation. Several interactions were also observed between these factors. Results will be discussed in light of previous findings and interrogation strategies for sex offenders.

  17. Lost in space? Generalising subtree prune and regraft to spaces of phylogenetic networks.

    Science.gov (United States)

    Bordewich, Magnus; Linz, Simone; Semple, Charles

    2017-06-21

    Over the last fifteen years, phylogenetic networks have become a popular tool to analyse relationships between species whose past includes reticulation events such as hybridisation or horizontal gene transfer. However, the space of phylogenetic networks is significantly larger than that of phylogenetic trees, and how to analyse and search this enlarged space remains a poorly understood problem. Inspired by the widely-used rooted subtree prune and regraft (rSPR) operation on rooted phylogenetic trees, we propose a new operation-called subnet prune and regraft (SNPR)-that induces a metric on the space of all rooted phylogenetic networks on a fixed set of leaves. We show that the spaces of several popular classes of rooted phylogenetic networks (e.g. tree child, reticulation visible, and tree based) are connected under SNPR and that connectedness remains for the subclasses of these networks with a fixed number of reticulations. Lastly, we bound the distance between two rooted phylogenetic networks under the SNPR operation, show that it is computationally hard to compute this distance exactly, and analyse how the SNPR-distance between two such networks relates to the rSPR-distance between rooted phylogenetic trees that are embedded in these networks. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Use of decision trees to value investigation strategies for soil pollution problems

    NARCIS (Netherlands)

    Okx, J.P.; Stein, A.

    2000-01-01

    Remediation of a contaminated site usually requires costly actions, and several clean-up and sampling strategies may have to be compared by those involved in the decision-making process. In this paper several common environmental pollution problems have been addressed by using probabilistic decision

  19. Procalcitonin and C-reactive protein-based decision tree model for distinguishing PFAPA flares from acute infections

    Directory of Open Access Journals (Sweden)

    Barbara Kraszewska-Głomba

    2016-03-01

    Full Text Available As no specific laboratory test has been identified, PFAPA (periodic fever, aphthous stomatitis, pharyngitis and cervical adenitis remains a diagnosis of exclusion. We searched for a practical use of procalcitonin (PCT and C-reactive protein (CRP in distinguishing PFAPA attacks from acute bacterial and viral infections. Levels of PCT and CRP were measured in 38 patients with PFAPA and 81 children diagnosed with an acute bacterial (n=42 or viral (n=39 infection. Statistical analysis with the use of the C4.5 algorithm resulted in the following decision tree: viral infection if CRP≤19.1 mg/L; otherwise for cases with CRP>19.1 mg/L: bacterial infection if PCT>0.65ng/mL, PFAPA if PCT≤0.65 ng/mL. The model was tested using a 10-fold cross validation and in an independent test cohort (n=30, the rule’s overall accuracy was 76.4% and 90% respectively. Although limited by a small sample size, the obtained decision tree might present a potential diagnostic tool for distinguishing PFAPA flares from acute infections when interpreted cautiously and with reference to the clinical context.

  20. Procalcitonin and C-reactive protein-based decision tree model for distinguishing PFAPA flares from acute infections

    Science.gov (United States)

    Kraszewska-Głomba, Barbara; Szymańska-Toczek, Zofia; Szenborn, Leszek

    2016-01-01

    As no specific laboratory test has been identified, PFAPA (periodic fever, aphthous stomatitis, pharyngitis and cervical adenitis) remains a diagnosis of exclusion. We searched for a practical use of procalcitonin (PCT) and C-reactive protein (CRP) in distinguishing PFAPA attacks from acute bacterial and viral infections. Levels of PCT and CRP were measured in 38 patients with PFAPA and 81 children diagnosed with an acute bacterial (n=42) or viral (n=39) infection. Statistical analysis with the use of the C4.5 algorithm resulted in the following decision tree: viral infection if CRP≤19.1 mg/L; otherwise for cases with CRP>19.1 mg/L: bacterial infection if PCT>0.65ng/mL, PFAPA if PCT≤0.65 ng/mL. The model was tested using a 10-fold cross validation and in an independent test cohort (n=30), the rule’s overall accuracy was 76.4% and 90% respectively. Although limited by a small sample size, the obtained decision tree might present a potential diagnostic tool for distinguishing PFAPA flares from acute infections when interpreted cautiously and with reference to the clinical context. PMID:27131024