WorldWideScience

Sample records for additive classification tree

  1. Classification and regression trees

    CERN Document Server

    Breiman, Leo; Olshen, Richard A; Stone, Charles J

    1984-01-01

    The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

  2. SPORT FOOD ADDITIVE CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    I. P. Prokopenko

    2015-01-01

    Full Text Available Correctly organized nutritive and pharmacological support is an important component of an athlete's preparation for competitions, an optimal shape maintenance, fast recovery and rehabilitation after traumas and defatigation. Special products of enhanced biological value (BAS for athletes nutrition are used with this purpose. Easy-to-use energy sources are administered into athlete's organism, yielded materials and biologically active substances which regulate and activate exchange reactions which proceed with difficulties during certain physical trainings. The article presents sport supplements classification which can be used before warm-up and trainings, after trainings and in competitions breaks.

  3. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    Science.gov (United States)

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  4. The decision tree approach to classification

    Science.gov (United States)

    Wu, C.; Landgrebe, D. A.; Swain, P. H.

    1975-01-01

    A class of multistage decision tree classifiers is proposed and studied relative to the classification of multispectral remotely sensed data. The decision tree classifiers are shown to have the potential for improving both the classification accuracy and the computation efficiency. Dimensionality in pattern recognition is discussed and two theorems on the lower bound of logic computation for multiclass classification are derived. The automatic or optimization approach is emphasized. Experimental results on real data are reported, which clearly demonstrate the usefulness of decision tree classifiers.

  5. Predicting Battle Outcomes with Classification Trees

    National Research Council Canada - National Science Library

    Coban, Muzaffer

    2001-01-01

    ... from the actual battlefield, The models built by using classification trees reveal that the objective variables alone cannot explain the outcome of battles, Relative factors, such as leadership, have deep...

  6. Automated Decision Tree Classification of Corneal Shape

    Science.gov (United States)

    Twa, Michael D.; Parthasarathy, Srinivasan; Roberts, Cynthia; Mahmoud, Ashraf M.; Raasch, Thomas W.; Bullimore, Mark A.

    2011-01-01

    Purpose The volume and complexity of data produced during videokeratography examinations present a challenge of interpretation. As a consequence, results are often analyzed qualitatively by subjective pattern recognition or reduced to comparisons of summary indices. We describe the application of decision tree induction, an automated machine learning classification method, to discriminate between normal and keratoconic corneal shapes in an objective and quantitative way. We then compared this method with other known classification methods. Methods The corneal surface was modeled with a seventh-order Zernike polynomial for 132 normal eyes of 92 subjects and 112 eyes of 71 subjects diagnosed with keratoconus. A decision tree classifier was induced using the C4.5 algorithm, and its classification performance was compared with the modified Rabinowitz–McDonnell index, Schwiegerling’s Z3 index (Z3), Keratoconus Prediction Index (KPI), KISA%, and Cone Location and Magnitude Index using recommended classification thresholds for each method. We also evaluated the area under the receiver operator characteristic (ROC) curve for each classification method. Results Our decision tree classifier performed equal to or better than the other classifiers tested: accuracy was 92% and the area under the ROC curve was 0.97. Our decision tree classifier reduced the information needed to distinguish between normal and keratoconus eyes using four of 36 Zernike polynomial coefficients. The four surface features selected as classification attributes by the decision tree method were inferior elevation, greater sagittal depth, oblique toricity, and trefoil. Conclusions Automated decision tree classification of corneal shape through Zernike polynomials is an accurate quantitative method of classification that is interpretable and can be generated from any instrument platform capable of raw elevation data output. This method of pattern classification is extendable to other classification

  7. Fast Image Texture Classification Using Decision Trees

    Science.gov (United States)

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  8. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge; Lo, Pechin Chien Pau; Dirksen, Asger

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment betw...

  9. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge Emil Borch Laurs; Lo, Pechin Chien Pau; Dirksen, Asger

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment...

  10. Phylogenetic classification and the universal tree.

    Science.gov (United States)

    Doolittle, W F

    1999-06-25

    From comparative analyses of the nucleotide sequences of genes encoding ribosomal RNAs and several proteins, molecular phylogeneticists have constructed a "universal tree of life," taking it as the basis for a "natural" hierarchical classification of all living things. Although confidence in some of the tree's early branches has recently been shaken, new approaches could still resolve many methodological uncertainties. More challenging is evidence that most archaeal and bacterial genomes (and the inferred ancestral eukaryotic nuclear genome) contain genes from multiple sources. If "chimerism" or "lateral gene transfer" cannot be dismissed as trivial in extent or limited to special categories of genes, then no hierarchical universal classification can be taken as natural. Molecular phylogeneticists will have failed to find the "true tree," not because their methods are inadequate or because they have chosen the wrong genes, but because the history of life cannot properly be represented as a tree. However, taxonomies based on molecular sequences will remain indispensable, and understanding of the evolutionary process will ultimately be enriched, not impoverished.

  11. A new tree classification system for southern hardwoods

    Science.gov (United States)

    James S. Meadows; Daniel A. Jr. Skojac

    2008-01-01

    A new tree classification system for southern hardwoods is described. The new system is based on the Putnam tree classification system, originally developed by Putnam et al., 1960, Management ond inventory of southern hardwoods, Agriculture Handbook 181, US For. Sew., Washington, DC, which consists of four tree classes: (1) preferred growing stock, (2) reserve growing...

  12. Transferability of decision trees for land cover classification in a ...

    African Journals Online (AJOL)

    This paper attempts to derive classification rules from training data of four Landsat-8 scenes by using the classification and regression tree (CART) implementation of the decision tree algorithm. The transferability of the ruleset was evaluated by classifying two adjacent scenes. The classification of the four mosaicked scenes ...

  13. Tree Classification with Fused Mobile Laser Scanning and Hyperspectral Data

    Science.gov (United States)

    Puttonen, Eetu; Jaakkola, Anttoni; Litkey, Paula; Hyyppä, Juha

    2011-01-01

    Mobile Laser Scanning data were collected simultaneously with hyperspectral data using the Finnish Geodetic Institute Sensei system. The data were tested for tree species classification. The test area was an urban garden in the City of Espoo, Finland. Point clouds representing 168 individual tree specimens of 23 tree species were determined manually. The classification of the trees was done using first only the spatial data from point clouds, then with only the spectral data obtained with a spectrometer, and finally with the combined spatial and hyperspectral data from both sensors. Two classification tests were performed: the separation of coniferous and deciduous trees, and the identification of individual tree species. All determined tree specimens were used in distinguishing coniferous and deciduous trees. A subset of 133 trees and 10 tree species was used in the tree species classification. The best classification results for the fused data were 95.8% for the separation of the coniferous and deciduous classes. The best overall tree species classification succeeded with 83.5% accuracy for the best tested fused data feature combination. The respective results for paired structural features derived from the laser point cloud were 90.5% for the separation of the coniferous and deciduous classes and 65.4% for the species classification. Classification accuracies with paired hyperspectral reflectance value data were 90.5% for the separation of coniferous and deciduous classes and 62.4% for different species. The results are among the first of their kind and they show that mobile collected fused data outperformed single-sensor data in both classification tests and by a significant margin. PMID:22163894

  14. Decision tree approach for classification of remotely sensed satellite ...

    Indian Academy of Sciences (India)

    sensed satellite data using open source support. Richa Sharma .... Decision tree classification techniques have been .... the USGS Earth Resource Observation Systems. (EROS) ... for shallow water, 11% were for sparse and dense built-up ...

  15. Decision tree approach for classification of remotely sensed satellite

    Indian Academy of Sciences (India)

    DTC) algorithm for classification of remotely sensed satellite data (Landsat TM) using open source support. The decision tree is constructed by recursively partitioning the spectral distribution of the training dataset using WEKA, open source ...

  16. Decision tree methods: applications for classification and prediction.

    Science.gov (United States)

    Song, Yan-Yan; Lu, Ying

    2015-04-25

    Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure.

  17. Boosted classification trees result in minor to modest improvement in the accuracy in classifying cardiovascular outcomes compared to conventional classification trees

    Science.gov (United States)

    Austin, Peter C; Lee, Douglas S

    2011-01-01

    Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181

  18. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis.

    Science.gov (United States)

    Lo, Benjamin W Y; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A H

    2016-01-01

    Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56-2.45, P tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH.

  19. Time Series of Images to Improve Tree Species Classification

    Science.gov (United States)

    Miyoshi, G. T.; Imai, N. N.; de Moraes, M. V. A.; Tommaselli, A. M. G.; Näsi, R.

    2017-10-01

    Tree species classification provides valuable information to forest monitoring and management. The high floristic variation of the tree species appears as a challenging issue in the tree species classification because the vegetation characteristics changes according to the season. To help to monitor this complex environment, the imaging spectroscopy has been largely applied since the development of miniaturized sensors attached to Unmanned Aerial Vehicles (UAV). Considering the seasonal changes in forests and the higher spectral and spatial resolution acquired with sensors attached to UAV, we present the use of time series of images to classify four tree species. The study area is an Atlantic Forest area located in the western part of São Paulo State. Images were acquired in August 2015 and August 2016, generating three data sets of images: only with the image spectra of 2015; only with the image spectra of 2016; with the layer stacking of images from 2015 and 2016. Four tree species were classified using Spectral angle mapper (SAM), Spectral information divergence (SID) and Random Forest (RF). The results showed that SAM and SID caused an overfitting of the data whereas RF showed better results and the use of the layer stacking improved the classification achieving a kappa coefficient of 18.26 %.

  20. Deep Multi-Task Learning for Tree Genera Classification

    Science.gov (United States)

    Ko, C.; Kang, J.; Sohn, G.

    2018-05-01

    The goal for our paper is to classify tree genera using airborne Light Detection and Ranging (LiDAR) data with Convolution Neural Network (CNN) - Multi-task Network (MTN) implementation. Unlike Single-task Network (STN) where only one task is assigned to the learning outcome, MTN is a deep learning architect for learning a main task (classification of tree genera) with other tasks (in our study, classification of coniferous and deciduous) simultaneously, with shared classification features. The main contribution of this paper is to improve classification accuracy from CNN-STN to CNN-MTN. This is achieved by introducing a concurrence loss (Lcd) to the designed MTN. This term regulates the overall network performance by minimizing the inconsistencies between the two tasks. Results show that we can increase the classification accuracy from 88.7 % to 91.0 % (from STN to MTN). The second goal of this paper is to solve the problem of small training sample size by multiple-view data generation. The motivation of this goal is to address one of the most common problems in implementing deep learning architecture, the insufficient number of training data. We address this problem by simulating training dataset with multiple-view approach. The promising results from this paper are providing a basis for classifying a larger number of dataset and number of classes in the future.

  1. Classification tree for the assessment of sedentary lifestyle among hypertensive.

    Science.gov (United States)

    Castelo Guedes Martins, Larissa; Venícios de Oliveira Lopes, Marcos; Gomes Guedes, Nirla; Paixão de Menezes, Angélica; de Oliveira Farias, Odaleia; Alves Dos Santos, Naftale

    2016-04-01

    To develop a classification tree of clinical indicators for the correct prediction of the nursing diagnosis "Sedentary lifestyle" (SL) in people with high blood pressure (HTN). A cross-sectional study conducted in an outpatient care center specializing in high blood pressure and Mellitus diabetes located in northeastern Brazil. The sample consisted of 285 people between 19 and 59 years old diagnosed with high blood pressure and was applied an interview and physical examination, obtaining socio-demographic information, related factors and signs and symptoms that made the defining characteristics for the diagnosis under study. The tree was generated using the CHAID algorithm (Chi-square Automatic Interaction Detection). The construction of the decision tree allowed establishing the interactions between clinical indicators that facilitate a probabilistic analysis of multiple situations allowing quantify the probability of an individual presenting a sedentary lifestyle. The tree included the clinical indicator Choose daily routine without exercise as the first node. People with this indicator showed a probability of 0.88 of presenting the SL. The second node was composed of the indicator Does not perform physical activity during leisure, with 0.99 probability of presenting the SL with these two indicators. The predictive capacity of the tree was established at 69.5%. Decision trees help nurses who care HTN people in decision-making in assessing the characteristics that increase the probability of SL nursing diagnosis, optimizing the time for diagnostic inference.

  2. Classification tree for the assessment of sedentary lifestyle among hypertensive

    Directory of Open Access Journals (Sweden)

    Larissa Castelo Guedes Martins

    Full Text Available Objective.To develop a classification tree of clinical indicators for the correct prediction of the nursing diagnosis "Sedentary lifestyle" (SL in people with high blood pressure (HTN. Methods. A cross-sectional study conducted in an outpatient care center specializing in high blood pressure and Mellitus diabetes located in northeastern Brazil. The sample consisted of 285 people between 19 and 59 years old diagnosed with high blood pressure and was applied an interview and physical examination, obtaining socio-demographic information, related factors and signs and symptoms that made the defining characteristics for the diagnosis under study. The tree was generated using the CHAID algorithm (Chi-square Automatic Interaction Detection. Results. The construction of the decision tree allowed establishing the interactions between clinical indicators that facilitate a probabilistic analysis of multiple situations allowing quantify the probability of an individual presenting a sedentary lifestyle. The tree included the clinical indicator Choose daily routine without exercise as the first node. People with this indicator showed a probability of 0.88 of presenting the SL. The second node was composed of the indicator Does not perform physical activity during leisure, with 0.99 probability of presenting the SL with these two indicators. The predictive capacity of the tree was established at 69.5%. Conclusion. Decision trees help nurses who care HTN people in decision-making in assessing the characteristics that increase the probability of SL nursing diagnosis, optimizing the time for diagnostic inference.

  3. Classification tree for the assessment of sedentary lifestyle among hypertensive

    OpenAIRE

    Castelo Guedes Martins, Larissa; Venícios de Oliveira Lopes, Marcos; Gomes Guedes, Nirla; Paixão de Menezes, Angélica; de Oliveira Farias, Odaleia; Alves dos Santos, Naftale

    2016-01-01

    Objective.To develop a classification tree of clinical indicators for the correct prediction of the nursing diagnosis "Sedentary lifestyle" (SL) in people with high blood pressure (HTN). Methods. A cross-sectional study conducted in an outpatient care center specializing in high blood pressure and Mellitus diabetes located in northeastern Brazil. The sample consisted of 285 people between 19 and 59 years old diagnosed with high blood pressure and was applied an interview and physical examinat...

  4. Hierarchical classification with a competitive evolutionary neural tree.

    Science.gov (United States)

    Adams, R G.; Butchart, K; Davey, N

    1999-04-01

    A new, dynamic, tree structured network, the Competitive Evolutionary Neural Tree (CENT) is introduced. The network is able to provide a hierarchical classification of unlabelled data sets. The main advantage that the CENT offers over other hierarchical competitive networks is its ability to self determine the number, and structure, of the competitive nodes in the network, without the need for externally set parameters. The network produces stable classificatory structures by halting its growth using locally calculated heuristics. The results of network simulations are presented over a range of data sets, including Anderson's IRIS data set. The CENT network demonstrates its ability to produce a representative hierarchical structure to classify a broad range of data sets.

  5. Prediction of Infertility Treatment Outcomes Using Classification Trees

    Directory of Open Access Journals (Sweden)

    Milewska Anna Justyna

    2016-12-01

    Full Text Available Infertility is currently a common problem with causes that are often unexplained, which complicates treatment. In many cases, the use of ART methods provides the only possibility of getting pregnant. Analysis of this type of data is very complex. More and more often, data mining methods or artificial intelligence techniques are appropriate for solving such problems. In this study, classification trees were used for analysis. This resulted in obtaining a group of patients characterized most likely to get pregnant while using in vitro fertilization.

  6. 1-Skeletons of the Spanning Tree Problems with Additional Constraints

    Directory of Open Access Journals (Sweden)

    V. A. Bondarenko

    2015-01-01

    Full Text Available In this paper, we study polyhedral properties of two spanning tree problems with additional constraints. In the first problem, it is required to find a tree with a minimum sum of edge weights among all spanning trees with the number of leaves less than or equal to a given value. In the second problem, an additional constraint is the assumption that the degree of all nodes of the spanning tree does not exceed a given value. The recognition versions of both problems are NP-complete. We consider polytopes of these problems and their 1-skeletons. We prove that in both cases it is a NP-complete problem to determine whether the vertices of 1-skeleton are adjacent. Although it is possible to obtain a superpolynomial lower bounds on the clique numbers of these graphs. These values characterize the time complexity in a broad class of algorithms based on linear comparisons. The results indicate a fundamental difference between combinatorial and geometric properties of the considered problems from the classical minimum spanning tree problem.

  7. Electronic Nose Odor Classification with Advanced Decision Tree Structures

    Directory of Open Access Journals (Sweden)

    S. Guney

    2013-09-01

    Full Text Available Electronic nose (e-nose is an electronic device which can measure chemical compounds in air and consequently classify different odors. In this paper, an e-nose device consisting of 8 different gas sensors was designed and constructed. Using this device, 104 different experiments involving 11 different odor classes (moth, angelica root, rose, mint, polis, lemon, rotten egg, egg, garlic, grass, and acetone were performed. The main contribution of this paper is the finding that using the chemical domain knowledge it is possible to train an accurate odor classification system. The domain knowledge about chemical compounds is represented by a decision tree whose nodes are composed of classifiers such as Support Vector Machines and k-Nearest Neighbor. The overall accuracy achieved with the proposed algorithm and the constructed e-nose device was 97.18 %. Training and testing data sets used in this paper are published online.

  8. Modeling Ecosystem Services for Park Trees: Sensitivity of i-Tree Eco Simulations to Light Exposure and Tree Species Classification

    Directory of Open Access Journals (Sweden)

    Rocco Pace

    2018-02-01

    Full Text Available Ecosystem modeling can help decision making regarding planting of urban trees for climate change mitigation and air pollution reduction. Algorithms and models that link the properties of plant functional types, species groups, or single species to their impact on specific ecosystem services have been developed. However, these models require a considerable effort for initialization that is inherently related to uncertainties originating from the high diversity of plant species in urban areas. We therefore suggest a new automated method to be used with the i-Tree Eco model to derive light competition for individual trees and investigate the importance of this property. Since competition depends also on the species, which is difficult to determine from increasingly used remote sensing methodologies, we also investigate the impact of uncertain tree species classification on the ecosystem services by comparing a species-specific inventory determined by field observation with a genus-specific categorization and a model initialization for the dominant deciduous and evergreen species only. Our results show how the simulation of competition affects the determination of carbon sequestration, leaf area, and related ecosystem services and that the proposed method provides a tool for improving estimations. Misclassifications of tree species can lead to large deviations in estimates of ecosystem impacts, particularly concerning biogenic volatile compound emissions. In our test case, monoterpene emissions almost doubled and isoprene emissions decreased to less than 10% when species were estimated to belong only to either two groups instead of being determined by species or genus. It is discussed that this uncertainty of emission estimates propagates further uncertainty in the estimation of potential ozone formation. Overall, we show the importance of using an individual light competition approach and explicitly parameterizing all ecosystem functions at the

  9. Univariate decision tree induction using maximum margin classification

    OpenAIRE

    Yıldız, Olcay Taner

    2012-01-01

    In many pattern recognition applications, first decision trees are used due to their simplicity and easily interpretable nature. In this paper, we propose a new decision tree learning algorithm called univariate margin tree where, for each continuous attribute, the best split is found using convex optimization. Our simulation results on 47 data sets show that the novel margin tree classifier performs at least as good as C4.5 and linear discriminant tree (LDT) with a similar time complexity. F...

  10. Tree species classification using within crown localization of waveform LiDAR attributes

    Science.gov (United States)

    Blomley, Rosmarie; Hovi, Aarne; Weinmann, Martin; Hinz, Stefan; Korpela, Ilkka; Jutzi, Boris

    2017-11-01

    Since forest planning is increasingly taking an ecological, diversity-oriented perspective into account, remote sensing technologies are becoming ever more important in assessing existing resources with reduced manual effort. While the light detection and ranging (LiDAR) technology provides a good basis for predictions of tree height and biomass, tree species identification based on this type of data is particularly challenging in structurally heterogeneous forests. In this paper, we analyse existing approaches with respect to the geometrical scale of feature extraction (whole tree, within crown partitions or within laser footprint) and conclude that currently features are always extracted separately from the different scales. Since multi-scale approaches however have proven successful in other applications, we aim to utilize the within-tree-crown distribution of within-footprint signal characteristics as additional features. To do so, a spin image algorithm, originally devised for the extraction of 3D surface features in object recognition, is adapted. This algorithm relies on spinning an image plane around a defined axis, e.g. the tree stem, collecting the number of LiDAR returns or mean values of returns attributes per pixel as respective values. Based on this representation, spin image features are extracted that comprise only those components of highest variability among a given set of library trees. The relative performance and the combined improvement of these spin image features with respect to non-spatial statistical metrics of the waveform (WF) attributes are evaluated for the tree species classification of Scots pine (Pinus sylvestris L.), Norway spruce (Picea abies (L.) Karst.) and Silver/Downy birch (Betula pendula Roth/Betula pubescens Ehrh.) in a boreal forest environment. This evaluation is performed for two WF LiDAR datasets that differ in footprint size, pulse density at ground, laser wavelength and pulse width. Furthermore, we evaluate the

  11. The process and utility of classification and regression tree methodology in nursing research.

    Science.gov (United States)

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-06-01

    This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Discussion paper. English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984-2013. Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. © 2013 The Authors. Journal of Advanced Nursing Published by John Wiley & Sons Ltd.

  12. Assessment and classification of hazardous street trees in University ...

    African Journals Online (AJOL)

    The study was carried out to assessed and classified hazardous trees within the University of Ibadan (UI) campus, Oyo State, Nigeria. The study population was 25 municipal tree species comprising of 420 individual trees located along the major roads of the study area, which were considered hazardous to the community.

  13. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models

    DEFF Research Database (Denmark)

    Kheir, Rania Bou; Greve, Mogens Humlekrog; Bøcher, Peder Klith

    2010-01-01

    the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect, mean curvature...... field measurements in the area of interest (Denmark). A large number of tree-based classification models (588) were developed using (i) all of the parameters, (ii) all Digital Elevation Model (DEM) parameters only, (iii) the primary DEM parameters only, (iv), the remote sensing (RS) indices only, (v......) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME...

  14. Multiple Additive Regression Trees a Methodology for Predictive Data Mining for Fraud Detection

    National Research Council Canada - National Science Library

    da

    2002-01-01

    ...) is using new and innovative techniques for fraud detection. Their primary techniques for fraud detection are the data mining tools of classification trees and neural networks as well as methods for pooling the results of multiple model fits...

  15. The Hybrid of Classification Tree and Extreme Learning Machine for Permeability Prediction in Oil Reservoir

    KAUST Repository

    Prasetyo Utomo, Chandra

    2011-01-01

    the permeability value. These are based on the well logs data. In order to handle the high range of the permeability value, a classification tree is utilized. A benefit of this innovation is that the tree represents knowledge in a clear and succinct fashion

  16. Classification of premalignant pancreatic cancer mass-spectrometry data using decision tree ensembles

    Directory of Open Access Journals (Sweden)

    Wong G William

    2008-06-01

    Full Text Available Abstract Background Pancreatic cancer is the fourth leading cause of cancer death in the United States. Consequently, identification of clinically relevant biomarkers for the early detection of this cancer type is urgently needed. In recent years, proteomics profiling techniques combined with various data analysis methods have been successfully used to gain critical insights into processes and mechanisms underlying pathologic conditions, particularly as they relate to cancer. However, the high dimensionality of proteomics data combined with their relatively small sample sizes poses a significant challenge to current data mining methodology where many of the standard methods cannot be applied directly. Here, we propose a novel methodological framework using machine learning method, in which decision tree based classifier ensembles coupled with feature selection methods, is applied to proteomics data generated from premalignant pancreatic cancer. Results This study explores the utility of three different feature selection schemas (Student t test, Wilcoxon rank sum test and genetic algorithm to reduce the high dimensionality of a pancreatic cancer proteomic dataset. Using the top features selected from each method, we compared the prediction performances of a single decision tree algorithm C4.5 with six different decision-tree based classifier ensembles (Random forest, Stacked generalization, Bagging, Adaboost, Logitboost and Multiboost. We show that ensemble classifiers always outperform single decision tree classifier in having greater accuracies and smaller prediction errors when applied to a pancreatic cancer proteomics dataset. Conclusion In our cross validation framework, classifier ensembles generally have better classification accuracies compared to that of a single decision tree when applied to a pancreatic cancer proteomic dataset, thus suggesting its utility in future proteomics data analysis. Additionally, the use of feature selection

  17. Vlsi implementation of flexible architecture for decision tree classification in data mining

    Science.gov (United States)

    Sharma, K. Venkatesh; Shewandagn, Behailu; Bhukya, Shankar Nayak

    2017-07-01

    The Data mining algorithms have become vital to researchers in science, engineering, medicine, business, search and security domains. In recent years, there has been a terrific raise in the size of the data being collected and analyzed. Classification is the main difficulty faced in data mining. In a number of the solutions developed for this problem, most accepted one is Decision Tree Classification (DTC) that gives high precision while handling very large amount of data. This paper presents VLSI implementation of flexible architecture for Decision Tree classification in data mining using c4.5 algorithm.

  18. Building classification trees to explain the radioactive contamination levels of the plants

    International Nuclear Information System (INIS)

    Briand, B.

    2008-04-01

    The objective of this thesis is the development of a method allowing the identification of factors leading to various radioactive contamination levels of the plants. The methodology suggested is based on the use of a radioecological transfer model of the radionuclides through the environment (A.S.T.R.A.L. computer code) and a classification-tree method. Particularly, to avoid the instability problems of classification trees and to preserve the tree structure, a node level stabilizing technique is used. Empirical comparisons are carried out between classification trees built by this method (called R.E.N. method) and those obtained by the C.A.R.T. method. A similarity measure is defined to compare the structure of two classification trees. This measure is used to study the stabilizing performance of the R.E.N. method. The methodology suggested is applied to a simplified contamination scenario. By the results obtained, we can identify the main variables responsible of the various radioactive contamination levels of four leafy-vegetables (lettuce, cabbage, spinach and leek). Some extracted rules from these classification trees can be usable in a post-accidental context. (author)

  19. PCA based feature reduction to improve the accuracy of decision tree c4.5 classification

    Science.gov (United States)

    Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.

    2018-03-01

    Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.

  20. Object-based methods for individual tree identification and tree species classification from high-spatial resolution imagery

    Science.gov (United States)

    Wang, Le

    2003-10-01

    Modern forest management poses an increasing need for detailed knowledge of forest information at different spatial scales. At the forest level, the information for tree species assemblage is desired whereas at or below the stand level, individual tree related information is preferred. Remote Sensing provides an effective tool to extract the above information at multiple spatial scales in the continuous time domain. To date, the increasing volume and readily availability of high-spatial-resolution data have lead to a much wider application of remotely sensed products. Nevertheless, to make effective use of the improving spatial resolution, conventional pixel-based classification methods are far from satisfactory. Correspondingly, developing object-based methods becomes a central challenge for researchers in the field of Remote Sensing. This thesis focuses on the development of methods for accurate individual tree identification and tree species classification. We develop a method in which individual tree crown boundaries and treetop locations are derived under a unified framework. We apply a two-stage approach with edge detection followed by marker-controlled watershed segmentation. Treetops are modeled from radiometry and geometry aspects. Specifically, treetops are assumed to be represented by local radiation maxima and to be located near the center of the tree-crown. As a result, a marker image was created from the derived treetop to guide a watershed segmentation to further differentiate overlapping trees and to produce a segmented image comprised of individual tree crowns. The image segmentation method developed achieves a promising result for a 256 x 256 CASI image. Then further effort is made to extend our methods to the multiscales which are constructed from a wavelet decomposition. A scale consistency and geometric consistency are designed to examine the gradients along the scale-space for the purpose of separating true crown boundary from unwanted

  1. Optimizing tree-species classification in hyperspectal images

    CSIR Research Space (South Africa)

    Barnard, E

    2010-11-01

    Full Text Available for classification. Scaling of these components so that all features have equal variance is found to be useful, and their best performance (88.9% accurate classification) is achieved with 15 scaled features and a support vector machine as classifier. A graphical...

  2. Towards a natural classification and backbone tree for Sordariomycete

    Digital Repository Service at National Institute of Oceanography (India)

    Maharachchikumbura, S.S.N.; Hyde, K.D.; Jones, E.B.G.; McKenzie, E.H.C.; Huang, S.-K.; Abdel-Wahab, M.A.; Daranagama, D.A.; Dayarathne, M.; D'souza, M.J.; Goonasekara, I.D.; Hongsanan, S.; Jayawardena, R.S.; Kirk, P.M.; Konta, S.; Liu, J.-K.; Liu, Z.-Y.; Norphanphoun, C.; Pang, K.-L.; Perera, R.H.; Senanayake, I.C.; Shang, Q.; Shenoy, B.D.; Xiao, Y.; Bahkali, A.H.; Kang, J.; Somrothipol, S.; Suetrong, S.; Wen, T.; Xu, J.

    , lichenized or lichenicolous taxa The class includes freshwater, marine and terrestrial taxa and has a worldwide distribution This paper provides an updated outline of the Sordariomycetes and a backbone tree incorporating asexual and sexual genera in the class...

  3. Lidar-based individual tree species classification using convolutional neural network

    Science.gov (United States)

    Mizoguchi, Tomohiro; Ishii, Akira; Nakamura, Hiroyuki; Inoue, Tsuyoshi; Takamatsu, Hisashi

    2017-06-01

    Terrestrial lidar is commonly used for detailed documentation in the field of forest inventory investigation. Recent improvements of point cloud processing techniques enabled efficient and precise computation of an individual tree shape parameters, such as breast-height diameter, height, and volume. However, tree species are manually specified by skilled workers to date. Previous works for automatic tree species classification mainly focused on aerial or satellite images, and few works have been reported for classification techniques using ground-based sensor data. Several candidate sensors can be considered for classification, such as RGB or multi/hyper spectral cameras. Above all candidates, we use terrestrial lidar because it can obtain high resolution point cloud in the dark forest. We selected bark texture for the classification criteria, since they clearly represent unique characteristics of each tree and do not change their appearance under seasonable variation and aged deterioration. In this paper, we propose a new method for automatic individual tree species classification based on terrestrial lidar using Convolutional Neural Network (CNN). The key component is the creation step of a depth image which well describe the characteristics of each species from a point cloud. We focus on Japanese cedar and cypress which cover the large part of domestic forest. Our experimental results demonstrate the effectiveness of our proposed method.

  4. Improving Land Use/Land Cover Classification by Integrating Pixel Unmixing and Decision Tree Methods

    Directory of Open Access Journals (Sweden)

    Chao Yang

    2017-11-01

    Full Text Available Decision tree classification is one of the most efficient methods for obtaining land use/land cover (LULC information from remotely sensed imageries. However, traditional decision tree classification methods cannot effectively eliminate the influence of mixed pixels. This study aimed to integrate pixel unmixing and decision tree to improve LULC classification by removing mixed pixel influence. The abundance and minimum noise fraction (MNF results that were obtained from mixed pixel decomposition were added to decision tree multi-features using a three-dimensional (3D Terrain model, which was created using an image fusion digital elevation model (DEM, to select training samples (ROIs, and improve ROI separability. A Landsat-8 OLI image of the Yunlong Reservoir Basin in Kunming was used to test this proposed method. Study results showed that the Kappa coefficient and the overall accuracy of integrated pixel unmixing and decision tree method increased by 0.093% and 10%, respectively, as compared with the original decision tree method. This proposed method could effectively eliminate the influence of mixed pixels and improve the accuracy in complex LULC classifications.

  5. Maximizing the Diversity of Ensemble Random Forests for Tree Genera Classification Using High Density LiDAR Data

    Directory of Open Access Journals (Sweden)

    Connie Ko

    2016-08-01

    Full Text Available Recent research into improving the effectiveness of forest inventory management using airborne LiDAR data has focused on developing advanced theories in data analytics. Furthermore, supervised learning as a predictive model for classifying tree genera (and species, where possible has been gaining popularity in order to minimize this labor-intensive task. However, bottlenecks remain that hinder the immediate adoption of supervised learning methods. With supervised classification, training samples are required for learning the parameters that govern the performance of a classifier, yet the selection of training data is often subjective and the quality of such samples is critically important. For LiDAR scanning in forest environments, the quantification of data quality is somewhat abstract, normally referring to some metric related to the completeness of individual tree crowns; however, this is not an issue that has received much attention in the literature. Intuitively the choice of training samples having varying quality will affect classification accuracy. In this paper a Diversity Index (DI is proposed that characterizes the diversity of data quality (Qi among selected training samples required for constructing a classification model of tree genera. The training sample is diversified in terms of data quality as opposed to the number of samples per class. The diversified training sample allows the classifier to better learn the positive and negative instances and; therefore; has a higher classification accuracy in discriminating the “unknown” class samples from the “known” samples. Our algorithm is implemented within the Random Forests base classifiers with six derived geometric features from LiDAR data. The training sample contains three tree genera (pine; poplar; and maple and the validation samples contains four labels (pine; poplar; maple; and “unknown”. Classification accuracy improved from 72.8%; when training samples were

  6. Automated method for identification and artery-venous classification of vessel trees in retinal vessel networks.

    Science.gov (United States)

    Joshi, Vinayak S; Reinhardt, Joseph M; Garvin, Mona K; Abramoff, Michael D

    2014-01-01

    The separation of the retinal vessel network into distinct arterial and venous vessel trees is of high interest. We propose an automated method for identification and separation of retinal vessel trees in a retinal color image by converting a vessel segmentation image into a vessel segment map and identifying the individual vessel trees by graph search. Orientation, width, and intensity of each vessel segment are utilized to find the optimal graph of vessel segments. The separated vessel trees are labeled as primary vessel or branches. We utilize the separated vessel trees for arterial-venous (AV) classification, based on the color properties of the vessels in each tree graph. We applied our approach to a dataset of 50 fundus images from 50 subjects. The proposed method resulted in an accuracy of 91.44% correctly classified vessel pixels as either artery or vein. The accuracy of correctly classified major vessel segments was 96.42%.

  7. CROWN-LEVEL TREE SPECIES CLASSIFICATION USING INTEGRATED AIRBORNE HYPERSPECTRAL AND LIDAR REMOTE SENSING DATA

    Directory of Open Access Journals (Sweden)

    Z. Wang

    2018-05-01

    Full Text Available Mapping tree species is essential for sustainable planning as well as to improve our understanding of the role of different trees as different ecological service. However, crown-level tree species automatic classification is a challenging task due to the spectral similarity among diversified tree species, fine-scale spatial variation, shadow, and underlying objects within a crown. Advanced remote sensing data such as airborne Light Detection and Ranging (LiDAR and hyperspectral imagery offer a great potential opportunity to derive crown spectral, structure and canopy physiological information at the individual crown scale, which can be useful for mapping tree species. In this paper, an innovative approach was developed for tree species classification at the crown level. The method utilized LiDAR data for individual tree crown delineation and morphological structure extraction, and Compact Airborne Spectrographic Imager (CASI hyperspectral imagery for pure crown-scale spectral extraction. Specifically, four steps were include: 1 A weighted mean filtering method was developed to improve the accuracy of the smoothed Canopy Height Model (CHM derived from LiDAR data; 2 The marker-controlled watershed segmentation algorithm was, therefore, also employed to delineate the tree-level canopy from the CHM image in this study, and then individual tree height and tree crown were calculated according to the delineated crown; 3 Spectral features within 3 × 3 neighborhood regions centered on the treetops detected by the treetop detection algorithm were derived from the spectrally normalized CASI imagery; 4 The shape characteristics related to their crown diameters and heights were established, and different crown-level tree species were classified using the combination of spectral and shape characteristics. Analysis of results suggests that the developed classification strategy in this paper (OA = 85.12 %, Kc = 0.90 performed better than Li

  8. Crown-Level Tree Species Classification Using Integrated Airborne Hyperspectral and LIDAR Remote Sensing Data

    Science.gov (United States)

    Wang, Z.; Wu, J.; Wang, Y.; Kong, X.; Bao, H.; Ni, Y.; Ma, L.; Jin, J.

    2018-05-01

    Mapping tree species is essential for sustainable planning as well as to improve our understanding of the role of different trees as different ecological service. However, crown-level tree species automatic classification is a challenging task due to the spectral similarity among diversified tree species, fine-scale spatial variation, shadow, and underlying objects within a crown. Advanced remote sensing data such as airborne Light Detection and Ranging (LiDAR) and hyperspectral imagery offer a great potential opportunity to derive crown spectral, structure and canopy physiological information at the individual crown scale, which can be useful for mapping tree species. In this paper, an innovative approach was developed for tree species classification at the crown level. The method utilized LiDAR data for individual tree crown delineation and morphological structure extraction, and Compact Airborne Spectrographic Imager (CASI) hyperspectral imagery for pure crown-scale spectral extraction. Specifically, four steps were include: 1) A weighted mean filtering method was developed to improve the accuracy of the smoothed Canopy Height Model (CHM) derived from LiDAR data; 2) The marker-controlled watershed segmentation algorithm was, therefore, also employed to delineate the tree-level canopy from the CHM image in this study, and then individual tree height and tree crown were calculated according to the delineated crown; 3) Spectral features within 3 × 3 neighborhood regions centered on the treetops detected by the treetop detection algorithm were derived from the spectrally normalized CASI imagery; 4) The shape characteristics related to their crown diameters and heights were established, and different crown-level tree species were classified using the combination of spectral and shape characteristics. Analysis of results suggests that the developed classification strategy in this paper (OA = 85.12 %, Kc = 0.90) performed better than LiDAR-metrics method (OA = 79

  9. Classification and Compression of Multi-Resolution Vectors: A Tree Structured Vector Quantizer Approach

    Science.gov (United States)

    2002-01-01

    their expression profile and for classification of cells into tumerous and non- tumerous classes. Then we will present a parallel tree method for... cancerous cells. We will use the same dataset and use tree structured classifiers with multi-resolution analysis for classifying cancerous from non- cancerous ...cells. We have the expressions of 4096 genes from 98 different cell types. Of these 98, 72 are cancerous while 26 are non- cancerous . We are interested

  10. Data Clustering and Evolving Fuzzy Decision Tree for Data Base Classification Problems

    Science.gov (United States)

    Chang, Pei-Chann; Fan, Chin-Yuan; Wang, Yen-Wen

    Data base classification suffers from two well known difficulties, i.e., the high dimensionality and non-stationary variations within the large historic data. This paper presents a hybrid classification model by integrating a case based reasoning technique, a Fuzzy Decision Tree (FDT), and Genetic Algorithms (GA) to construct a decision-making system for data classification in various data base applications. The model is major based on the idea that the historic data base can be transformed into a smaller case-base together with a group of fuzzy decision rules. As a result, the model can be more accurately respond to the current data under classifying from the inductions by these smaller cases based fuzzy decision trees. Hit rate is applied as a performance measure and the effectiveness of our proposed model is demonstrated by experimentally compared with other approaches on different data base classification applications. The average hit rate of our proposed model is the highest among others.

  11. A novel approach to internal crown characterization for coniferous tree species classification

    Science.gov (United States)

    Harikumar, A.; Bovolo, F.; Bruzzone, L.

    2016-10-01

    The knowledge about individual trees in forest is highly beneficial in forest management. High density small foot- print multi-return airborne Light Detection and Ranging (LiDAR) data can provide a very accurate information about the structural properties of individual trees in forests. Every tree species has a unique set of crown structural characteristics that can be used for tree species classification. In this paper, we use both the internal and external crown structural information of a conifer tree crown, derived from a high density small foot-print multi-return LiDAR data acquisition for species classification. Considering the fact that branches are the major building blocks of a conifer tree crown, we obtain the internal crown structural information using a branch level analysis. The structure of each conifer branch is represented using clusters in the LiDAR point cloud. We propose the joint use of the k-means clustering and geometric shape fitting, on the LiDAR data projected onto a novel 3-dimensional space, to identify branch clusters. After mapping the identified clusters back to the original space, six internal geometric features are estimated using a branch-level analysis. The external crown characteristics are modeled by using six least correlated features based on cone fitting and convex hull. Species classification is performed using a sparse Support Vector Machines (sparse SVM) classifier.

  12. An object-oriented classification method of high resolution imagery based on improved AdaTree

    International Nuclear Information System (INIS)

    Xiaohe, Zhang; Liang, Zhai; Jixian, Zhang; Huiyong, Sang

    2014-01-01

    With the popularity of the application using high spatial resolution remote sensing image, more and more studies paid attention to object-oriented classification on image segmentation as well as automatic classification after image segmentation. This paper proposed a fast method of object-oriented automatic classification. First, edge-based or FNEA-based segmentation was used to identify image objects and the values of most suitable attributes of image objects for classification were calculated. Then a certain number of samples from the image objects were selected as training data for improved AdaTree algorithm to get classification rules. Finally, the image objects could be classified easily using these rules. In the AdaTree, we mainly modified the final hypothesis to get classification rules. In the experiment with WorldView2 image, the result of the method based on AdaTree showed obvious accuracy and efficient improvement compared with the method based on SVM with the kappa coefficient achieving 0.9242

  13. Iqpc 2015 Track: Tree Separation and Classification in Mobile Mapping LIDAR Data

    Science.gov (United States)

    Gorte, B.; Oude Elberink, S.; Sirmacek, B.; Wang, J.

    2015-08-01

    The European FP7 project IQmulus yearly organizes several processing contests, where submissions are requested for novel algorithms for point cloud and other big geodata processing. This paper describes the set-up and execution of a contest having the purpose to evaluate state-of-the-art algorithms for Mobile Mapping System point clouds, in order to detect and identify (individual) trees. By the nature of MMS these are trees in the vicinity of the road network (rather than in forests). Therefore, part of the challenge is distinguishing between trees and other objects, such as buildings, street furniture, cars etc. Three submitted segmentation and classification algorithms are thus evaluated.

  14. Classification of Parkinsonian syndromes from FDG-PET brain data using decision trees with SSM/PCA features.

    Science.gov (United States)

    Mudali, D; Teune, L K; Renken, R J; Leenders, K L; Roerdink, J B T M

    2015-01-01

    Medical imaging techniques like fluorodeoxyglucose positron emission tomography (FDG-PET) have been used to aid in the differential diagnosis of neurodegenerative brain diseases. In this study, the objective is to classify FDG-PET brain scans of subjects with Parkinsonian syndromes (Parkinson's disease, multiple system atrophy, and progressive supranuclear palsy) compared to healthy controls. The scaled subprofile model/principal component analysis (SSM/PCA) method was applied to FDG-PET brain image data to obtain covariance patterns and corresponding subject scores. The latter were used as features for supervised classification by the C4.5 decision tree method. Leave-one-out cross validation was applied to determine classifier performance. We carried out a comparison with other types of classifiers. The big advantage of decision tree classification is that the results are easy to understand by humans. A visual representation of decision trees strongly supports the interpretation process, which is very important in the context of medical diagnosis. Further improvements are suggested based on enlarging the number of the training data, enhancing the decision tree method by bagging, and adding additional features based on (f)MRI data.

  15. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis

    Science.gov (United States)

    Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.

    2016-01-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…

  16. Effects of sample survey design on the accuracy of classification tree models in species distribution models

    Science.gov (United States)

    Thomas C. Edwards; D. Richard Cutler; Niklaus E. Zimmermann; Linda Geiser; Gretchen G. Moisen

    2006-01-01

    We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by...

  17. A novel transferable individual tree crown delineation model based on Fishing Net Dragging and boundary classification

    Science.gov (United States)

    Liu, Tao; Im, Jungho; Quackenbush, Lindi J.

    2015-12-01

    This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.

  18. Wall-to-wall tree type classification using airborne lidar data and CIR images

    DEFF Research Database (Denmark)

    Schumacher, Johannes; Nord-Larsen, Thomas

    2014-01-01

    analysed at the individual tree level (object-based). However, due to computational challenges, most object-based studies cover only smaller areas and experience of larger areas is lacking. We present an approach for an object-based, unsupervised classification of trees into broadleaf or conifer using......-based classification of the TST plots showed an overall accuracy of 84% and a kappa coefficient () of 0.61 when using all plots, and 92% and 0.79, respectively, when leaving out plots with larch. NFI plots were assigned to conifer- or broadleaf-dominated or mixed depending on the area covered by the segments...... of the two tree types. In areas where lidar data were collected specifically during leaf-off conditions, 71% of the NFI plots were assigned correctly into the three categories with = 0.53. Using only NFI plots dominated by one type (broadleaf or conifer), 78% were categorized correctly with = 0...

  19. Woodland Mapping at Single-Tree Levels Using Object-Oriented Classification of Unmanned Aerial Vehicle (uav) Images

    Science.gov (United States)

    Chenari, A.; Erfanifard, Y.; Dehghani, M.; Pourghasemi, H. R.

    2017-09-01

    Remotely sensed datasets offer a reliable means to precisely estimate biophysical characteristics of individual species sparsely distributed in open woodlands. Moreover, object-oriented classification has exhibited significant advantages over different classification methods for delineation of tree crowns and recognition of species in various types of ecosystems. However, it still is unclear if this widely-used classification method can have its advantages on unmanned aerial vehicle (UAV) digital images for mapping vegetation cover at single-tree levels. In this study, UAV orthoimagery was classified using object-oriented classification method for mapping a part of wild pistachio nature reserve in Zagros open woodlands, Fars Province, Iran. This research focused on recognizing two main species of the study area (i.e., wild pistachio and wild almond) and estimating their mean crown area. The orthoimage of study area was consisted of 1,076 images with spatial resolution of 3.47 cm which was georeferenced using 12 ground control points (RMSE=8 cm) gathered by real-time kinematic (RTK) method. The results showed that the UAV orthoimagery classified by object-oriented method efficiently estimated mean crown area of wild pistachios (52.09±24.67 m2) and wild almonds (3.97±1.69 m2) with no significant difference with their observed values (α=0.05). In addition, the results showed that wild pistachios (accuracy of 0.90 and precision of 0.92) and wild almonds (accuracy of 0.90 and precision of 0.89) were well recognized by image segmentation. In general, we concluded that UAV orthoimagery can efficiently produce precise biophysical data of vegetation stands at single-tree levels, which therefore is suitable for assessment and monitoring open woodlands.

  20. WOODLAND MAPPING AT SINGLE-TREE LEVELS USING OBJECT-ORIENTED CLASSIFICATION OF UNMANNED AERIAL VEHICLE (UAV IMAGES

    Directory of Open Access Journals (Sweden)

    A. Chenari

    2017-09-01

    Full Text Available Remotely sensed datasets offer a reliable means to precisely estimate biophysical characteristics of individual species sparsely distributed in open woodlands. Moreover, object-oriented classification has exhibited significant advantages over different classification methods for delineation of tree crowns and recognition of species in various types of ecosystems. However, it still is unclear if this widely-used classification method can have its advantages on unmanned aerial vehicle (UAV digital images for mapping vegetation cover at single-tree levels. In this study, UAV orthoimagery was classified using object-oriented classification method for mapping a part of wild pistachio nature reserve in Zagros open woodlands, Fars Province, Iran. This research focused on recognizing two main species of the study area (i.e., wild pistachio and wild almond and estimating their mean crown area. The orthoimage of study area was consisted of 1,076 images with spatial resolution of 3.47 cm which was georeferenced using 12 ground control points (RMSE=8 cm gathered by real-time kinematic (RTK method. The results showed that the UAV orthoimagery classified by object-oriented method efficiently estimated mean crown area of wild pistachios (52.09±24.67 m2 and wild almonds (3.97±1.69 m2 with no significant difference with their observed values (α=0.05. In addition, the results showed that wild pistachios (accuracy of 0.90 and precision of 0.92 and wild almonds (accuracy of 0.90 and precision of 0.89 were well recognized by image segmentation. In general, we concluded that UAV orthoimagery can efficiently produce precise biophysical data of vegetation stands at single-tree levels, which therefore is suitable for assessment and monitoring open woodlands.

  1. Exploring precrash maneuvers using classification trees and random forests.

    Science.gov (United States)

    Harb, Rami; Yan, Xuedong; Radwan, Essam; Su, Xiaogang

    2009-01-01

    Taking evasive actions vis-à-vis critical traffic situations impending to motor vehicle crashes endows drivers an opportunity to avoid the crash occurrence or at least diminish its severity. This study explores the drivers, vehicles, and environments' characteristics associated with crash avoidance maneuvers (i.e., evasive actions or no evasive actions). Rear-end collisions, head-on collisions, and angle collisions are analyzed separately using decision trees and the significance of the variables on the binary response variable (evasive actions or no evasive actions) is determined. Moreover, the random forests method is employed to rank the importance of the drivers/vehicles/environments characteristics on crash avoidance maneuvers. According to the exploratory analyses' results, drivers' visibility obstruction, drivers' physical impairment, drivers' distraction are associated with crash avoidance maneuvers in all three types of accidents. Moreover, speed limit is associated with rear-end collisions' avoidance maneuvers and vehicle type is correlated with head-on collisions and angle collisions' avoidance maneuvers. It is recommended that future research investigates further the explored trends (e.g., physically impaired drivers, visibility obstruction) using driving simulators which may help in legislative initiatives and in-vehicle technology recommendations.

  2. Multi-test decision tree and its application to microarray data classification.

    Science.gov (United States)

    Czajkowski, Marcin; Grześ, Marek; Kretowski, Marek

    2014-05-01

    The desirable property of tools used to investigate biological data is easy to understand models and predictive decisions. Decision trees are particularly promising in this regard due to their comprehensible nature that resembles the hierarchical process of human decision making. However, existing algorithms for learning decision trees have tendency to underfit gene expression data. The main aim of this work is to improve the performance and stability of decision trees with only a small increase in their complexity. We propose a multi-test decision tree (MTDT); our main contribution is the application of several univariate tests in each non-terminal node of the decision tree. We also search for alternative, lower-ranked features in order to obtain more stable and reliable predictions. Experimental validation was performed on several real-life gene expression datasets. Comparison results with eight classifiers show that MTDT has a statistically significantly higher accuracy than popular decision tree classifiers, and it was highly competitive with ensemble learning algorithms. The proposed solution managed to outperform its baseline algorithm on 14 datasets by an average 6%. A study performed on one of the datasets showed that the discovered genes used in the MTDT classification model are supported by biological evidence in the literature. This paper introduces a new type of decision tree which is more suitable for solving biological problems. MTDTs are relatively easy to analyze and much more powerful in modeling high dimensional microarray data than their popular counterparts. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. Prospective identification of adolescent suicide ideation using classification tree analysis: Models for community-based screening.

    Science.gov (United States)

    Hill, Ryan M; Oosterhoff, Benjamin; Kaplow, Julie B

    2017-07-01

    Although a large number of risk markers for suicide ideation have been identified, little guidance has been provided to prospectively identify adolescents at risk for suicide ideation within community settings. The current study addressed this gap in the literature by utilizing classification tree analysis (CTA) to provide a decision-making model for screening adolescents at risk for suicide ideation. Participants were N = 4,799 youth (Mage = 16.15 years, SD = 1.63) who completed both Waves 1 and 2 of the National Longitudinal Study of Adolescent to Adult Health. CTA was used to generate a series of decision rules for identifying adolescents at risk for reporting suicide ideation at Wave 2. Findings revealed 3 distinct solutions with varying sensitivity and specificity for identifying adolescents who reported suicide ideation. Sensitivity of the classification trees ranged from 44.6% to 77.6%. The tree with greatest specificity and lowest sensitivity was based on a history of suicide ideation. The tree with moderate sensitivity and high specificity was based on depressive symptoms, suicide attempts or suicide among family and friends, and social support. The most sensitive but least specific tree utilized these factors and gender, ethnicity, hours of sleep, school-related factors, and future orientation. These classification trees offer community organizations options for instituting large-scale screenings for suicide ideation risk depending on the available resources and modality of services to be provided. This study provides a theoretically and empirically driven model for prospectively identifying adolescents at risk for suicide ideation and has implications for preventive interventions among at-risk youth. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  4. Temporal expansion of annual crop classification layers for the CONUS using the C5 decision tree classifier

    Science.gov (United States)

    Friesz, Aaron M.; Wylie, Bruce K.; Howard, Daniel M.

    2017-01-01

    Crop cover maps have become widely used in a range of research applications. Multiple crop cover maps have been developed to suite particular research interests. The National Agricultural Statistics Service (NASS) Cropland Data Layers (CDL) are a series of commonly used crop cover maps for the conterminous United States (CONUS) that span from 2008 to 2013. In this investigation, we sought to contribute to the availability of consistent CONUS crop cover maps by extending temporal coverage of the NASS CDL archive back eight additional years to 2000 by creating annual NASS CDL-like crop cover maps derived from a classification tree model algorithm. We used over 11 million records to train a classification tree algorithm and develop a crop classification model (CCM). The model was used to create crop cover maps for the CONUS for years 2000–2013 at 250 m spatial resolution. The CCM and the maps for years 2008–2013 were assessed for accuracy relative to resampled NASS CDLs. The CCM performed well against a withheld test data set with a model prediction accuracy of over 90%. The assessment of the crop cover maps indicated that the model performed well spatially, placing crop cover pixels within their known domains; however, the model did show a bias towards the ‘Other’ crop cover class, which caused frequent misclassifications of pixels around the periphery of large crop cover patch clusters and of pixels that form small, sparsely dispersed crop cover patches.

  5. Incorporating additional tree and environmental variables in a lodgepole pine stem profile model

    Science.gov (United States)

    John C. Byrne

    1993-01-01

    A new variable-form segmented stem profile model is developed for lodgepole pine (Pinus contorta) trees from the northern Rocky Mountains of the United States. I improved estimates of stem diameter by predicting two of the model coefficients with linear equations using a measure of tree form, defined as a ratio of dbh and total height. Additional improvements were...

  6. A classification model of Hyperion image base on SAM combined decision tree

    Science.gov (United States)

    Wang, Zhenghai; Hu, Guangdao; Zhou, YongZhang; Liu, Xin

    2009-10-01

    Monitoring the Earth using imaging spectrometers has necessitated more accurate analyses and new applications to remote sensing. A very high dimensional input space requires an exponentially large amount of data to adequately and reliably represent the classes in that space. On the other hand, with increase in the input dimensionality the hypothesis space grows exponentially, which makes the classification performance highly unreliable. Traditional classification algorithms Classification of hyperspectral images is challenging. New algorithms have to be developed for hyperspectral data classification. The Spectral Angle Mapper (SAM) is a physically-based spectral classification that uses an ndimensional angle to match pixels to reference spectra. The algorithm determines the spectral similarity between two spectra by calculating the angle between the spectra, treating them as vectors in a space with dimensionality equal to the number of bands. The key and difficulty is that we should artificial defining the threshold of SAM. The classification precision depends on the rationality of the threshold of SAM. In order to resolve this problem, this paper proposes a new automatic classification model of remote sensing image using SAM combined with decision tree. It can automatic choose the appropriate threshold of SAM and improve the classify precision of SAM base on the analyze of field spectrum. The test area located in Heqing Yunnan was imaged by EO_1 Hyperion imaging spectrometer using 224 bands in visual and near infrared. The area included limestone areas, rock fields, soil and forests. The area was classified into four different vegetation and soil types. The results show that this method choose the appropriate threshold of SAM and eliminates the disturbance and influence of unwanted objects effectively, so as to improve the classification precision. Compared with the likelihood classification by field survey data, the classification precision of this model

  7. Modeling time-to-event (survival) data using classification tree analysis.

    Science.gov (United States)

    Linden, Ariel; Yarnold, Paul R

    2017-12-01

    Time to the occurrence of an event is often studied in health research. Survival analysis differs from other designs in that follow-up times for individuals who do not experience the event by the end of the study (called censored) are accounted for in the analysis. Cox regression is the standard method for analysing censored data, but the assumptions required of these models are easily violated. In this paper, we introduce classification tree analysis (CTA) as a flexible alternative for modelling censored data. Classification tree analysis is a "decision-tree"-like classification model that provides parsimonious, transparent (ie, easy to visually display and interpret) decision rules that maximize predictive accuracy, derives exact P values via permutation tests, and evaluates model cross-generalizability. Using empirical data, we identify all statistically valid, reproducible, longitudinally consistent, and cross-generalizable CTA survival models and then compare their predictive accuracy to estimates derived via Cox regression and an unadjusted naïve model. Model performance is assessed using integrated Brier scores and a comparison between estimated survival curves. The Cox regression model best predicts average incidence of the outcome over time, whereas CTA survival models best predict either relatively high, or low, incidence of the outcome over time. Classification tree analysis survival models offer many advantages over Cox regression, such as explicit maximization of predictive accuracy, parsimony, statistical robustness, and transparency. Therefore, researchers interested in accurate prognoses and clear decision rules should consider developing models using the CTA-survival framework. © 2017 John Wiley & Sons, Ltd.

  8. Estimating population extinction thresholds with categorical classification trees for Louisiana black bears.

    Science.gov (United States)

    Laufenberg, Jared S; Clark, Joseph D; Chandler, Richard B

    2018-01-01

    Monitoring vulnerable species is critical for their conservation. Thresholds or tipping points are commonly used to indicate when populations become vulnerable to extinction and to trigger changes in conservation actions. However, quantitative methods to determine such thresholds have not been well explored. The Louisiana black bear (Ursus americanus luteolus) was removed from the list of threatened and endangered species under the U.S. Endangered Species Act in 2016 and our objectives were to determine the most appropriate parameters and thresholds for monitoring and management action. Capture mark recapture (CMR) data from 2006 to 2012 were used to estimate population parameters and variances. We used stochastic population simulations and conditional classification trees to identify demographic rates for monitoring that would be most indicative of heighted extinction risk. We then identified thresholds that would be reliable predictors of population viability. Conditional classification trees indicated that annual apparent survival rates for adult females averaged over 5 years ([Formula: see text]) was the best predictor of population persistence. Specifically, population persistence was estimated to be ≥95% over 100 years when [Formula: see text], suggesting that this statistic can be used as threshold to trigger management intervention. Our evaluation produced monitoring protocols that reliably predicted population persistence and was cost-effective. We conclude that population projections and conditional classification trees can be valuable tools for identifying extinction thresholds used in monitoring programs.

  9. Estimating population extinction thresholds with categorical classification trees for Louisiana black bears.

    Directory of Open Access Journals (Sweden)

    Jared S Laufenberg

    Full Text Available Monitoring vulnerable species is critical for their conservation. Thresholds or tipping points are commonly used to indicate when populations become vulnerable to extinction and to trigger changes in conservation actions. However, quantitative methods to determine such thresholds have not been well explored. The Louisiana black bear (Ursus americanus luteolus was removed from the list of threatened and endangered species under the U.S. Endangered Species Act in 2016 and our objectives were to determine the most appropriate parameters and thresholds for monitoring and management action. Capture mark recapture (CMR data from 2006 to 2012 were used to estimate population parameters and variances. We used stochastic population simulations and conditional classification trees to identify demographic rates for monitoring that would be most indicative of heighted extinction risk. We then identified thresholds that would be reliable predictors of population viability. Conditional classification trees indicated that annual apparent survival rates for adult females averaged over 5 years ([Formula: see text] was the best predictor of population persistence. Specifically, population persistence was estimated to be ≥95% over 100 years when [Formula: see text], suggesting that this statistic can be used as threshold to trigger management intervention. Our evaluation produced monitoring protocols that reliably predicted population persistence and was cost-effective. We conclude that population projections and conditional classification trees can be valuable tools for identifying extinction thresholds used in monitoring programs.

  10. Estimating population extinction thresholds with categorical classification trees for Louisiana black bears

    Science.gov (United States)

    Laufenberg, Jared S.; Clark, Joseph D.; Chandler, Richard B.

    2018-01-01

    Monitoring vulnerable species is critical for their conservation. Thresholds or tipping points are commonly used to indicate when populations become vulnerable to extinction and to trigger changes in conservation actions. However, quantitative methods to determine such thresholds have not been well explored. The Louisiana black bear (Ursus americanus luteolus) was removed from the list of threatened and endangered species under the U.S. Endangered Species Act in 2016 and our objectives were to determine the most appropriate parameters and thresholds for monitoring and management action. Capture mark recapture (CMR) data from 2006 to 2012 were used to estimate population parameters and variances. We used stochastic population simulations and conditional classification trees to identify demographic rates for monitoring that would be most indicative of heighted extinction risk. We then identified thresholds that would be reliable predictors of population viability. Conditional classification trees indicated that annual apparent survival rates for adult females averaged over 5 years () was the best predictor of population persistence. Specifically, population persistence was estimated to be ≥95% over 100 years when , suggesting that this statistic can be used as threshold to trigger management intervention. Our evaluation produced monitoring protocols that reliably predicted population persistence and was cost-effective. We conclude that population projections and conditional classification trees can be valuable tools for identifying extinction thresholds used in monitoring programs.

  11. Tree Species Abundance Predictions in a Tropical Agricultural Landscape with a Supervised Classification Model and Imbalanced Data

    Directory of Open Access Journals (Sweden)

    Sarah J. Graves

    2016-02-01

    Full Text Available Mapping species through classification of imaging spectroscopy data is facilitating research to understand tree species distributions at increasingly greater spatial scales. Classification requires a dataset of field observations matched to the image, which will often reflect natural species distributions, resulting in an imbalanced dataset with many samples for common species and few samples for less common species. Despite the high prevalence of imbalanced datasets in multiclass species predictions, the effect on species prediction accuracy and landscape species abundance has not yet been quantified. First, we trained and assessed the accuracy of a support vector machine (SVM model with a highly imbalanced dataset of 20 tropical species and one mixed-species class of 24 species identified in a hyperspectral image mosaic (350–2500 nm of Panamanian farmland and secondary forest fragments. The model, with an overall accuracy of 62% ± 2.3% and F-score of 59% ± 2.7%, was applied to the full image mosaic (23,000 ha at a 2-m resolution to produce a species prediction map, which suggested that this tropical agricultural landscape is more diverse than what has been presented in field-based studies. Second, we quantified the effect of class imbalance on model accuracy. Model assessment showed a trend where species with more samples were consistently over predicted while species with fewer samples were under predicted. Standardizing sample size reduced model accuracy, but also reduced the level of species over- and under-prediction. This study advances operational species mapping of diverse tropical landscapes by detailing the effect of imbalanced data on classification accuracy and providing estimates of tree species abundance in an agricultural landscape. Species maps using data and methods presented here can be used in landscape analyses of species distributions to understand human or environmental effects, in addition to focusing conservation

  12. Sentiment classification of Roman-Urdu opinions using Naïve Bayesian, Decision Tree and KNN classification techniques

    Directory of Open Access Journals (Sweden)

    Muhammad Bilal

    2016-07-01

    Full Text Available Sentiment mining is a field of text mining to determine the attitude of people about a particular product, topic, politician in newsgroup posts, review sites, comments on facebook posts twitter, etc. There are many issues involved in opinion mining. One important issue is that opinions could be in different languages (English, Urdu, Arabic, etc.. To tackle each language according to its orientation is a challenging task. Most of the research work in sentiment mining has been done in English language. Currently, limited research is being carried out on sentiment classification of other languages like Arabic, Italian, Urdu and Hindi. In this paper, three classification models are used for text classification using Waikato Environment for Knowledge Analysis (WEKA. Opinions written in Roman-Urdu and English are extracted from a blog. These extracted opinions are documented in text files to prepare a training dataset containing 150 positive and 150 negative opinions, as labeled examples. Testing data set is supplied to three different models and the results in each case are analyzed. The results show that Naïve Bayesian outperformed Decision Tree and KNN in terms of more accuracy, precision, recall and F-measure.

  13. Tree Species Classification in Temperate Forests Using Formosat-2 Satellite Image Time Series

    Directory of Open Access Journals (Sweden)

    David Sheeren

    2016-09-01

    Full Text Available Mapping forest composition is a major concern for forest management, biodiversity assessment and for understanding the potential impacts of climate change on tree species distribution. In this study, the suitability of a dense high spatial resolution multispectral Formosat-2 satellite image time-series (SITS to discriminate tree species in temperate forests is investigated. Based on a 17-date SITS acquired across one year, thirteen major tree species (8 broadleaves and 5 conifers are classified in a study area of southwest France. The performance of parametric (GMM and nonparametric (k-NN, RF, SVM methods are compared at three class hierarchy levels for different versions of the SITS: (i a smoothed noise-free version based on the Whittaker smoother; (ii a non-smoothed cloudy version including all the dates; (iii a non-smoothed noise-free version including only 14 dates. Noise refers to pixels contaminated by clouds and cloud shadows. The results of the 108 distinct classifications show a very high suitability of the SITS to identify the forest tree species based on phenological differences (average κ = 0 . 93 estimated by cross-validation based on 1235 field-collected plots. SVM is found to be the best classifier with very close results from the other classifiers. No clear benefit of removing noise by smoothing can be observed. Classification accuracy is even improved using the non-smoothed cloudy version of the SITS compared to the 14 cloud-free image time series. However conclusions of the results need to be considered with caution because of possible overfitting. Disagreements also appear between the maps produced by the classifiers for complex mixed forests, suggesting a higher classification uncertainty in these contexts. Our findings suggest that time-series data can be a good alternative to hyperspectral data for mapping forest types. It also demonstrates the potential contribution of the recently launched Sentinel-2 satellite for

  14. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    Science.gov (United States)

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  15. ARABIC TEXT CLASSIFICATION USING NEW STEMMER FOR FEATURE SELECTION AND DECISION TREES

    Directory of Open Access Journals (Sweden)

    SAID BAHASSINE

    2017-06-01

    Full Text Available Text classification is the process of assignment of unclassified text to appropriate classes based on their content. The most prevalent representation for text classification is the bag of words vector. In this representation, the words that appear in documents often have multiple morphological structures, grammatical forms. In most cases, this morphological variant of words belongs to the same category. In the first part of this paper, anew stemming algorithm was developed in which each term of a given document is represented by its root. In the second part, a comparative study is conducted of the impact of two stemming algorithms namely Khoja’s stemmer and our new stemmer (referred to hereafter by origin-stemmer on Arabic text classification. This investigation was carried out using chi-square as a feature of selection to reduce the dimensionality of the feature space and decision tree classifier. In order to evaluate the performance of the classifier, this study used a corpus that consists of 5070 documents independently classified into six categories: sport, entertainment, business, Middle East, switch and world on WEKA toolkit. The recall, f-measure and precision measures are used to compare the performance of the obtained models. The experimental results show that text classification using rout stemmer outperforms classification using Khoja’s stemmer. The f-measure was 92.9% in sport category and 89.1% in business category.

  16. APPLICATION OF MULTIPLE LOGISTIC REGRESSION, BAYESIAN LOGISTIC AND CLASSIFICATION TREE TO IDENTIFY THE SIGNIFICANT FACTORS INFLUENCING CRASH SEVERITY

    Directory of Open Access Journals (Sweden)

    MILAD TAZIK

    2017-11-01

    Full Text Available Identifying cases in which road crashes result in fatality or injury of drivers may help improve their safety. In this study, datasets of crashes happened in TehranQom freeway, Iran, were examined by three models (multiple logistic regression, Bayesian logistic and classification tree to analyse the contribution of several variables to fatal accidents. For multiple logistic regression and Bayesian logistic models, the odds ratio was calculated for each variable. The model which best suited the identification of accident severity was determined based on AIC and DIC criteria. Based on the results of these two models, rollover crashes (OR = 14.58, %95 CI: 6.8-28.6, not using of seat belt (OR = 5.79, %95 CI: 3.1-9.9, exceeding speed limits (OR = 4.02, %95 CI: 1.8-7.9 and being female (OR = 2.91, %95 CI: 1.1-6.1 were the most important factors in fatalities of drivers. In addition, the results of the classification tree model have verified the findings of the other models.

  17. Classification and Optimization of Decision Trees for Inconsistent Decision Tables Represented as MVD Tables

    KAUST Repository

    Azad, Mohammad

    2015-10-11

    Decision tree is a widely used technique to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples (objects) with equal values of conditional attributes but different decisions (values of the decision attribute), then to discover the essential patterns or knowledge from the data set is challenging. We consider three approaches (generalized, most common and many-valued decision) to handle such inconsistency. We created different greedy algorithms using various types of impurity and uncertainty measures to construct decision trees. We compared the three approaches based on the decision tree properties of the depth, average depth and number of nodes. Based on the result of the comparison, we choose to work with the many-valued decision approach. Now to determine which greedy algorithms are efficient, we compared them based on the optimization and classification results. It was found that some greedy algorithms Mult\\\\_ws\\\\_entSort, and Mult\\\\_ws\\\\_entML are good for both optimization and classification.

  18. Classification and Optimization of Decision Trees for Inconsistent Decision Tables Represented as MVD Tables

    KAUST Repository

    Azad, Mohammad; Moshkov, Mikhail

    2015-01-01

    Decision tree is a widely used technique to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples (objects) with equal values of conditional attributes but different decisions (values of the decision attribute), then to discover the essential patterns or knowledge from the data set is challenging. We consider three approaches (generalized, most common and many-valued decision) to handle such inconsistency. We created different greedy algorithms using various types of impurity and uncertainty measures to construct decision trees. We compared the three approaches based on the decision tree properties of the depth, average depth and number of nodes. Based on the result of the comparison, we choose to work with the many-valued decision approach. Now to determine which greedy algorithms are efficient, we compared them based on the optimization and classification results. It was found that some greedy algorithms Mult\\_ws\\_entSort, and Mult\\_ws\\_entML are good for both optimization and classification.

  19. Prediction of survival to discharge following cardiopulmonary resuscitation using classification and regression trees.

    Science.gov (United States)

    Ebell, Mark H; Afonso, Anna M; Geocadin, Romergryko G

    2013-12-01

    To predict the likelihood that an inpatient who experiences cardiopulmonary arrest and undergoes cardiopulmonary resuscitation survives to discharge with good neurologic function or with mild deficits (Cerebral Performance Category score = 1). Classification and Regression Trees were used to develop branching algorithms that optimize the ability of a series of tests to correctly classify patients into two or more groups. Data from 2007 to 2008 (n = 38,092) were used to develop candidate Classification and Regression Trees models to predict the outcome of inpatient cardiopulmonary resuscitation episodes and data from 2009 (n = 14,435) to evaluate the accuracy of the models and judge the degree of over fitting. Both supervised and unsupervised approaches to model development were used. 366 hospitals participating in the Get With the Guidelines-Resuscitation registry. Adult inpatients experiencing an index episode of cardiopulmonary arrest and undergoing cardiopulmonary resuscitation in the hospital. The five candidate models had between 8 and 21 nodes and an area under the receiver operating characteristic curve from 0.718 to 0.766 in the derivation group and from 0.683 to 0.746 in the validation group. One of the supervised models had 14 nodes and classified 27.9% of patients as very unlikely to survive neurologically intact or with mild deficits (Tree models that predict survival to discharge with good neurologic function or with mild deficits following in-hospital cardiopulmonary arrest. Models like this can assist physicians and patients who are considering do-not-resuscitate orders.

  20. Fusion of LiDAR and aerial imagery for the estimation of downed tree volume using Support Vector Machines classification and region based object fitting

    Science.gov (United States)

    Selvarajan, Sowmya

    The study classifies 3D small footprint full waveform digitized LiDAR fused with aerial imagery to downed trees using Support Vector Machines (SVM) algorithm. Using small footprint waveform LiDAR, airborne LiDAR systems can provide better canopy penetration and very high spatial resolution. The small footprint waveform scanner system Riegl LMS-Q680 is addition with an UltraCamX aerial camera are used to measure and map downed trees in a forest. The various data preprocessing steps helped in the identification of ground points from the dense LiDAR dataset and segment the LiDAR data to help reduce the complexity of the algorithm. The haze filtering process helped to differentiate the spectral signatures of the various classes within the aerial image. Such processes, helped to better select the features from both sensor data. The six features: LiDAR height, LiDAR intensity, LiDAR echo, and three image intensities are utilized. To do so, LiDAR derived, aerial image derived and fused LiDAR-aerial image derived features are used to organize the data for the SVM hypothesis formulation. Several variations of the SVM algorithm with different kernels and soft margin parameter C are experimented. The algorithm is implemented to classify downed trees over a pine trees zone. The LiDAR derived features provided an overall accuracy of 98% of downed trees but with no classification error of 86%. The image derived features provided an overall accuracy of 65% and fusion derived features resulted in an overall accuracy of 88%. The results are observed to be stable and robust. The SVM accuracies were accompanied by high false alarm rates, with the LiDAR classification producing 58.45%, image classification producing 95.74% and finally the fused classification producing 93% false alarm rates The Canny edge correction filter helped control the LiDAR false alarm to 35.99%, image false alarm to 48.56% and fused false alarm to 37.69% The implemented classifiers provided a powerful tool for

  1. Modelling of additive manufacturing processes: a review and classification

    Science.gov (United States)

    Stavropoulos, Panagiotis; Foteinopoulos, Panagis

    2018-03-01

    Additive manufacturing (AM) is a very promising technology; however, there are a number of open issues related to the different AM processes. The literature on modelling the existing AM processes is reviewed and classified. A categorization of the different AM processes in process groups, according to the process mechanism, has been conducted and the most important issues are stated. Suggestions are made as to which approach is more appropriate according to the key performance indicator desired to be modelled and a discussion is included as to the way that future modelling work can better contribute to improving today's AM process understanding.

  2. Schistosoma mansoni reinfection: Analysis of risk factors by classification and regression tree (CART modeling.

    Directory of Open Access Journals (Sweden)

    Andréa Gazzinelli

    Full Text Available Praziquantel (PZQ is an effective chemotherapy for schistosomiasis mansoni and a mainstay for its control and potential elimination. However, it does not prevent against reinfection, which can occur rapidly in areas with active transmission. A guide to ranking the risk factors for Schistosoma mansoni reinfection would greatly contribute to prioritizing resources and focusing prevention and control measures to prevent rapid reinfection. The objective of the current study was to explore the relationship among the socioeconomic, demographic, and epidemiological factors that can influence reinfection by S. mansoni one year after successful treatment with PZQ in school-aged children in Northeastern Minas Gerais state Brazil. Parasitological, socioeconomic, demographic, and water contact information were surveyed in 506 S. mansoni-infected individuals, aged 6 to 15 years, resident in these endemic areas. Eligible individuals were treated with PZQ until they were determined to be negative by the absence of S. mansoni eggs in the feces on two consecutive days of Kato-Katz fecal thick smear. These individuals were surveyed again 12 months from the date of successful treatment with PZQ. A classification and regression tree modeling (CART was then used to explore the relationship between socioeconomic, demographic, and epidemiological variables and their reinfection status. The most important risk factor identified for S. mansoni reinfection was their "heavy" infection at baseline. Additional analyses, excluding heavy infection status, showed that lower socioeconomic status and a lower level of education of the household head were also most important risk factors for S. mansoni reinfection. Our results provide an important contribution toward the control and possible elimination of schistosomiasis by identifying three major risk factors that can be used for targeted treatment and monitoring of reinfection. We suggest that control measures that target

  3. Crown-level tree species classification from AISA hyperspectral imagery using an innovative pixel-weighting approach

    Science.gov (United States)

    Liu, Haijian; Wu, Changshan

    2018-06-01

    Crown-level tree species classification is a challenging task due to the spectral similarity among different tree species. Shadow, underlying objects, and other materials within a crown may decrease the purity of extracted crown spectra and further reduce classification accuracy. To address this problem, an innovative pixel-weighting approach was developed for tree species classification at the crown level. The method utilized high density discrete LiDAR data for individual tree delineation and Airborne Imaging Spectrometer for Applications (AISA) hyperspectral imagery for pure crown-scale spectra extraction. Specifically, three steps were included: 1) individual tree identification using LiDAR data, 2) pixel-weighted representative crown spectra calculation using hyperspectral imagery, with which pixel-based illuminated-leaf fractions estimated using a linear spectral mixture analysis (LSMA) were employed as weighted factors, and 3) representative spectra based tree species classification was performed through applying a support vector machine (SVM) approach. Analysis of results suggests that the developed pixel-weighting approach (OA = 82.12%, Kc = 0.74) performed better than treetop-based (OA = 70.86%, Kc = 0.58) and pixel-majority methods (OA = 72.26, Kc = 0.62) in terms of classification accuracy. McNemar tests indicated the differences in accuracy between pixel-weighting and treetop-based approaches as well as that between pixel-weighting and pixel-majority approaches were statistically significant.

  4. Bayesian additive decision trees of biomarker by treatment interactions for predictive biomarker detection and subgroup identification.

    Science.gov (United States)

    Zhao, Yang; Zheng, Wei; Zhuo, Daisy Y; Lu, Yuefeng; Ma, Xiwen; Liu, Hengchang; Zeng, Zhen; Laird, Glen

    2017-10-11

    Personalized medicine, or tailored therapy, has been an active and important topic in recent medical research. Many methods have been proposed in the literature for predictive biomarker detection and subgroup identification. In this article, we propose a novel decision tree-based approach applicable in randomized clinical trials. We model the prognostic effects of the biomarkers using additive regression trees and the biomarker-by-treatment effect using a single regression tree. Bayesian approach is utilized to periodically revise the split variables and the split rules of the decision trees, which provides a better overall fitting. Gibbs sampler is implemented in the MCMC procedure, which updates the prognostic trees and the interaction tree separately. We use the posterior distribution of the interaction tree to construct the predictive scores of the biomarkers and to identify the subgroup where the treatment is superior to the control. Numerical simulations show that our proposed method performs well under various settings comparing to existing methods. We also demonstrate an application of our method in a real clinical trial.

  5. Malignancy Risk Assessment in Patients with Thyroid Nodules Using Classification and Regression Trees

    Directory of Open Access Journals (Sweden)

    Shokouh Taghipour Zahir

    2013-01-01

    Full Text Available Purpose. We sought to investigate the utility of classification and regression trees (CART classifier to differentiate benign from malignant nodules in patients referred for thyroid surgery. Methods. Clinical and demographic data of 271 patients referred to the Sadoughi Hospital during 2006–2011 were collected. In a two-step approach, a CART classifier was employed to differentiate patients with a high versus low risk of thyroid malignancy. The first step served as the screening procedure and was tailored to produce as few false negatives as possible. The second step identified those with the lowest risk of malignancy, chosen from a high risk population. Sensitivity, specificity, positive and negative predictive values (PPV and NPV of the optimal tree were calculated. Results. In the first step, age, sex, and nodule size contributed to the optimal tree. Ultrasonographic features were employed in the second step with hypoechogenicity and/or microcalcifications yielding the highest discriminatory ability. The combined tree produced a sensitivity and specificity of 80.0% (95% CI: 29.9–98.9 and 94.1% (95% CI: 78.9–99.0, respectively. NPV and PPV were 66.7% (41.1–85.6 and 97.0% (82.5–99.8, respectively. Conclusion. CART classifier reliably identifies patients with a low risk of malignancy who can avoid unnecessary surgery.

  6. Prognostic classification index in Iranian colorectal cancer patients: Survival tree analysis

    Directory of Open Access Journals (Sweden)

    Amal Saki Malehi

    2016-01-01

    Full Text Available Aims: The aim of this study was to determine the prognostic index for separating homogenous subgroups in colorectal cancer (CRC patients based on clinicopathological characteristics using survival tree analysis. Methods: The current study was conducted at the Research Center of Gastroenterology and Liver Disease, Shahid Beheshti Medical University in Tehran, between January 2004 and January 2009. A total of 739 patients who already have been diagnosed with CRC based on pathologic report were enrolled. The data included demographic and clinical-pathological characteristic of patients. Tree-structured survival analysis based on a recursive partitioning algorithm was implemented to evaluate prognostic factors. The probability curves were calculated according to the Kaplan-Meier method, and the hazard ratio was estimated as an interest effect size. Result: There were 526 males (71.2% of these patients. The mean survival time (from diagnosis time was 42.46± (3.4. Survival tree identified three variables as main prognostic factors and based on their four prognostic subgroups was constructed. The log-rank test showed good separation of survival curves. Patients with Stage I-IIIA and treated with surgery as the first treatment showed low risk (median = 34 months whereas patients with stage IIIB, IV, and more than 68 years have the worse survival outcome (median = 9.5 months. Conclusion: Constructing the prognostic classification index via survival tree can aid the researchers to assess interaction between clinical variables and determining the cumulative effect of these variables on survival outcome.

  7. The Hybrid of Classification Tree and Extreme Learning Machine for Permeability Prediction in Oil Reservoir

    KAUST Repository

    Prasetyo Utomo, Chandra

    2011-06-01

    Permeability is an important parameter connected with oil reservoir. Predicting the permeability could save millions of dollars. Unfortunately, petroleum engineers have faced numerous challenges arriving at cost-efficient predictions. Much work has been carried out to solve this problem. The main challenge is to handle the high range of permeability in each reservoir. For about a hundred year, mathematicians and engineers have tried to deliver best prediction models. However, none of them have produced satisfying results. In the last two decades, artificial intelligence models have been used. The current best prediction model in permeability prediction is extreme learning machine (ELM). It produces fairly good results but a clear explanation of the model is hard to come by because it is so complex. The aim of this research is to propose a way out of this complexity through the design of a hybrid intelligent model. In this proposal, the system combines classification and regression models to predict the permeability value. These are based on the well logs data. In order to handle the high range of the permeability value, a classification tree is utilized. A benefit of this innovation is that the tree represents knowledge in a clear and succinct fashion and thereby avoids the complexity of all previous models. Finally, it is important to note that the ELM is used as a final predictor. Results demonstrate that this proposed hybrid model performs better when compared with support vector machines (SVM) and ELM in term of correlation coefficient. Moreover, the classification tree model potentially leads to better communication among petroleum engineers concerning this important process and has wider implications for oil reservoir management efficiency.

  8. Analysis of powered two-wheeler crashes in Italy by classification trees and rules discovery.

    Science.gov (United States)

    Montella, Alfonso; Aria, Massimo; D'Ambrosio, Antonio; Mauriello, Filomena

    2012-11-01

    Aim of the study was the analysis of powered two-wheeler (PTW) crashes in Italy in order to detect interdependence as well as dissimilarities among crash characteristics and provide insights for the development of safety improvement strategies focused on PTWs. At this aim, data mining techniques were used to analyze the data relative to the 254,575 crashes involving PTWs occurred in Italy in the period 2006-2008. Classification trees analysis and rules discovery were performed. Tree-based methods are non-linear and non-parametric data mining tools for supervised classification and regression problems. They do not require a priori probabilistic knowledge about the phenomena under studying and consider conditional interactions among input data. Rules discovery is the identification of sets of items (i.e., crash patterns) that occur together in a given event (i.e., a crash in our study) more often than they would if they were independent of each other. Thus, the method can detect interdependence among crash characteristics. Due to the large number of patterns considered, both methods suffer from an extreme risk of finding patterns that appear due to chance alone. To overcome this problem, in our study we randomly split the sample data in two data sets and used well-established statistical practices to evaluate the statistical significance of the results. Both the classification trees and the rules discovery were effective in providing meaningful insights about PTW crash characteristics and their interdependencies. Even though in several cases different crash characteristics were highlighted, the results of the two the analysis methods were never contradictory. Furthermore, most of the findings of this study were consistent with the results of previous studies which used different analytical techniques, such as probabilistic models of crash injury severity. Basing on the analysis results, engineering countermeasures and policy initiatives to reduce PTW injuries and

  9. First Experience with Sentinel-2 Data for Crop and Tree Species Classifications in Central Europe

    OpenAIRE

    Markus Immitzer; Francesco Vuolo; Clement Atzberger

    2016-01-01

    The study presents the preliminary results of two classification exercises assessing the capabilities of pre-operational (August 2015) Sentinel-2 (S2) data for mapping crop types and tree species. In the first case study, an S2 image was used to map six summer crop species in Lower Austria as well as winter crops/bare soil. Crop type maps are needed to account for crop-specific water use and for agricultural statistics. Crop type information is also useful to parametrize crop growth models fo...

  10. GENERATION OF 2D LAND COVER MAPS FOR URBAN AREAS USING DECISION TREE CLASSIFICATION

    DEFF Research Database (Denmark)

    Höhle, Joachim

    2014-01-01

    A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects...... of stereo-observations of false-colour stereopairs. The stratified statistical assessment of the produced land cover map with six classes and based on 91 points per class reveals a high thematic accuracy for classes ‘building’ (99%, 95% CI: 95%-100%) and ‘road and parking lot’ (90%, 95% CI: 83%-95%). Some...

  11. Comparison of rule induction, decision trees and formal concept analysis approaches for classification

    Science.gov (United States)

    Kotelnikov, E. V.; Milov, V. R.

    2018-05-01

    Rule-based learning algorithms have higher transparency and easiness to interpret in comparison with neural networks and deep learning algorithms. These properties make it possible to effectively use such algorithms to solve descriptive tasks of data mining. The choice of an algorithm depends also on its ability to solve predictive tasks. The article compares the quality of the solution of the problems with binary and multiclass classification based on the experiments with six datasets from the UCI Machine Learning Repository. The authors investigate three algorithms: Ripper (rule induction), C4.5 (decision trees), In-Close (formal concept analysis). The results of the experiments show that In-Close demonstrates the best quality of classification in comparison with Ripper and C4.5, however the latter two generate more compact rule sets.

  12. A cross-cultural investigation of college student alcohol consumption: a classification tree analysis.

    Science.gov (United States)

    Kitsantas, Panagiota; Kitsantas, Anastasia; Anagnostopoulou, Tanya

    2008-01-01

    In this cross-cultural study, the authors attempted to identify high-risk subgroups for alcohol consumption among college students. American and Greek students (N = 132) answered questions about alcohol consumption, religious beliefs, attitudes toward drinking, advertisement influences, parental monitoring, and drinking consequences. Heavy drinkers in the American group were younger and less religious than were infrequent drinkers. In the Greek group, heavy drinkers tended to deny the negative results of drinking alcohol and use a permissive attitude to justify it, whereas infrequent drinkers were more likely to be monitored by their parents. These results suggest that parental monitoring and an emphasis on informing students about the negative effects of alcohol on their health and social and academic lives may be effective methods of reducing alcohol consumption. Classification tree analysis revealed that student attitudes toward drinking were important in the classification of American and Greek drinkers, indicating that this is a powerful predictor of alcohol consumption regardless of ethnic background.

  13. Epilepsy classification and additional definitions in occipital lobe epilepsy.

    Science.gov (United States)

    Yilmaz, Kutluhan; Karatoprak, Elif Yüksel

    2015-09-01

    To evaluate epileptic children with occipital lobe epilepsy (OLE) in the light of the characteristics of Panayiotopoulos syndrome and late-onset occipital lobe epilepsy of Gastaut (OLE-G). Patients were categorized into six groups: primary OLE with autonomic symptoms (Panayiotopoulos syndrome), primary OLE with visual symptoms (OLE-G), secondary OLE with autonomic symptoms (P-type sOLE), secondary OLE with visual symptoms (G-type sOLE), and non-categorized primary OLE and non-categorized secondary OLE according to characteristic ictal symptoms of both Panayiotopoulos syndrome and OLE-G, as well as aetiology (primary or secondary). Patients were compared with regards to seizure symptoms, aetiology, cranial imaging, EEG, treatment and outcome. Of 108 patients with OLE (6.4±3.9 years of age), 60 patients constituted primary groups (32 with Panayiotopoulos syndrome, 11 with OLE-G, and 17 with non-categorized primary OLE); the other 48 patients constituted secondary groups (eight with P-type sOLE, three with G-type sOLE, and 37 with non-categorized sOLE). Epileptiform activity was restricted to the occipital area in half of the patients. Generalized epileptiform activity was observed in three patients, including a patient with Panayiotopoulos syndrome (PS). Only one patient had refractory epilepsy in the primary groups while such patients made up 29% in the secondary groups. In OLE, typical autonomic or visual ictal symptoms of Panayiotopoulos syndrome and OLE-G do not necessarily indicate primary (i.e. genetic or idiopathic) aetiology. Moreover, primary OLE may not present with these symptoms. Since there are many patients with OLE who do not exhibit the characteristics of Panayiotopoulos syndrome or OLE-G, additional definitions and terminology appear to be necessary to differentiate between such patients in both clinical practice and studies.

  14. Remote sensing mapping of macroalgal farms by modifying thresholds in the classification tree

    KAUST Repository

    Zheng, Yuhan

    2018-05-07

    Remote sensing is the main approach used to classify and map aquatic vegetation, and classification tree (CT) analysis is superior to various classification methods. Based on previous studies, modified CT can be developed from traditional CT by adjusting the thresholds based on the statistical relationship between spectral features to classify different images without ground-truth data. However, no studies have yet employed this method to resolve marine vegetation. In this study, three Gao-Fen 1 satellite images obtained with the same sensor on January 30, 2014, November 5, 2014, and January 21, 2015 were selected, and two features were then employed to extract macroalgae from aquaculture farms from the seawater background. Besides, object-based classification and other image analysis methods were adopted to improve the classification accuracy in this study. Results show that the overall accuracies of traditional CTs for three images are 92.0%, 94.2% and 93.9%, respectively, whereas the overall accuracies of the two corresponding modified CTs for images obtained on January 21, 2015 and November 5, 2014 are 93.1% and 89.5%, respectively. This indicates modified CTs can help map macroalgae with multi-date imagery and monitor the spatiotemporal distribution of macroalgae in coastal environments.

  15. Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.

    Science.gov (United States)

    Chen, Shizhi; Yang, Xiaodong; Tian, Yingli

    2015-09-01

    A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.

  16. Real-time classification of humans versus animals using profiling sensors and hidden Markov tree model

    Science.gov (United States)

    Hossen, Jakir; Jacobs, Eddie L.; Chari, Srikant

    2015-07-01

    Linear pyroelectric array sensors have enabled useful classifications of objects such as humans and animals to be performed with relatively low-cost hardware in border and perimeter security applications. Ongoing research has sought to improve the performance of these sensors through signal processing algorithms. In the research presented here, we introduce the use of hidden Markov tree (HMT) models for object recognition in images generated by linear pyroelectric sensors. HMTs are trained to statistically model the wavelet features of individual objects through an expectation-maximization learning process. Human versus animal classification for a test object is made by evaluating its wavelet features against the trained HMTs using the maximum-likelihood criterion. The classification performance of this approach is compared to two other techniques; a texture, shape, and spectral component features (TSSF) based classifier and a speeded-up robust feature (SURF) classifier. The evaluation indicates that among the three techniques, the wavelet-based HMT model works well, is robust, and has improved classification performance compared to a SURF-based algorithm in equivalent computation time. When compared to the TSSF-based classifier, the HMT model has a slightly degraded performance but almost an order of magnitude improvement in computation time enabling real-time implementation.

  17. Remote sensing mapping of macroalgal farms by modifying thresholds in the classification tree

    KAUST Repository

    Zheng, Yuhan; Duarte, Carlos M.; Chen, Jiang; Li, Dan; Lou, Zhaohan; Wu, Jiaping

    2018-01-01

    Remote sensing is the main approach used to classify and map aquatic vegetation, and classification tree (CT) analysis is superior to various classification methods. Based on previous studies, modified CT can be developed from traditional CT by adjusting the thresholds based on the statistical relationship between spectral features to classify different images without ground-truth data. However, no studies have yet employed this method to resolve marine vegetation. In this study, three Gao-Fen 1 satellite images obtained with the same sensor on January 30, 2014, November 5, 2014, and January 21, 2015 were selected, and two features were then employed to extract macroalgae from aquaculture farms from the seawater background. Besides, object-based classification and other image analysis methods were adopted to improve the classification accuracy in this study. Results show that the overall accuracies of traditional CTs for three images are 92.0%, 94.2% and 93.9%, respectively, whereas the overall accuracies of the two corresponding modified CTs for images obtained on January 21, 2015 and November 5, 2014 are 93.1% and 89.5%, respectively. This indicates modified CTs can help map macroalgae with multi-date imagery and monitor the spatiotemporal distribution of macroalgae in coastal environments.

  18. Multivariate decision tree designing for the classification of multi-jet topologies in e sup + e sup - collisions

    CERN Document Server

    Mjahed, M

    2002-01-01

    The binary decision tree method is used to separate between several multi-jet topologies in e sup + e sup - collisions. Instead of the univariate process usually taken, a new design procedure for constructing multivariate decision trees is proposed. The segmentation is obtained by considering some features functions, where linear and non-linear discriminant functions and a minimal distance method are used. The classification focuses on ALEPH simulated events, with multi-jet topologies. Compared to a standard univariate tree, the multivariate decision trees offer significantly better performance.

  19. An Evaluation of Different Training Sample Allocation Schemes for Discrete and Continuous Land Cover Classification Using Decision Tree-Based Algorithms

    Directory of Open Access Journals (Sweden)

    René Roland Colditz

    2015-07-01

    Full Text Available Land cover mapping for large regions often employs satellite images of medium to coarse spatial resolution, which complicates mapping of discrete classes. Class memberships, which estimate the proportion of each class for every pixel, have been suggested as an alternative. This paper compares different strategies of training data allocation for discrete and continuous land cover mapping using classification and regression tree algorithms. In addition to measures of discrete and continuous map accuracy the correct estimation of the area is another important criteria. A subset of the 30 m national land cover dataset of 2006 (NLCD2006 of the United States was used as reference set to classify NADIR BRDF-adjusted surface reflectance time series of MODIS at 900 m spatial resolution. Results show that sampling of heterogeneous pixels and sample allocation according to the expected area of each class is best for classification trees. Regression trees for continuous land cover mapping should be trained with random allocation, and predictions should be normalized with a linear scaling function to correctly estimate the total area. From the tested algorithms random forest classification yields lower errors than boosted trees of C5.0, and Cubist shows higher accuracies than random forest regression.

  20. A fuzzy decision tree method for fault classification in the steam generator of a pressurized water reactor

    International Nuclear Information System (INIS)

    Zio, Enrico; Baraldi, Piero; Popescu, Irina Crenguta

    2009-01-01

    This paper extends a method previously introduced by the authors for building a transparent fault classification algorithm by combining the fuzzy clustering, fuzzy logic and decision trees techniques. The baseline method transforms an opaque, fuzzy clustering-based classification model into a fuzzy logic inference model based on linguistic rules which can be represented by a decision tree formalism. The classification model thereby obtained is transparent in that it allows direct interpretation and inspection of the model. An extension in the procedure for the development of the fuzzy logic inference model is introduced to allow the treatment of more complicated cases, e.g. splitted and overlapping clusters. The corresponding computational tool developed relies on a number of parameters which can be tuned by the user to optimally compromise the level of transparency of the classification process and its efficiency. A numerical application is presented with regards to the fault classification in the Steam Generator of a Pressurized Water Reactor.

  1. A classification tree for the prediction of benign versus malignant disease in patients with small renal masses.

    Science.gov (United States)

    Rendon, Ricardo A; Mason, Ross J; Kirkland, Susan; Lawen, Joseph G; Abdolell, Mohamed

    2014-08-01

    To develop a classification tree for the preoperative prediction of benign versus malignant disease in patients with small renal masses. This is a retrospective study including 395 consecutive patients who underwent surgical treatment for a renal mass classification tree to predict the risk of having a benign renal mass preoperatively was developed using recursive partitioning analysis for repeated measures outcomes. Age, sex, volume on preoperative imaging, tumor location (central/peripheral), degree of endophytic component (1%-100%), and tumor axis position were used as potential predictors to develop the model. Forty-five patients (11.4%) were found to have a benign mass postoperatively. A classification tree has been developed which can predict the risk of benign disease with an accuracy of 88.9% (95% CI: 85.3 to 91.8). The significant prognostic factors in the classification tree are tumor volume, degree of endophytic component and symptoms at diagnosis. As an example of its utilization, a renal mass with a volume of classification tree to predict the risk of benign disease in small renal masses has been developed to aid the clinician when deciding on treatment strategies for small renal masses.

  2. Classification of soil respiration in areas of sugarcane renewal using decision tree

    Directory of Open Access Journals (Sweden)

    Camila Viana Vieira Farhate

    Full Text Available ABSTRACT: The use of data mining is a promising alternative to predict soil respiration from correlated variables. Our objective was to build a model using variable selection and decision tree induction to predict different levels of soil respiration, taking into account physical, chemical and microbiological variables of soil as well as precipitation in renewal of sugarcane areas. The original dataset was composed of 19 variables (18 independent variables and one dependent (or response variable. The variable-target refers to soil respiration as the target classification. Due to a large number of variables, a procedure for variable selection was conducted to remove those with low correlation with the variable-target. For that purpose, four approaches of variable selection were evaluated: no variable selection, correlation-based feature selection (CFS, chisquare method (χ2 and Wrapper. To classify soil respiration, we used the decision tree induction technique available in the Weka software package. Our results showed that data mining techniques allow the development of a model for soil respiration classification with accuracy of 81 %, resulting in a knowledge base composed of 27 rules for prediction of soil respiration. In particular, the wrapper method for variable selection identified a subset of only five variables out of 18 available in the original dataset, and they had the following order of influence in determining soil respiration: soil temperature > precipitation > macroporosity > soil moisture > potential acidity.

  3. Application of classification-tree methods to identify nitrate sources in ground water

    Science.gov (United States)

    Spruill, T.B.; Showers, W.J.; Howe, S.S.

    2002-01-01

    A study was conducted to determine if nitrate sources in ground water (fertilizer on crops, fertilizer on golf courses, irrigation spray from hog (Sus scrofa) wastes, and leachate from poultry litter and septic systems) could be classified with 80% or greater success. Two statistical classification-tree models were devised from 48 water samples containing nitrate from five source categories. Model I was constructed by evaluating 32 variables and selecting four primary predictor variables (??15N, nitrate to ammonia ratio, sodium to potassium ratio, and zinc) to identify nitrate sources. A ??15N value of nitrate plus potassium 18.2 indicated inorganic or soil organic N. A nitrate to ammonia ratio 575 indicated nitrate from golf courses. A sodium to potassium ratio 3.2 indicated spray or poultry wastes. A value for zinc 2.8 indicated poultry wastes. Model 2 was devised by using all variables except ??15N. This model also included four variables (sodium plus potassium, nitrate to ammonia ratio, calcium to magnesium ratio, and sodium to potassium ratio) to distinguish categories. Both models were able to distinguish all five source categories with better than 80% overall success and with 71 to 100% success in individual categories using the learning samples. Seventeen water samples that were not used in model development were tested using Model 2 for three categories, and all were correctly classified. Classification-tree models show great potential in identifying sources of contamination and variables important in the source-identification process.

  4. First Experience with Sentinel-2 Data for Crop and Tree Species Classifications in Central Europe

    Directory of Open Access Journals (Sweden)

    Markus Immitzer

    2016-02-01

    Full Text Available The study presents the preliminary results of two classification exercises assessing the capabilities of pre-operational (August 2015 Sentinel-2 (S2 data for mapping crop types and tree species. In the first case study, an S2 image was used to map six summer crop species in Lower Austria as well as winter crops/bare soil. Crop type maps are needed to account for crop-specific water use and for agricultural statistics. Crop type information is also useful to parametrize crop growth models for yield estimation, as well as for the retrieval of vegetation biophysical variables using radiative transfer models. The second case study aimed to map seven different deciduous and coniferous tree species in Germany. Detailed information about tree species distribution is important for forest management and to assess potential impacts of climate change. In our S2 data assessment, crop and tree species maps were produced at 10 m spatial resolution by combining the ten S2 spectral channels with 10 and 20 m pixel size. A supervised Random Forest classifier (RF was deployed and trained with appropriate ground truth. In both case studies, S2 data confirmed its expected capabilities to produce reliable land cover maps. Cross-validated overall accuracies ranged between 65% (tree species and 76% (crop types. The study confirmed the high value of the red-edge and shortwave infrared (SWIR bands for vegetation mapping. Also, the blue band was important in both study sites. The S2-bands in the near infrared were amongst the least important channels. The object based image analysis (OBIA and the classical pixel-based classification achieved comparable results, mainly for the cropland. As only single date acquisitions were available for this study, the full potential of S2 data could not be assessed. In the future, the two twin S2 satellites will offer global coverage every five days and therefore permit to concurrently exploit unprecedented spectral and temporal

  5. The Bird Core for Minimum Cost Spanning Tree problems Revisited : Monotonicity and Additivity Aspects

    NARCIS (Netherlands)

    Tijs, S.H.; Moretti, S.; Brânzei, R.; Norde, H.W.

    2005-01-01

    A new way is presented to define for minimum cost spanning tree (mcst-) games the irreducible core, which is introduced by Bird in 1976.The Bird core correspondence turns out to have interesting monotonicity and additivity properties and each stable cost monotonic allocation rule for mcst-problems

  6. Using spectrotemporal indices to improve the fruit-tree crop classification accuracy

    Science.gov (United States)

    Peña, M. A.; Liao, R.; Brenning, A.

    2017-06-01

    This study assesses the potential of spectrotemporal indices derived from satellite image time series (SITS) to improve the classification accuracy of fruit-tree crops. Six major fruit-tree crop types in the Aconcagua Valley, Chile, were classified by applying various linear discriminant analysis (LDA) techniques on a Landsat-8 time series of nine images corresponding to the 2014-15 growing season. As features we not only used the complete spectral resolution of the SITS, but also all possible normalized difference indices (NDIs) that can be constructed from any two bands of the time series, a novel approach to derive features from SITS. Due to the high dimensionality of this "enhanced" feature set we used the lasso and ridge penalized variants of LDA (PLDA). Although classification accuracies yielded by the standard LDA applied on the full-band SITS were good (misclassification error rate, MER = 0.13), they were further improved by 23% (MER = 0.10) with ridge PLDA using the enhanced feature set. The most important bands to discriminate the crops of interest were mainly concentrated on the first two image dates of the time series, corresponding to the crops' greenup stage. Despite the high predictor weights provided by the red and near infrared bands, typically used to construct greenness spectral indices, other spectral regions were also found important for the discrimination, such as the shortwave infrared band at 2.11-2.19 μm, sensitive to foliar water changes. These findings support the usefulness of spectrotemporal indices in the context of SITS-based crop type classifications, which until now have been mainly constructed by the arithmetic combination of two bands of the same image date in order to derive greenness temporal profiles like those from the normalized difference vegetation index.

  7. OmniGA: Optimized Omnivariate Decision Trees for Generalizable Classification Models

    KAUST Repository

    Magana-Mora, Arturo

    2017-06-14

    Classification problems from different domains vary in complexity, size, and imbalance of the number of samples from different classes. Although several classification models have been proposed, selecting the right model and parameters for a given classification task to achieve good performance is not trivial. Therefore, there is a constant interest in developing novel robust and efficient models suitable for a great variety of data. Here, we propose OmniGA, a framework for the optimization of omnivariate decision trees based on a parallel genetic algorithm, coupled with deep learning structure and ensemble learning methods. The performance of the OmniGA framework is evaluated on 12 different datasets taken mainly from biomedical problems and compared with the results obtained by several robust and commonly used machine-learning models with optimized parameters. The results show that OmniGA systematically outperformed these models for all the considered datasets, reducing the F score error in the range from 100% to 2.25%, compared to the best performing model. This demonstrates that OmniGA produces robust models with improved performance. OmniGA code and datasets are available at www.cbrc.kaust.edu.sa/omniga/.

  8. OmniGA: Optimized Omnivariate Decision Trees for Generalizable Classification Models

    KAUST Repository

    Magana-Mora, Arturo; Bajic, Vladimir B.

    2017-01-01

    Classification problems from different domains vary in complexity, size, and imbalance of the number of samples from different classes. Although several classification models have been proposed, selecting the right model and parameters for a given classification task to achieve good performance is not trivial. Therefore, there is a constant interest in developing novel robust and efficient models suitable for a great variety of data. Here, we propose OmniGA, a framework for the optimization of omnivariate decision trees based on a parallel genetic algorithm, coupled with deep learning structure and ensemble learning methods. The performance of the OmniGA framework is evaluated on 12 different datasets taken mainly from biomedical problems and compared with the results obtained by several robust and commonly used machine-learning models with optimized parameters. The results show that OmniGA systematically outperformed these models for all the considered datasets, reducing the F score error in the range from 100% to 2.25%, compared to the best performing model. This demonstrates that OmniGA produces robust models with improved performance. OmniGA code and datasets are available at www.cbrc.kaust.edu.sa/omniga/.

  9. Automated Detection of Connective Tissue by Tissue Counter Analysis and Classification and Regression Trees

    Directory of Open Access Journals (Sweden)

    Josef Smolle

    2001-01-01

    Full Text Available Objective: To evaluate the feasibility of the CART (Classification and Regression Tree procedure for the recognition of microscopic structures in tissue counter analysis. Methods: Digital microscopic images of H&E stained slides of normal human skin and of primary malignant melanoma were overlayed with regularly distributed square measuring masks (elements and grey value, texture and colour features within each mask were recorded. In the learning set, elements were interactively labeled as representing either connective tissue of the reticular dermis, other tissue components or background. Subsequently, CART models were based on these data sets. Results: Implementation of the CART classification rules into the image analysis program showed that in an independent test set 94.1% of elements classified as connective tissue of the reticular dermis were correctly labeled. Automated measurements of the total amount of tissue and of the amount of connective tissue within a slide showed high reproducibility (r=0.97 and r=0.94, respectively; p < 0.001. Conclusions: CART procedure in tissue counter analysis yields simple and reproducible classification rules for tissue elements.

  10. Examination as to the classification of representative tree species at Satoyama coppice forest using multiwavelength range data observed from aircraft

    International Nuclear Information System (INIS)

    Setojima, M.; Imai, Y.; Funahashi, M.; Kawai, M.; Katsuki, T.

    2006-01-01

    In this study, we examined the possibility of classifying representative tree species at Satoyama coppice forest based on spectral reflectance of the tree species. We used the airborne hyperspectral data observed in exhibition leaf stage at the test forest (about 3.4ha) in Tama Forest Science Garden (Hachioji, Tokyo) , where the forest type similar to that of Satoyama is preserved. The classification accuracy was verified by comparing the results of interpretation of color aerial photographs taken in spring and autumn in chronological order and the field survey. As a result, the 534-556 nm (band 6 and band 7) in the visible range and 739-762 nm (band 15 and band 16), 785nm (band 17) in the near infrared range are effective bands for classification of the species of such trees as Castanopsis sieboldii, Quercus glauca, Zelkova serrata, Quercus serrata, Cryptomeria japonica, and Chamaecyparis obutusa, which are representative trees in Satoyama coppice forest in Tama district

  11. Classification decision tree in CT imaging: application to the differential diagnosis of solitary pulmonary nodules

    International Nuclear Information System (INIS)

    Ma Hongxia; Guo Yulin; Wang Qiuping; Qiang Yongqian; Liu Min; Guo Xiaojuan; Guo Youmin; Chen Qihang

    2008-01-01

    Objective: To establish classification and regression tree (CART) for differentiating benign from malignant solitary pulmonary nudules (SPN). Methods: One hundred and sixteen consecutive cases with 116 solitary pulmonary nodules, which finally were pathologically proven 54 malignant nodules and 62 benign nodules, were prospectively registered in this research. Twelve clinical presentations and 22 CT findings were collected as predictors. A classification tree was established to distinguish benign SPNs from malignant ones. In the observer test, two groups (one made of junior radiologists and one of senior radiologists) were independently presented with clinical information and CT images without knowing the pathologic and machine-learning results. Performance of observers and CART were compared by receiver operating characteristic analysis. Results: Receiver operating characteristic analysis showed areas under the curve of CART, senior radiologists and junior radiologists respectively were 0.910±0.029, 0.827±0.038, 0.612±0.052. Difference between areas(DBF) between CART and junior radiologists was 0.297(P<0.01). DBF between CART and senior radiologists was 0.083 (P<0.05). DBF between senior and junior radiologists was 0.214 (P<0.01). CART showed a best diagnostic efficiency, followed by junior radiologists, and then senior radiologists. Conclusion: Our data mining techniques using CART prove a high accuracy in differentiating benign from malignant pulmonary nodules based on clinical variables and CT findings. It will be a potentially useful tool in further application of artificial intelligence in the imaging diagnosis. (authors)

  12. Using classification tree modelling to investigate drug prescription practices at health facilities in rural Tanzania

    Directory of Open Access Journals (Sweden)

    Kajungu Dan K

    2012-09-01

    Full Text Available Abstract Background Drug prescription practices depend on several factors related to the patient, health worker and health facilities. A better understanding of the factors influencing prescription patterns is essential to develop strategies to mitigate the negative consequences associated with poor practices in both the public and private sectors. Methods A cross-sectional study was conducted in rural Tanzania among patients attending health facilities, and health workers. Patients, health workers and health facilities-related factors with the potential to influence drug prescription patterns were used to build a model of key predictors. Standard data mining methodology of classification tree analysis was used to define the importance of the different factors on prescription patterns. Results This analysis included 1,470 patients and 71 health workers practicing in 30 health facilities. Patients were mostly treated in dispensaries. Twenty two variables were used to construct two classification tree models: one for polypharmacy (prescription of ≥3 drugs on a single clinic visit and one for co-prescription of artemether-lumefantrine (AL with antibiotics. The most important predictor of polypharmacy was the diagnosis of several illnesses. Polypharmacy was also associated with little or no supervision of the health workers, administration of AL and private facilities. Co-prescription of AL with antibiotics was more frequent in children under five years of age and the other important predictors were transmission season, mode of diagnosis and the location of the health facility. Conclusion Standard data mining methodology is an easy-to-implement analytical approach that can be useful for decision-making. Polypharmacy is mainly due to the diagnosis of multiple illnesses.

  13. A machine learning approach to galaxy-LSS classification - I. Imprints on halo merger trees

    Science.gov (United States)

    Hui, Jianan; Aragon, Miguel; Cui, Xinping; Flegal, James M.

    2018-04-01

    The cosmic web plays a major role in the formation and evolution of galaxies and defines, to a large extent, their properties. However, the relation between galaxies and environment is still not well understood. Here, we present a machine learning approach to study imprints of environmental effects on the mass assembly of haloes. We present a galaxy-LSS machine learning classifier based on galaxy properties sensitive to the environment. We then use the classifier to assess the relevance of each property. Correlations between galaxy properties and their cosmic environment can be used to predict galaxy membership to void/wall or filament/cluster with an accuracy of 93 per cent. Our study unveils environmental information encoded in properties of haloes not normally considered directly dependent on the cosmic environment such as merger history and complexity. Understanding the physical mechanism by which the cosmic web is imprinted in a halo can lead to significant improvements in galaxy formation models. This is accomplished by extracting features from galaxy properties and merger trees, computing feature scores for each feature and then applying support vector machine (SVM) to different feature sets. To this end, we have discovered that the shape and depth of the merger tree, formation time, and density of the galaxy are strongly associated with the cosmic environment. We describe a significant improvement in the original classification algorithm by performing LU decomposition of the distance matrix computed by the feature vectors and then using the output of the decomposition as input vectors for SVM.

  14. KLASIFIKASI KARAKTERISTIK KECELAKAAN LALU LINTAS DI KOTA DENPASAR DENGAN PENDEKATAN CLASSIFICATION AND REGRESSION TREES (CART

    Directory of Open Access Journals (Sweden)

    I GEDE AGUS JIWADIANA

    2015-11-01

    Full Text Available The aim of this research is to determine the classification characteristics of traffic accidents in Denpasar city in January-July 2014 by using Classification And Regression Trees (CART. Then, for determine the explanatory variables into the main classifier of CART. The result showed that optimum CART generate three terminal node. First terminal node, there are 12 people were classified as heavy traffic accident characteritics with single accident, and second terminal nodes, there are 68 people were classified as minor traffic accident characteristics by type of traffic accident front-rear, front-front, front-side, pedestrians, side-side and location of traffic accident in district road and sub-district road. For third terminal node, there are 291 people were classified as medium traffic accident characteristics by type of traffic accident front-rear, front-front, front-side, pedestrians, side-side and location of traffic accident in municipality road and explanatory variables into the main splitter to make of CART is type of traffic accident with maximum homogeneity measure of 0.03252.

  15. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    Science.gov (United States)

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  16. Exploiting machine learning algorithms for tree species classification in a semiarid woodland using RapidEye image

    CSIR Research Space (South Africa)

    Adelabu, S

    2013-11-01

    Full Text Available in semiarid environments. In this study, we examined the suitability of 5-band RapidEye satellite data for the classification of five tree species in mopane woodland of Botswana using machine leaning algorithms with limited training samples. We performed...

  17. A ROUGH SET DECISION TREE BASED MLP-CNN FOR VERY HIGH RESOLUTION REMOTELY SENSED IMAGE CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    C. Zhang

    2017-09-01

    Full Text Available Recent advances in remote sensing have witnessed a great amount of very high resolution (VHR images acquired at sub-metre spatial resolution. These VHR remotely sensed data has post enormous challenges in processing, analysing and classifying them effectively due to the high spatial complexity and heterogeneity. Although many computer-aid classification methods that based on machine learning approaches have been developed over the past decades, most of them are developed toward pixel level spectral differentiation, e.g. Multi-Layer Perceptron (MLP, which are unable to exploit abundant spatial details within VHR images. This paper introduced a rough set model as a general framework to objectively characterize the uncertainty in CNN classification results, and further partition them into correctness and incorrectness on the map. The correct classification regions of CNN were trusted and maintained, whereas the misclassification areas were reclassified using a decision tree with both CNN and MLP. The effectiveness of the proposed rough set decision tree based MLP-CNN was tested using an urban area at Bournemouth, United Kingdom. The MLP-CNN, well capturing the complementarity between CNN and MLP through the rough set based decision tree, achieved the best classification performance both visually and numerically. Therefore, this research paves the way to achieve fully automatic and effective VHR image classification.

  18. TREE SPECIES CLASSIFICATION OF BROADLEAVED FORESTS IN NAGANO, CENTRAL JAPAN, USING AIRBORNE LASER DATA AND MULTISPECTRAL IMAGES

    Directory of Open Access Journals (Sweden)

    S. Deng

    2017-10-01

    Full Text Available This study attempted to classify three coniferous and ten broadleaved tree species by combining airborne laser scanning (ALS data and multispectral images. The study area, located in Nagano, central Japan, is within the broadleaved forests of the Afan Woodland area. A total of 235 trees were surveyed in 2016, and we recorded the species, DBH, and tree height. The geographical position of each tree was collected using a Global Navigation Satellite System (GNSS device. Tree crowns were manually detected using GNSS position data, field photographs, true-color orthoimages with three bands (red-green-blue, RGB, 3D point clouds, and a canopy height model derived from ALS data. Then a total of 69 features, including 27 image-based and 42 point-based features, were extracted from the RGB images and the ALS data to classify tree species. Finally, the detected tree crowns were classified into two classes for the first level (coniferous and broadleaved trees, four classes for the second level (Pinus densiflora, Larix kaempferi, Cryptomeria japonica, and broadleaved trees, and 13 classes for the third level (three coniferous and ten broadleaved species, using the 27 image-based features, 42 point-based features, all 69 features, and the best combination of features identified using a neighborhood component analysis algorithm, respectively. The overall classification accuracies reached 90 % at the first and second levels but less than 60 % at the third level. The classifications using the best combinations of features had higher accuracies than those using the image-based and point-based features and the combination of all of the 69 features.

  19. Bayesian and Classical Machine Learning Methods: A Comparison for Tree Species Classification with LiDAR Waveform Signatures

    Directory of Open Access Journals (Sweden)

    Tan Zhou

    2017-12-01

    Full Text Available A plethora of information contained in full-waveform (FW Light Detection and Ranging (LiDAR data offers prospects for characterizing vegetation structures. This study aims to investigate the capacity of FW LiDAR data alone for tree species identification through the integration of waveform metrics with machine learning methods and Bayesian inference. Specifically, we first conducted automatic tree segmentation based on the waveform-based canopy height model (CHM using three approaches including TreeVaW, watershed algorithms and the combination of TreeVaW and watershed (TW algorithms. Subsequently, the Random forests (RF and Conditional inference forests (CF models were employed to identify important tree-level waveform metrics derived from three distinct sources, such as raw waveforms, composite waveforms, the waveform-based point cloud and the combined variables from these three sources. Further, we discriminated tree (gray pine, blue oak, interior live oak and shrub species through the RF, CF and Bayesian multinomial logistic regression (BMLR using important waveform metrics identified in this study. Results of the tree segmentation demonstrated that the TW algorithms outperformed other algorithms for delineating individual tree crowns. The CF model overcomes waveform metrics selection bias caused by the RF model which favors correlated metrics and enhances the accuracy of subsequent classification. We also found that composite waveforms are more informative than raw waveforms and waveform-based point cloud for characterizing tree species in our study area. Both classical machine learning methods (the RF and CF and the BMLR generated satisfactory average overall accuracy (74% for the RF, 77% for the CF and 81% for the BMLR and the BMLR slightly outperformed the other two methods. However, these three methods suffered from low individual classification accuracy for the blue oak which is prone to being misclassified as the interior live oak due

  20. A Decision-Tree-Based Algorithm for Speech/Music Classification and Segmentation

    Directory of Open Access Journals (Sweden)

    Lavner Yizhar

    2009-01-01

    Full Text Available We present an efficient algorithm for segmentation of audio signals into speech or music. The central motivation to our study is consumer audio applications, where various real-time enhancements are often applied. The algorithm consists of a learning phase and a classification phase. In the learning phase, predefined training data is used for computing various time-domain and frequency-domain features, for speech and music signals separately, and estimating the optimal speech/music thresholds, based on the probability density functions of the features. An automatic procedure is employed to select the best features for separation. In the test phase, initial classification is performed for each segment of the audio signal, using a three-stage sieve-like approach, applying both Bayesian and rule-based methods. To avoid erroneous rapid alternations in the classification, a smoothing technique is applied, averaging the decision on each segment with past segment decisions. Extensive evaluation of the algorithm, on a database of more than 12 hours of speech and more than 22 hours of music showed correct identification rates of 99.4% and 97.8%, respectively, and quick adjustment to alternating speech/music sections. In addition to its accuracy and robustness, the algorithm can be easily adapted to different audio types, and is suitable for real-time operation.

  1. Unified framework for triaxial accelerometer-based fall event detection and classification using cumulants and hierarchical decision tree classifier.

    Science.gov (United States)

    Kambhampati, Satya Samyukta; Singh, Vishal; Manikandan, M Sabarimalai; Ramkumar, Barathram

    2015-08-01

    In this Letter, the authors present a unified framework for fall event detection and classification using the cumulants extracted from the acceleration (ACC) signals acquired using a single waist-mounted triaxial accelerometer. The main objective of this Letter is to find suitable representative cumulants and classifiers in effectively detecting and classifying different types of fall and non-fall events. It was discovered that the first level of the proposed hierarchical decision tree algorithm implements fall detection using fifth-order cumulants and support vector machine (SVM) classifier. In the second level, the fall event classification algorithm uses the fifth-order cumulants and SVM. Finally, human activity classification is performed using the second-order cumulants and SVM. The detection and classification results are compared with those of the decision tree, naive Bayes, multilayer perceptron and SVM classifiers with different types of time-domain features including the second-, third-, fourth- and fifth-order cumulants and the signal magnitude vector and signal magnitude area. The experimental results demonstrate that the second- and fifth-order cumulant features and SVM classifier can achieve optimal detection and classification rates of above 95%, as well as the lowest false alarm rate of 1.03%.

  2. Classification tree analyses reveal limited potential for early targeted prevention against childhood overweight.

    Science.gov (United States)

    Beyerlein, Andreas; Kusian, Dennis; Ziegler, Anette-Gabriele; Schaffrath-Rosario, Angelika; von Kries, Rüdiger

    2014-02-01

    Whether specific combinations of risk factors in very early life might allow identification of high-risk target groups for overweight prevention programs was examined. Data of n = 8981 children from the German KiGGS study were analyzed. Using a classification tree approach, predictive risk factor combinations were assessed for overweight in 3-6, 7-10, and 11-17-year-old children. In preschool children, the subgroup with the highest overweight risk were migrant children with at least one obese parent, with a prevalence of 36.6 (95% confidence interval or CI: 22.9, 50.4)%, compared to an overall prevalence of 10.0 (8.9, 11.2)%. The prevalence of overweight increased from 18.3 (16.8, 19.8)% to 57.9 (46.6, 69.3)% in 7-10-year-old children, if at least one parent was obese and the child had been born large-for-gestational-age. In 11-17-year-olds, the overweight risk increased from 20.1 (18.9, 21.3)% to 63.0 (46.4, 79.7)% in the highest risk group. However, high prevalence ratios were found only in small subgroups, containing <10% of all overweight cases in the respective age group. Our results indicate only a limited potential for early targeted preventions against overweight in children and adolescents. Copyright © 2013 The Obesity Society.

  3. Predicting smear negative pulmonary tuberculosis with classification trees and logistic regression: a cross-sectional study

    Directory of Open Access Journals (Sweden)

    Kritski Afrânio

    2006-02-01

    Full Text Available Abstract Background Smear negative pulmonary tuberculosis (SNPT accounts for 30% of pulmonary tuberculosis cases reported yearly in Brazil. This study aimed to develop a prediction model for SNPT for outpatients in areas with scarce resources. Methods The study enrolled 551 patients with clinical-radiological suspicion of SNPT, in Rio de Janeiro, Brazil. The original data was divided into two equivalent samples for generation and validation of the prediction models. Symptoms, physical signs and chest X-rays were used for constructing logistic regression and classification and regression tree models. From the logistic regression, we generated a clinical and radiological prediction score. The area under the receiver operator characteristic curve, sensitivity, and specificity were used to evaluate the model's performance in both generation and validation samples. Results It was possible to generate predictive models for SNPT with sensitivity ranging from 64% to 71% and specificity ranging from 58% to 76%. Conclusion The results suggest that those models might be useful as screening tools for estimating the risk of SNPT, optimizing the utilization of more expensive tests, and avoiding costs of unnecessary anti-tuberculosis treatment. Those models might be cost-effective tools in a health care network with hierarchical distribution of scarce resources.

  4. City housing atmospheric pollutant impact on emergency visit for asthma: A classification and regression tree approach.

    Science.gov (United States)

    Mazenq, Julie; Dubus, Jean-Christophe; Gaudart, Jean; Charpin, Denis; Viudes, Gilles; Noel, Guilhem

    2017-11-01

    Particulate matter, nitrogen dioxide (NO 2 ) and ozone are recognized as the three pollutants that most significantly affect human health. Asthma is a multifactorial disease. However, the place of residence has rarely been investigated. We compared the impact of air pollution, measured near patients' homes, on emergency department (ED) visits for asthma or trauma (controls) within the Provence-Alpes-Côte-d'Azur region. Variables were selected using classification and regression trees on asthmatic and control population, 3-99 years, visiting ED from January 1 to December 31, 2013. Then in a nested case control study, randomization was based on the day of ED visit and on defined age groups. Pollution, meteorological, pollens and viral data measured that day were linked to the patient's ZIP code. A total of 794,884 visits were reported including 6250 for asthma and 278,192 for trauma. Factors associated with an excess risk of emergency visit for asthma included short-term exposure to NO 2 , female gender, high viral load and a combination of low temperature and high humidity. Short-term exposures to high NO 2 concentrations, as assessed close to the homes of the patients, were significantly associated with asthma-related ED visits in children and adults. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Differential Diagnosis of Erythmato-Squamous Diseases Using Classification and Regression Tree.

    Science.gov (United States)

    Maghooli, Keivan; Langarizadeh, Mostafa; Shahmoradi, Leila; Habibi-Koolaee, Mahdi; Jebraeily, Mohamad; Bouraghi, Hamid

    2016-10-01

    Differential diagnosis of Erythmato-Squamous Diseases (ESD) is a major challenge in the field of dermatology. The ESD diseases are placed into six different classes. Data mining is the process for detection of hidden patterns. In the case of ESD, data mining help us to predict the diseases. Different algorithms were developed for this purpose. we aimed to use the Classification and Regression Tree (CART) to predict differential diagnosis of ESD. we used the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology. For this purpose, the dermatology data set from machine learning repository, UCI was obtained. The Clementine 12.0 software from IBM Company was used for modelling. In order to evaluation of the model we calculate the accuracy, sensitivity and specificity of the model. The proposed model had an accuracy of 94.84% (. 24.42) in order to correct prediction of the ESD disease. Results indicated that using of this classifier could be useful. But, it would be strongly recommended that the combination of machine learning methods could be more useful in terms of prediction of ESD.

  6. Beef Quality Identification Using Thresholding Method and Decision Tree Classification Based on Android Smartphone

    Directory of Open Access Journals (Sweden)

    Kusworo Adi

    2017-01-01

    Full Text Available Beef is one of the animal food products that have high nutrition because it contains carbohydrates, proteins, fats, vitamins, and minerals. Therefore, the quality of beef should be maintained so that consumers get good beef quality. Determination of beef quality is commonly conducted visually by comparing the actual beef and reference pictures of each beef class. This process presents weaknesses, as it is subjective in nature and takes a considerable amount of time. Therefore, an automated system based on image processing that is capable of determining beef quality is required. This research aims to develop an image segmentation method by processing digital images. The system designed consists of image acquisition processes with varied distance, resolution, and angle. Image segmentation is done to separate the images of fat and meat using the Otsu thresholding method. Classification was carried out using the decision tree algorithm and the best accuracies were obtained at 90% for training and 84% for testing. Once developed, this system is then embedded into the android programming. Results show that the image processing technique is capable of proper marbling score identification.

  7. Groundwater level prediction of landslide based on classification and regression tree

    Directory of Open Access Journals (Sweden)

    Yannan Zhao

    2016-09-01

    Full Text Available According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree (CART model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15% respectively. To compare the support vector machine (SVM model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.

  8. CLASSIFICATION OF ENTREPRENEURIAL INTENTIONS BY NEURAL NETWORKS, DECISION TREES AND SUPPORT VECTOR MACHINES

    Directory of Open Access Journals (Sweden)

    Marijana Zekić-Sušac

    2010-12-01

    Full Text Available Entrepreneurial intentions of students are important to recognize during the study in order to provide those students with educational background that will support such intentions and lead them to successful entrepreneurship after the study. The paper aims to develop a model that will classify students according to their entrepreneurial intentions by benchmarking three machine learning classifiers: neural networks, decision trees, and support vector machines. A survey was conducted at a Croatian university including a sample of students at the first year of study. Input variables described students’ demographics, importance of business objectives, perception of entrepreneurial carrier, and entrepreneurial predispositions. Due to a large dimension of input space, a feature selection method was used in the pre-processing stage. For comparison reasons, all tested models were validated on the same out-of-sample dataset, and a cross-validation procedure for testing generalization ability of the models was conducted. The models were compared according to its classification accuracy, as well according to input variable importance. The results show that although the best neural network model produced the highest average hit rate, the difference in performance is not statistically significant. All three models also extract similar set of features relevant for classifying students, which can be suggested to be taken into consideration by universities while designing their academic programs.

  9. Classification tree analysis of second neoplasms in survivors of childhood cancer

    International Nuclear Information System (INIS)

    Jazbec, Janez; Todorovski, Ljupčo; Jereb, Berta

    2007-01-01

    Reports on childhood cancer survivors estimated cumulative probability of developing secondary neoplasms vary from 3,3% to 25% at 25 years from diagnosis, and the risk of developing another cancer to several times greater than in the general population. In our retrospective study, we have used the classification tree multivariate method on a group of 849 first cancer survivors, to identify childhood cancer patients with the greatest risk for development of secondary neoplasms. In observed group of patients, 34 develop secondary neoplasm after treatment of primary cancer. Analysis of parameters present at the treatment of first cancer, exposed two groups of patients at the special risk for secondary neoplasm. First are female patients treated for Hodgkin's disease at the age between 10 and 15 years, whose treatment included radiotherapy. Second group at special risk were male patients with acute lymphoblastic leukemia who were treated at the age between 4,6 and 6,6 years of age. The risk groups identified in our study are similar to the results of studies that used more conventional approaches. Usefulness of our approach in study of occurrence of second neoplasms should be confirmed in larger sample study, but user friendly presentation of results makes it attractive for further studies

  10. Study and ranking of determinants of Taenia solium infections by classification tree models.

    Science.gov (United States)

    Mwape, Kabemba E; Phiri, Isaac K; Praet, Nicolas; Dorny, Pierre; Muma, John B; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals. © The American Society of Tropical Medicine and Hygiene.

  11. Use of Binary Partition Tree and energy minimization for object-based classification of urban land cover

    Science.gov (United States)

    Li, Mengmeng; Bijker, Wietske; Stein, Alfred

    2015-04-01

    Two main challenges are faced when classifying urban land cover from very high resolution satellite images: obtaining an optimal image segmentation and distinguishing buildings from other man-made objects. For optimal segmentation, this work proposes a hierarchical representation of an image by means of a Binary Partition Tree (BPT) and an unsupervised evaluation of image segmentations by energy minimization. For building extraction, we apply fuzzy sets to create a fuzzy landscape of shadows which in turn involves a two-step procedure. The first step is a preliminarily image classification at a fine segmentation level to generate vegetation and shadow information. The second step models the directional relationship between building and shadow objects to extract building information at the optimal segmentation level. We conducted the experiments on two datasets of Pléiades images from Wuhan City, China. To demonstrate its performance, the proposed classification is compared at the optimal segmentation level with Maximum Likelihood Classification and Support Vector Machine classification. The results show that the proposed classification produced the highest overall accuracies and kappa coefficients, and the smallest over-classification and under-classification geometric errors. We conclude first that integrating BPT with energy minimization offers an effective means for image segmentation. Second, we conclude that the directional relationship between building and shadow objects represented by a fuzzy landscape is important for building extraction.

  12. Predicting the disease of Alzheimer with SNP biomarkers and clinical data using data mining classification approach: decision tree.

    Science.gov (United States)

    Erdoğan, Onur; Aydin Son, Yeşim

    2014-01-01

    Single Nucleotide Polymorphisms (SNPs) are the most common genomic variations where only a single nucleotide differs between individuals. Individual SNPs and SNP profiles associated with diseases can be utilized as biological markers. But there is a need to determine the SNP subsets and patients' clinical data which is informative for the diagnosis. Data mining approaches have the highest potential for extracting the knowledge from genomic datasets and selecting the representative SNPs as well as most effective and informative clinical features for the clinical diagnosis of the diseases. In this study, we have applied one of the widely used data mining classification methodology: "decision tree" for associating the SNP biomarkers and significant clinical data with the Alzheimer's disease (AD), which is the most common form of "dementia". Different tree construction parameters have been compared for the optimization, and the most accurate tree for predicting the AD is presented.

  13. Traditional Chinese medicine pharmacovigilance in signal detection: decision tree-based data classification.

    Science.gov (United States)

    Wei, Jian-Xiang; Wang, Jing; Zhu, Yun-Xia; Sun, Jun; Xu, Hou-Ming; Li, Ming

    2018-03-09

    Traditional Chinese Medicine (TCM) is a style of traditional medicine informed by modern medicine but built on a foundation of more than 2500 years of Chinese medical practice. According to statistics, TCM accounts for approximately 14% of total adverse drug reaction (ADR) spontaneous reporting data in China. Because of the complexity of the components in TCM formula, which makes it essentially different from Western medicine, it is critical to determine whether ADR reports of TCM should be analyzed independently. Reports in the Chinese spontaneous reporting database between 2010 and 2011 were selected. The dataset was processed and divided into the total sample (all data) and the subsample (including TCM data only). Four different ADR signal detection methods-PRR, ROR, MHRA and IC- currently widely used in China, were applied for signal detection on the two samples. By comparison of experimental results, three of them-PRR, MHRA and IC-were chosen to do the experiment. We designed several indicators for performance evaluation such as R (recall ratio), P (precision ratio), and D (discrepancy ratio) based on the reference database and then constructed a decision tree for data classification based on such indicators. For PRR: R 1 -R 2  = 0.72%, P 1 -P 2  = 0.16% and D = 0.92%; For MHRA: R 1 -R 2  = 0.97%, P 1 -P 2  = 0.20% and D = 1.18%; For IC: R 1 -R 2  = 1.44%, P 2 -P 1  = 4.06% and D = 4.72%. The threshold of R,Pand Dis set as 2%, 2% and 3% respectively. Based on the decision tree, the results are "separation" for PRR, MHRA and IC. In order to improve the efficiency and accuracy of signal detection, we suggest that TCM data should be separated from the total sample when conducting analyses.

  14. Active learning strategies for the deduplication of electronic patient data using classification trees.

    Science.gov (United States)

    Sariyar, M; Borg, A; Pommerening, K

    2012-10-01

    Supervised record linkage methods often require a clerical review to gain informative training data. Active learning means to actively prompt the user to label data with special characteristics in order to minimise the review costs. We conducted an empirical evaluation to investigate whether a simple active learning strategy using binary comparison patterns is sufficient or if string metrics together with a more sophisticated algorithm are necessary to achieve high accuracies with a small training set. Based on medical registry data with different numbers of attributes, we used active learning to acquire training sets for classification trees, which were then used to classify the remaining data. Active learning for binary patterns means that every distinct comparison pattern represents a stratum from which one item is sampled. Active learning for patterns consisting of the Levenshtein string metric values uses an iterative process where the most informative and representative examples are added to the training set. In this context, we extended the active learning strategy by Sarawagi and Bhamidipaty (2002). On the original data set, active learning based on binary comparison patterns leads to the best results. When dropping four or six attributes, using string metrics leads to better results. In both cases, not more than 200 manually reviewed training examples are necessary. In record linkage applications where only forename, name and birthday are available as attributes, we suggest the sophisticated active learning strategy based on string metrics in order to achieve highly accurate results. We recommend the simple strategy if more attributes are available, as in our study. In both cases, active learning significantly reduces the amount of manual involvement in training data selection compared to usual record linkage settings. Copyright © 2012 Elsevier Inc. All rights reserved.

  15. Analysis of effects of manhole covers on motorcycle driver maneuvers: a nonparametric classification tree approach.

    Science.gov (United States)

    Chang, Li-Yen

    2014-01-01

    A manhole cover is a removable plate forming the lid over the opening of a manhole to allow traffic to pass over the manhole and to prevent people from falling in. Because most manhole covers are placed in roadway traffic lanes, if these manhole covers are not appropriately installed or maintained, they can represent unexpected hazards on the road, especially for motorcycle drivers. The objective of this study is to identify the effects of manhole cover characteristics as well as driver factors and traffic and roadway conditions on motorcycle driver maneuvers. A video camera was used to record motorcycle drivers' maneuvers when they encountered an inappropriately installed or maintained manhole cover. Information on 3059 drivers' maneuver decisions was recorded. Classification and regression tree (CART) models were applied to explore factors that can significantly affect motorcycle driver maneuvers when passing a manhole cover. Nearly 50 percent of the motorcycle drivers decelerated or changed their driving path to reduce the effects of the manhole cover. The manhole cover characteristics including the level difference between manhole cover and pavement, the pavement condition over the manhole cover, and the size of the manhole cover can significantly affect motorcycle driver maneuvers. Other factors, including traffic conditions, lane width, motorcycle speed, and loading conditions, also have significant effects on motorcycle driver maneuvers. To reduce the effects and potential risks from the manhole covers, highway authorities not only need to make sure that any newly installed manhole covers are as level as possible but also need to regularly maintain all the manhole covers to ensure that they are in good condition. In the long run, the size of manhole covers should be kept as small as possible so that the impact of manhole covers on motorcycle drivers can be effectively reduced. Supplemental materials are available for this article. Go to the publisher

  16. Chronic subdural hematoma: Surgical management and outcome in 986 cases: A classification and regression tree approach

    Science.gov (United States)

    Rovlias, Aristedis; Theodoropoulos, Spyridon; Papoutsakis, Dimitrios

    2015-01-01

    Background: Chronic subdural hematoma (CSDH) is one of the most common clinical entities in daily neurosurgical practice which carries a most favorable prognosis. However, because of the advanced age and medical problems of patients, surgical therapy is frequently associated with various complications. This study evaluated the clinical features, radiological findings, and neurological outcome in a large series of patients with CSDH. Methods: A classification and regression tree (CART) technique was employed in the analysis of data from 986 patients who were operated at Asclepeion General Hospital of Athens from January 1986 to December 2011. Burr holes evacuation with closed system drainage has been the operative technique of first choice at our institution for 29 consecutive years. A total of 27 prognostic factors were examined to predict the outcome at 3-month postoperatively. Results: Our results indicated that neurological status on admission was the best predictor of outcome. With regard to the other data, age, brain atrophy, thickness and density of hematoma, subdural accumulation of air, and antiplatelet and anticoagulant therapy were found to correlate significantly with prognosis. The overall cross-validated predictive accuracy of CART model was 85.34%, with a cross-validated relative error of 0.326. Conclusions: Methodologically, CART technique is quite different from the more commonly used methods, with the primary benefit of illustrating the important prognostic variables as related to outcome. Since, the ideal therapy for the treatment of CSDH is still under debate, this technique may prove useful in developing new therapeutic strategies and approaches for patients with CSDH. PMID:26257985

  17. Voxel-based plaque classification in coronary intravascular optical coherence tomography images using decision trees

    Science.gov (United States)

    Kolluru, Chaitanya; Prabhu, David; Gharaibeh, Yazan; Wu, Hao; Wilson, David L.

    2018-02-01

    Intravascular Optical Coherence Tomography (IVOCT) is a high contrast, 3D microscopic imaging technique that can be used to assess atherosclerosis and guide stent interventions. Despite its advantages, IVOCT image interpretation is challenging and time consuming with over 500 image frames generated in a single pullback volume. We have developed a method to classify voxel plaque types in IVOCT images using machine learning. To train and test the classifier, we have used our unique database of labeled cadaver vessel IVOCT images accurately registered to gold standard cryoimages. This database currently contains 300 images and is growing. Each voxel is labeled as fibrotic, lipid-rich, calcified or other. Optical attenuation, intensity and texture features were extracted for each voxel and were used to build a decision tree classifier for multi-class classification. Five-fold cross-validation across images gave accuracies of 96 % +/- 0.01 %, 90 +/- 0.02% and 90 % +/- 0.01 % for fibrotic, lipid-rich and calcified classes respectively. To rectify performance degradation seen in left out vessel specimens as opposed to left out images, we are adding data and reducing features to limit overfitting. Following spatial noise cleaning, important vascular regions were unambiguous in display. We developed displays that enable physicians to make rapid determination of calcified and lipid regions. This will inform treatment decisions such as the need for devices (e.g., atherectomy or scoring balloon in the case of calcifications) or extended stent lengths to ensure coverage of lipid regions prone to injury at the edge of a stent.

  18. Classification and regression tree (CART) model to predict pulmonary tuberculosis in hospitalized patients.

    Science.gov (United States)

    Aguiar, Fabio S; Almeida, Luciana L; Ruffino-Netto, Antonio; Kritski, Afranio Lineu; Mello, Fernanda Cq; Werneck, Guilherme L

    2012-08-07

    Tuberculosis (TB) remains a public health issue worldwide. The lack of specific clinical symptoms to diagnose TB makes the correct decision to admit patients to respiratory isolation a difficult task for the clinician. Isolation of patients without the disease is common and increases health costs. Decision models for the diagnosis of TB in patients attending hospitals can increase the quality of care and decrease costs, without the risk of hospital transmission. We present a predictive model for predicting pulmonary TB in hospitalized patients in a high prevalence area in order to contribute to a more rational use of isolation rooms without increasing the risk of transmission. Cross sectional study of patients admitted to CFFH from March 2003 to December 2004. A classification and regression tree (CART) model was generated and validated. The area under the ROC curve (AUC), sensitivity, specificity, positive and negative predictive values were used to evaluate the performance of model. Validation of the model was performed with a different sample of patients admitted to the same hospital from January to December 2005. We studied 290 patients admitted with clinical suspicion of TB. Diagnosis was confirmed in 26.5% of them. Pulmonary TB was present in 83.7% of the patients with TB (62.3% with positive sputum smear) and HIV/AIDS was present in 56.9% of patients. The validated CART model showed sensitivity, specificity, positive predictive value and negative predictive value of 60.00%, 76.16%, 33.33%, and 90.55%, respectively. The AUC was 79.70%. The CART model developed for these hospitalized patients with clinical suspicion of TB had fair to good predictive performance for pulmonary TB. The most important variable for prediction of TB diagnosis was chest radiograph results. Prospective validation is still necessary, but our model offer an alternative for decision making in whether to isolate patients with clinical suspicion of TB in tertiary health facilities in

  19. Integrating classification trees with local logistic regression in Intensive Care prognosis.

    Science.gov (United States)

    Abu-Hanna, Ameen; de Keizer, Nicolette

    2003-01-01

    Health care effectiveness and efficiency are under constant scrutiny especially when treatment is quite costly as in the Intensive Care (IC). Currently there are various international quality of care programs for the evaluation of IC. At the heart of such quality of care programs lie prognostic models whose prediction of patient mortality can be used as a norm to which actual mortality is compared. The current generation of prognostic models in IC are statistical parametric models based on logistic regression. Given a description of a patient at admission, these models predict the probability of his or her survival. Typically, this patient description relies on an aggregate variable, called a score, that quantifies the severity of illness of the patient. The use of a parametric model and an aggregate score form adequate means to develop models when data is relatively scarce but it introduces the risk of bias. This paper motivates and suggests a method for studying and improving the performance behavior of current state-of-the-art IC prognostic models. Our method is based on machine learning and statistical ideas and relies on exploiting information that underlies a score variable. In particular, this underlying information is used to construct a classification tree whose nodes denote patient sub-populations. For these sub-populations, local models, most notably logistic regression ones, are developed using only the total score variable. We compare the performance of this hybrid model to that of a traditional global logistic regression model. We show that the hybrid model not only provides more insight into the data but also has a better performance. We pay special attention to the precision aspect of model performance and argue why precision is more important than discrimination ability.

  20. Trees

    Science.gov (United States)

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  1. Analysis of Chi-square Automatic Interaction Detection (CHAID) and Classification and Regression Tree (CRT) for Classification of Corn Production

    Science.gov (United States)

    Susanti, Yuliana; Zukhronah, Etik; Pratiwi, Hasih; Respatiwulan; Sri Sulistijowati, H.

    2017-11-01

    To achieve food resilience in Indonesia, food diversification by exploring potentials of local food is required. Corn is one of alternating staple food of Javanese society. For that reason, corn production needs to be improved by considering the influencing factors. CHAID and CRT are methods of data mining which can be used to classify the influencing variables. The present study seeks to dig up information on the potentials of local food availability of corn in regencies and cities in Java Island. CHAID analysis yields four classifications with accuracy of 78.8%, while CRT analysis yields seven classifications with accuracy of 79.6%.

  2. Mastectomy or breast conserving surgery? Factors affecting type of surgical treatment for breast cancer – a classification tree approach

    International Nuclear Information System (INIS)

    Martin, Michael A; Meyricke, Ramona; O'Neill, Terry; Roberts, Steven

    2006-01-01

    A critical choice facing breast cancer patients is which surgical treatment – mastectomy or breast conserving surgery (BCS) – is most appropriate. Several studies have investigated factors that impact the type of surgery chosen, identifying features such as place of residence, age at diagnosis, tumor size, socio-economic and racial/ethnic elements as relevant. Such assessment of 'propensity' is important in understanding issues such as a reported under-utilisation of BCS among women for whom such treatment was not contraindicated. Using Western Australian (WA) data, we further examine the factors associated with the type of surgical treatment for breast cancer using a classification tree approach. This approach deals naturally with complicated interactions between factors, and so allows flexible and interpretable models for treatment choice to be built that add to the current understanding of this complex decision process. Data was extracted from the WA Cancer Registry on women diagnosed with breast cancer in WA from 1990 to 2000. Subjects' treatment preferences were predicted from covariates using both classification trees and logistic regression. Tumor size was the primary determinant of patient choice, subjects with tumors smaller than 20 mm in diameter preferring BCS. For subjects with tumors greater than 20 mm in diameter factors such as patient age, nodal status, and tumor histology become relevant as predictors of patient choice. Classification trees perform as well as logistic regression for predicting patient choice, but are much easier to interpret for clinical use. The selected tree can inform clinicians' advice to patients

  3. Identification of pests and diseases of Dalbergia hainanensis based on EVI time series and classification of decision tree

    Science.gov (United States)

    Luo, Qiu; Xin, Wu; Qiming, Xiong

    2017-06-01

    In the process of vegetation remote sensing information extraction, the problem of phenological features and low performance of remote sensing analysis algorithm is not considered. To solve this problem, the method of remote sensing vegetation information based on EVI time-series and the classification of decision-tree of multi-source branch similarity is promoted. Firstly, to improve the time-series stability of recognition accuracy, the seasonal feature of vegetation is extracted based on the fitting span range of time-series. Secondly, the decision-tree similarity is distinguished by adaptive selection path or probability parameter of component prediction. As an index, it is to evaluate the degree of task association, decide whether to perform migration of multi-source decision tree, and ensure the speed of migration. Finally, the accuracy of classification and recognition of pests and diseases can reach 87%--98% of commercial forest in Dalbergia hainanensis, which is significantly better than that of MODIS coverage accuracy of 80%--96% in this area. Therefore, the validity of the proposed method can be verified.

  4. Classification and regression tree (CART model to predict pulmonary tuberculosis in hospitalized patients

    Directory of Open Access Journals (Sweden)

    Aguiar Fabio S

    2012-08-01

    Full Text Available Abstract Background Tuberculosis (TB remains a public health issue worldwide. The lack of specific clinical symptoms to diagnose TB makes the correct decision to admit patients to respiratory isolation a difficult task for the clinician. Isolation of patients without the disease is common and increases health costs. Decision models for the diagnosis of TB in patients attending hospitals can increase the quality of care and decrease costs, without the risk of hospital transmission. We present a predictive model for predicting pulmonary TB in hospitalized patients in a high prevalence area in order to contribute to a more rational use of isolation rooms without increasing the risk of transmission. Methods Cross sectional study of patients admitted to CFFH from March 2003 to December 2004. A classification and regression tree (CART model was generated and validated. The area under the ROC curve (AUC, sensitivity, specificity, positive and negative predictive values were used to evaluate the performance of model. Validation of the model was performed with a different sample of patients admitted to the same hospital from January to December 2005. Results We studied 290 patients admitted with clinical suspicion of TB. Diagnosis was confirmed in 26.5% of them. Pulmonary TB was present in 83.7% of the patients with TB (62.3% with positive sputum smear and HIV/AIDS was present in 56.9% of patients. The validated CART model showed sensitivity, specificity, positive predictive value and negative predictive value of 60.00%, 76.16%, 33.33%, and 90.55%, respectively. The AUC was 79.70%. Conclusions The CART model developed for these hospitalized patients with clinical suspicion of TB had fair to good predictive performance for pulmonary TB. The most important variable for prediction of TB diagnosis was chest radiograph results. Prospective validation is still necessary, but our model offer an alternative for decision making in whether to isolate patients with

  5. Comparative analysis of tree classification models for detecting fusarium oxysporum f. sp cubense (TR4) based on multi soil sensor parameters

    Science.gov (United States)

    Estuar, Maria Regina Justina; Victorino, John Noel; Coronel, Andrei; Co, Jerelyn; Tiausas, Francis; Señires, Chiara Veronica

    2017-09-01

    Use of wireless sensor networks and smartphone integration design to monitor environmental parameters surrounding plantations is made possible because of readily available and affordable sensors. Providing low cost monitoring devices would be beneficial, especially to small farm owners, in a developing country like the Philippines, where agriculture covers a significant amount of the labor market. This study discusses the integration of wireless soil sensor devices and smartphones to create an application that will use multidimensional analysis to detect the presence or absence of plant disease. Specifically, soil sensors are designed to collect soil quality parameters in a sink node from which the smartphone collects data from via Bluetooth. Given these, there is a need to develop a classification model on the mobile phone that will report infection status of a soil. Though tree classification is the most appropriate approach for continuous parameter-based datasets, there is a need to determine whether tree models will result to coherent results or not. Soil sensor data that resides on the phone is modeled using several variations of decision tree, namely: decision tree (DT), best-fit (BF) decision tree, functional tree (FT), Naive Bayes (NB) decision tree, J48, J48graft and LAD tree, where decision tree approaches the problem by considering all sensor nodes as one. Results show that there are significant differences among soil sensor parameters indicating that there are variances in scores between the infected and uninfected sites. Furthermore, analysis of variance in accuracy, recall, precision and F1 measure scores from tree classification models homogeneity among NBTree, J48graft and J48 tree classification models.

  6. Building classification trees to explain the radioactive contamination levels of the plants; Construction d'arbres de discrimination pour expliquer les niveaux de contamination radioactive des vegetaux

    Energy Technology Data Exchange (ETDEWEB)

    Briand, B

    2008-04-15

    The objective of this thesis is the development of a method allowing the identification of factors leading to various radioactive contamination levels of the plants. The methodology suggested is based on the use of a radioecological transfer model of the radionuclides through the environment (A.S.T.R.A.L. computer code) and a classification-tree method. Particularly, to avoid the instability problems of classification trees and to preserve the tree structure, a node level stabilizing technique is used. Empirical comparisons are carried out between classification trees built by this method (called R.E.N. method) and those obtained by the C.A.R.T. method. A similarity measure is defined to compare the structure of two classification trees. This measure is used to study the stabilizing performance of the R.E.N. method. The methodology suggested is applied to a simplified contamination scenario. By the results obtained, we can identify the main variables responsible of the various radioactive contamination levels of four leafy-vegetables (lettuce, cabbage, spinach and leek). Some extracted rules from these classification trees can be usable in a post-accidental context. (author)

  7. Construction and application of hierarchical decision tree for classification of ultrasonographic prostate images

    NARCIS (Netherlands)

    Giesen, R. J.; Huynen, A. L.; Aarnink, R. G.; de la Rosette, J. J.; Debruyne, F. M.; Wijkstra, H.

    1996-01-01

    A non-parametric algorithm is described for the construction of a binary decision tree classifier. This tree is used to correlate textural features, computed from ultrasonographic prostate images, with the histopathology of the imaged tissue. The algorithm consists of two parts; growing and pruning.

  8. Multi-pruning of decision trees for knowledge representation and classification

    KAUST Repository

    Azad, Mohammad

    2016-06-09

    We consider two important questions related to decision trees: first how to construct a decision tree with reasonable number of nodes and reasonable number of misclassification, and second how to improve the prediction accuracy of decision trees when they are used as classifiers. We have created a dynamic programming based approach for bi-criteria optimization of decision trees relative to the number of nodes and the number of misclassification. This approach allows us to construct the set of all Pareto optimal points and to derive, for each such point, decision trees with parameters corresponding to that point. Experiments on datasets from UCI ML Repository show that, very often, we can find a suitable Pareto optimal point and derive a decision tree with small number of nodes at the expense of small increment in number of misclassification. Based on the created approach we have proposed a multi-pruning procedure which constructs decision trees that, as classifiers, often outperform decision trees constructed by CART. © 2015 IEEE.

  9. Multi-pruning of decision trees for knowledge representation and classification

    KAUST Repository

    Azad, Mohammad; Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail

    2016-01-01

    We consider two important questions related to decision trees: first how to construct a decision tree with reasonable number of nodes and reasonable number of misclassification, and second how to improve the prediction accuracy of decision trees when they are used as classifiers. We have created a dynamic programming based approach for bi-criteria optimization of decision trees relative to the number of nodes and the number of misclassification. This approach allows us to construct the set of all Pareto optimal points and to derive, for each such point, decision trees with parameters corresponding to that point. Experiments on datasets from UCI ML Repository show that, very often, we can find a suitable Pareto optimal point and derive a decision tree with small number of nodes at the expense of small increment in number of misclassification. Based on the created approach we have proposed a multi-pruning procedure which constructs decision trees that, as classifiers, often outperform decision trees constructed by CART. © 2015 IEEE.

  10. The importance of chemosensory clues in Aguaruna tree classification and identification.

    Science.gov (United States)

    Jernigan, Kevin A

    2008-05-03

    The ethnobotanical literature still contains few detailed descriptions of the sensory criteria people use for judging membership in taxonomic categories. Olfactory criteria in particular have been explored very little. This paper will describe the importance of odor for woody plant taxonomy and identification among the Aguaruna Jívaro of the northern Peruvian Amazon, focusing on the Aguaruna category númi (trees excluding palms). Aguaruna informants almost always place trees that they consider to have a similar odor together as kumpají - 'companions,' a metaphor they use to describe trees that they consider to be related. The research took place in several Aguaruna communities in the upper Marañón region of the Peruvian Amazon. Structured interview data focus on informant criteria for membership in various folk taxa of trees. Informants were also asked to explain what members of each group of related companions had in common. This paper focuses on odor and taste criteria that came to light during these structured interviews. Botanical voucher specimens were collected, wherever possible. Of the 182 tree folk genera recorded in this study, 51 (28%) were widely considered to possess a distinctive odor. Thirty nine of those (76%) were said to have odors similar to some other tree, while the other 24% had unique odors. Aguaruna informants very rarely described tree odors in non-botanical terms. Taste was used mostly to describe trees with edible fruits. Trees judged to be related were nearly always in the same botanical family. The results of this study illustrate that odor of bark, sap, flowers, fruit and leaves are important clues that help the Aguaruna to judge the relatedness of trees found in their local environment. In contrast, taste appears to play a more limited role. The results suggest a more general ethnobotanical hypothesis that could be tested in other cultural settings: people tend to consider plants with similar odors to be related, but say that

  11. Which sociodemographic factors are important on smoking behaviour of high school students? The contribution of classification and regression tree methodology in a broad epidemiological survey.

    Science.gov (United States)

    Ozge, C; Toros, F; Bayramkaya, E; Camdeviren, H; Sasmaz, T

    2006-08-01

    The purpose of this study is to evaluate the most important sociodemographic factors on smoking status of high school students using a broad randomised epidemiological survey. Using in-class, self administered questionnaire about their sociodemographic variables and smoking behaviour, a representative sample of total 3304 students of preparatory, 9th, 10th, and 11th grades, from 22 randomly selected schools of Mersin, were evaluated and discriminative factors have been determined using appropriate statistics. In addition to binary logistic regression analysis, the study evaluated combined effects of these factors using classification and regression tree methodology, as a new statistical method. The data showed that 38% of the students reported lifetime smoking and 16.9% of them reported current smoking with a male predominancy and increasing prevalence by age. Second hand smoking was reported at a 74.3% frequency with father predominance (56.6%). The significantly important factors that affect current smoking in these age groups were increased by household size, late birth rank, certain school types, low academic performance, increased second hand smoking, and stress (especially reported as separation from a close friend or because of violence at home). Classification and regression tree methodology showed the importance of some neglected sociodemographic factors with a good classification capacity. It was concluded that, as closely related with sociocultural factors, smoking was a common problem in this young population, generating important academic and social burden in youth life and with increasing data about this behaviour and using new statistical methods, effective coping strategies could be composed.

  12. Learning machines and sleeping brains: Automatic sleep stage classification using decision-tree multi-class support vector machines.

    Science.gov (United States)

    Lajnef, Tarek; Chaibi, Sahbi; Ruby, Perrine; Aguera, Pierre-Emmanuel; Eichenlaub, Jean-Baptiste; Samet, Mounir; Kachouri, Abdennaceur; Jerbi, Karim

    2015-07-30

    Sleep staging is a critical step in a range of electrophysiological signal processing pipelines used in clinical routine as well as in sleep research. Although the results currently achievable with automatic sleep staging methods are promising, there is need for improvement, especially given the time-consuming and tedious nature of visual sleep scoring. Here we propose a sleep staging framework that consists of a multi-class support vector machine (SVM) classification based on a decision tree approach. The performance of the method was evaluated using polysomnographic data from 15 subjects (electroencephalogram (EEG), electrooculogram (EOG) and electromyogram (EMG) recordings). The decision tree, or dendrogram, was obtained using a hierarchical clustering technique and a wide range of time and frequency-domain features were extracted. Feature selection was carried out using forward sequential selection and classification was evaluated using k-fold cross-validation. The dendrogram-based SVM (DSVM) achieved mean specificity, sensitivity and overall accuracy of 0.92, 0.74 and 0.88 respectively, compared to expert visual scoring. Restricting DSVM classification to data where both experts' scoring was consistent (76.73% of the data) led to a mean specificity, sensitivity and overall accuracy of 0.94, 0.82 and 0.92 respectively. The DSVM framework outperforms classification with more standard multi-class "one-against-all" SVM and linear-discriminant analysis. The promising results of the proposed methodology suggest that it may be a valuable alternative to existing automatic methods and that it could accelerate visual scoring by providing a robust starting hypnogram that can be further fine-tuned by expert inspection. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Automated morphological analysis of bone marrow cells in microscopic images for diagnosis of leukemia: nucleus-plasma separation and cell classification using a hierarchical tree model of hematopoesis

    Science.gov (United States)

    Krappe, Sebastian; Wittenberg, Thomas; Haferlach, Torsten; Münzenmayer, Christian

    2016-03-01

    The morphological differentiation of bone marrow is fundamental for the diagnosis of leukemia. Currently, the counting and classification of the different types of bone marrow cells is done manually under the use of bright field microscopy. This is a time-consuming, subjective, tedious and error-prone process. Furthermore, repeated examinations of a slide may yield intra- and inter-observer variances. For that reason a computer assisted diagnosis system for bone marrow differentiation is pursued. In this work we focus (a) on a new method for the separation of nucleus and plasma parts and (b) on a knowledge-based hierarchical tree classifier for the differentiation of bone marrow cells in 16 different classes. Classification trees are easily interpretable and understandable and provide a classification together with an explanation. Using classification trees, expert knowledge (i.e. knowledge about similar classes and cell lines in the tree model of hematopoiesis) is integrated in the structure of the tree. The proposed segmentation method is evaluated with more than 10,000 manually segmented cells. For the evaluation of the proposed hierarchical classifier more than 140,000 automatically segmented bone marrow cells are used. Future automated solutions for the morphological analysis of bone marrow smears could potentially apply such an approach for the pre-classification of bone marrow cells and thereby shortening the examination time.

  14. The importance of chemosensory clues in Aguaruna tree classification and identification

    Directory of Open Access Journals (Sweden)

    Jernigan Kevin A

    2008-05-01

    Full Text Available Abstract Background The ethnobotanical literature still contains few detailed descriptions of the sensory criteria people use for judging membership in taxonomic categories. Olfactory criteria in particular have been explored very little. This paper will describe the importance of odor for woody plant taxonomy and identification among the Aguaruna Jívaro of the northern Peruvian Amazon, focusing on the Aguaruna category númi (trees excluding palms. Aguaruna informants almost always place trees that they consider to have a similar odor together as kumpají – 'companions,' a metaphor they use to describe trees that they consider to be related. Methods The research took place in several Aguaruna communities in the upper Marañón region of the Peruvian Amazon. Structured interview data focus on informant criteria for membership in various folk taxa of trees. Informants were also asked to explain what members of each group of related companions had in common. This paper focuses on odor and taste criteria that came to light during these structured interviews. Botanical voucher specimens were collected, wherever possible. Results Of the 182 tree folk genera recorded in this study, 51 (28% were widely considered to possess a distinctive odor. Thirty nine of those (76% were said to have odors similar to some other tree, while the other 24% had unique odors. Aguaruna informants very rarely described tree odors in non-botanical terms. Taste was used mostly to describe trees with edible fruits. Trees judged to be related were nearly always in the same botanical family. Conclusion The results of this study illustrate that odor of bark, sap, flowers, fruit and leaves are important clues that help the Aguaruna to judge the relatedness of trees found in their local environment. In contrast, taste appears to play a more limited role. The results suggest a more general ethnobotanical hypothesis that could be tested in other cultural settings: people tend to

  15. Computer-assisted detection of colonic polyps with CT colonography using neural networks and binary classification trees

    International Nuclear Information System (INIS)

    Jerebko, Anna K.; Summers, Ronald M.; Malley, James D.; Franaszek, Marek; Johnson, C. Daniel

    2003-01-01

    Detection of colonic polyps in CT colonography is problematic due to complexities of polyp shape and the surface of the normal colon. Published results indicate the feasibility of computer-aided detection of polyps but better classifiers are needed to improve specificity. In this paper we compare the classification results of two approaches: neural networks and recursive binary trees. As our starting point we collect surface geometry information from three-dimensional reconstruction of the colon, followed by a filter based on selected variables such as region density, Gaussian and average curvature and sphericity. The filter returns sites that are candidate polyps, based on earlier work using detection thresholds, to which the neural nets or the binary trees are applied. A data set of 39 polyps from 3 to 25 mm in size was used in our investigation. For both neural net and binary trees we use tenfold cross-validation to better estimate the true error rates. The backpropagation neural net with one hidden layer trained with Levenberg-Marquardt algorithm achieved the best results: sensitivity 90% and specificity 95% with 16 false positives per study

  16. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  17. Remote sensing of aquatic vegetation distribution in Taihu Lake using an improved classification tree with modified thresholds.

    Science.gov (United States)

    Zhao, Dehua; Jiang, Hao; Yang, Tangwu; Cai, Ying; Xu, Delin; An, Shuqing

    2012-03-01

    Classification trees (CT) have been used successfully in the past to classify aquatic vegetation from spectral indices (SI) obtained from remotely-sensed images. However, applying CT models developed for certain image dates to other time periods within the same year or among different years can reduce the classification accuracy. In this study, we developed CT models with modified thresholds using extreme SI values (CT(m)) to improve the stability of the models when applying them to different time periods. A total of 903 ground-truth samples were obtained in September of 2009 and 2010 and classified as emergent, floating-leaf, or submerged vegetation or other cover types. Classification trees were developed for 2009 (Model-09) and 2010 (Model-10) using field samples and a combination of two images from winter and summer. Overall accuracies of these models were 92.8% and 94.9%, respectively, which confirmed the ability of CT analysis to map aquatic vegetation in Taihu Lake. However, Model-10 had only 58.9-71.6% classification accuracy and 31.1-58.3% agreement (i.e., pixels classified the same in the two maps) for aquatic vegetation when it was applied to image pairs from both a different time period in 2010 and a similar time period in 2009. We developed a method to estimate the effects of extrinsic (EF) and intrinsic (IF) factors on model uncertainty using Modis images. Results indicated that 71.1% of the instability in classification between time periods was due to EF, which might include changes in atmospheric conditions, sun-view angle and water quality. The remainder was due to IF, such as phenological and growth status differences between time periods. The modified version of Model-10 (i.e. CT(m)) performed better than traditional CT with different image dates. When applied to 2009 images, the CT(m) version of Model-10 had very similar thresholds and performance as Model-09, with overall accuracies of 92.8% and 90.5% for Model-09 and the CT(m) version of Model

  18. Classification and Progression Based on CFS-GA and C5.0 Boost Decision Tree of TCM Zheng in Chronic Hepatitis B.

    Science.gov (United States)

    Chen, Xiao Yu; Ma, Li Zhuang; Chu, Na; Zhou, Min; Hu, Yiyang

    2013-01-01

    Chronic hepatitis B (CHB) is a serious public health problem, and Traditional Chinese Medicine (TCM) plays an important role in the control and treatment for CHB. In the treatment of TCM, zheng discrimination is the most important step. In this paper, an approach based on CFS-GA (Correlation based Feature Selection and Genetic Algorithm) and C5.0 boost decision tree is used for zheng classification and progression in the TCM treatment of CHB. The CFS-GA performs better than the typical method of CFS. By CFS-GA, the acquired attribute subset is classified by C5.0 boost decision tree for TCM zheng classification of CHB, and C5.0 decision tree outperforms two typical decision trees of NBTree and REPTree on CFS-GA, CFS, and nonselection in comparison. Based on the critical indicators from C5.0 decision tree, important lab indicators in zheng progression are obtained by the method of stepwise discriminant analysis for expressing TCM zhengs in CHB, and alterations of the important indicators are also analyzed in zheng progression. In conclusion, all the three decision trees perform better on CFS-GA than on CFS and nonselection, and C5.0 decision tree outperforms the two typical decision trees both on attribute selection and nonselection.

  19. Visualization of Decision Tree State for the Classification of Parkinson's Disease

    NARCIS (Netherlands)

    Valentijn, E

    2016-01-01

    Decision trees have been shown to be effective at classifying subjects with Parkinson’s disease when provided with features (subject scores) derived from FDG-PET data. Such subject scores have strong discriminative power but are not intuitive to understand. We therefore augment each decision node

  20. Exploratory Use of Decision Tree Analysis in Classification of Outcome in Hypoxic-Ischemic Brain Injury.

    Science.gov (United States)

    Phan, Thanh G; Chen, Jian; Singhal, Shaloo; Ma, Henry; Clissold, Benjamin B; Ly, John; Beare, Richard

    2018-01-01

    Prognostication following hypoxic ischemic encephalopathy (brain injury) is important for clinical management. The aim of this exploratory study is to use a decision tree model to find clinical and MRI associates of severe disability and death in this condition. We evaluate clinical model and then the added value of MRI data. The inclusion criteria were as follows: age ≥17 years, cardio-respiratory arrest, and coma on admission (2003-2011). Decision tree analysis was used to find clinical [Glasgow Coma Score (GCS), features about cardiac arrest, therapeutic hypothermia, age, and sex] and MRI (infarct volume) associates of severe disability and death. We used the area under the ROC (auROC) to determine accuracy of model. There were 41 (63.7% males) patients having MRI imaging with the average age 51.5 ± 18.9 years old. The decision trees showed that infarct volume and age were important factors for discrimination between mild to moderate disability and severe disability and death at day 0 and day 2. The auROC for this model was 0.94 (95% CI 0.82-1.00). At day 7, GCS value was the only predictor; the auROC was 0.96 (95% CI 0.86-1.00). Our findings provide proof of concept for further exploration of the role of MR imaging and decision tree analysis in the early prognostication of hypoxic ischemic brain injury.

  1. Identification, classification and differential expression of oleosin genes in tung tree (Vernicia fordii).

    Science.gov (United States)

    Cao, Heping; Zhang, Lin; Tan, Xiaofeng; Long, Hongxu; Shockey, Jay M

    2014-01-01

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii), whose seeds are rich in novel TAG with a wide range of industrial applications. The objectives of this study were to identify OLE genes, classify OLE proteins and analyze OLE gene expression in tung trees. We identified five tung tree OLE genes coding for small hydrophobic proteins. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that the five tung OLE genes represented the five OLE subfamilies and all contained the "proline knot" motif (PX5SPX3P) shared among 65 OLE from 19 tree species, including the sequenced genomes of Prunus persica (peach), Populus trichocarpa (poplar), Ricinus communis (castor bean), Theobroma cacao (cacao) and Vitis vinifera (grapevine). Tung OLE1, OLE2 and OLE3 belong to the S type and OLE4 and OLE5 belong to the SM type of Arabidopsis OLE. TaqMan and SYBR Green qPCR methods were used to study the differential expression of OLE genes in tung tree tissues. Expression results demonstrated that 1) All five OLE genes were expressed in developing tung seeds, leaves and flowers; 2) OLE mRNA levels were much higher in seeds than leaves or flowers; 3) OLE1, OLE2 and OLE3 genes were expressed in tung seeds at much higher levels than OLE4 and OLE5 genes; 4) OLE mRNA levels rapidly increased during seed development; and 5) OLE gene expression was well-coordinated with tung oil accumulation in the seeds. These results suggest that tung OLE genes 1-3 probably play major roles in tung oil accumulation and/or oil body development. Therefore, they might be preferred targets for tung oil engineering in transgenic plants.

  2. Identification, classification and differential expression of oleosin genes in tung tree (Vernicia fordii.

    Directory of Open Access Journals (Sweden)

    Heping Cao

    Full Text Available Triacylglycerols (TAG are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii, whose seeds are rich in novel TAG with a wide range of industrial applications. The objectives of this study were to identify OLE genes, classify OLE proteins and analyze OLE gene expression in tung trees. We identified five tung tree OLE genes coding for small hydrophobic proteins. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that the five tung OLE genes represented the five OLE subfamilies and all contained the "proline knot" motif (PX5SPX3P shared among 65 OLE from 19 tree species, including the sequenced genomes of Prunus persica (peach, Populus trichocarpa (poplar, Ricinus communis (castor bean, Theobroma cacao (cacao and Vitis vinifera (grapevine. Tung OLE1, OLE2 and OLE3 belong to the S type and OLE4 and OLE5 belong to the SM type of Arabidopsis OLE. TaqMan and SYBR Green qPCR methods were used to study the differential expression of OLE genes in tung tree tissues. Expression results demonstrated that 1 All five OLE genes were expressed in developing tung seeds, leaves and flowers; 2 OLE mRNA levels were much higher in seeds than leaves or flowers; 3 OLE1, OLE2 and OLE3 genes were expressed in tung seeds at much higher levels than OLE4 and OLE5 genes; 4 OLE mRNA levels rapidly increased during seed development; and 5 OLE gene expression was well-coordinated with tung oil accumulation in the seeds. These results suggest that tung OLE genes 1-3 probably play major roles in tung oil accumulation and/or oil body development. Therefore, they might be preferred targets for tung oil engineering in transgenic plants.

  3. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  4. Classification of Learning Styles in Virtual Learning Environment Using J48 Decision Tree

    Science.gov (United States)

    Maaliw, Renato R. III; Ballera, Melvin A.

    2017-01-01

    The usage of data mining has dramatically increased over the past few years and the education sector is leveraging this field in order to analyze and gain intuitive knowledge in terms of the vast accumulated data within its confines. The primary objective of this study is to compare the results of different classification techniques such as Naïve…

  5. Tree mortality based fire severity classification for forest inventories: A Pacific Northwest national forests example

    Science.gov (United States)

    Thomas R. Whittier; Andrew N. Gray

    2016-01-01

    Determining how the frequency, severity, and extent of forest fires are changing in response to changes in management and climate is a key concern in many regions where fire is an important natural disturbance. In the USA the only national-scale fire severity classification uses satellite image changedetection to produce maps for large (>400 ha) fires, and is...

  6. Exploratory Use of Decision Tree Analysis in Classification of Outcome in Hypoxic–Ischemic Brain Injury

    Directory of Open Access Journals (Sweden)

    Thanh G. Phan

    2018-03-01

    Full Text Available BackgroundPrognostication following hypoxic ischemic encephalopathy (brain injury is important for clinical management. The aim of this exploratory study is to use a decision tree model to find clinical and MRI associates of severe disability and death in this condition. We evaluate clinical model and then the added value of MRI data.MethodThe inclusion criteria were as follows: age ≥17 years, cardio-respiratory arrest, and coma on admission (2003–2011. Decision tree analysis was used to find clinical [Glasgow Coma Score (GCS, features about cardiac arrest, therapeutic hypothermia, age, and sex] and MRI (infarct volume associates of severe disability and death. We used the area under the ROC (auROC to determine accuracy of model. There were 41 (63.7% males patients having MRI imaging with the average age 51.5 ± 18.9 years old. The decision trees showed that infarct volume and age were important factors for discrimination between mild to moderate disability and severe disability and death at day 0 and day 2. The auROC for this model was 0.94 (95% CI 0.82–1.00. At day 7, GCS value was the only predictor; the auROC was 0.96 (95% CI 0.86–1.00.ConclusionOur findings provide proof of concept for further exploration of the role of MR imaging and decision tree analysis in the early prognostication of hypoxic ischemic brain injury.

  7. The Iqmulus Urban Showcase: Automatic Tree Classification and Identification in Huge Mobile Mapping Point Clouds

    Science.gov (United States)

    Böhm, J.; Bredif, M.; Gierlinger, T.; Krämer, M.; Lindenberg, R.; Liu, K.; Michel, F.; Sirmacek, B.

    2016-06-01

    Current 3D data capturing as implemented on for example airborne or mobile laser scanning systems is able to efficiently sample the surface of a city by billions of unselective points during one working day. What is still difficult is to extract and visualize meaningful information hidden in these point clouds with the same efficiency. This is where the FP7 IQmulus project enters the scene. IQmulus is an interactive facility for processing and visualizing big spatial data. In this study the potential of IQmulus is demonstrated on a laser mobile mapping point cloud of 1 billion points sampling ~ 10 km of street environment in Toulouse, France. After the data is uploaded to the IQmulus Hadoop Distributed File System, a workflow is defined by the user consisting of retiling the data followed by a PCA driven local dimensionality analysis, which runs efficiently on the IQmulus cloud facility using a Spark implementation. Points scattering in 3 directions are clustered in the tree class, and are separated next into individual trees. Five hours of processing at the 12 node computing cluster results in the automatic identification of 4000+ urban trees. Visualization of the results in the IQmulus fat client helps users to appreciate the results, and developers to identify remaining flaws in the processing workflow.

  8. THE IQMULUS URBAN SHOWCASE: AUTOMATIC TREE CLASSIFICATION AND IDENTIFICATION IN HUGE MOBILE MAPPING POINT CLOUDS

    Directory of Open Access Journals (Sweden)

    J. Böhm

    2016-06-01

    Full Text Available Current 3D data capturing as implemented on for example airborne or mobile laser scanning systems is able to efficiently sample the surface of a city by billions of unselective points during one working day. What is still difficult is to extract and visualize meaningful information hidden in these point clouds with the same efficiency. This is where the FP7 IQmulus project enters the scene. IQmulus is an interactive facility for processing and visualizing big spatial data. In this study the potential of IQmulus is demonstrated on a laser mobile mapping point cloud of 1 billion points sampling ~ 10 km of street environment in Toulouse, France. After the data is uploaded to the IQmulus Hadoop Distributed File System, a workflow is defined by the user consisting of retiling the data followed by a PCA driven local dimensionality analysis, which runs efficiently on the IQmulus cloud facility using a Spark implementation. Points scattering in 3 directions are clustered in the tree class, and are separated next into individual trees. Five hours of processing at the 12 node computing cluster results in the automatic identification of 4000+ urban trees. Visualization of the results in the IQmulus fat client helps users to appreciate the results, and developers to identify remaining flaws in the processing workflow.

  9. Combining logistic regression with classification and regression tree to predict quality of care in a home health nursing data set.

    Science.gov (United States)

    Guo, Huey-Ming; Shyu, Yea-Ing Lotus; Chang, Her-Kun

    2006-01-01

    In this article, the authors provide an overview of a research method to predict quality of care in home health nursing data set. The results of this study can be visualized through classification an regression tree (CART) graphs. The analysis was more effective, and the results were more informative since the home health nursing dataset was analyzed with a combination of the logistic regression and CART, these two techniques complete each other. And the results more informative that more patients' characters were related to quality of care in home care. The results contributed to home health nurse predict patient outcome in case management. Improved prediction is needed for interventions to be appropriately targeted for improved patient outcome and quality of care.

  10. [Analysis of dietary pattern and diabetes mellitus influencing factors identified by classification tree model in adults of Fujian].

    Science.gov (United States)

    Yu, F L; Ye, Y; Yan, Y S

    2017-05-10

    Objective: To find out the dietary patterns and explore the relationship between environmental factors (especially dietary patterns) and diabetes mellitus in the adults of Fujian. Methods: Multi-stage sampling method were used to survey residents aged ≥18 years by questionnaire, physical examination and laboratory detection in 10 disease surveillance points in Fujian. Factor analysis was used to identify the dietary patterns, while logistic regression model was applied to analyze relationship between dietary patterns and diabetes mellitus, and classification tree model was adopted to identify the influencing factors for diabetes mellitus. Results: There were four dietary patterns in the population, including meat, plant, high-quality protein, and fried food and beverages patterns. The result of logistic analysis showed that plant pattern, which has higher factor loading of fresh fruit-vegetables and cereal-tubers, was a protective factor for non-diabetes mellitus. The risk of diabetes mellitus in the population at T2 and T3 levels of factor score were 0.727 (95 %CI: 0.561-0.943) times and 0.736 (95 %CI : 0.573-0.944) times higher, respectively, than those whose factor score was in lowest quartile. Thirteen influencing factors and eleven group at high-risk for diabetes mellitus were identified by classification tree model. The influencing factors were dyslipidemia, age, family history of diabetes, hypertension, physical activity, career, sex, sedentary time, abdominal adiposity, BMI, marital status, sleep time and high-quality protein pattern. Conclusion: There is a close association between dietary patterns and diabetes mellitus. It is necessary to promote healthy and reasonable diet, strengthen the monitoring and control of blood lipids, blood pressure and body weight, and have good lifestyle for the prevention and control of diabetes mellitus.

  11. Classification of tree species based on longwave hyperspectral data from leaves, a case study for a tropical dry forest

    Science.gov (United States)

    Harrison, D.; Rivard, B.; Sánchez-Azofeifa, A.

    2018-04-01

    Remote sensing of the environment has utilized the visible, near and short-wave infrared (IR) regions of the electromagnetic (EM) spectrum to characterize vegetation health, vigor and distribution. However, relatively little research has focused on the use of the longwave infrared (LWIR, 8.0-12.5 μm) region for studies of vegetation. In this study LWIR leaf reflectance spectra were collected in the wet seasons (May through December) of 2013 and 2014 from twenty-six tree species located in a high species diversity environment, a tropical dry forest in Costa Rica. A continuous wavelet transformation (CWT) was applied to all spectra to minimize noise and broad amplitude variations attributable to non-compositional effects. Species discrimination was then explored with Random Forest classification and accuracy improved was observed with preprocessing of reflectance spectra with continuous wavelet transformation. Species were found to share common spectral features that formed the basis for five spectral types that were corroborated with linear discriminate analysis. The source of most of the observed spectral features is attributed to cell wall or cuticle compounds (cellulose, cutin, matrix glycan, silica and oleanolic acid). Spectral types could be advantageous for the analysis of airborne hyperspectral data because cavity effects will lower the spectral contrast thus increasing the reliance of classification efforts on dominant spectral features. Spectral types specifically derived from leaf level data are expected to support the labeling of spectral classes derived from imagery. The results of this study and that of Ribeiro Da Luz (2006), Ribeiro Da Luz and Crowley (2007, 2010), Ullah et al. (2012) and Rock et al. (2016) have now illustrated success in tree species discrimination across a range of ecosystems using leaf-level spectral observations. With advances in LWIR sensors and concurrent improvements in their signal to noise, applications to large-scale species

  12. Prediction of radiation levels in residences: A methodological comparison of CART [Classification and Regression Tree Analysis] and conventional regression

    International Nuclear Information System (INIS)

    Janssen, I.; Stebbings, J.H.

    1990-01-01

    In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ∼200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs

  13. Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees

    Science.gov (United States)

    Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu

    2018-02-01

    A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.

  14. Identifying Domain-General and Domain-Specific Predictors of Low Mathematics Performance: A Classification and Regression Tree Analysis

    Directory of Open Access Journals (Sweden)

    David J. Purpura

    2017-12-01

    Full Text Available Many children struggle to successfully acquire early mathematics skills. Theoretical and empirical evidence has pointed to deficits in domain-specific skills (e.g., non-symbolic mathematics skills or domain-general skills (e.g., executive functioning and language as underlying low mathematical performance. In the current study, we assessed a sample of 113 three- to five-year old preschool children on a battery of domain-specific and domain-general factors in the fall and spring of their preschool year to identify Time 1 (fall factors associated with low performance in mathematics knowledge at Time 2 (spring. We used the exploratory approach of classification and regression tree analyses, a strategy that uses step-wise partitioning to create subgroups from a larger sample using multiple predictors, to identify the factors that were the strongest classifiers of low performance for younger and older preschool children. Results indicated that the most consistent classifier of low mathematics performance at Time 2 was children’s Time 1 mathematical language skills. Further, other distinct classifiers of low performance emerged for younger and older children. These findings suggest that risk classification for low mathematics performance may differ depending on children’s age.

  15. Decision-Tree, Rule-Based, and Random Forest Classification of High-Resolution Multispectral Imagery for Wetland Mapping and Inventory

    Science.gov (United States)

    Efforts are increasingly being made to classify the world’s wetland resources, an important ecosystem and habitat that is diminishing in abundance. There are multiple remote sensing classification methods, including a suite of nonparametric classifiers such as decision-tree...

  16. Tree Biomass Allocation and Its Model Additivity for Casuarina equisetifolia in a Tropical Forest of Hainan Island, China.

    Science.gov (United States)

    Xue, Yang; Yang, Zhongyang; Wang, Xiaoyan; Lin, Zhipan; Li, Dunxi; Su, Shaofeng

    2016-01-01

    Casuarina equisetifolia is commonly planted and used in the construction of coastal shelterbelt protection in Hainan Island. Thus, it is critical to accurately estimate the tree biomass of Casuarina equisetifolia L. for forest managers to evaluate the biomass stock in Hainan. The data for this work consisted of 72 trees, which were divided into three age groups: young forest, middle-aged forest, and mature forest. The proportion of biomass from the trunk significantly increased with age (Pbiomass of the branch and leaf decreased, and the biomass of the root did not change. To test whether the crown radius (CR) can improve biomass estimates of C. equisetifolia, we introduced CR into the biomass models. Here, six models were used to estimate the biomass of each component, including the trunk, the branch, the leaf, and the root. In each group, we selected one model among these six models for each component. The results showed that including the CR greatly improved the model performance and reduced the error, especially for the young and mature forests. In addition, to ensure biomass additivity, the selected equation for each component was fitted as a system of equations using seemingly unrelated regression (SUR). The SUR method not only gave efficient and accurate estimates but also achieved the logical additivity. The results in this study provide a robust estimation of tree biomass components and total biomass over three groups of C. equisetifolia.

  17. Tree Biomass Allocation and Its Model Additivity for Casuarina equisetifolia in a Tropical Forest of Hainan Island, China

    Science.gov (United States)

    Xue, Yang; Yang, Zhongyang; Wang, Xiaoyan; Lin, Zhipan; Li, Dunxi; Su, Shaofeng

    2016-01-01

    Casuarina equisetifolia is commonly planted and used in the construction of coastal shelterbelt protection in Hainan Island. Thus, it is critical to accurately estimate the tree biomass of Casuarina equisetifolia L. for forest managers to evaluate the biomass stock in Hainan. The data for this work consisted of 72 trees, which were divided into three age groups: young forest, middle-aged forest, and mature forest. The proportion of biomass from the trunk significantly increased with age (Pbiomass of the branch and leaf decreased, and the biomass of the root did not change. To test whether the crown radius (CR) can improve biomass estimates of C. equisetifolia, we introduced CR into the biomass models. Here, six models were used to estimate the biomass of each component, including the trunk, the branch, the leaf, and the root. In each group, we selected one model among these six models for each component. The results showed that including the CR greatly improved the model performance and reduced the error, especially for the young and mature forests. In addition, to ensure biomass additivity, the selected equation for each component was fitted as a system of equations using seemingly unrelated regression (SUR). The SUR method not only gave efficient and accurate estimates but also achieved the logical additivity. The results in this study provide a robust estimation of tree biomass components and total biomass over three groups of C. equisetifolia. PMID:27002822

  18. Characterization of Escherichia coli isolates from different fecal sources by means of classification tree analysis of fatty acid methyl ester (FAME) profiles.

    Science.gov (United States)

    Seurinck, Sylvie; Deschepper, Ellen; Deboch, Bishaw; Verstraete, Willy; Siciliano, Steven

    2006-03-01

    Microbial source tracking (MST) methods need to be rapid, inexpensive and accurate. Unfortunately, many MST methods provide a wealth of information that is difficult to interpret by the regulators who use this information to make decisions. This paper describes the use of classification tree analysis to interpret the results of a MST method based on fatty acid methyl ester (FAME) profiles of Escherichia coli isolates, and to present results in a format readily interpretable by water quality managers. Raw sewage E. coli isolates and animal E. coli isolates from cow, dog, gull, and horse were isolated and their FAME profiles collected. Correct classification rates determined with leaveone-out cross-validation resulted in an overall low correct classification rate of 61%. A higher overall correct classification rate of 85% was obtained when the animal isolates were pooled together and compared to the raw sewage isolates. Bootstrap aggregation or adaptive resampling and combining of the FAME profile data increased correct classification rates substantially. Other MST methods may be better suited to differentiate between different fecal sources but classification tree analysis has enabled us to distinguish raw sewage from animal E. coli isolates, which previously had not been possible with other multivariate methods such as principal component analysis and cluster analysis.

  19. AERIAL IMAGES FROM AN UAV SYSTEM: 3D MODELING AND TREE SPECIES CLASSIFICATION IN A PARK AREA

    Directory of Open Access Journals (Sweden)

    R. Gini

    2012-07-01

    Full Text Available The use of aerial imagery acquired by Unmanned Aerial Vehicles (UAVs is scheduled within the FoGLIE project (Fruition of Goods Landscape in Interactive Environment: it starts from the need to enhance the natural, artistic and cultural heritage, to produce a better usability of it by employing audiovisual movable systems of 3D reconstruction and to improve monitoring procedures, by using new media for integrating the fruition phase with the preservation ones. The pilot project focus on a test area, Parco Adda Nord, which encloses various goods' types (small buildings, agricultural fields and different tree species and bushes. Multispectral high resolution images were taken by two digital compact cameras: a Pentax Optio A40 for RGB photos and a Sigma DP1 modified to acquire the NIR band. Then, some tests were performed in order to analyze the UAV images' quality with both photogrammetric and photo-interpretation purposes, to validate the vector-sensor system, the image block geometry and to study the feasibility of tree species classification. Many pre-signalized Control Points were surveyed through GPS to allow accuracy analysis. Aerial Triangulations (ATs were carried out with photogrammetric commercial software, Leica Photogrammetry Suite (LPS and PhotoModeler, with manual or automatic selection of Tie Points, to pick out pros and cons of each package in managing non conventional aerial imagery as well as the differences in the modeling approach. Further analysis were done on the differences between the EO parameters and the corresponding data coming from the on board UAV navigation system.

  20. Biodiversity among Lactobacillus helveticus Strains Isolated from Different Natural Whey Starter Cultures as Revealed by Classification Trees

    Science.gov (United States)

    Gatti, Monica; Trivisano, Carlo; Fabrizi, Enrico; Neviani, Erasmo; Gardini, Fausto

    2004-01-01

    Lactobacillus helveticus is a homofermentative thermophilic lactic acid bacterium used extensively for manufacturing Swiss type and aged Italian cheese. In this study, the phenotypic and genotypic diversity of strains isolated from different natural dairy starter cultures used for Grana Padano, Parmigiano Reggiano, and Provolone cheeses was investigated by a classification tree technique. A data set was used that consists of 119 L. helveticus strains, each of which was studied for its physiological characters, as well as surface protein profiles and hybridization with a species-specific DNA probe. The methodology employed in this work allowed the strains to be grouped into terminal nodes without difficult and subjective interpretation. In particular, good discrimination was obtained between L. helveticus strains isolated, respectively, from Grana Padano and from Provolone natural whey starter cultures. The method used in this work allowed identification of the main characteristics that permit discrimination of biotypes. In order to understand what kind of genes could code for phenotypes of technological relevance, evidence that specific DNA sequences are present only in particular biotypes may be of great interest. PMID:14711641

  1. Identifying the critical success factors in the coverage of low vision services using the classification analysis and regression tree methodology.

    Science.gov (United States)

    Chiang, Peggy Pei-Chia; Xie, Jing; Keeffe, Jill Elizabeth

    2011-04-25

    To identify the critical success factors (CSF) associated with coverage of low vision services. Data were collected from a survey distributed to Vision 2020 contacts, government, and non-government organizations (NGOs) in 195 countries. The Classification and Regression Tree Analysis (CART) was used to identify the critical success factors of low vision service coverage. Independent variables were sourced from the survey: policies, epidemiology, provision of services, equipment and infrastructure, barriers to services, human resources, and monitoring and evaluation. Socioeconomic and demographic independent variables: health expenditure, population statistics, development status, and human resources in general, were sourced from the World Health Organization (WHO), World Bank, and the United Nations (UN). The findings identified that having >50% of children obtaining devices when prescribed (χ(2) = 44; P 3 rehabilitation workers per 10 million of population (χ(2) = 4.50; P = 0.034), higher percentage of population urbanized (χ(2) = 14.54; P = 0.002), a level of private investment (χ(2) = 14.55; P = 0.015), and being fully funded by government (χ(2) = 6.02; P = 0.014), are critical success factors associated with coverage of low vision services. This study identified the most important predictors for countries with better low vision coverage. The CART is a useful and suitable methodology in survey research and is a novel way to simplify a complex global public health issue in eye care.

  2. Rapid Erosion Modeling in a Western Kenya Watershed using Visible Near Infrared Reflectance, Classification Tree Analysis and 137Cesium.

    Science.gov (United States)

    deGraffenried, Jeff B; Shepherd, Keith D

    2009-12-15

    Human induced soil erosion has severe economic and environmental impacts throughout the world. It is more severe in the tropics than elsewhere and results in diminished food production and security. Kenya has limited arable land and 30 percent of the country experiences severe to very severe human induced soil degradation. The purpose of this research was to test visible near infrared diffuse reflectance spectroscopy (VNIR) as a tool for rapid assessment and benchmarking of soil condition and erosion severity class. The study was conducted in the Saiwa River watershed in the northern Rift Valley Province of western Kenya, a tropical highland area. Soil 137 Cs concentration was measured to validate spectrally derived erosion classes and establish the background levels for difference land use types. Results indicate VNIR could be used to accurately evaluate a large and diverse soil data set and predict soil erosion characteristics. Soil condition was spectrally assessed and modeled. Analysis of mean raw spectra indicated significant reflectance differences between soil erosion classes. The largest differences occurred between 1,350 and 1,950 nm with the largest separation occurring at 1,920 nm. Classification and Regression Tree (CART) analysis indicated that the spectral model had practical predictive success (72%) with Receiver Operating Characteristic (ROC) of 0.74. The change in 137 Cs concentrations supported the premise that VNIR is an effective tool for rapid screening of soil erosion condition.

  3. Classification trees for identifying non-use of community-based long-term care services among older adults.

    Science.gov (United States)

    Penkunas, Michael James; Eom, Kirsten Yuna; Chan, Angelique Wei-Ming

    2017-10-01

    Home- and center-based long-term care (LTC) services allow older adults to remain in the community while simultaneously helping caregivers cope with the stresses associated with providing care. Despite these benefits, the uptake of community-based LTC services among older adults remains low. We analyzed data from a longitudinal study in Singapore to identify the characteristics of individuals with referrals to home-based LTC services or day rehabilitation services at the time of hospital discharge. Classification and regression tree analysis was employed to identify combinations of clinical and sociodemographic characteristics of patients and their caregivers for individuals who did not take up their referred services. Patients' level of limitation in activities of daily living (ADL) and caregivers' ethnicity and educational level were the most distinguishing characteristics for identifying older adults who failed to take up their referred home-based services. For day rehabilitation services, patients' level of ADL limitation, home size, age, and possession of a national medical savings account, as well as caregivers' education level, and gender were significant factors influencing service uptake. Identifying subgroups of patients with high rates of non-use can help clinicians target individuals who are need of community-based LTC services but unlikely to engage in formal treatment. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Mapping trees outside forests using high-resolution aerial imagery: a comparison of pixel- and object based classification approaches

    Science.gov (United States)

    Dacia M. Meneguzzo; Greg C. Liknes; Mark D. Nelson

    2013-01-01

    Discrete trees and small groups of trees in nonforest settings are considered an essential resource around the world and are collectively referred to as trees outside forests (ToF). ToF provide important functions across the landscape, such as protecting soil and water resources, providing wildlife habitat, and improving farmstead energy efficiency and aesthetics....

  5. Classification of suppressor additives based on synergistic and antagonistic ensemble effects

    Energy Technology Data Exchange (ETDEWEB)

    Broekmann, P., E-mail: peter.broekmann@iac.unibe.ch [BASF SE, Global Business Unit Electronic Materials, 67056 Ludwigshafen (Germany); Department of Chemistry and Biochemistry, University of Bern, Bern (Switzerland); Fluegel, A.; Emnet, C.; Arnold, M.; Roeger-Goepfert, C.; Wagner, A. [BASF SE, Global Business Unit Electronic Materials, 67056 Ludwigshafen (Germany); Hai, N.T.M. [Department of Chemistry and Biochemistry, University of Bern, Bern (Switzerland); Mayer, D. [BASF SE, Global Business Unit Electronic Materials, 67056 Ludwigshafen (Germany)

    2011-05-01

    Highlights: > Three fundamental types of suppressor additives for copper electroplating could be identified by means of potential transient measurements. > These suppressor additives differ in their synergistic and antagonistic interplay with anions that are chemisorbed on the metallic copper surface during electrodeposition. > In addition these suppressor chemistries reveal different barrier properties with respect to cupric ions and plating additives (Cl, SPS). - Abstract: Three fundamental types of suppressor additives for copper electroplating could be identified by means of potential transient measurements. These suppressor additives differ in their synergistic and antagonistic interplay with anions that are chemisorbed on the metallic copper surface during electrodeposition. In addition these suppressor chemistries reveal different barrier properties with respect to cupric ions and plating additives (Cl, SPS). While the type-I suppressor selectively forms efficient barriers for copper inter-diffusion on chloride-terminated electrode surfaces we identified a type-II suppressor that interacts non-selectively with any kind of anions chemisorbed on copper (chloride, sulfate, sulfonate). Type-I suppressors are vital for the superconformal copper growth mode in Damascene processing and show an antagonistic interaction with SPS (Bis-Sodium-Sulfopropyl-Disulfide) which involves the deactivation of this suppressor chemistry. This suppressor deactivation is rationalized in terms of compositional changes in the layer of the chemisorbed anions due to the competition of chloride and MPS (Mercaptopropane Sulfonic Acid) for adsorption sites on the metallic copper surface. MPS is the product of the dissociative SPS adsorption within the preexisting chloride matrix on the copper surface. The non-selectivity in the adsorption behavior of the type-II suppressor is rationalized in terms of anion/cation pairing effects of the poly-cationic suppressor and the anion-modified copper

  6. A novel classification and online platform for planning and documentation of medical applications of additive manufacturing.

    Science.gov (United States)

    Tuomi, Jukka; Paloheimo, Kaija-Stiina; Vehviläinen, Juho; Björkstrand, Roy; Salmi, Mika; Huotilainen, Eero; Kontio, Risto; Rouse, Stephen; Gibson, Ian; Mäkitie, Antti A

    2014-12-01

    Additive manufacturing technologies are widely used in industrial settings and now increasingly also in several areas of medicine. Various techniques and numerous types of materials are used for these applications. There is a clear need to unify and harmonize the patterns of their use worldwide. We present a 5-class system to aid planning of these applications and related scientific work as well as communication between various actors involved in this field. An online, matrix-based platform and a database were developed for planning and documentation of various solutions. This platform will help the medical community to structurally develop both research innovations and clinical applications of additive manufacturing. The online platform can be accessed through http://www.medicalam.info. © The Author(s) 2014.

  7. Predicting student satisfaction with courses based on log data from a virtual learning environment – a neural network and classification tree model

    Directory of Open Access Journals (Sweden)

    Ivana Đurđević Babić

    2015-03-01

    Full Text Available Student satisfaction with courses in academic institutions is an important issue and is recognized as a form of support in ensuring effective and quality education, as well as enhancing student course experience. This paper investigates whether there is a connection between student satisfaction with courses and log data on student courses in a virtual learning environment. Furthermore, it explores whether a successful classification model for predicting student satisfaction with course can be developed based on course log data and compares the results obtained from implemented methods. The research was conducted at the Faculty of Education in Osijek and included analysis of log data and course satisfaction on a sample of third and fourth year students. Multilayer Perceptron (MLP with different activation functions and Radial Basis Function (RBF neural networks as well as classification tree models were developed, trained and tested in order to classify students into one of two categories of course satisfaction. Type I and type II errors, and input variable importance were used for model comparison and classification accuracy. The results indicate that a successful classification model using tested methods can be created. The MLP model provides the highest average classification accuracy and the lowest preference in misclassification of students with a low level of course satisfaction, although a t-test for the difference in proportions showed that the difference in performance between the compared models is not statistically significant. Student involvement in forum discussions is recognized as a valuable predictor of student satisfaction with courses in all observed models.

  8. Comprehensive decision tree models in bioinformatics.

    Directory of Open Access Journals (Sweden)

    Gregor Stiglic

    Full Text Available PURPOSE: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. METHODS: This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. RESULTS: The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. CONCLUSIONS: The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets

  9. Comprehensive decision tree models in bioinformatics.

    Science.gov (United States)

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly

  10. Classification and regression tree (CART) analyses of genomic signatures reveal sets of tetramers that discriminate temperature optima of archaea and bacteria

    Science.gov (United States)

    Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.

    2008-01-01

    Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742

  11. A non-parametric, supervised classification of vegetation types on the Kaibab National Forest using decision trees

    Science.gov (United States)

    Suzanne M. Joy; R. M. Reich; Richard T. Reynolds

    2003-01-01

    Traditional land classification techniques for large areas that use Landsat Thematic Mapper (TM) imagery are typically limited to the fixed spatial resolution of the sensors (30m). However, the study of some ecological processes requires land cover classifications at finer spatial resolutions. We model forest vegetation types on the Kaibab National Forest (KNF) in...

  12. Calcium addition at the Hubbard Brook Experimental Forest increases the capacity for stress tolerance and carbon capture in red spruce (Picea rubens) trees during the cold season

    Science.gov (United States)

    Paul G. Schaberg; Rakesh Minocha; Stephanie Long; Joshua M. Halman; Gary J. Hawley; Christopher. Eagar

    2011-01-01

    Red spruce (Picea rubens Sarg.) trees are uniquely vulnerable to foliar freezing injury during the cold season (fall and winter), but are also capable of photosynthetic activity if temperatures moderate. To evaluate the influence of calcium (Ca) addition on the physiology of red spruce during the cold season, we measured concentrations of foliar...

  13. Land cover and forest formation distributions for St. Kitts, Nevis, St. Eustatius, Grenada and Barbados from decision tree classification of cloud-cleared satellite imagery

    Science.gov (United States)

    Helmer, E.H.; Kennaway, T.A.; Pedreros, D.H.; Clark, M.L.; Marcano-Vega, H.; Tieszen, L.L.; Ruzycki, T.R.; Schill, S.R.; Carrington, C.M.S.

    2008-01-01

    Satellite image-based mapping of tropical forests is vital to conservation planning. Standard methods for automated image classification, however, limit classification detail in complex tropical landscapes. In this study, we test an approach to Landsat image interpretation on four islands of the Lesser Antilles, including Grenada and St. Kitts, Nevis and St. Eustatius, testing a more detailed classification than earlier work in the latter three islands. Secondly, we estimate the extents of land cover and protected forest by formation for five islands and ask how land cover has changed over the second half of the 20th century. The image interpretation approach combines image mosaics and ancillary geographic data, classifying the resulting set of raster data with decision tree software. Cloud-free image mosaics for one or two seasons were created by applying regression tree normalization to scene dates that could fill cloudy areas in a base scene. Such mosaics are also known as cloud-filled, cloud-minimized or cloud-cleared imagery, mosaics, or composites. The approach accurately distinguished several classes that more standard methods would confuse; the seamless mosaics aided reference data collection; and the multiseason imagery allowed us to separate drought deciduous forests and woodlands from semi-deciduous ones. Cultivated land areas declined 60 to 100 percent from about 1945 to 2000 on several islands. Meanwhile, forest cover has increased 50 to 950%. This trend will likely continue where sugar cane cultivation has dominated. Like the island of Puerto Rico, most higher-elevation forest formations are protected in formal or informal reserves. Also similarly, lowland forests, which are drier forest types on these islands, are not well represented in reserves. Former cultivated lands in lowland areas could provide lands for new reserves of drier forest types. The land-use history of these islands may provide insight for planners in countries currently considering

  14. Application of classification trees for the qualitative differentiation of focal liver lesions suspicious for metastasis in gadolinium-EOB-DTPA-enhanced liver MR imaging

    Energy Technology Data Exchange (ETDEWEB)

    Schelhorn, J. [Sophien und Hufeland Klinikum, Weimar (Germany). Dept. of Radiology and Nuclear Medicine; Benndorf, M.; Dietzel, M.; Burmeister, H.P.; Kaiser, W.A.; Baltzer, P.A.T. [Jena Univ. (Germany). Inst. of Diagnostic and Interventional Radiology

    2012-09-15

    Purpose: To evaluate the diagnostic accuracy of qualitative descriptors alone and in combination for the classification of focal liver lesions (FLLs) suspicious for metastasis in gadolinium-EOB-DTPA-enhanced liver MR imaging. Materials and Methods: Consecutive patients with clinically suspected liver metastases were eligible for this retrospective investigation. 50 patients met the inclusion criteria. All underwent Gd-EOB-DTPA-enhanced liver MRI (T2w, chemical shift T1w, dynamic T1w). Primary liver malignancies or treated lesions were excluded. All investigations were read by two blinded observers (O1, O2). Both independently identified the presence of lesions and evaluated predefined qualitative lesion descriptors (signal intensities, enhancement pattern and morphology). A reference standard was determined under consideration of all clinical and follow-up information. Statistical analysis besides contingency tables (chi square, kappa statistics) included descriptor combinations using classification trees (CHAID methodology) as well as ROC analysis. Results: In 38 patients, 120 FLLs (52 benign, 68 malignant) were present. 115 (48 benign, 67 malignant) were identified by the observers. The enhancement pattern, relative SI upon T2w and late enhanced T1w images contributed significantly to the differentiation of FLLs. The overall classification accuracy was 91.3 % (O1) and 88.7 % (O2), kappa = 0.902. Conclusion: The combination of qualitative lesion descriptors proposed in this work revealed high diagnostic accuracy and interobserver agreement in the differentiation of focal liver lesions suspicious for metastases using Gd-EOB-DTPA-enhanced liver MRI. (orig.)

  15. Fenologics characteristics of the ‘Siciliano’ lemon tree on two rootstocks influenced by liming and boron addition

    Directory of Open Access Journals (Sweden)

    Hélio Grassi Filho

    2004-09-01

    Full Text Available The current study was developed with disturbed samples of an Oxisol, in which ‘Siciliano’ lemon trees seedlings (C. limon were grafted on sour orange tree (C. aurantium and rangpur lime tree (C. limonia. The experiment consisted of three basis saturation levels (50, 70 and 90 percent and three boron doses (0.5; 1.5 and 4.5 mg dm-3 in the planting with 3x3x2 factorial experimental design with four replications. Mineral composition of the "Siciliano" lemon leaves as well as root system development in sour orange tree were higher than the rangpur lime tree. There was no effect in the interaction basis saturarion level and the boron doses for any of the evaluated parameters.O presente estudo foi desenvolvido na UNESP/Botucatu, São Paulo, Brasil, num solo identificado como Oxisol, onde foram plantadas mudas de limoeiro ‘Siciliano’ (C. limon enxertadas em laranjeira ‘Azeda’ (C. aurantium e em limoeiro ‘Cravo’ (C. limonia. O experimento consistiu em três níveis de saturação por bases (50%, 70% e 90% e três doses de boro (0,5; 1,5 e 4,5 mg dm-3 no plantio em esquema fatorial de 3x3x2, com quatro repetições. Houve diferentes comportamentos entre os porta-enxertos no que se refere à composição mineral de folhas de limoeiro ‘Siciliano’, bem como, no desenvolvimento do sistema radicular, sendo maior na laranjeira azeda em relação ao limoeiro cravo. Não houve nenhum efeito na interação de níveis de saturação por bases e doses de boro para nenhum dos parâmetros avaliados.

  16. An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests

    Science.gov (United States)

    Strobl, Carolin; Malley, James; Tutz, Gerhard

    2009-01-01

    Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…

  17. Diversity of shrub tree layer, leaf litter decomposition and N release in a Brazilian Cerrado under N, P and N plus P additions

    International Nuclear Information System (INIS)

    Khan Baiocchi Jacobson, Tamiel; Cunha Bustamante, Mercedes Maria da; Rodrigues Kozovits, Alessandra

    2011-01-01

    This study investigated changes in diversity of shrub-tree layer, leaf decomposition rates, nutrient release and soil NO fluxes of a Brazilian savanna (cerrado sensu stricto) under N, P and N plus P additions. Simultaneous addition of N and P affected density, dominance, richness and diversity patterns more significantly than addition of N or P separately. Leaf litter decomposition rates increased in P and NP plots but did not differ in N plots in comparison to control plots. N addition increased N mass loss, while the combined addition of N and P resulted in an immobilization of N in leaf litter. Soil NO emissions were also higher when N was applied without P. The results indicate that if the availability of P is not increased proportionally to the availability of N, the losses of N are intensified. - Highlights: → Simultaneous addition of N and P affected richness and diversity of the shrub-tree layer of a Brazilian savanna more significantly than addition of N or P separately. → Leaf litter decomposition rates increased in P and NP plots but did not differ in N plots in comparison to control plots. N addition increased N mass loss, while the combined addition of N and P resulted in an immobilization of N in leaf litter. Soil NO emissions were also higher when N was applied without P. → The results indicated that if increases in N deposition in Cerrado ecosystems are not accompanied by P additions, higher N losses through leaching and gas emissions can occur with other ecosystem impacts. - Shrub-tree diversity and functioning of Brazilian savanna are affected by increasing nutrient availability.

  18. Diversity of shrub tree layer, leaf litter decomposition and N release in a Brazilian Cerrado under N, P and N plus P additions

    Energy Technology Data Exchange (ETDEWEB)

    Khan Baiocchi Jacobson, Tamiel, E-mail: tamiel@unb.br [Departamento de Ecologia, Universidade de Brasilia, Brasilia-DF 70919-970 (Brazil); Cunha Bustamante, Mercedes Maria da, E-mail: mercedes@unb.br [Departamento de Ecologia, Universidade de Brasilia, Brasilia-DF 70919-970 (Brazil); Rodrigues Kozovits, Alessandra, E-mail: kozovits@icep.ufop.br [Departamento de Ecologia, Universidade de Brasilia, Brasilia-DF 70919-970 (Brazil)

    2011-10-15

    This study investigated changes in diversity of shrub-tree layer, leaf decomposition rates, nutrient release and soil NO fluxes of a Brazilian savanna (cerrado sensu stricto) under N, P and N plus P additions. Simultaneous addition of N and P affected density, dominance, richness and diversity patterns more significantly than addition of N or P separately. Leaf litter decomposition rates increased in P and NP plots but did not differ in N plots in comparison to control plots. N addition increased N mass loss, while the combined addition of N and P resulted in an immobilization of N in leaf litter. Soil NO emissions were also higher when N was applied without P. The results indicate that if the availability of P is not increased proportionally to the availability of N, the losses of N are intensified. - Highlights: > Simultaneous addition of N and P affected richness and diversity of the shrub-tree layer of a Brazilian savanna more significantly than addition of N or P separately. > Leaf litter decomposition rates increased in P and NP plots but did not differ in N plots in comparison to control plots. N addition increased N mass loss, while the combined addition of N and P resulted in an immobilization of N in leaf litter. Soil NO emissions were also higher when N was applied without P. > The results indicated that if increases in N deposition in Cerrado ecosystems are not accompanied by P additions, higher N losses through leaching and gas emissions can occur with other ecosystem impacts. - Shrub-tree diversity and functioning of Brazilian savanna are affected by increasing nutrient availability.

  19. Classification of Different Degrees of Disability Following Intracerebral Hemorrhage: A Decision Tree Analysis from VISTA-ICH Collaboration.

    Science.gov (United States)

    Phan, Thanh G; Chen, Jian; Beare, Richard; Ma, Henry; Clissold, Benjamin; Van Ly, John; Srikanth, Velandai

    2017-01-01

    Prognostication following intracerebral hemorrhage (ICH) has focused on poor outcome at the expense of lumping together mild and moderate disability. We aimed to develop a novel approach at classifying a range of disability following ICH. The Virtual International Stroke Trial Archive collaboration database was searched for patients with ICH and known volume of ICH on baseline CT scans. Disability was partitioned into mild [modified Rankin Scale (mRS) at 90 days of 0-2], moderate (mRS = 3-4), and severe disabilities (mRS = 5-6). We used binary and trichotomy decision tree methodology. The data were randomly divided into training (2/3 of data) and validation (1/3 data) datasets. The area under the receiver operating characteristic curve (AUC) was used to calculate the accuracy of the decision tree model. We identified 957 patients, age 65.9 ± 12.3 years, 63.7% males, and ICH volume 22.6 ± 22.1 ml. The binary tree showed that lower ICH volume (27.9 ml), older age (>69.5 years), and low Glasgow Coma Scale (tree showed that ICH volume, age, and serum glucose can separate mild, moderate, and severe disability groups with AUC 0.79 (95% CI 0.71-0.87). Both the binary and trichotomy methods provide equivalent discrimination of disability outcome after ICH. The trichotomy method can classify three categories at once, whereas this action was not possible with the binary method. The trichotomy method may be of use to clinicians and trialists for classifying a range of disability in ICH.

  20. A Methodology In Processing Descriptive Analytics Using MMDA Traffic Update Tweets Tokenization And Classification Tree In Discovering Knowledge

    Directory of Open Access Journals (Sweden)

    Tristan Jay P. Calaguas

    2015-08-01

    Full Text Available Traffic on National Capital Region of the Philippines is going as one of many problems facing by the local government and Filipino citizen who are residing in Metro Manila. In addition a Filipino citizen that is working in Metro Manila is experiencing a waste of Twenty Eight Thousand hours in traffic which results unproductivity. Due to traffic that causes long commutes it take away an individual from exercise activities that results fatigue in their health. In relation with this due to lack of exercise that causing by the traffic each year One Hundred Seventy Thousand Filipinos die from cardiovascular diseases up from Eighty Five Thousand more than Twenty years ago according to 2009 study by the Department of Health DOH. Population increase is one of many causes of traffic in Metro Manila. As population is growing the more car riders and commuters volume will be in the road including delivery trucks Pedi cabs jeeps and provincial buses that signify that there is a high employment rate in the country that causes traffic. However to sustain the public needs MMDA is the government agency that provides public services to Filipino citizens through providing updated public traffic information. For past years MMDA used Telephony lines and Television Broadcasting for traffic information dissemination which is very costly in maintenance that made them to adopt Twitter to post Traffic updates and advisories to the public .Since this government agency uses Twitter in disseminating information through posting tweet there is a need for a methodology on how these tweets will analyze so that citizens will have an insight in decision making to avoid specific time of traffic in metro manila. From this condition the researcher will adopt the use of MMDA tweets as the primary data source and apply the CRISP as the knowledge discovery standard processes that to be used in building methodology for descriptive analytics. In this experimental research several

  1. A right whale pootree: classification trees of faecal hormones identify reproductive states in North Atlantic right whales (Eubalaena glacialis).

    Science.gov (United States)

    Corkeron, Peter; Rolland, Rosalind M; Hunt, Kathleen E; Kraus, Scott D

    2017-01-01

    Immunoassay of hormone metabolites extracted from faecal samples of free-ranging large whales can provide biologically relevant information on reproductive state and stress responses. North Atlantic right whales ( Eubalaena glacialis Müller 1776) are an ideal model for testing the conservation value of faecal metabolites. Almost all North Atlantic right whales are individually identified, most of the population is sighted each year, and systematic survey effort extends back to 1986. North Atlantic right whales number trees as an alternative method of analysing multiple-hormone data sets, building on univariate models that have previously been used to describe hormone profiles of individual North Atlantic right whales of known reproductive state. Our tree correctly classified the age class, sex and reproductive state of 83% of 112 faecal samples from known individual whales. Pregnant females, lactating females and both mature and immature males were classified reliably using our model. Non-reproductive [i.e. 'resting' (not pregnant and not lactating) and immature] females proved the most unreliable to distinguish. There were three individual males that, given their age, would traditionally be considered immature but that our tree classed as mature males, possibly calling for a re-evaluation of their reproductive status. Our analysis reiterates the importance of considering the reproductive state of whales when assessing the relationship between cortisol concentrations and stress. Overall, these results confirm findings from previous univariate statistical analyses, but with a more robust multivariate approach that may prove useful for the multiple-analyte data sets that are increasingly used by conservation physiologists.

  2. Application of classification trees for the qualitative differentiation of focal liver lesions suspicious for metastasis in gadolinium-EOB-DTPA-enhanced liver MR imaging.

    Science.gov (United States)

    Schelhorn, J; Benndorf, M; Dietzel, M; Burmeister, H P; Kaiser, W A; Baltzer, P A T

    2012-09-01

    To evaluate the diagnostic accuracy of qualitative descriptors alone and in combination for the classification of focal liver lesions (FLLs) suspicious for metastasis in gadolinium-EOB-DTPA-enhanced liver MR imaging. Consecutive patients with clinically suspected liver metastases were eligible for this retrospective investigation. 50 patients met the inclusion criteria. All underwent Gd-EOB-DTPA-enhanced liver MRI (T2w, chemical shift T1w, dynamic T1w). Primary liver malignancies or treated lesions were excluded. All investigations were read by two blinded observers (O1, O2). Both independently identified the presence of lesions and evaluated predefined qualitative lesion descriptors (signal intensities, enhancement pattern and morphology). A reference standard was determined under consideration of all clinical and follow-up information. Statistical analysis besides contingency tables (chi square, kappa statistics) included descriptor combinations using classification trees (CHAID methodology) as well as ROC analysis. In 38 patients, 120 FLLs (52 benign, 68 malignant) were present. 115 (48 benign, 67 malignant) were identified by the observers. The enhancement pattern, relative SI upon T2w and late enhanced T1w images contributed significantly to the differentiation of FLLs. The overall classification accuracy was 91.3 % (O1) and 88.7 % (O2), kappa = 0.902. The combination of qualitative lesion descriptors proposed in this work revealed high diagnostic accuracy and interobserver agreement in the differentiation of focal liver lesions suspicious for metastases using Gd-EOB-DTPA-enhanced liver MRI. © Georg Thieme Verlag KG Stuttgart · New York.

  3. Using DRG to analyze hospital production: a re-classification model based on a linear tree-network topology

    Directory of Open Access Journals (Sweden)

    Achille Lanzarini

    2014-09-01

    Full Text Available Background: Hospital discharge records are widely classified through the Diagnosis Related Group (DRG system; the version currently used in Italy counts 538 different codes, including thousands of diagnosis and procedures. These numbers reflect the considerable effort of simplification, yet the current classification system is of little use to evaluate hospital production and performance.Methods: As the case-mix of a given Hospital Unit (HU is driven by its physicians’ specializations, a grouping of DRGs into a specialization-driven classification system has been conceived through the analysis of HUs discharging and the ICD-9-CM codes. We propose a three-folded classification, based on the analysis of 1,670,755 Hospital Discharge Cards (HDCs produced by Lombardy Hospitals in 2010; it consists of 32 specializations (e.g. Neurosurgery, 124 sub-specialization (e.g. skull surgery and 337 sub-sub-specialization (e.g. craniotomy.Results: We give a practical application of the three-layered approach, based on the production of a Neurosurgical HU; we observe synthetically the profile of production (1,305 hospital discharges for 79 different DRG codes of 16 different MDC are grouped in few groups of homogeneous DRG codes, a more informative production comparison (through process-specific comparisons, rather than crude or case-mix standardized comparisons and a potentially more adequate production planning (considering the Neurosurgical HUs of the same city, those produce a limited quote of the whole neurosurgical production, because the same activity can be realized by non-Neurosugical HUs.Conclusion: Our work may help to evaluate the hospital production for a rational planning of available resources, blunting information asymmetries between physicians and managers. 

  4. Exploration and classification of chromatographic fingerprints as additional tool for identification and quality control of several Artemisia species.

    Science.gov (United States)

    Alaerts, Goedele; Pieters, Sigrid; Logie, Hans; Van Erps, Jürgen; Merino-Arévalo, Maria; Dejaegher, Bieke; Smeyers-Verbeke, Johanna; Vander Heyden, Yvan

    2014-07-01

    -analysis technique. Samples of different quality could be indicated on the score plots. No multi-component analysis was required to reach the goal. Furthermore, differences related to the origin of some of the not-certified samples were shown. The importance of the specific herbal part used for its identification was also presented. In addition, no differences were observed among fingerprints of lyophilised or conditioned-air dried samples. Finally, a classification technique, Soft Independent Modelling by Class Analogy (SIMCA), was successfully evaluated as identification technique for unknown samples. Six additional Artemisia species (29 herbal samples) were identified as not belonging to any of the four modelled classes. The developed chromatographic fingerprints and the evaluation of the entire profiles provide an added value to the distinction, identification and quality control of the simultaneously investigated Artemisia species. Copyright © 2014 Elsevier B.V. All rights reserved.

  5. Effects of elevated carbon dioxide and nitrogen addition on foliar stoichiometry of nitrogen and phosphorus of five tree species in subtropical model forest ecosystems

    International Nuclear Information System (INIS)

    Huang Wenjuan; Zhou Guoyi; Liu Juxiu; Zhang Deqiang; Xu Zhihong; Liu Shizhong

    2012-01-01

    The effects of elevated carbon dioxide (CO 2 ) and nitrogen (N) addition on foliar N and phosphorus (P) stoichiometry were investigated in five native tree species (four non-N 2 fixers and one N 2 fixer) in open-top chambers in southern China from 2005 to 2009. The high foliar N:P ratios induced by high foliar N and low foliar P indicate that plants may be more limited by P than by N. The changes in foliar N:P ratios were largely determined by P dynamics rather than N under both elevated CO 2 and N addition. Foliar N:P ratios in the non-N 2 fixers showed some negative responses to elevated CO 2 , while N addition reduced foliar N:P ratios in the N 2 fixer. The results suggest that N addition would facilitate the N 2 fixer rather than the non-N 2 fixers to regulate the stoichiometric balance under elevated CO 2 . - Highlights: ► Five native tree species in southern China were more limited by P than by N. ► Shifts in foliar N:P ratios were driven by P dynamic under the global change. ► N addition lowered foliar N:P ratios in the N 2 fixer under elevated CO 2 . - N addition could facilitate the N 2 fixer rather than the non-N 2 fixers to regulate foliar N and P stoichiometry under elevated CO 2 in subtropical forests.

  6. Simple street tree sampling

    Science.gov (United States)

    David J. Nowak; Jeffrey T. Walton; James Baldwin; Jerry. Bond

    2015-01-01

    Information on street trees is critical for management of this important resource. Sampling of street tree populations provides an efficient means to obtain street tree population information. Long-term repeat measures of street tree samples supply additional information on street tree changes and can be used to report damages from catastrophic events. Analyses of...

  7. A Method for Application of Classification Tree Models to Map Aquatic Vegetation Using Remotely Sensed Images from Different Sensors and Dates

    Directory of Open Access Journals (Sweden)

    Ying Cai

    2012-09-01

    Full Text Available In previous attempts to identify aquatic vegetation from remotely-sensed images using classification trees (CT, the images used to apply CT models to different times or locations necessarily originated from the same satellite sensor as that from which the original images used in model development came, greatly limiting the application of CT. We have developed an effective normalization method to improve the robustness of CT models when applied to images originating from different sensors and dates. A total of 965 ground-truth samples of aquatic vegetation types were obtained in 2009 and 2010 in Taihu Lake, China. Using relevant spectral indices (SI as classifiers, we manually developed a stable CT model structure and then applied a standard CT algorithm to obtain quantitative (optimal thresholds from 2009 ground-truth data and images from Landsat7-ETM+, HJ-1B-CCD, Landsat5-TM and ALOS-AVNIR-2 sensors. Optimal CT thresholds produced average classification accuracies of 78.1%, 84.7% and 74.0% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. However, the optimal CT thresholds for different sensor images differed from each other, with an average relative variation (RV of 6.40%. We developed and evaluated three new approaches to normalizing the images. The best-performing method (Method of 0.1% index scaling normalized the SI images using tailored percentages of extreme pixel values. Using the images normalized by Method of 0.1% index scaling, CT models for a particular sensor in which thresholds were replaced by those from the models developed for images originating from other sensors provided average classification accuracies of 76.0%, 82.8% and 68.9% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. Applying the CT models developed for normalized 2009 images to 2010 images resulted in high classification (78.0%–93.3% and overall (92.0%–93.1% accuracies. Our

  8. Prediction of healthy blood with data mining classification by using Decision Tree, Naive Baysian and SVM approaches

    Science.gov (United States)

    Khalilinezhad, Mahdieh; Minaei, Behrooz; Vernazza, Gianni; Dellepiane, Silvana

    2015-03-01

    Data mining (DM) is the process of discovery knowledge from large databases. Applications of data mining in Blood Transfusion Organizations could be useful for improving the performance of blood donation service. The aim of this research is the prediction of healthiness of blood donors in Blood Transfusion Organization (BTO). For this goal, three famous algorithms such as Decision Tree C4.5, Naïve Bayesian classifier, and Support Vector Machine have been chosen and applied to a real database made of 11006 donors. Seven fields such as sex, age, job, education, marital status, type of donor, results of blood tests (doctors' comments and lab results about healthy or unhealthy blood donors) have been selected as input to these algorithms. The results of the three algorithms have been compared and an error cost analysis has been performed. According to this research and the obtained results, the best algorithm with low error cost and high accuracy is SVM. This research helps BTO to realize a model from blood donors in each area in order to predict the healthy blood or unhealthy blood of donors. This research could be useful if used in parallel with laboratory tests to better separate unhealthy blood.

  9. Photosynthetic and growth response of sugar maple (Acer saccharum Marsh.) mature trees and seedlings to calcium, magnesium, and nitrogen additions in the Catskill Mountains, NY, USA

    Science.gov (United States)

    Momen, Bahram; Behling, Shawna J; Lawrence, Gregory B.; Sullivan, Joseph H

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the

  10. Photosynthetic and Growth Response of Sugar Maple (Acer saccharum Marsh.) Mature Trees and Seedlings to Calcium, Magnesium, and Nitrogen Additions in the Catskill Mountains, NY, USA.

    Science.gov (United States)

    Momen, Bahram; Behling, Shawna J; Lawrence, Greg B; Sullivan, Joseph H

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the forest floor

  11. Modeling Forest Structural Parameters in the Mediterranean Pines of Central Spain using QuickBird-2 Imagery and Classification and Regression Tree Analysis (CART

    Directory of Open Access Journals (Sweden)

    José A. Delgado

    2012-01-01

    Full Text Available Forest structural parameters such as quadratic mean diameter, basal area, and number of trees per unit area are important for the assessment of wood volume and biomass and represent key forest inventory attributes. Forest inventory information is required to support sustainable management, carbon accounting, and policy development activities. Digital image processing of remotely sensed imagery is increasingly utilized to assist traditional, more manual, methods in the estimation of forest structural attributes over extensive areas, also enabling evaluation of change over time. Empirical attribute estimation with remotely sensed data is frequently employed, yet with known limitations, especially over complex environments such as Mediterranean forests. In this study, the capacity of high spatial resolution (HSR imagery and related techniques to model structural parameters at the stand level (n = 490 in Mediterranean pines in Central Spain is tested using data from the commercial satellite QuickBird-2. Spectral and spatial information derived from multispectral and panchromatic imagery (2.4 m and 0.68 m sided pixels, respectively served to model structural parameters. Classification and Regression Tree Analysis (CART was selected for the modeling of attributes. Accurate models were produced of quadratic mean diameter (QMD (R2 = 0.8; RMSE = 0.13 m with an average error of 17% while basal area (BA models produced an average error of 22% (RMSE = 5.79 m2/ha. When the measured number of trees per unit area (N was categorized, as per frequent forest management practices, CART models correctly classified 70% of the stands, with all other stands classified in an adjacent class. The accuracy of the attributes estimated here is expected to be better when canopy cover is more open and attribute values are at the lower end of the range present, as related in the pattern of the residuals found in this study. Our findings indicate that attributes derived from

  12. Effects of phosphorus addition on nitrogen cycle and fluxes of N2O and CH4 in tropical tree plantation soils in Thailand

    Directory of Open Access Journals (Sweden)

    Taiki Mori

    2017-04-01

    Full Text Available An incubation experiment was conducted to test the effects of phosphorus (P addition on nitrous oxide (N2O emissions and methane (CH4 uptakes, using tropical tree plantation soils in Thailand. Soil samples were taken from five forest stands—Acacia auriculiformis, Acacia mangium, Eucalyptus camaldulensis, Hopea odorata, and Xylia xylocarpa—and incubated at 80% water holding capacity. P addition stimulated N2O emissions only in Xylia xylocarpa soils. Since P addition tended to increase net ammonification rates in Xylia xylocarpa soils, the stimulated N2O emissions were suggested to be due to the stimulated nitrogen (N cycle by P addition and the higher N supply for nitrification and denitrification. In other soils, P addition had no effects on N2O emissions or soil N properties, except that P addition tended to increase the soil microbial biomass N in Acacia auriculiformis soils. No effects of P addition were observed on CH4 uptakes in any soil. It is suggested that P addition on N2O and CH4 fluxes at the study site were not significant, at least under laboratory conditions.

  13. Identifying changes in dissolved organic matter content and characteristics by fluorescence spectroscopy coupled with self-organizing map and classification and regression tree analysis during wastewater treatment.

    Science.gov (United States)

    Yu, Huibin; Song, Yonghui; Liu, Ruixia; Pan, Hongwei; Xiang, Liancheng; Qian, Feng

    2014-10-01

    The stabilization of latent tracers of dissolved organic matter (DOM) of wastewater was analyzed by three-dimensional excitation-emission matrix (EEM) fluorescence spectroscopy coupled with self-organizing map and classification and regression tree analysis (CART) in wastewater treatment performance. DOM of water samples collected from primary sedimentation, anaerobic, anoxic, oxic and secondary sedimentation tanks in a large-scale wastewater treatment plant contained four fluorescence components: tryptophan-like (C1), tyrosine-like (C2), microbial humic-like (C3) and fulvic-like (C4) materials extracted by self-organizing map. These components showed good positive linear correlations with dissolved organic carbon of DOM. C1 and C2 were representative components in the wastewater, and they were removed to a higher extent than those of C3 and C4 in the treatment process. C2 was a latent parameter determined by CART to differentiate water samples of oxic and secondary sedimentation tanks from the successive treatment units, indirectly proving that most of tyrosine-like material was degraded by anaerobic microorganisms. C1 was an accurate parameter to comprehensively separate the samples of the five treatment units from each other, indirectly indicating that tryptophan-like material was decomposed by anaerobic and aerobic bacteria. EEM fluorescence spectroscopy in combination with self-organizing map and CART analysis can be a nondestructive effective method for characterizing structural component of DOM fractions and monitoring organic matter removal in wastewater treatment process. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. Identification of imaging predictors discriminating different primary liver tumours in patients with chronic liver disease on gadoxetic acid-enhanced MRI: a classification tree analysis

    International Nuclear Information System (INIS)

    Park, Hyun Jeong; Jang, Kyung Mi; Kang, Tae Wook; Song, Kyoung Doo; Kim, Seong Hyun; Kim, Young Kon; Cha, Dong Ik; Kim, Joungyoun; Goo, Juna

    2016-01-01

    To identify predictors for the discrimination of intrahepatic cholangiocarcinoma (IMCC) and combined hepatocellular-cholangiocarcinoma (CHC) from hepatocellular carcinoma (HCC) for primary liver cancers on gadoxetic acid-enhanced MRI among high-risk chronic liver disease (CLD) patients using classification tree analysis (CTA). A total of 152 patients with histopathologically proven IMCC (n = 40), CHC (n = 24) and HCC (n = 91) were enrolled. Tumour marker and MRI variables including morphologic features, signal intensity, and enhancement pattern were used to identify tumours suspicious for IMCC and CHC using CTA. On CTA, arterial rim enhancement (ARE) was the initial splitting predictor for assessing the probability of tumours being IMCC or CHC. Of 43 tumours that were classified in a subgroup on CTA based on the presence of ARE, non-intralesional fat, and non-globular shape, 41 (95.3 %) were IMCCs (n = 29) or CHCs (n = 12). All 24 tumours showing fat on MRI were HCCs. The CTA model demonstrated sensitivity of 84.4 %, specificity of 97.8 %, and accuracy of 92.3 % for discriminating IMCCs and CHCs from HCCs. We established a simple CTA model for classifying a high-risk group of CLD patients with IMCC and CHC. This model may be useful for guiding diagnosis for primary liver cancers in patients with CLD. (orig.)

  15. A Classification Regression Tree Analysis to Reduce Balance Impairments and Falls in the Older population: Impact on Resource Utilization and Clinical Decision-Making in USA Rehabilitation Service Delivery

    Directory of Open Access Journals (Sweden)

    Lucinda Pfalzer

    2013-06-01

    Full Text Available Background/Purpose: Over 1/3 of adults over age 65 experiences at least one fall each year. This pilot report uses a classification regression tree analysis (CART to model the outcomes for balance/risk of falls from the Gentiva® Safe Strides® Program (SSP. Methods/Outcomes: SSP is a home-based balance/fall prevention program designed to treat root causes of a patient

  16. Predictors of success of external cephalic version and cephalic presentation at birth among 1253 women with non-cephalic presentation using logistic regression and classification tree analyses.

    Science.gov (United States)

    Hutton, Eileen K; Simioni, Julia C; Thabane, Lehana

    2017-08-01

    Among women with a fetus with a non-cephalic presentation, external cephalic version (ECV) has been shown to reduce the rate of breech presentation at birth and cesarean birth. Compared with ECV at term, beginning ECV prior to 37 weeks' gestation decreases the number of infants in a non-cephalic presentation at birth. The purpose of this secondary analysis was to investigate factors associated with a successful ECV procedure and to present this in a clinically useful format. Data were collected as part of the Early ECV Pilot and Early ECV2 Trials, which randomized 1776 women with a fetus in breech presentation to either early ECV (34-36 weeks' gestation) or delayed ECV (at or after 37 weeks). The outcome of interest was successful ECV, defined as the fetus being in a cephalic presentation immediately following the procedure, as well as at the time of birth. The importance of several factors in predicting successful ECV was investigated using two statistical methods: logistic regression and classification and regression tree (CART) analyses. Among nulliparas, non-engagement of the presenting part and an easily palpable fetal head were independently associated with success. Among multiparas, non-engagement of the presenting part, gestation less than 37 weeks and an easily palpable fetal head were found to be independent predictors of success. These findings were consistent with results of the CART analyses. Regardless of parity, descent of the presenting part was the most discriminating factor in predicting successful ECV and cephalic presentation at birth. © 2017 Nordic Federation of Societies of Obstetrics and Gynecology.

  17. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    Directory of Open Access Journals (Sweden)

    Santana Isabel

    2011-08-01

    Full Text Available Abstract Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI, but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing.

  18. Categorizing ideas about trees: a tree of trees.

    Science.gov (United States)

    Fisler, Marie; Lecointre, Guillaume

    2013-01-01

    The aim of this study is to explore whether matrices and MP trees used to produce systematic categories of organisms could be useful to produce categories of ideas in history of science. We study the history of the use of trees in systematics to represent the diversity of life from 1766 to 1991. We apply to those ideas a method inspired from coding homologous parts of organisms. We discretize conceptual parts of ideas, writings and drawings about trees contained in 41 main writings; we detect shared parts among authors and code them into a 91-characters matrix and use a tree representation to show who shares what with whom. In other words, we propose a hierarchical representation of the shared ideas about trees among authors: this produces a "tree of trees." Then, we categorize schools of tree-representations. Classical schools like "cladists" and "pheneticists" are recovered but others are not: "gradists" are separated into two blocks, one of them being called here "grade theoreticians." We propose new interesting categories like the "buffonian school," the "metaphoricians," and those using "strictly genealogical classifications." We consider that networks are not useful to represent shared ideas at the present step of the study. A cladogram is made for showing who is sharing what with whom, but also heterobathmy and homoplasy of characters. The present cladogram is not modelling processes of transmission of ideas about trees, and here it is mostly used to test for proximity of ideas of the same age and for categorization.

  19. [Study on sensitivity of climatic factors on influenza A (H1N1) based on classification and regression tree and wavelet analysis].

    Science.gov (United States)

    Xiao, Hong; Lin, Xiao-ling; Dai, Xiang-yu; Gao, Li-dong; Chen, Bi-yun; Zhang, Xi-xing; Zhu, Pei-juan; Tian, Huai-yu

    2012-05-01

    To analyze the periodicity of pandemic influenza A (H1N1) in Changsha in year 2009 and its correlation with sensitive climatic factors. The information of 5439 cases of influenza A (H1N1) and synchronous meteorological data during the period between May 22th and December 31st in year 2009 (223 days in total) in Changsha city were collected. The classification and regression tree (CART) was employed to screen the sensitive climatic factors on influenza A (H1N1); meanwhile, cross wavelet transform and wavelet coherence analysis were applied to assess and compare the periodicity of the pandemic disease and its association with the time-lag phase features of the sensitive climatic factors. The results of CART indicated that the daily minimum temperature and daily absolute humidity were the sensitive climatic factors for the popularity of influenza A (H1N1) in Changsha. The peak of the incidence of influenza A (H1N1) was in the period between October and December (Median (M) = 44.00 cases per day), simultaneously the daily minimum temperature (M = 13°C) and daily absolute humidity (M = 6.69 g/m(3)) were relatively low. The results of wavelet analysis demonstrated that a period of 16 days was found in the epidemic threshold in Changsha, while the daily minimum temperature and daily absolute humidity were the relatively sensitive climatic factors. The number of daily reported patients was statistically relevant to the daily minimum temperature and daily absolute humidity. The frequency domain was mostly in the period of (16 ± 2) days. In the initial stage of the disease (from August 9th and September 8th), a 6-day lag was found between the incidence and the daily minimum temperature. In the peak period of the disease, the daily minimum temperature and daily absolute humidity were negatively relevant to the incidence of the disease. In the pandemic period, the incidence of influenza A (H1N1) showed periodic features; and the sensitive climatic factors did have a "driving

  20. Indexing Density Models for Incremental Learning and Anytime Classification on Data Streams

    DEFF Research Database (Denmark)

    Seidl, Thomas; Assent, Ira; Kranen, Philipp

    2009-01-01

    Classification of streaming data faces three basic challenges: it has to deal with huge amounts of data, the varying time between two stream data items must be used best possible (anytime classification) and additional training data must be incrementally learned (anytime learning) for applying...... to the individual object to be classified) a hierarchy of mixture densities that represent kernel density estimators at successively coarser levels. Our probability density queries together with novel classification improvement strategies provide the necessary information for very effective classification at any...... point of interruption. Moreover, we propose a novel evaluation method for anytime classification using Poisson streams and demonstrate the anytime learning performance of the Bayes tree....

  1. Under which conditions, additional monitoring data are worth gathering for improving decision making? Application of the VOI theory in the Bayesian Event Tree eruption forecasting framework

    Science.gov (United States)

    Loschetter, Annick; Rohmer, Jérémy

    2016-04-01

    Standard and new generation of monitoring observations provide in almost real-time important information about the evolution of the volcanic system. These observations are used to update the model and contribute to a better hazard assessment and to support decision making concerning potential evacuation. The framework BET_EF (based on Bayesian Event Tree) developed by INGV enables dealing with the integration of information from monitoring with the prospect of decision making. Using this framework, the objectives of the present work are i. to propose a method to assess the added value of information (within the Value Of Information (VOI) theory) from monitoring; ii. to perform sensitivity analysis on the different parameters that influence the VOI from monitoring. VOI consists in assessing the possible increase in expected value provided by gathering information, for instance through monitoring. Basically, the VOI is the difference between the value with information and the value without additional information in a Cost-Benefit approach. This theory is well suited to deal with situations that can be represented in the form of a decision tree such as the BET_EF tool. Reference values and ranges of variation (for sensitivity analysis) were defined for input parameters, based on data from the MESIMEX exercise (performed at Vesuvio volcano in 2006). Complementary methods for sensitivity analyses were implemented: local, global using Sobol' indices and regional using Contribution to Sample Mean and Variance plots. The results (specific to the case considered) obtained with the different techniques are in good agreement and enable answering the following questions: i. Which characteristics of monitoring are important for early warning (reliability)? ii. How do experts' opinions influence the hazard assessment and thus the decision? Concerning the characteristics of monitoring, the more influent parameters are the means rather than the variances for the case considered

  2. Repeated measurements of blood lactate concentration as a prognostic marker in horses with acute colitis evaluated with classification and regression trees (CART) and random forest analysis

    DEFF Research Database (Denmark)

    Petersen, Mette Bisgaard; Tolver, Anders; Husted, Louise

    2016-01-01

    -off value of 7 mmol/L had a sensitivity of 0.66 and a specificity of 0.92 in predicting survival. In independent test data, the sensitivity was 0.69 and the specificity was 0.76. At the observed survival rate (38%), the optimal decision tree identified horses as non-survivors when the Lac at admission...... admitted with acute colitis (trees, as well as random...

  3. A simple classification system (the Tree flowchart) for breast MRI can reduce the number of unnecessary biopsies in MRI-only lesions

    Energy Technology Data Exchange (ETDEWEB)

    Woitek, Ramona; Spick, Claudio; Schernthaner, Melanie; Kapetas, Panagiotis; Bernathova, Maria; Furtner, Julia; Pinker, Katja; Helbich, Thomas H.; Baltzer, Pascal A.T. [Medical University of Vienna, Department of Biomedical Imaging and Image-Guided Therapy, Vienna (Austria); Rudas, Margaretha [Medical University of Vienna, Clinical Institute of Pathology, Vienna (Austria)

    2017-09-15

    To assess whether using the Tree flowchart obviates unnecessary magnetic resonance imaging (MRI)-guided biopsies in breast lesions only visible on MRI. This retrospective IRB-approved study evaluated consecutive suspicious (BI-RADS 4) breast lesions only visible on MRI that were referred to our institution for MRI-guided biopsy. All lesions were evaluated according to the Tree flowchart for breast MRI by experienced readers. The Tree flowchart is a decision rule that assigns levels of suspicion to specific combinations of diagnostic criteria. Receiver operating characteristic (ROC) curve analysis was used to evaluate diagnostic accuracy. To assess reproducibility by kappa statistics, a second reader rated a subset of 82 patients. There were 454 patients with 469 histopathologically verified lesions included (98 malignant, 371 benign lesions). The area under the curve (AUC) of the Tree flowchart was 0.873 (95% CI: 0.839-0.901). The inter-reader agreement was almost perfect (kappa: 0.944; 95% CI 0.889-0.998). ROC analysis revealed exclusively benign lesions if the Tree node was ≤2, potentially avoiding unnecessary biopsies in 103 cases (27.8%). Using the Tree flowchart in breast lesions only visible on MRI, more than 25% of biopsies could be avoided without missing any breast cancer. (orig.)

  4. 基于决策树的戈壁信息提取研究%Gobi information extraction based on decision tree classification method

    Institute of Scientific and Technical Information of China (English)

    冯益明; 智长贵; 姚爱冬

    2013-01-01

    Gobi is one of the main landscape types of earth' s surface in the arid region of northwestern parts of China, with the total area of 458 000-757 000 km2, accounting for the 4.8%-7.9% of China's total land area. The gobi holds abundant natural resources such as minerals, wind energy and solar power. Meanwhile, many modern cities and towns and some important traffic routes were also constructed on the gobi region. The gobi region plays an important role in the construction of western economy. Therefore, it is important to launch the gobi research under current social and economic conditions, and accurately revealing the distribution and area of gobi is the base and premise of launching the gobi research. At present, it is difficult to do fieldwork due to the execrable natural conditions and the sparse dweller in the gobi region, which leads to the scarcity of research documents on the situation, distribution, type classification, transformation and utilization of gobi. The studied region of this paper is a typical gobi distribution region, locating in Ejina County in Inner Mongolia, China, and its climatic characteristics include lack of rain, more evaporation, full sunshine, large temperature difference and frequent windy sand weather. Using Remote Sensing imageries Landsat TM5 and TM7 of plant growth season of 2005-2010, the DEM with 30 m spatial resolution, administrative map, present land use map, field investigation data and related documents as the basic data resource. Firstly, the non-gobi distribution regions were extracted in GIS software by analyzing DEM. Then, based on the analysis of spectral characteristics of difference typical ground objects, the information extraction model of Decision Tree based on knowledge was constructed to classify the remote sensing imageries, and eroded gobi and cumulated gobi were relatively accurately separated. The general accuracy of the extracted gobi information reached 91.57%. There were few materials in China on using

  5. The gravity apple tree

    International Nuclear Information System (INIS)

    Aldama, Mariana Espinosa

    2015-01-01

    The gravity apple tree is a genealogical tree of the gravitation theories developed during the past century. The graphic representation is full of information such as guides in heuristic principles, names of main proponents, dates and references for original articles (See under Supplementary Data for the graphic representation). This visual presentation and its particular classification allows a quick synthetic view for a plurality of theories, many of them well validated in the Solar System domain. Its diachronic structure organizes information in a shape of a tree following similarities through a formal concept analysis. It can be used for educational purposes or as a tool for philosophical discussion. (paper)

  6. Long-term calcium addition increases growth release, wound closure, and health of sugar maple (Acer saccharum) trees at the Hubbard Brook Experimental Forest

    Science.gov (United States)

    Brett A. Huggett; Paul G. Schaberg; Gary J. Hawley; Christopher Eager

    2007-01-01

    We surveyed and wounded forest-grown sugar maple (Acer saccharum Marsh.) trees in a long-term, replicated Ca manipulation study at the Hubbard Brook Experimental Forest in New Hampshire, USA. Plots received applications of Ca (to boost Ca availability above depleted ambient levels) or A1 (to compete with Ca uptake and further reduce Ca availability...

  7. Using cluster analysis and a classification and regression tree model to developed cover types in the Sky Islands of southeastern Arizona [Abstract

    Science.gov (United States)

    Jose M. Iniguez; Joseph L. Ganey; Peter J. Daugherty; John D. Bailey

    2005-01-01

    The objective of this study was to develop a rule based cover type classification system for the forest and woodland vegetation in the Sky Islands of southeastern Arizona. In order to develop such system we qualitatively and quantitatively compared a hierarchical (Ward’s) and a non-hierarchical (k-means) clustering method. Ecologically, unique groups and plots...

  8. Mapping Savanna Tree Species at Ecosystem Scales Using Support Vector Machine Classification and BRDF Correction on Airborne Hyperspectral and LiDAR Data

    Directory of Open Access Journals (Sweden)

    Gregory P. Asner

    2012-11-01

    Full Text Available Mapping the spatial distribution of plant species in savannas provides insight into the roles of competition, fire, herbivory, soils and climate in maintaining the biodiversity of these ecosystems. This study focuses on the challenges facing large-scale species mapping using a fusion of Light Detection and Ranging (LiDAR and hyperspectral imagery. Here we build upon previous work on airborne species detection by using a two-stage support vector machine (SVM classifier to first predict species from hyperspectral data at the pixel scale. Tree crowns are segmented from the lidar imagery such that crown-level information, such as maximum tree height, can then be combined with the pixel-level species probabilities to predict the species of each tree. An overall prediction accuracy of 76% was achieved for 15 species. We also show that bidirectional reflectance distribution (BRDF effects caused by anisotropic scattering properties of savanna vegetation can result in flight line artifacts evident in species probability maps, yet these can be largely mitigated by applying a semi-empirical BRDF model to the hyperspectral data. We find that confronting these three challenges—reflectance anisotropy, integration of pixel- and crown-level data, and crown delineation over large areas—enables species mapping at ecosystem scales for monitoring biodiversity and ecosystem function.

  9. Tree Colors: Color Schemes for Tree-Structured Data.

    Science.gov (United States)

    Tennekes, Martijn; de Jonge, Edwin

    2014-12-01

    We present a method to map tree structures to colors from the Hue-Chroma-Luminance color model, which is known for its well balanced perceptual properties. The Tree Colors method can be tuned with several parameters, whose effect on the resulting color schemes is discussed in detail. We provide a free and open source implementation with sensible parameter defaults. Categorical data are very common in statistical graphics, and often these categories form a classification tree. We evaluate applying Tree Colors to tree structured data with a survey on a large group of users from a national statistical institute. Our user study suggests that Tree Colors are useful, not only for improving node-link diagrams, but also for unveiling tree structure in non-hierarchical visualizations.

  10. Synthesis of phylogeny and taxonomy into a comprehensive tree of life

    Science.gov (United States)

    Hinchliff, Cody E.; Smith, Stephen A.; Allman, James F.; Burleigh, J. Gordon; Chaudhary, Ruchi; Coghill, Lyndon M.; Crandall, Keith A.; Deng, Jiabin; Drew, Bryan T.; Gazis, Romina; Gude, Karl; Hibbett, David S.; Katz, Laura A.; Laughinghouse, H. Dail; McTavish, Emily Jane; Midford, Peter E.; Owen, Christopher L.; Ree, Richard H.; Rees, Jonathan A.; Soltis, Douglas E.; Williams, Tiffani; Cranston, Karen A.

    2015-01-01

    Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips—the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics. PMID:26385966

  11. Discriminant forest classification method and system

    Science.gov (United States)

    Chen, Barry Y.; Hanley, William G.; Lemmond, Tracy D.; Hiller, Lawrence J.; Knapp, David A.; Mugge, Marshall J.

    2012-11-06

    A hybrid machine learning methodology and system for classification that combines classical random forest (RF) methodology with discriminant analysis (DA) techniques to provide enhanced classification capability. A DA technique which uses feature measurements of an object to predict its class membership, such as linear discriminant analysis (LDA) or Andersen-Bahadur linear discriminant technique (AB), is used to split the data at each node in each of its classification trees to train and grow the trees and the forest. When training is finished, a set of n DA-based decision trees of a discriminant forest is produced for use in predicting the classification of new samples of unknown class.

  12. Ash addition after whole-tree harvesting on fertile sites in southern Sweden Establishment of two field trials; Askaaterfoering efter skogsbraensleuttag paa boerdig skogsmark i soedra Sverige Anlaeggning av tvaa faeltfoersoek

    Energy Technology Data Exchange (ETDEWEB)

    Sikstroem, Ulf; Hoegbom, Lars; Jacobson, Staffan; Lundstroem, Hagos; Ring, Eva

    2012-02-15

    In Sweden, the Swedish Forest Agency recommend ash recycling after whole-tree harvest (WTH), i.e. the whole tree above the stump is harvested. If this recommendation is complied, ash recycling will become a quite extensive forestry practice. To gain more knowledge regarding effects of ash on tree growth, storage of carbon (C) and nitrogen (N) in the upper soil, and soil-water chemistry, two field experiments were established. The experimental design was a randomized block design. Both experiments were installed on quite fertile sites in southern Sweden. On fertile sites (C/N less than 30), ash addition is hypothesized to increase tree growth, reduce the storage of C and N in the upper soil, and increase the concentrations of nitrate in soil water. One experiment (291 Guvarp) was established on a recently harvested clear cut. The sample plots were soil scarified by disc trenching and planted with Norway spruce (Picea abies L. Karst.) seedlings. Seedling height was measured at planting, and, two months later, seedling survival and height growth in 2011 was recorded. The other experiment (290 Bala) was located to a Norway spruce stand that was thinned by WTH. The initial registration and sampling comprised stand data and sampling of the upper soil (FH-layer). In addition, suction cups were installed at 50 cm depth for sampling of soil water. The experiments were designed for long-term monitoring, comprising several decades. For most of the variables registered, data did not show any significant differences among means for the plots representing different treatments. Pre-treatment data for individual plots showed acceptable comparability among plots within the blocks, which is a good starting point for detecting future changes caused by the treatments

  13. Spatial and temporal variation of light inside peach trees

    International Nuclear Information System (INIS)

    Genard, M.; Baret, F.

    1994-01-01

    Gap fractions measured with hemispherical photographs were used to describe spatial and temporal variations of diffuse and direct light fractions transmitted to shoots within peach trees. For both cultivars studied, spatial variability of daily diffuse and direct light transmitted to shoots was very high within the tree. Diffuse and daily direct light fractions transmitted to shoots increased with shoot height within the tree and for more erect shoots. Temporal variations of hourly direct light were also large among shoots. Hourly direct light fractions transmitted to shoots were analyzed using recent developments in multivariate exploratory analysis. A gradient was observed between shoots sunlit almost all day and other shoots almost never sunlit. Well sunlit shoots were mostly located at the top of the tree and were more erect. Shoots located in the outer parts of the tree crown were slightly but significantly more sunlit than others for one cultivar. Principal component analysis additionally discriminated shoots according to the time of the day they were sunlit. This classification was related to shoot compass position for one cultivar. Spatial location of the shoot in the tree explained only a small part of light climate variability. Consequences of modeling light climate within the tree are discussed

  14. Human action analysis with randomized trees

    CERN Document Server

    Yu, Gang; Liu, Zicheng

    2014-01-01

    This book will provide a comprehensive overview on human action analysis with randomized trees. It will cover both the supervised random trees and the unsupervised random trees. When there are sufficient amount of labeled data available, supervised random trees provides a fast method for space-time interest point matching. When labeled data is minimal as in the case of example-based action search, unsupervised random trees is used to leverage the unlabelled data. We describe how the randomized trees can be used for action classification, action detection, action search, and action prediction.

  15. From Google Maps to a fine-grained catalog of street trees

    Science.gov (United States)

    Branson, Steve; Wegner, Jan Dirk; Hall, David; Lang, Nico; Schindler, Konrad; Perona, Pietro

    2018-01-01

    Up-to-date catalogs of the urban tree population are of importance for municipalities to monitor and improve quality of life in cities. Despite much research on automation of tree mapping, mainly relying on dedicated airborne LiDAR or hyperspectral campaigns, tree detection and species recognition is still mostly done manually in practice. We present a fully automated tree detection and species recognition pipeline that can process thousands of trees within a few hours using publicly available aerial and street view images of Google MapsTM. These data provide rich information from different viewpoints and at different scales from global tree shapes to bark textures. Our work-flow is built around a supervised classification that automatically learns the most discriminative features from thousands of trees and corresponding, publicly available tree inventory data. In addition, we introduce a change tracker that recognizes changes of individual trees at city-scale, which is essential to keep an urban tree inventory up-to-date. The system takes street-level images of the same tree location at two different times and classifies the type of change (e.g., tree has been removed). Drawing on recent advances in computer vision and machine learning, we apply convolutional neural networks (CNN) for all classification tasks. We propose the following pipeline: download all available panoramas and overhead images of an area of interest, detect trees per image and combine multi-view detections in a probabilistic framework, adding prior knowledge; recognize fine-grained species of detected trees. In a later, separate module, track trees over time, detect significant changes and classify the type of change. We believe this is the first work to exploit publicly available image data for city-scale street tree detection, species recognition and change tracking, exhaustively over several square kilometers, respectively many thousands of trees. Experiments in the city of Pasadena

  16. 78 FR 68983 - Cotton Futures Classification: Optional Classification Procedure

    Science.gov (United States)

    2013-11-18

    ...-AD33 Cotton Futures Classification: Optional Classification Procedure AGENCY: Agricultural Marketing... regulations to allow for the addition of an optional cotton futures classification procedure--identified and... response to requests from the U.S. cotton industry and ICE, AMS will offer a futures classification option...

  17. The Application of Classification and Regression Trees for the Triage of Women for Referral to Colposcopy and the Estimation of Risk for Cervical Intraepithelial Neoplasia: A Study Based on 1625 Cases with Incomplete Data from Molecular Tests

    Directory of Open Access Journals (Sweden)

    Abraham Pouliakis

    2015-01-01

    Full Text Available Objective. Nowadays numerous ancillary techniques detecting HPV DNA and mRNA compete with cytology; however no perfect test exists; in this study we evaluated classification and regression trees (CARTs for the production of triage rules and estimate the risk for cervical intraepithelial neoplasia (CIN in cases with ASCUS+ in cytology. Study Design. We used 1625 cases. In contrast to other approaches we used missing data to increase the data volume, obtain more accurate results, and simulate real conditions in the everyday practice of gynecologic clinics and laboratories. The proposed CART was based on the cytological result, HPV DNA typing, HPV mRNA detection based on NASBA and flow cytometry, p16 immunocytochemical expression, and finally age and parous status. Results. Algorithms useful for the triage of women were produced; gynecologists could apply these in conjunction with available examination results and conclude to an estimation of the risk for a woman to harbor CIN expressed as a probability. Conclusions. The most important test was the cytological examination; however the CART handled cases with inadequate cytological outcome and increased the diagnostic accuracy by exploiting the results of ancillary techniques even if there were inadequate missing data. The CART performance was better than any other single test involved in this study.

  18. The Application of Classification and Regression Trees for the Triage of Women for Referral to Colposcopy and the Estimation of Risk for Cervical Intraepithelial Neoplasia: A Study Based on 1625 Cases with Incomplete Data from Molecular Tests.

    Science.gov (United States)

    Pouliakis, Abraham; Karakitsou, Efrossyni; Chrelias, Charalampos; Pappas, Asimakis; Panayiotides, Ioannis; Valasoulis, George; Kyrgiou, Maria; Paraskevaidis, Evangelos; Karakitsos, Petros

    2015-01-01

    Nowadays numerous ancillary techniques detecting HPV DNA and mRNA compete with cytology; however no perfect test exists; in this study we evaluated classification and regression trees (CARTs) for the production of triage rules and estimate the risk for cervical intraepithelial neoplasia (CIN) in cases with ASCUS+ in cytology. We used 1625 cases. In contrast to other approaches we used missing data to increase the data volume, obtain more accurate results, and simulate real conditions in the everyday practice of gynecologic clinics and laboratories. The proposed CART was based on the cytological result, HPV DNA typing, HPV mRNA detection based on NASBA and flow cytometry, p16 immunocytochemical expression, and finally age and parous status. Algorithms useful for the triage of women were produced; gynecologists could apply these in conjunction with available examination results and conclude to an estimation of the risk for a woman to harbor CIN expressed as a probability. The most important test was the cytological examination; however the CART handled cases with inadequate cytological outcome and increased the diagnostic accuracy by exploiting the results of ancillary techniques even if there were inadequate missing data. The CART performance was better than any other single test involved in this study.

  19. Classifying Classifications

    DEFF Research Database (Denmark)

    Debus, Michael S.

    2017-01-01

    This paper critically analyzes seventeen game classifications. The classifications were chosen on the basis of diversity, ranging from pre-digital classification (e.g. Murray 1952), over game studies classifications (e.g. Elverdam & Aarseth 2007) to classifications of drinking games (e.g. LaBrie et...... al. 2013). The analysis aims at three goals: The classifications’ internal consistency, the abstraction of classification criteria and the identification of differences in classification across fields and/or time. Especially the abstraction of classification criteria can be used in future endeavors...... into the topic of game classifications....

  20. Aproximación a la metodología basada en árboles de decisión (CART: Mortalidad hospitalaria del infarto agudo de miocardio Approach to the methodology of classification and regression trees

    Directory of Open Access Journals (Sweden)

    Javier Trujillano

    2008-02-01

    Full Text Available Objetivo: : Realizar una aproximación a la metodología de árboles de decisión tipo CART (Classification and Regression Trees desarrollando un modelo para calcular la probabilidad de muerte hospitalaria en infarto agudo de miocardio (IAM. Método: Se utiliza el conjunto mínimo básico de datos al alta hospitalaria (CMBD de Andalucía, Cataluña, Madrid y País Vasco de los años 2001 y 2002, que incluye los casos con IAM como diagnóstico principal. Los 33.203 pacientes se dividen aleatoriamente (70 y 30 % en grupo de desarrollo (GD = 23.277 y grupo de validación (GV = 9.926. Como CART se utiliza un modelo inductivo basado en el algoritmo de Breiman, con análisis de sensibilidad mediante el índice de Gini y sistema de validación cruzada. Se compara con un modelo de regresión logística (RL y una red neuronal artificial (RNA (multilayer perceptron. Los modelos desarrollados se contrastan en el GV y sus propiedades se comparan con el área bajo la curva ROC (ABC (intervalo de confianza del 95%. Resultados: En el GD el CART con ABC = 0,85 (0,86-0,88, RL 0,87 (0,86-0,88 y RNA 0,85 (0,85-0,86. En el GV el CART con ABC = 0,85 (0,85-0,88, RL 0,86 (0,85-0,88 y RNA 0,84 (0,83-0,86. Conclusiones: Los 3 modelos obtienen resultados similares en su capacidad de discriminación. El modelo CART ofrece como ventaja su simplicidad de uso y de interpretación, ya que las reglas de decisión que generan pueden aplicarse sin necesidad de procesos matemáticos.Objective: To provide an overview of decision trees based on CART (Classification and Regression Trees methodology. As an example, we developed a CART model intended to estimate the probability of intrahospital death from acute myocardial infarction (AMI. Method: We employed the minimum data set (MDS of Andalusia, Catalonia, Madrid and the Basque Country (2001-2002, which included 33,203 patients with a diagnosis of AMI. The 33,203 patients were randomly divided (70% and 30% into the development (DS

  1. Flowering Trees

    Indian Academy of Sciences (India)

    Flowering Trees. Boswellia serrata Roxb. ex Colebr. (Indian Frankincense tree) of Burseraceae is a large-sized deciduous tree that is native to India. Bark is thin, greenish-ash-coloured that exfoliates into smooth papery flakes. Stem exudes pinkish resin ... Fruit is a three-valved capsule. A green gum-resin exudes from the ...

  2. Flowering Trees

    Indian Academy of Sciences (India)

    IAS Admin

    Flowering Trees. Ailanthus excelsa Roxb. (INDIAN TREE OF. HEAVEN) of Simaroubaceae is a lofty tree with large pinnately compound alternate leaves, which are ... inflorescences, unisexual and greenish-yellow. Fruits are winged, wings many-nerved. Wood is used in making match sticks. 1. Male flower; 2. Female flower.

  3. Flowering Trees

    Indian Academy of Sciences (India)

    Flowering Trees. Gyrocarpus americanus Jacq. (Helicopter Tree) of Hernandiaceae is a moderate size deciduous tree that grows to about 12 m in height with a smooth, shining, greenish-white bark. The leaves are ovate, rarely irregularly ... flowers which are unpleasant smelling. Fruit is a woody nut with two long thin wings.

  4. Flowering Trees

    Indian Academy of Sciences (India)

    More Details Fulltext PDF. Volume 8 Issue 8 August 2003 pp 112-112 Flowering Trees. Zizyphus jujuba Lam. of Rhamnaceae · More Details Fulltext PDF. Volume 8 Issue 9 September 2003 pp 97-97 Flowering Trees. Moringa oleifera · More Details Fulltext PDF. Volume 8 Issue 10 October 2003 pp 100-100 Flowering Trees.

  5. Right putamen and age are the most discriminant features to diagnose Parkinson's disease by using 123I-FP-CIT brain SPET data by using an artificial neural network classifier, a classification tree (ClT).

    Science.gov (United States)

    Cascianelli, S; Tranfaglia, C; Fravolini, M L; Bianconi, F; Minestrini, M; Nuvoli, S; Tambasco, N; Dottorini, M E; Palumbo, B

    2017-01-01

    The differential diagnosis of Parkinson's disease (PD) and other conditions, such as essential tremor and drug-induced parkinsonian syndrome or normal aging brain, represents a diagnostic challenge. 123 I-FP-CIT brain SPET is able to contribute to the differential diagnosis. Semiquantitative analysis of radiopharmaceutical uptake in basal ganglia (caudate nuclei and putamina) is very useful to support the diagnostic process. An artificial neural network classifier using 123 I-FP-CIT brain SPET data, a classification tree (CIT), was applied. CIT is an automatic classifier composed of a set of logical rules, organized as a decision tree to produce an optimised threshold based classification of data to provide discriminative cut-off values. We applied a CIT to 123 I-FP-CIT brain SPET semiquantitave data, to obtain cut-off values of radiopharmaceutical uptake ratios in caudate nuclei and putamina with the aim to diagnose PD versus other conditions. We retrospectively investigated 187 patients undergoing 123 I-FP-CIT brain SPET (Millenium VG, G.E.M.S.) with semiquantitative analysis performed with Basal Ganglia (BasGan) V2 software according to EANM guidelines; among them 113 resulted affected by PD (PD group) and 74 (N group) by other non parkinsonian conditions, such as Essential Tremor and drug-induced PD. PD group included 113 subjects (60M and 53F of age: 60-81yrs) having Hoehn and Yahr score (HY): 0.5-1.5; Unified Parkinson Disease Rating Scale (UPDRS) score: 6-38; N group included 74 subjects (36M and 38 F range of age 60-80 yrs). All subjects were clinically followed for at least 6-18 months to confirm the diagnosis. To examinate data obtained by using CIT, for each of the 1,000 experiments carried out, 10% of patients were randomly selected as the CIT training set, while the remaining 90% validated the trained CIT, and the percentage of the validation data correctly classified in the two groups of patients was computed. The expected performance of an "average

  6. Tree compression with top trees

    DEFF Research Database (Denmark)

    Bille, Philip; Gørtz, Inge Li; Landau, Gad M.

    2013-01-01

    We introduce a new compression scheme for labeled trees based on top trees [3]. Our compression scheme is the first to simultaneously take advantage of internal repeats in the tree (as opposed to the classical DAG compression that only exploits rooted subtree repeats) while also supporting fast...

  7. Tree compression with top trees

    DEFF Research Database (Denmark)

    Bille, Philip; Gørtz, Inge Li; Landau, Gad M.

    2015-01-01

    We introduce a new compression scheme for labeled trees based on top trees. Our compression scheme is the first to simultaneously take advantage of internal repeats in the tree (as opposed to the classical DAG compression that only exploits rooted subtree repeats) while also supporting fast...

  8. Minimum Error Entropy Classification

    CERN Document Server

    Marques de Sá, Joaquim P; Santos, Jorge M F; Alexandre, Luís A

    2013-01-01

    This book explains the minimum error entropy (MEE) concept applied to data classification machines. Theoretical results on the inner workings of the MEE concept, in its application to solving a variety of classification problems, are presented in the wider realm of risk functionals. Researchers and practitioners also find in the book a detailed presentation of practical data classifiers using MEE. These include multi‐layer perceptrons, recurrent neural networks, complexvalued neural networks, modular neural networks, and decision trees. A clustering algorithm using a MEE‐like concept is also presented. Examples, tests, evaluation experiments and comparison with similar machines using classic approaches, complement the descriptions.

  9. Fragmentation of random trees

    International Nuclear Information System (INIS)

    Kalay, Z; Ben-Naim, E

    2015-01-01

    We study fragmentation of a random recursive tree into a forest by repeated removal of nodes. The initial tree consists of N nodes and it is generated by sequential addition of nodes with each new node attaching to a randomly-selected existing node. As nodes are removed from the tree, one at a time, the tree dissolves into an ensemble of separate trees, namely, a forest. We study statistical properties of trees and nodes in this heterogeneous forest, and find that the fraction of remaining nodes m characterizes the system in the limit N→∞. We obtain analytically the size density ϕ s of trees of size s. The size density has power-law tail ϕ s ∼s −α with exponent α=1+(1/m). Therefore, the tail becomes steeper as further nodes are removed, and the fragmentation process is unusual in that exponent α increases continuously with time. We also extend our analysis to the case where nodes are added as well as removed, and obtain the asymptotic size density for growing trees. (paper)

  10. Stock Picking via Nonsymmetrically Pruned Binary Decision Trees

    OpenAIRE

    Anton Andriyashin

    2008-01-01

    Stock picking is the field of financial analysis that is of particular interest for many professional investors and researchers. In this study stock picking is implemented via binary classification trees. Optimal tree size is believed to be the crucial factor in forecasting performance of the trees. While there exists a standard method of tree pruning, which is based on the cost-complexity tradeoff and used in the majority of studies employing binary decision trees, this paper introduces a no...

  11. Tree Nut Allergies

    Science.gov (United States)

    ... Blog Vision Awards Common Allergens Tree Nut Allergy Tree Nut Allergy Learn about tree nut allergy, how ... a Tree Nut Label card . Allergic Reactions to Tree Nuts Tree nuts can cause a severe and ...

  12. Munitions Classification Library

    Science.gov (United States)

    2016-04-04

    members of the community to make their own additions to any, or all, of the classification libraries . The next phase entailed data collection over less......Include area code) 04/04/2016 Final Report August 2014 - August 2015 MUNITIONS CLASSIFICATION LIBRARY Mr. Craig Murray, Parsons Dr. Thomas H. Bell, Leidos

  13. Spatial and Spectral Hybrid Image Classification for Rice Lodging Assessment through UAV Imagery

    Directory of Open Access Journals (Sweden)

    Ming-Der Yang

    2017-06-01

    Full Text Available Rice lodging identification relies on manual in situ assessment and often leads to a compensation dispute in agricultural disaster assessment. Therefore, this study proposes a comprehensive and efficient classification technique for agricultural lands that entails using unmanned aerial vehicle (UAV imagery. In addition to spectral information, digital surface model (DSM and texture information of the images was obtained through image-based modeling and texture analysis. Moreover, single feature probability (SFP values were computed to evaluate the contribution of spectral and spatial hybrid image information to classification accuracy. The SFP results revealed that texture information was beneficial for the classification of rice and water, DSM information was valuable for lodging and tree classification, and the combination of texture and DSM information was helpful in distinguishing between artificial surface and bare land. Furthermore, a decision tree classification model incorporating SFP values yielded optimal results, with an accuracy of 96.17% and a Kappa value of 0.941, compared with that of a maximum likelihood classification model (90.76%. The rice lodging ratio in paddies at the study site was successfully identified, with three paddies being eligible for disaster relief. The study demonstrated that the proposed spatial and spectral hybrid image classification technology is a promising tool for rice lodging assessment.

  14. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  15. Flowering Trees

    Indian Academy of Sciences (India)

    medium-sized handsome tree with a straight bole that branches at the top. Leaves are once pinnate, with two to three pairs of leaflets. Young parts of the tree are velvety. Inflorescence is a branched raceme borne at the branch ends. Flowers are large, white, attractive, and fragrant. Corolla is funnel-shaped. Fruit is an ...

  16. Flowering Trees

    Indian Academy of Sciences (India)

    Cassia siamia Lamk. (Siamese tree senna) of Caesalpiniaceae is a small or medium size handsome tree. Leaves are alternate, pinnately compound and glandular, upto 18 cm long with 8–12 pairs of leaflets. Inflorescence is axillary or terminal and branched. Flowering lasts for a long period from March to February. Fruit is ...

  17. Flowering Trees

    Indian Academy of Sciences (India)

    Flowering Trees. Cerbera manghasL. (SEA MANGO) of Apocynaceae is a medium-sized evergreen coastal tree with milky latex. The bark is grey-brown, thick and ... Fruit is large. (5–10 cm long), oval containing two flattened seeds and resembles a mango, hence the name Mangas or. Manghas. Leaves and fruits contain ...

  18. Flowering Trees

    Indian Academy of Sciences (India)

    user

    Flowering Trees. Gliricidia sepium(Jacq.) Kunta ex Walp. (Quickstick) of Fabaceae is a small deciduous tree with. Pinnately compound leaves. Flower are prroduced in large number in early summer on terminal racemes. They are attractive, pinkish-white and typically like bean flowers. Fruit is a few-seeded flat pod.

  19. Flowering Trees

    Indian Academy of Sciences (India)

    Flowering Trees. Acrocarpus fraxinifolius Wight & Arn. (PINK CEDAR, AUSTRALIAN ASH) of. Caesalpiniaceae is a lofty unarmed deciduous native tree that attains a height of 30–60m with buttresses. Bark is thin and light grey. Leaves are compound and bright red when young. Flowers in dense, erect, axillary racemes.

  20. Talking Trees

    Science.gov (United States)

    Tolman, Marvin

    2005-01-01

    Students love outdoor activities and will love them even more when they build confidence in their tree identification and measurement skills. Through these activities, students will learn to identify the major characteristics of trees and discover how the pace--a nonstandard measuring unit--can be used to estimate not only distances but also the…

  1. Drawing Trees

    DEFF Research Database (Denmark)

    Halkjær From, Andreas; Schlichtkrull, Anders; Villadsen, Jørgen

    2018-01-01

    We formally prove in Isabelle/HOL two properties of an algorithm for laying out trees visually. The first property states that removing layout annotations recovers the original tree. The second property states that nodes are placed at least a unit of distance apart. We have yet to formalize three...

  2. Flowering Trees

    Indian Academy of Sciences (India)

    Srimath

    Grevillea robusta A. Cunn. ex R. Br. (Sil- ver Oak) of Proteaceae is a daintily lacy ornamental tree while young and growing into a mighty tree (45 m). Young shoots are silvery grey and the leaves are fern- like. Flowers are golden-yellow in one- sided racemes (10 cm). Fruit is a boat- shaped, woody follicle.

  3. Evaluation of the 8th TNM classification on p16-positive oropharyngeal squamous cell carcinomas in the Netherlands, and the importance of additional HPV DNA-testing.

    Science.gov (United States)

    Nauta, I H; Rietbergen, M M; van Bokhoven, A A J D; Bloemena, E; Witte, B I; Heideman, D A M; Baatenburg de Jong, R J; Brakenhoff, R H; Leemans, C R

    2018-02-09

    Oropharyngeal squamous cell carcinomas (OPSCCs) are traditionally caused by smoking and excessive alcohol consumption. However, in the last decades high-risk human papillomavirus (HR-HPV) infections play an increasingly important role in tumorigenesis. HPV-driven OPSCCs are known to have a more favorable prognosis, which has led to important and marked changes in the recently released TNM-8. In this edition, OPSCCs are divided based on p16-immunostaining, with p16-overexpression as surrogate marker for the presence of HPV. The aims of this study are to evaluate TNM-8 on a Dutch consecutive cohort of patients with p16-positive OPSCC and to determine the relevance of additional HPV DNA-testing. All OPSCC patients without distant metastases at diagnosis and treated with curative intent at VU University Medical Center (2000-2015) and Erasmus Medical Center (2000-2006) were included (N = 1,204). HPV-status was established by p16-immunostaining followed by HPV DNA-PCR on the p16-immunopositive cases. We compared TNM-7 and TNM-8 using the Harrell's C index. In total, 388 of 1,204 (32.2%) patients were p16-immunopositive. In these patients, TNM-8 had a markedly better predictive prognostic power than TNM-7 (Harrell's C index 0.63 versus 0.53). Of the 388 p16-positive OPSCCs, 48 tumors (12.4%) were HPV DNA-negative. This subgroup had distinct demographic, clinical and morphologic characteristics and showed a significantly worse five-year overall survival compared to the HPV DNA-positive tumors (P HPV DNA-negative subgroup with distinct features and a worse overall survival, indicating the importance to perform additional HPV DNA-testing when predicting prognosis and particularly for selecting patients for de-intensified treatment regimens. © The Author 2018. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  4. Trees in the city: valuing street trees in Portland, Oregon

    Science.gov (United States)

    G.H. Donovan; D.T. Butry

    2010-01-01

    We use a hedonic price model to simultaneously estimate the effects of street trees on the sales price and the time-on-market (TOM) of houses in Portland. Oregon. On average, street trees add $8,870 to sales price and reduce TOM by 1.7 days. In addition, we found that the benefits of street trees spill over to neighboring houses. Because the provision and maintenance...

  5. Applying Topographic Classification, Based on the Hydrological Process, to Design Habitat Linkages for Climate Change

    Directory of Open Access Journals (Sweden)

    Yongwon Mo

    2017-11-01

    Full Text Available The use of biodiversity surrogates has been discussed in the context of designing habitat linkages to support the migration of species affected by climate change. Topography has been proposed as a useful surrogate in the coarse-filter approach, as the hydrological process caused by topography such as erosion and accumulation is the basis of ecological processes. However, some studies that have designed topographic linkages as habitat linkages, so far have focused much on the shape of the topography (morphometric topographic classification with little emphasis on the hydrological processes (generic topographic classification to find such topographic linkages. We aimed to understand whether generic classification was valid for designing these linkages. First, we evaluated whether topographic classification is more appropriate for describing actual (coniferous and deciduous and potential (mammals and amphibians habitat distributions. Second, we analyzed the difference in the linkages between the morphometric and generic topographic classifications. The results showed that the generic classification represented the actual distribution of the trees, but neither the morphometric nor the generic classification could represent the potential animal distributions adequately. Our study demonstrated that the topographic classes, according to the generic classification, were arranged successively according to the flow of water, nutrients, and sediment; therefore, it would be advantageous to secure linkages with a width of 1 km or more. In addition, the edge effect would be smaller than with the morphometric classification. Accordingly, we suggest that topographic characteristics, based on the hydrological process, are required to design topographic linkages for climate change.

  6. Phylogenetic trees

    OpenAIRE

    Baños, Hector; Bushek, Nathaniel; Davidson, Ruth; Gross, Elizabeth; Harris, Pamela E.; Krone, Robert; Long, Colby; Stewart, Allen; Walker, Robert

    2016-01-01

    We introduce the package PhylogeneticTrees for Macaulay2 which allows users to compute phylogenetic invariants for group-based tree models. We provide some background information on phylogenetic algebraic geometry and show how the package PhylogeneticTrees can be used to calculate a generating set for a phylogenetic ideal as well as a lower bound for its dimension. Finally, we show how methods within the package can be used to compute a generating set for the join of any two ideals.

  7. Classification and Regression Tree Analysis of Clinical Patterns that Predict Survival in 127 Chinese Patients with Advanced Non-small Cell Lung Cancer Treated by Gefitinib Who Failed to Previous Chemotherapy

    Directory of Open Access Journals (Sweden)

    Ziping WANG

    2011-09-01

    Full Text Available Background and objective It has been proven that gefitinib produces only 10%-20% tumor regression in heavily pretreated, unselected non-small cell lung cancer (NSCLC patients as the second- and third-line setting. Asian, female, nonsmokers and adenocarcinoma are favorable factors; however, it is difficult to find a patient satisfying all the above clinical characteristics. The aim of this study is to identify novel predicting factors, and to explore the interactions between clinical variables and their impact on the survival of Chinese patients with advanced NSCLC who were heavily treated with gefitinib in the second- or third-line setting. Methods The clinical and follow-up data of 127 advanced NSCLC patients referred to the Cancer Hospital & Institute, Chinese Academy of Medical Sciences from March 2005 to March 2010 were analyzed. Multivariate analysis of progression-free survival (PFS was performed using recursive partitioning, which is referred to as the classification and regression tree (CART analysis. Results The median PFS of 127 eligible consecutive advanced NSCLC patients was 8.0 months (95%CI: 5.8-10.2. CART was performed with an initial split on first-line chemotherapy outcomes and a second split on patients’ age. Three terminal subgroups were formed. The median PFS of the three subsets ranged from 1.0 month (95%CI: 0.8-1.2 for those with progressive disease outcome after the first-line chemotherapy subgroup, 10 months (95%CI: 7.0-13.0 in patients with a partial response or stable disease in first-line chemotherapy and age <70, and 22.0 months for patients obtaining a partial response or stable disease in first-line chemotherapy at age 70-81 (95%CI: 3.8-40.1. Conclusion Partial response, stable disease in first-line chemotherapy and age ≥ 70 are closely correlated with long-term survival treated by gefitinib as a second- or third-line setting in advanced NSCLC. CART can be used to identify previously unappreciated patient

  8. The Tree of Industrial Life

    DEFF Research Database (Denmark)

    Andersen, Esben Sloth

    2002-01-01

    The purpose of this paper is to bring forth an interaction between evolutionary economics and industrial systematics. The suggested solution is to reconstruct the "family tree" of the industries. Such a tree is based on similarities, but it may also reflect the evolutionary history in industries....... For this purpose the paper shows how matrices of input-output coefficients can be transformed into binary characteristics matrices and to distance matrices, and it also discusses the possible evolutionary meaning of this translation. Then these derived matrices are used as inputs to algorithms for the heuristic...... finding of optimal industrial trees. The results are presented as taxonomic trees that can easily be compared with the hierarchical structure of existing systems of industrial classification....

  9. Residual tumor size and IGCCCG risk classification predict additional vascular procedures in patients with germ cell tumors and residual tumor resection: a multicenter analysis of the German Testicular Cancer Study Group.

    Science.gov (United States)

    Winter, Christian; Pfister, David; Busch, Jonas; Bingöl, Cigdem; Ranft, Ulrich; Schrader, Mark; Dieckmann, Klaus-Peter; Heidenreich, Axel; Albers, Peter

    2012-02-01

    Residual tumor resection (RTR) after chemotherapy in patients with advanced germ cell tumors (GCT) is an important part of the multimodal treatment. To provide a complete resection of residual tumor, additional surgical procedures are sometimes necessary. In particular, additional vascular interventions are high-risk procedures that require multidisciplinary planning and adequate resources to optimize outcome. The aim was to identify parameters that predict additional vascular procedures during RTR in GCT patients. A retrospective analysis was performed in 402 GCT patients who underwent 414 RTRs in 9 German Testicular Cancer Study Group (GTCSG) centers. Overall, 339 of 414 RTRs were evaluable with complete perioperative data sets. The RTR database was queried for additional vascular procedures (inferior vena cava [IVC] interventions, aortic prosthesis) and correlated to International Germ Cell Cancer Collaborative Group (IGCCCG) classification and residual tumor volume. In 40 RTRs, major vascular procedures (23 IVC resections with or without prosthesis, 11 partial IVC resections, and 6 aortic prostheses) were performed. In univariate analysis, the necessity of IVC intervention was significantly correlated with IGCCCG (14.1% intermediate/poor vs 4.8% good; p=0.0047) and residual tumor size (3.7% size risk features must initially be identified as high-risk patients for vascular procedures and therefore should be referred to specialized surgical centers with the ad hoc possibility of vascular interventions. Copyright © 2011 European Association of Urology. Published by Elsevier B.V. All rights reserved.

  10. Rate of tree carbon accumulation increases continuously with tree size.

    Science.gov (United States)

    Stephenson, N L; Das, A J; Condit, R; Russo, S E; Baker, P J; Beckman, N G; Coomes, D A; Lines, E R; Morris, W K; Rüger, N; Alvarez, E; Blundo, C; Bunyavejchewin, S; Chuyong, G; Davies, S J; Duque, A; Ewango, C N; Flores, O; Franklin, J F; Grau, H R; Hao, Z; Harmon, M E; Hubbell, S P; Kenfack, D; Lin, Y; Makana, J-R; Malizia, A; Malizia, L R; Pabst, R J; Pongpattananurak, N; Su, S-H; Sun, I-F; Tan, S; Thomas, D; van Mantgem, P J; Wang, X; Wiser, S K; Zavala, M A

    2014-03-06

    Forests are major components of the global carbon cycle, providing substantial feedback to atmospheric greenhouse gas concentrations. Our ability to understand and predict changes in the forest carbon cycle--particularly net primary productivity and carbon storage--increasingly relies on models that represent biological processes across several scales of biological organization, from tree leaves to forest stands. Yet, despite advances in our understanding of productivity at the scales of leaves and stands, no consensus exists about the nature of productivity at the scale of the individual tree, in part because we lack a broad empirical assessment of whether rates of absolute tree mass growth (and thus carbon accumulation) decrease, remain constant, or increase as trees increase in size and age. Here we present a global analysis of 403 tropical and temperate tree species, showing that for most species mass growth rate increases continuously with tree size. Thus, large, old trees do not act simply as senescent carbon reservoirs but actively fix large amounts of carbon compared to smaller trees; at the extreme, a single big tree can add the same amount of carbon to the forest within a year as is contained in an entire mid-sized tree. The apparent paradoxes of individual tree growth increasing with tree size despite declining leaf-level and stand-level productivity can be explained, respectively, by increases in a tree's total leaf area that outpace declines in productivity per unit of leaf area and, among other factors, age-related reductions in population density. Our results resolve conflicting assumptions about the nature of tree growth, inform efforts to undertand and model forest carbon dynamics, and have additional implications for theories of resource allocation and plant senescence.

  11. Classification and Analysis of Computer Network Traffic

    DEFF Research Database (Denmark)

    Bujlow, Tomasz

    2014-01-01

    various classification modes (decision trees, rulesets, boosting, softening thresholds) regarding the classification accuracy and the time required to create the classifier. We showed how to use our VBS tool to obtain per-flow, per-application, and per-content statistics of traffic in computer networks...

  12. Electron Tree

    DEFF Research Database (Denmark)

    Appelt, Ane L; Rønde, Heidi S

    2013-01-01

    The photo shows a close-up of a Lichtenberg figure – popularly called an “electron tree” – produced in a cylinder of polymethyl methacrylate (PMMA). Electron trees are created by irradiating a suitable insulating material, in this case PMMA, with an intense high energy electron beam. Upon discharge......, during dielectric breakdown in the material, the electrons generate branching chains of fractures on leaving the PMMA, producing the tree pattern seen. To be able to create electron trees with a clinical linear accelerator, one needs to access the primary electron beam used for photon treatments. We...... appropriated a linac that was being decommissioned in our department and dismantled the head to circumvent the target and ion chambers. This is one of 24 electron trees produced before we had to stop the fun and allow the rest of the accelerator to be disassembled....

  13. Flowering Trees

    Indian Academy of Sciences (India)

    Srimath

    shaped corolla. Fruit is large, ellipsoidal, green with a hard and smooth shell containing numerous flattened seeds, which are embedded in fleshy pulp. Calabash tree is commonly grown in the tropical gardens of the world as a botanical oddity.

  14. Tree felling 2014

    CERN Multimedia

    2014-01-01

    With a view to creating new landscapes and making its population of trees safer and healthier, this winter CERN will complete the tree-felling campaign started in 2010.   Tree felling will take place between 15 and 22 November on the Swiss part of the Meyrin site. This work is being carried out above all for safety reasons. The trees to be cut down are at risk of falling as they are too old and too tall to withstand the wind. In addition, the roots of poplar trees are very powerful and spread widely, potentially damaging underground networks, pavements and roadways. Compensatory tree planting campaigns will take place in the future, subject to the availability of funding, with the aim of creating coherent landscapes while also respecting the functional constraints of the site. These matters are being considered in close collaboration with the Geneva nature and countryside directorate (Direction générale de la nature et du paysage, DGNP). GS-SE Group

  15. Phylogenetic classification of bony fishes.

    Science.gov (United States)

    Betancur-R, Ricardo; Wiley, Edward O; Arratia, Gloria; Acero, Arturo; Bailly, Nicolas; Miya, Masaki; Lecointre, Guillaume; Ortí, Guillermo

    2017-07-06

    Fish classifications, as those of most other taxonomic groups, are being transformed drastically as new molecular phylogenies provide support for natural groups that were unanticipated by previous studies. A brief review of the main criteria used by ichthyologists to define their classifications during the last 50 years, however, reveals slow progress towards using an explicit phylogenetic framework. Instead, the trend has been to rely, in varying degrees, on deep-rooted anatomical concepts and authority, often mixing taxa with explicit phylogenetic support with arbitrary groupings. Two leading sources in ichthyology frequently used for fish classifications (JS Nelson's volumes of Fishes of the World and W. Eschmeyer's Catalog of Fishes) fail to adopt a global phylogenetic framework despite much recent progress made towards the resolution of the fish Tree of Life. The first explicit phylogenetic classification of bony fishes was published in 2013, based on a comprehensive molecular phylogeny ( www.deepfin.org ). We here update the first version of that classification by incorporating the most recent phylogenetic results. The updated classification presented here is based on phylogenies inferred using molecular and genomic data for nearly 2000 fishes. A total of 72 orders (and 79 suborders) are recognized in this version, compared with 66 orders in version 1. The phylogeny resolves placement of 410 families, or ~80% of the total of 514 families of bony fishes currently recognized. The ordinal status of 30 percomorph families included in this study, however, remains uncertain (incertae sedis in the series Carangaria, Ovalentaria, or Eupercaria). Comments to support taxonomic decisions and comparisons with conflicting taxonomic groups proposed by others are presented. We also highlight cases were morphological support exist for the groups being classified. This version of the phylogenetic classification of bony fishes is substantially improved, providing resolution

  16. Decision trees in epidemiological research

    Directory of Open Access Journals (Sweden)

    Ashwini Venkatasubramaniam

    2017-09-01

    Full Text Available Abstract Background In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. Main text We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART technique and the newer Conditional Inference tree (CTree technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Conclusions Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.

  17. Decision trees in epidemiological research.

    Science.gov (United States)

    Venkatasubramaniam, Ashwini; Wolfson, Julian; Mitchell, Nathan; Barnes, Timothy; JaKa, Meghan; French, Simone

    2017-01-01

    In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.

  18. Building optimal regression tree by ant colony system-genetic algorithm: Application to modeling of melting points

    Energy Technology Data Exchange (ETDEWEB)

    Hemmateenejad, Bahram, E-mail: hemmatb@sums.ac.ir [Department of Chemistry, Shiraz University, Shiraz (Iran, Islamic Republic of); Medicinal and Natural Products Chemistry Research Center, Shiraz University of Medical Sciences, Shiraz (Iran, Islamic Republic of); Shamsipur, Mojtaba [Department of Chemistry, Razi University, Kermanshah (Iran, Islamic Republic of); Zare-Shahabadi, Vali [Young Researchers Club, Mahshahr Branch, Islamic Azad University, Mahshahr (Iran, Islamic Republic of); Akhond, Morteza [Department of Chemistry, Shiraz University, Shiraz (Iran, Islamic Republic of)

    2011-10-17

    Highlights: {yields} Ant colony systems help to build optimum classification and regression trees. {yields} Using of genetic algorithm operators in ant colony systems resulted in more appropriate models. {yields} Variable selection in each terminal node of the tree gives promising results. {yields} CART-ACS-GA could model the melting point of organic materials with prediction errors lower than previous models. - Abstract: The classification and regression trees (CART) possess the advantage of being able to handle large data sets and yield readily interpretable models. A conventional method of building a regression tree is recursive partitioning, which results in a good but not optimal tree. Ant colony system (ACS), which is a meta-heuristic algorithm and derived from the observation of real ants, can be used to overcome this problem. The purpose of this study was to explore the use of CART and its combination with ACS for modeling of melting points of a large variety of chemical compounds. Genetic algorithm (GA) operators (e.g., cross averring and mutation operators) were combined with ACS algorithm to select the best solution model. In addition, at each terminal node of the resulted tree, variable selection was done by ACS-GA algorithm to build an appropriate partial least squares (PLS) model. To test the ability of the resulted tree, a set of approximately 4173 structures and their melting points were used (3000 compounds as training set and 1173 as validation set). Further, an external test set containing of 277 drugs was used to validate the prediction ability of the tree. Comparison of the results obtained from both trees showed that the tree constructed by ACS-GA algorithm performs better than that produced by recursive partitioning procedure.

  19. Phylogenetic trees and Euclidean embeddings.

    Science.gov (United States)

    Layer, Mark; Rhodes, John A

    2017-01-01

    It was recently observed by de Vienne et al. (Syst Biol 60(6):826-832, 2011) that a simple square root transformation of distances between taxa on a phylogenetic tree allowed for an embedding of the taxa into Euclidean space. While the justification for this was based on a diffusion model of continuous character evolution along the tree, here we give a direct and elementary explanation for it that provides substantial additional insight. We use this embedding to reinterpret the differences between the NJ and BIONJ tree building algorithms, providing one illustration of how this embedding reflects tree structures in data.

  20. Rate of tree carbon accumulation increases continuously with tree size

    Science.gov (United States)

    Stephenson, N.L.; Das, A.J.; Condit, R.; Russo, S.E.; Baker, P.J.; Beckman, N.G.; Coomes, D.A.; Lines, E.R.; Morris, W.K.; Rüger, N.; Álvarez, E.; Blundo, C.; Bunyavejchewin, S.; Chuyong, G.; Davies, S.J.; Duque, Á.; Ewango, C.N.; Flores, O.; Franklin, J.F.; Grau, H.R.; Hao, Z.; Harmon, M.E.; Hubbell, S.P.; Kenfack, D.; Lin, Y.; Makana, J.-R.; Malizia, A.; Malizia, L.R.; Pabst, R.J.; Pongpattananurak, N.; Su, S.-H.; Sun, I-F.; Tan, S.; Thomas, D.; van Mantgem, P.J.; Wang, X.; Wiser, S.K.; Zavala, M.A.

    2014-01-01

    Forests are major components of the global carbon cycle, providing substantial feedback to atmospheric greenhouse gas concentrations. Our ability to understand and predict changes in the forest carbon cycle—particularly net primary productivity and carbon storage - increasingly relies on models that represent biological processes across several scales of biological organization, from tree leaves to forest stands. Yet, despite advances in our understanding of productivity at the scales of leaves and stands, no consensus exists about the nature of productivity at the scale of the individual tree, in part because we lack a broad empirical assessment of whether rates of absolute tree mass growth (and thus carbon accumulation) decrease, remain constant, or increase as trees increase in size and age. Here we present a global analysis of 403 tropical and temperate tree species, showing that for most species mass growth rate increases continuously with tree size. Thus, large, old trees do not act simply as senescent carbon reservoirs but actively fix large amounts of carbon compared to smaller trees; at the extreme, a single big tree can add the same amount of carbon to the forest within a year as is contained in an entire mid-sized tree. The apparent paradoxes of individual tree growth increasing with tree size despite declining leaf-level and stand-level productivity can be explained, respectively, by increases in a tree’s total leaf area that outpace declines in productivity per unit of leaf area and, among other factors, age-related reductions in population density. Our results resolve conflicting assumptions about the nature of tree growth, inform efforts to understand and model forest carbon dynamics, and have additional implications for theories of resource allocation and plant senescence.

  1. Land cover and forest formation distributions for St. Kitts, Nevis, St. Eustatius, Grenada and Barbados from decision tree classification of cloud-cleared satellite imagery. Caribbean Journal of Science. 44(2):175-198.

    Science.gov (United States)

    E.H. Helmer; T.A. Kennaway; D.H. Pedreros; M.L. Clark; H. Marcano-Vega; L.L. Tieszen; S.R. Schill; C.M.S. Carrington

    2008-01-01

    Satellite image-based mapping of tropical forests is vital to conservation planning. Standard methods for automated image classification, however, limit classification detail in complex tropical landscapes. In this study, we test an approach to Landsat image interpretation on four islands of the Lesser Antilles, including Grenada and St. Kitts, Nevis and St. Eustatius...

  2. FOREST TREE SPECIES DISTRIBUTION MAPPING USING LANDSAT SATELLITE IMAGERY AND TOPOGRAPHIC VARIABLES WITH THE MAXIMUM ENTROPY METHOD IN MONGOLIA

    Directory of Open Access Journals (Sweden)

    S. H. Chiang

    2016-06-01

    Full Text Available Forest is a very important ecosystem and natural resource for living things. Based on forest inventories, government is able to make decisions to converse, improve and manage forests in a sustainable way. Field work for forestry investigation is difficult and time consuming, because it needs intensive physical labor and the costs are high, especially surveying in remote mountainous regions. A reliable forest inventory can give us a more accurate and timely information to develop new and efficient approaches of forest management. The remote sensing technology has been recently used for forest investigation at a large scale. To produce an informative forest inventory, forest attributes, including tree species are unavoidably required to be considered. In this study the aim is to classify forest tree species in Erdenebulgan County, Huwsgul province in Mongolia, using Maximum Entropy method. The study area is covered by a dense forest which is almost 70% of total territorial extension of Erdenebulgan County and is located in a high mountain region in northern Mongolia. For this study, Landsat satellite imagery and a Digital Elevation Model (DEM were acquired to perform tree species mapping. The forest tree species inventory map was collected from the Forest Division of the Mongolian Ministry of Nature and Environment as training data and also used as ground truth to perform the accuracy assessment of the tree species classification. Landsat images and DEM were processed for maximum entropy modeling, and this study applied the model with two experiments. The first one is to use Landsat surface reflectance for tree species classification; and the second experiment incorporates terrain variables in addition to the Landsat surface reflectance to perform the tree species classification. All experimental results were compared with the tree species inventory to assess the classification accuracy. Results show that the second one which uses Landsat surface

  3. Forest Tree Species Distribution Mapping Using Landsat Satellite Imagery and Topographic Variables with the Maximum Entropy Method in Mongolia

    Science.gov (United States)

    Hao Chiang, Shou; Valdez, Miguel; Chen, Chi-Farn

    2016-06-01

    Forest is a very important ecosystem and natural resource for living things. Based on forest inventories, government is able to make decisions to converse, improve and manage forests in a sustainable way. Field work for forestry investigation is difficult and time consuming, because it needs intensive physical labor and the costs are high, especially surveying in remote mountainous regions. A reliable forest inventory can give us a more accurate and timely information to develop new and efficient approaches of forest management. The remote sensing technology has been recently used for forest investigation at a large scale. To produce an informative forest inventory, forest attributes, including tree species are unavoidably required to be considered. In this study the aim is to classify forest tree species in Erdenebulgan County, Huwsgul province in Mongolia, using Maximum Entropy method. The study area is covered by a dense forest which is almost 70% of total territorial extension of Erdenebulgan County and is located in a high mountain region in northern Mongolia. For this study, Landsat satellite imagery and a Digital Elevation Model (DEM) were acquired to perform tree species mapping. The forest tree species inventory map was collected from the Forest Division of the Mongolian Ministry of Nature and Environment as training data and also used as ground truth to perform the accuracy assessment of the tree species classification. Landsat images and DEM were processed for maximum entropy modeling, and this study applied the model with two experiments. The first one is to use Landsat surface reflectance for tree species classification; and the second experiment incorporates terrain variables in addition to the Landsat surface reflectance to perform the tree species classification. All experimental results were compared with the tree species inventory to assess the classification accuracy. Results show that the second one which uses Landsat surface reflectance coupled

  4. Multiquarks and Steiner trees

    International Nuclear Information System (INIS)

    Richard, Jean-Marc

    2010-01-01

    A brief review review is presented of models tentatively leading to stable multiquarks. A new attempt is presented, based on a Steiner-tree model of confinement, which is inspired by by QCD. It leads to more attraction than the empirical colour-additive model used in earlier multiquark calculations, and predict several multiquark states in configurations with different flavours.

  5. On Tree-Based Phylogenetic Networks.

    Science.gov (United States)

    Zhang, Louxin

    2016-07-01

    A large class of phylogenetic networks can be obtained from trees by the addition of horizontal edges between the tree edges. These networks are called tree-based networks. We present a simple necessary and sufficient condition for tree-based networks and prove that a universal tree-based network exists for any number of taxa that contains as its base every phylogenetic tree on the same set of taxa. This answers two problems posted by Francis and Steel recently. A byproduct is a computer program for generating random binary phylogenetic networks under the uniform distribution model.

  6. Spatial-temporal changes in trees outside forests

    DEFF Research Database (Denmark)

    Novotný, M.; Skaloš, J.; Plieninger, T.

    2017-01-01

    Trees outside forests act as ecologically valuable elements in the rural landscapes of Europe. This study proposes a new classification system for trees outside forest elements based on the shape and size of the patches and their location in fields. Using this system, the study evaluates the spat......Trees outside forests act as ecologically valuable elements in the rural landscapes of Europe. This study proposes a new classification system for trees outside forest elements based on the shape and size of the patches and their location in fields. Using this system, the study evaluates...

  7. Sampling the quality of hardwood trees

    Science.gov (United States)

    Adrian M. Gilbert

    1959-01-01

    Anyone acquainted with the conversion of hardwood trees into wood products knows that timber has a wide range in quality. Some trees will yield better products than others. So, in addition to rate of growth and size, tree values are affected by the quality of products yielded.

  8. Flowering Trees

    Indian Academy of Sciences (India)

    deciduous tree with irregularly-shaped trunk, greyish-white scaly bark and milky latex. Leaves in opposite pairs are simple, oblong and whitish beneath. Flowers that occur in branched inflorescence are white, 2–. 3cm across and fragrant. Calyx is glandular inside. Petals bear numerous linear white scales, the corollary.

  9. Flowering Trees

    Indian Academy of Sciences (India)

    Berrya cordifolia (Willd.) Burret (Syn. B. ammonilla Roxb.) – Trincomali Wood of Tiliaceae is a tall evergreen tree with straight trunk, smooth brownish-grey bark and simple broad leaves. Inflorescence is much branched with white flowers. Stamens are many with golden yellow anthers. Fruit is a capsule with six spreading ...

  10. Flowering Trees

    Indian Academy of Sciences (India)

    Canthium parviflorum Lam. of Rubiaceae is a large shrub that often grows into a small tree with conspicuous spines. Leaves are simple, in pairs at each node and are shiny. Inflorescence is an axillary few-flowered cymose fascicle. Flowers are small (less than 1 cm across), 4-merous and greenish-white. Fruit is ellipsoid ...

  11. Flowering Trees

    Indian Academy of Sciences (India)

    sriranga

    Hook.f. ex Brandis (Yellow. Cadamba) of Rubiaceae is a large and handsome deciduous tree. Leaves are simple, large, orbicular, and drawn abruptly at the apex. Flowers are small, yellowish and aggregate into small spherical heads. The corolla is funnel-shaped with five stamens inserted at its mouth. Fruit is a capsule.

  12. Flowering Trees

    Indian Academy of Sciences (India)

    Celtis tetrandra Roxb. of Ulmaceae is a moderately large handsome deciduous tree with green branchlets and grayish-brown bark. Leaves are simple with three to four secondary veins running parallel to the mid vein. Flowers are solitary, male, female and bisexual and inconspicuous. Fruit is berry-like, small and globose ...

  13. Flowering Trees

    Indian Academy of Sciences (India)

    IAS Admin

    Aglaia elaeagnoidea (A.Juss.) Benth. of Meliaceae is a small-sized evergreen tree of both moist and dry deciduous forests. The leaves are alternate and pinnately compound, terminating in a single leaflet. Leaflets are more or less elliptic with entire margin. Flowers are small on branched inflorescence. Fruit is a globose ...

  14. Flowering Trees

    Indian Academy of Sciences (India)

    user

    Flowers are borne on stiff bunches terminally on short shoots. They are 2-3 cm across, white, sweet-scented with light-brown hairy sepals and many stamens. Loquat fruits are round or pear-shaped, 3-5 cm long and are edible. A native of China, Loquat tree is grown in parks as an ornamental and also for its fruits.

  15. Flowering Trees

    Indian Academy of Sciences (India)

    mid-sized slow-growing evergreen tree with spreading branches that form a dense crown. The bark is smooth, thick, dark and flakes off in large shreds. Leaves are thick, oblong, leathery and bright red when young. The female flowers are drooping and are larger than male flowers. Fruit is large, red in color and velvety.

  16. Flowering Trees

    Indian Academy of Sciences (India)

    Andira inermis (wright) DC. , Dog Almond of Fabaceae is a handsome lofty evergreen tree. Leaves are alternate and pinnately compound with 4–7 pairs of leaflets. Flowers are fragrant and are borne on compact branched inflorescences. Fruit is ellipsoidal one-seeded drupe that is peculiar to members of this family.

  17. Flowering Trees

    Indian Academy of Sciences (India)

    narrow towards base. Flowers are large and attrac- tive, but emit unpleasant foetid smell. They appear in small numbers on erect terminal clusters and open at night. Stamens are numerous, pink or white. Style is slender and long, terminating in a small stigma. Fruit is green, ovoid and indistinctly lobed. Flowering Trees.

  18. Flowering Trees

    Indian Academy of Sciences (India)

    Muntingia calabura L. (Singapore cherry) of. Elaeocarpaceae is a medium size handsome ever- green tree. Leaves are simple and alternate with sticky hairs. Flowers are bisexual, bear numerous stamens, white in colour and arise in the leaf axils. Fruit is a berry, edible with several small seeds embedded in a fleshy pulp ...

  19. ~{owering 'Trees

    Indian Academy of Sciences (India)

    . Stamens are fused into a purple staminal tube that is toothed. Fruit is about 0.5 in. across, nearly globose, generally 5-seeded, green but yellow when ripe, quite smooth at first but wrinkled in drying, remaining long on the tree ajier ripening.

  20. Tree Mortality

    Science.gov (United States)

    Mark J. Ambrose

    2012-01-01

    Tree mortality is a natural process in all forest ecosystems. However, extremely high mortality also can be an indicator of forest health issues. On a regional scale, high mortality levels may indicate widespread insect or disease problems. High mortality may also occur if a large proportion of the forest in a particular region is made up of older, senescent stands....

  1. Flowering Trees

    Indian Academy of Sciences (India)

    Guaiacum officinale L. (LIGNUM-VITAE) of Zygophyllaceae is a dense-crowned, squat, knobbly, rough and twisted medium-sized ev- ergreen tree with mottled bark. The wood is very hard and resinous. Leaves are compound. The leaflets are smooth, leathery, ovate-ellipti- cal and appear in two pairs. Flowers (about 1.5.

  2. DupTree: a program for large-scale phylogenetic analyses using gene tree parsimony.

    Science.gov (United States)

    Wehe, André; Bansal, Mukul S; Burleigh, J Gordon; Eulenstein, Oliver

    2008-07-01

    DupTree is a new software program for inferring rooted species trees from collections of gene trees using the gene tree parsimony approach. The program implements a novel algorithm that significantly improves upon the run time of standard search heuristics for gene tree parsimony, and enables the first truly genome-scale phylogenetic analyses. In addition, DupTree allows users to examine alternate rootings and to weight the reconciliation costs for gene trees. DupTree is an open source project written in C++. DupTree for Mac OS X, Windows, and Linux along with a sample dataset and an on-line manual are available at http://genome.cs.iastate.edu/CBL/DupTree

  3. The transposition distance for phylogenetic trees

    OpenAIRE

    Rossello, Francesc; Valiente, Gabriel

    2006-01-01

    The search for similarity and dissimilarity measures on phylogenetic trees has been motivated by the computation of consensus trees, the search by similarity in phylogenetic databases, and the assessment of clustering results in bioinformatics. The transposition distance for fully resolved phylogenetic trees is a recent addition to the extensive collection of available metrics for comparing phylogenetic trees. In this paper, we generalize the transposition distance from fully resolved to arbi...

  4. Classification for Inconsistent Decision Tables

    KAUST Repository

    Azad, Mohammad; Moshkov, Mikhail

    2016-01-01

    Decision trees have been used widely to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples with equal values of conditional attributes but different labels, then to discover the essential patterns or knowledge from the data set is challenging. Three approaches (generalized, most common and many-valued decision) have been considered to handle such inconsistency. The decision tree model has been used to compare the classification results among three approaches. Many-valued decision approach outperforms other approaches, and M_ws_entM greedy algorithm gives faster and better prediction accuracy.

  5. Classification for Inconsistent Decision Tables

    KAUST Repository

    Azad, Mohammad

    2016-09-28

    Decision trees have been used widely to discover patterns from consistent data set. But if the data set is inconsistent, where there are groups of examples with equal values of conditional attributes but different labels, then to discover the essential patterns or knowledge from the data set is challenging. Three approaches (generalized, most common and many-valued decision) have been considered to handle such inconsistency. The decision tree model has been used to compare the classification results among three approaches. Many-valued decision approach outperforms other approaches, and M_ws_entM greedy algorithm gives faster and better prediction accuracy.

  6. Proactive data mining with decision trees

    CERN Document Server

    Dahan, Haim; Rokach, Lior; Maimon, Oded

    2014-01-01

    This book explores a proactive and domain-driven method to classification tasks. This novel proactive approach to data mining not only induces a model for predicting or explaining a phenomenon, but also utilizes specific problem/domain knowledge to suggest specific actions to achieve optimal changes in the value of the target attribute. In particular, the authors suggest a specific implementation of the domain-driven proactive approach for classification trees. The book centers on the core idea of moving observations from one branch of the tree to another. It introduces a novel splitting crite

  7. Combining evolutionary algorithms with oblique decision trees to detect bent-double galaxies

    Science.gov (United States)

    Cantu-Paz, Erick; Kamath, Chandrika

    2000-10-01

    Decision tress have long been popular in classification as they use simple and easy-to-understand tests at each node. Most variants of decision trees test a single attribute at a node, leading to axis- parallel trees, where the test results in a hyperplane which is parallel to one of the dimensions in the attribute space. These trees can be rather large and inaccurate in cases where the concept to be learned is best approximated by oblique hyperplanes. In such cases, it may be more appropriate to use an oblique decision tree, where the decision at each node is a linear combination of the attributes. Oblique decision trees have not gained wide popularity in part due to the complexity of constructing good oblique splits and the tendency of existing splitting algorithms to get stuck in local minima. Several alternatives have been proposed to handle these problems including randomization in conjunction wiht deterministic hill-climbing and the use of simulated annealing. In this paper, we use evolutionary algorithms (EAs) to determine the split. EAs are well suited for this problem because of their global search properties, their tolerance to noisy fitness evaluations, and their scalability to large dimensional search spaces. We demonstrate our technique on a synthetic data set, and then we apply it to a practical problem from astronomy, namely, the classification of galaxies with a bent-double morphology. In addition, we describe our experiences with several split evaluation criteria. Our results suggest that, in some cases, the evolutionary approach is faster and more accurate than existing oblique decision tree algorithms. However, for our astronomical data, the accuracy is not significantly different than the axis-parallel trees.

  8. Modelling tree biomasses in Finland

    Energy Technology Data Exchange (ETDEWEB)

    Repola, J.

    2013-06-01

    Biomass equations for above- and below-ground tree components of Scots pine (Pinus sylvestris L), Norway spruce (Picea abies [L.] Karst) and birch (Betula pendula Roth and Betula pubescens Ehrh.) were compiled using empirical material from a total of 102 stands. These stands (44 Scots pine, 34 Norway spruce and 24 birch stands) were located mainly on mineral soil sites representing a large part of Finland. The biomass models were based on data measured from 1648 sample trees, comprising 908 pine, 613 spruce and 127 birch trees. Biomass equations were derived for the total above-ground biomass and for the individual tree components: stem wood, stem bark, living and dead branches, needles, stump, and roots, as dependent variables. Three multivariate models with different numbers of independent variables for above-ground biomass and one for below-ground biomass were constructed. Variables that are normally measured in forest inventories were used as independent variables. The simplest model formulations, multivariate models (1) were mainly based on tree diameter and height as independent variables. In more elaborated multivariate models, (2) and (3), additional commonly measured tree variables such as age, crown length, bark thickness and radial growth rate were added. Tree biomass modelling includes consecutive phases, which cause unreliability in the prediction of biomass. First, biomasses of sample trees should be determined reliably to decrease the statistical errors caused by sub-sampling. In this study, methods to improve the accuracy of stem biomass estimates of the sample trees were developed. In addition, the reliability of the method applied to estimate sample-tree crown biomass was tested, and no systematic error was detected. Second, the whole information content of data should be utilized in order to achieve reliable parameter estimates and applicable and flexible model structure. In the modelling approach, the basic assumption was that the biomasses of

  9. Addition of ash and PK with or without N on a peatland in southern Sweden. Effects on tree growth and needle element concentrate; Tillfoersel av aska och PK med eller utan N paa en torvmark i soedra Sverige. Effekter paa traedtillvaext och aemneshalter i barr

    Energy Technology Data Exchange (ETDEWEB)

    Sikstroem, Ulf (Forestry Research Inst. of Sweden, Uppsala (Sweden))

    2008-04-15

    In Sweden, about 1.3 million tonnes of ash are produced annually. Out of that amount, 150 000 - 300 000 tonnes have been estimated to originate from forest fuels, i.e. ashes that potentially can be brought back to the forest. Apart from being a compensatory measure after intensive forest harvest (e.g. including tops and branches), the ash may be used to increase the tree growth on peat soils (ash fertilization). On peat soils, tree growth is generally increased after addition of ash or phosphorous (P) and potassium (K) fertilizers. However, in some cases also nitrogen (N) addition is required. On sparsely stocked mires, fertilization may promote the regeneration as well as increase the growth of the trees. Sufficient drainage and supply of plant-available N are some prerequisites for increasing tree growth by PK-addition. In 1982, Skogforsk established a field experiment (168 Perstorp) located in the province of Scania in a sparsely stocked Pinus Sylvestris (L.) sapling stand on a ditched mire where different nutrient regimes were tested. The experiment had a randomized block design including four blocks and seven treatments. Ash and different dozes of P (raw phosphate) and K (potassium chloride), with or without simultaneous N addition, were included as well as un untreated control. The aim of the present study was to evaluate the effects on tree growth and needle elemental concentrations 26 years after treatment. The addition of similar dozes of P (approx. 40 kg P/ha), as ash (2.5 tonnes/ha) or as PK-fertilizer rendered similar growth responses. The increase in stem-wood growth was in the order of 1,6 - 1,9 m3/ha/yr during the 26-year period. The N-addition had no additional effect. On the control plots, the growth was more or less negligible (approx. 0,04 m3sk/ha/yr). On average, the high dozes of raw-phosphate and potassium chloride (40 kg P/ha and 80 kg K/ha) gave a higher growth increase than the low dozes (20 kg P/ha and 40 kg K/ha), although this effect was

  10. An edit script for taxonomic classifications

    Directory of Open Access Journals (Sweden)

    Valiente Gabriel

    2005-08-01

    Full Text Available Abstract Background The NCBI taxonomy provides one of the most powerful ways to navigate sequence data bases but currently users are forced to formulate queries according to a single taxonomic classification. Given that there is not universal agreement on the classification of organisms, providing a single classification places constraints on the questions biologists can ask. However, maintaining multiple classifications is burdensome in the face of a constantly growing NCBI classification. Results In this paper, we present a solution to the problem of generating modifications of the NCBI taxonomy, based on the computation of an edit script that summarises the differences between two classification trees. Our algorithms find the shortest possible edit script based on the identification of all shared subtrees, and only take time quasi linear in the size of the trees because classification trees have unique node labels. Conclusion These algorithms have been recently implemented, and the software is freely available for download from http://darwin.zoology.gla.ac.uk/~rpage/forest/.

  11. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    Directory of Open Access Journals (Sweden)

    Shu-Chuan Chen

    Full Text Available The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  12. Surface tree languages and parallel derivation trees

    NARCIS (Netherlands)

    Engelfriet, Joost

    1976-01-01

    The surface tree languages obtained by top-down finite state transformation of monadic trees are exactly the frontier-preserving homomorphic images of sets of derivation trees of ETOL systems. The corresponding class of tree transformation languages is therefore equal to the class of ETOL languages.

  13. A Metric on Phylogenetic Tree Shapes.

    Science.gov (United States)

    Colijn, C; Plazzotta, G

    2018-01-01

    The shapes of evolutionary trees are influenced by the nature of the evolutionary process but comparisons of trees from different processes are hindered by the challenge of completely describing tree shape. We present a full characterization of the shapes of rooted branching trees in a form that lends itself to natural tree comparisons. We use this characterization to define a metric, in the sense of a true distance function, on tree shapes. The metric distinguishes trees from random models known to produce different tree shapes. It separates trees derived from tropical versus USA influenza A sequences, which reflect the differing epidemiology of tropical and seasonal flu. We describe several metrics based on the same core characterization, and illustrate how to extend the metric to incorporate trees' branch lengths or other features such as overall imbalance. Our approach allows us to construct addition and multiplication on trees, and to create a convex metric on tree shapes which formally allows computation of average tree shapes. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  14. Decision Tree Technique for Particle Identification

    International Nuclear Information System (INIS)

    Quiller, Ryan

    2003-01-01

    Particle identification based on measurements such as the Cerenkov angle, momentum, and the rate of energy loss per unit distance (-dE/dx) is fundamental to the BaBar detector for particle physics experiments. It is particularly important to separate the charged forms of kaons and pions. Currently, the Neural Net, an algorithm based on mapping input variables to an output variable using hidden variables as intermediaries, is one of the primary tools used for identification. In this study, a decision tree classification technique implemented in the computer program, CART, was investigated and compared to the Neural Net over the range of momenta, 0.25 GeV/c to 5.0 GeV/c. For a given subinterval of momentum, three decision trees were made using different sets of input variables. The sensitivity and specificity were calculated for varying kaon acceptance thresholds. This data was used to plot Receiver Operating Characteristic curves (ROC curves) to compare the performance of the classification methods. Also, input variables used in constructing the decision trees were analyzed. It was found that the Neural Net was a significant contributor to decision trees using dE/dx and the Cerenkov angle as inputs. Furthermore, the Neural Net had poorer performance than the decision tree technique, but tended to improve decision tree performance when used as an input variable. These results suggest that the decision tree technique using Neural Net input may possibly increase accuracy of particle identification in BaBar

  15. Research on Classification of Chinese Text Data Based on SVM

    Science.gov (United States)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  16. Los Angeles 1-Million tree canopy cover assessment

    Science.gov (United States)

    Gregory E. McPherson; James R. Simpson; Qingfu Xiao; Wu Chunxia

    2008-01-01

    The Million Trees LA initiative intends to chart a course for sustainable growth through planting and stewardship of trees. The purpose of this study was to measure Los Angeles's existing tree canopy cover (TCC), determine if space exists for 1 million additional trees, and estimate future benefits from the planting. High resolution QuickBird remote sensing data,...

  17. AUTOMATIC TREE-CROWN DETECTION IN CHALLENGING SCENARIOS

    Directory of Open Access Journals (Sweden)

    D. Bulatov

    2016-06-01

    Full Text Available In this paper, a new procedure for individual tree detection and modeling is presented. The input of this procedure consists of a normalized digital surface model NDSM, and a possibly error-prone classification result. The procedure is modular so that the functionality, the advantages and the disadvantages for every single module will be explained. The most important technical contributions of the paper are: Employing watershed transformation combined with classification results, applying hotspots detectors for identifying treetops in groups of trees, and correcting NDSM by detecting and geometric reconstruction of small anomalies, such as earth walls. Two minor contributions are made up by a detailed literature research on available methods for individual tree detection and estimation of tree-crowns for clearly identified trees in order to reduce arbitrariness by assigning trees to one of the few types in the output model.

  18. Trees are good, but…

    Science.gov (United States)

    E.G. McPherson; F. Ferrini

    2010-01-01

    We know that “trees are good,” and most people believe this to be true. But if this is so, why are so many trees neglected, and so many tree wells empty? An individual’s attitude toward trees may result from their firsthand encounters with specific trees. Understanding how attitudes about trees are shaped, particularly aversion to trees, is critical to the business of...

  19. Atlas of United States Trees, Volume 2: Alaska Trees and Common Shrubs.

    Science.gov (United States)

    Viereck, Leslie A.; Little, Elbert L., Jr.

    This volume is the second in a series of atlases describing the natural distribution or range of native tree species in the United States. The 82 species maps include 32 of trees in Alaska, 6 of shrubs rarely reaching tree size, and 44 more of common shrubs. More than 20 additional maps summarize environmental factors and furnish general…

  20. Classification in Medical Imaging

    DEFF Research Database (Denmark)

    Chen, Chen

    Classification is extensively used in the context of medical image analysis for the purpose of diagnosis or prognosis. In order to classify image content correctly, one needs to extract efficient features with discriminative properties and build classifiers based on these features. In addition...... on characterizing human faces and emphysema disease in lung CT images....

  1. Vessel-guided airway tree segmentation

    DEFF Research Database (Denmark)

    Lo, Pechin Chien Pau; Sporring, Jon; Ashraf, Haseem

    2010-01-01

    This paper presents a method for airway tree segmentation that uses a combination of a trained airway appearance model, vessel and airway orientation information, and region growing. We propose a voxel classification approach for the appearance model, which uses a classifier that is trained to di...

  2. Multiscale Vessel-guided Airway Tree Segmentation

    DEFF Research Database (Denmark)

    Lo, Pechin Chien Pau; Sporring, Jon; de Bruijne, Marleen

    2009-01-01

    This paper presents a method for airway tree segmentation that uses a combination of a trained airway appearance model, vessel and airway orientation information, and region growing. The method uses a voxel classification based appearance model, which involves the use of a classifier that is trai...

  3. Binary classification of dyslipidemia from the waist-to-hip ratio and body mass index: a comparison of linear, logistic, and CART models

    Directory of Open Access Journals (Sweden)

    Paccaud Fred

    2004-04-01

    Full Text Available Abstract Background We sought to improve upon previously published statistical modeling strategies for binary classification of dyslipidemia for general population screening purposes based on the waist-to-hip circumference ratio and body mass index anthropometric measurements. Methods Study subjects were participants in WHO-MONICA population-based surveys conducted in two Swiss regions. Outcome variables were based on the total serum cholesterol to high density lipoprotein cholesterol ratio. The other potential predictor variables were gender, age, current cigarette smoking, and hypertension. The models investigated were: (i linear regression; (ii logistic classification; (iii regression trees; (iv classification trees (iii and iv are collectively known as "CART". Binary classification performance of the region-specific models was externally validated by classifying the subjects from the other region. Results Waist-to-hip circumference ratio and body mass index remained modest predictors of dyslipidemia. Correct classification rates for all models were 60–80%, with marked gender differences. Gender-specific models provided only small gains in classification. The external validations provided assurance about the stability of the models. Conclusions There were no striking differences between either the algebraic (i, ii vs. non-algebraic (iii, iv, or the regression (i, iii vs. classification (ii, iv modeling approaches. Anticipated advantages of the CART vs. simple additive linear and logistic models were less than expected in this particular application with a relatively small set of predictor variables. CART models may be more useful when considering main effects and interactions between larger sets of predictor variables.

  4. Tissue Classification

    DEFF Research Database (Denmark)

    Van Leemput, Koen; Puonti, Oula

    2015-01-01

    Computational methods for automatically segmenting magnetic resonance images of the brain have seen tremendous advances in recent years. So-called tissue classification techniques, aimed at extracting the three main brain tissue classes (white matter, gray matter, and cerebrospinal fluid), are now...... well established. In their simplest form, these methods classify voxels independently based on their intensity alone, although much more sophisticated models are typically used in practice. This article aims to give an overview of often-used computational techniques for brain tissue classification...

  5. Automatic Hierarchical Color Image Classification

    Directory of Open Access Journals (Sweden)

    Jing Huang

    2003-02-01

    Full Text Available Organizing images into semantic categories can be extremely useful for content-based image retrieval and image annotation. Grouping images into semantic classes is a difficult problem, however. Image classification attempts to solve this hard problem by using low-level image features. In this paper, we propose a method for hierarchical classification of images via supervised learning. This scheme relies on using a good low-level feature and subsequently performing feature-space reconfiguration using singular value decomposition to reduce noise and dimensionality. We use the training data to obtain a hierarchical classification tree that can be used to categorize new images. Our experimental results suggest that this scheme not only performs better than standard nearest-neighbor techniques, but also has both storage and computational advantages.

  6. In Vivo Pattern Classification of Ingestive Behavior in Ruminants Using FBG Sensors and Machine Learning

    Directory of Open Access Journals (Sweden)

    Vinicius Pegorini

    2015-11-01

    Full Text Available Pattern classification of ingestive behavior in grazing animals has extreme importance in studies related to animal nutrition, growth and health. In this paper, a system to classify chewing patterns of ruminants in in vivo experiments is developed. The proposal is based on data collected by optical fiber Bragg grating sensors (FBG that are processed by machine learning techniques. The FBG sensors measure the biomechanical strain during jaw movements, and a decision tree is responsible for the classification of the associated chewing pattern. In this study, patterns associated with food intake of dietary supplement, hay and ryegrass were considered. Additionally, two other important events for ingestive behavior were monitored: rumination and idleness. Experimental results show that the proposed approach for pattern classification is capable of differentiating the five patterns involved in the chewing process with an overall accuracy of 94%.

  7. In Vivo Pattern Classification of Ingestive Behavior in Ruminants Using FBG Sensors and Machine Learning.

    Science.gov (United States)

    Pegorini, Vinicius; Karam, Leandro Zen; Pitta, Christiano Santos Rocha; Cardoso, Rafael; da Silva, Jean Carlos Cardozo; Kalinowski, Hypolito José; Ribeiro, Richardson; Bertotti, Fábio Luiz; Assmann, Tangriani Simioni

    2015-11-11

    Pattern classification of ingestive behavior in grazing animals has extreme importance in studies related to animal nutrition, growth and health. In this paper, a system to classify chewing patterns of ruminants in in vivo experiments is developed. The proposal is based on data collected by optical fiber Bragg grating sensors (FBG) that are processed by machine learning techniques. The FBG sensors measure the biomechanical strain during jaw movements, and a decision tree is responsible for the classification of the associated chewing pattern. In this study, patterns associated with food intake of dietary supplement, hay and ryegrass were considered. Additionally, two other important events for ingestive behavior were monitored: rumination and idleness. Experimental results show that the proposed approach for pattern classification is capable of differentiating the five patterns involved in the chewing process with an overall accuracy of 94%.

  8. Coalescent methods for estimating phylogenetic trees.

    Science.gov (United States)

    Liu, Liang; Yu, Lili; Kubatko, Laura; Pearl, Dennis K; Edwards, Scott V

    2009-10-01

    We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces.

  9. Modular tree automata

    DEFF Research Database (Denmark)

    Bahr, Patrick

    2012-01-01

    Tree automata are traditionally used to study properties of tree languages and tree transformations. In this paper, we consider tree automata as the basis for modular and extensible recursion schemes. We show, using well-known techniques, how to derive from standard tree automata highly modular...

  10. Using tree diversity to compare phylogenetic heuristics.

    Science.gov (United States)

    Sul, Seung-Jin; Matthews, Suzanne; Williams, Tiffani L

    2009-04-29

    Evolutionary trees are family trees that represent the relationships between a group of organisms. Phylogenetic heuristics are used to search stochastically for the best-scoring trees in tree space. Given that better tree scores are believed to be better approximations of the true phylogeny, traditional evaluation techniques have used tree scores to determine the heuristics that find the best scores in the fastest time. We develop new techniques to evaluate phylogenetic heuristics based on both tree scores and topologies to compare Pauprat and Rec-I-DCM3, two popular Maximum Parsimony search algorithms. Our results show that although Pauprat and Rec-I-DCM3 find the trees with the same best scores, topologically these trees are quite different. Furthermore, the Rec-I-DCM3 trees cluster distinctly from the Pauprat trees. In addition to our heatmap visualizations of using parsimony scores and the Robinson-Foulds distance to compare best-scoring trees found by the two heuristics, we also develop entropy-based methods to show the diversity of the trees found. Overall, Pauprat identifies more diverse trees than Rec-I-DCM3. Overall, our work shows that there is value to comparing heuristics beyond the parsimony scores that they find. Pauprat is a slower heuristic than Rec-I-DCM3. However, our work shows that there is tremendous value in using Pauprat to reconstruct trees-especially since it finds identical scoring but topologically distinct trees. Hence, instead of discounting Pauprat, effort should go in improving its implementation. Ultimately, improved performance measures lead to better phylogenetic heuristics and will result in better approximations of the true evolutionary history of the organisms of interest.

  11. Extensions and applications of ensemble-of-trees methods in machine learning

    Science.gov (United States)

    Bleich, Justin

    Ensemble-of-trees algorithms have emerged to the forefront of machine learning due to their ability to generate high forecasting accuracy for a wide array of regression and classification problems. Classic ensemble methodologies such as random forests (RF) and stochastic gradient boosting (SGB) rely on algorithmic procedures to generate fits to data. In contrast, more recent ensemble techniques such as Bayesian Additive Regression Trees (BART) and Dynamic Trees (DT) focus on an underlying Bayesian probability model to generate the fits. These new probability model-based approaches show much promise versus their algorithmic counterparts, but also offer substantial room for improvement. The first part of this thesis focuses on methodological advances for ensemble-of-trees techniques with an emphasis on the more recent Bayesian approaches. In particular, we focus on extensions of BART in four distinct ways. First, we develop a more robust implementation of BART for both research and application. We then develop a principled approach to variable selection for BART as well as the ability to naturally incorporate prior information on important covariates into the algorithm. Next, we propose a method for handling missing data that relies on the recursive structure of decision trees and does not require imputation. Last, we relax the assumption of homoskedasticity in the BART model to allow for parametric modeling of heteroskedasticity. The second part of this thesis returns to the classic algorithmic approaches in the context of classification problems with asymmetric costs of forecasting errors. First we consider the performance of RF and SGB more broadly and demonstrate its superiority to logistic regression for applications in criminology with asymmetric costs. Next, we use RF to forecast unplanned hospital readmissions upon patient discharge with asymmetric costs taken into account. Finally, we explore the construction of stable decision trees for forecasts of

  12. Application of Object Based Classification and High Resolution Satellite Imagery for Savanna Ecosystem Analysis

    Directory of Open Access Journals (Sweden)

    Jane Southworth

    2010-12-01

    Full Text Available Savanna ecosystems are an important component of dryland regions and yet are exceedingly difficult to study using satellite imagery. Savannas are composed are varying amounts of trees, shrubs and grasses and typically traditional classification schemes or vegetation indices cannot differentiate across class type. This research utilizes object based classification (OBC for a region in Namibia, using IKONOS imagery, to help differentiate tree canopies and therefore woodland savanna, from shrub or grasslands. The methodology involved the identification and isolation of tree canopies within the imagery and the creation of tree polygon layers had an overall accuracy of 84%. In addition, the results were scaled up to a corresponding Landsat image of the same region, and the OBC results compared to corresponding pixel values of NDVI. The results were not compelling, indicating once more the problems of these traditional image analysis techniques for savanna ecosystems. Overall, the use of the OBC holds great promise for this ecosystem and could be utilized more frequently in studies of vegetation structure.

  13. Nonbinary Tree-Based Phylogenetic Networks.

    Science.gov (United States)

    Jetten, Laura; van Iersel, Leo

    2018-01-01

    Rooted phylogenetic networks are used to describe evolutionary histories that contain non-treelike evolutionary events such as hybridization and horizontal gene transfer. In some cases, such histories can be described by a phylogenetic base-tree with additional linking arcs, which can, for example, represent gene transfer events. Such phylogenetic networks are called tree-based. Here, we consider two possible generalizations of this concept to nonbinary networks, which we call tree-based and strictly-tree-based nonbinary phylogenetic networks. We give simple graph-theoretic characterizations of tree-based and strictly-tree-based nonbinary phylogenetic networks. Moreover, we show for each of these two classes that it can be decided in polynomial time whether a given network is contained in the class. Our approach also provides a new view on tree-based binary phylogenetic networks. Finally, we discuss two examples of nonbinary phylogenetic networks in biology and show how our results can be applied to them.

  14. City of Pittsburgh Trees

    Data.gov (United States)

    Allegheny County / City of Pittsburgh / Western PA Regional Data Center — Trees cared for and managed by the City of Pittsburgh Department of Public Works Forestry Division. Tree Benefits are calculated using the National Tree Benefit...

  15. Distribution of cavity trees in midwestern old-growth and second-growth forests

    Science.gov (United States)

    Zhaofei Fan; Stephen R. Shifley; Martin A. Spetich; Frank R. Thompson; David R. Larsen

    2003-01-01

    We used classification and regression tree analysis to determine the primary variables associated with the occurrence of cavity trees and the hierarchical structure among those variables. We applied that information to develop logistic models predicting cavity tree probability as a function of diameter, species group, and decay class. Inventories of cavity abundance in...

  16. Transporter Classification Database (TCDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Transporter Classification Database details a comprehensive classification system for membrane transport proteins known as the Transporter Classification (TC)...

  17. Non-phytoseiid Mesostigmata within citrus orchards in Florida: species distribution, relative and seasonal abundance within trees, associated vines and ground cover plants and additional collection records of mites in citrus orchards.

    Science.gov (United States)

    Childers, Carl C; Ueckermann, Eduard A

    2015-03-01

    Seven citrus orchards on reduced- to no-pesticide spray programs in central and south central Florida were sampled for non-phytoseiid mesostigmatid mites. Inner and outer canopy leaves, fruits, twigs and trunk scrapings were sampled monthly between August 1994 and January 1996. Open flowers were sampled in March from five of the sites. A total of 431 samples from one or more of 82 vine or ground cover plants were sampled monthly in five of the seven orchards. Two of the seven orchards (Mixon I and II) were on full herbicide programs and vines and ground cover plants were absent. A total of 2,655 mites (26 species) within the families: Ascidae, Blattisociidae, Laelapidae, Macrochelidae, Melicharidae, Pachylaelapidae and Parasitidae were identified. A total of 685 mites in the genus Asca (nine species: family Ascidae) were collected from within tree samples, 79 from vine or ground cover plants. Six species of Blattisociidae were collected: Aceodromus convolvuli, Blattisocius dentriticus, B. keegani, Cheiroseius sp. near jamaicensis, Lasioseius athiashenriotae and L. dentatus. A total of 485 Blattisociidae were collected from within tree samples compared with 167 from vine or ground cover plants. Low numbers of Laelapidae and Macrochelidae were collected from within tree samples. One Zygoseius furciger (Pachylaelapidae) was collected from Eleusine indica. Four species of Melicharidae were identified from 34 mites collected from within tree samples and 1,190 from vine or ground cover plants: Proctolaelaps lobatus was the most abundant species with 1,177 specimens collected from seven ground cover plants. One Phorytocarpais fimetorum (Parasitidae) was collected from inner leaves and four from twigs. Species of Ascidae, Blattisociidae, Melicharidae, Laelapidae and Pachylaelapidae were collected from 31 of the 82 vine or ground cover plants sampled, representing only a small fraction of the total number of Phytoseiidae collected from the same plants. Including the

  18. On Internet Traffic Classification: A Two-Phased Machine Learning Approach

    Directory of Open Access Journals (Sweden)

    Taimur Bakhshi

    2016-01-01

    Full Text Available Traffic classification utilizing flow measurement enables operators to perform essential network management. Flow accounting methods such as NetFlow are, however, considered inadequate for classification requiring additional packet-level information, host behaviour analysis, and specialized hardware limiting their practical adoption. This paper aims to overcome these challenges by proposing two-phased machine learning classification mechanism with NetFlow as input. The individual flow classes are derived per application through k-means and are further used to train a C5.0 decision tree classifier. As part of validation, the initial unsupervised phase used flow records of fifteen popular Internet applications that were collected and independently subjected to k-means clustering to determine unique flow classes generated per application. The derived flow classes were afterwards used to train and test a supervised C5.0 based decision tree. The resulting classifier reported an average accuracy of 92.37% on approximately 3.4 million test cases increasing to 96.67% with adaptive boosting. The classifier specificity factor which accounted for differentiating content specific from supplementary flows ranged between 98.37% and 99.57%. Furthermore, the computational performance and accuracy of the proposed methodology in comparison with similar machine learning techniques lead us to recommend its extension to other applications in achieving highly granular real-time traffic classification.

  19. adabag: An R Package for Classification with Boosting and Bagging

    Directory of Open Access Journals (Sweden)

    Esteban Alfaro

    2013-09-01

    Full Text Available Boosting and bagging are two widely used ensemble methods for classification. Their common goal is to improve the accuracy of a classifier combining single classifiers which are slightly better than random guessing. Among the family of boosting algorithms, AdaBoost (adaptive boosting is the best known, although it is suitable only for dichotomous tasks. AdaBoost.M1 and SAMME (stagewise additive modeling using a multi-class exponential loss function are two easy and natural extensions to the general case of two or more classes. In this paper, the adabag R package is introduced. This version implements AdaBoost.M1, SAMME and bagging algorithms with classification trees as base classifiers. Once the ensembles have been trained, they can be used to predict the class of new samples. The accuracy of these classifiers can be estimated in a separated data set or through cross validation. Moreover, the evolution of the error as the ensemble grows can be analysed and the ensemble can be pruned. In addition, the margin in the class prediction and the probability of each class for the observations can be calculated. Finally, several classic examples in classification literature are shown to illustrate the use of this package.

  20. I - Multivariate Classification and Machine Learning in HEP

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Traditional multivariate methods for classification (Stochastic Gradient Boosted Decision Trees and Multi-Layer Perceptrons) are explained in theory and practise using examples from HEP. General aspects of multivariate classification are discussed, in particular different regularisation techniques. Afterwards, data-driven techniques are introduced and compared to MC-based methods.

  1. Integrating Human and Machine Intelligence in Galaxy Morphology Classification Tasks

    Science.gov (United States)

    Beck, Melanie Renee

    The large flood of data flowing from observatories presents significant challenges to astronomy and cosmology--challenges that will only be magnified by projects currently under development. Growth in both volume and velocity of astrophysics data is accelerating: whereas the Sloan Digital Sky Survey (SDSS) has produced 60 terabytes of data in the last decade, the upcoming Large Synoptic Survey Telescope (LSST) plans to register 30 terabytes per night starting in the year 2020. Additionally, the Euclid Mission will acquire imaging for 5 x 107 resolvable galaxies. The field of galaxy evolution faces a particularly challenging future as complete understanding often cannot be reached without analysis of detailed morphological galaxy features. Historically, morphological analysis has relied on visual classification by astronomers, accessing the human brains capacity for advanced pattern recognition. However, this accurate but inefficient method falters when confronted with many thousands (or millions) of images. In the SDSS era, efforts to automate morphological classifications of galaxies (e.g., Conselice et al., 2000; Lotz et al., 2004) are reasonably successful and can distinguish between elliptical and disk-dominated galaxies with accuracies of 80%. While this is statistically very useful, a key problem with these methods is that they often cannot say which 80% of their samples are accurate. Furthermore, when confronted with the more complex task of identifying key substructure within galaxies, automated classification algorithms begin to fail. The Galaxy Zoo project uses a highly innovative approach to solving the scalability problem of visual classification. Displaying images of SDSS galaxies to volunteers via a simple and engaging web interface, www.galaxyzoo.org asks people to classify images by eye. Within the first year hundreds of thousands of members of the general public had classified each of the 1 million SDSS galaxies an average of 40 times. Galaxy Zoo

  2. Community assessment of tropical tree biomass

    DEFF Research Database (Denmark)

    Theilade, Ida; Rutishauser, Ervan; Poulsen, Michael K.

    2015-01-01

    Background REDD+ programs rely on accurate forest carbon monitoring. Several REDD+ projects have recently shown that local communities can monitor above ground biomass as well as external professionals, but at lower costs. However, the precision and accuracy of carbon monitoring conducted by local...... communities have rarely been assessed in the tropics. The aim of this study was to investigate different sources of error in tree biomass measurements conducted by community monitors and determine the effect on biomass estimates. Furthermore, we explored the potential of local ecological knowledge to assess...... measurement, with special attention given to large and odd-shaped trees. A better understanding of traditional classification systems and concepts is required for local tree identifications and wood density estimates to become useful in monitoring of biomass and tree diversity....

  3. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy.

    Science.gov (United States)

    Letunic, Ivica; Bork, Peer

    2011-07-01

    Interactive Tree Of Life (http://itol.embl.de) is a web-based tool for the display, manipulation and annotation of phylogenetic trees. It is freely available and open to everyone. In addition to classical tree viewer functions, iTOL offers many novel ways of annotating trees with various additional data. Current version introduces numerous new features and greatly expands the number of supported data set types. Trees can be interactively manipulated and edited. A free personal account system is available, providing management and sharing of trees in user defined workspaces and projects. Export to various bitmap and vector graphics formats is supported. Batch access interface is available for programmatic access or inclusion of interactive trees into other web services.

  4. A new approach to enhance the performance of decision tree for classifying gene expression data.

    Science.gov (United States)

    Hassan, Md; Kotagiri, Ramamohanarao

    2013-12-20

    Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree.

  5. Video genre classification using multimodal features

    Science.gov (United States)

    Jin, Sung Ho; Bae, Tae Meon; Choo, Jin Ho; Ro, Yong Man

    2003-12-01

    We propose a video genre classification method using multimodal features. The proposed method is applied for the preprocessing of automatic video summarization or the retrieval and classification of broadcasting video contents. Through a statistical analysis of low-level and middle-level audio-visual features in video, the proposed method can achieve good performance in classifying several broadcasting genres such as cartoon, drama, music video, news, and sports. In this paper, we adopt MPEG-7 audio-visual descriptors as multimodal features of video contents and evaluate the performance of the classification by feeding the features into a decision tree-based classifier which is trained by CART. The experimental results show that the proposed method can recognize several broadcasting video genres with a high accuracy and the classification performance with multimodal features is superior to the one with unimodal features in the genre classification.

  6. Nonbinary tree-based phylogenetic networks

    OpenAIRE

    Jetten, Laura; van Iersel, Leo

    2016-01-01

    Rooted phylogenetic networks are used to describe evolutionary histories that contain non-treelike evolutionary events such as hybridization and horizontal gene transfer. In some cases, such histories can be described by a phylogenetic base-tree with additional linking arcs, which can for example represent gene transfer events. Such phylogenetic networks are called tree-based. Here, we consider two possible generalizations of this concept to nonbinary networks, which we call tree-based and st...

  7. Border trees of complex networks

    International Nuclear Information System (INIS)

    Villas Boas, Paulino R; Rodrigues, Francisco A; Travieso, Gonzalo; Fontoura Costa, Luciano da

    2008-01-01

    The comprehensive characterization of the structure of complex networks is essential to understand the dynamical processes which guide their evolution. The discovery of the scale-free distribution and the small-world properties of real networks were fundamental to stimulate more realistic models and to understand important dynamical processes related to network growth. However, the properties of the network borders (nodes with degree equal to 1), one of its most fragile parts, remained little investigated and understood. The border nodes may be involved in the evolution of structures such as geographical networks. Here we analyze the border trees of complex networks, which are defined as the subgraphs without cycles connected to the remainder of the network (containing cycles) and terminating into border nodes. In addition to describing an algorithm for identification of such tree subgraphs, we also consider how their topological properties can be quantified in terms of their depth and number of leaves. We investigate the properties of border trees for several theoretical models as well as real-world networks. Among the obtained results, we found that more than half of the nodes of some real-world networks belong to the border trees. A power-law with cut-off was observed for the distribution of the depth and number of leaves of the border trees. An analysis of the local role of the nodes in the border trees was also performed

  8. Urban tree growth modeling

    Science.gov (United States)

    E. Gregory McPherson; Paula J. Peper

    2012-01-01

    This paper describes three long-term tree growth studies conducted to evaluate tree performance because repeated measurements of the same trees produce critical data for growth model calibration and validation. Several empirical and process-based approaches to modeling tree growth are reviewed. Modeling is more advanced in the fields of forestry and...

  9. Keeping trees as assets

    Science.gov (United States)

    Kevin T. Smith

    2009-01-01

    Landscape trees have real value and contribute to making livable communities. Making the most of that value requires providing trees with the proper care and attention. As potentially large and long-lived organisms, trees benefit from commitment to regular care that respects the natural tree system. This system captures, transforms, and uses energy to survive, grow,...

  10. Unrealistic phylogenetic trees may improve phylogenetic footprinting.

    Science.gov (United States)

    Nettling, Martin; Treutler, Hendrik; Cerquides, Jesus; Grosse, Ivo

    2017-06-01

    The computational investigation of DNA binding motifs from binding sites is one of the classic tasks in bioinformatics and a prerequisite for understanding gene regulation as a whole. Due to the development of sequencing technologies and the increasing number of available genomes, approaches based on phylogenetic footprinting become increasingly attractive. Phylogenetic footprinting requires phylogenetic trees with attached substitution probabilities for quantifying the evolution of binding sites, but these trees and substitution probabilities are typically not known and cannot be estimated easily. Here, we investigate the influence of phylogenetic trees with different substitution probabilities on the classification performance of phylogenetic footprinting using synthetic and real data. For synthetic data we find that the classification performance is highest when the substitution probability used for phylogenetic footprinting is similar to that used for data generation. For real data, however, we typically find that the classification performance of phylogenetic footprinting surprisingly increases with increasing substitution probabilities and is often highest for unrealistically high substitution probabilities close to one. This finding suggests that choosing realistic model assumptions might not always yield optimal predictions in general and that choosing unrealistically high substitution probabilities close to one might actually improve the classification performance of phylogenetic footprinting. The proposed PF is implemented in JAVA and can be downloaded from https://github.com/mgledi/PhyFoo. : martin.nettling@informatik.uni-halle.de. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  11. Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes.

    Science.gov (United States)

    Yates, Katherine L; Mellin, Camille; Caley, M Julian; Radford, Ben T; Meeuwig, Jessica J

    2016-01-01

    Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are

  12. Comparison of leaf-on and leaf-off ALS data for mapping riparian tree species

    Science.gov (United States)

    Laslier, Marianne; Ba, Antoine; Hubert-Moy, Laurence; Dufour, Simon

    2017-10-01

    Forest species composition is a fundamental indicator of forest study and management. However, describing forest species composition at large scales and of highly diverse populations remains an issue for which remote sensing can provide significant contribution, in particular, Airborne Laser Scanning (ALS) data. Riparian corridors are good examples of highly valuable ecosystems, with high species richness and large surface areas that can be time consuming and expensive to monitor with in situ measurements. Remote sensing could be useful to study them, but few studies have focused on monitoring riparian tree species using ALS data. This study aimed to determine which metrics derived from ALS data are best suited to identify and map riparian tree species. We acquired very high density leaf-on and leaf-off ALS data along the Sélune River (France). In addition, we inventoried eight main riparian deciduous tree species along the study site. After manual segmentation of the inventoried trees, we extracted 68 morphological and structural metrics from both leaf-on and leaf-off ALS point clouds. Some of these metrics were then selected using Sequential Forward Selection (SFS) algorithm. Support Vector Machine (SVM) classification results showed good accuracy with 7 metrics (0.77). Both leaf-on and leafoff metrics were kept as important metrics for distinguishing tree species. Results demonstrate the ability of 3D information derived from high density ALS data to identify riparian tree species using external and internal structural metrics. They also highlight the complementarity of leaf-on and leaf-off Lidar data for distinguishing riparian tree species.

  13. Fuzzy tree automata and syntactic pattern recognition.

    Science.gov (United States)

    Lee, E T

    1982-04-01

    An approach of representing patterns by trees and processing these trees by fuzzy tree automata is described. Fuzzy tree automata are defined and investigated. The results include that the class of fuzzy root-to-frontier recognizable ¿-trees is closed under intersection, union, and complementation. Thus, the class of fuzzy root-to-frontier recognizable ¿-trees forms a Boolean algebra. Fuzzy tree automata are applied to processing fuzzy tree representation of patterns based on syntactic pattern recognition. The grade of acceptance is defined and investigated. Quantitative measures of ``approximate isosceles triangle,'' ``approximate elongated isosceles triangle,'' ``approximate rectangle,'' and ``approximate cross'' are defined and used in the illustrative examples of this approach. By using these quantitative measures, a house, a house with high roof, and a church are also presented as illustrative examples. In addition, three fuzzy tree automata are constructed which have the capability of processing the fuzzy tree representations of ``fuzzy houses,'' ``houses with high roofs,'' and ``fuzzy churches,'' respectively. The results may have useful applications in pattern recognition, image processing, artificial intelligence, pattern database design and processing, image science, and pictorial information systems.

  14. A survey of decision tree classifier methodology

    Science.gov (United States)

    Safavian, S. R.; Landgrebe, David

    1991-01-01

    Decision tree classifiers (DTCs) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps the most important feature of DTCs is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issues. After considering potential advantages of DTCs over single-state classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  15. Remote sensing survey of Chinese tallow tree in the Toledo Bend Reservoir area, Louisiana and Texas

    Science.gov (United States)

    Ramsey, Elijah W.; Rangoonwala, Amina; Bannister, Terri; Suzuoki, Yukihiro

    2013-01-01

    We applied Hyperion sensor satellite data acquired by the National Aeronautics and Space Administration’s Earth Observing-1 (EO-1) satellite in conjunction with reconnaissance surveys to map the occurrences of the invasive Chinese tallow tree (Triadica sebifera) in the Toledo Bend Reservoir study area of northwestern Louisiana and northeastern Texas. The rationale for application of high spectral resolution EO-1 Hyperion data was based on the successful use of Hyperion data in the mapping of Chinese tallow tree in southwestern Louisiana in 2005. In contrast to the single Hyperion image used in the 2005 project, more than 20 EO-1 Hyperion and Advanced Land Imager (ALI) images of the study area were collected in 2009 and 2010 during the fall senescence when Chinese tallow tree leaves turn red. Atmospherically corrected reflectance spectra of Hyperion imagery collected at ground and aerial observation locations provided the input datasets used in the program for spectral discrimination analysis. Discrimination analysis was used to identify spectral indicator sets to best explain variance contained in the input databases. The expectation was that at least one set of Hyperion-based indicator spectra would uniquely identify occurrences of red-leaf Chinese tallow tree; however, no combination of Hyperion-based reflectance datasets produced a unique identifier. The inability to discover a unique spectral indicator resulted primarily from relatively sparse coverage by red-leaf Chinese tallow tree within the study area (percentage of coverage was less than 5 percent per 30- by 30-meter Hyperion pixel). To enhance the performance of the spectral discrimination analysis, leaf and canopy spectra of Chinese tallow tree were added to the input datasets to guide the indicator selection. In addition, input databases were segregated by land class obtained from an ALI-based landcover classification in order to reduce the input variance and to promote spectral discrimination of red

  16. ETE: a python Environment for Tree Exploration.

    Science.gov (United States)

    Huerta-Cepas, Jaime; Dopazo, Joaquín; Gabaldón, Toni

    2010-01-13

    Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale. Here we present the Environment for Tree Exploration (ETE), a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations. ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from http://ete.cgenomics.org.

  17. ETE: a python Environment for Tree Exploration

    Directory of Open Access Journals (Sweden)

    Gabaldón Toni

    2010-01-01

    Full Text Available Abstract Background Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale. Results Here we present the Environment for Tree Exploration (ETE, a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations. Conclusions ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from http://ete.cgenomics.org.

  18. Fungal diseases of tree stands under urbanized conditions of Moscow

    Directory of Open Access Journals (Sweden)

    Smirnova Oksana G.

    2013-01-01

    Full Text Available Phytosanitary and ecological estimation of tree-stands has been con­ducted at the Forest Experimental Station of Moscow Agricultural Academy and parks of Northeast of Moscow in 2007-2011. Fomes fomentarius was proved to be a very serious pathogen of trees under conditions of Moscow, Piptoporus betulinus, Phellinus igniarius, and Fomitopsis pinicola also occurred and caused damage to trees. This rather bad phytosanitary situation depends on alarming ecological situation in Moscow. At the Forest Experimental Station of Moscow Agricultural Academy a number and cover of lichens decreased. In general, all trees in Moscow are in dynamic equilibrium with the urbanized environment. In connection with this, the following classification of tree-stands was proposed for the urbanized environment: 1 - healthy trees, 2 - affected trees which can be managed, 3 - dry woods, 3a - very diseased. Many tree-stands in investigated regions of Moscow are found to belong to the groups 2 and 3c. All tree-stands must be carefully monitored and managed in order to provide a well-timed decision on the support system for preservation of trees as ‘lungs of city’ and avoid unpredictable tree falling which put people and traffic at risk.

  19. Fault tree handbook

    International Nuclear Information System (INIS)

    Haasl, D.F.; Roberts, N.H.; Vesely, W.E.; Goldberg, F.F.

    1981-01-01

    This handbook describes a methodology for reliability analysis of complex systems such as those which comprise the engineered safety features of nuclear power generating stations. After an initial overview of the available system analysis approaches, the handbook focuses on a description of the deductive method known as fault tree analysis. The following aspects of fault tree analysis are covered: basic concepts for fault tree analysis; basic elements of a fault tree; fault tree construction; probability, statistics, and Boolean algebra for the fault tree analyst; qualitative and quantitative fault tree evaluation techniques; and computer codes for fault tree evaluation. Also discussed are several example problems illustrating the basic concepts of fault tree construction and evaluation

  20. Is overall similarity classification less effortful than single-dimension classification?

    Science.gov (United States)

    Wills, Andy J; Milton, Fraser; Longmore, Christopher A; Hester, Sarah; Robinson, Jo

    2013-01-01

    It is sometimes argued that the implementation of an overall similarity classification is less effortful than the implementation of a single-dimension classification. In the current article, we argue that the evidence securely in support of this view is limited, and report additional evidence in support of the opposite proposition--overall similarity classification is more effortful than single-dimension classification. Using a match-to-standards procedure, Experiments 1A, 1B and 2 demonstrate that concurrent load reduces the prevalence of overall similarity classification, and that this effect is robust to changes in the concurrent load task employed, the level of time pressure experienced, and the short-term memory requirements of the classification task. Experiment 3 demonstrates that participants who produced overall similarity classifications from the outset have larger working memory capacities than those who produced single-dimension classifications initially, and Experiment 4 demonstrates that instructions to respond meticulously increase the prevalence of overall similarity classification.

  1. There's Life in Hazard Trees

    Science.gov (United States)

    Mary Torsello; Toni McLellan

    The goals of hazard tree management programs are to maximize public safety and maintain a healthy sustainable tree resource. Although hazard tree management frequently targets removal of trees or parts of trees that attract wildlife, it can take into account a diversity of tree values. With just a little extra planning, hazard tree management can be highly beneficial...

  2. Forest Management Intensity Affects Aquatic Communities in Artificial Tree Holes.

    Science.gov (United States)

    Petermann, Jana S; Rohland, Anja; Sichardt, Nora; Lade, Peggy; Guidetti, Brenda; Weisser, Wolfgang W; Gossner, Martin M

    2016-01-01

    Forest management could potentially affect organisms in all forest habitats. However, aquatic communities in water-filled tree-holes may be especially sensitive because of small population sizes, the risk of drought and potential dispersal limitation. We set up artificial tree holes in forest stands subject to different management intensities in two regions in Germany and assessed the influence of local environmental properties (tree-hole opening type, tree diameter, water volume and water temperature) as well as regional drivers (forest management intensity, tree-hole density) on tree-hole insect communities (not considering other organisms such as nematodes or rotifers), detritus content, oxygen and nutrient concentrations. In addition, we compared data from artificial tree holes with data from natural tree holes in the same area to evaluate the methodological approach of using tree-hole analogues. We found that forest management had strong effects on communities in artificial tree holes in both regions and across the season. Abundance and species richness declined, community composition shifted and detritus content declined with increasing forest management intensity. Environmental variables, such as tree-hole density and tree diameter partly explained these changes. However, dispersal limitation, indicated by effects of tree-hole density, generally showed rather weak impacts on communities. Artificial tree holes had higher water temperatures (on average 2°C higher) and oxygen concentrations (on average 25% higher) than natural tree holes. The abundance of organisms was higher but species richness was lower in artificial tree holes. Community composition differed between artificial and natural tree holes. Negative management effects were detectable in both tree-hole systems, despite their abiotic and biotic differences. Our results indicate that forest management has substantial and pervasive effects on tree-hole communities and may alter their structure and

  3. Covering tree with stars

    DEFF Research Database (Denmark)

    Baumbach, Jan; Guo, Jiong; Ibragimov, Rashid

    2015-01-01

    We study the tree edit distance problem with edge deletions and edge insertions as edit operations. We reformulate a special case of this problem as Covering Tree with Stars (CTS): given a tree T and a set of stars, can we connect the stars in by adding edges between them such that the resulting...... tree is isomorphic to T? We prove that in the general setting, CST is NP-complete, which implies that the tree edit distance considered here is also NP-hard, even when both input trees having diameters bounded by 10. We also show that, when the number of distinct stars is bounded by a constant k, CTS...

  4. Monitoring stress-related mass variations in Amazon trees using accelerometers

    Science.gov (United States)

    van Emmerik, T. H. M.; Steele-Dunne, S. C.; Gentine, P.; Hut, R.; Guerin, M. F.; Leus, G.; Oliveira, R. S.; Van De Giesen, N.

    2016-12-01

    Containing half of the world's rainforests, the Amazon plays a key role in the global water and carbon budget. However, the Amazon remains poorly understood, but appears to be vulnerable to increasing moisture stress, and future droughts have the potential to considerably change the global water and carbon budget. Field measurements will allow further investigations of the effects of moisture stress and droughts on tree dynamics, and its impact on the water and carbon budget. This study focuses on studying the diurnal mass variations of seven Amazonian tree species. The mass of trees is influenced by physiological processes within the tree (e.g. transpiration and root water uptake), as well as external loads (e.g. intercepted precipitation). Depending on the physiological traits of an individual tree, moisture stress and drought affect processes such as photosynthesis, assimilation, transpiration, and root water uptake. In turn, these have their influence on diurnal mass variations of a tree. Our study uses measured three-dimensional displacement and acceleration of trees, to detect and quantify their diurnal (bio)mass variations. Nineteen accelerometers and dendrometers were installed on seven different tree species in the Amazon rainforest, covering an area of 250 x 250 m. The selected species span a wide range in wood density (0.5 - 1.1), diameter (15 - 40 cm) and height (25 - 60 m). Acceleration was measured with a frequency of 10 Hz, from August 2015 to June 2016, covering both the wet and dry season. On-site additional measurements of net radiation, wind speed at three heights, temperature, and precipitation as available every 15 minutes. Dendrometers measured variation in xylem and bark thickness every 5 minutes. The MUltiple SIgnal Classification (MUSIC) algorithm was applied to the acceleration time series to estimate the frequency spectrum of each tree. A correction was necessary to account for the dominant effect of wind. The resulting spectra reveal

  5. Building an asynchronous web-based tool for machine learning classification.

    Science.gov (United States)

    Weber, Griffin; Vinterbo, Staal; Ohno-Machado, Lucila

    2002-01-01

    Various unsupervised and supervised learning methods including support vector machines, classification trees, linear discriminant analysis and nearest neighbor classifiers have been used to classify high-throughput gene expression data. Simpler and more widely accepted statistical tools have not yet been used for this purpose, hence proper comparisons between classification methods have not been conducted. We developed free software that implements logistic regression with stepwise variable selection as a quick and simple method for initial exploration of important genetic markers in disease classification. To implement the algorithm and allow our collaborators in remote locations to evaluate and compare its results against those of other methods, we developed a user-friendly asynchronous web-based application with a minimal amount of programming using free, downloadable software tools. With this program, we show that classification using logistic regression can perform as well as other more sophisticated algorithms, and it has the advantages of being easy to interpret and reproduce. By making the tool freely and easily available, we hope to promote the comparison of classification methods. In addition, we believe our web application can be used as a model for other bioinformatics laboratories that need to develop web-based analysis tools in a short amount of time and on a limited budget.

  6. A Comparison of Machine Learning Methods in a High-Dimensional Classification Problem

    Directory of Open Access Journals (Sweden)

    Zekić-Sušac Marijana

    2014-09-01

    Full Text Available Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART classification trees, support vector machines, and k-nearest neighbour on the same dataset in order to compare their efficiency in the sense of classification accuracy. The performance of each method was compared on ten subsamples in a 10-fold cross-validation procedure in order to assess computing sensitivity and specificity of each model. Results: The artificial neural network model based on multilayer perceptron yielded a higher classification rate than the models produced by other methods. The pairwise t-test showed a statistical significance between the artificial neural network and the k-nearest neighbour model, while the difference among other methods was not statistically significant. Conclusions: Tested machine learning methods are able to learn fast and achieve high classification accuracy. However, further advancement can be assured by testing a few additional methodological refinements in machine learning methods.

  7. Classification of hand eczema

    DEFF Research Database (Denmark)

    Agner, T; Aalto-Korte, K; Andersen, K E

    2015-01-01

    BACKGROUND: Classification of hand eczema (HE) is mandatory in epidemiological and clinical studies, and also important in clinical work. OBJECTIVES: The aim was to test a recently proposed classification system of HE in clinical practice in a prospective multicentre study. METHODS: Patients were...... recruited from nine different tertiary referral centres. All patients underwent examination by specialists in dermatology and were checked using relevant allergy testing. Patients were classified into one of the six diagnostic subgroups of HE: allergic contact dermatitis, irritant contact dermatitis, atopic...... system investigated in the present study was useful, being able to give an appropriate main diagnosis for 89% of HE patients, and for another 7% when using two main diagnoses. The fact that more than half of the patients had one or more additional diagnoses illustrates that HE is a multifactorial disease....

  8. Classification in context

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper surveys classification research literature, discusses various classification theories, and shows that the focus has traditionally been on establishing a scientific foundation for classification research. This paper argues that a shift has taken place, and suggests that contemporary...... classification research focus on contextual information as the guide for the design and construction of classification schemes....

  9. Classification of the web

    DEFF Research Database (Denmark)

    Mai, Jens Erik

    2004-01-01

    This paper discusses the challenges faced by investigations into the classification of the Web and outlines inquiries that are needed to use principles for bibliographic classification to construct classifications of the Web. This paper suggests that the classification of the Web meets challenges...... that call for inquiries into the theoretical foundation of bibliographic classification theory....

  10. Relationships between diameter and height of trees in natural ...

    African Journals Online (AJOL)

    Relationships between diameter and height of trees in natural tropical forest in Tanzania. Wilson A Mugasha, Ole M Bollandsås, Tron Eid. Abstract. The relationship between tree height (h) and tree diameter at breast height (dbh) is an important element describing forest stands. In addition, h often is a required variable in ...

  11. Effects of nurse trees, spacing, and tree species on biomass production in mixed forest plantations

    DEFF Research Database (Denmark)

    Nord-Larsen, Thomas; Meilby, Henrik

    2016-01-01

    Growing concern about increasing concentrations of greenhouse gases in the atmosphere, and resulting global climate change, has spurred a growing demand for renewable energy. In this study, we hypothesized that a nurse tree crop may provide additional early yields of biomass for fuel, while...... was in most cases reduced due to competition. However, provided timely thinning of nurse trees, the qualitative development of the trees will allow for long-term timber production....

  12. Modular representation and analysis of fault trees

    Energy Technology Data Exchange (ETDEWEB)

    Olmos, J; Wolf, L [Massachusetts Inst. of Tech., Cambridge (USA). Dept. of Nuclear Engineering

    1978-08-01

    An analytical method to describe fault tree diagrams in terms of their modular compositions is developed. Fault tree structures are characterized by recursively relating the top tree event to all its basic component inputs through a set of equations defining each of the modulus for the fault tree. It is shown that such a modular description is an extremely valuable tool for making a quantitative analysis of fault trees. The modularization methodology has been implemented into the PL-MOD computer code, written in PL/1 language, which is capable of modularizing fault trees containing replicated components and replicated modular gates. PL-MOD in addition can handle mutually exclusive inputs and explicit higher order symmetric (k-out-of-n) gates. The step-by-step modularization of fault trees performed by PL-MOD is demonstrated and it is shown how this procedure is only made possible through an extensive use of the list processing tools available in PL/1. A number of nuclear reactor safety system fault trees were analyzed. PL-MOD performed the modularization and evaluation of the modular occurrence probabilities and Vesely-Fussell importance measures for these systems very efficiently. In particular its execution time for the modularization of a PWR High Pressure Injection System reduced fault tree was 25 times faster than that necessary to generate its equivalent minimal cut-set description using MOCUS, a code considered to be fast by present standards.

  13. The space of ultrametric phylogenetic trees.

    Science.gov (United States)

    Gavryushkin, Alex; Drummond, Alexei J

    2016-08-21

    The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  14. Trees and highway safety.

    Science.gov (United States)

    2011-03-01

    To minimize the severity of run-off-road collisions of vehicles with trees, departments of transportation (DOTs) : commonly establish clear zones for trees and other fixed objects. Caltrans clear zone on freeways is 30 feet : minimum (40 feet pref...

  15. An overview of decision tree applied to power systems

    DEFF Research Database (Denmark)

    Liu, Leo; Rather, Zakir Hussain; Chen, Zhe

    2013-01-01

    The corrosive volume of available data in electric power systems motivate the adoption of data mining techniques in the emerging field of power system data analytics. The mainstream of data mining algorithm applied to power system, Decision Tree (DT), also named as Classification And Regression...... Tree (CART), has gained increasing interests because of its high performance in terms of computational efficiency, uncertainty manageability, and interpretability. This paper presents an overview of a variety of DT applications to power systems for better interfacing of power systems with data...... analytics. The fundamental knowledge of CART algorithm is also introduced which is then followed by examples of both classification tree and regression tree with the help of case study for security assessment of Danish power system....

  16. An automated approach to the design of decision tree classifiers

    Science.gov (United States)

    Argentiero, P.; Chin, R.; Beaudet, P.

    1982-01-01

    An automated technique is presented for designing effective decision tree classifiers predicated only on a priori class statistics. The procedure relies on linear feature extractions and Bayes table look-up decision rules. Associated error matrices are computed and utilized to provide an optimal design of the decision tree at each so-called 'node'. A by-product of this procedure is a simple algorithm for computing the global probability of correct classification assuming the statistical independence of the decision rules. Attention is given to a more precise definition of decision tree classification, the mathematical details on the technique for automated decision tree design, and an example of a simple application of the procedure using class statistics acquired from an actual Landsat scene.

  17. Two tree-formation methods for fast pattern search using nearest-neighbour and nearest-centroid matching

    NARCIS (Netherlands)

    Schomaker, Lambertus; Mangalagiu, D.; Vuurpijl, Louis; Weinfeld, M.; Schomaker, Lambert; Vuurpijl, Louis

    2000-01-01

    This paper describes tree­based classification of character images, comparing two methods of tree formation and two methods of matching: nearest neighbor and nearest centroid. The first method, Preprocess Using Relative Distances (PURD) is a tree­based reorganization of a flat list of patterns,

  18. Tree detection in urban regions from aerial imagery and DSM based on local maxima points

    Science.gov (United States)

    Korkmaz, Özgür; Yardımcı ćetin, Yasemin; Yilmaz, Erdal

    2017-05-01

    In this study, we propose an automatic approach for tree detection and classification in registered 3-band aerial images and associated digital surface models (DSM). The tree detection results can be used in 3D city modelling and urban planning. This problem is magnified when trees are in close proximity to each other or other objects such as rooftops in the scenes. This study presents a method for locating individual trees and estimation of crown size based on local maxima from DSM accompanied by color and texture information. For this purpose, segment level classifier trained for 10 classes and classification results are improved by analyzing the class probabilities of neighbour segments. Later, the tree classes under a certain height were eliminated using the Digital Terrain Model (DTM). For the tree classes, local maxima points are obtained and the tree radius estimate is made from the vertical and horizontal height profiles passing through these points. The final tree list containing the centers and radius of the trees is obtained by selecting from the list of tree candidates according to the overlapping and selection parameters. Although the limited number of train sets are used in this study, tree classification and localization results are competitive.

  19. Decision-Tree Program

    Science.gov (United States)

    Buntine, Wray

    1994-01-01

    IND computer program introduces Bayesian and Markov/maximum-likelihood (MML) methods and more-sophisticated methods of searching in growing trees. Produces more-accurate class-probability estimates important in applications like diagnosis. Provides range of features and styles with convenience for casual user, fine-tuning for advanced user or for those interested in research. Consists of four basic kinds of routines: data-manipulation, tree-generation, tree-testing, and tree-display. Written in C language.

  20. Hazard classification methodology

    International Nuclear Information System (INIS)

    Brereton, S.J.

    1996-01-01

    This document outlines the hazard classification methodology used to determine the hazard classification of the NIF LTAB, OAB, and the support facilities on the basis of radionuclides and chemicals. The hazard classification determines the safety analysis requirements for a facility

  1. Million trees Los Angeles canopy cover and benefit assessment

    Science.gov (United States)

    E.G. McPherson; J.R. Simpson; Q. Xiao; C. Wu

    2011-01-01

    The Million Trees LA initiative intends to improve Los Angeles’s environment through planting and stewardship of 1 million trees. The purpose of this study was to measure Los Angeles’s existing tree canopy cover (TCC), determine if space exists for 1 million additional trees, and estimate future benefits from the planting. High-resolution QuickBird remote sensing data...

  2. Comparing the performance of flat and hierarchical Habitat/Land-Cover classification models in a NATURA 2000 site

    Science.gov (United States)

    Gavish, Yoni; O'Connell, Jerome; Marsh, Charles J.; Tarantino, Cristina; Blonda, Palma; Tomaselli, Valeria; Kunin, William E.

    2018-02-01

    The increasing need for high quality Habitat/Land-Cover (H/LC) maps has triggered considerable research into novel machine-learning based classification models. In many cases, H/LC classes follow pre-defined hierarchical classification schemes (e.g., CORINE), in which fine H/LC categories are thematically nested within more general categories. However, none of the existing machine-learning algorithms account for this pre-defined hierarchical structure. Here we introduce a novel Random Forest (RF) based application of hierarchical classification, which fits a separate local classification model in every branching point of the thematic tree, and then integrates all the different local models to a single global prediction. We applied the hierarchal RF approach in a NATURA 2000 site in Italy, using two land-cover (CORINE, FAO-LCCS) and one habitat classification scheme (EUNIS) that differ from one another in the shape of the class hierarchy. For all 3 classification schemes, both the hierarchical model and a flat model alternative provided accurate predictions, with kappa values mostly above 0.9 (despite using only 2.2-3.2% of the study area as training cells). The flat approach slightly outperformed the hierarchical models when the hierarchy was relatively simple, while the hierarchical model worked better under more complex thematic hierarchies. Most misclassifications came from habitat pairs that are thematically distant yet spectrally similar. In 2 out of 3 classification schemes, the additional constraints of the hierarchical model resulted with fewer such serious misclassifications relative to the flat model. The hierarchical model also provided valuable information on variable importance which can shed light into "black-box" based machine learning algorithms like RF. We suggest various ways by which hierarchical classification models can increase the accuracy and interpretability of H/LC classification maps.

  3. Automated Diatom Classification (Part A: Handcrafted Feature Approaches

    Directory of Open Access Journals (Sweden)

    Gloria Bueno

    2017-07-01

    Full Text Available This paper deals with automatic taxa identification based on machine learning methods. The aim is therefore to automatically classify diatoms, in terms of pattern recognition terminology. Diatoms are a kind of algae microorganism with high biodiversity at the species level, which are useful for water quality assessment. The most relevant features for diatom description and classification have been selected using an extensive dataset of 80 taxa with a minimum of 100 samples/taxon augmented to 300 samples/taxon. In addition to published morphological, statistical and textural descriptors, a new textural descriptor, Local Binary Patterns (LBP, to characterize the diatom’s valves, and a log Gabor implementation not tested before for this purpose are introduced in this paper. Results show an overall accuracy of 98.11% using bagging decision trees and combinations of descriptors. Finally, some phycological features of diatoms that are still difficult to integrate in computer systems are discussed for future work.

  4. Minnesota's Forest Trees. Revised.

    Science.gov (United States)

    Miles, William R.; Fuller, Bruce L.

    This bulletin describes 46 of the more common trees found in Minnesota's forests and windbreaks. The bulletin contains two tree keys, a summer key and a winter key, to help the reader identify these trees. Besides the two keys, the bulletin includes an introduction, instructions for key use, illustrations of leaf characteristics and twig…

  5. D2-tree

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Sioutas, Spyros; Pantazos, Kostas

    2015-01-01

    We present a new overlay, called the Deterministic Decentralized tree (D2-tree). The D2-tree compares favorably to other overlays for the following reasons: (a) it provides matching and better complexities, which are deterministic for the supported operations; (b) the management of nodes (peers...

  6. Covering tree with stars

    DEFF Research Database (Denmark)

    Baumbach, Jan; Guo, Jian-Ying; Ibragimov, Rashid

    2013-01-01

    We study the tree edit distance problem with edge deletions and edge insertions as edit operations. We reformulate a special case of this problem as Covering Tree with Stars (CTS): given a tree T and a set of stars, can we connect the stars in by adding edges between them such that the resulting ...

  7. Winter Birch Trees

    Science.gov (United States)

    Sweeney, Debra; Rounds, Judy

    2011-01-01

    Trees are great inspiration for artists. Many art teachers find themselves inspired and maybe somewhat obsessed with the natural beauty and elegance of the lofty tree, and how it changes through the seasons. One such tree that grows in several regions and always looks magnificent, regardless of the time of year, is the birch. In this article, the…

  8. Total well dominated trees

    DEFF Research Database (Denmark)

    Finbow, Arthur; Frendrup, Allan; Vestergaard, Preben D.

    cardinality then G is a total well dominated graph. In this paper we study composition and decomposition of total well dominated trees. By a reversible process we prove that any total well dominated tree can both be reduced to and constructed from a family of three small trees....

  9. IND - THE IND DECISION TREE PACKAGE

    Science.gov (United States)

    Buntine, W.

    1994-01-01

    A common approach to supervised classification and prediction in artificial intelligence and statistical pattern recognition is the use of decision trees. A tree is "grown" from data using a recursive partitioning algorithm to create a tree which has good prediction of classes on new data. Standard algorithms are CART (by Breiman Friedman, Olshen and Stone) and ID3 and its successor C4 (by Quinlan). As well as reimplementing parts of these algorithms and offering experimental control suites, IND also introduces Bayesian and MML methods and more sophisticated search in growing trees. These produce more accurate class probability estimates that are important in applications like diagnosis. IND is applicable to most data sets consisting of independent instances, each described by a fixed length vector of attribute values. An attribute value may be a number, one of a set of attribute specific symbols, or it may be omitted. One of the attributes is delegated the "target" and IND grows trees to predict the target. Prediction can then be done on new data or the decision tree printed out for inspection. IND provides a range of features and styles with convenience for the casual user as well as fine-tuning for the advanced user or those interested in research. IND can be operated in a CART-like mode (but without regression trees, surrogate splits or multivariate splits), and in a mode like the early version of C4. Advanced features allow more extensive search, interactive control and display of tree growing, and Bayesian and MML algorithms for tree pruning and smoothing. These often produce more accurate class probability estimates at the leaves. IND also comes with a comprehensive experimental control suite. IND consists of four basic kinds of routines: data manipulation routines, tree generation routines, tree testing routines, and tree display routines. The data manipulation routines are used to partition a single large data set into smaller training and test sets. The

  10. TreePics: visualizing trees with pictures

    Directory of Open Access Journals (Sweden)

    Nicolas Puillandre

    2017-09-01

    Full Text Available While many programs are available to edit phylogenetic trees, associating pictures with branch tips in an efficient and automatic way is not an available option. Here, we present TreePics, a standalone software that uses a web browser to visualize phylogenetic trees in Newick format and that associates pictures (typically, pictures of the voucher specimens to the tip of each branch. Pictures are visualized as thumbnails and can be enlarged by a mouse rollover. Further, several pictures can be selected and displayed in a separate window for visual comparison. TreePics works either online or in a full standalone version, where it can display trees with several thousands of pictures (depending on the memory available. We argue that TreePics can be particularly useful in a preliminary stage of research, such as to quickly detect conflicts between a DNA-based phylogenetic tree and morphological variation, that may be due to contamination that needs to be removed prior to final analyses, or the presence of species complexes.

  11. Interpretation of Forest Resources at the Individual Tree Level at Purple Mountain, Nanjing City, China, Using WorldView-2 Imagery by Combining GPS, RS and GIS Technologies

    Directory of Open Access Journals (Sweden)

    Songqiu Deng

    2013-12-01

    Full Text Available This study attempted to measure forest resources at the individual tree level using high-resolution images by combining GPS, RS, and Geographic Information System (GIS technologies. The images were acquired by the WorldView-2 satellite with a resolution of 0.5 m in the panchromatic band and 2.0 m in the multispectral bands. Field data of 90 plots were used to verify the interpreted accuracy. The tops of trees in three groups, namely ≥10 cm, ≥15 cm, and ≥20 cm DBH (diameter at breast height, were extracted by the individual tree crown (ITC approach using filters with moving windows of 3 × 3 pixels, 5 × 5 pixels and 7 × 7 pixels, respectively. In the study area, there were 1,203,970 trees of DBH over 10 cm, and the interpreted accuracy was 73.68 ± 15.14% averaged over the 90 plots. The numbers of the trees that were ≥15 cm and ≥20 cm DBH were 727,887 and 548,919, with an average accuracy of 68.74 ± 17.21% and 71.92 ± 18.03%, respectively. The pixel-based classification showed that the classified accuracies of the 16 classes obtained using the eight multispectral bands were higher than those obtained using only the four standard bands. The increments ranged from 0.1% for the water class to 17.0% for Metasequoia glyptostroboides, with an average value of 4.8% for the 16 classes. In addition, to overcome the “mixed pixels” problem, a crown-based supervised classification, which can improve the classified accuracy of both dominant species and smaller classes, was used for generating a thematic map of tree species. The improvements of the crown- to pixel-based classification ranged from −1.6% for the open forest class to 34.3% for Metasequoia glyptostroboides, with an average value of 20.3% for the 10 classes. All tree tops were then annotated with the species attributes from the map, and a tree count of different species indicated that the forest of Purple Mountain is mainly dominated by Quercus acutissima, Liquidambar formosana

  12. Neuromuscular disease classification system

    Science.gov (United States)

    Sáez, Aurora; Acha, Begoña; Montero-Sánchez, Adoración; Rivas, Eloy; Escudero, Luis M.; Serrano, Carmen

    2013-06-01

    Diagnosis of neuromuscular diseases is based on subjective visual assessment of biopsies from patients by the pathologist specialist. A system for objective analysis and classification of muscular dystrophies and neurogenic atrophies through muscle biopsy images of fluorescence microscopy is presented. The procedure starts with an accurate segmentation of the muscle fibers using mathematical morphology and a watershed transform. A feature extraction step is carried out in two parts: 24 features that pathologists take into account to diagnose the diseases and 58 structural features that the human eye cannot see, based on the assumption that the biopsy is considered as a graph, where the nodes are represented by each fiber, and two nodes are connected if two fibers are adjacent. A feature selection using sequential forward selection and sequential backward selection methods, a classification using a Fuzzy ARTMAP neural network, and a study of grading the severity are performed on these two sets of features. A database consisting of 91 images was used: 71 images for the training step and 20 as the test. A classification error of 0% was obtained. It is concluded that the addition of features undetectable by the human visual inspection improves the categorization of atrophic patterns.

  13. Land use classification from Sentinel-2 imagery

    OpenAIRE

    Borràs, J.; Delegido, J.; Pezzola, A.; Pereira, M.; Morassi, G.; Camps-Valls, G.

    2017-01-01

    [EN] Sentinel-2 (S2), a new ESA satellite for Earth observation, accounts with 13 bands which provide high-quality radiometric images with an excellent spatial resolution (10 and 20 m) ideal for classification purposes. In this paper, two objectives have been addressed: to determine the best classification method for S2, and to quantify its improve-ment with respect to the SPOT operational mission. To do so, four classifiers (LDA, RF, Decision Trees, K-NN) have been selected and applied to tw...

  14. Additive manufacturing.

    Science.gov (United States)

    Mumith, A; Thomas, M; Shah, Z; Coathup, M; Blunn, G

    2018-04-01

    Increasing innovation in rapid prototyping (RP) and additive manufacturing (AM), also known as 3D printing, is bringing about major changes in translational surgical research. This review describes the current position in the use of additive manufacturing in orthopaedic surgery. Cite this article: Bone Joint J 2018;100-B:455-60.

  15. REMOTE SENSING IMAGE CLASSIFICATION APPLIED TO THE FIRST NATIONAL GEOGRAPHICAL INFORMATION CENSUS OF CHINA

    Directory of Open Access Journals (Sweden)

    X. Yu

    2016-06-01

    Full Text Available Image classification will still be a long way in the future, although it has gone almost half a century. In fact, researchers have gained many fruits in the image classification domain, but there is still a long distance between theory and practice. However, some new methods in the artificial intelligence domain will be absorbed into the image classification domain and draw on the strength of each to offset the weakness of the other, which will open up a new prospect. Usually, networks play the role of a high-level language, as is seen in Artificial Intelligence and statistics, because networks are used to build complex model from simple components. These years, Bayesian Networks, one of probabilistic networks, are a powerful data mining technique for handling uncertainty in complex domains. In this paper, we apply Tree Augmented Naive Bayesian Networks (TAN to texture classification of High-resolution remote sensing images and put up a new method to construct the network topology structure in terms of training accuracy based on the training samples. Since 2013, China government has started the first national geographical information census project, which mainly interprets geographical information based on high-resolution remote sensing images. Therefore, this paper tries to apply Bayesian network to remote sensing image classification, in order to improve image interpretation in the first national geographical information census project. In the experiment, we choose some remote sensing images in Beijing. Experimental results demonstrate TAN outperform than Naive Bayesian Classifier (NBC and Maximum Likelihood Classification Method (MLC in the overall classification accuracy. In addition, the proposed method can reduce the workload of field workers and improve the work efficiency. Although it is time consuming, it will be an attractive and effective method for assisting office operation of image interpretation.

  16. Spectra of chemical trees

    International Nuclear Information System (INIS)

    Balasubramanian, K.

    1982-01-01

    A method is developed for obtaining the spectra of trees of NMR and chemical interests. The characteristic polynomials of branched trees can be obtained in terms of the characteristic polynomials of unbranched trees and branches by pruning the tree at the joints. The unbranched trees can also be broken down further until a tree containing just two vertices is obtained. The effectively reduces the order of the secular determinant of the tree used at the beginning to determinants of orders atmost equal to the number of vertices in the branch containing the largest number of vertices. An illustrative example of a NMR graph is given for which the 22 x 22 secular determinant is reduced to determinants of orders atmost 4 x 4 in just the second step of the algorithm. The tree pruning algorithm can be applied even to trees with no symmetry elements and such a factoring can be achieved. Methods developed here can be elegantly used to find if two trees are cospectral and to construct cospectral trees

  17. Refining discordant gene trees.

    Science.gov (United States)

    Górecki, Pawel; Eulenstein, Oliver

    2014-01-01

    Evolutionary studies are complicated by discordance between gene trees and the species tree in which they evolved. Dealing with discordant trees often relies on comparison costs between gene and species trees, including the well-established Robinson-Foulds, gene duplication, and deep coalescence costs. While these costs have provided credible results for binary rooted gene trees, corresponding cost definitions for non-binary unrooted gene trees, which are frequently occurring in practice, are challenged by biological realism. We propose a natural extension of the well-established costs for comparing unrooted and non-binary gene trees with rooted binary species trees using a binary refinement model. For the duplication cost we describe an efficient algorithm that is based on a linear time reduction and also computes an optimal rooted binary refinement of the given gene tree. Finally, we show that similar reductions lead to solutions for computing the deep coalescence and the Robinson-Foulds costs. Our binary refinement of Robinson-Foulds, gene duplication, and deep coalescence costs for unrooted and non-binary gene trees together with the linear time reductions provided here for computing these costs significantly extends the range of trees that can be incorporated into approaches dealing with discordance.

  18. Classification algorithms using adaptive partitioning

    KAUST Repository

    Binev, Peter; Cohen, Albert; Dahmen, Wolfgang; DeVore, Ronald

    2014-01-01

    © 2014 Institute of Mathematical Statistics. Algorithms for binary classification based on adaptive tree partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. In contrast with the most studied tree-based algorithms, which utilize piecewise constant approximation on the generated partition [IEEE Trans. Inform. Theory 52 (2006) 1335.1353; Mach. Learn. 66 (2007) 209.242], we consider decorated trees, which allow us to derive higher order methods. Convergence rates for these methods are derived in terms the parameter - of margin conditions and a rate s of best approximation of the Bayes set by decorated adaptive partitions. They can also be expressed in terms of the Besov smoothness β of the regression function that governs its approximability by piecewise polynomials on adaptive partition. The execution of the algorithms does not require knowledge of the smoothness or margin conditions. Besov smoothness conditions are weaker than the commonly used Holder conditions, which govern approximation by nonadaptive partitions, and therefore for a given regression function can result in a higher rate of convergence. This in turn mitigates the compatibility conflict between smoothness and margin parameters.

  19. Classification algorithms using adaptive partitioning

    KAUST Repository

    Binev, Peter

    2014-12-01

    © 2014 Institute of Mathematical Statistics. Algorithms for binary classification based on adaptive tree partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. In contrast with the most studied tree-based algorithms, which utilize piecewise constant approximation on the generated partition [IEEE Trans. Inform. Theory 52 (2006) 1335.1353; Mach. Learn. 66 (2007) 209.242], we consider decorated trees, which allow us to derive higher order methods. Convergence rates for these methods are derived in terms the parameter - of margin conditions and a rate s of best approximation of the Bayes set by decorated adaptive partitions. They can also be expressed in terms of the Besov smoothness β of the regression function that governs its approximability by piecewise polynomials on adaptive partition. The execution of the algorithms does not require knowledge of the smoothness or margin conditions. Besov smoothness conditions are weaker than the commonly used Holder conditions, which govern approximation by nonadaptive partitions, and therefore for a given regression function can result in a higher rate of convergence. This in turn mitigates the compatibility conflict between smoothness and margin parameters.

  20. Radioecological investigations on tree rings of spruce

    International Nuclear Information System (INIS)

    Haas, G.; Mueller, A.

    1995-01-01

    Tree ring analysis contributes essentially to the explanation of physiological and element-specific transport phenomena in trees. After the accident in Chernobyl the behaviour of Cs-134 and Cs-137 in trees is most informative for the prediction of the future development of the distribution of these elements. In this study the uptake and the long term behaviour of Cs-134, Cs-137, Pb-210, Ra-226, Ra-228, K-40, Th-228, Th-230, Th-232, U-234 and U-238 in tree rings of spruce are examined by α- and γ-spectrometry. All samples are dried at 105C and ashed at 450C in a muffle furnace. The distributions found in the tree rings vary for different radionuclides. A soil profile from the spruce stand provides additional information

  1. Statistical Sensitive Data Protection and Inference Prevention with Decision Tree Methods

    National Research Council Canada - National Science Library

    Chang, LiWu

    2003-01-01

    .... We consider inference as correct classification and approach it with decision tree methods. As in our previous work, sensitive data are viewed as classes of those test data and non-sensitive data are the rest attribute values...

  2. Food additives

    Science.gov (United States)

    ... GO About MedlinePlus Site Map FAQs Customer Support Health Topics Drugs & Supplements Videos & Tools Español You Are Here: Home → Medical Encyclopedia → Food additives URL of this page: //medlineplus.gov/ency/article/ ...

  3. TESTING OF LAND COVER CLASSIFICATION FROM MULTISPECTRAL AIRBORNE LASER SCANNING DATA

    Directory of Open Access Journals (Sweden)

    K. Bakuła

    2016-06-01

    Full Text Available Multispectral Airborne Laser Scanning provides a new opportunity for airborne data collection. It provides high-density topographic surveying and is also a useful tool for land cover mapping. Use of a minimum of three intensity images from a multiwavelength laser scanner and 3D information included in the digital surface model has the potential for land cover/use classification and a discussion about the application of this type of data in land cover/use mapping has recently begun. In the test study, three laser reflectance intensity images (orthogonalized point cloud acquired in green, near-infrared and short-wave infrared bands, together with a digital surface model, were used in land cover/use classification where six classes were distinguished: water, sand and gravel, concrete and asphalt, low vegetation, trees and buildings. In the tested methods, different approaches for classification were applied: spectral (based only on laser reflectance intensity images, spectral with elevation data as additional input data, and spectro-textural, using morphological granulometry as a method of texture analysis of both types of data: spectral images and the digital surface model. The method of generating the intensity raster was also tested in the experiment. Reference data were created based on visual interpretation of ALS data and traditional optical aerial and satellite images. The results have shown that multispectral ALS data are unlike typical multispectral optical images, and they have a major potential for land cover/use classification. An overall accuracy of classification over 90% was achieved. The fusion of multi-wavelength laser intensity images and elevation data, with the additional use of textural information derived from granulometric analysis of images, helped to improve the accuracy of classification significantly. The method of interpolation for the intensity raster was not very helpful, and using intensity rasters with both first and

  4. Floral markers of strawberry tree (Arbutus unedo L.) honey.

    Science.gov (United States)

    Tuberoso, Carlo I G; Bifulco, Ersilia; Caboni, Pierluigi; Cottiglia, Filippo; Cabras, Paolo; Floris, Ignazio

    2010-01-13

    Strawberry tree honey, due to its characteristic bitter taste, is one of the most typical Mediterranean honeys, with Sardinia being one of the largest producers. According to specific chemical studies, homogentisic acid was identified as a possible marker of this honey. This work, based on HPLC-DAD-MS/MS analysis of strawberry tree (Arbutus unedo L.) honeys, previously selected by sensory evaluation and melissopalynological analysis, showed that, in addition to the above-mentioned acid, there were other high levels of substances useful for the botanical classification of this unifloral honey. Two of these compounds were isolated and identified as (+/-)-2-cis,4-trans-abscisic acid (c,t-ABA) and (+/-)-2-trans,4-trans-abscisic acid (t,t-ABA). A third compound, a new natural product named unedone, was characterized as an epoxidic derivative of the above-mentioned acids. Structures of c,t-ABA, t,t-ABA, and unedone were elucidated on the basis of extensive 1D and 2D NMR experiments, as well as HPLC-MS/MS and Q-TOF analysis. In selected honeys the average amounts of c,t-ABA, t,t-ABA, and unedone were 176.2+/-25.4, 162.3+/-21.1, and 32.9+/-7.1 mg/kg, respectively. Analysis of the A. unedo nectar confirmed the floral origin of these compounds found in the honey. Abscisic acids were found in other unifloral honeys but not in such high amount and with a constant ratio of about 1:1. For this reason, besides homogentisic acid, these compounds could be used as complementary markers of strawberry tree honey.

  5. Dual-tree complex wavelet for medical image watermarking

    International Nuclear Information System (INIS)

    Mavudila, K.R.; Ndaye, B.M.; Masmoudi, L.; Hassanain, N.; Cherkaoui, M.

    2010-01-01

    In order to transmit medical data between hospitals, we insert the information for each patient in the image and its diagnosis, the watermarking consist to insert a message in the image and try to find it with the maximum possible fidelity. This paper presents a blind watermarking scheme in wavelet transform domain dual tree (DTT), who increasing the robustness and preserves the image quality. This system is transparent to the user and allows image integrity control. In addition, it provides information on the location of potential alterations and an evaluation of image modifications which is of major importance in a medico-legal framework. An example using head magnetic resonance and mammography imaging illustrates the overall method. Wavelet techniques can be successfully applied in various image processing methods, namely in image de noising, segmentation, classification, watermarking and others. In this paper we discussed the application of dual tree complex wavelet transform (D T-CWT), which has significant advantages over classic discrete wavelet transform (DWT), for certain image processing problems. The D T-CWT is a form of discreet wavelet transform which generates complex coefficients by using a dual tree of wavelet filters to obtain their real and imaginary parts. The main part of the paper is devoted to profit the exceptional quality for D T-CWT, compared to classical DWT, for a blind medical image watermarking, our schemes are using for the performance bivariate shrinkage with local variance estimation and are robust of attacks and favourably preserves the visual quality. Experimental results show that embedded watermarks using CWT give good image quality and are robust in comparison with the classical DWT.

  6. Shopping intention prediction using decision trees

    Directory of Open Access Journals (Sweden)

    Dario Šebalj

    2017-09-01

    Full Text Available Introduction: The price is considered to be neglected marketing mix element due to the complexity of price management and sensitivity of customers on price changes. It pulls the fastest customer reactions to that change. Accordingly, the process of making shopping decisions can be very challenging for customer. Objective: The aim of this paper is to create a model that is able to predict shopping intention and classify respondents into one of the two categories, depending on whether they intend to shop or not. Methods: Data sample consists of 305 respondents, who are persons older than 18 years involved in buying groceries for their household. The research was conducted in February 2017. In order to create a model, the decision trees method was used with its several classification algorithms. Results: All models, except the one that used RandomTree algorithm, achieved relatively high classification rate (over the 80%. The highest classification accuracy of 84.75% gave J48 and RandomForest algorithms. Since there is no statistically significant difference between those two algorithms, authors decided to choose J48 algorithm and build a decision tree. Conclusions: The value for money and price level in the store were the most significant variables for classification of shopping intention. Future study plans to compare this model with some other data mining techniques, such as neural networks or support vector machines since these techniques achieved very good accuracy in some previous research in this field.

  7. Mastication and prescribed fire influences on tree mortality and predicted fire behavior in ponderosa pine

    Science.gov (United States)

    Alicia L. Reiner; Nicole M. Vaillant; Scott N. Dailey

    2012-01-01

    The purpose of this study was to provide land managers with information on potential wildfire behavior and tree mortality associated with mastication and masticated/fire treatments in a plantation. Additionally, the effect of pulling fuels away from tree boles before applying fire treatment was studied in relation to tree mortality. Fuel characteristics and tree...

  8. A method to study response of large trees to different amounts of available soil water

    Science.gov (United States)

    D.H. Marx; Shi-Jean S. Sung; J.S. Cunningham; M.D. Thompson; L.M. White

    1995-01-01

    A method was developed to manipulate available soil water on large trees by intercepting thrufall with gutters placed under tree canopies and irrigating the intercepted thrufall onto other trees. With this design, trees were exposed for 2 years to either 25% less thrufall, normal thrufall, or 25% additional thrufall.Undercanopy construction in these plots moderately...

  9. Fault tree analysis for vital area identification

    International Nuclear Information System (INIS)

    Varnado, G.B.; Ortiz, N.R.

    1978-01-01

    This paper discusses the use of fault tree analysis to identify those areas of nuclear fuel cycle facilities which must be protected to prevent acts of sabotage that could lead to sifnificant release of radioactive material. By proper manipulation of the fault trees for a plant, an analyst can identify vital areas in a manner consistent with regulatory definitions. This paper discusses the general procedures used in the analysis of any nuclear facility. In addition, a structured, generic approach to the development of the fault trees for nuclear power reactors is presented along with selected results of the application of the generic approach to several plants

  10. River and wetland classifications for freshwater conservation ...

    African Journals Online (AJOL)

    River and wetland classifications for freshwater conservation planning in KwaZulu-Natal, South Africa. ... regional- or provincial-scale conservation planning. The hierarchical structure of the classifications provides scope for finer resolution, by the addition of further levels, for application at a sub-regional or municipal scale.

  11. SAW Classification Algorithm for Chinese Text Classification

    OpenAIRE

    Xiaoli Guo; Huiyu Sun; Tiehua Zhou; Ling Wang; Zhaoyang Qu; Jiannan Zang

    2015-01-01

    Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words...

  12. Degree of susceptibility of industrial gases of tree and shrub species

    Energy Technology Data Exchange (ETDEWEB)

    Dobrovoljskii, I A

    1952-01-01

    The trees and shrubs of the iron smelting region of Krivoi Rog, in the Ukraine, were surveyed to determine susceptibility to air pollution damage. Most of the observations were made in parks and green belts in industrial areas. A classification of tree and shrub species is presented; they are separated into three classes according to their susceptibility to air pollutant injury.

  13. A Branch-and-Price approach to find optimal decision trees

    NARCIS (Netherlands)

    Firat, M.; Crognier, Guillaume; Gabor, Adriana; Zhang, Y.

    2018-01-01

    In Artificial Intelligence (AI) field, decision trees have gained certain importance due to their effectiveness in solving classification and regression problems. Recently, in the literature we see finding optimal decision trees are formulated as Mixed Integer Linear Programming (MILP) models. This

  14. The valuative tree

    CERN Document Server

    Favre, Charles

    2004-01-01

    This volume is devoted to a beautiful object, called the valuative tree and designed as a powerful tool for the study of singularities in two complex dimensions. Its intricate yet manageable structure can be analyzed by both algebraic and geometric means. Many types of singularities, including those of curves, ideals, and plurisubharmonic functions, can be encoded in terms of positive measures on the valuative tree. The construction of these measures uses a natural tree Laplace operator of independent interest.

  15. Detection, diagnosis, and treatment of accident conditions using response trees

    International Nuclear Information System (INIS)

    Nelson, W.R.

    1980-01-01

    Response Trees were developed at the LOFT facility in 1978 and included in the Plant Operating Manual (POM) to assist reactor operators in selecting emergency procedures. In an emergency situation the operator would manually gather data and evaluate the trees to select the appropriate procedures. As a portion of the LOFT Augmented Operator Capability (AOC) Program, the response tree methodology has been extended so that a computer can be used to evaluate the trees and recommend an appropriate response for an accident. Techniques for diagnosing failures within a cooling mode have also been investigated. This paper summarizes these additions to the response tree methodology

  16. Using different classification models in wheat grading utilizing visual features

    Science.gov (United States)

    Basati, Zahra; Rasekh, Mansour; Abbaspour-Gilandeh, Yousef

    2018-04-01

    Wheat is one of the most important strategic crops in Iran and in the world. The major component that distinguishes wheat from other grains is the gluten section. In Iran, sunn pest is one of the most important factors influencing the characteristics of wheat gluten and in removing it from a balanced state. The existence of bug-damaged grains in wheat will reduce the quality and price of the product. In addition, damaged grains reduce the enrichment of wheat and the quality of bread products. In this study, after preprocessing and segmentation of images, 25 features including 9 colour features, 10 morphological features, and 6 textual statistical features were extracted so as to classify healthy and bug-damaged wheat grains of Azar cultivar of four levels of moisture content (9, 11.5, 14 and 16.5% w.b.) and two lighting colours (yellow light, the composition of yellow and white lights). Using feature selection methods in the WEKA software and the CfsSubsetEval evaluator, 11 features were chosen as inputs of artificial neural network, decision tree and discriment analysis classifiers. The results showed that the decision tree with the J.48 algorithm had the highest classification accuracy of 90.20%. This was followed by artificial neural network classifier with the topology of 11-19-2 and discrimient analysis classifier at 87.46 and 81.81%, respectively

  17. Comparative study of PCA in classification of multichannel EMG signals.

    Science.gov (United States)

    Geethanjali, P

    2015-06-01

    Electromyographic (EMG) signals are abundantly used in the field of rehabilitation engineering in controlling the prosthetic device and significantly essential to find fast and accurate EMG pattern recognition system, to avoid intrusive delay. The main objective of this paper is to study the influence of Principal component analysis (PCA), a transformation technique, in pattern recognition of six hand movements using four channel surface EMG signals from ten healthy subjects. For this reason, time domain (TD) statistical as well as auto regression (AR) coefficients are extracted from the four channel EMG signals. The extracted statistical features as well as AR coefficients are transformed using PCA to 25, 50 and 75 % of corresponding original feature vector space. The classification accuracy of PCA transformed and non-PCA transformed TD statistical features as well as AR coefficients are studied with simple logistic regression (SLR), decision tree (DT) with J48 algorithm, logistic model tree (LMT), k nearest neighbor (kNN) and neural network (NN) classifiers in the identification of six different movements. The Kruskal-Wallis (KW) statistical test shows that there is a significant reduction (P PCA transformed features compared to non-PCA transformed features. SLR with non-PCA transformed time domain (TD) statistical features performs better in accuracy and computational power compared to other features considered in this study. In addition, the motion control of three drives for six movements of the hand is implemented with SLR using TD statistical features in off-line with TMSLF2407 digital signal controller (DSC).

  18. Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification.

    Science.gov (United States)

    Wang, Yin; Li, Rudong; Zhou, Yuhua; Ling, Zongxin; Guo, Xiaokui; Xie, Lu; Liu, Lei

    2016-01-01

    Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF) to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data.

  19. Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification

    Directory of Open Access Journals (Sweden)

    Yin Wang

    2016-01-01

    Full Text Available Background. Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty. Results. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods. Conclusions. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data.

  20. Comparison of pixel and object-based classification for burned area mapping using SPOT-6 images

    Directory of Open Access Journals (Sweden)

    Elif Sertel

    2016-07-01

    Full Text Available On 30 May 2013, a forest fire occurred in Izmir, Turkey causing damage to both forest and fruit trees within the region. In this research, pre- and post-fire SPOT-6 images obtained on 30 April 2013 and 31 May 2013 were used to identify the extent of forest fire within the region. SPOT-6 images of the study region were orthorectified and classified using pixel and object-based classification (OBC algorithms to accurately delineate the boundaries of burned areas. The present results show that for OBC using only normalized difference vegetation index (NDVI thresholds is not sufficient enough to map the burn scars; however, creating a new and simple rule set that included mean brightness values of near infrared and red channels in addition to mean NDVI values of segments considerably improved the accuracy of classification. According to the accuracy assessment results, the burned area was mapped with a 0.9322 kappa value in OBC, while a 0.7433 kappa value was observed in pixel-based classification. Lastly, classification results were integrated with the forest management map to determine the effected forest types after the fire to be used by the National Forest Directorate for their operational activities to effectively manage the fire, response and recovery processes.

  1. Using methods from the data mining and machine learning literature for disease classification and prediction: A case study examining classification of heart failure sub-types

    Science.gov (United States)

    Austin, Peter C.; Tu, Jack V.; Ho, Jennifer E.; Levy, Daniel; Lee, Douglas S.

    2014-01-01

    Objective Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study design and Setting We compared the performance of these classification methods with those of conventional classification trees to classify patients with heart failure according to the following sub-types: heart failure with preserved ejection fraction (HFPEF) vs. heart failure with reduced ejection fraction (HFREF). We also compared the ability of these methods to predict the probability of the presence of HFPEF with that of conventional logistic regression. Results We found that modern, flexible tree-based methods from the data mining literature offer substantial improvement in prediction and classification of heart failure sub-type compared to conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of HFPEF compared to the methods proposed in the data mining literature. Conclusion The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying heart failure subtypes in a population-based sample of patients from Ontario. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF. PMID:23384592

  2. Coded Splitting Tree Protocols

    DEFF Research Database (Denmark)

    Sørensen, Jesper Hemming; Stefanovic, Cedomir; Popovski, Petar

    2013-01-01

    This paper presents a novel approach to multiple access control called coded splitting tree protocol. The approach builds on the known tree splitting protocols, code structure and successive interference cancellation (SIC). Several instances of the tree splitting protocol are initiated, each...... instance is terminated prematurely and subsequently iterated. The combined set of leaves from all the tree instances can then be viewed as a graph code, which is decodable using belief propagation. The main design problem is determining the order of splitting, which enables successful decoding as early...

  3. Morocco - Fruit Tree Productivity

    Data.gov (United States)

    Millennium Challenge Corporation — Date Tree Irrigation Project: The specific objectives of this evaluation are threefold: - Performance evaluation of project activities, like the mid-term evaluation,...

  4. Transportation Modes Classification Using Sensors on Smartphones

    Directory of Open Access Journals (Sweden)

    Shih-Hau Fang

    2016-08-01

    Full Text Available This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.

  5. An ordinal classification approach for CTG categorization.

    Science.gov (United States)

    Georgoulas, George; Karvelis, Petros; Gavrilis, Dimitris; Stylios, Chrysostomos D; Nikolakopoulos, George

    2017-07-01

    Evaluation of cardiotocogram (CTG) is a standard approach employed during pregnancy and delivery. But, its interpretation requires high level expertise to decide whether the recording is Normal, Suspicious or Pathological. Therefore, a number of attempts have been carried out over the past three decades for development automated sophisticated systems. These systems are usually (multiclass) classification systems that assign a category to the respective CTG. However most of these systems usually do not take into consideration the natural ordering of the categories associated with CTG recordings. In this work, an algorithm that explicitly takes into consideration the ordering of CTG categories, based on binary decomposition method, is investigated. Achieved results, using as a base classifier the C4.5 decision tree classifier, prove that the ordinal classification approach is marginally better than the traditional multiclass classification approach, which utilizes the standard C4.5 algorithm for several performance criteria.

  6. Hyper-parameter tuning of a decision tree induction algorithm

    NARCIS (Netherlands)

    Mantovani, R.G.; Horváth, T.; Cerri, R.; Vanschoren, J.; de Carvalho, A.C.P.L.F.

    2017-01-01

    Supervised classification is the most studied task in Machine Learning. Among the many algorithms used in such task, Decision Tree algorithms are a popular choice, since they are robust and efficient to construct. Moreover, they have the advantage of producing comprehensible models and satisfactory

  7. Treelink: data integration, clustering and visualization of phylogenetic trees.

    Science.gov (United States)

    Allende, Christian; Sohn, Erik; Little, Cedric

    2015-12-29

    Phylogenetic trees are central to a wide range of biological studies. In many of these studies, tree nodes need to be associated with a variety of attributes. For example, in studies concerned with viral relationships, tree nodes are associated with epidemiological information, such as location, age and subtype. Gene trees used in comparative genomics are usually linked with taxonomic information, such as functional annotations and events. A wide variety of tree visualization and annotation tools have been developed in the past, however none of them are intended for an integrative and comparative analysis. Treelink is a platform-independent software for linking datasets and sequence files to phylogenetic trees. The application allows an automated integration of datasets to trees for operations such as classifying a tree based on a field or showing the distribution of selected data attributes in branches and leafs. Genomic and proteonomic sequences can also be linked to the tree and extracted from internal and external nodes. A novel clustering algorithm to simplify trees and display the most divergent clades was also developed, where validation can be achieved using the data integration and classification function. Integrated geographical information allows ancestral character reconstruction for phylogeographic plotting based on parsimony and likelihood algorithms. Our software can successfully integrate phylogenetic trees with different data sources, and perform operations to differentiate and visualize those differences within a tree. File support includes the most popular formats such as newick and csv. Exporting visualizations as images, cluster outputs and genomic sequences is supported. Treelink is available as a web and desktop application at http://www.treelinkapp.com .

  8. Deep learning for image classification

    Science.gov (United States)

    McCoppin, Ryan; Rizki, Mateen

    2014-06-01

    This paper provides an overview of deep learning and introduces the several subfields of deep learning including a specific tutorial of convolutional neural networks. Traditional methods for learning image features are compared to deep learning techniques. In addition, we present our preliminary classification results, our basic implementation of a convolutional restricted Boltzmann machine on the Mixed National Institute of Standards and Technology database (MNIST), and we explain how to use deep learning networks to assist in our development of a robust gender classification system.

  9. Recruiting Conventional Tree Architecture Models into State-of-the-Art LiDAR Mapping for Investigating Tree Growth Habits in Structure

    Directory of Open Access Journals (Sweden)

    Yi Lin

    2018-02-01

    Full Text Available Mensuration of tree growth habits is of considerable importance for understanding forest ecosystem processes and forest biophysical responses to climate changes. However, the complexity of tree crown morphology that is typically formed after many years of growth tends to render it a non-trivial task, even for the state-of-the-art 3D forest mapping technology—light detection and ranging (LiDAR. Fortunately, botanists have deduced the large structural diversity of tree forms into only a limited number of tree architecture models, which can present a-priori knowledge about tree structure, growth, and other attributes for different species. This study attempted to recruit Hallé architecture models (HAMs into LiDAR mapping to investigate tree growth habits in structure. First, following the HAM-characterized tree structure organization rules, we run the kernel procedure of tree species classification based on the LiDAR-collected point clouds using a support vector machine classifier in the leave-one-out-for-cross-validation mode. Then, the HAM corresponding to each of the classified tree species was identified based on expert knowledge, assisted by the comparison of the LiDAR-derived feature parameters. Next, the tree growth habits in structure for each of the tree species were derived from the determined HAM. In the case of four tree species growing in the boreal environment, the tests indicated that the classification accuracy reached 85.0%, and their growth habits could be derived by qualitative and quantitative means. Overall, the strategy of recruiting conventional HAMs into LiDAR mapping for investigating tree growth habits in structure was validated, thereby paving a new way for efficiently reflecting tree growth habits and projecting forest structure dynamics.

  10. Recruiting Conventional Tree Architecture Models into State-of-the-Art LiDAR Mapping for Investigating Tree Growth Habits in Structure.

    Science.gov (United States)

    Lin, Yi; Jiang, Miao; Pellikka, Petri; Heiskanen, Janne

    2018-01-01

    Mensuration of tree growth habits is of considerable importance for understanding forest ecosystem processes and forest biophysical responses to climate changes. However, the complexity of tree crown morphology that is typically formed after many years of growth tends to render it a non-trivial task, even for the state-of-the-art 3D forest mapping technology-light detection and ranging (LiDAR). Fortunately, botanists have deduced the large structural diversity of tree forms into only a limited number of tree architecture models, which can present a-priori knowledge about tree structure, growth, and other attributes for different species. This study attempted to recruit Hallé architecture models (HAMs) into LiDAR mapping to investigate tree growth habits in structure. First, following the HAM-characterized tree structure organization rules, we run the kernel procedure of tree species classification based on the LiDAR-collected point clouds using a support vector machine classifier in the leave-one-out-for-cross-validation mode. Then, the HAM corresponding to each of the classified tree species was identified based on expert knowledge, assisted by the comparison of the LiDAR-derived feature parameters. Next, the tree growth habits in structure for each of the tree species were derived from the determined HAM. In the case of four tree species growing in the boreal environment, the tests indicated that the classification accuracy reached 85.0%, and their growth habits could be derived by qualitative and quantitative means. Overall, the strategy of recruiting conventional HAMs into LiDAR mapping for investigating tree growth habits in structure was validated, thereby paving a new way for efficiently reflecting tree growth habits and projecting forest structure dynamics.

  11. A DIMENSION REDUCTION-BASED METHOD FOR CLASSIFICATION OF HYPERSPECTRAL AND LIDAR DATA

    Directory of Open Access Journals (Sweden)

    B. Abbasi

    2015-12-01

    Full Text Available The existence of various natural objects such as grass, trees, and rivers along with artificial manmade features such as buildings and roads, make it difficult to classify ground objects. Consequently using single data or simple classification approach cannot improve classification results in object identification. Also, using of a variety of data from different sensors; increase the accuracy of spatial and spectral information. In this paper, we proposed a classification algorithm on joint use of hyperspectral and Lidar (Light Detection and Ranging data based on dimension reduction. First, some feature extraction techniques are applied to achieve more information from Lidar and hyperspectral data. Also Principal component analysis (PCA and Minimum Noise Fraction (MNF have been utilized to reduce the dimension of spectral features. The number of 30 features containing the most information of the hyperspectral images is considered for both PCA and MNF. In addition, Normalized Difference Vegetation Index (NDVI has been measured to highlight the vegetation. Furthermore, the extracted features from Lidar data calculated based on relation between every pixel of data and surrounding pixels in local neighbourhood windows. The extracted features are based on the Grey Level Co-occurrence Matrix (GLCM matrix. In second step, classification is operated in all features which obtained by MNF, PCA, NDVI and GLCM and trained by class samples. After this step, two classification maps are obtained by SVM classifier with MNF+NDVI+GLCM features and PCA+NDVI+GLCM features, respectively. Finally, the classified images are fused together to create final classification map by decision fusion based majority voting strategy.

  12. Multiple endmember spectral-angle-mapper (SAM) analysis improves discrimination of Savanna tree species

    CSIR Research Space (South Africa)

    Cho, Moses A

    2009-08-01

    Full Text Available of this paper was to evaluate the classification performance of a multiple-endmember spectral angle mapper (SAM) classification approach in discriminating seven common African savanna tree species and to compare the results with the traditional SAM classifier...

  13. Are trees long-lived?

    Science.gov (United States)

    Kevin T. Smith

    2009-01-01

    Trees and tree care can capture the best of people's motivations and intentions. Trees are living memorials that help communities heal at sites of national tragedy, such as Oklahoma City and the World Trade Center. We mark the places of important historical events by the trees that grew nearby even if the original tree, such as the Charter Oak in Connecticut or...

  14. Transferability of decision trees for land cover classification in a ...

    African Journals Online (AJOL)

    GChandler

    heterogeneous, as geographical complexity can have a negative effect on the ... elevation, climate, environmental patterns and vegetation (Verhulp and Van Niekerk, 2016). .... As there were many clouds in scene 170/082, which could not be masked out, a cloud mask ..... terrain models and texture', Applied Geography, vol.

  15. Ensemble of randomized soft decision trees for robust classification

    Indian Academy of Sciences (India)

    G KISHOR KUMAR

    1Department of Information Technology, Rajeev Gandhi Memorial College of Engineering and Technology,. Nandyal ... Data Mining is a process which discovers knowledge from ... for many applications like diagnosis systems [23], video.

  16. Classification Trees and the Analysis of Helicopter Vibration Data

    National Research Council Canada - National Science Library

    Larson, Harold

    1997-01-01

    .... These systems monitor (and can record) various flight parameters, pilot conversations, engine exhaust debris, metallic chip detector levels in the lubrication system, rotor track and balance, as well as vibration levels at selected...

  17. Benefits of tree mixes in carbon plantings

    Science.gov (United States)

    Hulvey, Kristin B.; Hobbs, Richard J.; Standish, Rachel J.; Lindenmayer, David B.; Lach, Lori; Perring, Michael P.

    2013-10-01

    Increasingly governments and the private sector are using planted forests to offset carbon emissions. Few studies, however, examine how tree diversity -- defined here as species richness and/or stand composition -- affects carbon storage in these plantings. Using aboveground tree biomass as a proxy for carbon storage, we used meta-analysis to compare carbon storage in tree mixtures with monoculture plantings. Tree mixes stored at least as much carbon as monocultures consisting of the mixture's most productive species and at times outperformed monoculture plantings. In mixed-species stands, individual species, and in particular nitrogen-fixing trees, increased stand biomass. Further motivations for incorporating tree richness into planted forests include the contribution of diversity to total forest carbon-pool development, carbon-pool stability and the provision of extra ecosystem services. Our findings suggest a two-pronged strategy for designing carbon plantings including: (1) increased tree species richness; and (2) the addition of species that contribute to carbon storage and other target functions.

  18. The tree BVOC index

    Science.gov (United States)

    J.R. Simpson; E.G. McPherson

    2011-01-01

    Urban trees can produce a number of benefits, among them improved air quality. Biogenic volatile organic compounds (BVOCs) emitted by some species are ozone precursors. Modifying future tree planting to favor lower-emitting species can reduce these emissions and aid air management districts in meeting federally mandated emissions reductions for these compounds. Changes...

  19. Tree growth visualization

    Science.gov (United States)

    L. Linsen; B.J. Karis; E.G. McPherson; B. Hamann

    2005-01-01

    In computer graphics, models describing the fractal branching structure of trees typically exploit the modularity of tree structures. The models are based on local production rules, which are applied iteratively and simultaneously to create a complex branching system. The objective is to generate three-dimensional scenes of often many realistic- looking and non-...

  20. Flowering T Flowering Trees

    Indian Academy of Sciences (India)

    Adansonia digitata L. ( The Baobab Tree) of Bombacaceae is a tree with swollen trunk that attains a dia. of 10m. Leaves are digitately compound with leaflets up to 18cm. long. Flowers are large, solitary, waxy white, and open at dusk. They open in 30 seconds and are bat pollinated. Stamens are many. Fruit is about 30 cm ...

  1. Fault tree graphics

    International Nuclear Information System (INIS)

    Bass, L.; Wynholds, H.W.; Porterfield, W.R.

    1975-01-01

    Described is an operational system that enables the user, through an intelligent graphics terminal, to construct, modify, analyze, and store fault trees. With this system, complex engineering designs can be analyzed. This paper discusses the system and its capabilities. Included is a brief discussion of fault tree analysis, which represents an aspect of reliability and safety modeling

  2. Tree biology and dendrochemistry

    Science.gov (United States)

    Kevin T. Smith; Walter C. Shortle

    1996-01-01

    Dendrochemistry, the interpretation of elemental analysis of dated tree rings, can provide a temporal record of environmental change. Using the dendrochemical record requires an understanding of tree biology. In this review, we pose four questions concerning assumptions that underlie recent dendrochemical research: 1) Does the chemical composition of the wood directly...

  3. Individual tree control

    Science.gov (United States)

    Harvey A. Holt

    1989-01-01

    Controlling individual unwanted trees in forest stands is a readily accepted method for improving the value of future harvests. The practice is especially important in mixed hardwood forests where species differ considerably in value and within species individual trees differ in quality. Individual stem control is a mechanical or chemical weeding operation that...

  4. Trees and Climate Change

    OpenAIRE

    Dettenmaier, Megan; Kuhns, Michael; Unger, Bethany; McAvoy, Darren

    2017-01-01

    This fact sheet describes the complex relationship between forests and climate change based on current research. It explains ways that trees can mitigate some of the risks associated with climate change. It details the impacts that forests are having on the changing climate and discuss specific ways that trees can be used to reduce or counter carbon emissions directly and indirectly.

  5. Structural Equation Model Trees

    Science.gov (United States)

    Brandmaier, Andreas M.; von Oertzen, Timo; McArdle, John J.; Lindenberger, Ulman

    2013-01-01

    In the behavioral and social sciences, structural equation models (SEMs) have become widely accepted as a modeling tool for the relation between latent and observed variables. SEMs can be seen as a unification of several multivariate analysis techniques. SEM Trees combine the strengths of SEMs and the decision tree paradigm by building tree…

  6. Matching Subsequences in Trees

    DEFF Research Database (Denmark)

    Bille, Philip; Gørtz, Inge Li

    2009-01-01

    Given two rooted, labeled trees P and T the tree path subsequence problem is to determine which paths in P are subsequences of which paths in T. Here a path begins at the root and ends at a leaf. In this paper we propose this problem as a useful query primitive for XML data, and provide new...

  7. FT-Raman and chemometric tools for rapid determination of quality parameters in milk powder: Classification of samples for the presence of lactose and fraud detection by addition of maltodextrin.

    Science.gov (United States)

    Rodrigues Júnior, Paulo Henrique; de Sá Oliveira, Kamila; de Almeida, Carlos Eduardo Rocha; De Oliveira, Luiz Fernando Cappa; Stephani, Rodrigo; Pinto, Michele da Silva; de Carvalho, Antônio Fernandes; Perrone, Ítalo Tuler

    2016-04-01

    FT-Raman spectroscopy has been explored as a quick screening method to evaluate the presence of lactose and identify milk powder samples adulterated with maltodextrin (2.5-50% w/w). Raman measurements can easily differentiate samples of milk powder, without the need for sample preparation, while traditional quality control methods, including high performance liquid chromatography, are cumbersome and slow. FT-Raman spectra were obtained from samples of whole lactose and low-lactose milk powder, both without and with addition of maltodextrin. Differences were observed between the spectra involved in identifying samples with low lactose content, as well as adulterated samples. Exploratory data analysis using Raman spectroscopy and multivariate analysis was also developed to classify samples with PCA and PLS-DA. The PLS-DA models obtained allowed to correctly classify all samples. These results demonstrate the utility of FT-Raman spectroscopy in combination with chemometrics to infer about the quality of milk powder. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Seafloor backscatter signal simulation and classification

    Digital Repository Service at National Institute of Oceanography (India)

    Mahale, V.; El Dine, W.G.; Chakraborty, B.

    . In this model a smooth echo envelope is generated then mixed up with multiplicative and additive noise. Several such echo signals were simulated for three types of seafloor. An Artificial Neural Network based classification technique is conceived to classify...

  9. Waste classification sampling plan

    International Nuclear Information System (INIS)

    Landsman, S.D.

    1998-01-01

    The purpose of this sampling is to explain the method used to collect and analyze data necessary to verify and/or determine the radionuclide content of the B-Cell decontamination and decommissioning waste stream so that the correct waste classification for the waste stream can be made, and to collect samples for studies of decontamination methods that could be used to remove fixed contamination present on the waste. The scope of this plan is to establish the technical basis for collecting samples and compiling quantitative data on the radioactive constituents present in waste generated during deactivation activities in B-Cell. Sampling and radioisotopic analysis will be performed on the fixed layers of contamination present on structural material and internal surfaces of process piping and tanks. In addition, dose rate measurements on existing waste material will be performed to determine the fraction of dose rate attributable to both removable and fixed contamination. Samples will also be collected to support studies of decontamination methods that are effective in removing the fixed contamination present on the waste. Sampling performed under this plan will meet criteria established in BNF-2596, Data Quality Objectives for the B-Cell Waste Stream Classification Sampling, J. M. Barnett, May 1998

  10. Environmental tritium in trees

    International Nuclear Information System (INIS)

    Brown, R.M.

    1979-01-01

    The distribution of environmental tritium in the free water and organically bound hydrogen of trees growing in the vicinity of the Chalk River Nuclear Laboratories (CRNL) has been studied. The regional dispersal of HTO in the atmosphere has been observed by surveying the tritium content of leaf moisture. Measurement of the distribution of organically bound tritium in the wood of tree ring sequences has given information on past concentrations of HTO taken up by trees growing in the CRNL Liquid Waste Disposal Area. For samples at background environmental levels, cellulose separation and analysis was done. The pattern of bomb tritium in precipitation of 1955-68 was observed to be preserved in the organically bound tritium of a tree ring sequence. Reactor tritium was discernible in a tree growing at a distance of 10 km from CRNL. These techniques provide convenient means of monitoring dispersal of HTO from nuclear facilities. (author)

  11. Generalising tree traversals and tree transformations to DAGs

    DEFF Research Database (Denmark)

    Bahr, Patrick; Axelsson, Emil

    2017-01-01

    We present a recursion scheme based on attribute grammars that can be transparently applied to trees and acyclic graphs. Our recursion scheme allows the programmer to implement a tree traversal or a tree transformation and then apply it to compact graph representations of trees instead. The resul......We present a recursion scheme based on attribute grammars that can be transparently applied to trees and acyclic graphs. Our recursion scheme allows the programmer to implement a tree traversal or a tree transformation and then apply it to compact graph representations of trees instead...... as the complementing theory with a number of examples....

  12. Asteroid taxonomic classifications

    International Nuclear Information System (INIS)

    Tholen, D.J.

    1989-01-01

    This paper reports on three taxonomic classification schemes developed and applied to the body of available color and albedo data. Asteroid taxonomic classifications according to two of these schemes are reproduced

  13. Land Cover and Land Use Classification with TWOPAC: towards Automated Processing for Pixel- and Object-Based Image Classification

    Directory of Open Access Journals (Sweden)

    Stefan Dech

    2012-09-01

    Full Text Available We present a novel and innovative automated processing environment for the derivation of land cover (LC and land use (LU information. This processing framework named TWOPAC (TWinned Object and Pixel based Automated classification Chain enables the standardized, independent, user-friendly, and comparable derivation of LC and LU information, with minimized manual classification labor. TWOPAC allows classification of multi-spectral and multi-temporal remote sensing imagery from different sensor types. TWOPAC enables not only pixel-based classification, but also allows classification based on object-based characteristics. Classification is based on a Decision Tree approach (DT for which the well-known C5.0 code has been implemented, which builds decision trees based on the concept of information entropy. TWOPAC enables automatic generation of the decision tree classifier based on a C5.0-retrieved ascii-file, as well as fully automatic validation of the classification output via sample based accuracy assessment.Envisaging the automated generation of standardized land cover products, as well as area-wide classification of large amounts of data in preferably a short processing time, standardized interfaces for process control, Web Processing Services (WPS, as introduced by the Open Geospatial Consortium (OGC, are utilized. TWOPAC’s functionality to process geospatial raster or vector data via web resources (server, network enables TWOPAC’s usability independent of any commercial client or desktop software and allows for large scale data processing on servers. Furthermore, the components of TWOPAC were built-up using open source code components and are implemented as a plug-in for Quantum GIS software for easy handling of the classification process from the user’s perspective.

  14. Hand eczema classification

    DEFF Research Database (Denmark)

    Diepgen, T L; Andersen, Klaus Ejner; Brandao, F M

    2008-01-01

    of the disease is rarely evidence based, and a classification system for different subdiagnoses of hand eczema is not agreed upon. Randomized controlled trials investigating the treatment of hand eczema are called for. For this, as well as for clinical purposes, a generally accepted classification system...... A classification system for hand eczema is proposed. Conclusions It is suggested that this classification be used in clinical work and in clinical trials....

  15. Posbist fault tree analysis of coherent systems

    International Nuclear Information System (INIS)

    Huang, H.-Z.; Tong Xin; Zuo, Ming J.

    2004-01-01

    When the failure probability of a system is extremely small or necessary statistical data from the system is scarce, it is very difficult or impossible to evaluate its reliability and safety with conventional fault tree analysis (FTA) techniques. New techniques are needed to predict and diagnose such a system's failures and evaluate its reliability and safety. In this paper, we first provide a concise overview of FTA. Then, based on the posbist reliability theory, event failure behavior is characterized in the context of possibility measures and the structure function of the posbist fault tree of a coherent system is defined. In addition, we define the AND operator and the OR operator based on the minimal cut of a posbist fault tree. Finally, a model of posbist fault tree analysis (posbist FTA) of coherent systems is presented. The use of the model for quantitative analysis is demonstrated with a real-life safety system

  16. Fault tree analysis of a research reactor

    International Nuclear Information System (INIS)

    Hall, J.A.; O'Dacre, D.F.; Chenier, R.J.; Arbique, G.M.

    1986-08-01

    Fault Tree Analysis Techniques have been used to assess the safety system of the ZED-2 Research Reactor at the Chalk River Nuclear Laboratories. This turned out to be a strong test of the techniques involved. The resulting fault tree was large and because of inter-links in the system structure the tree was not modularized. In addition, comprehensive documentation was required. After a brief overview of the reactor and the analysis, this paper concentrates on the computer tools that made the job work. Two types of tools were needed; text editing and forms management capability for large volumes of component and system data, and the fault tree codes themselves. The solutions (and failures) are discussed along with the tools we are already developing for the next analysis

  17. Classification with support hyperplanes

    NARCIS (Netherlands)

    G.I. Nalbantov (Georgi); J.C. Bioch (Cor); P.J.F. Groenen (Patrick)

    2006-01-01

    textabstractA new classification method is proposed, called Support Hy- perplanes (SHs). To solve the binary classification task, SHs consider the set of all hyperplanes that do not make classification mistakes, referred to as semi-consistent hyperplanes. A test object is classified using

  18. Standard classification: Physics

    International Nuclear Information System (INIS)

    1977-01-01

    This is a draft standard classification of physics. The conception is based on the physics part of the systematic catalogue of the Bayerische Staatsbibliothek and on the classification given in standard textbooks. The ICSU-AB classification now used worldwide by physics information services was not taken into account. (BJ) [de

  19. Analyzing and synthesizing phylogenies using tree alignment graphs.

    Directory of Open Access Journals (Sweden)

    Stephen A Smith

    Full Text Available Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG. The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees, we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to

  20. Analyzing and synthesizing phylogenies using tree alignment graphs.

    Science.gov (United States)

    Smith, Stephen A; Brown, Joseph W; Hinchliff, Cody E

    2013-01-01

    Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe.

  1. An ecological classification system for the central hardwoods region: The Hoosier National Forest

    Science.gov (United States)

    James E. Van Kley; George R. Parker

    1993-01-01

    This study, a multifactor ecological classification system, using vegetation, soil characteristics, and physiography, was developed for the landscape of the Hoosier National Forest in Southern Indiana. Measurements of ground flora, saplings, and canopy trees from selected stands older than 80 years were subjected to TWINSPAN classification and DECORANA ordination....

  2. Up in the tree--the overlooked richness of bryophytes and lichens in tree crowns.

    Directory of Open Access Journals (Sweden)

    Steffen Boch

    Full Text Available Assessing diversity is among the major tasks in ecology and conservation science. In ecological and conservation studies, epiphytic cryptogams are usually sampled up to accessible heights in forests. Thus, their diversity, especially of canopy specialists, likely is underestimated. If the proportion of those species differs among forest types, plot-based diversity assessments are biased and may result in misleading conservation recommendations. We sampled bryophytes and lichens in 30 forest plots of 20 m × 20 m in three German regions, considering all substrates, and including epiphytic litter fall. First, the sampling of epiphytic species was restricted to the lower 2 m of trees and shrubs. Then, on one representative tree per plot, we additionally recorded epiphytic species in the crown, using tree climbing techniques. Per tree, on average 54% of lichen and 20% of bryophyte species were overlooked if the crown was not been included. After sampling all substrates per plot, including the bark of all shrubs and trees, still 38% of the lichen and 4% of the bryophyte species were overlooked if the tree crown of the sampled tree was not included. The number of overlooked lichen species varied strongly among regions. Furthermore, the number of overlooked bryophyte and lichen species per plot was higher in European beech than in coniferous stands and increased with increasing diameter at breast height of the sampled tree. Thus, our results indicate a bias of comparative studies which might have led to misleading conservation recommendations of plot-based diversity assessments.

  3. Up in the Tree – The Overlooked Richness of Bryophytes and Lichens in Tree Crowns

    Science.gov (United States)

    Boch, Steffen; Müller, Jörg; Prati, Daniel; Blaser, Stefan; Fischer, Markus

    2013-01-01

    Assessing diversity is among the major tasks in ecology and conservation science. In ecological and conservation studies, epiphytic cryptogams are usually sampled up to accessible heights in forests. Thus, their diversity, especially of canopy specialists, likely is underestimated. If the proportion of those species differs among forest types, plot-based diversity assessments are biased and may result in misleading conservation recommendations. We sampled bryophytes and lichens in 30 forest plots of 20 m × 20 m in three German regions, considering all substrates, and including epiphytic litter fall. First, the sampling of epiphytic species was restricted to the lower 2 m of trees and shrubs. Then, on one representative tree per plot, we additionally recorded epiphytic species in the crown, using tree climbing techniques. Per tree, on average 54% of lichen and 20% of bryophyte species were overlooked if the crown was not been included. After sampling all substrates per plot, including the bark of all shrubs and trees, still 38% of the lichen and 4% of the bryophyte species were overlooked if the tree crown of the sampled tree was not included. The number of overlooked lichen species varied strongly among regions. Furthermore, the number of overlooked bryophyte and lichen species per plot was higher in European beech than in coniferous stands and increased with increasing diameter at breast height of the sampled tree. Thus, our results indicate a bias of comparative studies which might have led to misleading conservation recommendations of plot-based diversity assessments. PMID:24358373

  4. Up in the tree--the overlooked richness of bryophytes and lichens in tree crowns.

    Science.gov (United States)

    Boch, Steffen; Müller, Jörg; Prati, Daniel; Blaser, Stefan; Fischer, Markus

    2013-01-01

    Assessing diversity is among the major tasks in ecology and conservation science. In ecological and conservation studies, epiphytic cryptogams are usually sampled up to accessible heights in forests. Thus, their diversity, especially of canopy specialists, likely is underestimated. If the proportion of those species differs among forest types, plot-based diversity assessments are biased and may result in misleading conservation recommendations. We sampled bryophytes and lichens in 30 forest plots of 20 m × 20 m in three German regions, considering all substrates, and including epiphytic litter fall. First, the sampling of epiphytic species was restricted to the lower 2 m of trees and shrubs. Then, on one representative tree per plot, we additionally recorded epiphytic species in the crown, using tree climbing techniques. Per tree, on average 54% of lichen and 20% of bryophyte species were overlooked if the crown was not been included. After sampling all substrates per plot, including the bark of all shrubs and trees, still 38% of the lichen and 4% of the bryophyte species were overlooked if the tree crown of the sampled tree was not included. The number of overlooked lichen species varied strongly among regions. Furthermore, the number of overlooked bryophyte and lichen species per plot was higher in European beech than in coniferous stands and increased with increasing diameter at breast height of the sampled tree. Thus, our results indicate a bias of comparative studies which might have led to misleading conservation recommendations of plot-based diversity assessments.

  5. Phylogenetic tree reconstruction accuracy and model fit when proportions of variable sites change across the tree.

    Science.gov (United States)

    Shavit Grievink, Liat; Penny, David; Hendy, Michael D; Holland, Barbara R

    2010-05-01

    Commonly used phylogenetic models assume a homogeneous process through time in all parts of the tree. However, it is known that these models can be too simplistic as they do not account for nonhomogeneous lineage-specific properties. In particular, it is now widely recognized that as constraints on sequences evolve, the proportion and positions of variable sites can vary between lineages causing heterotachy. The extent to which this model misspecification affects tree reconstruction is still unknown. Here, we evaluate the effect of changes in the proportions and positions of variable sites on model fit and tree estimation. We consider 5 current models of nucleotide sequence evolution in a Bayesian Markov chain Monte Carlo framework as well as maximum parsimony (MP). We show that for a tree with 4 lineages where 2 nonsister taxa undergo a change in the proportion of variable sites tree reconstruction under the best-fitting model, which is chosen using a relative test, often results in the wrong tree. In this case, we found that an absolute test of model fit is a better predictor of tree estimation accuracy. We also found further evidence that MP is not immune to heterotachy. In addition, we show that increased sampling of taxa that have undergone a change in proportion and positions of variable sites is critical for accurate tree reconstruction.

  6. A recursive algorithm for trees and forests

    OpenAIRE

    Guo, Song; Guo, Victor J. W.

    2017-01-01

    Trees or rooted trees have been generously studied in the literature. A forest is a set of trees or rooted trees. Here we give recurrence relations between the number of some kind of rooted forest with $k$ roots and that with $k+1$ roots on $\\{1,2,\\ldots,n\\}$. Classical formulas for counting various trees such as rooted trees, bipartite trees, tripartite trees, plane trees, $k$-ary plane trees, $k$-edge colored trees follow immediately from our recursive relations.

  7. Skewed Binary Search Trees

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Moruz, Gabriel

    2006-01-01

    It is well-known that to minimize the number of comparisons a binary search tree should be perfectly balanced. Previous work has shown that a dominating factor over the running time for a search is the number of cache faults performed, and that an appropriate memory layout of a binary search tree...... can reduce the number of cache faults by several hundred percent. Motivated by the fact that during a search branching to the left or right at a node does not necessarily have the same cost, e.g. because of branch prediction schemes, we in this paper study the class of skewed binary search trees....... For all nodes in a skewed binary search tree the ratio between the size of the left subtree and the size of the tree is a fixed constant (a ratio of 1/2 gives perfect balanced trees). In this paper we present an experimental study of various memory layouts of static skewed binary search trees, where each...

  8. Beyond Tree Throw: Wind, Water, Rock and the Mechanics of Tree-Driven Bedrock Physical Weathering

    Science.gov (United States)

    Marshall, J. A.; Anderson, R. S.; Dawson, T. E.; Dietrich, W. E.; Minear, J. T.

    2017-12-01

    Tree throw is often invoked as the dominant process in converting bedrock to soil and thus helping to build the Critical Zone (CZ). In addition, observations of tree roots lifting sidewalk slabs, occupying cracks, and prying slabs of rock from cliff faces have led to a general belief in the power of plant growth forces. These common observations have led to conceptual models with trees at the center of the soil genesis process. This is despite the observation that tree throw is rare in many forested settings, and a dearth of field measurements that quantify the magnitude of growth forces. While few trees blow down, every tree grows roots, inserting many tens of percent of its mass below ground. Yet we lack data quantifying the role of trees in both damaging bedrock and detaching it (and thus producing soil). By combing force measurements at the tree-bedrock interface with precipitation, solar radiation, wind speed, and wind-driven tree sway data we quantified the magnitude and frequency of tree-driven soil-production mechanisms from two contrasting climatic and lithologic regimes (Boulder and Eel Creek CZ Observatories). Preliminary data suggests that in settings with relatively thin soils, trees can damage and detach rock due to diurnal fluctuations, wind response and rainfall events. Surprisingly, our data suggests that forces from roots and trunks growing against bedrock are insufficient to pry rock apart or damage bedrock although much more work is needed in this area. The frequency, magnitude and style of wind-driven tree forces at the bedrock interface varies considerably from one to another species. This suggests that tree properties such as mass, elasticity, stiffness and branch structure determine whether trees respond to gusts big or small, move at the same frequency as large wind gusts, or are able to self-dampen near-ground sway response to extended wind forces. Our measurements of precipitation-driven and daily fluctuations in root pressures exerted on

  9. Classification of refrigerants; Classification des fluides frigorigenes

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2001-07-01

    This document was made from the US standard ANSI/ASHRAE 34 published in 2001 and entitled 'designation and safety classification of refrigerants'. This classification allows to clearly organize in an international way the overall refrigerants used in the world thanks to a codification of the refrigerants in correspondence with their chemical composition. This note explains this codification: prefix, suffixes (hydrocarbons and derived fluids, azeotropic and non-azeotropic mixtures, various organic compounds, non-organic compounds), safety classification (toxicity, flammability, case of mixtures). (J.S.)

  10. Monitoring Urban Tree Cover Using Object-Based Image Analysis and Public Domain Remotely Sensed Data

    Directory of Open Access Journals (Sweden)

    Meghan Halabisky

    2011-10-01

    Full Text Available Urban forest ecosystems provide a range of social and ecological services, but due to the heterogeneity of these canopies their spatial extent is difficult to quantify and monitor. Traditional per-pixel classification methods have been used to map urban canopies, however, such techniques are not generally appropriate for assessing these highly variable landscapes. Landsat imagery has historically been used for per-pixel driven land use/land cover (LULC classifications, but the spatial resolution limits our ability to map small urban features. In such cases, hyperspatial resolution imagery such as aerial or satellite imagery with a resolution of 1 meter or below is preferred. Object-based image analysis (OBIA allows for use of additional variables such as texture, shape, context, and other cognitive information provided by the image analyst to segment and classify image features, and thus, improve classifications. As part of this research we created LULC classifications for a pilot study area in Seattle, WA, USA, using OBIA techniques and freely available public aerial photography. We analyzed the differences in accuracies which can be achieved with OBIA using multispectral and true-color imagery. We also compared our results to a satellite based OBIA LULC and discussed the implications of per-pixel driven vs. OBIA-driven field sampling campaigns. We demonstrated that the OBIA approach can generate good and repeatable LULC classifications suitable for tree cover assessment in urban areas. Another important finding is that spectral content appeared to be more important than spatial detail of hyperspatial data when it comes to an OBIA-driven LULC.

  11. On the Accuracy of Language Trees

    Science.gov (United States)

    Pompei, Simone; Loreto, Vittorio; Tria, Francesca

    2011-01-01

    Historical linguistics aims at inferring the most likely language phylogenetic tree starting from information concerning the evolutionary relatedness of languages. The available information are typically lists of homologous (lexical, phonological, syntactic) features or characters for many different languages: a set of parallel corpora whose compilation represents a paramount achievement in linguistics. From this perspective the reconstruction of language trees is an example of inverse problems: starting from present, incomplete and often noisy, information, one aims at inferring the most likely past evolutionary history. A fundamental issue in inverse problems is the evaluation of the inference made. A standard way of dealing with this question is to generate data with artificial models in order to have full access to the evolutionary process one is going to infer. This procedure presents an intrinsic limitation: when dealing with real data sets, one typically does not know which model of evolution is the most suitable for them. A possible way out is to compare algorithmic inference with expert classifications. This is the point of view we take here by conducting a thorough survey of the accuracy of reconstruction methods as compared with the Ethnologue expert classifications. We focus in particular on state-of-the-art distance-based methods for phylogeny reconstruction using worldwide linguistic databases. In order to assess the accuracy of the inferred trees we introduce and characterize two generalizations of standard definitions of distances between trees. Based on these scores we quantify the relative performances of the distance-based algorithms considered. Further we quantify how the completeness and the coverage of the available databases affect the accuracy of the reconstruction. Finally we draw some conclusions about where the accuracy of the reconstructions in historical linguistics stands and about the leading directions to improve it. PMID:21674034

  12. On the accuracy of language trees.

    Directory of Open Access Journals (Sweden)

    Simone Pompei

    Full Text Available Historical linguistics aims at inferring the most likely language phylogenetic tree starting from information concerning the evolutionary relatedness of languages. The available information are typically lists of homologous (lexical, phonological, syntactic features or characters for many different languages: a set of parallel corpora whose compilation represents a paramount achievement in linguistics. From this perspective the reconstruction of language trees is an example of inverse problems: starting from present, incomplete and often noisy, information, one aims at inferring the most likely past evolutionary history. A fundamental issue in inverse problems is the evaluation of the inference made. A standard way of dealing with this question is to generate data with artificial models in order to have full access to the evolutionary process one is going to infer. This procedure presents an intrinsic limitation: when dealing with real data sets, one typically does not know which model of evolution is the most suitable for them. A possible way out is to compare algorithmic inference with expert classifications. This is the point of view we take here by conducting a thorough survey of the accuracy of reconstruction methods as compared with the Ethnologue expert classifications. We focus in particular on state-of-the-art distance-based methods for phylogeny reconstruction using worldwide linguistic databases. In order to assess the accuracy of the inferred trees we introduce and characterize two generalizations of standard definitions of distances between trees. Based on these scores we quantify the relative performances of the distance-based algorithms considered. Further we quantify how the completeness and the coverage of the available databases affect the accuracy of the reconstruction. Finally we draw some conclusions about where the accuracy of the reconstructions in historical linguistics stands and about the leading directions to improve

  13. Handling Imbalanced Data Sets in Multistage Classification

    Science.gov (United States)

    López, M.

    Multistage classification is a logical approach, based on a divide-and-conquer solution, for dealing with problems with a high number of classes. The classification problem is divided into several sequential steps, each one associated to a single classifier that works with subgroups of the original classes. In each level, the current set of classes is split into smaller subgroups of classes until they (the subgroups) are composed of only one class. The resulting chain of classifiers can be represented as a tree, which (1) simplifies the classification process by using fewer categories in each classifier and (2) makes it possible to combine several algorithms or use different attributes in each stage. Most of the classification algorithms can be biased in the sense of selecting the most populated class in overlapping areas of the input space. This can degrade a multistage classifier performance if the training set sample frequencies do not reflect the real prevalence in the population. Several techniques such as applying prior probabilities, assigning weights to the classes, or replicating instances have been developed to overcome this handicap. Most of them are designed for two-class (accept-reject) problems. In this article, we evaluate several of these techniques as applied to multistage classification and analyze how they can be useful for astronomy. We compare the results obtained by classifying a data set based on Hipparcos with and without these methods.

  14. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  15. Black smokers and the Tree of Life

    Science.gov (United States)

    Linich, Michael

    The molecular biology revolution has turned the classification of life on its head. Is Whittaker's five-kingdom scheme for the classification of living things no longer relevant to life science education? Coupled with this is the discovery that most microscopic life cannot yet be brought into culture. One of the key organisms making this knowledge possible is Methanococcus jannishi a microorganism found in black smokers. This workshop presents the development of the Universal Tree of Life in a historical context and then links together major concepts in the New South Wales senior science programs of Earth and Environmental Science and Biology by examining the biological and geological aspects of changes to black smokers over geological time.

  16. Visualizing phylogenetic tree landscapes.

    Science.gov (United States)

    Wilgenbusch, James C; Huang, Wen; Gallivan, Kyle A

    2017-02-02

    Genomic-scale sequence alignments are increasingly used to infer phylogenies in order to better understand the processes and patterns of evolution. Different partitions within these new alignments (e.g., genes, codon positions, and structural features) often favor hundreds if not thousands of competing phylogenies. Summarizing and comparing phylogenies obtained from multi-source data sets using current consensus tree methods discards valuable information and can disguise potential methodological problems. Discovery of efficient and accurate dimensionality reduction methods used to display at once in 2- or 3- dimensions the relationship among these competing phylogenies will help practitioners diagnose the limits of current evolutionary models and potential problems with phylogenetic reconstruction methods when analyzing large multi-source data sets. We introduce several dimensionality reduction methods to visualize in 2- and 3-dimensions the relationship among competing phylogenies obtained from gene partitions found in three mid- to large-size mitochondrial genome alignments. We test the performance of these dimensionality reduction methods by applying several goodness-of-fit measures. The intrinsic dimensionality of each data set is also estimated to determine whether projections in 2- and 3-dimensions can be expected to reveal meaningful relationships among trees from different data partitions. Several new approaches to aid in the comparison of different phylogenetic landscapes are presented. Curvilinear Components Analysis (CCA) and a stochastic gradient decent (SGD) optimization method give the best representation of the original tree-to-tree distance matrix for each of the three- mitochondrial genome alignments and greatly outperformed the method currently used to visualize tree landscapes. The CCA + SGD method converged at least as fast as previously applied methods for visualizing tree landscapes. We demonstrate for all three mtDNA alignments that 3D

  17. Carbon-14 in tree rings

    International Nuclear Information System (INIS)

    Cain, W.F.; Suess, H.E.

    1976-01-01

    In order to investigate how reliably the carbon 14 content of tree rings reflects that of atmospheric carbon dioxide, two types of determinations were carried out: (1) carbon 14 determinations in annual rings from the beginning of this century until 1974 and (2) carbon 14 determinations in synchronous wood from the North American bristlecone pine and from European oak trees, dendrochronologically dated to have grown in the third and fourth century B.C. The first series of measurements showed that bomb-produced radiocarbon was incorporated in wood at a time when it was converted from sapwood to heartwood, whenever radiocarbon from bomb testing was present in the atmosphere. The second series showed that wood more than 2000 years old and grown on two different continents at different altitudes had, within the limits of experimental error, the same radiocarbon content. This work and other experimental evidence, obtained in part by other laboratories, show that tree rings reflect the average radiocarbon content of global atmospheric carbon dioxide accurately within several parts per mil. In rare cases, deviations of up to 10 parts per thousand may be possible. This means that a typical single radiocarbon date for wood or charcoal possesses an intrinsic uncertainty (viz., an estimated ''one-sigma error'' in addition to all the other errors) of the order of +-50 years. This intrinsic uncertainty is independent of the absolute age of the sample. More accurate dates can, in principle, be obtained by the so-called method of ''wiggle matching.''

  18. Classification of Internet banking customers using data mining algorithms

    Directory of Open Access Journals (Sweden)

    Reza Radfar

    2014-03-01

    Full Text Available Classifying customers using data mining algorithms, enables banks to keep old customers loyality while attracting new ones. Using decision tree as a data mining technique, we can optimize customer classification provided that the appropriate decision tree is selected. In this article we have presented an appropriate model to classify customers who use internet banking service. The model is developed based on CRISP-DM standard and we have used real data of Sina bank’s Internet bank. In compare to other decision trees, ours is based on both optimization and accuracy factors that recognizes new potential internet banking customers using a three level classification, which is low/medium and high. This is a practical, documentary-based research. Mining customer rules enables managers to make policies based on found out patterns in order to have a better perception of what customers really desire.

  19. Classification, disease, and diagnosis.

    Science.gov (United States)

    Jutel, Annemarie

    2011-01-01

    Classification shapes medicine and guides its practice. Understanding classification must be part of the quest to better understand the social context and implications of diagnosis. Classifications are part of the human work that provides a foundation for the recognition and study of illness: deciding how the vast expanse of nature can be partitioned into meaningful chunks, stabilizing and structuring what is otherwise disordered. This article explores the aims of classification, their embodiment in medical diagnosis, and the historical traditions of medical classification. It provides a brief overview of the aims and principles of classification and their relevance to contemporary medicine. It also demonstrates how classifications operate as social framing devices that enable and disable communication, assert and refute authority, and are important items for sociological study.

  20. Tree-growth analyses to estimate tree species' drought tolerance

    NARCIS (Netherlands)

    Eilmann, B.; Rigling, A.

    2012-01-01

    Climate change is challenging forestry management and practices. Among other things, tree species with the ability to cope with more extreme climate conditions have to be identified. However, while environmental factors may severely limit tree growth or even cause tree death, assessing a tree

  1. Big trees, old trees, and growth factor tables

    Science.gov (United States)

    Kevin T. Smith

    2018-01-01

    The potential for a tree to reach a great size and to live a long life frequently captures the public's imagination. Sometimes the desire to know the age of an impressively large tree is simple curiosity. For others, the date-of-tree establishment can make a big diff erence for management, particularly for trees at historic sites or those mentioned in property...

  2. A bijection between phylogenetic trees and plane oriented recursive trees

    OpenAIRE

    Prodinger, Helmut

    2017-01-01

    Phylogenetic trees are binary nonplanar trees with labelled leaves, and plane oriented recursive trees are planar trees with an increasing labelling. Both families are enumerated by double factorials. A bijection is constructed, using the respective representations a 2-partitions and trapezoidal words.

  3. A Suffix Tree Or Not a Suffix Tree?

    DEFF Research Database (Denmark)

    Starikovskaya, Tatiana; Vildhøj, Hjalte Wedel

    2015-01-01

    In this paper we study the structure of suffix trees. Given an unlabeled tree r on n nodes and suffix links of its internal nodes, we ask the question “Is r a suffix tree?”, i.e., is there a string S whose suffix tree has the same topological structure as r? We place no restrictions on S, in part...

  4. A Framework For Analyzing And Mitigating The Vulnerabilities Of Complex Systems Via Attack And Protection Trees

    National Research Council Canada - National Science Library

    Edge, Kenneth S

    2007-01-01

    .... In addition to developing protection trees, this research improves the existing concept of attack trees and develops rule sets for the manipulation of metrics used in the security of complex systems...

  5. NLCD 2001 - Tree Canopy

    Data.gov (United States)

    Minnesota Department of Natural Resources — The National Land Cover Database 2001 tree canopy layer for Minnesota (mapping zones 39-42, 50-51) was produced through a cooperative project conducted by the...

  6. Trees for future forests

    DEFF Research Database (Denmark)

    Lobo, Albin

    Climate change creates new challenges in forest management. The increase in temperature may in the long run be beneficial for the forests in the northern latitudes, but the high rate at which climate change is predicted to proceed will make adaptation difficult because trees are long living sessile...... organisms. The aim of the present thesis is therefore to explore genetic resilience and phenotypic plasticity mechanisms that allows trees to adapt and evolve with changing climates. The thesis focus on the abiotic factors associated with climate change, especially raised temperatures and lack...... age of these tree species and the uncertainty around the pace and effect of climate, it remains an open question if the native populations can respond fast enough. Phenotypic plasticity through epigenetic regulation of spring phenology is found to be present in a tree species which might act...

  7. Value tree analysis

    International Nuclear Information System (INIS)

    Keeney, R.; Renn, O.; Winterfeldt, D. von; Kotte, U.

    1985-01-01

    What are the targets and criteria on which national energy policy should be based. What priorities should be set, and how can different social interests be matched. To answer these questions, a new instrument of decision theory is presented which has been applied with good results to controversial political issues in the USA. The new technique is known under the name of value tree analysis. Members of important West German organisations (BDI, VDI, RWE, the Catholic and Protestant Church, Deutscher Naturschutzring, and ecological research institutions) were asked about the goals of their organisations. These goals were then ordered systematically and arranged in a hierarchical tree structure. The value trees of different groups can be combined into a catalogue of social criteria of acceptability and policy assessment. The authors describe the philosophy and methodology of value tree analysis and give an outline of its application in the development of a socially acceptable energy policy. (orig.) [de

  8. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    International Nuclear Information System (INIS)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.; McEwen, Jason D.

    2016-01-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  9. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    Energy Technology Data Exchange (ETDEWEB)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K. [Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT (United Kingdom); McEwen, Jason D., E-mail: dr.michelle.lochner@gmail.com [Mullard Space Science Laboratory, University College London, Surrey RH5 6NT (United Kingdom)

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  10. 48 CFR 22.406-3 - Additional classifications.

    Science.gov (United States)

    2010-10-01

    ... SOCIOECONOMIC PROGRAMS APPLICATION OF LABOR LAWS TO GOVERNMENT ACQUISITIONS Labor Standards for Contracts... and wage rate be posted in accordance with paragraph (a) of the clause at 52.222-6 and that workers in...

  11. Multiscale singularity trees

    DEFF Research Database (Denmark)

    Somchaipeng, Kerawit; Sporring, Jon; Johansen, Peter

    2007-01-01

    We propose MultiScale Singularity Trees (MSSTs) as a structure to represent images, and we propose an algorithm for image comparison based on comparing MSSTs. The algorithm is tested on 3 public image databases and compared to 2 state-of-theart methods. We conclude that the computational complexity...... of our algorithm only allows for the comparison of small trees, and that the results of our method are comparable with state-of-the-art using much fewer parameters for image representation....

  12. Type extension trees

    DEFF Research Database (Denmark)

    Jaeger, Manfred

    2006-01-01

    We introduce type extension trees as a formal representation language for complex combinatorial features of relational data. Based on a very simple syntax this language provides a unified framework for expressing features as diverse as embedded subgraphs on the one hand, and marginal counts...... of attribute values on the other. We show by various examples how many existing relational data mining techniques can be expressed as the problem of constructing a type extension tree and a discriminant function....

  13. Data Fusion Research of Triaxial Human Body Motion Gesture based on Decision Tree

    Directory of Open Access Journals (Sweden)

    Feihong Zhou

    2014-05-01

    Full Text Available The development status of human body motion gesture data fusion domestic and overseas has been analyzed. A triaxial accelerometer is adopted to develop a wearable human body motion gesture monitoring system aimed at old people healthcare. On the basis of a brief introduction of decision tree algorithm, the WEKA workbench is adopted to generate a human body motion gesture decision tree. At last, the classification quality of the decision tree has been validated through experiments. The experimental results show that the decision tree algorithm could reach an average predicting accuracy of 97.5 % with lower time cost.

  14. Benefit-based tree valuation

    Science.gov (United States)

    E.G. McPherson

    2007-01-01

    Benefit-based tree valuation provides alternative estimates of the fair and reasonable value of trees while illustrating the relative contribution of different benefit types. This study compared estimates of tree value obtained using cost- and benefit-based approaches. The cost-based approach used the Council of Landscape and Tree Appraisers trunk formula method, and...

  15. Attack Trees with Sequential Conjunction

    NARCIS (Netherlands)

    Jhawar, Ravi; Kordy, Barbara; Mauw, Sjouke; Radomirović, Sasa; Trujillo-Rasua, Rolando

    2015-01-01

    We provide the first formal foundation of SAND attack trees which are a popular extension of the well-known attack trees. The SAND at- tack tree formalism increases the expressivity of attack trees by intro- ducing the sequential conjunctive operator SAND. This operator enables the modeling of

  16. Security classification of information

    Energy Technology Data Exchange (ETDEWEB)

    Quist, A.S.

    1993-04-01

    This document is the second of a planned four-volume work that comprehensively discusses the security classification of information. The main focus of Volume 2 is on the principles for classification of information. Included herein are descriptions of the two major types of information that governments classify for national security reasons (subjective and objective information), guidance to use when determining whether information under consideration for classification is controlled by the government (a necessary requirement for classification to be effective), information disclosure risks and benefits (the benefits and costs of classification), standards to use when balancing information disclosure risks and benefits, guidance for assigning classification levels (Top Secret, Secret, or Confidential) to classified information, guidance for determining how long information should be classified (classification duration), classification of associations of information, classification of compilations of information, and principles for declassifying and downgrading information. Rules or principles of certain areas of our legal system (e.g., trade secret law) are sometimes mentioned to .provide added support to some of those classification principles.

  17. Determination of the ecological connectivity between landscape patches obtained using the knowledge engineer (expert) classification technique

    Science.gov (United States)

    Selim, Serdar; Sonmez, Namik Kemal; Onur, Isin; Coslu, Mesut

    2017-10-01

    Connection of similar landscape patches with ecological corridors supports habitat quality of these patches, increases urban ecological quality, and constitutes an important living and expansion area for wild life. Furthermore, habitat connectivity provided by urban green areas is supporting biodiversity in urban areas. In this study, possible ecological connections between landscape patches, which were achieved by using Expert classification technique and modeled with probabilistic connection index. Firstly, the reflection responses of plants to various bands are used as data in hypotheses. One of the important features of this method is being able to use more than one image at the same time in the formation of the hypothesis. For this reason, before starting the application of the Expert classification, the base images are prepared. In addition to the main image, the hypothesis conditions were also created for each class with the NDVI image which is commonly used in the vegetation researches. Besides, the results of the previously conducted supervised classification were taken into account. We applied this classification method by using the raster imagery with user-defined variables. Hereupon, to provide ecological connections of the tree cover which was achieved from the classification, we used Probabilistic Connection (PC) index. The probabilistic connection model which is used for landscape planning and conservation studies via detecting and prioritization critical areas for ecological connection characterizes the possibility of direct connection between habitats. As a result we obtained over % 90 total accuracy in accuracy assessment analysis. We provided ecological connections with PC index and we created inter-connected green spaces system. Thus, we offered and implicated green infrastructure system model takes place in the agenda of recent years.

  18. Tree manipulation experiment

    Science.gov (United States)

    Nishina, K.; Takenaka, C.; Ishizuka, S.; Hashimoto, S.; Yagai, Y.

    2012-12-01

    Some forest operations such as thinning and harvesting management could cause changes in N cycling and N2O emission from soils, since thinning and harvesting managements are accompanied with changes in aboveground environments such as an increase of slash falling and solar radiation on the forest floor. However, a considerable uncertainty exists in effects of thinning and harvesting on N2O fluxes regarding changes in belowground environments by cutting trees. To focus on the effect of changes in belowground environments on the N2O emissions from soils, we conducted a tree manipulation experiment in Japanese cedar (Cryptomeria japonica) stand without soil compaction and slash falling near the chambers and measured N2O flux at 50 cm and 150 cm distances from the tree trunk (stump) before and after cutting. We targeted 5 trees for the manipulation and established the measurement chambers to the 4 directions around each targeted tree relative to upper slope (upper, left, right, lower positions). We evaluated the effect of logging on the emission by using hierarchical Bayesian model. HB model can evaluate the variability in observed data and their uncertainties in the estimation with various probability distributions. Moreover, the HB model can easily accommodate the non-linear relationship among the N2O emissions and the environmental factors, and explicitly take non-independent data (nested structure of data) for the estimation into account by using random effects in the model. Our results showed tree cutting stimulated N2O emission from soils, and also that the increase of N2O flux depended on the distance from the trunk (stump): the increase of N2O flux at 50 cm from the trunk (stump) was greater than that of 150 cm from the trunk. The posterior simulation of the HB model indicated that the stimulation of N2O emission by tree cut- ting could reach up to 200 cm in our experimental plot. By tree cutting, the estimated N2O emission at 0-40 cm from the trunk doubled

  19. The role of non-fig-wasp insects on fig tree biology, with a proposal of the F phase (Fallen figs)

    Science.gov (United States)

    Palmieri, Luciano; Pereira, Rodrigo Augusto Santinelo

    2018-07-01

    The two seminal papers by Galil and Eisikowitch describing the development of Ficus flowers and their sycophilous wasps (i.e., phases A-E) have been adopted in several ecological and evolutionary studies on a wide range of fig tree-insect interactions. Their classification, however, is not inclusive enough to encompass all the diversity of insects associated with the fig development, and the impact of this fauna on the fig-fig wasp mutualism is still unexplored. Here we describe the life history of the non-fig-wasp insects and propose an additional phase to fig-development classification, the F phase (Fallen figs). These figs are not consumed by frugivores while still on the parent tree, fall to the ground and turn into a resource for a diverse range of animals. To support the relevance of the F phase, we summarized a 5-years-period of field observations made on different biomes in three continents. Additionally, we compiled data from the literature of non-fig-wasp insects including only insects associated with inflorescences of wild fig tree species. We report 129 species of non-fig-wasp insects feeding on figs; they colonize the figs in different phases of development and some groups rely on the fallen figs to complete their life cycles. Their range of interaction varies from specialists - that use exclusively fig pulp or fig seeds in their diets - to generalists, opportunists and parasitoids species. The formalization of this additional phase will encourage new studies on fig tree ecology and improve our knowledge on the processes that affect the diversification of insects. It will also help us to understand the implications this fauna may have had on the origin and maintenance of mutualistic interactions.

  20. The View from the Trees: Nocturnal Bull Ants, Myrmecia midas, Use the Surrounding Panorama While Descending from Trees.

    Science.gov (United States)

    Freas, Cody A; Wystrach, Antione; Narendra, Ajay; Cheng, Ken

    2018-01-01

    Solitary foraging ants commonly use visual cues from their environment for navigation. Foragers are known to store visual scenes from the surrounding panorama for later guidance to known resources and to return successfully back to the nest. Several ant species travel not only on the ground, but also climb trees to locate resources. The navigational information that guides animals back home during their descent, while their body is perpendicular to the ground, is largely unknown. Here, we investigate in a nocturnal ant, Myrmecia midas , whether foragers travelling down a tree use visual information to return home. These ants establish nests at the base of a tree on which they forage and in addition, they also forage on nearby trees. We collected foragers and placed them on the trunk of the nest tree or a foraging tree in multiple compass directions. Regardless of the displacement location, upon release ants immediately moved to the side of the trunk facing the nest during their descent. When ants were released on non-foraging trees near the nest, displaced foragers again travelled around the tree to the side facing the nest. All the displaced foragers reached the correct side of the tree well before reaching the ground. However, when the terrestrial cues around the tree were blocked, foragers were unable to orient correctly, suggesting that the surrounding panorama is critical to successful orientation on the tree. Through analysis of panoramic pictures, we show that views acquired at the base of the foraging tree nest can provide reliable nest-ward orientation up to 1.75 m above the ground. We discuss, how animals descending from trees compare their current scene to a memorised scene and report on the similarities in visually guided behaviour while navigating on the ground and descending from trees.