WorldWideScience

Sample records for techniques documentation classification

  1. Vietnamese Document Representation and Classification

    Science.gov (United States)

    Nguyen, Giang-Son; Gao, Xiaoying; Andreae, Peter

    Vietnamese is very different from English and little research has been done on Vietnamese document classification, or indeed, on any kind of Vietnamese language processing, and only a few small corpora are available for research. We created a large Vietnamese text corpus with about 18000 documents, and manually classified them based on different criteria such as topics and styles, giving several classification tasks of different difficulty levels. This paper introduces a new syllable-based document representation at the morphological level of the language for efficient classification. We tested the representation on our corpus with different classification tasks using six classification algorithms and two feature selection techniques. Our experiments show that the new representation is effective for Vietnamese categorization, and suggest that best performance can be achieved using syllable-pair document representation, an SVM with a polynomial kernel as the learning algorithm, and using Information gain and an external dictionary for feature selection.

  2. Classification of forensic autopsy reports through conceptual graph-based document representation model.

    Science.gov (United States)

    Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali

    2018-06-01

    Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation, a clinical report is transformed into a format that is suitable for classification. The traditional document representation technique for text categorization is the bag-of-words (BoW) technique. In this study, the traditional BoW technique is ineffective in classifying forensic autopsy reports because it merely extracts frequent but discriminative features from clinical reports. Moreover, this technique fails to capture word inversion, as well as word-level synonymy and polysemy, when classifying autopsy reports. Hence, the BoW technique suffers from low accuracy and low robustness unless it is improved with contextual and application-specific information. To overcome the aforementioned limitations of the BoW technique, this research aims to develop an effective conceptual graph-based document representation (CGDR) technique to classify 1500 forensic autopsy reports from four (4) manners of death (MoD) and sixteen (16) causes of death (CoD). Term-based and Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) based conceptual features were extracted and represented through graphs. These features were then used to train a two-level text classifier. The first level classifier was responsible for predicting MoD. In addition, the second level classifier was responsible for predicting CoD using the proposed conceptual graph-based document representation technique. To demonstrate the significance of the proposed technique, its results were compared with those of six (6) state-of-the-art document representation techniques. Lastly, this study compared the effects of one-level classification and two-level classification on the experimental results

  3. Text document classification

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana

    č. 62 (2005), s. 53-54 ISSN 0926-4981 R&D Projects: GA AV ČR IAA2075302; GA AV ČR KSK1019101; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : document representation * categorization * classification Subject RIV: BD - Theory of Information

  4. Classification of e-government documents based on cooperative expression of word vectors

    Science.gov (United States)

    Fu, Qianqian; Liu, Hao; Wei, Zhiqiang

    2017-03-01

    The effective document classification is a powerful technique to deal with the huge amount of e-government documents automatically instead of accomplishing them manually. The word-to-vector (word2vec) model, which converts semantic word into low-dimensional vectors, could be successfully employed to classify the e-government documents. In this paper, we propose the cooperative expressions of word vector (Co-word-vector), whose multi-granularity of integration explores the possibility of modeling documents in the semantic space. Meanwhile, we also aim to improve the weighted continuous bag of words model based on word2vec model and distributed representation of topic-words based on LDA model. Furthermore, combining the two levels of word representation, performance result shows that our proposed method on the e-government document classification outperform than the traditional method.

  5. Semantic Document Image Classification Based on Valuable Text Pattern

    Directory of Open Access Journals (Sweden)

    Hossein Pourghassem

    2011-01-01

    Full Text Available Knowledge extraction from detected document image is a complex problem in the field of information technology. This problem becomes more intricate when we know, a negligible percentage of the detected document images are valuable. In this paper, a segmentation-based classification algorithm is used to analysis the document image. In this algorithm, using a two-stage segmentation approach, regions of the image are detected, and then classified to document and non-document (pure region regions in the hierarchical classification. In this paper, a novel valuable definition is proposed to classify document image in to valuable or invaluable categories. The proposed algorithm is evaluated on a database consisting of the document and non-document image that provide from Internet. Experimental results show the efficiency of the proposed algorithm in the semantic document image classification. The proposed algorithm provides accuracy rate of 98.8% for valuable and invaluable document image classification problem.

  6. Classification process in a text document recommender system

    Directory of Open Access Journals (Sweden)

    Dan MUNTEANU

    2005-12-01

    Full Text Available This paper presents the classification process in a recommender system used for textual documents taken especially from web. The system uses in the classification process a combination of content filters, event filters and collaborative filters and it uses implicit and explicit feedback for evaluating documents.

  7. Automatic classification of journalistic documents on the Internet1

    Directory of Open Access Journals (Sweden)

    Elias OLIVEIRA

    Full Text Available Abstract Online journalism is increasing every day. There are many news agencies, newspapers, and magazines using digital publication in the global network. Documents published online are available to users, who use search engines to find them. In order to deliver documents that are relevant to the search, they must be indexed and classified. Due to the vast number of documents published online every day, a lot of research has been carried out to find ways to facilitate automatic document classification. The objective of the present study is to describe an experimental approach for the automatic classification of journalistic documents published on the Internet using the Vector Space Model for document representation. The model was tested based on a real journalism database, using algorithms that have been widely reported in the literature. This article also describes the metrics used to assess the performance of these algorithms and their required configurations. The results obtained show the efficiency of the method used and justify further research to find ways to facilitate the automatic classification of documents.

  8. 10 CFR 1016.32 - Classification and preparation of documents.

    Science.gov (United States)

    2010-01-01

    ... classification, such documents shall be safeguarded with the highest classification in question. (d... with a classification at least as high as its highest classified enclosure. When the contents of the... classification in letters at least one-fourth (1/4) inch in height at the top and bottom on the outside front...

  9. Automatic Classification Using Supervised Learning in a Medical Document Filtering Application.

    Science.gov (United States)

    Mostafa, J.; Lam, W.

    2000-01-01

    Presents a multilevel model of the information filtering process that permits document classification. Evaluates a document classification approach based on a supervised learning algorithm, measures the accuracy of the algorithm in a neural network that was trained to classify medical documents on cell biology, and discusses filtering…

  10. Prediction of cause of death from forensic autopsy reports using text classification techniques: A comparative study.

    Science.gov (United States)

    Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa

    2018-07-01

    Automatic text classification techniques are useful for classifying plaintext medical documents. This study aims to automatically predict the cause of death from free text forensic autopsy reports by comparing various schemes for feature extraction, term weighing or feature value representation, text classification, and feature reduction. For experiments, the autopsy reports belonging to eight different causes of death were collected, preprocessed and converted into 43 master feature vectors using various schemes for feature extraction, representation, and reduction. The six different text classification techniques were applied on these 43 master feature vectors to construct a classification model that can predict the cause of death. Finally, classification model performance was evaluated using four performance measures i.e. overall accuracy, macro precision, macro-F-measure, and macro recall. From experiments, it was found that that unigram features obtained the highest performance compared to bigram, trigram, and hybrid-gram features. Furthermore, in feature representation schemes, term frequency, and term frequency with inverse document frequency obtained similar and better results when compared with binary frequency, and normalized term frequency with inverse document frequency. Furthermore, the chi-square feature reduction approach outperformed Pearson correlation, and information gain approaches. Finally, in text classification algorithms, support vector machine classifier outperforms random forest, Naive Bayes, k-nearest neighbor, decision tree, and ensemble-voted classifier. Our results and comparisons hold practical importance and serve as references for future works. Moreover, the comparison outputs will act as state-of-art techniques to compare future proposals with existing automated text classification techniques. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  11. Formalization of Technological Knowledge in the Field of Metallurgy using Document Classification Tools Supported with Semantic Techniques

    Directory of Open Access Journals (Sweden)

    Regulski K.

    2017-06-01

    Full Text Available The process of knowledge formalization is an essential part of decision support systems development. Creating a technological knowledge base in the field of metallurgy encountered problems in acquisition and codifying reusable computer artifacts based on text documents. The aim of the work was to adapt the algorithms for classification of documents and to develop a method of semantic integration of a created repository. Author used artificial intelligence tools: latent semantic indexing, rough sets, association rules learning and ontologies as a tool for integration. The developed methodology allowed for the creation of semantic knowledge base on the basis of documents in natural language in the field of metallurgy.

  12. Text mining in the classification of digital documents

    Directory of Open Access Journals (Sweden)

    Marcial Contreras Barrera

    2016-11-01

    Full Text Available Objective: Develop an automated classifier for the classification of bibliographic material by means of the text mining. Methodology: The text mining is used for the development of the classifier, based on a method of type supervised, conformed by two phases; learning and recognition, in the learning phase, the classifier learns patterns across the analysis of bibliographical records, of the classification Z, belonging to library science, information sciences and information resources, recovered from the database LIBRUNAM, in this phase is obtained the classifier capable of recognizing different subclasses (LC. In the recognition phase the classifier is validated and evaluates across classification tests, for this end bibliographical records of the classification Z are taken randomly, classified by a cataloguer and processed by the automated classifier, in order to obtain the precision of the automated classifier. Results: The application of the text mining achieved the development of the automated classifier, through the method classifying documents supervised type. The precision of the classifier was calculated doing the comparison among the assigned topics manually and automated obtaining 75.70% of precision. Conclusions: The application of text mining facilitated the creation of automated classifier, allowing to obtain useful technology for the classification of bibliographical material with the aim of improving and speed up the process of organizing digital documents.

  13. A Hybrid Feature Selection Approach for Arabic Documents Classification

    NARCIS (Netherlands)

    Habib, Mena Badieh; Sarhan, Ahmed A. E.; Salem, Abdel-Badeeh M.; Fayed, Zaki T.; Gharib, Tarek F.

    Text Categorization (classification) is the process of classifying documents into a predefined set of categories based on their content. Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge number of features. Feature selection tries to

  14. Classification automatique de documents bruités à faible contenu textuel

    OpenAIRE

    Laroum , Sami; Béchet , Nicolas; Hamza , Hatem; Roche , Mathieu

    2010-01-01

    National audience; La classification de documents numériques est une tâche complexe dans un flux numérique de gestion électronique de documents. Cependant, la quantité des documents issus de la retro-conversion d'OCR (Reconnaissance Optique de Caractères) constitue une problématique qui ne facilite pas la tâche de classification. Après l'étude et l'évaluation des descripteurs les mieux adaptés aux documents issus d'OCR, nous proposons une nouvelle approche de représentation des données textue...

  15. Document representations for classification of short web-page descriptions

    Directory of Open Access Journals (Sweden)

    Radovanović Miloš

    2008-01-01

    Full Text Available Motivated by applying Text Categorization to classification of Web search results, this paper describes an extensive experimental study of the impact of bag-of- words document representations on the performance of five major classifiers - Naïve Bayes, SVM, Voted Perceptron, kNN and C4.5. The texts, representing short Web-page descriptions sorted into a large hierarchy of topics, are taken from the dmoz Open Directory Web-page ontology, and classifiers are trained to automatically determine the topics which may be relevant to a previously unseen Web-page. Different transformations of input data: stemming, normalization, logtf and idf, together with dimensionality reduction, are found to have a statistically significant improving or degrading effect on classification performance measured by classical metrics - accuracy, precision, recall, F1 and F2. The emphasis of the study is not on determining the best document representation which corresponds to each classifier, but rather on describing the effects of every individual transformation on classification, together with their mutual relationships. .

  16. A Similarity-Based Approach for Audiovisual Document Classification Using Temporal Relation Analysis

    Directory of Open Access Journals (Sweden)

    Ferrane Isabelle

    2011-01-01

    Full Text Available Abstract We propose a novel approach for video classification that bases on the analysis of the temporal relationships between the basic events in audiovisual documents. Starting from basic segmentation results, we define a new representation method that is called Temporal Relation Matrix (TRM. Each document is then described by a set of TRMs, the analysis of which makes events of a higher level stand out. This representation has been first designed to analyze any audiovisual document in order to find events that may well characterize its content and its structure. The aim of this work is to use this representation to compute a similarity measure between two documents. Approaches for audiovisual documents classification are presented and discussed. Experimentations are done on a set of 242 video documents and the results show the efficiency of our proposals.

  17. Cirse Quality Assurance Document and Standards for Classification of Complications: The Cirse Classification System.

    Science.gov (United States)

    Filippiadis, D K; Binkert, C; Pellerin, O; Hoffmann, R T; Krajina, A; Pereira, P L

    2017-08-01

    Interventional radiology provides a wide variety of vascular, nonvascular, musculoskeletal, and oncologic minimally invasive techniques aimed at therapy or palliation of a broad spectrum of pathologic conditions. Outcome data for these techniques are globally evaluated by hospitals, insurance companies, and government agencies targeting in a high-quality health care policy, including reimbursement strategies. To analyze effectively the outcome of a technique, accurate reporting of complications is necessary. Throughout the literature, numerous classification systems for complications grading and classification have been reported. Until now, there has been no method for uniform reporting of complications both in terms of definition and grading. The purpose of this CIRSE guideline is to provide a classification system of complications based on combining outcome and severity of sequelae. The ultimate challenge will be the adoption of this system by practitioners in different countries and health economies within the European Union and beyond.

  18. Side effects of cancer therapies. International classification and documentation systems

    International Nuclear Information System (INIS)

    Seegenschmiedt, M.H.

    1998-01-01

    The publication presents and explains verified, international classification and documentation systems for side effects induced by cancer treatments, applicable in general and clinical practice and clinical research, and covers in a clearly arranged manner the whole range of treatments, including acute and chronic side effects of chemotherapy and radiotherapy, surgery, or combined therapies. The book fills a long-felt need in tumor documentation and is a major contribution to quality assurance in clinical oncology in German-speaking countries. As most parts of the book are bilingual, presenting German and English texts and terminology, it satisfies the principles of interdisciplinarity and internationality. The tabulated form chosen for presentation of classification systems and criteria facilitate the user's approach as well as application in daily work. (orig./CB) [de

  19. Classification algorithm of Web document in ionization radiation

    International Nuclear Information System (INIS)

    Geng Zengmin; Liu Wanchun

    2005-01-01

    Resources in the Internet is numerous. It is one of research directions of Web mining (WM) how to mine the resource of some calling or trade more efficiently. The paper studies the classification of Web document in ionization radiation (IR) based on the algorithm of Bayes, Rocchio, Widrow-Hoff, and analyses the result of trial effect. (authors)

  20. The Effect of Preprocessing on Arabic Document Categorization

    Directory of Open Access Journals (Sweden)

    Abdullah Ayedh

    2016-04-01

    Full Text Available Preprocessing is one of the main components in a conventional document categorization (DC framework. This paper aims to highlight the effect of preprocessing tasks on the efficiency of the Arabic DC system. In this study, three classification techniques are used, namely, naive Bayes (NB, k-nearest neighbor (KNN, and support vector machine (SVM. Experimental analysis on Arabic datasets reveals that preprocessing techniques have a significant impact on the classification accuracy, especially with complicated morphological structure of the Arabic language. Choosing appropriate combinations of preprocessing tasks provides significant improvement on the accuracy of document categorization depending on the feature size and classification techniques. Findings of this study show that the SVM technique has outperformed the KNN and NB techniques. The SVM technique achieved 96.74% micro-F1 value by using the combination of normalization and stemming as preprocessing tasks.

  1. Documenting Community Engagement Practices and Outcomes: Insights from Recipients of the 2010 Carnegie Community Engagement Classification

    Science.gov (United States)

    Noel, Jana; Earwicker, David P.

    2015-01-01

    This study was performed to document the strategies and methods used by successful applicants for the 2010 Carnegie Community Engagement Classification and to document the cultural shifts connected with the application process and receipt of the Classification. Four major findings emerged: (1) Applicants benefited from a team approach; (2)…

  2. Biometric Authentication for Gender Classification Techniques: A Review

    Science.gov (United States)

    Mathivanan, P.; Poornima, K.

    2017-12-01

    One of the challenging biometric authentication applications is gender identification and age classification, which captures gait from far distance and analyze physical information of the subject such as gender, race and emotional state of the subject. It is found that most of the gender identification techniques have focused only with frontal pose of different human subject, image size and type of database used in the process. The study also classifies different feature extraction process such as, Principal Component Analysis (PCA) and Local Directional Pattern (LDP) that are used to extract the authentication features of a person. This paper aims to analyze different gender classification techniques that help in evaluating strength and weakness of existing gender identification algorithm. Therefore, it helps in developing a novel gender classification algorithm with less computation cost and more accuracy. In this paper, an overview and classification of different gender identification techniques are first presented and it is compared with other existing human identification system by means of their performance.

  3. Page Layout Analysis of the Document Image Based on the Region Classification in a Decision Hierarchical Structure

    Directory of Open Access Journals (Sweden)

    Hossein Pourghassem

    2010-10-01

    Full Text Available The conversion of document image to its electronic version is a very important problem in the saving, searching and retrieval application in the official automation system. For this purpose, analysis of the document image is necessary. In this paper, a hierarchical classification structure based on a two-stage segmentation algorithm is proposed. In this structure, image is segmented using the proposed two-stage segmentation algorithm. Then, the type of the image regions such as document and non-document image is determined using multiple classifiers in the hierarchical classification structure. The proposed segmentation algorithm uses two algorithms based on wavelet transform and thresholding. Texture features such as correlation, homogeneity and entropy that extracted from co-occurrenc matrix and also two new features based on wavelet transform are used to classifiy and lable the regions of the image. The hierarchical classifier is consisted of two Multilayer Perceptron (MLP classifiers and a Support Vector Machine (SVM classifier. The proposed algorithm is evaluated on a database consisting of document and non-document images that provides from Internet. The experimental results show the efficiency of the proposed approach in the region segmentation and classification. The proposed algorithm provides accuracy rate of 97.5% on classification of the regions.

  4. Handwritten document age classification based on handwriting styles

    Science.gov (United States)

    Ramaiah, Chetan; Kumar, Gaurav; Govindaraju, Venu

    2012-01-01

    Handwriting styles are constantly changing over time. We approach the novel problem of estimating the approximate age of Historical Handwritten Documents using Handwriting styles. This system will have many applications in handwritten document processing engines where specialized processing techniques can be applied based on the estimated age of the document. We propose to learn a distribution over styles across centuries using Topic Models and to apply a classifier over weights learned in order to estimate the approximate age of the documents. We present a comparison of different distance metrics such as Euclidean Distance and Hellinger Distance within this application.

  5. On the classification techniques in data mining for microarray data classification

    Science.gov (United States)

    Aydadenta, Husna; Adiwijaya

    2018-03-01

    Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.

  6. A methodology for semiautomatic taxonomy of concepts extraction from nuclear scientific documents using text mining techniques

    International Nuclear Information System (INIS)

    Braga, Fabiane dos Reis

    2013-01-01

    This thesis presents a text mining method for semi-automatic extraction of taxonomy of concepts, from a textual corpus composed of scientific papers related to nuclear area. The text classification is a natural human practice and a crucial task for work with large repositories. The document clustering technique provides a logical and understandable framework that facilitates the organization, browsing and searching. Most clustering algorithms using the bag of words model to represent the content of a document. This model generates a high dimensionality of the data, ignores the fact that different words can have the same meaning and does not consider the relationship between them, assuming that words are independent of each other. The methodology presents a combination of a model for document representation by concepts with a hierarchical document clustering method using frequency of co-occurrence concepts and a technique for clusters labeling more representatives, with the objective of producing a taxonomy of concepts which may reflect a structure of the knowledge domain. It is hoped that this work will contribute to the conceptual mapping of scientific production of nuclear area and thus support the management of research activities in this area. (author)

  7. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  8. Securing XML Documents

    Directory of Open Access Journals (Sweden)

    Charles Shoniregun

    2004-11-01

    Full Text Available XML (extensible markup language is becoming the current standard for establishing interoperability on the Web. XML data are self-descriptive and syntax-extensible; this makes it very suitable for representation and exchange of semi-structured data, and allows users to define new elements for their specific applications. As a result, the number of documents incorporating this standard is continuously increasing over the Web. The processing of XML documents may require a traversal of all document structure and therefore, the cost could be very high. A strong demand for a means of efficient and effective XML processing has posed a new challenge for the database world. This paper discusses a fast and efficient indexing technique for XML documents, and introduces the XML graph numbering scheme. It can be used for indexing and securing graph structure of XML documents. This technique provides an efficient method to speed up XML data processing. Furthermore, the paper explores the classification of existing methods impact of query processing, and indexing.

  9. Alphanumerical classification for the subject files of the department of documentation of the Atomic Energy Commission; Classification alpha-numerique pour le fichier matieres du service de documentation du Commissariat a l'Energie Atomique

    Energy Technology Data Exchange (ETDEWEB)

    Braffort, P; Iung, J [Commissariat a l' Energie Atomique, Saclay (France). Centre d' Etudes Nucleaires

    1956-07-01

    The research activities of the Atomic Energy Commission cover a large variety of different subjects from theoretical physics and nuclear physics to biology, medicine or geology. Thus, about 350 scientific reviews are received and presented in the library. All those documents need to be classified to make the research of information easier for researchers. It describes the classification and codification of such a large quantity of documents. The classification uses a bidimensional system with 5 columns with inter-scale phenomena, corpuscular scale, nuclear scale, atomic and molecular scale and macroscopic scale as subject and 5 lines with theoretical problems, production, measurement, description and utilisation as topic. Some of the rules are given and examples are presented. (M.P.)

  10. Elaboration of an alpha-numeric classification for file of matters of the documentation service of the CEA; Elaboration d'une classification alfha-numerique pour le fichier matieres du service de documentation du Commissariat a l'Energie Atomique

    Energy Technology Data Exchange (ETDEWEB)

    Braffort, P [Commissariat a l' Energie Atomique, Saclay(France). Centre d' Etudes Nucleaires

    1953-07-01

    We give the principles of a classification of matters to square basis, suiting the needs of the Service, of Documentation of the C.E.A. We present the detail of the categories in the order of the 'columns', likewise the big scientific subdivisions at the CEA. (authors) [French] On donne les principes d'une classification matieres a base carree, convenant aux besoins du Service de Documentation du C.E.A. On presente ensuite le detail des rubriques dans l'ordre des ''colonnes'', c'est-a-dire, des grandes subdivisions scientifiques du C.E.A. (auteurs)

  11. Alphanumerical classification for the subject files of the department of documentation of the Atomic Energy Commission; Classification alpha-numerique pour le fichier matieres du service de documentation du Commissariat a l'Energie Atomique

    Energy Technology Data Exchange (ETDEWEB)

    Braffort, P.; Iung, J. [Commissariat a l' Energie Atomique, Saclay (France). Centre d' Etudes Nucleaires

    1956-07-01

    The research activities of the Atomic Energy Commission cover a large variety of different subjects from theoretical physics and nuclear physics to biology, medicine or geology. Thus, about 350 scientific reviews are received and presented in the library. All those documents need to be classified to make the research of information easier for researchers. It describes the classification and codification of such a large quantity of documents. The classification uses a bidimensional system with 5 columns with inter-scale phenomena, corpuscular scale, nuclear scale, atomic and molecular scale and macroscopic scale as subject and 5 lines with theoretical problems, production, measurement, description and utilisation as topic. Some of the rules are given and examples are presented. (M.P.)

  12. Classification of protein-protein interaction full-text documents using text and citation network features.

    Science.gov (United States)

    Kolchinsky, Artemy; Abi-Haidar, Alaa; Kaur, Jasleen; Hamed, Ahmed Abdeen; Rocha, Luis M

    2010-01-01

    We participated (as Team 9) in the Article Classification Task of the Biocreative II.5 Challenge: binary classification of full-text documents relevant for protein-protein interaction. We used two distinct classifiers for the online and offline challenges: 1) the lightweight Variable Trigonometric Threshold (VTT) linear classifier we successfully introduced in BioCreative 2 for binary classification of abstracts and 2) a novel Naive Bayes classifier using features from the citation network of the relevant literature. We supplemented the supplied training data with full-text documents from the MIPS database. The lightweight VTT classifier was very competitive in this new full-text scenario: it was a top-performing submission in this task, taking into account the rank product of the Area Under the interpolated precision and recall Curve, Accuracy, Balanced F-Score, and Matthew's Correlation Coefficient performance measures. The novel citation network classifier for the biomedical text mining domain, while not a top performing classifier in the challenge, performed above the central tendency of all submissions, and therefore indicates a promising new avenue to investigate further in bibliome informatics.

  13. Alphanumerical classification for the subject files of the department of documentation of the Atomic Energy Commission

    International Nuclear Information System (INIS)

    Braffort, P.; Iung, J.

    1956-01-01

    The research activities of the Atomic Energy Commission cover a large variety of different subjects from theoretical physics and nuclear physics to biology, medicine or geology. Thus, about 350 scientific reviews are received and presented in the library. All those documents need to be classified to make the research of information easier for researchers. It describes the classification and codification of such a large quantity of documents. The classification uses a bidimensional system with 5 columns with inter-scale phenomena, corpuscular scale, nuclear scale, atomic and molecular scale and macroscopic scale as subject and 5 lines with theoretical problems, production, measurement, description and utilisation as topic. Some of the rules are given and examples are presented. (M.P.)

  14. A comparison of autonomous techniques for multispectral image analysis and classification

    Science.gov (United States)

    Valdiviezo-N., Juan C.; Urcid, Gonzalo; Toxqui-Quitl, Carina; Padilla-Vivanco, Alfonso

    2012-10-01

    Multispectral imaging has given place to important applications related to classification and identification of objects from a scene. Because of multispectral instruments can be used to estimate the reflectance of materials in the scene, these techniques constitute fundamental tools for materials analysis and quality control. During the last years, a variety of algorithms has been developed to work with multispectral data, whose main purpose has been to perform the correct classification of the objects in the scene. The present study introduces a brief review of some classical as well as a novel technique that have been used for such purposes. The use of principal component analysis and K-means clustering techniques as important classification algorithms is here discussed. Moreover, a recent method based on the min-W and max-M lattice auto-associative memories, that was proposed for endmember determination in hyperspectral imagery, is introduced as a classification method. Besides a discussion of their mathematical foundation, we emphasize their main characteristics and the results achieved for two exemplar images conformed by objects similar in appearance, but spectrally different. The classification results state that the first components computed from principal component analysis can be used to highlight areas with different spectral characteristics. In addition, the use of lattice auto-associative memories provides good results for materials classification even in the cases where some spectral similarities appears in their spectral responses.

  15. Motor Oil Classification using Color Histograms and Pattern Recognition Techniques.

    Science.gov (United States)

    Ahmadi, Shiva; Mani-Varnosfaderani, Ahmad; Habibi, Biuck

    2018-04-20

    Motor oil classification is important for quality control and the identification of oil adulteration. In thiswork, we propose a simple, rapid, inexpensive and nondestructive approach based on image analysis and pattern recognition techniques for the classification of nine different types of motor oils according to their corresponding color histograms. For this, we applied color histogram in different color spaces such as red green blue (RGB), grayscale, and hue saturation intensity (HSI) in order to extract features that can help with the classification procedure. These color histograms and their combinations were used as input for model development and then were statistically evaluated by using linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) techniques. Here, two common solutions for solving a multiclass classification problem were applied: (1) transformation to binary classification problem using a one-against-all (OAA) approach and (2) extension from binary classifiers to a single globally optimized multilabel classification model. In the OAA strategy, LDA, QDA, and SVM reached up to 97% in terms of accuracy, sensitivity, and specificity for both the training and test sets. In extension from binary case, despite good performances by the SVM classification model, QDA and LDA provided better results up to 92% for RGB-grayscale-HSI color histograms and up to 93% for the HSI color map, respectively. In order to reduce the numbers of independent variables for modeling, a principle component analysis algorithm was used. Our results suggest that the proposed method is promising for the identification and classification of different types of motor oils.

  16. Analysis and classification of oncology activities on the way to workflow based single source documentation in clinical information systems.

    Science.gov (United States)

    Wagner, Stefan; Beckmann, Matthias W; Wullich, Bernd; Seggewies, Christof; Ries, Markus; Bürkle, Thomas; Prokosch, Hans-Ulrich

    2015-12-22

    Today, cancer documentation is still a tedious task involving many different information systems even within a single institution and it is rarely supported by appropriate documentation workflows. In a comprehensive 14 step analysis we compiled diagnostic and therapeutic pathways for 13 cancer entities using a mixed approach of document analysis, workflow analysis, expert interviews, workflow modelling and feedback loops. These pathways were stepwise classified and categorized to create a final set of grouped pathways and workflows including electronic documentation forms. A total of 73 workflows for the 13 entities based on 82 paper documentation forms additionally to computer based documentation systems were compiled in a 724 page document comprising 130 figures, 94 tables and 23 tumour classifications as well as 12 follow-up tables. Stepwise classification made it possible to derive grouped diagnostic and therapeutic pathways for the three major classes - solid entities with surgical therapy - solid entities with surgical and additional therapeutic activities and - non-solid entities. For these classes it was possible to deduct common documentation workflows to support workflow-guided single-source documentation. Clinical documentation activities within a Comprehensive Cancer Center can likely be realized in a set of three documentation workflows with conditional branching in a modern workflow supporting clinical information system.

  17. Hybrid image classification technique for land-cover mapping in the Arctic tundra, North Slope, Alaska

    Science.gov (United States)

    Chaudhuri, Debasish

    Remotely sensed image classification techniques are very useful to understand vegetation patterns and species combination in the vast and mostly inaccessible arctic region. Previous researches that were done for mapping of land cover and vegetation in the remote areas of northern Alaska have considerably low accuracies compared to other biomes. The unique arctic tundra environment with short growing season length, cloud cover, low sun angles, snow and ice cover hinders the effectiveness of remote sensing studies. The majority of image classification research done in this area as reported in the literature used traditional unsupervised clustering technique with Landsat MSS data. It was also emphasized by previous researchers that SPOT/HRV-XS data lacked the spectral resolution to identify the small arctic tundra vegetation parcels. Thus, there is a motivation and research need to apply a new classification technique to develop an updated, detailed and accurate vegetation map at a higher spatial resolution i.e. SPOT-5 data. Traditional classification techniques in remotely sensed image interpretation are based on spectral reflectance values with an assumption of the training data being normally distributed. Hence it is difficult to add ancillary data in classification procedures to improve accuracy. The purpose of this dissertation was to develop a hybrid image classification approach that effectively integrates ancillary information into the classification process and combines ISODATA clustering, rule-based classifier and the Multilayer Perceptron (MLP) classifier which uses artificial neural network (ANN). The main goal was to find out the best possible combination or sequence of classifiers for typically classifying tundra type vegetation that yields higher accuracy than the existing classified vegetation map from SPOT data. Unsupervised ISODATA clustering and rule-based classification techniques were combined to produce an intermediate classified map which was

  18. Document Classification Using Distributed Machine Learning

    OpenAIRE

    Aydin, Galip; Hallac, Ibrahim Riza

    2018-01-01

    In this paper, we investigate the performance and success rates of Na\\"ive Bayes Classification Algorithm for automatic classification of Turkish news into predetermined categories like economy, life, health etc. We use Apache Big Data technologies such as Hadoop, HDFS, Spark and Mahout, and apply these distributed technologies to Machine Learning.

  19. International Standardization of Library and Documentation Techniques.

    Science.gov (United States)

    International Federation for Documentation, The Hague (Netherlands).

    This comparative study of the national and international standards, rules and regulations on library and documentation techniques adopted in various countries was conducted as a preliminary step in determining the minimal bases for facilitating national and international cooperation between documentalists and librarians. The study compares and…

  20. A novel Neuro-fuzzy classification technique for data mining

    Directory of Open Access Journals (Sweden)

    Soumadip Ghosh

    2014-11-01

    Full Text Available In our study, we proposed a novel Neuro-fuzzy classification technique for data mining. The inputs to the Neuro-fuzzy classification system were fuzzified by applying generalized bell-shaped membership function. The proposed method utilized a fuzzification matrix in which the input patterns were associated with a degree of membership to different classes. Based on the value of degree of membership a pattern would be attributed to a specific category or class. We applied our method to ten benchmark data sets from the UCI machine learning repository for classification. Our objective was to analyze the proposed method and, therefore compare its performance with two powerful supervised classification algorithms Radial Basis Function Neural Network (RBFNN and Adaptive Neuro-fuzzy Inference System (ANFIS. We assessed the performance of these classification methods in terms of different performance measures such as accuracy, root-mean-square error, kappa statistic, true positive rate, false positive rate, precision, recall, and f-measure. In every aspect the proposed method proved to be superior to RBFNN and ANFIS algorithms.

  1. Usefulness of the classification technique of cerebral artery for 2D/3D registration

    International Nuclear Information System (INIS)

    Takemura, Akihiro; Suzuki, Masayuki; Kikuchi, Yuzo; Okumura, Yusuke; Harauchi, Hajime

    2007-01-01

    Several papers have proposed 2D/3D registration methods of the cerebral artery using magnetic resonance angiography (MRA) and digital subtraction angiography (DSA). Since differences between vessels in a DSA image and MRA volume data cause registration failure, we previously proposed a method to extract vessels from MRA volume data using a technique based on classification of the cerebral artery. In this paper, we evaluated the usefulness of this classification technique by evaluating the reliability of this 2D/3D registration method. This classification method divides the cerebral artery in MRA volume data into 12 segments. According to the results of the classification, structures corresponding to vessels on a DSA image can then be extracted. We applied the 2D/3D registration with/without classification to 16 pairs of MRA volume data and DSA images obtained from six patients. The registration results were scored into four levels (Excellent, Good, Fair and Poor). The rates of successful registration (>fair) were 37.5% for registration without classification and 81.3% for that with classification. These findings suggested that there was a low percentage of incorrectly extracted voxels and we could facilitate reliable registration. Thus, the classification technique was shown to be useful for feature-based 2D/3D registration. (author)

  2. Probabilistic topic modeling for the analysis and classification of genomic sequences

    Science.gov (United States)

    2015-01-01

    Background Studies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques. Methods The presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences. Results and conclusions We performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased. PMID:25916734

  3. Field-Testing a PC Electronic Documentation System using the Clinical Care Classification© System with Nursing Students

    Directory of Open Access Journals (Sweden)

    Jennifer E. Mannino

    2011-01-01

    Full Text Available Schools of nursing are slow in training their students to keep up with the fast approaching era of electronic healthcare documentation. This paper discusses the importance of nursing documentation, and describes the field-testing of an electronic health record, the Sabacare Clinical Care Classification (CCC© system. The PC-CCC©, designed as a Microsoft Access® application, is an evidence-based electronic documentation system available via free download from the internet. A sample of baccalaureate nursing students from a mid-Atlantic private college used this program to document the nursing care they provided to patients during their sophomore level clinical experience. This paper summarizes the design, training, and evaluation of using the system in practice.

  4. Comparative Study of Classification Techniques on Breast Cancer FNA Biopsy Data

    Directory of Open Access Journals (Sweden)

    George Rumbe

    2010-12-01

    Full Text Available Accurate diagnostic detection of the cancerous cells in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Bayesian classifier and other Artificial neural network classifiers (Backpropagation, linear programming, Learning vector quantization, and K nearest neighborhood on the Wisconsin breast cancer classification problem.

  5. Toward an enhanced Arabic text classification using cosine similarity and Latent Semantic

    Directory of Open Access Journals (Sweden)

    Fawaz S. Al-Anzi

    2017-04-01

    Full Text Available Cosine similarity is one of the most popular distance measures in text classification problems. In this paper, we used this important measure to investigate the performance of Arabic language text classification. For textual features, vector space model (VSM is generally used as a model to represent textual information as numerical vectors. However, Latent Semantic Indexing (LSI is a better textual representation technique as it maintains semantic information between the words. Hence, we used the singular value decomposition (SVD method to extract textual features based on LSI. In our experiments, we conducted comparison between some of the well-known classification methods such as Naïve Bayes, k-Nearest Neighbors, Neural Network, Random Forest, Support Vector Machine, and classification tree. We used a corpus that contains 4,000 documents of ten topics (400 document for each topic. The corpus contains 2,127,197 words with about 139,168 unique words. The testing set contains 400 documents, 40 documents for each topics. As a weighing scheme, we used Term Frequency.Inverse Document Frequency (TF.IDF. This study reveals that the classification methods that use LSI features significantly outperform the TF.IDF-based methods. It also reveals that k-Nearest Neighbors (based on cosine measure and support vector machine are the best performing classifiers.

  6. Elaboration of an alpha-numeric classification for file of matters of the documentation service of the CEA

    International Nuclear Information System (INIS)

    Braffort, P.

    1953-01-01

    We give the principles of a classification of matters to square basis, suiting the needs of the Service, of Documentation of the C.E.A. We present the detail of the categories in the order of the 'columns', likewise the big scientific subdivisions at the CEA. (authors) [fr

  7. Segmentation of complex document

    Directory of Open Access Journals (Sweden)

    Souad Oudjemia

    2014-06-01

    Full Text Available In this paper we present a method for segmentation of documents image with complex structure. This technique based on GLCM (Grey Level Co-occurrence Matrix used to segment this type of document in three regions namely, 'graphics', 'background' and 'text'. Very briefly, this method is to divide the document image, in block size chosen after a series of tests and then applying the co-occurrence matrix to each block in order to extract five textural parameters which are energy, entropy, the sum entropy, difference entropy and standard deviation. These parameters are then used to classify the image into three regions using the k-means algorithm; the last step of segmentation is obtained by grouping connected pixels. Two performance measurements are performed for both graphics and text zones; we have obtained a classification rate of 98.3% and a Misclassification rate of 1.79%.

  8. Classification of the financial sustainability of health insurance beneficiaries through data mining techniques

    Directory of Open Access Journals (Sweden)

    Sílvia Maria Dias Pedro Rebouças

    2016-09-01

    Full Text Available Advances in information technologies have led to the storage of large amounts of data by organizations. An analysis of this data through data mining techniques is important support for decision-making. This article aims to apply techniques for the classification of the beneficiaries of an operator of health insurance in Brazil, according to their financial sustainability, via their sociodemographic characteristics and their healthcare cost history. Beneficiaries with a loss ratio greater than 0.75 are considered unsustainable. The sample consists of 38875 beneficiaries, active between the years 2011 and 2013. The techniques used were logistic regression and classification trees. The performance of the models was compared to accuracy rates and receiver operating Characteristic curves (ROC curves, by determining the area under the curves (AUC. The results showed that most of the sample is composed of sustainable beneficiaries. The logistic regression model had a 68.43% accuracy rate with AUC of 0.7501, and the classification tree obtained 67.76% accuracy and an AUC of 0.6855. Age and the type of plan were the most important variables related to the profile of the beneficiaries in the classification. The highlights with regard to healthcare costs were annual spending on consultation and on dental insurance.

  9. Activity identification using body-mounted sensors—a review of classification techniques

    International Nuclear Information System (INIS)

    Preece, Stephen J; Kenney, Laurence P J; Howard, Dave; Goulermas, John Y; Crompton, Robin; Meijer, Kenneth

    2009-01-01

    With the advent of miniaturized sensing technology, which can be body-worn, it is now possible to collect and store data on different aspects of human movement under the conditions of free living. This technology has the potential to be used in automated activity profiling systems which produce a continuous record of activity patterns over extended periods of time. Such activity profiling systems are dependent on classification algorithms which can effectively interpret body-worn sensor data and identify different activities. This article reviews the different techniques which have been used to classify normal activities and/or identify falls from body-worn sensor data. The review is structured according to the different analytical techniques and illustrates the variety of approaches which have previously been applied in this field. Although significant progress has been made in this important area, there is still significant scope for further work, particularly in the application of advanced classification techniques to problems involving many different activities. (topical review)

  10. Sentiment classification of Roman-Urdu opinions using Naïve Bayesian, Decision Tree and KNN classification techniques

    Directory of Open Access Journals (Sweden)

    Muhammad Bilal

    2016-07-01

    Full Text Available Sentiment mining is a field of text mining to determine the attitude of people about a particular product, topic, politician in newsgroup posts, review sites, comments on facebook posts twitter, etc. There are many issues involved in opinion mining. One important issue is that opinions could be in different languages (English, Urdu, Arabic, etc.. To tackle each language according to its orientation is a challenging task. Most of the research work in sentiment mining has been done in English language. Currently, limited research is being carried out on sentiment classification of other languages like Arabic, Italian, Urdu and Hindi. In this paper, three classification models are used for text classification using Waikato Environment for Knowledge Analysis (WEKA. Opinions written in Roman-Urdu and English are extracted from a blog. These extracted opinions are documented in text files to prepare a training dataset containing 150 positive and 150 negative opinions, as labeled examples. Testing data set is supplied to three different models and the results in each case are analyzed. The results show that Naïve Bayesian outperformed Decision Tree and KNN in terms of more accuracy, precision, recall and F-measure.

  11. The 2016 WHO classification and diagnostic criteria for myeloproliferative neoplasms: document summary and in-depth discussion

    OpenAIRE

    Barbui, Tiziano; Thiele, Jürgen; Gisslinger, Heinz; Kvasnicka, Hans Michael; Vannucchi, Alessandro M.; Guglielmelli, Paola; Orazi, Attilio; Tefferi, Ayalew

    2018-01-01

    The new edition of the 2016 World Health Organization (WHO) classification system for tumors of the hematopoietic and lymphoid tissues was published in September 2017. Under the category of myeloproliferative neoplasms (MPNs), the revised document includes seven subcategories: chronic myeloid leukemia, chronic neutrophilic leukemia, polycythemia vera (PV), primary myelofibrosis (PMF), essential thrombocythemia (ET), chronic eosinophilic leukemia-not otherwise specified and MPN, unclassifiable...

  12. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  13. Gasoline classification using near infrared (NIR) spectroscopy data: Comparison of multivariate techniques

    Energy Technology Data Exchange (ETDEWEB)

    Balabin, Roman M., E-mail: balabin@org.chem.ethz.ch [Department of Chemistry and Applied Biosciences, ETH Zurich, 8093 Zurich (Switzerland); Safieva, Ravilya Z. [Gubkin Russian State University of Oil and Gas, 119991 Moscow (Russian Federation); Lomakina, Ekaterina I. [Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, 119992 Moscow (Russian Federation)

    2010-06-25

    Near infrared (NIR) spectroscopy is a non-destructive (vibrational spectroscopy based) measurement technique for many multicomponent chemical systems, including products of petroleum (crude oil) refining and petrochemicals, food products (tea, fruits, e.g., apples, milk, wine, spirits, meat, bread, cheese, etc.), pharmaceuticals (drugs, tablets, bioreactor monitoring, etc.), and combustion products. In this paper we have compared the abilities of nine different multivariate classification methods: linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), regularized discriminant analysis (RDA), soft independent modeling of class analogy (SIMCA), partial least squares (PLS) classification, K-nearest neighbor (KNN), support vector machines (SVM), probabilistic neural network (PNN), and multilayer perceptron (ANN-MLP) - for gasoline classification. Three sets of near infrared (NIR) spectra (450, 415, and 345 spectra) were used for classification of gasolines into 3, 6, and 3 classes, respectively, according to their source (refinery or process) and type. The 14,000-8000 cm{sup -1} NIR spectral region was chosen. In all cases NIR spectroscopy was found to be effective for gasoline classification purposes, when compared with nuclear magnetic resonance (NMR) spectroscopy or gas chromatography (GC). KNN, SVM, and PNN techniques for classification were found to be among the most effective ones. Artificial neural network (ANN-MLP) approach based on principal component analysis (PCA), which was believed to be efficient, has shown much worse results. We hope that the results obtained in this study will help both further chemometric (multivariate data analysis) investigations and investigations in the sphere of applied vibrational (infrared/IR, near-IR, and Raman) spectroscopy of sophisticated multicomponent systems.

  14. Gasoline classification using near infrared (NIR) spectroscopy data: Comparison of multivariate techniques

    International Nuclear Information System (INIS)

    Balabin, Roman M.; Safieva, Ravilya Z.; Lomakina, Ekaterina I.

    2010-01-01

    Near infrared (NIR) spectroscopy is a non-destructive (vibrational spectroscopy based) measurement technique for many multicomponent chemical systems, including products of petroleum (crude oil) refining and petrochemicals, food products (tea, fruits, e.g., apples, milk, wine, spirits, meat, bread, cheese, etc.), pharmaceuticals (drugs, tablets, bioreactor monitoring, etc.), and combustion products. In this paper we have compared the abilities of nine different multivariate classification methods: linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), regularized discriminant analysis (RDA), soft independent modeling of class analogy (SIMCA), partial least squares (PLS) classification, K-nearest neighbor (KNN), support vector machines (SVM), probabilistic neural network (PNN), and multilayer perceptron (ANN-MLP) - for gasoline classification. Three sets of near infrared (NIR) spectra (450, 415, and 345 spectra) were used for classification of gasolines into 3, 6, and 3 classes, respectively, according to their source (refinery or process) and type. The 14,000-8000 cm -1 NIR spectral region was chosen. In all cases NIR spectroscopy was found to be effective for gasoline classification purposes, when compared with nuclear magnetic resonance (NMR) spectroscopy or gas chromatography (GC). KNN, SVM, and PNN techniques for classification were found to be among the most effective ones. Artificial neural network (ANN-MLP) approach based on principal component analysis (PCA), which was believed to be efficient, has shown much worse results. We hope that the results obtained in this study will help both further chemometric (multivariate data analysis) investigations and investigations in the sphere of applied vibrational (infrared/IR, near-IR, and Raman) spectroscopy of sophisticated multicomponent systems.

  15. Using Pattern Classification and Recognition Techniques for Diagnostic and Prediction

    Directory of Open Access Journals (Sweden)

    MORARIU, N.

    2007-04-01

    Full Text Available The paper presents some aspects regarding the joint use of classification and recognition techniques for the activity evolution diagnostication and prediction by means of a set of indexes. Starting from the indexes set there is defined a measure on the patterns set, measure representing a scalar value that characterizes the activity analyzed at each time moment. A pattern is defined by the values of the indexes set at a given time. Over the classes set obtained by means of the classification and recognition techniques is defined a relation that allows the representation of the evolution from negative evolution towards positive evolution. For the diagnostication and prediction the following tools are used: pattern recognition and multilayer perceptron. The data set used in experiments describes the pollution due to CO2 emission from the consumption of fuels in Europe. The paper also presents the REFORME software written by the authors and the results of the experiment obtained with this software.

  16. Collective Classification in Network Data

    OpenAIRE

    Sen, Prithviraj; Namata, Galileo; Bilgic, Mustafa; Getoor, Lise; University of Maryland; Galligher, Brian; Eliassi-Rad, Tina

    2008-01-01

    Many real-world applications produce networked data such as the world-wide web (hypertext documents connected via hyperlinks), social networks (for example, people connected by friendship links), communication networks (computers connected via communication links) and biological networks (for example, protein interaction networks). A recent focus in machine learning research has been to extend traditional machine learning classification techniques to classify nodes in such networks. In this a...

  17. Improved Optical Document Security Techniques Based on Volume Holography and Lippmann Photography

    Science.gov (United States)

    Bjelkhagen, Hans I.

    Optical variable devices (OVDs), such as holograms, are now common in the field of document security. Up until now mass-produced embossed holograms or other types of mass-produced OVDs are used not only for banknotes but also for personalized documents, such as passports, ID cards, travel documents, driving licenses, credit cards, etc. This means that identical OVDs are used on documents issued to individuals. Today, there is need for a higher degree of security on such documents and this chapter covers new techniques to make improved mass-produced or personalized OVDs.

  18. Evaluating and comparing imaging techniques: a review and classification of study designs

    International Nuclear Information System (INIS)

    Freedman, L.S.

    1987-01-01

    The design of studies to evaluate and compare imaging techniques are reviewed. Thirteen principles for the design of studies of diagnostic accuracy are given. Because of the 'independence principle' these studies are not able directly to evaluate the contribution of a technique to clinical management. For the latter, the 'clinical value' study design is recommended. A classification of study designs is proposed in parallel with the standard classification of clinical trials. Studies of diagnostic accuracy are analogous to Phase II, whereas studies evaluating the contribution to clinical management correspond to the Phase III category. Currently the majority of published studies employ the Phase II design. More emphasis on Phase III studies is required. (author)

  19. Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data.

    Science.gov (United States)

    Saini, Harsh; Lal, Sunil Pranit; Naidu, Vimal Vikash; Pickering, Vincel Wince; Singh, Gurmeet; Tsunoda, Tatsuhiko; Sharma, Alok

    2016-12-05

    High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.

  20. Evaluation of nuclear power plant operating procedures classifications and interfaces: Problems and techniques for improvement

    International Nuclear Information System (INIS)

    Barnes, V.E.; Radford, L.R.

    1987-02-01

    This report presents activities and findings of a project designed to evaluate current practices and problems related to procedure classification schemes and procedure interfaces in commercial nuclear power plants. The phrase ''procedure classification scheme'' refers to how plant operating procedures are categorized and indexed (e.g., normal, abnormal, emergency operating procedures). The term ''procedure interface'' refers to how reactor operators are instructed to transition within and between procedures. The project consisted of four key tasks, including (1) a survey of literature regarding problems associated with procedure classifications and interfaces, as well as techniques for overcoming them; (2) interviews with experts in the nuclear industry to discuss the appropriate scope of different classes of operating procedures and techniques for managing interfaces between them; (3) a reanalysis of data gathered about nuclear power plant normal operating and off-normal operating procedures in a related project, ''Program Plan for Assessing and Upgrading Operating Procedures for Nuclear Power Plants''; and (4) solicitation of the comments and expert opinions of a peer review group on the draft project report and on proposed techniques for resolving classification and interface issues. In addition to describing these activities and their results, recommendations for NRC and utility actions to address procedure classification and interface problems are offered

  1. A Document Imaging Technique for Implementing Electronic Loan Approval Process

    Directory of Open Access Journals (Sweden)

    J. Manikandan

    2015-04-01

    Full Text Available The image processing is one of the leading technologies of computer applications. Image processing is a type of signal processing, the input for image processor is an image or video frame and the output will be an image or subset of image [1]. Computer graphics and computer vision process uses an image processing techniques. Image processing systems are used in various environments like medical fields, computer-aided design (CAD, research fields, crime investigation fields and military fields. In this paper, we proposed a document image processing technique, for establishing electronic loan approval process (E-LAP [2]. Loan approval process has been tedious process, the E-LAP system attempts to reduce the complexity of loan approval process. Customers have to login to fill the loan application form online with all details and submit the form. The loan department then processes the submitted form and then sends an acknowledgement mail via the E-LAP to the requested customer with the details about list of documents required for the loan approval process [3]. The approaching customer can upload the scanned copies of all required documents. All this interaction between customer and bank take place using an E-LAP system.

  2. HClass: Automatic classification tool for health pathologies using artificial intelligence techniques.

    Science.gov (United States)

    Garcia-Chimeno, Yolanda; Garcia-Zapirain, Begonya

    2015-01-01

    The classification of subjects' pathologies enables a rigorousness to be applied to the treatment of certain pathologies, as doctors on occasions play with so many variables that they can end up confusing some illnesses with others. Thanks to Machine Learning techniques applied to a health-record database, it is possible to make using our algorithm. hClass contains a non-linear classification of either a supervised, non-supervised or semi-supervised type. The machine is configured using other techniques such as validation of the set to be classified (cross-validation), reduction in features (PCA) and committees for assessing the various classifiers. The tool is easy to use, and the sample matrix and features that one wishes to classify, the number of iterations and the subjects who are going to be used to train the machine all need to be introduced as inputs. As a result, the success rate is shown either via a classifier or via a committee if one has been formed. A 90% success rate is obtained in the ADABoost classifier and 89.7% in the case of a committee (comprising three classifiers) when PCA is applied. This tool can be expanded to allow the user to totally characterise the classifiers by adjusting them to each classification use.

  3. Artificial intelligence techniques for embryo and oocyte classification.

    Science.gov (United States)

    Manna, Claudio; Nanni, Loris; Lumini, Alessandra; Pappalardo, Sebastiana

    2013-01-01

    One of the most relevant aspects in assisted reproduction technology is the possibility of characterizing and identifying the most viable oocytes or embryos. In most cases, embryologists select them by visual examination and their evaluation is totally subjective. Recently, due to the rapid growth in the capacity to extract texture descriptors from a given image, a growing interest has been shown in the use of artificial intelligence methods for embryo or oocyte scoring/selection in IVF programmes. This work concentrates the efforts on the possible prediction of the quality of embryos and oocytes in order to improve the performance of assisted reproduction technology, starting from their images. The artificial intelligence system proposed in this work is based on a set of Levenberg-Marquardt neural networks trained using textural descriptors (the local binary patterns). The proposed system was tested on two data sets of 269 oocytes and 269 corresponding embryos from 104 women and compared with other machine learning methods already proposed in the past for similar classification problems. Although the results are only preliminary, they show an interesting classification performance. This technique may be of particular interest in those countries where legislation restricts embryo selection. One of the most relevant aspects in assisted reproduction technology is the possibility of characterizing and identifying the most viable oocytes or embryos. In most cases, embryologists select them by visual examination and their evaluation is totally subjective. Recently, due to the rapid growth in our capacity to extract texture descriptors from a given image, a growing interest has been shown in the use of artificial intelligence methods for embryo or oocyte scoring/selection in IVF programmes. In this work, we concentrate our efforts on the possible prediction of the quality of embryos and oocytes in order to improve the performance of assisted reproduction technology

  4. Contrast Enhancement Using Brightness Preserving Histogram Equalization Technique for Classification of Date Varieties

    Directory of Open Access Journals (Sweden)

    G Thomas

    2014-06-01

    Full Text Available Computer vision technique is becoming popular for quality assessment of many products in food industries. Image enhancement is the first step in analyzing the images in order to obtain detailed information for the determination of quality. In this study, Brightness preserving histogram equalization technique was used to enhance the features of gray scale images to classify three date varieties (Khalas, Fard and Madina. Mean, entropy, kurtosis and skewness features were extracted from the original and enhanced images. Mean and entropy from original images and kurtosis from the enhanced images were selected based on Lukka's feature selection approach. An overall classification efficiency of 93.72% was achieved with just three features. Brightness preserving histogram equalization technique has great potential to improve the classification in various quality attributes of food and agricultural products with minimum features.

  5. Hazard classification methodology

    International Nuclear Information System (INIS)

    Brereton, S.J.

    1996-01-01

    This document outlines the hazard classification methodology used to determine the hazard classification of the NIF LTAB, OAB, and the support facilities on the basis of radionuclides and chemicals. The hazard classification determines the safety analysis requirements for a facility

  6. Text Mining in Biomedical Domain with Emphasis on Document Clustering.

    Science.gov (United States)

    Renganathan, Vinaitheerthan

    2017-07-01

    With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

  7. A Review of Ground Target Detection and Classification Techniques in Forward Scattering Radars

    Directory of Open Access Journals (Sweden)

    M. E. A. Kanona

    2018-06-01

    Full Text Available This paper presents a review of target detection and classification in forward scattering radar (FSR which is a special state of bistatic radars, designed to detect and track moving targets in the narrow region along the transmitter-receiver base line. FSR has advantages and incredible features over other types of radar configurations. All previous studies proved that FSR can be used as an alternative system for ground target detection and classification. The radar and FSR fundamentals were addressed and classification algorithms and techniques were debated. On the other hand, the current and future applications and the limitations of FSR were discussed.

  8. Robust Automatic Modulation Classification Technique for Fading Channels via Deep Neural Network

    Directory of Open Access Journals (Sweden)

    Jung Hwan Lee

    2017-08-01

    Full Text Available In this paper, we propose a deep neural network (DNN-based automatic modulation classification (AMC for digital communications. While conventional AMC techniques perform well for additive white Gaussian noise (AWGN channels, classification accuracy degrades for fading channels where the amplitude and phase of channel gain change in time. The key contributions of this paper are in two phases. First, we analyze the effectiveness of a variety of statistical features for AMC task in fading channels. We reveal that the features that are shown to be effective for fading channels are different from those known to be good for AWGN channels. Second, we introduce a new enhanced AMC technique based on DNN method. We use the extensive and diverse set of statistical features found in our study for the DNN-based classifier. The fully connected feedforward network with four hidden layers are trained to classify the modulation class for several fading scenarios. Numerical evaluation shows that the proposed technique offers significant performance gain over the existing AMC methods in fading channels.

  9. Monitoring nanotechnology using patent classifications: an overview and comparison of nanotechnology classification schemes

    Energy Technology Data Exchange (ETDEWEB)

    Jürgens, Björn, E-mail: bjurgens@agenciaidea.es [Agency of Innovation and Development of Andalusia, CITPIA PATLIB Centre (Spain); Herrero-Solana, Victor, E-mail: victorhs@ugr.es [University of Granada, SCImago-UGR (SEJ036) (Spain)

    2017-04-15

    Patents are an essential information source used to monitor, track, and analyze nanotechnology. When it comes to search nanotechnology-related patents, a keyword search is often incomplete and struggles to cover such an interdisciplinary discipline. Patent classification schemes can reveal far better results since they are assigned by experts who classify the patent documents according to their technology. In this paper, we present the most important classifications to search nanotechnology patents and analyze how nanotechnology is covered in the main patent classification systems used in search systems nowadays: the International Patent Classification (IPC), the United States Patent Classification (USPC), and the Cooperative Patent Classification (CPC). We conclude that nanotechnology has a significantly better patent coverage in the CPC since considerable more nanotechnology documents were retrieved than by using other classifications, and thus, recommend its use for all professionals involved in nanotechnology patent searches.

  10. Monitoring nanotechnology using patent classifications: an overview and comparison of nanotechnology classification schemes

    International Nuclear Information System (INIS)

    Jürgens, Björn; Herrero-Solana, Victor

    2017-01-01

    Patents are an essential information source used to monitor, track, and analyze nanotechnology. When it comes to search nanotechnology-related patents, a keyword search is often incomplete and struggles to cover such an interdisciplinary discipline. Patent classification schemes can reveal far better results since they are assigned by experts who classify the patent documents according to their technology. In this paper, we present the most important classifications to search nanotechnology patents and analyze how nanotechnology is covered in the main patent classification systems used in search systems nowadays: the International Patent Classification (IPC), the United States Patent Classification (USPC), and the Cooperative Patent Classification (CPC). We conclude that nanotechnology has a significantly better patent coverage in the CPC since considerable more nanotechnology documents were retrieved than by using other classifications, and thus, recommend its use for all professionals involved in nanotechnology patent searches.

  11. Exploitation of geospatial techniques for monitoring metropolitan population growth and classification of landcover features

    International Nuclear Information System (INIS)

    Almas, A.S.; Rahim, C.A.

    2006-01-01

    The present research relates to the exploitation of Remote Sensing and GIS techniques for studying the metropolitan expansion and land use/ landcover classification of Lahore, the second largest city of Pakistan where urbanization is taking place at a striking rate with inadequate development of the requisite infrastructure. Such sprawl gives rise to the congestion, pollution and commuting time issues. The metropolitan expansion, based on growth direction and distance from the city centre, was observed for a period of about thirty years. The classification of the complex spatial assemblage of urban environment and its expanding precincts was done using the temporally spaced satellite images geo-referenced to a common coordinate system and census data. Spatial categorization of urban landscape involving densely populated residential areas, sparsely inhibited regions, bare soil patches, water bodies, vegetation, Parks, and mixed features was done with the help of satellite images. Resultantly, remote sensing and GIS techniques were found very efficient and effective for studying the metropolitan growth patterns along with the classification of urban features into prominent categories. In addition, census data augments the usefulness of spatial techniques for carrying out such studies. (author)

  12. A New Classification Technique in Mobile Robot Navigation

    Directory of Open Access Journals (Sweden)

    Bambang Tutuko

    2011-12-01

    Full Text Available This paper presents a novel pattern recognition algorithm that use weightless neural network (WNNs technique.This technique plays a role of situation classifier to judge the situation around the mobile robot environment and makes control decision in mobile robot navigation. The WNNs technique is choosen due to significant advantages over conventional neural network, such as they can be easily implemented in hardware using standard RAM, faster in training phase and work with small resources. Using a simple classification algorithm, the similar data will be grouped with each other and it will be possible to attach similar data classes to specific local areas in the mobile robot environment. This strategy is demonstrated in simple mobile robot powered by low cost microcontrollers with 512 bytes of RAM and low cost sensors. Experimental result shows, when number of neuron increases the average environmental recognition ratehas risen from 87.6% to 98.5%.The WNNs technique allows the mobile robot to recognize many and different environmental patterns and avoid obstacles in real time. Moreover, by using proposed WNNstechnique mobile robot has successfully reached the goal in dynamic environment compare to fuzzy logic technique and logic function, capable of dealing with uncertainty in sensor reading, achieving good performance in performing control actions with 0.56% error rate in mobile robot speed.

  13. Performance Evaluation of Frequency Transform Based Block Classification of Compound Image Segmentation Techniques

    Science.gov (United States)

    Selwyn, Ebenezer Juliet; Florinabel, D. Jemi

    2018-04-01

    Compound image segmentation plays a vital role in the compression of computer screen images. Computer screen images are images which are mixed with textual, graphical, or pictorial contents. In this paper, we present a comparison of two transform based block classification of compound images based on metrics like speed of classification, precision and recall rate. Block based classification approaches normally divide the compound images into fixed size blocks of non-overlapping in nature. Then frequency transform like Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) are applied over each block. Mean and standard deviation are computed for each 8 × 8 block and are used as features set to classify the compound images into text/graphics and picture/background block. The classification accuracy of block classification based segmentation techniques are measured by evaluation metrics like precision and recall rate. Compound images of smooth background and complex background images containing text of varying size, colour and orientation are considered for testing. Experimental evidence shows that the DWT based segmentation provides significant improvement in recall rate and precision rate approximately 2.3% than DCT based segmentation with an increase in block classification time for both smooth and complex background images.

  14. Personalized Metaheuristic Clustering Onto Web Documents

    Institute of Scientific and Technical Information of China (English)

    Wookey Lee

    2004-01-01

    Optimal clustering for the web documents is known to complicated cornbinatorial Optimization problem and it is hard to develop a generally applicable oplimal algorithm. An accelerated simuIated arlneaIing aIgorithm is developed for automatic web document classification. The web document classification problem is addressed as the problem of best describing a match between a web query and a hypothesized web object. The normalized term frequency and inverse document frequency coefficient is used as a measure of the match. Test beds are generated on - line during the search by transforming model web sites. As a result, web sites can be clustered optimally in terms of keyword vectofs of corresponding web documents.

  15. A COMPARATIVE ANALYSIS OF WEB INFORMATION EXTRACTION TECHNIQUES DEEP LEARNING vs. NAÏVE BAYES vs. BACK PROPAGATION NEURAL NETWORKS IN WEB DOCUMENT EXTRACTION

    Directory of Open Access Journals (Sweden)

    J. Sharmila

    2016-01-01

    Full Text Available Web mining related exploration is getting the chance to be more essential these days in view of the reason that a lot of information is overseen through the web. Web utilization is expanding in an uncontrolled way. A particular framework is required for controlling such extensive measure of information in the web space. Web mining is ordered into three noteworthy divisions: Web content mining, web usage mining and web structure mining. Tak-Lam Wong has proposed a web content mining methodology in the exploration with the aid of Bayesian Networks (BN. In their methodology, they were learning on separating the web data and characteristic revelation in view of the Bayesian approach. Roused from their investigation, we mean to propose a web content mining methodology, in view of a Deep Learning Algorithm. The Deep Learning Algorithm gives the interest over BN on the basis that BN is not considered in any learning architecture planning like to propose system. The main objective of this investigation is web document extraction utilizing different grouping algorithm and investigation. This work extricates the data from the web URL. This work shows three classification algorithms, Deep Learning Algorithm, Bayesian Algorithm and BPNN Algorithm. Deep Learning is a capable arrangement of strategies for learning in neural system which is connected like computer vision, speech recognition, and natural language processing and biometrics framework. Deep Learning is one of the simple classification technique and which is utilized for subset of extensive field furthermore Deep Learning has less time for classification. Naive Bayes classifiers are a group of basic probabilistic classifiers in view of applying Bayes hypothesis with concrete independence assumptions between the features. At that point the BPNN algorithm is utilized for classification. Initially training and testing dataset contains more URL. We extract the content presently from the dataset. The

  16. Classification of refrigerants; Classification des fluides frigorigenes

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2001-07-01

    This document was made from the US standard ANSI/ASHRAE 34 published in 2001 and entitled 'designation and safety classification of refrigerants'. This classification allows to clearly organize in an international way the overall refrigerants used in the world thanks to a codification of the refrigerants in correspondence with their chemical composition. This note explains this codification: prefix, suffixes (hydrocarbons and derived fluids, azeotropic and non-azeotropic mixtures, various organic compounds, non-organic compounds), safety classification (toxicity, flammability, case of mixtures). (J.S.)

  17. Review and classification of variability analysis techniques with clinical applications

    Science.gov (United States)

    2011-01-01

    Analysis of patterns of variation of time-series, termed variability analysis, represents a rapidly evolving discipline with increasing applications in different fields of science. In medicine and in particular critical care, efforts have focussed on evaluating the clinical utility of variability. However, the growth and complexity of techniques applicable to this field have made interpretation and understanding of variability more challenging. Our objective is to provide an updated review of variability analysis techniques suitable for clinical applications. We review more than 70 variability techniques, providing for each technique a brief description of the underlying theory and assumptions, together with a summary of clinical applications. We propose a revised classification for the domains of variability techniques, which include statistical, geometric, energetic, informational, and invariant. We discuss the process of calculation, often necessitating a mathematical transform of the time-series. Our aims are to summarize a broad literature, promote a shared vocabulary that would improve the exchange of ideas, and the analyses of the results between different studies. We conclude with challenges for the evolving science of variability analysis. PMID:21985357

  18. Review and classification of variability analysis techniques with clinical applications.

    Science.gov (United States)

    Bravi, Andrea; Longtin, André; Seely, Andrew J E

    2011-10-10

    Analysis of patterns of variation of time-series, termed variability analysis, represents a rapidly evolving discipline with increasing applications in different fields of science. In medicine and in particular critical care, efforts have focussed on evaluating the clinical utility of variability. However, the growth and complexity of techniques applicable to this field have made interpretation and understanding of variability more challenging. Our objective is to provide an updated review of variability analysis techniques suitable for clinical applications. We review more than 70 variability techniques, providing for each technique a brief description of the underlying theory and assumptions, together with a summary of clinical applications. We propose a revised classification for the domains of variability techniques, which include statistical, geometric, energetic, informational, and invariant. We discuss the process of calculation, often necessitating a mathematical transform of the time-series. Our aims are to summarize a broad literature, promote a shared vocabulary that would improve the exchange of ideas, and the analyses of the results between different studies. We conclude with challenges for the evolving science of variability analysis.

  19. Statistical methods for segmentation and classification of images

    DEFF Research Database (Denmark)

    Rosholm, Anders

    1997-01-01

    The central matter of the present thesis is Bayesian statistical inference applied to classification of images. An initial review of Markov Random Fields relates to the modeling aspect of the indicated main subject. In that connection, emphasis is put on the relatively unknown sub-class of Pickard...... with a Pickard Random Field modeling of a considered (categorical) image phenomemon. An extension of the fast PRF based classification technique is presented. The modification introduces auto-correlation into the model of an involved noise process, which previously has been assumed independent. The suitability...... of the extended model is documented by tests on controlled image data containing auto-correlated noise....

  20. Alternatives for Developing User Documentation for Applications Software

    Science.gov (United States)

    1991-09-01

    08sIili!llllill1$11ilt UNCLASSIFIE) SECURITY CLASSIFICATION OF THIS PAGE REPORT DOCUMENTATION PAGE la REPORT SECURITY CLASSIFICATION lb RESTRICTIVE ...adults spontaneously adopt " Natural egoism Many writers have had difficulty adjusting to the change in the place and function of user documentation. In...become problematic. [Brockmann, 1990] Natural egoism is the final factor that can adversely affect documentation. A writer will not be effective until he

  1. Hybrid classification: the case of the acquisition procedure documentation within and ouside the Public Contracts information systems in Alto Adige Region

    Directory of Open Access Journals (Sweden)

    Francesca Delneri

    2017-05-01

    Full Text Available With reference to the acquisition procedures, the related documentation is created and managed mostly on the platform of the subcontracting institution, partly on the Alto Adige Public Sector Contracts information system. With a partial integration between this system and the platform made available by the Agenzia per i procedimenti e la vigilanza in materia di contratti pubblici di lavori, servizi e forniture, classification and filing information can be assigned to the documents coming from both platform, while the reunification of documents related to the same process is transferred to the preservation system as unique archive of the administration.

  2. How automated image analysis techniques help scientists in species identification and classification?

    Science.gov (United States)

    Yousef Kalafi, Elham; Town, Christopher; Kaur Dhillon, Sarinder

    2017-09-04

    Identification of taxonomy at a specific level is time consuming and reliant upon expert ecologists. Hence the demand for automated species identification increased over the last two decades. Automation of data classification is primarily focussed on images, incorporating and analysing image data has recently become easier due to developments in computational technology. Research efforts in identification of species include specimens' image processing, extraction of identical features, followed by classifying them into correct categories. In this paper, we discuss recent automated species identification systems, categorizing and evaluating their methods. We reviewed and compared different methods in step by step scheme of automated identification and classification systems of species images. The selection of methods is influenced by many variables such as level of classification, number of training data and complexity of images. The aim of writing this paper is to provide researchers and scientists an extensive background study on work related to automated species identification, focusing on pattern recognition techniques in building such systems for biodiversity studies.

  3. "What is relevant in a text document?": An interpretable machine learning approach.

    Directory of Open Access Journals (Sweden)

    Leila Arras

    Full Text Available Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text's category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP, a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications.

  4. Two-stage neural-network-based technique for Urdu character two-dimensional shape representation, classification, and recognition

    Science.gov (United States)

    Megherbi, Dalila B.; Lodhi, S. M.; Boulenouar, A. J.

    2001-03-01

    This work is in the field of automated document processing. This work addresses the problem of representation and recognition of Urdu characters using Fourier representation and a Neural Network architecture. In particular, we show that a two-stage Neural Network scheme is used here to make classification of 36 Urdu characters into seven sub-classes namely subclasses characterized by seven proposed and defined fuzzy features specifically related to Urdu characters. We show that here Fourier Descriptors and Neural Network provide a remarkably simple way to draw definite conclusions from vague, ambiguous, noisy or imprecise information. In particular, we illustrate the concept of interest regions and describe a framing method that provides a way to make the proposed technique for Urdu characters recognition robust and invariant to scaling and translation. We also show that a given character rotation is dealt with by using the Hotelling transform. This transform is based upon the eigenvalue decomposition of the covariance matrix of an image, providing a method of determining the orientation of the major axis of an object within an image. Finally experimental results are presented to show the power and robustness of the proposed two-stage Neural Network based technique for Urdu character recognition, its fault tolerance, and high recognition accuracy.

  5. SFM TECHNIQUE AND FOCUS STACKING FOR DIGITAL DOCUMENTATION OF ARCHAEOLOGICAL ARTIFACTS

    Directory of Open Access Journals (Sweden)

    P. Clini

    2016-06-01

    Full Text Available Digital documentation and high-quality 3D representation are always more requested in many disciplines and areas due to the large amount of technologies and data available for fast, detailed and quick documentation. This work aims to investigate the area of medium and small sized artefacts and presents a fast and low cost acquisition system that guarantees the creation of 3D models with an high level of detail, making the digitalization of cultural heritage a simply and fast procedure. The 3D models of the artefacts are created with the photogrammetric technique Structure From Motion that makes it possible to obtain, in addition to three-dimensional models, high-definition images for a deepened study and understanding of the artefacts. For the survey of small objects (only few centimetres it is used a macro lens and the focus stacking, a photographic technique that consists in capturing a stack of images at different focus planes for each camera pose so that is possible to obtain a final image with a higher depth of field. The acquisition with focus stacking technique has been finally validated with an acquisition with laser triangulation scanner Minolta that demonstrates the validity compatible with the allowable error in relation to the expected precision.

  6. Assessing the Effectiveness of Statistical Classification Techniques in Predicting Future Employment of Participants in the Temporary Assistance for Needy Families Program

    Science.gov (United States)

    Montoya, Isaac D.

    2008-01-01

    Three classification techniques (Chi-square Automatic Interaction Detection [CHAID], Classification and Regression Tree [CART], and discriminant analysis) were tested to determine their accuracy in predicting Temporary Assistance for Needy Families program recipients' future employment. Technique evaluation was based on proportion of correctly…

  7. Classification of Phishing Email Using Random Forest Machine Learning Technique

    Directory of Open Access Journals (Sweden)

    Andronicus A. Akinyelu

    2014-01-01

    Full Text Available Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars have been lost by many companies and individuals. In 2012, an online report put the loss due to phishing attack at about $1.5 billion. This global impact of phishing attacks will continue to be on the increase and thus requires more efficient phishing detection techniques to curb the menace. This paper investigates and reports the use of random forest machine learning algorithm in classification of phishing attacks, with the major objective of developing an improved phishing email classifier with better prediction accuracy and fewer numbers of features. From a dataset consisting of 2000 phishing and ham emails, a set of prominent phishing email features (identified from the literature were extracted and used by the machine learning algorithm with a resulting classification accuracy of 99.7% and low false negative (FN and false positive (FP rates.

  8. Text document classification based on mixture models

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Malík, Antonín

    2004-01-01

    Roč. 40, č. 3 (2004), s. 293-304 ISSN 0023-5954 R&D Projects: GA AV ČR IAA2075302; GA ČR GA102/03/0049; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : text classification * text categorization * multinomial mixture model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004

  9. A New Wavelet-Based Document Image Segmentation Scheme

    Institute of Scientific and Technical Information of China (English)

    赵健; 李道京; 俞卞章; 耿军平

    2002-01-01

    The document image segmentation is very useful for printing, faxing and data processing. An algorithm is developed for segmenting and classifying document image. Feature used for classification is based on the histogram distribution pattern of different image classes. The important attribute of the algorithm is using wavelet correlation image to enhance raw image's pattern, so the classification accuracy is improved. In this paper document image is divided into four types: background, photo, text and graph. Firstly, the document image background has been distingusished easily by former normally method; secondly, three image types will be distinguished by their typical histograms, in order to make histograms feature clearer, each resolution' s HH wavelet subimage is used to add to the raw image at their resolution. At last, the photo, text and praph have been devided according to how the feature fit to the Laplacian distrbution by -X2 and L. Simulations show that classification accuracy is significantly improved. The comparison with related shows that our algorithm provides both lower classification error rates and better visual results.

  10. Lidar-based individual tree species classification using convolutional neural network

    Science.gov (United States)

    Mizoguchi, Tomohiro; Ishii, Akira; Nakamura, Hiroyuki; Inoue, Tsuyoshi; Takamatsu, Hisashi

    2017-06-01

    Terrestrial lidar is commonly used for detailed documentation in the field of forest inventory investigation. Recent improvements of point cloud processing techniques enabled efficient and precise computation of an individual tree shape parameters, such as breast-height diameter, height, and volume. However, tree species are manually specified by skilled workers to date. Previous works for automatic tree species classification mainly focused on aerial or satellite images, and few works have been reported for classification techniques using ground-based sensor data. Several candidate sensors can be considered for classification, such as RGB or multi/hyper spectral cameras. Above all candidates, we use terrestrial lidar because it can obtain high resolution point cloud in the dark forest. We selected bark texture for the classification criteria, since they clearly represent unique characteristics of each tree and do not change their appearance under seasonable variation and aged deterioration. In this paper, we propose a new method for automatic individual tree species classification based on terrestrial lidar using Convolutional Neural Network (CNN). The key component is the creation step of a depth image which well describe the characteristics of each species from a point cloud. We focus on Japanese cedar and cypress which cover the large part of domestic forest. Our experimental results demonstrate the effectiveness of our proposed method.

  11. Classification of remotely sensed data using OCR-inspired neural network techniques. [Optical Character Recognition

    Science.gov (United States)

    Kiang, Richard K.

    1992-01-01

    Neural networks have been applied to classifications of remotely sensed data with some success. To improve the performance of this approach, an examination was made of how neural networks are applied to the optical character recognition (OCR) of handwritten digits and letters. A three-layer, feedforward network, along with techniques adopted from OCR, was used to classify Landsat-4 Thematic Mapper data. Good results were obtained. To overcome the difficulties that are characteristic of remote sensing applications and to attain significant improvements in classification accuracy, a special network architecture may be required.

  12. IRIS COLOUR CLASSIFICATION SCALES--THEN AND NOW.

    Science.gov (United States)

    Grigore, Mariana; Avram, Alina

    2015-01-01

    Eye colour is one of the most obvious phenotypic traits of an individual. Since the first documented classification scale developed in 1843, there have been numerous attempts to classify the iris colour. In the past centuries, iris colour classification scales has had various colour categories and mostly relied on comparison of an individual's eye with painted glass eyes. Once photography techniques were refined, standard iris photographs replaced painted eyes, but this did not solve the problem of painted/ printed colour variability in time. Early clinical scales were easy to use, but lacked objectivity and were not standardised or statistically tested for reproducibility. The era of automated iris colour classification systems came with the technological development. Spectrophotometry, digital analysis of high-resolution iris images, hyper spectral analysis of the human real iris and the dedicated iris colour analysis software, all accomplished an objective, accurate iris colour classification, but are quite expensive and limited in use to research environment. Iris colour classification systems evolved continuously due to their use in a wide range of studies, especially in the fields of anthropology, epidemiology and genetics. Despite the wide range of the existing scales, up until present there has been no generally accepted iris colour classification scale.

  13. 10 CFR 95.37 - Classification and preparation of documents.

    Science.gov (United States)

    2010-01-01

    ... not be the subject of retribution. (i) Files, folders or group of documents. Files, folders, binders... document which they contain. (j) Drafts and working papers. Drafts of documents and working papers which...

  14. FY71 Engineering Report on Surveillance Techniques for Civil Aviation Security

    Science.gov (United States)

    1971-11-01

    This document discusses the work performed by the TSC task group on surveillance techniques in FY71. The principal section is devoted to the technical description, classification and evaluation of commercial metal detectors for concealed weapons. It ...

  15. An Authentication Technique Based on Classification

    Institute of Scientific and Technical Information of China (English)

    李钢; 杨杰

    2004-01-01

    We present a novel watermarking approach based on classification for authentication, in which a watermark is embedded into the host image. When the marked image is modified, the extracted watermark is also different to the original watermark, and different kinds of modification lead to different extracted watermarks. In this paper, different kinds of modification are considered as classes, and we used classification algorithm to recognize the modifications with high probability. Simulation results show that the proposed method is potential and effective.

  16. Automatic Classification of Sub-Techniques in Classical Cross-Country Skiing Using a Machine Learning Algorithm on Micro-Sensor Data

    Directory of Open Access Journals (Sweden)

    Ole Marius Hoel Rindal

    2017-12-01

    Full Text Available The automatic classification of sub-techniques in classical cross-country skiing provides unique possibilities for analyzing the biomechanical aspects of outdoor skiing. This is currently possible due to the miniaturization and flexibility of wearable inertial measurement units (IMUs that allow researchers to bring the laboratory to the field. In this study, we aimed to optimize the accuracy of the automatic classification of classical cross-country skiing sub-techniques by using two IMUs attached to the skier’s arm and chest together with a machine learning algorithm. The novelty of our approach is the reliable detection of individual cycles using a gyroscope on the skier’s arm, while a neural network machine learning algorithm robustly classifies each cycle to a sub-technique using sensor data from an accelerometer on the chest. In this study, 24 datasets from 10 different participants were separated into the categories training-, validation- and test-data. Overall, we achieved a classification accuracy of 93.9% on the test-data. Furthermore, we illustrate how an accurate classification of sub-techniques can be combined with data from standard sports equipment including position, altitude, speed and heart rate measuring systems. Combining this information has the potential to provide novel insight into physiological and biomechanical aspects valuable to coaches, athletes and researchers.

  17. Document Organization Using Kohonen's Algorithm.

    Science.gov (United States)

    Guerrero Bote, Vicente P.; Moya Anegon, Felix de; Herrero Solana, Victor

    2002-01-01

    Discussion of the classification of documents from bibliographic databases focuses on a method of vectorizing reference documents from LISA (Library and Information Science Abstracts) which permits their topological organization using Kohonen's algorithm. Analyzes possibilities of this type of neural network with respect to the development of…

  18. Evolving Techniques of Documentation of a World Heritage Site in Lahore

    Science.gov (United States)

    Arif, R.; Essa, K.

    2017-08-01

    Lahore is an ancient, culturally rich city amidst which are embedded two world heritage sites. The state of historic preservation in the country is impoverished with a dearth of training and poor documentation skills, thus these monuments are decaying and in dire need of attention. The Aga Khan Cultural Service - Pakistan is one of the first working in heritage conservation in the country. AKCSP is currently subjecting the UNESCO World Heritage site of the Mughal era Lahore Fort to an intensive and multi-faceted architectural documentation process. This is presented here as a case study to chart the evolution of documentation techniques and enunciate the spectrum of challenges faced in the documentation of an intricate Mughal heritage site for conservation in the Pakistani context. 3D - laser scanning is used for the purpose of heritage conservation for the first time, and since has been utilised on heritage buildings and urban fabric in ongoing projects. These include Lahore Fort, Walled city of Lahore as well as the Baltit Fort, a project restored in the past, assisting in the maintenance of conserved buildings. The documentation team is currently discovering the full potential of this technology especially its use in heritage conservation simultaneously overcoming challenges faced. Moreover negotiating solutions to auto-generate 2D architectural drawings from the 3D pointcloud output. The historic architecture is juxtaposed with contemporary technology in a region where such a combination is rarely found. The goal is to continually develop the documentation methodologies whilst investigating other technologies in the future.

  19. EVOLVING TECHNIQUES OF DOCUMENTATION OF A WORLD HERITAGE SITE IN LAHORE

    Directory of Open Access Journals (Sweden)

    R. Arif

    2017-08-01

    Full Text Available Lahore is an ancient, culturally rich city amidst which are embedded two world heritage sites. The state of historic preservation in the country is impoverished with a dearth of training and poor documentation skills, thus these monuments are decaying and in dire need of attention. The Aga Khan Cultural Service - Pakistan is one of the first working in heritage conservation in the country. AKCSP is currently subjecting the UNESCO World Heritage site of the Mughal era Lahore Fort to an intensive and multi-faceted architectural documentation process. This is presented here as a case study to chart the evolution of documentation techniques and enunciate the spectrum of challenges faced in the documentation of an intricate Mughal heritage site for conservation in the Pakistani context. 3D - laser scanning is used for the purpose of heritage conservation for the first time, and since has been utilised on heritage buildings and urban fabric in ongoing projects. These include Lahore Fort, Walled city of Lahore as well as the Baltit Fort, a project restored in the past, assisting in the maintenance of conserved buildings. The documentation team is currently discovering the full potential of this technology especially its use in heritage conservation simultaneously overcoming challenges faced. Moreover negotiating solutions to auto-generate 2D architectural drawings from the 3D pointcloud output. The historic architecture is juxtaposed with contemporary technology in a region where such a combination is rarely found. The goal is to continually develop the documentation methodologies whilst investigating other technologies in the future.

  20. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  1. Repair-oriented classification of aortic insufficiency: impact on surgical techniques and clinical outcomes.

    Science.gov (United States)

    Boodhwani, Munir; de Kerchove, Laurent; Glineur, David; Poncelet, Alain; Rubay, Jean; Astarci, Parla; Verhelst, Robert; Noirhomme, Philippe; El Khoury, Gébrine

    2009-02-01

    Valve repair for aortic insufficiency requires a tailored surgical approach determined by the leaflet and aortic disease. Over the past decade, we have developed a functional classification of AI, which guides repair strategy and can predict outcome. In this study, we analyze our experience with a systematic approach to aortic valve repair. From 1996 to 2007, 264 patients underwent elective aortic valve repair for aortic insufficiency (mean age - 54 +/- 16 years; 79% male). AV was tricuspid in 171 patients bicuspid in 90 and quadricuspid in 3. One hundred fifty three patients had type I dysfunction (aortic dilatation), 134 had type II (cusp prolapse), and 40 had type III (restrictive). Thirty six percent (96/264) of the patients had more than one identified mechanism. In-hospital mortality was 1.1% (3/264). Six patients experienced early repair failure; 3 underwent re-repair. Functional classification predicted the necessary repair techniques in 82-100% of patients, with adjunctive techniques being employed in up to 35% of patients. Mid-term follow up (median [interquartile range]: 47 [29-73] months) revealed a late mortality rate of 4.2% (11/261, 10 cardiac). Five year overall survival was 95 +/- 3%. Ten patients underwent aortic valve reoperation (1 re-repair). Freedoms from recurrent Al (>2+) and from AV reoperation at 5 years was 88 +/- 3% and 92 +/- 4% respectively and patients with type I (82 +/- 9%; 93 +/- 5%) or II (95 +/- 5%; 94 +/- 6%) had better outcomes compared to type III (76 +/- 17%; 84 +/- 13%). Aortic valve repair is an acceptable therapeutic option for patients with aortic insufficiency. This functional classification allows a systematic approach to the repair of Al and can help to predict the surgical techniques required as well as the durability of repair. Restrictive cusp motion (type III), due to fibrosis or calcification, is an important predictor for recurrent Al following AV repair.

  2. A Novel Feature Extraction Technique Using Binarization of Bit Planes for Content Based Image Classification

    Directory of Open Access Journals (Sweden)

    Sudeep Thepade

    2014-01-01

    Full Text Available A number of techniques have been proposed earlier for feature extraction using image binarization. Efficiency of the techniques was dependent on proper threshold selection for the binarization method. In this paper, a new feature extraction technique using image binarization has been proposed. The technique has binarized the significant bit planes of an image by selecting local thresholds. The proposed algorithm has been tested on a public dataset and has been compared with existing widely used techniques using binarization for extraction of features. It has been inferred that the proposed method has outclassed all the existing techniques and has shown consistent classification performance.

  3. Preliminary Hazard Classification for the 105-B Reactor

    International Nuclear Information System (INIS)

    Kerr, N.R.

    1997-08-01

    This document summarizes the inventories of radioactive and hazardous materials present within the 105-B Reactor and uses the inventory information to determine the preliminary hazard classification for the surveillance and maintenance activities of the facility. The result of this effort was the preliminary hazard classification for the 105-B Building surveillance and maintenance activities. The preliminary hazard classification was determined to be Nuclear Category 3. Additional hazard and accident analysis will be documented in a separate report to define the hazard controls and final hazard classification

  4. Border Lakes land-cover classification

    Science.gov (United States)

    Marvin Bauer; Brian Loeffelholz; Doug. Shinneman

    2009-01-01

    This document contains metadata and description of land-cover classification of approximately 5.1 million acres of land bordering Minnesota, U.S.A. and Ontario, Canada. The classification focused on the separation and identification of specific forest-cover types. Some separation of the nonforest classes also was performed. The classification was derived from multi-...

  5. Graph-based Techniques for Topic Classification of Tweets in Spanish

    Directory of Open Access Journals (Sweden)

    Hector Cordobés

    2014-03-01

    Full Text Available Topic classification of texts is one of the most interesting challenges in Natural Language Processing (NLP. Topic classifiers commonly use a bag-of-words approach, in which the classifier uses (and is trained with selected terms from the input texts. In this work we present techniques based on graph similarity to classify short texts by topic. In our classifier we build graphs from the input texts, and then use properties of these graphs to classify them. We have tested the resulting algorithm by classifying Twitter messages in Spanish among a predefined set of topics, achieving more than 70% accuracy.

  6. 10 CFR 1045.37 - Classification guides.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false Classification guides. 1045.37 Section 1045.37 Energy DEPARTMENT OF ENERGY (GENERAL PROVISIONS) NUCLEAR CLASSIFICATION AND DECLASSIFICATION Generation and Review of Documents Containing Restricted Data and Formerly Restricted Data § 1045.37 Classification guides...

  7. IRIS COLOUR CLASSIFICATION SCALES – THEN AND NOW

    Science.gov (United States)

    Grigore, Mariana; Avram, Alina

    2015-01-01

    Eye colour is one of the most obvious phenotypic traits of an individual. Since the first documented classification scale developed in 1843, there have been numerous attempts to classify the iris colour. In the past centuries, iris colour classification scales has had various colour categories and mostly relied on comparison of an individual’s eye with painted glass eyes. Once photography techniques were refined, standard iris photographs replaced painted eyes, but this did not solve the problem of painted/ printed colour variability in time. Early clinical scales were easy to use, but lacked objectivity and were not standardised or statistically tested for reproducibility. The era of automated iris colour classification systems came with the technological development. Spectrophotometry, digital analysis of high-resolution iris images, hyper spectral analysis of the human real iris and the dedicated iris colour analysis software, all accomplished an objective, accurate iris colour classification, but are quite expensive and limited in use to research environment. Iris colour classification systems evolved continuously due to their use in a wide range of studies, especially in the fields of anthropology, epidemiology and genetics. Despite the wide range of the existing scales, up until present there has been no generally accepted iris colour classification scale. PMID:27373112

  8. On the query reformulation technique for effective MEDLINE document retrieval.

    Science.gov (United States)

    Yoo, Sooyoung; Choi, Jinwook

    2010-10-01

    Improving the retrieval accuracy of MEDLINE documents is still a challenging issue due to low retrieval precision. Focusing on a query expansion technique based on pseudo-relevance feedback (PRF), this paper addresses the problem by systematically examining the effects of expansion term selection and adjustment of the term weights of the expanded query using a set of MEDLINE test documents called OHSUMED. Implementing a baseline information retrieval system based on the Okapi BM25 retrieval model, we compared six well-known term ranking algorithms for useful expansion term selection and then compared traditional term reweighting algorithms with our new variant of the standard Rocchio's feedback formula, which adopts a group-based weighting scheme. Our experimental results on the OHSUMED test collection showed a maximum improvement of 20.2% and 20.4% for mean average precision and recall measures over unexpanded queries when terms were expanded using a co-occurrence analysis-based term ranking algorithm in conjunction with our term reweighting algorithm (p-valueretrieval.

  9. Machine Learning Techniques for Stellar Light Curve Classification

    Science.gov (United States)

    Hinners, Trisha A.; Tat, Kevin; Thorp, Rachel

    2018-07-01

    We apply machine learning techniques in an attempt to predict and classify stellar properties from noisy and sparse time-series data. We preprocessed over 94 GB of Kepler light curves from the Mikulski Archive for Space Telescopes (MAST) to classify according to 10 distinct physical properties using both representation learning and feature engineering approaches. Studies using machine learning in the field have been primarily done on simulated data, making our study one of the first to use real light-curve data for machine learning approaches. We tuned our data using previous work with simulated data as a template and achieved mixed results between the two approaches. Representation learning using a long short-term memory recurrent neural network produced no successful predictions, but our work with feature engineering was successful for both classification and regression. In particular, we were able to achieve values for stellar density, stellar radius, and effective temperature with low error (∼2%–4%) and good accuracy (∼75%) for classifying the number of transits for a given star. The results show promise for improvement for both approaches upon using larger data sets with a larger minority class. This work has the potential to provide a foundation for future tools and techniques to aid in the analysis of astrophysical data.

  10. Ahmad's NPRT System: A Practical Innovation for Documenting Male Pattern Baldness

    OpenAIRE

    Ahmad, Muhammad

    2016-01-01

    Various classifications for male pattern baldness are mentioned in the literature. The 'Norwood's classification is the most commonly used but it has certain limitations. The new system has included 'three' extra features which were not mentioned in any other classification. It provides an opportunity to document the full and correct picture while documenting male pattern baldness. It also aids in assessing the treatment for various degrees of baldness.

  11. Ahmad's NPRT system: A practical innovation for documenting male pattern baldness

    Directory of Open Access Journals (Sweden)

    Muhammad Ahmad

    2016-01-01

    Full Text Available Various classifications for male pattern baldness are mentioned in the literature. The 'Norwood's classification is the most commonly used but it has certain limitations. The new system has included 'three' extra features which were not mentioned in any other classification. It provides an opportunity to document the full and correct picture while documenting male pattern baldness. It also aids in assessing the treatment for various degrees of baldness.

  12. Classification Technique for Ultrasonic Weld Inspection Signals using a Neural Network based on 2-dimensional fourier Transform and Principle Component Analysis

    International Nuclear Information System (INIS)

    Kim, Jae Joon

    2004-01-01

    Neural network-based signal classification systems are increasingly used in the analysis of large volumes of data obtained in NDE applications. Ultrasonic inspection methods on the other hand are commonly used in the nondestructive evaluation of welds to detect flaws. An important characteristic of ultrasonic inspection is the ability to identify the type of discontinuity that gives rise to a peculiar signal. Standard techniques rely on differences in individual A-scans to classify the signals. This paper proposes an ultrasonic signal classification technique based on the information tying in the neighboring signals. The approach is based on a 2-dimensional Fourier transform and the principal component analysis to generate a reduced dimensional feature vector for classification. Results of applying the technique to data obtained from the inspection of actual steel welds are presented

  13. Content Abstract Classification Using Naive Bayes

    Science.gov (United States)

    Latif, Syukriyanto; Suwardoyo, Untung; Aldrin Wihelmus Sanadi, Edwin

    2018-03-01

    This study aims to classify abstract content based on the use of the highest number of words in an abstract content of the English language journals. This research uses a system of text mining technology that extracts text data to search information from a set of documents. Abstract content of 120 data downloaded at www.computer.org. Data grouping consists of three categories: DM (Data Mining), ITS (Intelligent Transport System) and MM (Multimedia). Systems built using naive bayes algorithms to classify abstract journals and feature selection processes using term weighting to give weight to each word. Dimensional reduction techniques to reduce the dimensions of word counts rarely appear in each document based on dimensional reduction test parameters of 10% -90% of 5.344 words. The performance of the classification system is tested by using the Confusion Matrix based on comparative test data and test data. The results showed that the best classification results were obtained during the 75% training data test and 25% test data from the total data. Accuracy rates for categories of DM, ITS and MM were 100%, 100%, 86%. respectively with dimension reduction parameters of 30% and the value of learning rate between 0.1-0.5.

  14. Proposal plan of classification faceted for federal universities

    Directory of Open Access Journals (Sweden)

    Renata Santos Brandão

    2017-09-01

    Full Text Available This study aims to present a faceted classification plan for the archival management of documents in the federal universities of Brazil. For this, was done a literature review on the archival management in Brazil, the types of classification plans and the theory of the Ranganathan faceted classification, through searches in databases in the areas of Librarianship and Archivology. It was identified the classification plan used in the Federal Institutions of Higher Education to represent the functional facet and created the structural classification plan to represent the structural facet. The two classification plans were inserted into a digital repository management system to give rise to the faceted classification plan. The system used was Tainacan, free software wordpress-based used in digital document management. The developed faceted classification plan allows the user to choose and even combine the way to look for the information that guarantees agreater efficiency in the information retrieval.

  15. On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Asriyanti Indah Pratiwi

    2018-01-01

    Full Text Available Sentiment analysis in a movie review is the needs of today lifestyle. Unfortunately, enormous features make the sentiment of analysis slow and less sensitive. Finding the optimum feature selection and classification is still a challenge. In order to handle an enormous number of features and provide better sentiment classification, an information-based feature selection and classification are proposed. The proposed method reduces more than 90% unnecessary features while the proposed classification scheme achieves 96% accuracy of sentiment classification. From the experimental results, it can be concluded that the combination of proposed feature selection and classification achieves the best performance so far.

  16. A methodology for semiautomatic taxonomy of concepts extraction from nuclear scientific documents using text mining techniques; Metodologia para extracao semiautomatica de uma taxonomia de conceitos a partir da producao cientifica da area nuclear utilizando tecnicas de mineracao de textos

    Energy Technology Data Exchange (ETDEWEB)

    Braga, Fabiane dos Reis

    2013-07-01

    This thesis presents a text mining method for semi-automatic extraction of taxonomy of concepts, from a textual corpus composed of scientific papers related to nuclear area. The text classification is a natural human practice and a crucial task for work with large repositories. The document clustering technique provides a logical and understandable framework that facilitates the organization, browsing and searching. Most clustering algorithms using the bag of words model to represent the content of a document. This model generates a high dimensionality of the data, ignores the fact that different words can have the same meaning and does not consider the relationship between them, assuming that words are independent of each other. The methodology presents a combination of a model for document representation by concepts with a hierarchical document clustering method using frequency of co-occurrence concepts and a technique for clusters labeling more representatives, with the objective of producing a taxonomy of concepts which may reflect a structure of the knowledge domain. It is hoped that this work will contribute to the conceptual mapping of scientific production of nuclear area and thus support the management of research activities in this area. (author)

  17. 32 CFR 2400.16 - Derivative classification markings.

    Science.gov (United States)

    2010-07-01

    ... SECURITY PROGRAM Derivative Classification § 2400.16 Derivative classification markings. (a) Documents... 32 National Defense 6 2010-07-01 2010-07-01 false Derivative classification markings. 2400.16..., as described in § 2400.12 of this part, the information may not be used as a basis for derivative...

  18. Machine printed text and handwriting identification in noisy document images.

    Science.gov (United States)

    Zheng, Yefeng; Li, Huiping; Doermann, David

    2004-03-01

    In this paper, we address the problem of the identification of text in noisy document images. We are especially focused on segmenting and identifying between handwriting and machine printed text because: 1) Handwriting in a document often indicates corrections, additions, or other supplemental information that should be treated differently from the main content and 2) the segmentation and recognition techniques requested for machine printed and handwritten text are significantly different. A novel aspect of our approach is that we treat noise as a separate class and model noise based on selected features. Trained Fisher classifiers are used to identify machine printed text and handwriting from noise and we further exploit context to refine the classification. A Markov Random Field-based (MRF) approach is used to model the geometrical structure of the printed text, handwriting, and noise to rectify misclassifications. Experimental results show that our approach is robust and can significantly improve page segmentation in noisy document collections.

  19. Search techniques in intelligent classification systems

    CERN Document Server

    Savchenko, Andrey V

    2016-01-01

    A unified methodology for categorizing various complex objects is presented in this book. Through probability theory, novel asymptotically minimax criteria suitable for practical applications in imaging and data analysis are examined including the special cases such as the Jensen-Shannon divergence and the probabilistic neural network. An optimal approximate nearest neighbor search algorithm, which allows faster classification of databases is featured. Rough set theory, sequential analysis and granular computing are used to improve performance of the hierarchical classifiers. Practical examples in face identification (including deep neural networks), isolated commands recognition in voice control system and classification of visemes captured by the Kinect depth camera are included. This approach creates fast and accurate search procedures by using exact probability densities of applied dissimilarity measures. This book can be used as a guide for independent study and as supplementary material for a technicall...

  20. Security classification of information

    Energy Technology Data Exchange (ETDEWEB)

    Quist, A.S.

    1993-04-01

    This document is the second of a planned four-volume work that comprehensively discusses the security classification of information. The main focus of Volume 2 is on the principles for classification of information. Included herein are descriptions of the two major types of information that governments classify for national security reasons (subjective and objective information), guidance to use when determining whether information under consideration for classification is controlled by the government (a necessary requirement for classification to be effective), information disclosure risks and benefits (the benefits and costs of classification), standards to use when balancing information disclosure risks and benefits, guidance for assigning classification levels (Top Secret, Secret, or Confidential) to classified information, guidance for determining how long information should be classified (classification duration), classification of associations of information, classification of compilations of information, and principles for declassifying and downgrading information. Rules or principles of certain areas of our legal system (e.g., trade secret law) are sometimes mentioned to .provide added support to some of those classification principles.

  1. Computer-aided classification of lung nodules on computed tomography images via deep learning technique

    Directory of Open Access Journals (Sweden)

    Hua KL

    2015-08-01

    Full Text Available Kai-Lung Hua,1 Che-Hao Hsu,1 Shintami Chusnul Hidayati,1 Wen-Huang Cheng,2 Yu-Jen Chen3 1Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, 2Research Center for Information Technology Innovation, Academia Sinica, 3Department of Radiation Oncology, MacKay Memorial Hospital, Taipei, Taiwan Abstract: Lung cancer has a poor prognosis when not diagnosed early and unresectable lesions are present. The management of small lung nodules noted on computed tomography scan is controversial due to uncertain tumor characteristics. A conventional computer-aided diagnosis (CAD scheme requires several image processing and pattern recognition steps to accomplish a quantitative tumor differentiation result. In such an ad hoc image analysis pipeline, every step depends heavily on the performance of the previous step. Accordingly, tuning of classification performance in a conventional CAD scheme is very complicated and arduous. Deep learning techniques, on the other hand, have the intrinsic advantage of an automatic exploitation feature and tuning of performance in a seamless fashion. In this study, we attempted to simplify the image analysis pipeline of conventional CAD with deep learning techniques. Specifically, we introduced models of a deep belief network and a convolutional neural network in the context of nodule classification in computed tomography images. Two baseline methods with feature computing steps were implemented for comparison. The experimental results suggest that deep learning methods could achieve better discriminative results and hold promise in the CAD application domain. Keywords: nodule classification, deep learning, deep belief network, convolutional neural network

  2. Tissue Classification

    DEFF Research Database (Denmark)

    Van Leemput, Koen; Puonti, Oula

    2015-01-01

    Computational methods for automatically segmenting magnetic resonance images of the brain have seen tremendous advances in recent years. So-called tissue classification techniques, aimed at extracting the three main brain tissue classes (white matter, gray matter, and cerebrospinal fluid), are now...... well established. In their simplest form, these methods classify voxels independently based on their intensity alone, although much more sophisticated models are typically used in practice. This article aims to give an overview of often-used computational techniques for brain tissue classification...

  3. Testing photogrammetry-based techniques for three-dimensional surface documentation in forensic pathology.

    Science.gov (United States)

    Urbanová, Petra; Hejna, Petr; Jurda, Mikoláš

    2015-05-01

    Three-dimensional surface technologies particularly close range photogrammetry and optical surface scanning have recently advanced into affordable, flexible and accurate techniques. Forensic postmortem investigation as performed on a daily basis, however, has not yet fully benefited from their potentials. In the present paper, we tested two approaches to 3D external body documentation - digital camera-based photogrammetry combined with commercial Agisoft PhotoScan(®) software and stereophotogrammetry-based Vectra H1(®), a portable handheld surface scanner. In order to conduct the study three human subjects were selected, a living person, a 25-year-old female, and two forensic cases admitted for postmortem examination at the Department of Forensic Medicine, Hradec Králové, Czech Republic (both 63-year-old males), one dead to traumatic, self-inflicted, injuries (suicide by hanging), the other diagnosed with the heart failure. All three cases were photographed in 360° manner with a Nikon 7000 digital camera and simultaneously documented with the handheld scanner. In addition to having recorded the pre-autopsy phase of the forensic cases, both techniques were employed in various stages of autopsy. The sets of collected digital images (approximately 100 per case) were further processed to generate point clouds and 3D meshes. Final 3D models (a pair per individual) were counted for numbers of points and polygons, then assessed visually and compared quantitatively using ICP alignment algorithm and a cloud point comparison technique based on closest point to point distances. Both techniques were proven to be easy to handle and equally laborious. While collecting the images at autopsy took around 20min, the post-processing was much more time-demanding and required up to 10h of computation time. Moreover, for the full-body scanning the post-processing of the handheld scanner required rather time-consuming manual image alignment. In all instances the applied approaches

  4. Classification of Tropical River Using Chemometrics Technique: Case Study in Pahang River, Malaysia

    International Nuclear Information System (INIS)

    Mohd Khairul Amri Kamarudin; Mohd Ekhwan Toriman; Nur Hishaam Sulaiman

    2015-01-01

    River classification is very important to know the river characteristic in study areas, where this database can help to understand the behaviour of the river. This article discusses about river classification using Chemometrics techniques in mainstream of Pahang River. Based on river survey, GIS and Remote Sensing database, the chemometric analysis techniques have been used to identify the cluster on the Pahang River using Hierarchical Agglomerative Cluster Analysis (HACA). Calibration and validation process using Discriminant Analysis (DA) has been used to confirm the HACA result. Principal Component Analysis (PCA) study to see the strong coefficient where the Pahang River has been classed. The results indicated the main of Pahang River has been classed to three main clusters as upstream, middle stream and downstream. Base on DA analysis, the calibration and validation model shows 100 % convinced. While the PCA indicates there are three variables that have a significant correlation, domination slope with R"2 0.796, L/D ratio with R"2 -0868 and sinuosity with R"2 0.557. Map of the river classification with moving class also was produced. Where the green colour considered in valley erosion zone, yellow in a low terrace of land near the channels and red colour class in flood plain and valley deposition zone. From this result, the basic information can be produced to understand the characteristics of the main Pahang River. This result is important to local authorities to make decisions according to the cluster or guidelines for future study in Pahang River, Malaysia specifically and for Tropical River generally. The research findings are important to local authorities by providing basic data as a guidelines to the integrated river management at Pahang River, and Tropical River in general. (author)

  5. Interagency Security Classification Appeals Panel (ISCAP) Decisions

    Data.gov (United States)

    National Archives and Records Administration — This online collection includes documents decided upon by the Interagency Security Classification Appeals Panel (ISCAP) starting in Fiscal Year 2012. The documents...

  6. Legal technique: approaches to section on types

    Directory of Open Access Journals (Sweden)

    І. Д. Шутак

    2015-11-01

    Full Text Available Legal technique is a branch of knowledge about the rules of doing legal work and creating in the process a variety of legal documents, which had previously been part of the theory of law. In modern conditions of the legal technique are isolated in a separate branch of legal science, focused on solving practical problems. The purpose of this article is to analyze the types of legal techniques, in particular, on the basis of theoretical propositions about legal technique to allocate substantial characteristics and types of legal technique. O. Malko and M. Matuzov consider legal technique as a set of rules, techniques, methods of preparation, creation, registration of legal documents, their classification and accounting for their excellence, efficient use. A similar meaning is investing in this concept Alekseev, determining that the legal technique is a set of tools and techniques used in accordance with accepted rules in the formulation and systematization of legal acts to ensure their perfection. So, legal technique – theoretical and applied legal science, which studies the regularities of rational legal practice in the creation, interpretation and implementation of law. In relation to the type of legal techniques in the literature proposed different classifications. For example, G. Muromtsev technique, which is used only in the field of law, divide on the technique of law-making (legislative technique, technique of law enforcement, interpretation, technique of judicial speech, interrogation, notarial activities. V. Kartashov shared legal technique on law making and enforcement (prorealtime, interpretive yourself and prevacidrebatezw, judicial or investigative, prosecutorial, and the like. Some authors clearly indicate that the criterion by which to distinguish types of legal techniques. So, S. Alekseev notes that legal technique is classified from the point of view of the legal nature of the act made on: a techniques of legal acts; b the

  7. Document boundary determination using structural and lexical analysis

    Science.gov (United States)

    Taghva, Kazem; Cartright, Marc-Allen

    2009-01-01

    The document boundary determination problem is the process of identifying individual documents in a stack of papers. In this paper, we report on a classification system for automation of this process. The system employs features based on document structure and lexical content. We also report on experimental results to support the effectiveness of this system.

  8. Supervised Classification of Agricultural Land Cover Using a Modified k-NN Technique (MNN and Landsat Remote Sensing Imagery

    Directory of Open Access Journals (Sweden)

    Karsten Schulz

    2009-11-01

    Full Text Available Nearest neighbor techniques are commonly used in remote sensing, pattern recognition and statistics to classify objects into a predefined number of categories based on a given set of predictors. These techniques are especially useful for highly nonlinear relationship between the variables. In most studies the distance measure is adopted a priori. In contrast we propose a general procedure to find an adaptive metric that combines a local variance reducing technique and a linear embedding of the observation space into an appropriate Euclidean space. To illustrate the application of this technique, two agricultural land cover classifications using mono-temporal and multi-temporal Landsat scenes are presented. The results of the study, compared with standard approaches used in remote sensing such as maximum likelihood (ML or k-Nearest Neighbor (k-NN indicate substantial improvement with regard to the overall accuracy and the cardinality of the calibration data set. Also, using MNN in a soft/fuzzy classification framework demonstrated to be a very useful tool in order to derive critical areas that need some further attention and investment concerning additional calibration data.

  9. New technique for real-time distortion-invariant multiobject recognition and classification

    Science.gov (United States)

    Hong, Rutong; Li, Xiaoshun; Hong, En; Wang, Zuyi; Wei, Hongan

    2001-04-01

    A real-time hybrid distortion-invariant OPR system was established to make 3D multiobject distortion-invariant automatic pattern recognition. Wavelet transform technique was used to make digital preprocessing of the input scene, to depress the noisy background and enhance the recognized object. A three-layer backpropagation artificial neural network was used in correlation signal post-processing to perform multiobject distortion-invariant recognition and classification. The C-80 and NOA real-time processing ability and the multithread programming technology were used to perform high speed parallel multitask processing and speed up the post processing rate to ROIs. The reference filter library was constructed for the distortion version of 3D object model images based on the distortion parameter tolerance measuring as rotation, azimuth and scale. The real-time optical correlation recognition testing of this OPR system demonstrates that using the preprocessing, post- processing, the nonlinear algorithm os optimum filtering, RFL construction technique and the multithread programming technology, a high possibility of recognition and recognition rate ere obtained for the real-time multiobject distortion-invariant OPR system. The recognition reliability and rate was improved greatly. These techniques are very useful to automatic target recognition.

  10. Classification of rabbit meat obtained with industrial and organic breeding by means of spectrocolorimetric technique

    Science.gov (United States)

    Menesatti, P.; D'Andrea, S.; Negretti, P.

    2007-09-01

    Rabbit meat is for its nutritional characteristics a food corresponding to new models of consumption. Quality improvement is possible integrating an extensive organic breeding with suitable rabbit genetic typologies. Aim of this work (financed by a Project of the Lazio Region, Italy) was the characterization of rabbit meat by a statistic model, able to distinguish rabbit meat obtained by organic breeding from that achieved industrially. This was pursued through the analysis of spectral data and colorimetric values. Two genetic typologies of rabbit, Leprino Viterbese and a commercial hybrid, were studied. The Leprino Viterbese has been breeded with two different systems, organic and industrial. The commercial hybrid has been bred only industrially because of its characteristics of high sensibility to diseases. The device used for opto-electronic analysis is a VIS-NIR image spectrometer (range: 400-970 nm). The instrument has a stabilized light, it works in accordance to standard CIE L*a*b* technique and it measures the spectral reflectance and the colorimetric coordinates values. The statistic data analysis has been performed by Partial Least Square technique (PLS). A part of measured data was used to create the statistic model and the remaining data were utilized in phase of test to verify the correct model classification. The results put in evidence a high percentage of correct classification (90%) of the model for the two rabbit meat classes, deriving from organic and industrial breeding. Moreover, concerning the different genetic typologies, the percentage of correct classification was 90%.

  11. 10 CFR 1045.9 - RD classification performance evaluation.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false RD classification performance evaluation. 1045.9 Section... classification performance evaluation. (a) Heads of agencies shall ensure that RD management officials and those... RD or FRD documents shall have their personnel performance evaluated with respect to classification...

  12. Automated Classification of Heritage Buildings for As-Built Bim Using Machine Learning Techniques

    Science.gov (United States)

    Bassier, M.; Vergauwen, M.; Van Genechten, B.

    2017-08-01

    Semantically rich three dimensional models such as Building Information Models (BIMs) are increasingly used in digital heritage. They provide the required information to varying stakeholders during the different stages of the historic buildings life cyle which is crucial in the conservation process. The creation of as-built BIM models is based on point cloud data. However, manually interpreting this data is labour intensive and often leads to misinterpretations. By automatically classifying the point cloud, the information can be proccesed more effeciently. A key aspect in this automated scan-to-BIM process is the classification of building objects. In this research we look to automatically recognise elements in existing buildings to create compact semantic information models. Our algorithm efficiently extracts the main structural components such as floors, ceilings, roofs, walls and beams despite the presence of significant clutter and occlusions. More specifically, Support Vector Machines (SVM) are proposed for the classification. The algorithm is evaluated using real data of a variety of existing buildings. The results prove that the used classifier recognizes the objects with both high precision and recall. As a result, entire data sets are reliably labelled at once. The approach enables experts to better document and process heritage assets.

  13. Social Media Text Classification by Enhancing Well-Formed Text Trained Model

    Directory of Open Access Journals (Sweden)

    Phat Jotikabukkana

    2016-09-01

    Full Text Available Social media are a powerful communication tool in our era of digital information. The large amount of user-generated data is a useful novel source of data, even though it is not easy to extract the treasures from this vast and noisy trove. Since classification is an important part of text mining, many techniques have been proposed to classify this kind of information. We developed an effective technique of social media text classification by semi-supervised learning utilizing an online news source consisting of well-formed text. The computer first automatically extracts news categories, well-categorized by publishers, as classes for topic classification. A bag of words taken from news articles provides the initial keywords related to their category in the form of word vectors. The principal task is to retrieve a set of new productive keywords. Term Frequency-Inverse Document Frequency weighting (TF-IDF and Word Article Matrix (WAM are used as main methods. A modification of WAM is recomputed until it becomes the most effective model for social media text classification. The key success factor was enhancing our model with effective keywords from social media. A promising result of 99.50% accuracy was achieved, with more than 98.5% of Precision, Recall, and F-measure after updating the model three times.

  14. Establishing structure-property correlations and classification of base oils using statistical techniques and artificial neural networks

    International Nuclear Information System (INIS)

    Kapur, G.S.; Sastry, M.I.S.; Jaiswal, A.K.; Sarpal, A.S.

    2004-01-01

    The present paper describes various classification techniques like cluster analysis, principal component (PC)/factor analysis to classify different types of base stocks. The API classification of base oils (Group I-III) has been compared to a more detailed NMR derived chemical compositional and molecular structural parameters based classification in order to point out the similarities of the base oils in the same group and the differences between the oils placed in different groups. The detailed compositional parameters have been generated using 1 H and 13 C nuclear magnetic resonance (NMR) spectroscopic methods. Further, oxidation stability, measured in terms of rotating bomb oxidation test (RBOT) life, of non-conventional base stocks and their blends with conventional base stocks, has been quantitatively correlated with their 1 H NMR and elemental (sulphur and nitrogen) data with the help of multiple linear regression (MLR) and artificial neural networks (ANN) techniques. The MLR based model developed using NMR and elemental data showed a high correlation between the 'measured' and 'estimated' RBOT values for both training (R=0.859) and validation (R=0.880) data sets. The ANN based model, developed using fewer number of input variables (only 1 H NMR data) also showed high correlation between the 'measured' and 'estimated' RBOT values for training (R=0.881), validation (R=0.860) and test (R=0.955) data sets

  15. Discriminant forest classification method and system

    Science.gov (United States)

    Chen, Barry Y.; Hanley, William G.; Lemmond, Tracy D.; Hiller, Lawrence J.; Knapp, David A.; Mugge, Marshall J.

    2012-11-06

    A hybrid machine learning methodology and system for classification that combines classical random forest (RF) methodology with discriminant analysis (DA) techniques to provide enhanced classification capability. A DA technique which uses feature measurements of an object to predict its class membership, such as linear discriminant analysis (LDA) or Andersen-Bahadur linear discriminant technique (AB), is used to split the data at each node in each of its classification trees to train and grow the trees and the forest. When training is finished, a set of n DA-based decision trees of a discriminant forest is produced for use in predicting the classification of new samples of unknown class.

  16. Study and development of equipment supervision technique system and its management software for nuclear electricity production

    International Nuclear Information System (INIS)

    Zhang Liying; Zou Pingguo; Zhu Chenghu; Lu Haoliang; Wu Jie

    2008-01-01

    The equipment supervision technique system, which standardized the behavior of supervision organizations in planning and implementing of equipment supervision, is built up based on equipment supervision technique documents, such as Quality Supervision Classifications, Special Supervision Plans and Supervision Guides. Furthermore, based on the research, the equipment supervision management information system is developed by Object Oriented Programming, which consists of supervision information, supervision technique, supervision implementation, quality statistics and analysis module. (authors)

  17. Nuclear power plant systems, structures and components and their safety classification

    International Nuclear Information System (INIS)

    2000-01-01

    The assurance of a nuclear power plant's safety is based on the reliable functioning of the plant as well as on its appropriate maintenance and operation. To ensure the reliability of operation, special attention shall be paid to the design, manufacturing, commissioning and operation of the plant and its components. To control these functions the nuclear power plant is divided into structural and functional entities, i.e. systems. A systems safety class is determined by its safety significance. Safety class specifies the procedures to be employed in plant design, construction, monitoring and operation. The classification document contains all documentation related to the classification of the nuclear power plant. The principles of safety classification and the procedures pertaining to the classification document are presented in this guide. In the Appendix of the guide, examples of systems most typical of each safety class are given to clarify the safety classification principles

  18. New Framework for Cross-Domain Document Classification

    Science.gov (United States)

    2011-03-01

    article discussing the culinary arts of Japan will belong to cooking category and Japan category. Two different domains, e.g., Wikipedia and NYT, may...Clustering analysis of the 300 topics obtained the “ Arts ” category group, with 50 topics obtained independently from each of its six subcategories. The...Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Table 4.3 Number of documents in each category under “ Arts ” group as obtained from

  19. Chemometric techniques in oil classification from oil spill fingerprinting.

    Science.gov (United States)

    Ismail, Azimah; Toriman, Mohd Ekhwan; Juahir, Hafizan; Kassim, Azlina Md; Zain, Sharifuddin Md; Ahmad, Wan Kamaruzaman Wan; Wong, Kok Fah; Retnam, Ananthy; Zali, Munirah Abdul; Mokhtar, Mazlin; Yusri, Mohd Ayub

    2016-10-15

    Extended use of GC-FID and GC-MS in oil spill fingerprinting and matching is significantly important for oil classification from the oil spill sources collected from various areas of Peninsular Malaysia and Sabah (East Malaysia). Oil spill fingerprinting from GC-FID and GC-MS coupled with chemometric techniques (discriminant analysis and principal component analysis) is used as a diagnostic tool to classify the types of oil polluting the water. Clustering and discrimination of oil spill compounds in the water from the actual site of oil spill events are divided into four groups viz. diesel, Heavy Fuel Oil (HFO), Mixture Oil containing Light Fuel Oil (MOLFO) and Waste Oil (WO) according to the similarity of their intrinsic chemical properties. Principal component analysis (PCA) demonstrates that diesel, HFO, MOLFO and WO are types of oil or oil products from complex oil mixtures with a total variance of 85.34% and are identified with various anthropogenic activities related to either intentional releasing of oil or accidental discharge of oil into the environment. Our results show that the use of chemometric techniques is significant in providing independent validation for classifying the types of spilled oil in the investigation of oil spill pollution in Malaysia. This, in consequence would result in cost and time saving in identification of the oil spill sources. Copyright © 2016. Published by Elsevier Ltd.

  20. An unsupervised technique for optimal feature selection in attribute profiles for spectral-spatial classification of hyperspectral images

    Science.gov (United States)

    Bhardwaj, Kaushal; Patra, Swarnajyoti

    2018-04-01

    Inclusion of spatial information along with spectral features play a significant role in classification of remote sensing images. Attribute profiles have already proved their ability to represent spatial information. In order to incorporate proper spatial information, multiple attributes are required and for each attribute large profiles need to be constructed by varying the filter parameter values within a wide range. Thus, the constructed profiles that represent spectral-spatial information of an hyperspectral image have huge dimension which leads to Hughes phenomenon and increases computational burden. To mitigate these problems, this work presents an unsupervised feature selection technique that selects a subset of filtered image from the constructed high dimensional multi-attribute profile which are sufficiently informative to discriminate well among classes. In this regard the proposed technique exploits genetic algorithms (GAs). The fitness function of GAs are defined in an unsupervised way with the help of mutual information. The effectiveness of the proposed technique is assessed using one-against-all support vector machine classifier. The experiments conducted on three hyperspectral data sets show the robustness of the proposed method in terms of computation time and classification accuracy.

  1. Sentiment classification with interpolated information diffusion kernels

    NARCIS (Netherlands)

    Raaijmakers, S.

    2007-01-01

    Information diffusion kernels - similarity metrics in non-Euclidean information spaces - have been found to produce state of the art results for document classification. In this paper, we present a novel approach to global sentiment classification using these kernels. We carry out a large array of

  2. Towards Automatic Classification of Wikipedia Content

    Science.gov (United States)

    Szymański, Julian

    Wikipedia - the Free Encyclopedia encounters the problem of proper classification of new articles everyday. The process of assignment of articles to categories is performed manually and it is a time consuming task. It requires knowledge about Wikipedia structure, which is beyond typical editor competence, which leads to human-caused mistakes - omitting or wrong assignments of articles to categories. The article presents application of SVM classifier for automatic classification of documents from The Free Encyclopedia. The classifier application has been tested while using two text representations: inter-documents connections (hyperlinks) and word content. The results of the performed experiments evaluated on hand crafted data show that the Wikipedia classification process can be partially automated. The proposed approach can be used for building a decision support system which suggests editors the best categories that fit new content entered to Wikipedia.

  3. Recognition techniques for extracting information from semistructured documents

    Science.gov (United States)

    Della Ventura, Anna; Gagliardi, Isabella; Zonta, Bruna

    2000-12-01

    Archives of optical documents are more and more massively employed, the demand driven also by the new norms sanctioning the legal value of digital documents, provided they are stored on supports that are physically unalterable. On the supply side there is now a vast and technologically advanced market, where optical memories have solved the problem of the duration and permanence of data at costs comparable to those for magnetic memories. The remaining bottleneck in these systems is the indexing. The indexing of documents with a variable structure, while still not completely automated, can be machine supported to a large degree with evident advantages both in the organization of the work, and in extracting information, providing data that is much more detailed and potentially significant for the user. We present here a system for the automatic registration of correspondence to and from a public office. The system is based on a general methodology for the extraction, indexing, archiving, and retrieval of significant information from semi-structured documents. This information, in our prototype application, is distributed among the database fields of sender, addressee, subject, date, and body of the document.

  4. Application of spectroscopic techniques for the study of paper documents: A survey

    International Nuclear Information System (INIS)

    Manso, M.; Carvalho, M.L.

    2009-01-01

    For many centuries paper was the main material for recording cultural achievements all over the world. Paper is mostly made from cellulose with small amounts of organic and inorganic additives, which allow its identification and characterization and may also contribute to its degradation. Prior to 1850, paper was made entirely from rags, using hemp, flax and cotton fibres. After this period, due to the enormous increase in demand, wood pulp began to be commonly used as raw material, resulting in rapid degradation of paper. Spectroscopic techniques represent one of the most powerful tools to investigate the constituents of paper documents in order to establish its identification and its state of degradation. This review describes the application of selected spectroscopic techniques used for paper characterization and conservation. The spectroscopic techniques that have been used and will be reviewed include: Fourier-Transform Infrared spectroscopy, Raman spectroscopy, Nuclear Magnetic Resonance spectroscopy, X-Ray spectroscopy, Laser-based Spectroscopy, Inductively Coupled Mass Spectroscopy, Laser ablation, Atomic Absorption Spectroscopy and X-Ray Photoelectron Spectroscopy.

  5. Multi-Agent Information Classification Using Dynamic Acquaintance Lists.

    Science.gov (United States)

    Mukhopadhyay, Snehasis; Peng, Shengquan; Raje, Rajeev; Palakal, Mathew; Mostafa, Javed

    2003-01-01

    Discussion of automated information services focuses on information classification and collaborative agents, i.e. intelligent computer programs. Highlights include multi-agent systems; distributed artificial intelligence; thesauri; document representation and classification; agent modeling; acquaintances, or remote agents discovered through…

  6. Requirements for the data transfer during the examination of design documentation

    Directory of Open Access Journals (Sweden)

    Karakozova Irina

    2017-01-01

    Full Text Available When you transfer the design documents to the examination office, number of incompatible electronic documents increases dramatically. The article discusses the way to solve the problem of transferring of the text and graphic data of design documentation for state and non-state expertise, as well as verification of estimates and requirement management. The methods for the recognition of the system elements and requirements for the transferring of text and graphic design documents are provided. The need to use the classification and coding of various elements of information systems (structures, objects, resources, requirements, contracts, etc. in data transferring systems is indicated separately. The authors have developed a sequence of document processing and transmission of data during the examination, and propose a language for describing the construction of the facility, taking into account the classification criteria of the structures and construction works.

  7. Compression of Probabilistic XML Documents

    Science.gov (United States)

    Veldman, Irma; de Keijzer, Ander; van Keulen, Maurice

    Database techniques to store, query and manipulate data that contains uncertainty receives increasing research interest. Such UDBMSs can be classified according to their underlying data model: relational, XML, or RDF. We focus on uncertain XML DBMS with as representative example the Probabilistic XML model (PXML) of [10,9]. The size of a PXML document is obviously a factor in performance. There are PXML-specific techniques to reduce the size, such as a push down mechanism, that produces equivalent but more compact PXML documents. It can only be applied, however, where possibilities are dependent. For normal XML documents there also exist several techniques for compressing a document. Since Probabilistic XML is (a special form of) normal XML, it might benefit from these methods even more. In this paper, we show that existing compression mechanisms can be combined with PXML-specific compression techniques. We also show that best compression rates are obtained with a combination of PXML-specific technique with a rather simple generic DAG-compression technique.

  8. Uncovering Document Fraud in Maritime Freight Transport Based on Probabilistic Classification

    OpenAIRE

    Triepels , Ron; Feelders , Ad; Daniels , Hennie

    2015-01-01

    Part 4: Data Analysis and Information Retrieval; International audience; Deficient visibility in global supply chains causes significant risks for the customs brokerage practices of freight forwarders. One of the risks that freight forwarders face is that shipping documentation might contain document fraud and is used to declare a shipment. Traditional risk controls are ineffective in this regard since the creation of shipping documentation is uncontrollable by freight forwarders. In this pap...

  9. Coordonnateur, Gestion des documents | CRDI - Centre de ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    Classifier, indexer, faire des références croisées et identifier les documents relatifs aux projets et aux activités administratives du CRDI en attribuant, conformément au plan de classification des documents du Centre, le numéro de dossier et le titre appropriés de chaque dossier et les mots-clés servant à la recherche;.

  10. Data preprocessing techniques for classification without discrimination

    NARCIS (Netherlands)

    Kamiran, F.; Calders, T.G.K.

    2012-01-01

    Recently, the following Discrimination-Aware Classification Problem was introduced: Suppose we are given training data that exhibit unlawful discrimination; e.g., toward sensitive attributes such as gender or ethnicity. The task is to learn a classifier that optimizes accuracy, but does not have

  11. Hazard classification of environmental restoration activities at the INEL

    International Nuclear Information System (INIS)

    Peatross, R.G.

    1996-04-01

    The following documents require that a hazard classification be prepared for all activities for which US Department of Energy (DOE) has assumed environmental, safety, and health responsibility: the DOE Order 5481.1B, Safety Analysis and Review System and DOE Order 5480.23, Nuclear Safety Analysis Reports. A hazard classification defines the level of hazard posed by an operation or activity, assuming an unmitigated release of radioactive and nonradioactive hazardous material. For environmental restoration activities, the release threshold criteria presented in Hazard Baseline Documentation (DOE-EM-STD-5502-94) are used to determine classifications, such as Radiological, Nonnuclear, and Other Industrial facilities. Based upon DOE-EM-STD-5502-94, environmental restoration activities in all but one of the sites addressed by the scope of this classification (see Section 2) can be classified as ''Other Industrial Facility''. DOE-EM-STD-5502-94 states that a Health and Safety Plan and compliance with the applicable Occupational Safety and Health Administration (OSHA) standards are sufficient safety controls for this classification

  12. Feature-Free Activity Classification of Inertial Sensor Data With Machine Vision Techniques: Method, Development, and Evaluation.

    Science.gov (United States)

    Dominguez Veiga, Jose Juan; O'Reilly, Martin; Whelan, Darragh; Caulfield, Brian; Ward, Tomas E

    2017-08-04

    Inertial sensors are one of the most commonly used sources of data for human activity recognition (HAR) and exercise detection (ED) tasks. The time series produced by these sensors are generally analyzed through numerical methods. Machine learning techniques such as random forests or support vector machines are popular in this field for classification efforts, but they need to be supported through the isolation of a potentially large number of additionally crafted features derived from the raw data. This feature preprocessing step can involve nontrivial digital signal processing (DSP) techniques. However, in many cases, the researchers interested in this type of activity recognition problems do not possess the necessary technical background for this feature-set development. The study aimed to present a novel application of established machine vision methods to provide interested researchers with an easier entry path into the HAR and ED fields. This can be achieved by removing the need for deep DSP skills through the use of transfer learning. This can be done by using a pretrained convolutional neural network (CNN) developed for machine vision purposes for exercise classification effort. The new method should simply require researchers to generate plots of the signals that they would like to build classifiers with, store them as images, and then place them in folders according to their training label before retraining the network. We applied a CNN, an established machine vision technique, to the task of ED. Tensorflow, a high-level framework for machine learning, was used to facilitate infrastructure needs. Simple time series plots generated directly from accelerometer and gyroscope signals are used to retrain an openly available neural network (Inception), originally developed for machine vision tasks. Data from 82 healthy volunteers, performing 5 different exercises while wearing a lumbar-worn inertial measurement unit (IMU), was collected. The ability of the

  13. Classification of protein profiles using fuzzy clustering techniques

    DEFF Research Database (Denmark)

    Karemore, Gopal; Mullick, Jhinuk B.; Sujatha, R.

    2010-01-01

     Present  study  has  brought  out  a  comparison  of PCA  and  fuzzy  clustering  techniques  in  classifying  protein profiles  (chromatogram)  of  homogenates  of  different  tissue origins:  Ovarian,  Cervix,  Oral  cancers,  which  were  acquired using HPLC–LIF (High Performance Liquid...... Chromatography- Laser   Induced   Fluorescence)   method   developed   in   our laboratory. Study includes 11 chromatogram spectra each from oral,  cervical,  ovarian  cancers  as  well  as  healthy  volunteers. Generally  multivariate  analysis  like  PCA  demands  clear  data that   is   devoid   of   day......   PCA   mapping   in   classifying   various cancers from healthy spectra with classification rate up to 95 % from  60%.  Methods  are  validated  using  various  clustering indexes   and   shows   promising   improvement   in   developing optical pathology like HPLC-LIF for early detection of various...

  14. The potential of 3D techniques for cultural heritage object documentation

    Science.gov (United States)

    Bitelli, Gabriele; Girelli, Valentina A.; Remondino, Fabio; Vittuari, Luca

    2007-01-01

    The generation of 3D models of objects has become an important research point in many fields of application like industrial inspection, robotics, navigation and body scanning. Recently the techniques for generating photo-textured 3D digital models have interested also the field of Cultural Heritage, due to their capability to combine high precision metrical information with a qualitative and photographic description of the objects. In fact this kind of product is a fundamental support for documentation, studying and restoration of works of art, until a production of replicas by fast prototyping techniques. Close-range photogrammetric techniques are nowadays more and more frequently used for the generation of precise 3D models. With the advent of automated procedures and fully digital products in the 1990s, it has become easier to use and cheaper, and nowadays a wide range of commercial software is available to calibrate, orient and reconstruct objects from images. This paper presents the complete process for the derivation of a photorealistic 3D model of an important basalt stela (about 70 x 60 x 25 cm) discovered in the archaeological site of Tilmen Höyük, in Turkey, dating back to 2nd mill. BC. We will report the modeling performed using passive and active sensors and the comparison of the achieved results.

  15. “The Naming of Cats”: Automated Genre Classification

    Directory of Open Access Journals (Sweden)

    Yunhyong Kim

    2007-07-01

    Full Text Available This paper builds on the work presented at the ECDL 2006 in automated genre classification as a step toward automating metadata extraction from digital documents for ingest into digital repositories such as those run by archives, libraries and eprint services (Kim & Ross, 2006b. We have previously proposed dividing features of a document into five types (features for visual layout, language model features, stylometric features, features for semantic structure, and contextual features as an object linked to previously classified objects and other external sources and have examined visual and language model features. The current paper compares results from testing classifiers based on image and stylometric features in a binary classification to show that certain genres have strong image features which enable effective separation of documents belonging to the genre from a large pool of other documents.

  16. Document image retrieval through word shape coding.

    Science.gov (United States)

    Lu, Shijian; Li, Linlin; Tan, Chew Lim

    2008-11-01

    This paper presents a document retrieval technique that is capable of searching document images without OCR (optical character recognition). The proposed technique retrieves document images by a new word shape coding scheme, which captures the document content through annotating each word image by a word shape code. In particular, we annotate word images by using a set of topological shape features including character ascenders/descenders, character holes, and character water reservoirs. With the annotated word shape codes, document images can be retrieved by either query keywords or a query document image. Experimental results show that the proposed document image retrieval technique is fast, efficient, and tolerant to various types of document degradation.

  17. DOCUMENTING FOR POSTERITY: ADVOCATING THE USE OF ADVANCED RECORDING TECHNIQUES FOR DOCUMENTATION IN THE FIELD OF BUILDING ARCHAEOLOGY

    Directory of Open Access Journals (Sweden)

    P. J. De Vos

    2017-08-01

    Full Text Available Since the new millennium, living in historic cities has become extremely popular in the Netherlands. As a consequence, historic environments are being adapted to meet modern living standards. Houses are constantly subjected to development, restoration and renovation. Although most projects are carried out with great care and strive to preserve and respect as much historic material as possible, nevertheless a significant amount of historical fabric disappears. This puts enormous pressure on building archaeologists that struggle to rapidly and accurately capture in situ authentic material and historical evidence in the midst of construction works. In Leiden, a medieval city that flourished during the seventeenth century and that today counts over 3,000 listed monuments, a solution to the problem has been found with the implementation of advanced recording techniques. Since 2014, building archaeologists of the city council have experienced first-hand that new recording techniques, such as laser scanning and photogrammetry, have dramatically decreased time spent on site with documentation. Time they now use to uncover, analyse and interpret the recovered historical data. Nevertheless, within building archaeology education, a strong case is made for hand drawing as a method for understanding a building, emphasising the importance of close observation and physical contact with the subject. In this paper, the use of advanced recording techniques in building archaeology is being advocated, confronting traditional educational theory with practise, and research tradition with the rapid rise of new recording technologies.

  18. Documenting for Posterity: Advocating the Use of Advanced Recording Techniques for Documentation in the Field of Building Archaeology

    Science.gov (United States)

    De Vos, P. J.

    2017-08-01

    Since the new millennium, living in historic cities has become extremely popular in the Netherlands. As a consequence, historic environments are being adapted to meet modern living standards. Houses are constantly subjected to development, restoration and renovation. Although most projects are carried out with great care and strive to preserve and respect as much historic material as possible, nevertheless a significant amount of historical fabric disappears. This puts enormous pressure on building archaeologists that struggle to rapidly and accurately capture in situ authentic material and historical evidence in the midst of construction works. In Leiden, a medieval city that flourished during the seventeenth century and that today counts over 3,000 listed monuments, a solution to the problem has been found with the implementation of advanced recording techniques. Since 2014, building archaeologists of the city council have experienced first-hand that new recording techniques, such as laser scanning and photogrammetry, have dramatically decreased time spent on site with documentation. Time they now use to uncover, analyse and interpret the recovered historical data. Nevertheless, within building archaeology education, a strong case is made for hand drawing as a method for understanding a building, emphasising the importance of close observation and physical contact with the subject. In this paper, the use of advanced recording techniques in building archaeology is being advocated, confronting traditional educational theory with practise, and research tradition with the rapid rise of new recording technologies.

  19. Interactive Classification of Construction Materials: Feedback Driven Framework for Annotation and Analysis of 3d Point Clouds

    Science.gov (United States)

    Hess, M. R.; Petrovic, V.; Kuester, F.

    2017-08-01

    Digital documentation of cultural heritage structures is increasingly more common through the application of different imaging techniques. Many works have focused on the application of laser scanning and photogrammetry techniques for the acquisition of threedimensional (3D) geometry detailing cultural heritage sites and structures. With an abundance of these 3D data assets, there must be a digital environment where these data can be visualized and analyzed. Presented here is a feedback driven visualization framework that seamlessly enables interactive exploration and manipulation of massive point cloud data. The focus of this work is on the classification of different building materials with the goal of building more accurate as-built information models of historical structures. User defined functions have been tested within the interactive point cloud visualization framework to evaluate automated and semi-automated classification of 3D point data. These functions include decisions based on observed color, laser intensity, normal vector or local surface geometry. Multiple case studies are presented here to demonstrate the flexibility and utility of the presented point cloud visualization framework to achieve classification objectives.

  20. Web document engineering

    International Nuclear Information System (INIS)

    White, B.

    1996-05-01

    This tutorial provides an overview of several document engineering techniques which are applicable to the authoring of World Wide Web documents. It illustrates how pre-WWW hypertext research is applicable to the development of WWW information resources

  1. Pre-optimization of radiotherapy treatment planning: an artificial neural network classification aided technique

    International Nuclear Information System (INIS)

    Hosseini-Ashrafi, M.E.; Bagherebadian, H.; Yahaqi, E.

    1999-01-01

    A method has been developed which, by using the geometric information from treatment sample cases, selects from a given data set an initial treatment plan as a step for treatment plan optimization. The method uses an artificial neural network (ANN) classification technique to select a best matching plan from the 'optimized' ANN database. Separate back-propagation ANN classifiers were trained using 50, 60 and 77 examples for three groups of treatment case classes (up to 21 examples from each class were used). The performance of the classifiers in selecting the correct treatment class was tested using the leave-one-out method; the networks were optimized with respect their architecture. For the three groups used in this study, successful classification fractions of 0.83, 0.98 and 0.93 were achieved by the optimized ANN classifiers. The automated response of the ANN may be used to arrive at a pre-plan where many treatment parameters may be identified and therefore a significant reduction in the steps required to arrive at the optimum plan may be achieved. Treatment planning 'experience' and also results from lengthy calculations may be used for training the ANN. (author)

  2. A New Method for Solving Supervised Data Classification Problems

    Directory of Open Access Journals (Sweden)

    Parvaneh Shabanzadeh

    2014-01-01

    Full Text Available Supervised data classification is one of the techniques used to extract nontrivial information from data. Classification is a widely used technique in various fields, including data mining, industry, medicine, science, and law. This paper considers a new algorithm for supervised data classification problems associated with the cluster analysis. The mathematical formulations for this algorithm are based on nonsmooth, nonconvex optimization. A new algorithm for solving this optimization problem is utilized. The new algorithm uses a derivative-free technique, with robustness and efficiency. To improve classification performance and efficiency in generating classification model, a new feature selection algorithm based on techniques of convex programming is suggested. Proposed methods are tested on real-world datasets. Results of numerical experiments have been presented which demonstrate the effectiveness of the proposed algorithms.

  3. Clustering and classification of email contents

    Directory of Open Access Journals (Sweden)

    Izzat Alsmadi

    2015-01-01

    Full Text Available Information users depend heavily on emails’ system as one of the major sources of communication. Its importance and usage are continuously growing despite the evolution of mobile applications, social networks, etc. Emails are used on both the personal and professional levels. They can be considered as official documents in communication among users. Emails’ data mining and analysis can be conducted for several purposes such as: Spam detection and classification, subject classification, etc. In this paper, a large set of personal emails is used for the purpose of folder and subject classifications. Algorithms are developed to perform clustering and classification for this large text collection. Classification based on NGram is shown to be the best for such large text collection especially as text is Bi-language (i.e. with English and Arabic content.

  4. Angle′s Molar Classification Revisited

    Directory of Open Access Journals (Sweden)

    Devanshi Yadav

    2014-01-01

    Results: Of the 500 pretreatment study casts assessed 52.4% were definitive Class I, 23.6% were Class II, 2.6% were Class III and the ambiguous cases were 21%. These could be easily classified with our method of classification. Conclusion: This improvised classification technique will help orthodontists in making classification of malocclusion accurate and simple.

  5. Determination of the ecological connectivity between landscape patches obtained using the knowledge engineer (expert) classification technique

    Science.gov (United States)

    Selim, Serdar; Sonmez, Namik Kemal; Onur, Isin; Coslu, Mesut

    2017-10-01

    Connection of similar landscape patches with ecological corridors supports habitat quality of these patches, increases urban ecological quality, and constitutes an important living and expansion area for wild life. Furthermore, habitat connectivity provided by urban green areas is supporting biodiversity in urban areas. In this study, possible ecological connections between landscape patches, which were achieved by using Expert classification technique and modeled with probabilistic connection index. Firstly, the reflection responses of plants to various bands are used as data in hypotheses. One of the important features of this method is being able to use more than one image at the same time in the formation of the hypothesis. For this reason, before starting the application of the Expert classification, the base images are prepared. In addition to the main image, the hypothesis conditions were also created for each class with the NDVI image which is commonly used in the vegetation researches. Besides, the results of the previously conducted supervised classification were taken into account. We applied this classification method by using the raster imagery with user-defined variables. Hereupon, to provide ecological connections of the tree cover which was achieved from the classification, we used Probabilistic Connection (PC) index. The probabilistic connection model which is used for landscape planning and conservation studies via detecting and prioritization critical areas for ecological connection characterizes the possibility of direct connection between habitats. As a result we obtained over % 90 total accuracy in accuracy assessment analysis. We provided ecological connections with PC index and we created inter-connected green spaces system. Thus, we offered and implicated green infrastructure system model takes place in the agenda of recent years.

  6. Uncovering document fraud in maritime freight transport based on probabilistic classification

    NARCIS (Netherlands)

    Triepels, Ron; Feelders, A. F.; Daniels, Hennie

    2015-01-01

    Deficient visibility in global supply chains causes significant risks for the customs brokerage practices of freight forwarders. One of the risks that freight forwarders face is that shipping documentation might contain document fraud and is used to declare a shipment. Traditional risk controls are

  7. Uncovering Document Fraud in Maritime Freight Transport Based on Probabilistic Classification

    NARCIS (Netherlands)

    Triepels, Ron; Feelders, A.J.; Daniels, Hennie

    2015-01-01

    Deficient visibility in global supply chains causes significant risks for the customs brokerage practices of freight forwarders. One of the risks that freight forwarders face is that shipping documentation might contain document fraud and is used to declare a shipment. Traditional risk controls are

  8. Mapping forested wetlands in the Great Zhan River Basin through integrating optical, radar, and topographical data classification techniques.

    Science.gov (United States)

    Na, X D; Zang, S Y; Wu, C S; Li, W L

    2015-11-01

    Knowledge of the spatial extent of forested wetlands is essential to many studies including wetland functioning assessment, greenhouse gas flux estimation, and wildlife suitable habitat identification. For discriminating forested wetlands from their adjacent land cover types, researchers have resorted to image analysis techniques applied to numerous remotely sensed data. While with some success, there is still no consensus on the optimal approaches for mapping forested wetlands. To address this problem, we examined two machine learning approaches, random forest (RF) and K-nearest neighbor (KNN) algorithms, and applied these two approaches to the framework of pixel-based and object-based classifications. The RF and KNN algorithms were constructed using predictors derived from Landsat 8 imagery, Radarsat-2 advanced synthetic aperture radar (SAR), and topographical indices. The results show that the objected-based classifications performed better than per-pixel classifications using the same algorithm (RF) in terms of overall accuracy and the difference of their kappa coefficients are statistically significant (pwetlands based on the per-pixel classifications using the RF algorithm. As for the object-based image analysis, there were also statistically significant differences (pwetlands and omissions for agriculture land. This research proves that the object-based classification with RF using optical, radar, and topographical data improved the mapping accuracy of land covers and provided a feasible approach to discriminate the forested wetlands from the other land cover types in forestry area.

  9. Exploiting the systematic review protocol for classification of medical abstracts.

    Science.gov (United States)

    Frunza, Oana; Inkpen, Diana; Matwin, Stan; Klement, William; O'Blenis, Peter

    2011-01-01

    To determine whether the automatic classification of documents can be useful in systematic reviews on medical topics, and specifically if the performance of the automatic classification can be enhanced by using the particular protocol of questions employed by the human reviewers to create multiple classifiers. The test collection is the data used in large-scale systematic review on the topic of the dissemination strategy of health care services for elderly people. From a group of 47,274 abstracts marked by human reviewers to be included in or excluded from further screening, we randomly selected 20,000 as a training set, with the remaining 27,274 becoming a separate test set. As a machine learning algorithm we used complement naïve Bayes. We tested both a global classification method, where a single classifier is trained on instances of abstracts and their classification (i.e., included or excluded), and a novel per-question classification method that trains multiple classifiers for each abstract, exploiting the specific protocol (questions) of the systematic review. For the per-question method we tested four ways of combining the results of the classifiers trained for the individual questions. As evaluation measures, we calculated precision and recall for several settings of the two methods. It is most important not to exclude any relevant documents (i.e., to attain high recall for the class of interest) but also desirable to exclude most of the non-relevant documents (i.e., to attain high precision on the class of interest) in order to reduce human workload. For the global method, the highest recall was 67.8% and the highest precision was 37.9%. For the per-question method, the highest recall was 99.2%, and the highest precision was 63%. The human-machine workflow proposed in this paper achieved a recall value of 99.6%, and a precision value of 17.8%. The per-question method that combines classifiers following the specific protocol of the review leads to better

  10. High Classification Rates for Continuous Cow Activity Recognition using Low-cost GPS Positioning Sensors and Standard Machine Learning Techniques

    DEFF Research Database (Denmark)

    Godsk, Torben; Kjærgaard, Mikkel Baun

    2011-01-01

    activities. By preprocessing the raw cow position data, we obtain high classification rates using standard machine learning techniques to recognize cow activities. Our objectives were to (i) determine to what degree it is possible to robustly recognize cow activities from GPS positioning data, using low...... and their activities manually logged to serve as ground truth. For our dataset we managed to obtain an average classification success rate of 86.2% of the four activities: eating/seeking (90.0%), walking (100%), lying (76.5%), and standing (75.8%) by optimizing both the preprocessing of the raw GPS data...

  11. The Effect of Structured Decision-Making Techniques and Gender on Student Reaction and Quality of Written Documents.

    Science.gov (United States)

    Neal, Joan; Echternacht, Lonnie

    1995-01-01

    Experimental groups used four decision-making techniques--reverse brainstorming (RS), dialectical inquiry (DI), devil's advocacy (DA), and consensus--in evaluating writing assignments. Control group produced a better quality document. Student reaction to negative features of RS, DI, and DA were not significant. (SK)

  12. Document Classification in Support of Automated Metadata Extraction Form Heterogeneous Collections

    Science.gov (United States)

    Flynn, Paul K.

    2014-01-01

    A number of federal agencies, universities, laboratories, and companies are placing their documents online and making them searchable via metadata fields such as author, title, and publishing organization. To enable this, every document in the collection must be catalogued using the metadata fields. Though time consuming, the task of identifying…

  13. Classification of radiological procedures

    International Nuclear Information System (INIS)

    1989-01-01

    A classification for departments in Danish hospitals which use radiological procedures. The classification codes consist of 4 digits, where the first 2 are the codes for the main groups. The first digit represents the procedure's topographical object and the second the techniques. The last 2 digits describe individual procedures. (CLS)

  14. Real-time network traffic classification technique for wireless local area networks based on compressed sensing

    Science.gov (United States)

    Balouchestani, Mohammadreza

    2017-05-01

    Network traffic or data traffic in a Wireless Local Area Network (WLAN) is the amount of network packets moving across a wireless network from each wireless node to another wireless node, which provide the load of sampling in a wireless network. WLAN's Network traffic is the main component for network traffic measurement, network traffic control and simulation. Traffic classification technique is an essential tool for improving the Quality of Service (QoS) in different wireless networks in the complex applications such as local area networks, wireless local area networks, wireless personal area networks, wireless metropolitan area networks, and wide area networks. Network traffic classification is also an essential component in the products for QoS control in different wireless network systems and applications. Classifying network traffic in a WLAN allows to see what kinds of traffic we have in each part of the network, organize the various kinds of network traffic in each path into different classes in each path, and generate network traffic matrix in order to Identify and organize network traffic which is an important key for improving the QoS feature. To achieve effective network traffic classification, Real-time Network Traffic Classification (RNTC) algorithm for WLANs based on Compressed Sensing (CS) is presented in this paper. The fundamental goal of this algorithm is to solve difficult wireless network management problems. The proposed architecture allows reducing False Detection Rate (FDR) to 25% and Packet Delay (PD) to 15 %. The proposed architecture is also increased 10 % accuracy of wireless transmission, which provides a good background for establishing high quality wireless local area networks.

  15. Hazard classification and auditable safety analysis for the 1300-N Emergency Dump Basin

    International Nuclear Information System (INIS)

    Kretzschmar, S.P.; Larson, A.R.

    1996-06-01

    This document combines the following four analytical functions: (1) hazards baseline of the Emergency Dump Basin (EDB) in the quiescent state; (2) preliminary hazard classification for intrusive activities (i.e., basin stabilization); (3) final hazard classification for intrusive activities; and (4) an auditable safety analysis. This document describes the potential hazards contained within the EDB at the N Reactor complex and the vulnerabilities of those hazards during the quiescent state (when only surveillance and maintenance activities take place) and during basin stabilization activities. This document also identifies the inventory of both radioactive and hazardous material in the EDB. Result is that the final hazard classification for the EDB segment intrusive activities is radiological

  16. BOREAS TE-18 Landsat TM Physical Classification Image of the NSA

    Science.gov (United States)

    Hall, Forrest G. (Editor); Knapp, David

    2000-01-01

    The BOREAS TE-18 team focused its efforts on using remotely sensed data to characterize the successional and disturbance dynamics of the boreal forest for use in carbon modeling. The objective of this classification is to provide the BOREAS investigators with a data product that characterizes the land cover of the NSA. A Landsat-5 TM image from 21-Jun-1995 was used to derive the classification. A technique was implemented that uses reflectances of various land cover types along with a geometric optical canopy model to produce spectral trajectories. These trajectories are used in a way that is similar to training data to classify the image into the different land cover classes. The data are provided in a binary, image file format. The data files are available on a CD-ROM (see document number 20010000884), or from the Oak Ridge National Laboratory (ORNL) Distributed Active Archive Center (DAAC).

  17. An implementation of support vector machine on sentiment classification of movie reviews

    Science.gov (United States)

    Yulietha, I. M.; Faraby, S. A.; Adiwijaya; Widyaningtyas, W. C.

    2018-03-01

    With technological advances, all information about movie is available on the internet. If the information is processed properly, it will get the quality of the information. This research proposes to the classify sentiments on movie review documents. This research uses Support Vector Machine (SVM) method because it can classify high dimensional data in accordance with the data used in this research in the form of text. Support Vector Machine is a popular machine learning technique for text classification because it can classify by learning from a collection of documents that have been classified previously and can provide good result. Based on number of datasets, the 90-10 composition has the best result that is 85.6%. Based on SVM kernel, kernel linear with constant 1 has the best result that is 84.9%

  18. Authentication of bee pollen grains in bright-field microscopy by combining one-class classification techniques and image processing.

    Science.gov (United States)

    Chica, Manuel

    2012-11-01

    A novel method for authenticating pollen grains in bright-field microscopic images is presented in this work. The usage of this new method is clear in many application fields such as bee-keeping sector, where laboratory experts need to identify fraudulent bee pollen samples against local known pollen types. Our system is based on image processing and one-class classification to reject unknown pollen grain objects. The latter classification technique allows us to tackle the major difficulty of the problem, the existence of many possible fraudulent pollen types, and the impossibility of modeling all of them. Different one-class classification paradigms are compared to study the most suitable technique for solving the problem. In addition, feature selection algorithms are applied to reduce the complexity and increase the accuracy of the models. For each local pollen type, a one-class classifier is trained and aggregated into a multiclassifier model. This multiclassification scheme combines the output of all the one-class classifiers in a unique final response. The proposed method is validated by authenticating pollen grains belonging to different Spanish bee pollen types. The overall accuracy of the system on classifying fraudulent microscopic pollen grain objects is 92.3%. The system is able to rapidly reject pollen grains, which belong to nonlocal pollen types, reducing the laboratory work and effort. The number of possible applications of this authentication method in the microscopy research field is unlimited. Copyright © 2012 Wiley Periodicals, Inc.

  19. Cellular image classification

    CERN Document Server

    Xu, Xiang; Lin, Feng

    2017-01-01

    This book introduces new techniques for cellular image feature extraction, pattern recognition and classification. The authors use the antinuclear antibodies (ANAs) in patient serum as the subjects and the Indirect Immunofluorescence (IIF) technique as the imaging protocol to illustrate the applications of the described methods. Throughout the book, the authors provide evaluations for the proposed methods on two publicly available human epithelial (HEp-2) cell datasets: ICPR2012 dataset from the ICPR'12 HEp-2 cell classification contest and ICIP2013 training dataset from the ICIP'13 Competition on cells classification by fluorescent image analysis. First, the reading of imaging results is significantly influenced by one’s qualification and reading systems, causing high intra- and inter-laboratory variance. The authors present a low-order LP21 fiber mode for optical single cell manipulation and imaging staining patterns of HEp-2 cells. A focused four-lobed mode distribution is stable and effective in optical...

  20. Archival classification: new usage scenarios among semantic web and traditio of digital samples

    Directory of Open Access Journals (Sweden)

    Alessandro Alfier

    2017-05-01

    Full Text Available Starting from the acknowledgement of the basic purpose assigned by tradition to classification within documents management, the article faces the issues related to new needs and usage, related to the digital scenarios, that would allow classification to consolidate its tradition of effectiveness in a new digital environment. The key point of the article is represented by the in-depth analysis of the possible synergies between classification-related activities and the International Standard for Describing Functions (ISDF, developed by ICA in 2007. The article highlights how an approach to classification elaborated from the ISDF perspective allows classification itself to enrich from purposes and semantic web related usage, and with the traditio of digital documents.

  1. Objective Classification of Rainfall in Northern Europe for Online Operation of Urban Water Systems Based on Clustering Techniques

    DEFF Research Database (Denmark)

    Löwe, Roland; Madsen, Henrik; McSharry, Patrick

    2016-01-01

    operators to change modes of control of their facilities. A k-means clustering technique was applied to group events retrospectively and was able to distinguish events with clearly different temporal and spatial correlation properties. For online applications, techniques based on k-means clustering...... and quadratic discriminant analysis both provided a fast and reliable identification of rain events of "high" variability, while the k-means provided the smallest number of rain events falsely identified as being of "high" variability (false hits). A simple classification method based on a threshold...

  2. Stamp Detection in Color Document Images

    DEFF Research Database (Denmark)

    Micenkova, Barbora; van Beusekom, Joost

    2011-01-01

    , moreover, it can be imprinted with a variable quality and rotation. Previous methods were restricted to detection of stamps of particular shapes or colors. The method presented in the paper includes segmentation of the image by color clustering and subsequent classification of candidate solutions...... by geometrical and color-related features. The approach allows for differentiation of stamps from other color objects in the document such as logos or texts. For the purpose of evaluation, a data set of 400 document images has been collected, annotated and made public. With the proposed method, recall of 83...

  3. Creation of structured documentation templates using Natural Language Processing techniques.

    Science.gov (United States)

    Kashyap, Vipul; Turchin, Alexander; Morin, Laura; Chang, Frank; Li, Qi; Hongsermeier, Tonya

    2006-01-01

    Structured Clinical Documentation is a fundamental component of the healthcare enterprise, linking both clinical (e.g., electronic health record, clinical decision support) and administrative functions (e.g., evaluation and management coding, billing). One of the challenges in creating good quality documentation templates has been the inability to address specialized clinical disciplines and adapt to local clinical practices. A one-size-fits-all approach leads to poor adoption and inefficiencies in the documentation process. On the other hand, the cost associated with manual generation of documentation templates is significant. Consequently there is a need for at least partial automation of the template generation process. We propose an approach and methodology for the creation of structured documentation templates for diabetes using Natural Language Processing (NLP).

  4. Calibration of a Plastic Classification System with the Ccw Model

    International Nuclear Information System (INIS)

    Barcala Riveira, J. M.; Fernandez Marron, J. L.; Alberdi Primicia, J.; Navarrete Marin, J. J.; Oller Gonzalez, J. C.

    2003-01-01

    This document describes the calibration of a plastic Classification system with the Ccw model (Classification by Quantum's built with Wavelet Coefficients). The method is applied to spectra of plastics usually present in domestic wastes. Obtained results are showed. (Author) 16 refs

  5. Data classification using metaheuristic Cuckoo Search technique for Levenberg Marquardt back propagation (CSLM) algorithm

    Science.gov (United States)

    Nawi, Nazri Mohd.; Khan, Abdullah; Rehman, M. Z.

    2015-05-01

    A nature inspired behavior metaheuristic techniques which provide derivative-free solutions to solve complex problems. One of the latest additions to the group of nature inspired optimization procedure is Cuckoo Search (CS) algorithm. Artificial Neural Network (ANN) training is an optimization task since it is desired to find optimal weight set of a neural network in training process. Traditional training algorithms have some limitation such as getting trapped in local minima and slow convergence rate. This study proposed a new technique CSLM by combining the best features of two known algorithms back-propagation (BP) and Levenberg Marquardt algorithm (LM) for improving the convergence speed of ANN training and avoiding local minima problem by training this network. Some selected benchmark classification datasets are used for simulation. The experiment result show that the proposed cuckoo search with Levenberg Marquardt algorithm has better performance than other algorithm used in this study.

  6. A Semisupervised Cascade Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Stamatis Karlos

    2016-01-01

    Full Text Available Classification is one of the most important tasks of data mining techniques, which have been adopted by several modern applications. The shortage of enough labeled data in the majority of these applications has shifted the interest towards using semisupervised methods. Under such schemes, the use of collected unlabeled data combined with a clearly smaller set of labeled examples leads to similar or even better classification accuracy against supervised algorithms, which use labeled examples exclusively during the training phase. A novel approach for increasing semisupervised classification using Cascade Classifier technique is presented in this paper. The main characteristic of Cascade Classifier strategy is the use of a base classifier for increasing the feature space by adding either the predicted class or the probability class distribution of the initial data. The classifier of the second level is supplied with the new dataset and extracts the decision for each instance. In this work, a self-trained NB∇C4.5 classifier algorithm is presented, which combines the characteristics of Naive Bayes as a base classifier and the speed of C4.5 for final classification. We performed an in-depth comparison with other well-known semisupervised classification methods on standard benchmark datasets and we finally reached to the point that the presented technique has better accuracy in most cases.

  7. Technique for information retrieval using enhanced latent semantic analysis generating rank approximation matrix by factorizing the weighted morpheme-by-document matrix

    Science.gov (United States)

    Chew, Peter A; Bader, Brett W

    2012-10-16

    A technique for information retrieval includes parsing a corpus to identify a number of wordform instances within each document of the corpus. A weighted morpheme-by-document matrix is generated based at least in part on the number of wordform instances within each document of the corpus and based at least in part on a weighting function. The weighted morpheme-by-document matrix separately enumerates instances of stems and affixes. Additionally or alternatively, a term-by-term alignment matrix may be generated based at least in part on the number of wordform instances within each document of the corpus. At least one lower rank approximation matrix is generated by factorizing the weighted morpheme-by-document matrix and/or the term-by-term alignment matrix.

  8. Stellar Spectral Classification with Locality Preserving Projections ...

    Indian Academy of Sciences (India)

    With the help of computer tools and algorithms, automatic stellar spectral classification has become an area of current interest. The process of stellar spectral classification mainly includes two steps: dimension reduction and classification. As a popular dimensionality reduction technique, Principal Component Analysis (PCA) ...

  9. STANDARDIZATION OF MEDICAL DOCUMENT FLOW: PRINCIPLES AND FEATURES

    Directory of Open Access Journals (Sweden)

    Melentev Vladimir Anatolevich

    2013-04-01

    Full Text Available In presented article the questions connected with the general concepts and bases of functioning of document flow in borders of any economic object (the enterprise, establishment, the organization are considered. Gostirovanny definition of document flow, classification of types of documentary streams is given. The basic principles of creation of document flow, following which are considered allows to create optimum structure документопотока and nature of movement of documents; interrelation of external and internal influences. Further basic elements of medical document flow are considered; the main problems of medical document flow being, besides, major factors, distinguishing medical document flow from document flow of manufacturing enterprises or other economic objects are specified. From consideration of these problems the conclusion about an initial stage of their decision - standardization of the medical document flow, being, besides, is drawn by the first stage of creation of a common information space of medical branch.

  10. [Molecular classification of breast cancer patients obtained through the technique of chromogenic in situ hybridization (CISH)].

    Science.gov (United States)

    Fernández, Angel; Reigosa, Aldo

    2013-12-01

    Breast cancer is a heterogeneous disease composed of a growing number of biological subtypes, with substantial variability of the disease progression within each category. The aim of this research was to classify the samples object of study according to the molecular classes of breast cancer: luminal A, luminal B, HER2 and triple negative, as a result of the state of HER2 amplification obtained by the technique of chromogenic in situ hybridization (CISH). The sample consisted of 200 biopsies fixed in 10% formalin, processed by standard techniques up to paraffin embedding, corresponding to patients diagnosed with invasive ductal carcinoma of the breast. These biopsies were obtained from patients from private practice and the Institute of Oncology "Dr. Miguel Pérez Carreño", for immunohistochemistry (IHC) of hormone receptors and HER2 made in the Hospital Metropolitano del Norte, Valencia, Venezuela. The molecular classification of the patient's tumors considering the expression of estrogen and progesterone receptors by IHC and HER2 amplification by CISH, allowed those cases originally classified as unknown, since they had an indeterminate (2+) outcome for HER2 expression by IHC, to be grouped into the different molecular classes. Also, this classification permitted that some cases, initially considered as belonging to a molecular class, were assigned to another class, after the revaluation of the HER2 status by CISH.

  11. The Classification of Corruption in Indonesia: A Behavioral Perspective

    Directory of Open Access Journals (Sweden)

    Hamdani Rizki

    2017-01-01

    Full Text Available This research is aimed to investigate and identify the pattern and classification of corruptors in Indonesia, especially the state officials being imprisoned. This research used the qualitative method. The data were collected through documentations and interviews. The source of the data was chosen by purposive sampling technique. The researcher interviewed deeply 9 suspects of corruption cases being imprisoned. The results of the research show that the classification of corruptors in Indonesia includes all types of corruptions constructed by the Association of Certified Fraud Examiners (ACFE, namely: conflict of interest, bribery, illegal gratuities, and economic extortion. Based on the interview, it is found that the interviewees perform different types of corruptions as follows: there are some suspects perform more than one type of corruptions; there are some suspects perform single corruptions with the same type, and there are some suspects perform single corruptions with the different type. In Indonesia, it is not only the people of executive, legislative, and judiciary who can perform corruptions, but also the people of private sections.

  12. Sentiment Classification of Documents in Serbian: The Effects of Morphological Normalization and Word Embeddings

    Directory of Open Access Journals (Sweden)

    V. Batanović

    2017-11-01

    Full Text Available An open issue in the sentiment classification of texts written in Serbian is the effect of different forms of morphological normalization and the usefulness of leveraging large amounts of unlabeled texts. In this paper, we assess the impact of lemmatizers and stemmers for Serbian on classifiers trained and evaluated on the Serbian Movie Review Dataset. We also consider the effectiveness of using word embeddings, generated from a large unlabeled corpus, as classification features.

  13. Supervised Cross-Modal Factor Analysis for Multiple Modal Data Classification

    KAUST Repository

    Wang, Jingbin

    2015-10-09

    In this paper we study the problem of learning from multiple modal data for purpose of document classification. In this problem, each document is composed two different modals of data, i.e., An image and a text. Cross-modal factor analysis (CFA) has been proposed to project the two different modals of data to a shared data space, so that the classification of a image or a text can be performed directly in this space. A disadvantage of CFA is that it has ignored the supervision information. In this paper, we improve CFA by incorporating the supervision information to represent and classify both image and text modals of documents. We project both image and text data to a shared data space by factor analysis, and then train a class label predictor in the shared space to use the class label information. The factor analysis parameter and the predictor parameter are learned jointly by solving one single objective function. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projection measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple modal document data sets show the advantage of the proposed algorithm over other CFA methods.

  14. I - Multivariate Classification and Machine Learning in HEP

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Traditional multivariate methods for classification (Stochastic Gradient Boosted Decision Trees and Multi-Layer Perceptrons) are explained in theory and practise using examples from HEP. General aspects of multivariate classification are discussed, in particular different regularisation techniques. Afterwards, data-driven techniques are introduced and compared to MC-based methods.

  15. State-of-the-art techniques for inventory of Great Lakes aquatic habitats and resources

    Science.gov (United States)

    Edsall, Thomas A.; Brock, R.H.; Bukata, R.P.; Dawson, J.J.; Horvath, F.J.; Busch, W.-Dieter N.; Sly, Peter G.

    1992-01-01

    This section of the Classification and Inventory of Great Lakes Aquatic Habitat report was prepared as a series of individually authored contributions that describe, in various levels of detail, state-of-the-art techniques that can be used alone or in combination to inventory aquatic habitats and resources in the Laurentian Great Lakes system. No attempt was made to review and evaluate techniques that are used routinely in limnological and fisheries surveys and inventories because it was felt that users of this document would be familiar with them.

  16. Document flow segmentation for business applications

    Science.gov (United States)

    Daher, Hani; Belaïd, Abdel

    2013-12-01

    The aim of this paper is to propose a document flow supervised segmentation approach applied to real world heterogeneous documents. Our algorithm treats the flow of documents as couples of consecutive pages and studies the relationship that exists between them. At first, sets of features are extracted from the pages where we propose an approach to model the couple of pages into a single feature vector representation. This representation will be provided to a binary classifier which classifies the relationship as either segmentation or continuity. In case of segmentation, we consider that we have a complete document and the analysis of the flow continues by starting a new document. In case of continuity, the couple of pages are assimilated to the same document and the analysis continues on the flow. If there is an uncertainty on whether the relationship between the couple of pages should be classified as a continuity or segmentation, a rejection is decided and the pages analyzed until this point are considered as a "fragment". The first classification already provides good results approaching 90% on certain documents, which is high at this level of the system.

  17. The Finnish background report for the EC documentation of best available techniques for tanning industry

    Energy Technology Data Exchange (ETDEWEB)

    Kustula, V.; Salo, H.; Witick, A.; Kaunismaa, P.

    2000-08-01

    The objective of this document is to identify best available techniques (BAT) for the reduction of emissions and energy use in the tanning industry in Finland. The leather tanning industry in Finland has long traditions, dating back for centuries, but today there are only nine tanneries of any importance left. The tanneries vary in size from small, family-owned ones to large on a Finnish scale, with a staff of about 70 persons. The production of finished leather in even the largest tannery in Finland is well below the production limit (12 tonnes finished leather a day) mentioned in the IPPC-directive (96/61/EC). The range of products manufactured by the Finnish leather industry is large and includes processed leather for, e.g. footwear, clothing, furniture and intermediate products, e.g. wet-blue and crust. The hides and skins of cows, lamb, elk, reindeer and occasionally skins from other animals, e.g. horses are the main raw materials used. Some of the tanneries carry out only a part of the processing and sell their products in a treated, but not finished state. Tanneries which undertake only a part of the preparation process are reviewed in this document as well. Because of the varying size of the Finnish tanneries, the quality and quantity of emissions and environmental impacts vary considerably. The parameters used by the authorities are suspended solids (SS), biological oxygen demand (BOD), and chromium and sulphide concentration in the effluents. The quantity of waste and waste water generated is also subject to assessment by the authorities. Direct regulation of emissions is practised in Finland by issuing permits containing emission limit values. Major waste streams generated are sludge from waste water treatment plants, animal residues from the beamhouse stage, residues from tanned leather and chemicals used. Various ways of residue separation and consequently reuse and recovery are practised in most Finnish tanneries. The proteinaceous residues from the

  18. Prediction of lung cancer patient survival via supervised machine learning classification techniques.

    Science.gov (United States)

    Lynch, Chip M; Abdollahi, Behnaz; Fuqua, Joshua D; de Carlo, Alexandra R; Bartholomai, James A; Balgemann, Rayeanne N; van Berkel, Victor H; Frieboes, Hermann B

    2017-12-01

    Outcomes for cancer patients have been previously estimated by applying various machine learning techniques to large datasets such as the Surveillance, Epidemiology, and End Results (SEER) program database. In particular for lung cancer, it is not well understood which types of techniques would yield more predictive information, and which data attributes should be used in order to determine this information. In this study, a number of supervised learning techniques is applied to the SEER database to classify lung cancer patients in terms of survival, including linear regression, Decision Trees, Gradient Boosting Machines (GBM), Support Vector Machines (SVM), and a custom ensemble. Key data attributes in applying these methods include tumor grade, tumor size, gender, age, stage, and number of primaries, with the goal to enable comparison of predictive power between the various methods The prediction is treated like a continuous target, rather than a classification into categories, as a first step towards improving survival prediction. The results show that the predicted values agree with actual values for low to moderate survival times, which constitute the majority of the data. The best performing technique was the custom ensemble with a Root Mean Square Error (RMSE) value of 15.05. The most influential model within the custom ensemble was GBM, while Decision Trees may be inapplicable as it had too few discrete outputs. The results further show that among the five individual models generated, the most accurate was GBM with an RMSE value of 15.32. Although SVM underperformed with an RMSE value of 15.82, statistical analysis singles the SVM as the only model that generated a distinctive output. The results of the models are consistent with a classical Cox proportional hazards model used as a reference technique. We conclude that application of these supervised learning techniques to lung cancer data in the SEER database may be of use to estimate patient survival time

  19. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    Directory of Open Access Journals (Sweden)

    Manana Khachidze

    2016-01-01

    Full Text Available According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray and 13 subgroups using two well-known methods: Support Vector Machine (SVM and K-Nearest Neighbor (KNN. The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system due to common features characterizing these subclasses. The overall results of the study were successful.

  20. The Standard of Management and Application of Cultural Heritage Documentation

    Directory of Open Access Journals (Sweden)

    Yen Ya Ning

    2011-12-01

    Full Text Available Using digital technology for cultural heritage documentation is a global trend in the 21 st century. Many important techniques are currently under development, including 3D digital imaging, reverse engineering, GIS (Geographic Information Systems etc. However, no system for overall management or data integration is yet available. Therefore, we urgently need such a system to efficiently manage and interpret data for the preservation of cultural heritages. This paper presents a digitizing process developed in Taiwan by the authors. To govern and manage cultural property, three phases of property conservation, registration, restoration and management, has been set up along a timeline. In accordance with the laws of cultural property, a structural system has been built for project management, including data classification and data interpretation with self-documenting characteristics. Through repository information and metadata, a system catalogue (also called data dictionary (Figure 1 was created. The primary objective of the study is to create an integrated technology for an efficient management of databases. Several benefits could be obtained from this structural standard: (1 cultural heritage management documentation can be centralized to minimize the possibility of data re-entry resulting inconsistency, and also to facilitate simultaneous updating of data; (2 since multiple data can be simultaneously retrieved and saved in real time, the incidence of errors can be reduced; (3 this system could be easily tailored to meet the administrative requirements for the standardization of documentation exchanged between cultural properties institutions and various county and city governments.

  1. Base Oils Biodegradability Prediction with Data Mining Techniques

    Directory of Open Access Journals (Sweden)

    Malika Trabelsi

    2010-02-01

    Full Text Available In this paper, we apply various data mining techniques including continuous numeric and discrete classification prediction models of base oils biodegradability, with emphasis on improving prediction accuracy. The results show that highly biodegradable oils can be better predicted through numeric models. In contrast, classification models did not uncover a similar dichotomy. With the exception of Memory Based Reasoning and Decision Trees, tested classification techniques achieved high classification prediction. However, the technique of Decision Trees helped uncover the most significant predictors. A simple classification rule derived based on this predictor resulted in good classification accuracy. The application of this rule enables efficient classification of base oils into either low or high biodegradability classes with high accuracy. For the latter, a higher precision biodegradability prediction can be obtained using continuous modeling techniques.

  2. Web Approach for Ontology-Based Classification, Integration, and Interdisciplinary Usage of Geoscience Metadata

    Directory of Open Access Journals (Sweden)

    B Ritschel

    2012-10-01

    Full Text Available The Semantic Web is a W3C approach that integrates the different sources of semantics within documents and services using ontology-based techniques. The main objective of this approach in the geoscience domain is the improvement of understanding, integration, and usage of Earth and space science related web content in terms of data, information, and knowledge for machines and people. The modeling and representation of semantic attributes and relations within and among documents can be realized by human readable concept maps and machine readable OWL documents. The objectives for the usage of the Semantic Web approach in the GFZ data center ISDC project are the design of an extended classification of metadata documents for product types related to instruments, platforms, and projects as well as the integration of different types of metadata related to data product providers, users, and data centers. Sources of content and semantics for the description of Earth and space science product types and related classes are standardized metadata documents (e.g., DIF documents, publications, grey literature, and Web pages. Other sources are information provided by users, such as tagging data and social navigation information. The integration of controlled vocabularies as well as folksonomies plays an important role in the design of well formed ontologies.

  3. Titulus Scuola: the new file classification schema for Italian schools

    Directory of Open Access Journals (Sweden)

    Gianni Penzo Doria

    2017-05-01

    Full Text Available This article presents the new file classification schema national for Italian schools, produced by the Italian Directorate General of Archives of the Ministry for Cultural Heritage, within the project Titulus Scuola. This classification schema represents the starting point for a standard documental system, aimed at the digital administration.

  4. Land-cover classification with an expert classification algorithm using digital aerial photographs

    Directory of Open Access Journals (Sweden)

    José L. de la Cruz

    2010-05-01

    Full Text Available The purpose of this study was to evaluate the usefulness of the spectral information of digital aerial sensors in determining land-cover classification using new digital techniques. The land covers that have been evaluated are the following, (1 bare soil, (2 cereals, including maize (Zea mays L., oats (Avena sativa L., rye (Secale cereale L., wheat (Triticum aestivum L. and barley (Hordeun vulgare L., (3 high protein crops, such as peas (Pisum sativum L. and beans (Vicia faba L., (4 alfalfa (Medicago sativa L., (5 woodlands and scrublands, including holly oak (Quercus ilex L. and common retama (Retama sphaerocarpa L., (6 urban soil, (7 olive groves (Olea europaea L. and (8 burnt crop stubble. The best result was obtained using an expert classification algorithm, achieving a reliability rate of 95%. This result showed that the images of digital airborne sensors hold considerable promise for the future in the field of digital classifications because these images contain valuable information that takes advantage of the geometric viewpoint. Moreover, new classification techniques reduce problems encountered using high-resolution images; while reliabilities are achieved that are better than those achieved with traditional methods.

  5. Classification of breast tumour using electrical impedance and machine learning techniques.

    Science.gov (United States)

    Al Amin, Abdullah; Parvin, Shahnaj; Kadir, M A; Tahmid, Tasmia; Alam, S Kaisar; Siddique-e Rabbani, K

    2014-06-01

    When a breast lump is detected through palpation, mammography or ultrasonography, the final test for characterization of the tumour, whether it is malignant or benign, is biopsy. This is invasive and carries hazards associated with any surgical procedures. The present work was undertaken to study the feasibility for such characterization using non-invasive electrical impedance measurements and machine learning techniques. Because of changes in cell morphology of malignant and benign tumours, changes are expected in impedance at a fixed frequency, and versus frequency of measurement. Tetrapolar impedance measurement (TPIM) using four electrodes at the corners of a square region of sides 4 cm was used for zone localization. Data of impedance in two orthogonal directions, measured at 5 and 200 kHz from 19 subjects, and their respective slopes with frequency were subjected to machine learning procedures through the use of feature plots. These patients had single or multiple tumours of various types in one or both breasts, and four of them had malignant tumours, as diagnosed by core biopsy. Although size and depth of the tumours are expected to affect the measurements, this preliminary work ignored these effects. Selecting 12 features from the above measurements, feature plots were drawn for the 19 patients, which displayed considerable overlap between malignant and benign cases. However, based on observed qualitative trend of the measured values, when all the feature values were divided by respective ages, the two types of tumours separated out reasonably well. Using K-NN classification method the results obtained are, positive prediction value: 60%, negative prediction value: 93%, sensitivity: 75%, specificity: 87% and efficacy: 84%, which are very good for such a test on a small sample size. Study on a larger sample is expected to give confidence in this technique, and further improvement of the technique may have the ability to replace biopsy.

  6. Classification of breast tumour using electrical impedance and machine learning techniques

    International Nuclear Information System (INIS)

    Amin, Abdullah Al; Parvin, Shahnaj; Kadir, M A; Tahmid, Tasmia; Alam, S Kaisar; Siddique-e Rabbani, K

    2014-01-01

    When a breast lump is detected through palpation, mammography or ultrasonography, the final test for characterization of the tumour, whether it is malignant or benign, is biopsy. This is invasive and carries hazards associated with any surgical procedures. The present work was undertaken to study the feasibility for such characterization using non-invasive electrical impedance measurements and machine learning techniques. Because of changes in cell morphology of malignant and benign tumours, changes are expected in impedance at a fixed frequency, and versus frequency of measurement. Tetrapolar impedance measurement (TPIM) using four electrodes at the corners of a square region of sides 4 cm was used for zone localization. Data of impedance in two orthogonal directions, measured at 5 and 200 kHz from 19 subjects, and their respective slopes with frequency were subjected to machine learning procedures through the use of feature plots. These patients had single or multiple tumours of various types in one or both breasts, and four of them had malignant tumours, as diagnosed by core biopsy. Although size and depth of the tumours are expected to affect the measurements, this preliminary work ignored these effects. Selecting 12 features from the above measurements, feature plots were drawn for the 19 patients, which displayed considerable overlap between malignant and benign cases. However, based on observed qualitative trend of the measured values, when all the feature values were divided by respective ages, the two types of tumours separated out reasonably well. Using K-NN classification method the results obtained are, positive prediction value: 60%, negative prediction value: 93%, sensitivity: 75%, specificity: 87% and efficacy: 84%, which are very good for such a test on a small sample size. Study on a larger sample is expected to give confidence in this technique, and further improvement of the technique may have the ability to replace biopsy. (paper)

  7. Power quality event classification: an overview and key issues ...

    African Journals Online (AJOL)

    ... used for PQ events' classifications. Various artificial intelligent techniques which are used in PQ event classification are also discussed. Major Key issues and challenges in classifying PQ events are critically examined and outlined. Keywords: Power quality, PQ event classifiers, artificial intelligence techniques, PQ noise, ...

  8. Interdisciplinary consensus document for the treatment of fibromyalgia.

    Science.gov (United States)

    de Miquel, C Alegre; Campayo, J García; Flórez, M Tomás; Arguelles, J M Gómez; Tarrio, E Blanco; Montoya, M Gobbo; Martin, Á Pérez; Salio, A Martínez; Fuentes, J Vidal; Alberch, E Altarriba; de la Cámara, A Gómez

    2010-01-01

    Backgrounds. The elevated prevalence and enormous clinical and social impact of fibromyalgia, together with the complexity of its treatment, require action consensuses that guide health care professionals. Although there are some similar documents in our language, most have been made from the perspective of a single discipline.Objective. To develop a consensus on the treatment of fibromyalgia made by selected representatives and supported by the principal medical associations that intervene in its treatment (rheumatology, neurology, psychiatry,rehabilitation and family medicine) and representatives of the associations of patients. On the other hand, understanding the disease not as a homogenous disorders but also as the sum of different clinical subtypes,having specific symptomatic characteristics and different therapeutic needs is stressed. This approach represented a need perceived by the clinicians and a novelty regarding previous consensuses.Methods. The different clinical classifications proposed in fibromyalgia and the scientific evidence of the treatments used in this disease were reviewed. For the selection of the classification used and performance of the therapeutic recommendations, some of the usual techniques to obtain the consensus (nominal group and brainstorming) were used.Conclusion. The classification of Giesecke of fibromyalgia into 3 subgroups seems to have the greatest scientific evidence and the most useful for the clinician. The guide offers a series of general recommendations for all the patients with fibromyalgia. However, in addition, for each subgroup, there are a series of specific pharmacological and psychological-type recommendations and those of modification of the environment, which will make it possible to have a personalized approach to the patient with fibromyalgia in accordance with their individual clinical characteristics (pain, catastrophizing levels, etc.).

  9. Classification and disposal of radioactive wastes: History and legal and regulatory requirements

    International Nuclear Information System (INIS)

    Kocher, D.C.

    1990-01-01

    This document discusses the laws and regulations in the United States addressing classification of radioactive wastes and the requirements for disposal of different waste classes. This review emphasizes the relationship between waste classification and the requirements for permanent disposal

  10. Improving MeSH classification of biomedical articles using citation contexts.

    Science.gov (United States)

    Aljaber, Bader; Martinez, David; Stokes, Nicola; Bailey, James

    2011-10-01

    Medical Subject Headings (MeSH) are used to index the majority of databases generated by the National Library of Medicine. Essentially, MeSH terms are designed to make information, such as scientific articles, more retrievable and assessable to users of systems such as PubMed. This paper proposes a novel method for automating the assignment of biomedical publications with MeSH terms that takes advantage of citation references to these publications. Our findings show that analysing the citation references that point to a document can provide a useful source of terms that are not present in the document. The use of these citation contexts, as they are known, can thus help to provide a richer document feature representation, which in turn can help improve text mining and information retrieval applications, in our case MeSH term classification. In this paper, we also explore new methods of selecting and utilising citation contexts. In particular, we assess the effect of weighting the importance of citation terms (found in the citation contexts) according to two aspects: (i) the section of the paper they appear in and (ii) their distance to the citation marker. We conduct intrinsic and extrinsic evaluations of citation term quality. For the intrinsic evaluation, we rely on the UMLS Metathesaurus conceptual database to explore the semantic characteristics of the mined citation terms. We also analyse the "informativeness" of these terms using a class-entropy measure. For the extrinsic evaluation, we run a series of automatic document classification experiments over MeSH terms. Our experimental evaluation shows that citation contexts contain terms that are related to the original document, and that the integration of this knowledge results in better classification performance compared to two state-of-the-art MeSH classification systems: MeSHUP and MTI. Our experiments also demonstrate that the consideration of Section and Distance factors can lead to statistically

  11. Waste classification - history, standards, and requirements for disposal

    International Nuclear Information System (INIS)

    Kocher, D.C.

    1989-01-01

    This document contains an outline of a presentation on the historical development in US of different classes (categories) or radioactive waste, on laws and regulations in US regarding classification of radioactive wastes; and requirements for disposal of different waste classes; and on the application of laws and regulations for hazardous chemical wastes to classification and disposal of naturally occurring and accelerator-produced radioactive materials; and mixed radioactive and hazardous chemical wastes

  12. ARABIC TEXT CLASSIFICATION USING NEW STEMMER FOR FEATURE SELECTION AND DECISION TREES

    Directory of Open Access Journals (Sweden)

    SAID BAHASSINE

    2017-06-01

    Full Text Available Text classification is the process of assignment of unclassified text to appropriate classes based on their content. The most prevalent representation for text classification is the bag of words vector. In this representation, the words that appear in documents often have multiple morphological structures, grammatical forms. In most cases, this morphological variant of words belongs to the same category. In the first part of this paper, anew stemming algorithm was developed in which each term of a given document is represented by its root. In the second part, a comparative study is conducted of the impact of two stemming algorithms namely Khoja’s stemmer and our new stemmer (referred to hereafter by origin-stemmer on Arabic text classification. This investigation was carried out using chi-square as a feature of selection to reduce the dimensionality of the feature space and decision tree classifier. In order to evaluate the performance of the classifier, this study used a corpus that consists of 5070 documents independently classified into six categories: sport, entertainment, business, Middle East, switch and world on WEKA toolkit. The recall, f-measure and precision measures are used to compare the performance of the obtained models. The experimental results show that text classification using rout stemmer outperforms classification using Khoja’s stemmer. The f-measure was 92.9% in sport category and 89.1% in business category.

  13. Rational kernels for Arabic Root Extraction and Text Classification

    Directory of Open Access Journals (Sweden)

    Attia Nehar

    2016-04-01

    Full Text Available In this paper, we address the problems of Arabic Text Classification and root extraction using transducers and rational kernels. We introduce a new root extraction approach on the basis of the use of Arabic patterns (Pattern Based Stemmer. Transducers are used to model these patterns and root extraction is done without relying on any dictionary. Using transducers for extracting roots, documents are transformed into finite state transducers. This document representation allows us to use and explore rational kernels as a framework for Arabic Text Classification. Root extraction experiments are conducted on three word collections and yield 75.6% of accuracy. Classification experiments are done on the Saudi Press Agency dataset and N-gram kernels are tested with different values of N. Accuracy and F1 report 90.79% and 62.93% respectively. These results show that our approach, when compared with other approaches, is promising specially in terms of accuracy and F1.

  14. Automated authorship attribution using advanced signal classification techniques.

    Directory of Open Access Journals (Sweden)

    Maryam Ebrahimpour

    Full Text Available In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discriminant Analysis (MDA and the other based on a Support Vector Machine (SVM. The classification features we exploit are based on word frequencies in the text. We adopt an approach of preprocessing each text by stripping it of all characters except a-z and space. This is in order to increase the portability of the software to different types of texts. We test the methodology on a corpus of undisputed English texts, and use leave-one-out cross validation to demonstrate classification accuracies in excess of 90%. We further test our methods on the Federalist Papers, which have a partly disputed authorship and a fair degree of scholarly consensus. And finally, we apply our methodology to the question of the authorship of the Letter to the Hebrews by comparing it against a number of original Greek texts of known authorship. These tests identify where some of the limitations lie, motivating a number of open questions for future work. An open source implementation of our methodology is freely available for use at https://github.com/matthewberryman/author-detection.

  15. Simple Fully Automated Group Classification on Brain fMRI

    International Nuclear Information System (INIS)

    Honorio, J.; Goldstein, R.; Samaras, D.; Tomasi, D.; Goldstein, R.Z.

    2010-01-01

    We propose a simple, well grounded classification technique which is suited for group classification on brain fMRI data sets that have high dimensionality, small number of subjects, high noise level, high subject variability, imperfect registration and capture subtle cognitive effects. We propose threshold-split region as a new feature selection method and majority voteas the classification technique. Our method does not require a predefined set of regions of interest. We use average acros ssessions, only one feature perexperimental condition, feature independence assumption, and simple classifiers. The seeming counter-intuitive approach of using a simple design is supported by signal processing and statistical theory. Experimental results in two block design data sets that capture brain function under distinct monetary rewards for cocaine addicted and control subjects, show that our method exhibits increased generalization accuracy compared to commonly used feature selection and classification techniques.

  16. Simple Fully Automated Group Classification on Brain fMRI

    Energy Technology Data Exchange (ETDEWEB)

    Honorio, J.; Goldstein, R.; Honorio, J.; Samaras, D.; Tomasi, D.; Goldstein, R.Z.

    2010-04-14

    We propose a simple, well grounded classification technique which is suited for group classification on brain fMRI data sets that have high dimensionality, small number of subjects, high noise level, high subject variability, imperfect registration and capture subtle cognitive effects. We propose threshold-split region as a new feature selection method and majority voteas the classification technique. Our method does not require a predefined set of regions of interest. We use average acros ssessions, only one feature perexperimental condition, feature independence assumption, and simple classifiers. The seeming counter-intuitive approach of using a simple design is supported by signal processing and statistical theory. Experimental results in two block design data sets that capture brain function under distinct monetary rewards for cocaine addicted and control subjects, show that our method exhibits increased generalization accuracy compared to commonly used feature selection and classification techniques.

  17. Nonlinear filtering for character recognition in low quality document images

    Science.gov (United States)

    Diaz-Escobar, Julia; Kober, Vitaly

    2014-09-01

    Optical character recognition in scanned printed documents is a well-studied task, where the captured conditions like sheet position, illumination, contrast and resolution are controlled. Nowadays, it is more practical to use mobile devices for document capture than a scanner. So as a consequence, the quality of document images is often poor owing to presence of geometric distortions, nonhomogeneous illumination, low resolution, etc. In this work we propose to use multiple adaptive nonlinear composite filters for detection and classification of characters. Computer simulation results obtained with the proposed system are presented and discussed.

  18. PROGRESSIVE DENSIFICATION AND REGION GROWING METHODS FOR LIDAR DATA CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    J. L. Pérez-García

    2012-07-01

    Full Text Available At present, airborne laser scanner systems are one of the most frequent methods used to obtain digital terrain elevation models. While having the advantage of direct measurement on the object, the point cloud obtained has the need for classification of their points according to its belonging to the ground. This need for classification of raw data has led to appearance of multiple filters focused LiDAR classification information. According this approach, this paper presents a classification method that combines LiDAR data segmentation techniques and progressive densification to carry out the location of the points belonging to the ground. The proposed methodology is tested on several datasets with different terrain characteristics and data availability. In all case, we analyze the advantages and disadvantages that have been obtained compared with the individual techniques application and, in a special way, the benefits derived from the integration of both classification techniques. In order to provide a more comprehensive quality control of the classification process, the obtained results have been compared with the derived from a manual procedure, which is used as reference classification. The results are also compared with other automatic classification methodologies included in some commercial software packages, highly contrasted by users for LiDAR data treatment.

  19. Text Classification and Distributional features techniques in Datamining and Warehousing

    OpenAIRE

    Bethu, Srikanth; Babu, G Charless; Vinoda, J; Priyadarshini, E; rao, M Raghavendra

    2013-01-01

    Text Categorization is traditionally done by using the term frequency and inverse document frequency.This type of method is not very good because, some words which are not so important may appear in the document .The term frequency of unimportant words may increase and document may be classified in the wrong category.For reducing the error of classifying of documents in wrong category. The Distributional features are introduced. In the Distribuional Features, the Distribution of the words in ...

  20. Final hazard classification for the 183-C D ampersand D project

    International Nuclear Information System (INIS)

    Zimmer, J.J.

    1995-01-01

    The intent of this document is to provide a final Hazard Classification for the Decontamination and Decommissioning (D ampersand D) activities associated with the 183-C Filter Plant/Pump Room facility. The Hazard Classification was determined based upon DOE-EM-STD-5502-94, ''DOE Limited Standard, Hazard Baseline Documentation,'' issued by the US Department of Energy. The 183-C Filter Plant/Pump Room facility was constructed to support operations of the 105-B and 105-C Reactors at the Hanford Site. Since shutdown of the 105-C Reactor in April 1969, the 183-C facility has been kept in a safe storage condition

  1. Characterization and classification or the first meteorite fall in Varre-Sai town, southeast Brazil, using X-ray microfluorescence technique

    Energy Technology Data Exchange (ETDEWEB)

    Alves, Haimon D.L. [Universidade Federal do Rio de Janeiro (COPPE/UFRJ), RJ (Brazil). Coordenacao dos Programas de Pos-Graduacao de Engenharia. Programa de Engenharia Nuclear; Assis, Joaquim T. de, E-mail: joaquim@iprj.uerj.b [Instituto Politecnico do Rio de Janeiro (IPRJ/UERJ), Nova Friburgo, RJ (Brazil); Valeriano, Claudio [Universidade do Estado do Rio de Janeiro (UERJ), RJ (Brazil). Dept. de Geologia; Turbay, Caio [Universidade Federal do Espirito Santo (UFES), Alegre, ES (Brazil). Dept. de Geologia

    2011-07-01

    On the night of June 19th, 2010, a meteorite fell nearby the town of Varre-Sai, Rio de Janeiro state, southeast Brazil. A small part of it was found and taken for analysis. A meteorite analysis can give researchers a better understanding of the origins of the Universe. However, some of the most traditionalist methods of characterization and classification of meteorites are destructive. In this paper we present the results of a chemical analysis and classification of this particular meteorite using X-ray microfluorescence ({mu}XRF), a non-destructive technique that allows for a quick and easy elemental analysis within the range of micrometers. Both sides of the meteorite were measured, 35 points in total, using Artax, a state of the art {mu}XRF system developed by Bruker, at 50 kV tension and 700 {mu}A current. Quantitative analysis using direct comparison of counting rates method showed concentrations of iron and nickel together of roughly 7.86%. We found that it is possible to distinguish this meteorite from most of the categories as an ordinary L-type chondrite but a more thorough analysis might be necessary so as to obtain a more detailed classification. (author)

  2. 3D laser scanning techniques applying to tunnel documentation and geological mapping at Aespoe hard rock laboratory, Sweden

    International Nuclear Information System (INIS)

    Feng, Q.; Wang, G.; Roeshoff, K.

    2008-01-01

    3D terrestrial laser scanning is nowadays one of the most attractive methods to applying for 3D mapping and documentation of rock faces and tunnels, and shows the most potential to improve the data quality and provide some good solutions in rock engineering projects. In this paper, the state-of-the-art methods are described for different possibility to tunnel documentation and geological mapping based on 3D laser scanning data. Some results are presented from the case study performed at the Hard Rock Laboratory, Aespoe run by SKB, Swedish Nuclear Fuel and Waste Management Co. Comparing to traditional methods, 3D laser scanning techniques can not only provide us with a rapid and 3D digital way for tunnel documentation, but also create a potential chance to achieve high quality data, which might be beneficial to different rock engineering project procedures, including field data acquisition, data processing, data retrieving and management, and also modeling and design. (authors)

  3. The Spinal Cord Injury-Interventions Classification System

    NARCIS (Netherlands)

    van Langeveld, A.H.B.

    2010-01-01

    Title: The Spinal Cord Injury-Interventions Classification System: development and evaluation of a documentation tool to record therapy to improve mobility and self-care in people with spinal cord injury. Background: Many rehabilitation researchers have emphasized the need to examine the actual

  4. A novel classification and online platform for planning and documentation of medical applications of additive manufacturing.

    Science.gov (United States)

    Tuomi, Jukka; Paloheimo, Kaija-Stiina; Vehviläinen, Juho; Björkstrand, Roy; Salmi, Mika; Huotilainen, Eero; Kontio, Risto; Rouse, Stephen; Gibson, Ian; Mäkitie, Antti A

    2014-12-01

    Additive manufacturing technologies are widely used in industrial settings and now increasingly also in several areas of medicine. Various techniques and numerous types of materials are used for these applications. There is a clear need to unify and harmonize the patterns of their use worldwide. We present a 5-class system to aid planning of these applications and related scientific work as well as communication between various actors involved in this field. An online, matrix-based platform and a database were developed for planning and documentation of various solutions. This platform will help the medical community to structurally develop both research innovations and clinical applications of additive manufacturing. The online platform can be accessed through http://www.medicalam.info. © The Author(s) 2014.

  5. A novel technique for estimation of skew in binary text document ...

    Indian Academy of Sciences (India)

    R. Narasimhan (Krishtel eMaging) 1461 1996 Oct 15 13:05:22

    Gatos et al (1997) have proposed a new skew detection method based on the information ..... different books, magazines and journals. ..... Duda R O, Hart P E 1973 Pattern classification and scene analysis (New York: Wiley-Interscience).

  6. A New Binarization Algorithm for Historical Documents

    Directory of Open Access Journals (Sweden)

    Marcos Almeida

    2018-01-01

    Full Text Available Monochromatic documents claim for much less computer bandwidth for network transmission and storage space than their color or even grayscale equivalent. The binarization of historical documents is far more complex than recent ones as paper aging, color, texture, translucidity, stains, back-to-front interference, kind and color of ink used in handwriting, printing process, digitalization process, etc. are some of the factors that affect binarization. This article presents a new binarization algorithm for historical documents. The new global filter proposed is performed in four steps: filtering the image using a bilateral filter, splitting image into the RGB components, decision-making for each RGB channel based on an adaptive binarization method inspired by Otsu’s method with a choice of the threshold level, and classification of the binarized images to decide which of the RGB components best preserved the document information in the foreground. The quantitative and qualitative assessment made with 23 binarization algorithms in three sets of “real world” documents showed very good results.

  7. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey.

    Science.gov (United States)

    Nahid, Abdullah-Al; Kong, Yinan

    2017-01-01

    Breast cancer is one of the largest causes of women's death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors' and physicians' time. Despite the various publications on breast image classification, very few review papers are available which provide a detailed description of breast cancer image classification techniques, feature extraction and selection procedures, classification measuring parameterizations, and image classification findings. We have put a special emphasis on the Convolutional Neural Network (CNN) method for breast image classification. Along with the CNN method we have also described the involvement of the conventional Neural Network (NN), Logic Based classifiers such as the Random Forest (RF) algorithm, Support Vector Machines (SVM), Bayesian methods, and a few of the semisupervised and unsupervised methods which have been used for breast image classification.

  8. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey

    Directory of Open Access Journals (Sweden)

    Abdullah-Al Nahid

    2017-01-01

    Full Text Available Breast cancer is one of the largest causes of women’s death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors’ and physicians’ time. Despite the various publications on breast image classification, very few review papers are available which provide a detailed description of breast cancer image classification techniques, feature extraction and selection procedures, classification measuring parameterizations, and image classification findings. We have put a special emphasis on the Convolutional Neural Network (CNN method for breast image classification. Along with the CNN method we have also described the involvement of the conventional Neural Network (NN, Logic Based classifiers such as the Random Forest (RF algorithm, Support Vector Machines (SVM, Bayesian methods, and a few of the semisupervised and unsupervised methods which have been used for breast image classification.

  9. Safety classification of nuclear power plant systems, structures and components

    International Nuclear Information System (INIS)

    1992-01-01

    The Safety Classification principles used for the systems, structures and components of a nuclear power plant are detailed in the guide. For classification, the nuclear power plant is divided into structural and operational units called systems. Every structure and component under control is included into some system. The Safety Classes are 1, 2 and 3 and the Class EYT (non-nuclear). Instructions how to assign each system, structure and component to an appropriate safety class are given in the guide. The guide applies to new nuclear power plants and to the safety classification of systems, structures and components designed for the refitting of old nuclear power plants. The classification principles and procedures applying to the classification document are also given

  10. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    Energy Technology Data Exchange (ETDEWEB)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K. [Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT (United Kingdom); McEwen, Jason D., E-mail: dr.michelle.lochner@gmail.com [Mullard Space Science Laboratory, University College London, Surrey RH5 6NT (United Kingdom)

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  11. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    International Nuclear Information System (INIS)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.; McEwen, Jason D.

    2016-01-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  12. Semantic Similarity between Web Documents Using Ontology

    Science.gov (United States)

    Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh

    2018-06-01

    The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.

  13. Semantic Similarity between Web Documents Using Ontology

    Science.gov (United States)

    Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh

    2018-03-01

    The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.

  14. 32 CFR 732.25 - Accounting classifications for nonnaval medical and dental care expenses.

    Science.gov (United States)

    2010-07-01

    ... and dental care expenses. 732.25 Section 732.25 National Defense Department of Defense (Continued) DEPARTMENT OF THE NAVY PERSONNEL NONNAVAL MEDICAL AND DENTAL CARE Accounting Classifications for Nonnaval Medical and Dental Care Expenses and Standard Document Numbers § 732.25 Accounting classifications for...

  15. Document retrieval on repetitive string collections.

    Science.gov (United States)

    Gagie, Travis; Hartikainen, Aleksi; Karhu, Kalle; Kärkkäinen, Juha; Navarro, Gonzalo; Puglisi, Simon J; Sirén, Jouni

    2017-01-01

    Most of the fastest-growing string collections today are repetitive, that is, most of the constituent documents are similar to many others. As these collections keep growing, a key approach to handling them is to exploit their repetitiveness, which can reduce their space usage by orders of magnitude. We study the problem of indexing repetitive string collections in order to perform efficient document retrieval operations on them. Document retrieval problems are routinely solved by search engines on large natural language collections, but the techniques are less developed on generic string collections. The case of repetitive string collections is even less understood, and there are very few existing solutions. We develop two novel ideas, interleaved LCPs and precomputed document lists , that yield highly compressed indexes solving the problem of document listing (find all the documents where a string appears), top- k document retrieval (find the k documents where a string appears most often), and document counting (count the number of documents where a string appears). We also show that a classical data structure supporting the latter query becomes highly compressible on repetitive data. Finally, we show how the tools we developed can be combined to solve ranked conjunctive and disjunctive multi-term queries under the simple [Formula: see text] model of relevance. We thoroughly evaluate the resulting techniques in various real-life repetitiveness scenarios, and recommend the best choices for each case.

  16. Land cover classification using reformed fuzzy C-means

    Indian Academy of Sciences (India)

    This paper uses segmentation based on unsupervised clustering techniques for classification of land cover. ∗ ... and unsupervised classification can be solved by FCM. ..... They also act as input to the development and monitoring of a range of ...

  17. Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection.

    Science.gov (United States)

    Chen, Yifei; Sun, Yuxing; Han, Bing-Qing

    2015-01-01

    Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.

  18. ERC hazard classification matrices for above ground structures and groundwater and soil remediation activities

    International Nuclear Information System (INIS)

    Curry, L.R.

    1997-01-01

    This document provides the status of the preliminary hazard classification (PHC) process for the Environmental Restoration Contractor (ERC) above ground structures and groundwater and soil remediation activities currently underway for planned for fiscal year (FY) 1997. This classification process is based on current US Department of Energy (DOE), Richland Operations Office (RL) guidance for the classification of facilities and activities containing radionuclide and nonradiological hazardous material inventories. The above ground structures presented in the matrices were drawn from the Bechtel Hanford, Inc. (BHI) Decontamination and Decommissioning (D and D) Project Facility List (DOE 1996), which identifies the facilities in the RL-Environmental Restoration baseline contract in 1997. This document contains the following two appendices: (1) Appendix A, which consists of a matrix identifying PHC documents that have been issued for BHI's above ground structures and groundwater and soil remediation activities underway or planned for FY 1997, and (2) Appendix B, which consists of a matrix showing anticipated PHCs for above ground structures, and groundwater and soil remediation activities underway or planned for FY 1997. Appendix B also shows the schedule for finalization of PHCs for above ground structures with an anticipated classification of Nuclear

  19. Low-cost computer classification of land cover in the Portland area, Oregon, by signature extension techniques

    Science.gov (United States)

    Gaydos, Leonard

    1978-01-01

    Computer-aided techniques for interpreting multispectral data acquired by Landsat offer economies in the mapping of land cover. Even so, the actual establishment of the statistical classes, or "signatures," is one of the relatively more costly operations involved. Analysts have therefore been seeking cost-saving signature extension techniques that would accept training data acquired for one time or place and apply them to another. Opportunities to extend signatures occur in preprocessing steps and in the classification steps that follow. In the present example, land cover classes were derived by the simplest and most direct form of signature extension: Classes statistically derived from a Landsat scene for the Puget Sound area, Wash., were applied to the Portland area, Oreg., using data for the next Landsat scene acquired less than 25 seconds down orbit. Many features can be recognized on the reduced-scale version of the Portland land cover map shown in this report, although no statistical assessment of its accuracy is available.

  20. Computer-aided classification of forest cover types from small scale aerial photography

    Science.gov (United States)

    Bliss, John C.; Bonnicksen, Thomas M.; Mace, Thomas H.

    1980-11-01

    The US National Park Service must map forest cover types over extensive areas in order to fulfill its goal of maintaining or reconstructing presettlement vegetation within national parks and monuments. Furthermore, such cover type maps must be updated on a regular basis to document vegetation changes. Computer-aided classification of small scale aerial photography is a promising technique for generating forest cover type maps efficiently and inexpensively. In this study, seven cover types were classified with an overall accuracy of 62 percent from a reproduction of a 1∶120,000 color infrared transparency of a conifer-hardwood forest. The results were encouraging, given the degraded quality of the photograph and the fact that features were not centered, as well as the lack of information on lens vignetting characteristics to make corrections. Suggestions are made for resolving these problems in future research and applications. In addition, it is hypothesized that the overall accuracy is artificially low because the computer-aided classification more accurately portrayed the intermixing of cover types than the hand-drawn maps to which it was compared.

  1. 49 CFR 1105.6 - Classification of actions.

    Science.gov (United States)

    2010-10-01

    ... Classification of actions. (a) Environmental Impact Statements will normally be prepared for rail construction... action that would normally require environmental documentation (such as a construction or abandonment... environmental impacts; (6) Water carrier licensing under 49 U.S.C. 10922 that: (i) Involves a new operation (i.e...

  2. Gender classification under extended operating conditions

    Science.gov (United States)

    Rude, Howard N.; Rizki, Mateen

    2014-06-01

    Gender classification is a critical component of a robust image security system. Many techniques exist to perform gender classification using facial features. In contrast, this paper explores gender classification using body features extracted from clothed subjects. Several of the most effective types of features for gender classification identified in literature were implemented and applied to the newly developed Seasonal Weather And Gender (SWAG) dataset. SWAG contains video clips of approximately 2000 samples of human subjects captured over a period of several months. The subjects are wearing casual business attire and outer garments appropriate for the specific weather conditions observed in the Midwest. The results from a series of experiments are presented that compare the classification accuracy of systems that incorporate various types and combinations of features applied to multiple looks at subjects at different image resolutions to determine a baseline performance for gender classification.

  3. Photographic documentation of acute radiation-induced side effects of the oral mucosa

    International Nuclear Information System (INIS)

    Riesenbeck, D.; Doerr, W.; Feyerabend, T.; Richter, E.; Fietkau, R.; Henne, K.; Schendera, A.

    1998-01-01

    Background: Radiotherapy in cancer of the head and neck induces cutaneous and mucosal reactions. These must be carefully assessed and documented to control the accuracy of individual treatment, the overall toxicity of particular treatment schedules, the efficacy of prophylaxis and treatment and to determine the adequate therapy of treatment sequelae depending on the severity of the reactions. The accurate classification of lesions according to internationally accepted schedules (WHO/RTOG/CTC) is indispensable for the comparison of radiotherapy treatment results and efficacy of supportive care. Methods: While the treatment of cancer depends on tumor stage and medical circumstances of the patient and is more or less standardized, prophylaxis and treatment of side-effects is highly variable. Discussing an optimized prophylaxis and therapy of oral mucositis, the problem of accurate classification and documentation emerged. The verbal description of mucosal lesions is open to many subjective interpretations. Photographic documentation seems a suitable method to optimize the grading of toxicity. Results: A photographic survey of typical lesions for each grade of toxicity is a tool to reach several aims in one step. Toxicity of an individual patient may be compared with representative photographic examples in daily routine to decide quickly on the grade of toxicity. Subjective differences due to intra- and interpersonal variability of the evaluating radiooncologist will be reduced. The efficacy of trof treatment can be proven by accurate documentation. Randomized clinical studies concerning prophylaxis and treatmen mucositis will provide more reliable results if evaluation of toxicity grading is standardized by photographs. Conclusions: Photographic documentation of lesions of the oral mucosa might be the best means to reduce interindividual subjectivity in grading. It is a valuable appendix to standard classification systems and only concerns the visible signs of mucosal

  4. Validation of potential classification criteria for systemic sclerosis.

    NARCIS (Netherlands)

    Johnson, S.R.; Fransen, J.; Khanna, D.; Baron, M.; Hoogen, F. van den; Medsger TA, J.r.; Peschken, C.A.; Carreira, P.E.; Riemekasten, G.; Tyndall, A.; Matucci-Cerinic, M.; Pope, J.E.

    2012-01-01

    OBJECTIVE: Classification criteria for systemic sclerosis (SSc; scleroderma) are being updated jointly by the American College of Rheumatology and European League Against Rheumatism. Potential items for classification were reduced to 23 using Delphi and nominal group techniques. We evaluated the

  5. Binary Stochastic Representations for Large Multi-class Classification

    KAUST Repository

    Gerald, Thomas; Baskiotis, Nicolas; Denoyer, Ludovic

    2017-01-01

    Classification with a large number of classes is a key problem in machine learning and corresponds to many real-world applications like tagging of images or textual documents in social networks. If one-vs-all methods usually reach top performance

  6. Classification of high-resolution multi-swath hyperspectral data using Landsat 8 surface reflectance data as a calibration target and a novel histogram based unsupervised classification technique to determine natural classes from biophysically relevant fit parameters

    Science.gov (United States)

    McCann, C.; Repasky, K. S.; Morin, M.; Lawrence, R. L.; Powell, S. L.

    2016-12-01

    Compact, cost-effective, flight-based hyperspectral imaging systems can provide scientifically relevant data over large areas for a variety of applications such as ecosystem studies, precision agriculture, and land management. To fully realize this capability, unsupervised classification techniques based on radiometrically-calibrated data that cluster based on biophysical similarity rather than simply spectral similarity are needed. An automated technique to produce high-resolution, large-area, radiometrically-calibrated hyperspectral data sets based on the Landsat surface reflectance data product as a calibration target was developed and applied to three subsequent years of data covering approximately 1850 hectares. The radiometrically-calibrated data allows inter-comparison of the temporal series. Advantages of the radiometric calibration technique include the need for minimal site access, no ancillary instrumentation, and automated processing. Fitting the reflectance spectra of each pixel using a set of biophysically relevant basis functions reduces the data from 80 spectral bands to 9 parameters providing noise reduction and data compression. Examination of histograms of these parameters allows for determination of natural splitting into biophysical similar clusters. This method creates clusters that are similar in terms of biophysical parameters, not simply spectral proximity. Furthermore, this method can be applied to other data sets, such as urban scenes, by developing other physically meaningful basis functions. The ability to use hyperspectral imaging for a variety of important applications requires the development of data processing techniques that can be automated. The radiometric-calibration combined with the histogram based unsupervised classification technique presented here provide one potential avenue for managing big-data associated with hyperspectral imaging.

  7. Evaluation of the Waste Isolation Pilot Plant classification of systems, structures and components

    International Nuclear Information System (INIS)

    1985-07-01

    A review of the classification system for systems, structures, and components at the Waste Isolation Pilot Plant (WIPP) was performed using the WIPP Safety Analysis Report (SAR) and Bechtel document D-76-D-03 as primary source documents. The regulations of the US Nuclear Regulatory Commission (NRC) covering ''Disposal of High level Radioactive Wastes in Geologic Repositories,'' 10 CFR 60, and the regulations relevant to nuclear power plant siting and construction (10 CFR 50, 51, 100) were used as standards to evaluate the WIPP design classification system, although it is recognized that the US Department of Energy (DOE) is not required to comply with these NRC regulations in the design and construction of WIPP. The DOE General Design Criteria Manual (DOE Order 6430.1) and the Safety Analysis and Review System for AL Operation document (AL 54f81.1A) were reviewed in part. This report includes a discussion of the historical basis for nuclear power plant requirements, a review of WIPP and nuclear power plant classification bases, and a comparison of the codes and standards applicable to each quality level. Observations made during the review of the WIPP SAR are noted in the text of this reoport. The conclusions reached by this review are: WIPP classification methodology is comparable to corresponding nuclear power procedures. The classification levels assigned to WIPP systems are qualitatively the same as those assigned to nuclear power plant systems

  8. A Cognitive Computing Approach for Classification of Complaints in the Insurance Industry

    Science.gov (United States)

    Forster, J.; Entrup, B.

    2017-10-01

    In this paper we present and evaluate a cognitive computing approach for classification of dissatisfaction and four complaint specific complaint classes in correspondence documents between insurance clients and an insurance company. A cognitive computing approach includes the combination classical natural language processing methods, machine learning algorithms and the evaluation of hypothesis. The approach combines a MaxEnt machine learning algorithm with language modelling, tf-idf and sentiment analytics to create a multi-label text classification model. The result is trained and tested with a set of 2500 original insurance communication documents written in German, which have been manually annotated by the partnering insurance company. With a F1-Score of 0.9, a reliable text classification component has been implemented and evaluated. A final outlook towards a cognitive computing insurance assistant is given in the end.

  9. Discrete classification technique applied to TV advertisements liking recognition system based on low-cost EEG headsets.

    Science.gov (United States)

    Soria Morillo, Luis M; Alvarez-Garcia, Juan A; Gonzalez-Abril, Luis; Ortega Ramírez, Juan A

    2016-07-15

    In this paper a new approach is applied to the area of marketing research. The aim of this paper is to recognize how brain activity responds during the visualization of short video advertisements using discrete classification techniques. By means of low cost electroencephalography devices (EEG), the activation level of some brain regions have been studied while the ads are shown to users. We may wonder about how useful is the use of neuroscience knowledge in marketing, or what could provide neuroscience to marketing sector, or why this approach can improve the accuracy and the final user acceptance compared to other works. By using discrete techniques over EEG frequency bands of a generated dataset, C4.5, ANN and the new recognition system based on Ameva, a discretization algorithm, is applied to obtain the score given by subjects to each TV ad. The proposed technique allows to reach more than 75 % of accuracy, which is an excellent result taking into account the typology of EEG sensors used in this work. Furthermore, the time consumption of the algorithm proposed is reduced up to 30 % compared to other techniques presented in this paper. This bring about a battery lifetime improvement on the devices where the algorithm is running, extending the experience in the ubiquitous context where the new approach has been tested.

  10. An ensemble heterogeneous classification methodology for discovering health-related knowledge in social media messages.

    Science.gov (United States)

    Tuarob, Suppawong; Tucker, Conrad S; Salathe, Marcel; Ram, Nilam

    2014-06-01

    The role of social media as a source of timely and massive information has become more apparent since the era of Web 2.0.Multiple studies illustrated the use of information in social media to discover biomedical and health-related knowledge.Most methods proposed in the literature employ traditional document classification techniques that represent a document as a bag of words.These techniques work well when documents are rich in text and conform to standard English; however, they are not optimal for social media data where sparsity and noise are norms.This paper aims to address the limitations posed by the traditional bag-of-word based methods and propose to use heterogeneous features in combination with ensemble machine learning techniques to discover health-related information, which could prove to be useful to multiple biomedical applications, especially those needing to discover health-related knowledge in large scale social media data.Furthermore, the proposed methodology could be generalized to discover different types of information in various kinds of textual data. Social media data is characterized by an abundance of short social-oriented messages that do not conform to standard languages, both grammatically and syntactically.The problem of discovering health-related knowledge in social media data streams is then transformed into a text classification problem, where a text is identified as positive if it is health-related and negative otherwise.We first identify the limitations of the traditional methods which train machines with N-gram word features, then propose to overcome such limitations by utilizing the collaboration of machine learning based classifiers, each of which is trained to learn a semantically different aspect of the data.The parameter analysis for tuning each classifier is also reported. Three data sets are used in this research.The first data set comprises of approximately 5000 hand-labeled tweets, and is used for cross validation of the

  11. Document Type Profiles in Nature, Science, and PNAS: Journal and Country Level

    Directory of Open Access Journals (Sweden)

    Jielan Ding

    2016-09-01

    Full Text Available Purpose: In this contribution, we want to detect the document type profiles of the three prestigious journals Nature, Science, and Proceedings of the National Academy of Sciences of the United States (PNAS with regard to two levels: journal and country. Design/methodology/approach: Using relative values based on fractional counting, we investigate the distribution of publications across document types at both the journal and country level, and we use (cosine document type profile similarity values to compare pairs of publication years within countries. Findings: Nature and Science mainly publish Editorial Material, Article, News Item and Letter, whereas the publications of PNAS are heavily concentrated on Article. The shares of Article for Nature and Science are decreasing slightly from 1999 to 2014, while the corresponding shares of Editorial Material are increasing. Most studied countries focus on Article and Letter in Nature, but on Letter in Science and PNAS. The document type profiles of some of the studied countries change to a relatively large extent over publication years. Research limitations: The main limitation of this research concerns the Web of Science classification of publications into document types. Since the analysis of the paper is based on document types of Web of Science, the classification in question is not free from errors, and the accuracy of the analysis might be affected. Practical implications: Results show that Nature and Science are quite diversified with regard to document types. In bibliometric assessments, where publications in Nature and Science play a role, other document types than Article and Review might therefore be taken into account. Originality/value: Results highlight the importance of other document types than Article and Review in Nature and Science. Large differences are also found when comparing the country document type profiles of the three journals with the corresponding profiles in all Web of

  12. Independent Comparison of Popular DPI Tools for Traffic Classification

    DEFF Research Database (Denmark)

    Bujlow, Tomasz; Carela-Español, Valentín; Barlet-Ros, Pere

    2015-01-01

    Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic classifi......Deep Packet Inspection (DPI) is the state-of-the-art technology for traffic classification. According to the conventional wisdom, DPI is the most accurate classification technique. Consequently, most popular products, either commercial or open-source, rely on some sort of DPI for traffic......, application and web service). We carefully built a labeled dataset with more than 750K flows, which contains traffic from popular applications. We used the Volunteer-Based System (VBS), developed at Aalborg University, to guarantee the correct labeling of the dataset. We released this dataset, including full...

  13. Classification of deadlift biomechanics with wearable inertial measurement units.

    Science.gov (United States)

    O'Reilly, Martin A; Whelan, Darragh F; Ward, Tomas E; Delahunt, Eamonn; Caulfield, Brian M

    2017-06-14

    The deadlift is a compound full-body exercise that is fundamental in resistance training, rehabilitation programs and powerlifting competitions. Accurate quantification of deadlift biomechanics is important to reduce the risk of injury and ensure training and rehabilitation goals are achieved. This study sought to develop and evaluate deadlift exercise technique classification systems utilising Inertial Measurement Units (IMUs), recording at 51.2Hz, worn on the lumbar spine, both thighs and both shanks. It also sought to compare classification quality when these IMUs are worn in combination and in isolation. Two datasets of IMU deadlift data were collected. Eighty participants first completed deadlifts with acceptable technique and 5 distinct, deliberately induced deviations from acceptable form. Fifty-five members of this group also completed a fatiguing protocol (3-Repition Maximum test) to enable the collection of natural deadlift deviations. For both datasets, universal and personalised random-forests classifiers were developed and evaluated. Personalised classifiers outperformed universal classifiers in accuracy, sensitivity and specificity in the binary classification of acceptable or aberrant technique and in the multi-label classification of specific deadlift deviations. Whilst recent research has favoured universal classifiers due to the reduced overhead in setting them up for new system users, this work demonstrates that such techniques may not be appropriate for classifying deadlift technique due to the poor accuracy achieved. However, personalised classifiers perform very well in assessing deadlift technique, even when using data derived from a single lumbar-worn IMU to detect specific naturally occurring technique mistakes. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. A Chinese text classification system based on Naive Bayes algorithm

    Directory of Open Access Journals (Sweden)

    Cui Wei

    2016-01-01

    Full Text Available In this paper, aiming at the characteristics of Chinese text classification, using the ICTCLAS(Chinese lexical analysis system of Chinese academy of sciences for document segmentation, and for data cleaning and filtering the Stop words, using the information gain and document frequency feature selection algorithm to document feature selection. Based on this, based on the Naive Bayesian algorithm implemented text classifier , and use Chinese corpus of Fudan University has carried on the experiment and analysis on the system.

  15. Class Association Rule Pada Metode Associative Classification

    Directory of Open Access Journals (Sweden)

    Eka Karyawati

    2011-11-01

    Full Text Available Frequent patterns (itemsets discovery is an important problem in associative classification rule mining.  Differents approaches have been proposed such as the Apriori-like, Frequent Pattern (FP-growth, and Transaction Data Location (Tid-list Intersection algorithm. This paper focuses on surveying and comparing the state of the art associative classification techniques with regards to the rule generation phase of associative classification algorithms.  This phase includes frequent itemsets discovery and rules mining/extracting methods to generate the set of class association rules (CARs.  There are some techniques proposed to improve the rule generation method.  A technique by utilizing the concepts of discriminative power of itemsets can reduce the size of frequent itemset.  It can prune the useless frequent itemsets. The closed frequent itemset concept can be utilized to compress the rules to be compact rules.  This technique may reduce the size of generated rules.  Other technique is in determining the support threshold value of the itemset. Specifying not single but multiple support threshold values with regard to the class label frequencies can give more appropriate support threshold value.  This technique may generate more accurate rules. Alternative technique to generate rule is utilizing the vertical layout to represent dataset.  This method is very effective because it only needs one scan over dataset, compare with other techniques that need multiple scan over dataset.   However, one problem with these approaches is that the initial set of tid-lists may be too large to fit into main memory. It requires more sophisticated techniques to compress the tid-lists.

  16. International standards to document remaining autonomic function after spinal cord injury

    DEFF Research Database (Denmark)

    Krassioukov, Andrei; Biering-Sørensen, Fin; Donovan, William

    2012-01-01

    This is the first guideline describing the International Standards to document remaining Autonomic Function after Spinal Cord Injury (ISAFSCI). This guideline should be used as an adjunct to the International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI) including the ...

  17. Intelligent Computer Vision System for Automated Classification

    International Nuclear Information System (INIS)

    Jordanov, Ivan; Georgieva, Antoniya

    2010-01-01

    In this paper we investigate an Intelligent Computer Vision System applied for recognition and classification of commercially available cork tiles. The system is capable of acquiring and processing gray images using several feature generation and analysis techniques. Its functionality includes image acquisition, feature extraction and preprocessing, and feature classification with neural networks (NN). We also discuss system test and validation results from the recognition and classification tasks. The system investigation also includes statistical feature processing (features number and dimensionality reduction techniques) and classifier design (NN architecture, target coding, learning complexity and performance, and training with our own metaheuristic optimization method). The NNs trained with our genetic low-discrepancy search method (GLPτS) for global optimisation demonstrated very good generalisation abilities. In our view, the reported testing success rate of up to 95% is due to several factors: combination of feature generation techniques; application of Analysis of Variance (ANOVA) and Principal Component Analysis (PCA), which appeared to be very efficient for preprocessing the data; and use of suitable NN design and learning method.

  18. TEXT CLASSIFICATION USING NAIVE BAYES UPDATEABLE ALGORITHM IN SBMPTN TEST QUESTIONS

    Directory of Open Access Journals (Sweden)

    Ristu Saptono

    2017-01-01

    Full Text Available Document classification is a growing interest in the research of text mining. Classification can be done based on the topics, languages, and so on. This study was conducted to determine how Naive Bayes Updateable performs in classifying the SBMPTN exam questions based on its theme. Increment model of one classification algorithm often used in text classification Naive Bayes classifier has the ability to learn from new data introduces with the system even after the classifier has been produced with the existing data. Naive Bayes Classifier classifies the exam questions based on the theme of the field of study by analyzing keywords that appear on the exam questions. One of feature selection method DF-Thresholding is implemented for improving the classification performance. Evaluation of the classification with Naive Bayes classifier algorithm produces 84,61% accuracy.

  19. An Efficient Ensemble Learning Method for Gene Microarray Classification

    Directory of Open Access Journals (Sweden)

    Alireza Osareh

    2013-01-01

    Full Text Available The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost.

  20. Digital watermarks in electronic document circulation

    Directory of Open Access Journals (Sweden)

    Vitaliy Grigorievich Ivanenko

    2017-07-01

    Full Text Available This paper reviews different protection methods for electronic documents, their good and bad qualities. Common attacks on electronic documents are analyzed. Digital signature and ways of eliminating its flaws are studied. Different digital watermark embedding methods are described, they are divided into 2 types. The solution to protection of electronic documents is based on embedding digital watermarks. Comparative analysis of this methods is given. As a result, the most convenient method is suggested – reversible data hiding. It’s remarked that this technique excels at securing the integrity of the container and its digital watermark. Digital watermark embedding system should prevent illegal access to the digital watermark and its container. Digital watermark requirements for electronic document protection are produced. Legal aspect of copyright protection is reviewed. Advantages of embedding digital watermarks in electronic documents are produced. Modern reversible data hiding techniques are studied. Distinctive features of digital watermark use in Russia are highlighted. Digital watermark serves as an additional layer of defense, that is in most cases unknown to the violator. With an embedded digital watermark, it’s impossible to misappropriate the authorship of the document, even if the intruder signs his name on it. Therefore, digital watermarks can act as an effective additional tool to protect electronic documents.

  1. Phenotype classification of zebrafish embryos by supervised learning.

    Directory of Open Access Journals (Sweden)

    Nathalie Jeanray

    Full Text Available Zebrafish is increasingly used to assess biological properties of chemical substances and thus is becoming a specific tool for toxicological and pharmacological studies. The effects of chemical substances on embryo survival and development are generally evaluated manually through microscopic observation by an expert and documented by several typical photographs. Here, we present a methodology to automatically classify brightfield images of wildtype zebrafish embryos according to their defects by using an image analysis approach based on supervised machine learning. We show that, compared to manual classification, automatic classification results in 90 to 100% agreement with consensus voting of biological experts in nine out of eleven considered defects in 3 days old zebrafish larvae. Automation of the analysis and classification of zebrafish embryo pictures reduces the workload and time required for the biological expert and increases the reproducibility and objectivity of this classification.

  2. Realizing parameterless automatic classification of remote sensing imagery using ontology engineering and cyberinfrastructure techniques

    Science.gov (United States)

    Sun, Ziheng; Fang, Hui; Di, Liping; Yue, Peng

    2016-09-01

    It was an untouchable dream for remote sensing experts to realize total automatic image classification without inputting any parameter values. Experts usually spend hours and hours on tuning the input parameters of classification algorithms in order to obtain the best results. With the rapid development of knowledge engineering and cyberinfrastructure, a lot of data processing and knowledge reasoning capabilities become online accessible, shareable and interoperable. Based on these recent improvements, this paper presents an idea of parameterless automatic classification which only requires an image and automatically outputs a labeled vector. No parameters and operations are needed from endpoint consumers. An approach is proposed to realize the idea. It adopts an ontology database to store the experiences of tuning values for classifiers. A sample database is used to record training samples of image segments. Geoprocessing Web services are used as functionality blocks to finish basic classification steps. Workflow technology is involved to turn the overall image classification into a total automatic process. A Web-based prototypical system named PACS (Parameterless Automatic Classification System) is implemented. A number of images are fed into the system for evaluation purposes. The results show that the approach could automatically classify remote sensing images and have a fairly good average accuracy. It is indicated that the classified results will be more accurate if the two databases have higher quality. Once the experiences and samples in the databases are accumulated as many as an expert has, the approach should be able to get the results with similar quality to that a human expert can get. Since the approach is total automatic and parameterless, it can not only relieve remote sensing workers from the heavy and time-consuming parameter tuning work, but also significantly shorten the waiting time for consumers and facilitate them to engage in image

  3. Music genre classification using temporal domain features

    Science.gov (United States)

    Shiu, Yu; Kuo, C.-C. Jay

    2004-10-01

    Music genre provides an efficient way to index songs in the music database, and can be used as an effective means to retrieval music of a similar type, i.e. content-based music retrieval. In addition to other features, the temporal domain features of a music signal are exploited so as to increase the classification rate in this research. Three temporal techniques are examined in depth. First, the hidden Markov model (HMM) is used to emulate the time-varying properties of music signals. Second, to further increase the classification rate, we propose another feature set that focuses on the residual part of music signals. Third, the overall classification rate is enhanced by classifying smaller segments from a test material individually and making decision via majority voting. Experimental results are given to demonstrate the performance of the proposed techniques.

  4. Interdisciplinary documentation of treatment side effects in oncology. Present status and perspectives

    International Nuclear Information System (INIS)

    Seegenschmiedt, M.H.

    1998-01-01

    Background: The documentation of acute and chronic treatment sequelae is a decisive precondition for the appropriate evaluation of the treatment quality of any cancer therapy. Methods and results: Interdisciplinary (inter)national efforts have resulted in a new consensus for recording of treatment sequelae in oncology. While the acute treatment side effects (day 1 to 90 after treatment) are recommended to be documented and evaluated using the Common Toxicity Criteria (CTC), for the chronic treatment side effects (day 91 and thereafter) the Late Effect Normal Tissue (LENT) criteria are to be implemented. The latter classification system allows to differentiate between the Subjective, Objective, Management and Analytic (SOMA) toxicity aspects. Both classification systems can be implemented not only for clinical applications using radiotherapy or chemotherapy alone but also for combinations with each other or with other treatment modalities. This allows for an effective interdisciplinary comparison between different treatment concepts not only within each institution but also in multicenter trials. Conclusions: Prospective documentation and evaluation of treatment toxicity in oncology should be intensified and systematically included in future mono- and multi-institutional clinical trials. (orig.) [de

  5. Guidance on classification for reproductive toxicity under the globally harmonized system of classification and labelling of chemicals (GHS).

    Science.gov (United States)

    Moore, Nigel P; Boogaard, Peter J; Bremer, Susanne; Buesen, Roland; Edwards, James; Fraysse, Benoit; Hallmark, Nina; Hemming, Helena; Langrand-Lerche, Carole; McKee, Richard H; Meisters, Marie-Louise; Parsons, Paul; Politano, Valerie; Reader, Stuart; Ridgway, Peter; Hennes, Christa

    2013-11-01

    The Globally Harmonised System of Classification (GHS) is a framework within which the intrinsic hazards of substances may be determined and communicated. It is not a legislative instrument per se, but is enacted into national legislation with the appropriate legislative instruments. GHS covers many aspects of effects upon health and the environment, including adverse effects upon sexual function and fertility or on development. Classification for these effects is based upon observations in humans or from properly designed experiments in animals, although only the latter is covered herein. The decision to classify a substance based upon experimental data, and the category of classification ascribed, is determined by the level of evidence that is available for an adverse effect on sexual function and fertility or on development that does not arise as a secondary non-specific consequence of other toxic effect. This document offers guidance on the determination of level of concern as a measure of adversity, and the level of evidence to ascribe classification based on data from tests in laboratory animals.

  6. Intelligent feature selection techniques for pattern classification of Lamb wave signals

    International Nuclear Information System (INIS)

    Hinders, Mark K.; Miller, Corey A.

    2014-01-01

    Lamb wave interaction with flaws is a complex, three-dimensional phenomenon, which often frustrates signal interpretation schemes based on mode arrival time shifts predicted by dispersion curves. As the flaw severity increases, scattering and mode conversion effects will often dominate the time-domain signals, obscuring available information about flaws because multiple modes may arrive on top of each other. Even for idealized flaw geometries the scattering and mode conversion behavior of Lamb waves is very complex. Here, multi-mode Lamb waves in a metal plate are propagated across a rectangular flat-bottom hole in a sequence of pitch-catch measurements corresponding to the double crosshole tomography geometry. The flaw is sequentially deepened, with the Lamb wave measurements repeated at each flaw depth. Lamb wave tomography reconstructions are used to identify which waveforms have interacted with the flaw and thereby carry information about its depth. Multiple features are extracted from each of the Lamb wave signals using wavelets, which are then fed to statistical pattern classification algorithms that identify flaw severity. In order to achieve the highest classification accuracy, an optimal feature space is required but it’s never known a priori which features are going to be best. For structural health monitoring we make use of the fact that physical flaws, such as corrosion, will only increase over time. This allows us to identify feature vectors which are topologically well-behaved by requiring that sequential classes “line up” in feature vector space. An intelligent feature selection routine is illustrated that identifies favorable class distributions in multi-dimensional feature spaces using computational homology theory. Betti numbers and formal classification accuracies are calculated for each feature space subset to establish a correlation between the topology of the class distribution and the corresponding classification accuracy

  7. Construction accident narrative classification: An evaluation of text mining techniques.

    Science.gov (United States)

    Goh, Yang Miang; Ubeynarayana, C U

    2017-11-01

    Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Document reconstruction by layout analysis of snippets

    Science.gov (United States)

    Kleber, Florian; Diem, Markus; Sablatnig, Robert

    2010-02-01

    Document analysis is done to analyze entire forms (e.g. intelligent form analysis, table detection) or to describe the layout/structure of a document. Also skew detection of scanned documents is performed to support OCR algorithms that are sensitive to skew. In this paper document analysis is applied to snippets of torn documents to calculate features for the reconstruction. Documents can either be destroyed by the intention to make the printed content unavailable (e.g. tax fraud investigation, business crime) or due to time induced degeneration of ancient documents (e.g. bad storage conditions). Current reconstruction methods for manually torn documents deal with the shape, inpainting and texture synthesis techniques. In this paper the possibility of document analysis techniques of snippets to support the matching algorithm by considering additional features are shown. This implies a rotational analysis, a color analysis and a line detection. As a future work it is planned to extend the feature set with the paper type (blank, checked, lined), the type of the writing (handwritten vs. machine printed) and the text layout of a snippet (text size, line spacing). Preliminary results show that these pre-processing steps can be performed reliably on a real dataset consisting of 690 snippets.

  9. ICRS Recommendation Document

    DEFF Research Database (Denmark)

    Roos, Ewa M.; Engelhart, Luella; Ranstam, Jonas

    2011-01-01

    and function evaluated for validity and psychometric properties in patients with articular cartilage lesions. Results: The knee-specific instruments, titled the International Knee Documentation Committee Subjective Knee Form and the Knee injury and Osteoarthritis and Outcome Score, both fulfill the basic......Abstract Objective: The purpose of this article is to describe and recommend patient-reported outcome instruments for use in patients with articular cartilage lesions undergoing cartilage repair interventions. Methods: Nonsystematic literature search identifying measures addressing pain...... constructs at all levels according to the International Classification of Functioning. Conclusions: Because there is no obvious superiority of either instrument at this time, both outcome measures are recommended for use in cartilage repair. Rescaling of the Lysholm Scoring Scale has been suggested...

  10. Impact of corpus domain for sentiment classification: An evaluation study using supervised machine learning techniques

    Science.gov (United States)

    Karsi, Redouane; Zaim, Mounia; El Alami, Jamila

    2017-07-01

    Thanks to the development of the internet, a large community now has the possibility to communicate and express its opinions and preferences through multiple media such as blogs, forums, social networks and e-commerce sites. Today, it becomes clearer that opinions published on the web are a very valuable source for decision-making, so a rapidly growing field of research called “sentiment analysis” is born to address the problem of automatically determining the polarity (Positive, negative, neutral,…) of textual opinions. People expressing themselves in a particular domain often use specific domain language expressions, thus, building a classifier, which performs well in different domains is a challenging problem. The purpose of this paper is to evaluate the impact of domain for sentiment classification when using machine learning techniques. In our study three popular machine learning techniques: Support Vector Machines (SVM), Naive Bayes and K nearest neighbors(KNN) were applied on datasets collected from different domains. Experimental results show that Support Vector Machines outperforms other classifiers in all domains, since it achieved at least 74.75% accuracy with a standard deviation of 4,08.

  11. Towards an outcome documentation in manual medicine: a first proposal of the International Classification of Functioning, Disability and Health (ICF) intervention categories for manual medicine based on a Delphi survey.

    Science.gov (United States)

    Kirchberger, I; Stucki, G; Böhni, U; Cieza, A; Kirschneck, M; Dvorak, J

    2009-09-01

    The International Classification of Functioning, Disability and Health (ICF) provides a useful framework for the comprehensive description of the patients' functional health. The aim of this study was to identify the ICF categories that represent the patients' problems treated by manual medicine practitioners in order to facilitate its application in manual medicine. This selection of ICF categories could be used for assessment, treatment documentation and quality management in manual medicine practice. Swiss manual medicine experts were asked about the patients' problems commonly treated by manual medicine practitioners in a three-round survey using the Delphi technique. Responses were linked to the ICF. Forty-eight manual medicine experts gave a total of 808 responses that were linked to 225 different ICF categories; 106 ICF categories which reached an agreement of at least 50% among the participants in the final Delphi-round were included in the set of ICF Intervention Categories for Manual Medicine; 42 (40%) of the categories are assigned to the ICF component body functions, 36 (34%) represent the ICF component body structures and 28 (26%) the ICF component activities and participation. A first proposal of ICF Intervention Categories for Manual Medicine was defined and needs to be validated in further studies.

  12. Automated Generation of Technical Documentation and Provenance for Reproducible Research

    Science.gov (United States)

    Jolly, B.; Medyckyj-Scott, D.; Spiekermann, R.; Ausseil, A. G.

    2017-12-01

    Data provenance and detailed technical documentation are essential components of high-quality reproducible research, however are often only partially addressed during a research project. Recording and maintaining this information during the course of a project can be a difficult task to get right as it is a time consuming and often boring process for the researchers involved. As a result, provenance records and technical documentation provided alongside research results can be incomplete or may not be completely consistent with the actual processes followed. While providing access to the data and code used by the original researchers goes some way toward enabling reproducibility, this does not count as, or replace, data provenance. Additionally, this can be a poor substitute for good technical documentation and is often more difficult for a third-party to understand - particularly if they do not understand the programming language(s) used. We present and discuss a tool built from the ground up for the production of well-documented and reproducible spatial datasets that are created by applying a series of classification rules to a number of input layers. The internal model of the classification rules required by the tool to process the input data is exploited to also produce technical documentation and provenance records with minimal additional user input. Available provenance records that accompany input datasets are incorporated into those that describe the current process. As a result, each time a new iteration of the analysis is performed the documentation and provenance records are re-generated to provide an accurate description of the exact process followed. The generic nature of this tool, and the lessons learned during its creation, have wider application to other fields where the production of derivative datasets must be done in an open, defensible, and reproducible way.

  13. Advances in oriental document analysis and recognition techniques

    CERN Document Server

    Lee, Seong-Whan

    1999-01-01

    In recent years, rapid progress has been made in computer processing of oriental languages, and the research developments in this area have resulted in tremendous changes in handwriting processing, printed oriental character recognition, document analysis and recognition, automatic input methodologies for oriental languages, etc. Advances in computer processing of oriental languages can also be seen in multimedia computing and the World Wide Web. Many of the results in those domains are presented in this book.

  14. The classification of the Ricci tensor in the general theory of relativity

    International Nuclear Information System (INIS)

    Cormack, W.J.

    1979-10-01

    A comprehensive classification of the Ricci tensor in General Relativity using several techniques is given and their connection with existing classification studied under the headings; canonical forms for the Ricci tensor, invariant 2-spaces in the classification of the Ricci tensor, Riemannian curvature and the classification of the Riemann and Ricci tensors, and spinor classifications of the Ricci tensor. (U.K.)

  15. Olive oil sensory defects classification with data fusion of instrumental techniques and multivariate analysis (PLS-DA).

    Science.gov (United States)

    Borràs, Eva; Ferré, Joan; Boqué, Ricard; Mestres, Montserrat; Aceña, Laura; Calvo, Angels; Busto, Olga

    2016-07-15

    Three instrumental techniques, headspace-mass spectrometry (HS-MS), mid-infrared spectroscopy (MIR) and UV-visible spectrophotometry (UV-vis), have been combined to classify virgin olive oil samples based on the presence or absence of sensory defects. The reference sensory values were provided by an official taste panel. Different data fusion strategies were studied to improve the discrimination capability compared to using each instrumental technique individually. A general model was applied to discriminate high-quality non-defective olive oils (extra-virgin) and the lowest-quality olive oils considered non-edible (lampante). A specific identification of key off-flavours, such as musty, winey, fusty and rancid, was also studied. The data fusion of the three techniques improved the classification results in most of the cases. Low-level data fusion was the best strategy to discriminate musty, winey and fusty defects, using HS-MS, MIR and UV-vis, and the rancid defect using only HS-MS and MIR. The mid-level data fusion approach using partial least squares-discriminant analysis (PLS-DA) scores was found to be the best strategy for defective vs non-defective and edible vs non-edible oil discrimination. However, the data fusion did not sufficiently improve the results obtained by a single technique (HS-MS) to classify non-defective classes. These results indicate that instrumental data fusion can be useful for the identification of sensory defects in virgin olive oils. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Scientific and General Subject Classifications in the Digital World

    CERN Document Server

    De Robbio, Antonella; Marini, A

    2001-01-01

    In the present work we discuss opportunities, problems, tools and techniques encountered when interconnecting discipline-specific subject classifications, primarily organized as search devices in bibliographic databases, with general classifications originally devised for book shelving in public libraries. We first state the fundamental distinction between topical (or subject) classifications and object classifications. Then we trace the structural limitations that have constrained subject classifications since their library origins, and the devices that were used to overcome the gap with genuine knowledge representation. After recalling some general notions on structure, dynamics and interferences of subject classifications and of the objects they refer to, we sketch a synthetic overview on discipline-specific classifications in Mathematics, Computing and Physics, on one hand, and on general classifications on the other. In this setting we present The Scientific Classifications Page, which collects groups of...

  17. 76 FR 69034 - Microbiology Devices; Classification of In Vitro Diagnostic Device for Yersinia Species Detection

    Science.gov (United States)

    2011-11-07

    ... Drug Administration 21 CFR Part 866 Microbiology Devices; Classification of In Vitro Diagnostic Device... CFR Part 866 [Docket No. FDA-2011-N-0729] Microbiology Devices; Classification of In Vitro Diagnostic... of the Microbiology Devices Advisory Panel (the panel). FDA is publishing in this document the...

  18. An Introduction to the Nuclear Document Crawling System

    International Nuclear Information System (INIS)

    Tae, Jae Woong; Yoon, Sung Ho; Shin, Dong Hoon

    2016-01-01

    The NSG(Nuclear Suppliers Group) guidelines state that controls on 'technology' transfer do not apply to information 'in the public domain' or to 'basic scientific research'. According to the guidelines, 'basic scientific research' is an experimental or theoretical work undertaken principally to acquire new knowledge of the fundamental principles of phenomena and observable facts, not primarily directed towards a specific practical aim or objective. 'Technology in the public domain' means 'technology' or 'software' that has been made available without restrictions upon its further dissemination. It is a difficult problem to determine whether a document is in the public domain or it is a basic scientific research because its criteria are ambiguous and unclear. In this paper, we introduce an approach using documents on the web and a system to manage electronic documents on the web. In this paper, we proposed an approach to determine whether a document is open to public or it is a basic scientific research and we developed the document crawling system to collect open documents on the web. We can take open documents into a review process in a new way. It supports to prevent reviewers from classifying an open document into a strategic technology. It is expected to improve reliability of classification results

  19. An Introduction to the Nuclear Document Crawling System

    Energy Technology Data Exchange (ETDEWEB)

    Tae, Jae Woong; Yoon, Sung Ho; Shin, Dong Hoon [Korea Institute of Nuclear Nonproliferation and Control, Daejeon (Korea, Republic of)

    2016-05-15

    The NSG(Nuclear Suppliers Group) guidelines state that controls on 'technology' transfer do not apply to information 'in the public domain' or to 'basic scientific research'. According to the guidelines, 'basic scientific research' is an experimental or theoretical work undertaken principally to acquire new knowledge of the fundamental principles of phenomena and observable facts, not primarily directed towards a specific practical aim or objective. 'Technology in the public domain' means 'technology' or 'software' that has been made available without restrictions upon its further dissemination. It is a difficult problem to determine whether a document is in the public domain or it is a basic scientific research because its criteria are ambiguous and unclear. In this paper, we introduce an approach using documents on the web and a system to manage electronic documents on the web. In this paper, we proposed an approach to determine whether a document is open to public or it is a basic scientific research and we developed the document crawling system to collect open documents on the web. We can take open documents into a review process in a new way. It supports to prevent reviewers from classifying an open document into a strategic technology. It is expected to improve reliability of classification results.

  20. CLASSIFICATION OF THE MGR MUCK HANDLING SYSTEM

    International Nuclear Information System (INIS)

    R. Garrett

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) muck handling system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description (QARD) (DOE 1998). This QA classification incorporates the current MGR design and the results of the ''Preliminary Preclosure Design Basis Event Calculations for the Monitored Geologic Repository (CRWMS M and O 1998a)

  1. Field sampling and data analysis methods for development of ecological land classifications: an application on the Manistee National Forest.

    Science.gov (United States)

    George E. Host; Carl W. Ramm; Eunice A. Padley; Kurt S. Pregitzer; James B. Hart; David T. Cleland

    1992-01-01

    Presents technical documentation for development of an Ecological Classification System for the Manistee National Forest in northwest Lower Michigan, and suggests procedures applicable to other ecological land classification projects. Includes discussion of sampling design, field data collection, data summarization and analyses, development of classification units,...

  2. Effectiveness of Multivariate Time Series Classification Using Shapelets

    Directory of Open Access Journals (Sweden)

    A. P. Karpenko

    2015-01-01

    Full Text Available Typically, time series classifiers require signal pre-processing (filtering signals from noise and artifact removal, etc., enhancement of signal features (amplitude, frequency, spectrum, etc., classification of signal features in space using the classical techniques and classification algorithms of multivariate data. We consider a method of classifying time series, which does not require enhancement of the signal features. The method uses the shapelets of time series (time series shapelets i.e. small fragments of this series, which reflect properties of one of its classes most of all.Despite the significant number of publications on the theory and shapelet applications for classification of time series, the task to evaluate the effectiveness of this technique remains relevant. An objective of this publication is to study the effectiveness of a number of modifications of the original shapelet method as applied to the multivariate series classification that is a littlestudied problem. The paper presents the problem statement of multivariate time series classification using the shapelets and describes the shapelet–based basic method of binary classification, as well as various generalizations and proposed modification of the method. It also offers the software that implements a modified method and results of computational experiments confirming the effectiveness of the algorithmic and software solutions.The paper shows that the modified method and the software to use it allow us to reach the classification accuracy of about 85%, at best. The shapelet search time increases in proportion to input data dimension.

  3. Steganalysis Techniques for Documents and Images

    Science.gov (United States)

    2005-05-01

    steganography . We then illustrated the efficacy of our model using variations of LSB steganography . For binary images , we have made significant progress in...efforts have focused on two areas. The first area is LSB steganalysis for grayscale images . Here, as we had proposed (as a challenging task), we have...generalized our previous steganalysis technique of sample pair analysis to a theoretical framework for the detection of the LSB steganography . The new

  4. Deep learning for image classification

    Science.gov (United States)

    McCoppin, Ryan; Rizki, Mateen

    2014-06-01

    This paper provides an overview of deep learning and introduces the several subfields of deep learning including a specific tutorial of convolutional neural networks. Traditional methods for learning image features are compared to deep learning techniques. In addition, we present our preliminary classification results, our basic implementation of a convolutional restricted Boltzmann machine on the Mixed National Institute of Standards and Technology database (MNIST), and we explain how to use deep learning networks to assist in our development of a robust gender classification system.

  5. Authorization Basis Safety Classification of Transfer Bay Bridge Crane at the 105-K Basins

    International Nuclear Information System (INIS)

    CHAFFEE, G.A.

    2000-01-01

    This supporting document provides the bases for the safety classification for the K Basin transfer bay bridge crane and the bases for the Structures, Systems, and Components (SSC) safety classification. A table is presented that delineates the safety significant components. This safety classification is based on a review of the Authorization Basis (AB). This Authorization Basis review was performed regarding AB and design baseline issues. The primary issues are: (1) What is the AB for the safety classification of the transfer bay bridge crane? (2) What does the SSC safety classification ''Safety Significant'' or ''Safety Significant for Design Only'' mean for design requirements and quality requirements for procurement, installation and maintenance (including replacement of parts) activities for the crane during its expected life time? The AB information on the crane was identified based on review of Department of Energy--Richland Office (RL) and Spent Nuclear Fuel (SNF) Project correspondence, K Basin Safety Analysis Report (SAR) and RL Safety Evaluation Reports (SERs) of SNF Project SAR submittals. The relevant correspondence, actions and activities taken and substantive directions or conclusions of these documents are provided in Appendix A

  6. Can the Ni classification of vessels predict neoplasia?

    DEFF Research Database (Denmark)

    Mehlum, Camilla Slot; Rosenberg, Tine; Dyrvig, Anne-Kirstine

    2018-01-01

    OBJECTIVES: The Ni classification of vascular change from 2011 is well documented for evaluating pharyngeal and laryngeal lesions, primarily focusing on cancer. In the planning of surgery it may be more relevant to differentiate neoplasia from non-neoplasia. We aimed to evaluate the ability...... of the Ni classification to predict laryngeal or hypopharyngeal neoplasia and to investigate if a changed cutoff value would support the recent European Laryngological Society (ELS) proposal of perpendicular vascular changes as indicative of neoplasia. DATA SOURCES: PubMed, Embase, Cochrane, and Scopus....... The pooled sensitivity and specificity of the Ni classification with two different cutoffs were calculated, and bubble and summary receiver operating characteristics plots were created. RESULTS: The combined sensitivity of five studies (n = 687) with Ni type IV-V defined as test-positive was 0.89 (95...

  7. Automatic Hierarchical Color Image Classification

    Directory of Open Access Journals (Sweden)

    Jing Huang

    2003-02-01

    Full Text Available Organizing images into semantic categories can be extremely useful for content-based image retrieval and image annotation. Grouping images into semantic classes is a difficult problem, however. Image classification attempts to solve this hard problem by using low-level image features. In this paper, we propose a method for hierarchical classification of images via supervised learning. This scheme relies on using a good low-level feature and subsequently performing feature-space reconfiguration using singular value decomposition to reduce noise and dimensionality. We use the training data to obtain a hierarchical classification tree that can be used to categorize new images. Our experimental results suggest that this scheme not only performs better than standard nearest-neighbor techniques, but also has both storage and computational advantages.

  8. Computer Aided Design for Soil Classification Relational Database ...

    African Journals Online (AJOL)

    The paper focuses on the problems associated with classification, storage and retrieval of information on soil data, such as the incompatibility of soil data semantics; inadequate documentation, and lack of indexing; hence it is pretty difficult to efficiently access large database. Consequently, information on soil is very difficult ...

  9. Evaluation of Advanced Signal Processing Techniques to Improve Detection and Identification of Embedded Defects

    Energy Technology Data Exchange (ETDEWEB)

    Clayton, Dwight A. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Santos-Villalobos, Hector J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Baba, Justin S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2016-09-01

    , or an improvement in contrast over conventional SAFT reconstructed images. This report documents our efforts in four fronts: 1) Comparative study between traditional SAFT and FBD SAFT for concrete specimen with and without Alkali-Silica Reaction (ASR) damage, 2) improvement of our Model-Based Iterative Reconstruction (MBIR) for thick reinforced concrete [5], 3) development of a universal framework for sharing, reconstruction, and visualization of ultrasound NDE datasets, and 4) application of machine learning techniques for automated detection of ASR inside concrete. Our comparative study between FBD and traditional SAFT reconstruction images shows a clear difference between images of ASR and non-ASR specimens. In particular, the left first harmonic shows an increased contrast and sensitivity to ASR damage. For MBIR, we show the superiority of model-based techniques over delay and sum techniques such as SAFT. Improvements include elimination of artifacts caused by direct arrival signals, and increased contrast and Signal to Noise Ratio. For the universal framework, we document a format for data storage based on the HDF5 file format, and also propose a modular Graphic User Interface (GUI) for easy customization of data conversion, reconstruction, and visualization routines. Finally, two techniques for ASR automated detection are presented. The first technique is based on an analysis of the frequency content using Hilbert Transform Indicator (HTI) and the second technique employees Artificial Neural Network (ANN) techniques for training and classification of ultrasound data as ASR or non-ASR damaged classes. The ANN technique shows great potential with classification accuracy above 95%. These approaches are extensible to the detection of additional reinforced, thick concrete defects and damage.

  10. Fully Convolutional Networks for Ground Classification from LIDAR Point Clouds

    Science.gov (United States)

    Rizaldy, A.; Persello, C.; Gevaert, C. M.; Oude Elberink, S. J.

    2018-05-01

    Deep Learning has been massively used for image classification in recent years. The use of deep learning for ground classification from LIDAR point clouds has also been recently studied. However, point clouds need to be converted into an image in order to use Convolutional Neural Networks (CNNs). In state-of-the-art techniques, this conversion is slow because each point is converted into a separate image. This approach leads to highly redundant computation during conversion and classification. The goal of this study is to design a more efficient data conversion and ground classification. This goal is achieved by first converting the whole point cloud into a single image. The classification is then performed by a Fully Convolutional Network (FCN), a modified version of CNN designed for pixel-wise image classification. The proposed method is significantly faster than state-of-the-art techniques. On the ISPRS Filter Test dataset, it is 78 times faster for conversion and 16 times faster for classification. Our experimental analysis on the same dataset shows that the proposed method results in 5.22 % of total error, 4.10 % of type I error, and 15.07 % of type II error. Compared to the previous CNN-based technique and LAStools software, the proposed method reduces the total error and type I error (while type II error is slightly higher). The method was also tested on a very high point density LIDAR point clouds resulting in 4.02 % of total error, 2.15 % of type I error and 6.14 % of type II error.

  11. Classification of coronary artery bifurcation lesions and treatments: Time for a consensus!

    DEFF Research Database (Denmark)

    Louvard, Yves; Thomas, Martyn; Dzavik, Vladimir

    2007-01-01

    by intention to treat, it is necessary to clearly define which vessel is the distal main branch and which is (are) the side branche(s) and give each branch a distinct name. Each segment of the bifurcation has been named following the same pattern as the Medina classification. The classification......, heterogeneity, and inadequate description of techniques implemented. Methods: The aim is to propose a consensus established by the European Bifurcation Club (EBC), on the definition and classification of bifurcation lesions and treatments implemented with the purpose of allowing comparisons between techniques...... in various anatomical and clinical settings. Results: A bifurcation lesion is a coronary artery narrowing occurring adjacent to, and/or involving, the origin of a significant side branch. The simple lesion classification proposed by Medina has been adopted. To analyze the outcomes of different techniques...

  12. Novelty detection for breast cancer image classification

    Science.gov (United States)

    Cichosz, Pawel; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał; Oleszkiewicz, Witold

    2016-09-01

    Using classification learning algorithms for medical applications may require not only refined model creation techniques and careful unbiased model evaluation, but also detecting the risk of misclassification at the time of model application. This is addressed by novelty detection, which identifies instances for which the training set is not sufficiently representative and for which it may be safer to restrain from classification and request a human expert diagnosis. The paper investigates two techniques for isolated instance identification, based on clustering and one-class support vector machines, which represent two different approaches to multidimensional outlier detection. The prediction quality for isolated instances in breast cancer image data is evaluated using the random forest algorithm and found to be substantially inferior to the prediction quality for non-isolated instances. Each of the two techniques is then used to create a novelty detection model which can be combined with a classification model and used at the time of prediction to detect instances for which the latter cannot be reliably applied. Novelty detection is demonstrated to improve random forest prediction quality and argued to deserve further investigation in medical applications.

  13. The research on business rules classification and specification methods

    OpenAIRE

    Baltrušaitis, Egidijus

    2005-01-01

    The work is based on the research of business rules classification and specification methods. The basics of business rules approach are discussed. The most common business rules classification and modeling methods are analyzed. Business rules modeling techniques and tools for supporting them in the information systems are presented. Basing on the analysis results business rules classification method is proposed. Templates for every business rule type are presented. Business rules structuring ...

  14. Documenting Penicillin Allergy: The Impact of Inconsistency

    Science.gov (United States)

    Shah, Nirav S.; Ridgway, Jessica P.; Pettit, Natasha; Fahrenbach, John; Robicsek, Ari

    2016-01-01

    Background Allergy documentation is frequently inconsistent and incomplete. The impact of this variability on subsequent treatment is not well described. Objective To determine how allergy documentation affects subsequent antibiotic choice. Design Retrospective, cohort study. Participants 232,616 adult patients seen by 199 primary care providers (PCPs) between January 1, 2009 and January 1, 2014 at an academic medical system. Main Measures Inter-physician variation in beta-lactam allergy documentation; antibiotic treatment following beta-lactam allergy documentation. Key Results 15.6% of patients had a reported beta-lactam allergy. Of those patients, 39.8% had a specific allergen identified and 22.7% had allergic reaction characteristics documented. Variation between PCPs was greater than would be expected by chance (all ppenicillins”) (24.0% to 58.2%) and documentation of the reaction characteristics (5.4% to 51.9%). After beta-lactam allergy documentation, patients were less likely to receive penicillins (Relative Risk [RR] 0.16 [95% Confidence Interval: 0.15–0.17]) and cephalosporins (RR 0.28 [95% CI 0.27–0.30]) and more likely to receive fluoroquinolones (RR 1.5 [95% CI 1.5–1.6]), clindamycin (RR 3.8 [95% CI 3.6–4.0]) and vancomycin (RR 5.0 [95% CI 4.3–5.8]). Among patients with beta-lactam allergy, rechallenge was more likely when a specific allergen was identified (RR 1.6 [95% CI 1.5–1.8]) and when reaction characteristics were documented (RR 2.0 [95% CI 1.8–2.2]). Conclusions Provider documentation of beta-lactam allergy is highly variable, and details of the allergy are infrequently documented. Classification of a patient as beta-lactam allergic and incomplete documentation regarding the details of the allergy lead to beta-lactam avoidance and use of other antimicrobial agents, behaviors that may adversely impact care quality and cost. PMID:26981866

  15. Histopathological Breast Cancer Image Classification by Deep Neural Network Techniques Guided by Local Clustering.

    Science.gov (United States)

    Nahid, Abdullah-Al; Mehrabi, Mohamad Ali; Kong, Yinan

    2018-01-01

    Breast Cancer is a serious threat and one of the largest causes of death of women throughout the world. The identification of cancer largely depends on digital biomedical photography analysis such as histopathological images by doctors and physicians. Analyzing histopathological images is a nontrivial task, and decisions from investigation of these kinds of images always require specialised knowledge. However, Computer Aided Diagnosis (CAD) techniques can help the doctor make more reliable decisions. The state-of-the-art Deep Neural Network (DNN) has been recently introduced for biomedical image analysis. Normally each image contains structural and statistical information. This paper classifies a set of biomedical breast cancer images (BreakHis dataset) using novel DNN techniques guided by structural and statistical information derived from the images. Specifically a Convolutional Neural Network (CNN), a Long-Short-Term-Memory (LSTM), and a combination of CNN and LSTM are proposed for breast cancer image classification. Softmax and Support Vector Machine (SVM) layers have been used for the decision-making stage after extracting features utilising the proposed novel DNN models. In this experiment the best Accuracy value of 91.00% is achieved on the 200x dataset, the best Precision value 96.00% is achieved on the 40x dataset, and the best F -Measure value is achieved on both the 40x and 100x datasets.

  16. CLASSIFICATION OF THE MGR WASTE HANDLING BUILDING VENTILATION SYSTEM

    International Nuclear Information System (INIS)

    J.A. Ziegler

    2000-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) waste handling building ventilation system structures, systems and components (SSCs) performed by the MGR Preclosure Safety and Systems Engineering Section. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 2000). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (QARD) (DOE 2000). This QA classification incorporates the current MGR design and the results of the ''Design Basis Event Frequency and Dose Calculation for Site Recommendation'' (CRWMS M andO 2000a) and ''Bounding Individual Category 1 Design Basis Event Dose Calculation to Support Quality Assurance Classification'' (Gwyn 2000)

  17. REAL-TIME INTELLIGENT MULTILAYER ATTACK CLASSIFICATION SYSTEM

    Directory of Open Access Journals (Sweden)

    T. Subbhulakshmi

    2014-01-01

    Full Text Available Intrusion Detection Systems (IDS takes the lion’s share of the current security infrastructure. Detection of intrusions is vital for initiating the defensive procedures. Intrusion detection was done by statistical and distance based methods. A threshold value is used in these methods to indicate the level of normalcy. When the network traffic crosses the level of normalcy then above which it is flagged as anomalous. When there are occurrences of new intrusion events which are increasingly a key part of system security, the statistical techniques cannot detect them. To overcome this issue, learning techniques are used which helps in identifying new intrusion activities in a computer system. The objective of the proposed system designed in this paper is to classify the intrusions using an Intelligent Multi Layered Attack Classification System (IMLACS which helps in detecting and classifying the intrusions with improved classification accuracy. The intelligent multi layered approach contains three intelligent layers. The first layer involves Binary Support Vector Machine classification for detecting the normal and attack. The second layer involves neural network classification to classify the attacks into classes of attacks. The third layer involves fuzzy inference system to classify the attacks into various subclasses. The proposed IMLACS can be able to detect an intrusion behavior of the networks since the system contains a three intelligent layer classification and better set of rules. Feature selection is also used to improve the time of detection. The experimental results show that the IMLACS achieves the Classification Rate of 97.31%.

  18. BOREAS TE-18 Landsat TM Maximum Likelihood Classification Image of the NSA

    Science.gov (United States)

    Hall, Forrest G. (Editor); Knapp, David

    2000-01-01

    The BOREAS TE-18 team focused its efforts on using remotely sensed data to characterize the successional and disturbance dynamics of the boreal forest for use in carbon modeling. The objective of this classification is to provide the BOREAS investigators with a data product that characterizes the land cover of the NSA. A Landsat-5 TM image from 20-Aug-1988 was used to derive this classification. A standard supervised maximum likelihood classification approach was used to produce this classification. The data are provided in a binary image format file. The data files are available on a CD-ROM (see document number 20010000884), or from the Oak Ridge National Laboratory (ORNL) Distributed Activity Archive Center (DAAC).

  19. A Data Mining Classification Approach for Behavioral Malware Detection

    Directory of Open Access Journals (Sweden)

    Monire Norouzi

    2016-01-01

    Full Text Available Data mining techniques have numerous applications in malware detection. Classification method is one of the most popular data mining techniques. In this paper we present a data mining classification approach to detect malware behavior. We proposed different classification methods in order to detect malware based on the feature and behavior of each malware. A dynamic analysis method has been presented for identifying the malware features. A suggested program has been presented for converting a malware behavior executive history XML file to a suitable WEKA tool input. To illustrate the performance efficiency as well as training data and test, we apply the proposed approaches to a real case study data set using WEKA tool. The evaluation results demonstrated the availability of the proposed data mining approach. Also our proposed data mining approach is more efficient for detecting malware and behavioral classification of malware can be useful to detect malware in a behavioral antivirus.

  20. Available Tools and Challenges Classifying Cutting-Edge and Historical Astronomical Documents

    Science.gov (United States)

    Lagerstrom, Jill

    2015-08-01

    The STScI Library assists the Science Policies Division in evaluating and choosing scientific keywords and categories for proposals for the Hubble Space Telescope mission and the upcoming James Webb Space Telescope mission. In addition we are often faced with the question “what is the shape of the astronomical literature?” However, subject classification in astronomy in recent times has not been cultivated. This talk will address the available tools and challenges of classifying cutting-edge as well as historical astronomical documents. In at the process, we will give an overview of current and upcoming practices of subject classification in astronomy.

  1. Classification of the visual landscape for transmission planning

    Science.gov (United States)

    Curtis Miller; Nargis Jetha; Rod MacDonald

    1979-01-01

    The Visual Landscape Type Classification method of the Route and Site Selection Division, Ontario Hydro, defines and delineates the landscape into discrete visual units using parametric and judgmental data. This qualitative and quantitative information is documented in a prescribed format to give each of the approximately 1100 Landscape Types a unique description....

  2. CLASSIFICATION OF THE MGR WASTE EMPLACEMENT/RETRIEVAL SYSTEM

    International Nuclear Information System (INIS)

    J.A. Ziegler

    2000-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) waste emplacement/retrieved system structures, systems and components (SSCs) performed by the MGR Preclosure Safety and Systems Engineering Section. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 2000). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, Quality Assurance Requirements and Description (QARD) (DOE 2000). This QA classification incorporates the current MGR design and the results of the ''Design Basis Event Frequency and Dose Calculation for Site Recommendation'' (CRWMS M andO 2000a). The content and technical approach of this analysis is in accordance with the development plan ''QA Classification of MGR Structures, Systems, and Components'' (CRWMS M andO 1999b)

  3. FULLY CONVOLUTIONAL NETWORKS FOR GROUND CLASSIFICATION FROM LIDAR POINT CLOUDS

    Directory of Open Access Journals (Sweden)

    A. Rizaldy

    2018-05-01

    Full Text Available Deep Learning has been massively used for image classification in recent years. The use of deep learning for ground classification from LIDAR point clouds has also been recently studied. However, point clouds need to be converted into an image in order to use Convolutional Neural Networks (CNNs. In state-of-the-art techniques, this conversion is slow because each point is converted into a separate image. This approach leads to highly redundant computation during conversion and classification. The goal of this study is to design a more efficient data conversion and ground classification. This goal is achieved by first converting the whole point cloud into a single image. The classification is then performed by a Fully Convolutional Network (FCN, a modified version of CNN designed for pixel-wise image classification. The proposed method is significantly faster than state-of-the-art techniques. On the ISPRS Filter Test dataset, it is 78 times faster for conversion and 16 times faster for classification. Our experimental analysis on the same dataset shows that the proposed method results in 5.22 % of total error, 4.10 % of type I error, and 15.07 % of type II error. Compared to the previous CNN-based technique and LAStools software, the proposed method reduces the total error and type I error (while type II error is slightly higher. The method was also tested on a very high point density LIDAR point clouds resulting in 4.02 % of total error, 2.15 % of type I error and 6.14 % of type II error.

  4. Staffs' documentation of participation for adults with profound intellectual disability or profound intellectual and multiple disabilities.

    Science.gov (United States)

    Talman, Lena; Gustafsson, Christine; Stier, Jonas; Wilder, Jenny

    2017-06-21

    This study investigated what areas of International Classification of Functioning, Disability and Health were documented in implementation plans for adults with profound intellectual disability or profound intellectual and multiple disabilities with focus on participation. A document analysis of 17 implementation plans was performed and International Classification of Functioning, Disability and Health was used as an analytic tool. One hundred and sixty-three different codes were identified, especially in the components Activities and participation and Environmental factors. Participation was most frequently coded in the chapters Community, social and civic life and Self-care. Overall, the results showed that focus in the implementation plans concerned Self-care and Community, social and civic life. The other life areas in Activities and participation were seldom, or not at all, documented. A deeper focus on participation in the implementation plans and all life areas in the component Activities and participation is needed. It is important that the documentation clearly shows what the adult wants, wishes, and likes in everyday life. It is also important to ensure that the job description for staff contains both life areas and individual preferences so that staff have the possibility to work to fulfill social and individual participation for the target group. Implications for rehabilitation There is a need for functioning working models to increase participation significantly for adults with profound intellectual disability or profound intellectual and multiple disabilities. For these adults, participation is achieved through the assistance of others and support and services carried out must be documented in an implementation plan. The International Classification of Functioning, Disability and Health can be used to support staff and ensure that information about the most important factors in an individual's functioning in their environment is not omitted in

  5. Significance of perceptually relevant image decolorization for scene classification

    Science.gov (United States)

    Viswanathan, Sowmya; Divakaran, Govind; Soman, Kutti Padanyl

    2017-11-01

    Color images contain luminance and chrominance components representing the intensity and color information, respectively. The objective of this paper is to show the significance of incorporating chrominance information to the task of scene classification. An improved color-to-grayscale image conversion algorithm that effectively incorporates chrominance information is proposed using the color-to-gray structure similarity index and singular value decomposition to improve the perceptual quality of the converted grayscale images. The experimental results based on an image quality assessment for image decolorization and its success rate (using the Cadik and COLOR250 datasets) show that the proposed image decolorization technique performs better than eight existing benchmark algorithms for image decolorization. In the second part of the paper, the effectiveness of incorporating the chrominance component for scene classification tasks is demonstrated using a deep belief network-based image classification system developed using dense scale-invariant feature transforms. The amount of chrominance information incorporated into the proposed image decolorization technique is confirmed with the improvement to the overall scene classification accuracy. Moreover, the overall scene classification performance improved by combining the models obtained using the proposed method and conventional decolorization methods.

  6. Changing techniques in crop plant classification: molecularization at the National Institute of Agricultural Botany during the 1980s.

    Science.gov (United States)

    Holmes, Matthew

    2017-04-01

    Modern methods of analysing biological materials, including protein and DNA sequencing, are increasingly the objects of historical study. Yet twentieth-century taxonomic techniques have been overlooked in one of their most important contexts: agricultural botany. This paper addresses this omission by harnessing unexamined archival material from the National Institute of Agricultural Botany (NIAB), a British plant science organization. During the 1980s the NIAB carried out three overlapping research programmes in crop identification and analysis: electrophoresis, near infrared spectroscopy (NIRS) and machine vision systems. For each of these three programmes, contemporary economic, statutory and scientific factors behind their uptake by the NIAB are discussed. This approach reveals significant links between taxonomic practice at the NIAB and historical questions around agricultural research, intellectual property and scientific values. Such links are of further importance given that the techniques developed by researchers at the NIAB during the 1980s remain part of crop classification guidelines issued by international bodies today.

  7. ILAE Classification of the Epilepsies Position Paper of the ILAE Commission for Classification and Terminology

    Science.gov (United States)

    Scheffer, Ingrid E; Berkovic, Samuel; Capovilla, Giuseppe; Connolly, Mary B; French, Jacqueline; Guilhoto, Laura; Hirsch, Edouard; Jain, Satish; Mathern, Gary W.; Moshé, Solomon L; Nordli, Douglas R; Perucca, Emilio; Tomson, Torbjörn; Wiebe, Samuel; Zhang, Yue-Hua; Zuberi, Sameer M

    2017-01-01

    Summary The ILAE Classification of the Epilepsies has been updated to reflect our gain in understanding of the epilepsies and their underlying mechanisms following the major scientific advances which have taken place since the last ratified classification in 1989. As a critical tool for the practising clinician, epilepsy classification must be relevant and dynamic to changes in thinking, yet robust and translatable to all areas of the globe. Its primary purpose is for diagnosis of patients, but it is also critical for epilepsy research, development of antiepileptic therapies and communication around the world. The new classification originates from a draft document submitted for public comments in 2013 which was revised to incorporate extensive feedback from the international epilepsy community over several rounds of consultation. It presents three levels, starting with seizure type where it assumes that the patient is having epileptic seizures as defined by the new 2017 ILAE Seizure Classification. After diagnosis of the seizure type, the next step is diagnosis of epilepsy type, including focal epilepsy, generalized epilepsy, combined generalized and focal epilepsy, and also an unknown epilepsy group. The third level is that of epilepsy syndrome where a specific syndromic diagnosis can be made. The new classification incorporates etiology along each stage, emphasizing the need to consider etiology at each step of diagnosis as it often carries significant treatment implications. Etiology is broken into six subgroups, selected because of their potential therapeutic consequences. New terminology is introduced such as developmental and epileptic encephalopathy. The term benign is replaced by the terms self-limited and pharmacoresponsive, to be used where appropriate. It is hoped that this new framework will assist in improving epilepsy care and research in the 21st century. PMID:28276062

  8. [The inadequacy of official classification of work accidents in Brazil].

    Science.gov (United States)

    Cordeiro, Ricardo

    2018-02-19

    Traditionally, work accidents in Brazil have been categorized in government documents and legal and academic texts as typical work accidents and commuting accidents. Given the increase in urban violence and the increasingly precarious work conditions in recent decades, this article addresses the conceptual inadequacy of this classification and its implications for the underestimation of work accidents in the country. An alternative classification is presented as an example and a contribution to the discussion on the improvement of statistics on work-related injuries in Brazil.

  9. A hybrid ensemble learning approach to star-galaxy classification

    Science.gov (United States)

    Kim, Edward J.; Brunner, Robert J.; Carrasco Kind, Matias

    2015-10-01

    There exist a variety of star-galaxy classification techniques, each with their own strengths and weaknesses. In this paper, we present a novel meta-classification framework that combines and fully exploits different techniques to produce a more robust star-galaxy classification. To demonstrate this hybrid, ensemble approach, we combine a purely morphological classifier, a supervised machine learning method based on random forest, an unsupervised machine learning method based on self-organizing maps, and a hierarchical Bayesian template-fitting method. Using data from the CFHTLenS survey (Canada-France-Hawaii Telescope Lensing Survey), we consider different scenarios: when a high-quality training set is available with spectroscopic labels from DEEP2 (Deep Extragalactic Evolutionary Probe Phase 2 ), SDSS (Sloan Digital Sky Survey), VIPERS (VIMOS Public Extragalactic Redshift Survey), and VVDS (VIMOS VLT Deep Survey), and when the demographics of sources in a low-quality training set do not match the demographics of objects in the test data set. We demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method in these scenarios. Thus, strategies that combine the predictions of different classifiers may prove to be optimal in currently ongoing and forthcoming photometric surveys, such as the Dark Energy Survey and the Large Synoptic Survey Telescope.

  10. Data Augmentation for Plant Classification

    NARCIS (Netherlands)

    Pawara, Pornntiwa; Okafor, Emmanuel; Schomaker, Lambertus; Wiering, Marco

    2017-01-01

    Data augmentation plays a crucial role in increasing the number of training images, which often aids to improve classification performances of deep learning techniques for computer vision problems. In this paper, we employ the deep learning framework and determine the effects of several

  11. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey

    OpenAIRE

    Nahid, Abdullah-Al; Kong, Yinan

    2017-01-01

    Breast cancer is one of the largest causes of women’s death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors’ and physicians’ time. Despite the various publications on breast image classification, very few review papers are available w...

  12. Auditable safety analysis and final hazard classification for Buildings 1310-N and 1314-N

    International Nuclear Information System (INIS)

    Kloster, G.L.

    1997-05-01

    This document is a graded auditable safety analysis (ASA) of the deactivation activities planned for the 100-N facility segment comprised of the Building 1310-N pump silo (part of the Liquid Radioactive Waste Treatment Facility) and 1314-N Building (Liquid Waste Disposal Building).The ASA describes the hazards within the facility and evaluates the adequacy of the measures taken to reduce, control, or mitigate the identified hazards. This document also serves as the Final Hazard Classification (FHC) for the 1310-N pump silo and 1314-N Building segment. The FHC is radiological based on the Preliminary Hazard Classification and the total inventory of radioactive and hazardous materials in the segment

  13. Fractographic classification in metallic materials by using 3D processing and computer vision techniques

    Directory of Open Access Journals (Sweden)

    Maria Ximena Bastidas-Rodríguez

    2016-09-01

    Full Text Available Failure analysis aims at collecting information about how and why a failure is produced. The first step in this process is a visual inspection on the flaw surface that will reveal the features, marks, and texture, which characterize each type of fracture. This is generally carried out by personnel with no experience that usually lack the knowledge to do it. This paper proposes a classification method for three kinds of fractures in crystalline materials: brittle, fatigue, and ductile. The method uses 3D vision, and it is expected to support failure analysis. The features used in this work were: i Haralick’s features and ii the fractal dimension. These features were applied to 3D images obtained from a confocal laser scanning microscopy Zeiss LSM 700. For the classification, we evaluated two classifiers: Artificial Neural Networks and Support Vector Machine. The performance evaluation was made by extracting four marginal relations from the confusion matrix: accuracy, sensitivity, specificity, and precision, plus three evaluation methods: Receiver Operating Characteristic space, the Individual Classification Success Index, and the Jaccard’s coefficient. Despite the classification percentage obtained by an expert is better than the one obtained with the algorithm, the algorithm achieves a classification percentage near or exceeding the 60 % accuracy for the analyzed failure modes. The results presented here provide a good approach to address future research on texture analysis using 3D data.

  14. LDA boost classification: boosting by topics

    Science.gov (United States)

    Lei, La; Qiao, Guo; Qimin, Cao; Qitao, Li

    2012-12-01

    AdaBoost is an efficacious classification algorithm especially in text categorization (TC) tasks. The methodology of setting up a classifier committee and voting on the documents for classification can achieve high categorization precision. However, traditional Vector Space Model can easily lead to the curse of dimensionality and feature sparsity problems; so it affects classification performance seriously. This article proposed a novel classification algorithm called LDABoost based on boosting ideology which uses Latent Dirichlet Allocation (LDA) to modeling the feature space. Instead of using words or phrase, LDABoost use latent topics as the features. In this way, the feature dimension is significantly reduced. Improved Naïve Bayes (NB) is designed as the weaker classifier which keeps the efficiency advantage of classic NB algorithm and has higher precision. Moreover, a two-stage iterative weighted method called Cute Integration in this article is proposed for improving the accuracy by integrating weak classifiers into strong classifier in a more rational way. Mutual Information is used as metrics of weights allocation. The voting information and the categorization decision made by basis classifiers are fully utilized for generating the strong classifier. Experimental results reveals LDABoost making categorization in a low-dimensional space, it has higher accuracy than traditional AdaBoost algorithms and many other classic classification algorithms. Moreover, its runtime consumption is lower than different versions of AdaBoost, TC algorithms based on support vector machine and Neural Networks.

  15. Hazard classification criteria for non-nuclear facilities

    International Nuclear Information System (INIS)

    Mahn, J.A.; Walker, S.A.

    1997-01-01

    Sandia National Laboratories' Integrated Risk Management Department has developed a process for establishing the appropriate hazard classification of a new facility or operation, and thus the level of rigor required for the associated authorization basis safety documentation. This process is referred to as the Preliminary Hazard Screen. DOE Order 5481.1B contains the following hazard classification for non-nuclear facilities: high--having the potential for onsite or offsite impacts to large numbers of persons or for major impacts to the environment; moderate--having the potential for considerable onsite impacts but only minor offsite impacts to people or the environment; low--having the potential for only minor onsite and negligible offsite impacts to people or the environment. It is apparent that the application of such generic criteria is more than likely to be fraught with subjective judgment. One way to remove the subjectivity is to define health and safety classification thresholds for specific hazards that are based on the magnitude of the hazard, rather than on a qualitative assessment of possible accident consequences. This paper presents the results of such an approach to establishing a readily usable set of non-nuclear facility hazard classifications

  16. Handling Imbalanced Data Sets in Multistage Classification

    Science.gov (United States)

    López, M.

    Multistage classification is a logical approach, based on a divide-and-conquer solution, for dealing with problems with a high number of classes. The classification problem is divided into several sequential steps, each one associated to a single classifier that works with subgroups of the original classes. In each level, the current set of classes is split into smaller subgroups of classes until they (the subgroups) are composed of only one class. The resulting chain of classifiers can be represented as a tree, which (1) simplifies the classification process by using fewer categories in each classifier and (2) makes it possible to combine several algorithms or use different attributes in each stage. Most of the classification algorithms can be biased in the sense of selecting the most populated class in overlapping areas of the input space. This can degrade a multistage classifier performance if the training set sample frequencies do not reflect the real prevalence in the population. Several techniques such as applying prior probabilities, assigning weights to the classes, or replicating instances have been developed to overcome this handicap. Most of them are designed for two-class (accept-reject) problems. In this article, we evaluate several of these techniques as applied to multistage classification and analyze how they can be useful for astronomy. We compare the results obtained by classifying a data set based on Hipparcos with and without these methods.

  17. Classification of Pansharpened Urban Satellite Images

    DEFF Research Database (Denmark)

    Palsson, Frosti; Sveinsson, Johannes R.; Benediktsson, Jon Atli

    2012-01-01

    The classification of high resolution urban remote sensing imagery is addressed with the focus on classification of imagery that has been pansharpened by a number of different pansharpening methods. The pansharpening process introduces some spectral and spatial distortions in the resulting fused...... multispectral image, the amount of which highly varies depending on which pansharpening technique is used. In the majority of the pansharpening techniques that have been proposed, there is a compromise between the spatial enhancement and the spectral consistency. Here we study the effects of the spectral...... information from the panchromatic data. Random Forests (RF) and Support Vector Machines (SVM) will be used as classifiers. Experiments are done for three different datasets that have been obtained by two different imaging sensors, IKONOS and QuickBird. These sensors deliver multispectral images that have four...

  18. High-level waste tank farm set point document

    International Nuclear Information System (INIS)

    Anthony, J.A. III.

    1995-01-01

    Setpoints for nuclear safety-related instrumentation are required for actions determined by the design authorization basis. Minimum requirements need to be established for assuring that setpoints are established and held within specified limits. This document establishes the controlling methodology for changing setpoints of all classifications. The instrumentation under consideration involve the transfer, storage, and volume reduction of radioactive liquid waste in the F- and H-Area High-Level Radioactive Waste Tank Farms. The setpoint document will encompass the PROCESS AREA listed in the Safety Analysis Report (SAR) (DPSTSA-200-10 Sup 18) which includes the diversion box HDB-8 facility. In addition to the PROCESS AREAS listed in the SAR, Building 299-H and the Effluent Transfer Facility (ETF) are also included in the scope

  19. High-level waste tank farm set point document

    Energy Technology Data Exchange (ETDEWEB)

    Anthony, J.A. III

    1995-01-15

    Setpoints for nuclear safety-related instrumentation are required for actions determined by the design authorization basis. Minimum requirements need to be established for assuring that setpoints are established and held within specified limits. This document establishes the controlling methodology for changing setpoints of all classifications. The instrumentation under consideration involve the transfer, storage, and volume reduction of radioactive liquid waste in the F- and H-Area High-Level Radioactive Waste Tank Farms. The setpoint document will encompass the PROCESS AREA listed in the Safety Analysis Report (SAR) (DPSTSA-200-10 Sup 18) which includes the diversion box HDB-8 facility. In addition to the PROCESS AREAS listed in the SAR, Building 299-H and the Effluent Transfer Facility (ETF) are also included in the scope.

  20. Exploiting monotonicity constraints for active learning in ordinal classification

    NARCIS (Netherlands)

    Soons, Pieter; Feelders, Adrianus

    2014-01-01

    We consider ordinal classification and instance ranking problems where each attribute is known to have an increasing or decreasing relation with the class label or rank. For example, it stands to reason that the number of query terms occurring in a document has a positive influence on its relevance

  1. Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification.

    Science.gov (United States)

    Li, Jinyan; Fong, Simon; Sung, Yunsick; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K L

    2016-01-01

    An imbalanced dataset is defined as a training dataset that has imbalanced proportions of data in both interesting and uninteresting classes. Often in biomedical applications, samples from the stimulating class are rare in a population, such as medical anomalies, positive clinical tests, and particular diseases. Although the target samples in the primitive dataset are small in number, the induction of a classification model over such training data leads to poor prediction performance due to insufficient training from the minority class. In this paper, we use a novel class-balancing method named adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique (ASCB_DmSMOTE) to solve this imbalanced dataset problem, which is common in biomedical applications. The proposed method combines under-sampling and over-sampling into a swarm optimisation algorithm. It adaptively selects suitable parameters for the rebalancing algorithm to find the best solution. Compared with the other versions of the SMOTE algorithm, significant improvements, which include higher accuracy and credibility, are observed with ASCB_DmSMOTE. Our proposed method tactfully combines two rebalancing techniques together. It reasonably re-allocates the majority class in the details and dynamically optimises the two parameters of SMOTE to synthesise a reasonable scale of minority class for each clustered sub-imbalanced dataset. The proposed methods ultimately overcome other conventional methods and attains higher credibility with even greater accuracy of the classification model.

  2. Multisource data fusion for documenting archaeological sites

    Science.gov (United States)

    Knyaz, Vladimir; Chibunichev, Alexander; Zhuravlev, Denis

    2017-10-01

    The quality of archaeological sites documenting is of great importance for cultural heritage preserving and investigating. The progress in developing new techniques and systems for data acquisition and processing creates an excellent basis for achieving a new quality of archaeological sites documenting and visualization. archaeological data has some specific features which have to be taken into account when acquiring, processing and managing. First of all, it is a needed to gather as full as possible information about findings providing no loss of information and no damage to artifacts. Remote sensing technologies are the most adequate and powerful means which satisfy this requirement. An approach to archaeological data acquiring and fusion based on remote sensing is proposed. It combines a set of photogrammetric techniques for obtaining geometrical and visual information at different scales and detailing and a pipeline for archaeological data documenting, structuring, fusion, and analysis. The proposed approach is applied for documenting of Bosporus archaeological expedition of Russian State Historical Museum.

  3. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  4. A review on fault classification methodologies in power transmission systems: Part-II

    Directory of Open Access Journals (Sweden)

    Avagaddi Prasad

    2018-05-01

    Full Text Available The countless extent of power systems and applications requires the improvement in suitable techniques for the fault classification in power transmission systems, to increase the efficiency of the systems and to avoid major damages. For this purpose, the technical literature proposes a large number of methods. The paper analyzes the technical literature, summarizing the most important methods that can be applied to fault classification methodologies in power transmission systems.The part 2 of the article is named “A review on fault classification methodologies in power transmission systems”. In this part 2 we discussed the advanced technologies developed by various researchers for fault classification in power transmission systems. Keywords: Transmission line protection, Protective relaying, Soft computing techniques

  5. Feasibility of a novel deformable image registration technique to facilitate classification, targeting, and monitoring of tumor and normal tissue

    International Nuclear Information System (INIS)

    Brock, Kristy K.; Dawson, Laura A.; Sharpe, Michael B.; Moseley, Douglas J.; Jaffray, David A.

    2006-01-01

    Purpose: To investigate the feasibility of a biomechanical-based deformable image registration technique for the integration of multimodality imaging, image guided treatment, and response monitoring. Methods and Materials: A multiorgan deformable image registration technique based on finite element modeling (FEM) and surface projection alignment of selected regions of interest with biomechanical material and interface models has been developed. FEM also provides an inherent method for direct tracking specified regions through treatment and follow-up. Results: The technique was demonstrated on 5 liver cancer patients. Differences of up to 1 cm of motion were seen between the diaphragm and the tumor center of mass after deformable image registration of exhale and inhale CT scans. Spatial differences of 5 mm or more were observed for up to 86% of the surface of the defined tumor after deformable image registration of the computed tomography (CT) and magnetic resonance images. Up to 6.8 mm of motion was observed for the tumor after deformable image registration of the CT and cone-beam CT scan after rigid registration of the liver. Deformable registration of the CT to the follow-up CT allowed a more accurate assessment of tumor response. Conclusions: This biomechanical-based deformable image registration technique incorporates classification, targeting, and monitoring of tumor and normal tissue using one methodology

  6. Development of an auditable safety analysis in support of a radiological facility classification

    International Nuclear Information System (INIS)

    Kinney, M.D.; Young, B.

    1995-01-01

    In recent years, U.S. Department of Energy (DOE) facilities commonly have been classified as reactor, non-reactor nuclear, or nuclear facilities. Safety analysis documentation was prepared for these facilities, with few exceptions, using the requirements in either DOE Order 5481.1B, Safety Analysis and Review System; or DOE Order 5480.23, Nuclear Safety Analysis Reports. Traditionally, this has been accomplished by development of an extensive Safety Analysis Report (SAR), which identifies hazards, assesses risks of facility operation, describes and analyzes adequacy of measures taken to control hazards, and evaluates potential accidents and their associated risks. This process is complicated by analysis of secondary hazards and adequacy of backup (redundant) systems. The traditional SAR process is advantageous for DOE facilities with appreciable hazards or operational risks. SAR preparation for a low-risk facility or process can be cost-prohibitive and quite challenging because conventional safety analysis protocols may not readily be applied to a low-risk facility. The DOE Office of Environmental Restoration and Waste Management recognized this potential disadvantage and issued an EM limited technical standard, No. 5502-94, Hazard Baseline Documentation. This standard can be used for developing documentation for a facility classified as radiological, including preparation of an auditable (defensible) safety analysis. In support of the radiological facility classification process, the Uranium Mill Tailings Remedial Action (UMTRA) Project has developed an auditable safety analysis document based upon the postulation criteria and hazards analysis techniques defined in DOE Order 5480.23

  7. Malware distributed collection and pre-classification system using honeypot technology

    Science.gov (United States)

    Grégio, André R. A.; Oliveira, Isabela L.; Santos, Rafael D. C.; Cansian, Adriano M.; de Geus, Paulo L.

    2009-04-01

    Malware has become a major threat in the last years due to the ease of spread through the Internet. Malware detection has become difficult with the use of compression, polymorphic methods and techniques to detect and disable security software. Those and other obfuscation techniques pose a problem for detection and classification schemes that analyze malware behavior. In this paper we propose a distributed architecture to improve malware collection using different honeypot technologies to increase the variety of malware collected. We also present a daemon tool developed to grab malware distributed through spam and a pre-classification technique that uses antivirus technology to separate malware in generic classes.

  8. Technical training: CERN Document Server (CDS), Inspire and Library Services

    CERN Multimedia

    IT & GS Departments

    2012-01-01

    A new training course, “CERN Document Server (CDS), Inspire and Library Services”, is available since the beginning of the year. The training course is given by members of CERN’s CDS Team (IT-CIS group) and the Library Services (GIS SIS group) and is intended for all members of personnel of CERN. This course will present CDS and inspirehep.net and the content, scope and scientific information available in or with CDS, as much as the classification and organization of the documents. It is intended to give you the training needed to know how to use CDS most efficiently and in particular covers: the main characteristics and advanced features for the search of documents (scientific, multimedia, etc). the collaborative tools : baskets, alerts, comments, evaluation, etc. the submission of documents in CDS and examples of workflows. An important part of the training is composed of various exercises, designed to acquire practical ability to work with CDS in cases similar to re...

  9. Textual Data Mining Applications in the Service Chain Knowledge Management of e-Government

    Directory of Open Access Journals (Sweden)

    Jalal Rezaeenour

    2017-03-01

    Full Text Available Systems related to knowledge management can improve quality and efficiency of knowledge used for decision making process. Approximately 80 percent of corporate information are in textual data formats. That is why text mining is useful and important in service chain knowledge management. For example, one of the most important applications of text mining is in managing on-line source of digital documents and the analysis of internal documents. This research is based on text-based documents and textual information and interviews processed by Grounded theory. In this research clustering techniques were applied at first step. In the second step, Apriori association rules techniques for discovering and extracting the most useful association rules were applied. In other words, integration of datamining techniques was emphasized to improve the accuracy and precision of classification. Using decision tree technique for classification may result in reducing classification precision. But, the proposed method showed a significant improvement in classification precision.

  10. Automatic Genre Classification of Musical Signals

    Science.gov (United States)

    Barbedo, Jayme Garcia sArnal; Lopes, Amauri

    2006-12-01

    We present a strategy to perform automatic genre classification of musical signals. The technique divides the signals into 21.3 milliseconds frames, from which 4 features are extracted. The values of each feature are treated over 1-second analysis segments. Some statistical results of the features along each analysis segment are used to determine a vector of summary features that characterizes the respective segment. Next, a classification procedure uses those vectors to differentiate between genres. The classification procedure has two main characteristics: (1) a very wide and deep taxonomy, which allows a very meticulous comparison between different genres, and (2) a wide pairwise comparison of genres, which allows emphasizing the differences between each pair of genres. The procedure points out the genre that best fits the characteristics of each segment. The final classification of the signal is given by the genre that appears more times along all signal segments. The approach has shown very good accuracy even for the lowest layers of the hierarchical structure.

  11. Automotive System for Remote Surface Classification.

    Science.gov (United States)

    Bystrov, Aleksandr; Hoare, Edward; Tran, Thuy-Yung; Clarke, Nigel; Gashinova, Marina; Cherniakov, Mikhail

    2017-04-01

    In this paper we shall discuss a novel approach to road surface recognition, based on the analysis of backscattered microwave and ultrasonic signals. The novelty of our method is sonar and polarimetric radar data fusion, extraction of features for separate swathes of illuminated surface (segmentation), and using of multi-stage artificial neural network for surface classification. The developed system consists of 24 GHz radar and 40 kHz ultrasonic sensor. The features are extracted from backscattered signals and then the procedures of principal component analysis and supervised classification are applied to feature data. The special attention is paid to multi-stage artificial neural network which allows an overall increase in classification accuracy. The proposed technique was tested for recognition of a large number of real surfaces in different weather conditions with the average accuracy of correct classification of 95%. The obtained results thereby demonstrate that the use of proposed system architecture and statistical methods allow for reliable discrimination of various road surfaces in real conditions.

  12. Final hazard classification for the 116-F-4 (Terra Stor) soil retrieval activities

    International Nuclear Information System (INIS)

    Adam, W.J.

    1996-07-01

    The purpose of this document is to provide the final hazard classification for the remediation activities described in the Work Plan for the Retrieval of Contaminated Soil from the 116-F-4 Storage Unit. Based upon total inventories calculated from the characterization data, a preliminary hazard categorization of less than Hazard Category 3 was assigned. Based upon the material-at-risk, a final hazard classification of radiological was assigned

  13. Virtual Sensor of Surface Electromyography in a New Extensive Fault-Tolerant Classification System.

    Science.gov (United States)

    de Moura, Karina de O A; Balbinot, Alexandre

    2018-05-01

    A few prosthetic control systems in the scientific literature obtain pattern recognition algorithms adapted to changes that occur in the myoelectric signal over time and, frequently, such systems are not natural and intuitive. These are some of the several challenges for myoelectric prostheses for everyday use. The concept of the virtual sensor, which has as its fundamental objective to estimate unavailable measures based on other available measures, is being used in other fields of research. The virtual sensor technique applied to surface electromyography can help to minimize these problems, typically related to the degradation of the myoelectric signal that usually leads to a decrease in the classification accuracy of the movements characterized by computational intelligent systems. This paper presents a virtual sensor in a new extensive fault-tolerant classification system to maintain the classification accuracy after the occurrence of the following contaminants: ECG interference, electrode displacement, movement artifacts, power line interference, and saturation. The Time-Varying Autoregressive Moving Average (TVARMA) and Time-Varying Kalman filter (TVK) models are compared to define the most robust model for the virtual sensor. Results of movement classification were presented comparing the usual classification techniques with the method of the degraded signal replacement and classifier retraining. The experimental results were evaluated for these five noise types in 16 surface electromyography (sEMG) channel degradation case studies. The proposed system without using classifier retraining techniques recovered of mean classification accuracy was of 4% to 38% for electrode displacement, movement artifacts, and saturation noise. The best mean classification considering all signal contaminants and channel combinations evaluated was the classification using the retraining method, replacing the degraded channel by the virtual sensor TVARMA model. This method

  14. Automatic topic identification of health-related messages in online health community using text classification.

    Science.gov (United States)

    Lu, Yingjie

    2013-01-01

    To facilitate patient involvement in online health community and obtain informative support and emotional support they need, a topic identification approach was proposed in this paper for identifying automatically topics of the health-related messages in online health community, thus assisting patients in reaching the most relevant messages for their queries efficiently. Feature-based classification framework was presented for automatic topic identification in our study. We first collected the messages related to some predefined topics in a online health community. Then we combined three different types of features, n-gram-based features, domain-specific features and sentiment features to build four feature sets for health-related text representation. Finally, three different text classification techniques, C4.5, Naïve Bayes and SVM were adopted to evaluate our topic classification model. By comparing different feature sets and different classification techniques, we found that n-gram-based features, domain-specific features and sentiment features were all considered to be effective in distinguishing different types of health-related topics. In addition, feature reduction technique based on information gain was also effective to improve the topic classification performance. In terms of classification techniques, SVM outperformed C4.5 and Naïve Bayes significantly. The experimental results demonstrated that the proposed approach could identify the topics of online health-related messages efficiently.

  15. An Alternative Approach to Mapping Thermophysical Units from Martian Thermal Inertia and Albedo Data Using a Combination of Unsupervised Classification Techniques

    Directory of Open Access Journals (Sweden)

    Eriita Jones

    2014-06-01

    Full Text Available Thermal inertia and albedo provide information on the distribution of surface materials on Mars. These parameters have been mapped globally on Mars by the Thermal Emission Spectrometer (TES onboard the Mars Global Surveyor. Two-dimensional clusters of thermal inertia and albedo reflect the thermophysical attributes of the dominant materials on the surface. In this paper three automated, non-deterministic, algorithmic classification methods are employed for defining thermophysical units: Expectation Maximisation of a Gaussian Mixture Model; Iterative Self-Organizing Data Analysis Technique (ISODATA; and Maximum Likelihood. We analyse the behaviour of the thermophysical classes resulting from the three classifiers, operating on the 2007 TES thermal inertia and albedo datasets. Producing a rigorous mapping of thermophysical classes at ~3 km/pixel resolution remains important for constraining the geologic processes that have shaped the Martian surface on a regional scale, and for choosing appropriate landing sites. The results from applying these algorithms are compared to geologic maps, surface data from lander missions, features derived from imaging, and previous classifications of thermophysical units which utilized manual (and potentially more time consuming classification methods. These comparisons comprise data suitable for validation of our classifications. Our work shows that a combination of the algorithms—ISODATA and Maximum Likelihood—optimises the sensitivity to the underlying dataspace, and that new information on Martian surface materials can be obtained by using these methods. We demonstrate that the algorithms used here can be applied to define a finer partitioning of albedo and thermal inertia for a more detailed mapping of surface materials, grain sizes and thermal behaviour of the Martian surface and shallow subsurface, at the ~3 km scale.

  16. A systematic literature review of automated clinical coding and classification systems.

    Science.gov (United States)

    Stanfill, Mary H; Williams, Margaret; Fenton, Susan H; Jenders, Robert A; Hersh, William R

    2010-01-01

    Clinical coding and classification processes transform natural language descriptions in clinical text into data that can subsequently be used for clinical care, research, and other purposes. This systematic literature review examined studies that evaluated all types of automated coding and classification systems to determine the performance of such systems. Studies indexed in Medline or other relevant databases prior to March 2009 were considered. The 113 studies included in this review show that automated tools exist for a variety of coding and classification purposes, focus on various healthcare specialties, and handle a wide variety of clinical document types. Automated coding and classification systems themselves are not generalizable, nor are the results of the studies evaluating them. Published research shows these systems hold promise, but these data must be considered in context, with performance relative to the complexity of the task and the desired outcome.

  17. RELIABLE COGNITIVE DIMENSIONAL DOCUMENT RANKING BY WEIGHTED STANDARD CAUCHY DISTRIBUTION

    Directory of Open Access Journals (Sweden)

    S Florence Vijila

    2017-04-01

    Full Text Available Categorization of cognitively uniform and consistent documents such as University question papers are in demand by e-learners. Literature indicates that Standard Cauchy distribution and the derived values are extensively used for checking uniformity and consistency of documents. The paper attempts to apply this technique for categorizing question papers according to four selective cognitive dimensions. For this purpose cognitive dimensional keyword sets of these four categories (also termed as portrayal concepts are assumed and an automatic procedure is developed to quantify these dimensions in question papers. The categorization is relatively accurate when checked with manual methods. Hence simple and well established term frequency / inverse document frequency ‘tf/ IDF’ technique is considered for automating the categorization process. After the documents categorization, standard Cauchy formula is applied to rank order the documents that have the least differences among Cauchy value, (according to Cauchy theorem so as obtain consistent and uniform documents in an order or ranked. For the purpose of experiments and social survey, seven question papers (documents have been designed with various consistencies. To validate this proposed technique social survey is administered on selective samples of e-learners of Tamil Nadu, India. Results are encouraging and conclusions drawn out of the experiments will be useful to researchers of concept mining and categorizing documents according to concepts. Findings have also contributed utility value to e-learning system designers.

  18. Robust electrocardiogram (ECG) beat classification using discrete wavelet transform

    International Nuclear Information System (INIS)

    Minhas, Fayyaz-ul-Amir Afsar; Arif, Muhammad

    2008-01-01

    This paper presents a robust technique for the classification of six types of heartbeats through an electrocardiogram (ECG). Features extracted from the QRS complex of the ECG using a wavelet transform along with the instantaneous RR-interval are used for beat classification. The wavelet transform utilized for feature extraction in this paper can also be employed for QRS delineation, leading to reduction in overall system complexity as no separate feature extraction stage would be required in the practical implementation of the system. Only 11 features are used for beat classification with the classification accuracy of ∼99.5% through a KNN classifier. Another main advantage of this method is its robustness to noise, which is illustrated in this paper through experimental results. Furthermore, principal component analysis (PCA) has been used for feature reduction, which reduces the number of features from 11 to 6 while retaining the high beat classification accuracy. Due to reduction in computational complexity (using six features, the time required is ∼4 ms per beat), a simple classifier and noise robustness (at 10 dB signal-to-noise ratio, accuracy is 95%), this method offers substantial advantages over previous techniques for implementation in a practical ECG analyzer

  19. Deep-learnt classification of light curves

    DEFF Research Database (Denmark)

    Mahabal, Ashish; Gieseke, Fabian; Pai, Akshay Sadananda Uppinakudru

    2017-01-01

    is to derive statistical features from the time series and to use machine learning methods, generally supervised, to separate objects into a few of the standard classes. In this work, we transform the time series to two-dimensional light curve representations in order to classify them using modern deep......Astronomy light curves are sparse, gappy, and heteroscedastic. As a result standard time series methods regularly used for financial and similar datasets are of little help and astronomers are usually left to their own instruments and techniques to classify light curves. A common approach...... learning techniques. In particular, we show that convolutional neural networks based classifiers work well for broad characterization and classification. We use labeled datasets of periodic variables from CRTS survey and show how this opens doors for a quick classification of diverse classes with several...

  20. Automated Tissue Classification Framework for Reproducible Chronic Wound Assessment

    Directory of Open Access Journals (Sweden)

    Rashmi Mukherjee

    2014-01-01

    Full Text Available The aim of this paper was to develop a computer assisted tissue classification (granulation, necrotic, and slough scheme for chronic wound (CW evaluation using medical image processing and statistical machine learning techniques. The red-green-blue (RGB wound images grabbed by normal digital camera were first transformed into HSI (hue, saturation, and intensity color space and subsequently the “S” component of HSI color channels was selected as it provided higher contrast. Wound areas from 6 different types of CW were segmented from whole images using fuzzy divergence based thresholding by minimizing edge ambiguity. A set of color and textural features describing granulation, necrotic, and slough tissues in the segmented wound area were extracted using various mathematical techniques. Finally, statistical learning algorithms, namely, Bayesian classification and support vector machine (SVM, were trained and tested for wound tissue classification in different CW images. The performance of the wound area segmentation protocol was further validated by ground truth images labeled by clinical experts. It was observed that SVM with 3rd order polynomial kernel provided the highest accuracies, that is, 86.94%, 90.47%, and 75.53%, for classifying granulation, slough, and necrotic tissues, respectively. The proposed automated tissue classification technique achieved the highest overall accuracy, that is, 87.61%, with highest kappa statistic value (0.793.

  1. Visualization and classification in biomedical terahertz pulsed imaging

    International Nuclear Information System (INIS)

    Loeffler, Torsten; Siebert, Karsten; Czasch, Stephanie; Bauer, Tobias; Roskos, Hartmut G

    2002-01-01

    'Visualization' in imaging is the process of extracting useful information from raw data in such a way that meaningful physical contrasts are developed. 'Classification' is the subsequent process of defining parameter ranges which allow us to identify elements of images such as different tissues or different objects. In this paper, we explore techniques for visualization and classification in terahertz pulsed imaging (TPI) for biomedical applications. For archived (formalin-fixed, alcohol-dehydrated and paraffin-mounted) test samples, we investigate both time- and frequency-domain methods based on bright- and dark-field TPI. Successful tissue classification is demonstrated

  2. Classification of radioactive waste

    International Nuclear Information System (INIS)

    1994-01-01

    Radioactive wastes are generated in a number of different kinds of facilities and arise in a wide range of concentrations of radioactive materials and in a variety of physical and chemical forms. To simplify their management, a number of schemes have evolved for classifying radioactive waste according to the physical, chemical and radiological properties of significance to those facilities managing this waste. These schemes have led to a variety of terminologies, differing from country to country and even between facilities in the same country. This situation makes it difficult for those concerned to communicate with one another regarding waste management practices. This document revises and updates earlier IAEA references on radioactive waste classification systems given in IAEA Technical Reports Series and Safety Series. Guidance regarding exemption of materials from regulatory control is consistent with IAEA Safety Series and the RADWASS documents published under IAEA Safety Series. 11 refs, 2 figs, 2 tab

  3. Seafloor backscatter signal simulation and classification

    Digital Repository Service at National Institute of Oceanography (India)

    Mahale, V.; El Dine, W.G.; Chakraborty, B.

    . In this model a smooth echo envelope is generated then mixed up with multiplicative and additive noise. Several such echo signals were simulated for three types of seafloor. An Artificial Neural Network based classification technique is conceived to classify...

  4. Improving imbalanced scientific text classification using sampling strategies and dictionaries

    Directory of Open Access Journals (Sweden)

    Borrajo L.

    2011-12-01

    Full Text Available Many real applications have the imbalanced class distribution problem, where one of the classes is represented by a very small number of cases compared to the other classes. One of the systems affected are those related to the recovery and classification of scientific documentation.

  5. Information Gain Based Dimensionality Selection for Classifying Text Documents

    Energy Technology Data Exchange (ETDEWEB)

    Dumidu Wijayasekara; Milos Manic; Miles McQueen

    2013-06-01

    Selecting the optimal dimensions for various knowledge extraction applications is an essential component of data mining. Dimensionality selection techniques are utilized in classification applications to increase the classification accuracy and reduce the computational complexity. In text classification, where the dimensionality of the dataset is extremely high, dimensionality selection is even more important. This paper presents a novel, genetic algorithm based methodology, for dimensionality selection in text mining applications that utilizes information gain. The presented methodology uses information gain of each dimension to change the mutation probability of chromosomes dynamically. Since the information gain is calculated a priori, the computational complexity is not affected. The presented method was tested on a specific text classification problem and compared with conventional genetic algorithm based dimensionality selection. The results show an improvement of 3% in the true positives and 1.6% in the true negatives over conventional dimensionality selection methods.

  6. AN IMPLEMENTATION OF EIS-SVM CLASSIFIER USING RESEARCH ARTICLES FOR TEXT CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    B Ramesh

    2016-04-01

    Full Text Available Automatic text classification is a prominent research topic in text mining. The text pre-processing is a major role in text classifier. The efficiency of pre-processing techniques is increasing the performance of text classifier. In this paper, we are implementing ECAS stemmer, Efficient Instance Selection and Pre-computed Kernel Support Vector Machine for text classification using recent research articles. We are using better pre-processing techniques such as ECAS stemmer to find root word, Efficient Instance Selection for dimensionality reduction of text data and Pre-computed Kernel Support Vector Machine for classification of selected instances. In this experiments were performed on 750 research articles with three classes such as engineering article, medical articles and educational articles. The EIS-SVM classifier provides better performance in real-time research articles classification.

  7. Failure diagnosis using deep belief learning based health state classification

    International Nuclear Information System (INIS)

    Tamilselvan, Prasanna; Wang, Pingfeng

    2013-01-01

    Effective health diagnosis provides multifarious benefits such as improved safety, improved reliability and reduced costs for operation and maintenance of complex engineered systems. This paper presents a novel multi-sensor health diagnosis method using deep belief network (DBN). DBN has recently become a popular approach in machine learning for its promised advantages such as fast inference and the ability to encode richer and higher order network structures. The DBN employs a hierarchical structure with multiple stacked restricted Boltzmann machines and works through a layer by layer successive learning process. The proposed multi-sensor health diagnosis methodology using DBN based state classification can be structured in three consecutive stages: first, defining health states and preprocessing sensory data for DBN training and testing; second, developing DBN based classification models for diagnosis of predefined health states; third, validating DBN classification models with testing sensory dataset. Health diagnosis using DBN based health state classification technique is compared with four existing diagnosis techniques. Benchmark classification problems and two engineering health diagnosis applications: aircraft engine health diagnosis and electric power transformer health diagnosis are employed to demonstrate the efficacy of the proposed approach

  8. Vulnerable land ecosystems classification using spatial context and spectral indices

    Science.gov (United States)

    Ibarrola-Ulzurrun, Edurne; Gonzalo-Martín, Consuelo; Marcello, Javier

    2017-10-01

    Natural habitats are exposed to growing pressure due to intensification of land use and tourism development. Thus, obtaining information on the vegetation is necessary for conservation and management projects. In this context, remote sensing is an important tool for monitoring and managing habitats, being classification a crucial stage. The majority of image classifications techniques are based upon the pixel-based approach. An alternative is the object-based (OBIA) approach, in which a previous segmentation step merges image pixels to create objects that are then classified. Besides, improved results may be gained by incorporating additional spatial information and specific spectral indices into the classification process. The main goal of this work was to implement and assess object-based classification techniques on very-high resolution imagery incorporating spectral indices and contextual spatial information in the classification models. The study area was Teide National Park in Canary Islands (Spain) using Worldview-2 orthoready imagery. In the classification model, two common indices were selected Normalized Difference Vegetation Index (NDVI) and Optimized Soil Adjusted Vegetation Index (OSAVI), as well as two specific Worldview-2 sensor indices, Worldview Vegetation Index and Worldview Soil Index. To include the contextual information, Grey Level Co-occurrence Matrices (GLCM) were used. The classification was performed training a Support Vector Machine with sufficient and representative number of vegetation samples (Spartocytisus supranubius, Pterocephalus lasiospermus, Descurainia bourgaeana and Pinus canariensis) as well as urban, road and bare soil classes. Confusion Matrices were computed to evaluate the results from each classification model obtaining the highest overall accuracy (90.07%) combining both Worldview indices with the GLCM-dissimilarity.

  9. Land Cover Classification from Multispectral Data Using Computational Intelligence Tools: A Comparative Study

    Directory of Open Access Journals (Sweden)

    André Mora

    2017-11-01

    Full Text Available This article discusses how computational intelligence techniques are applied to fuse spectral images into a higher level image of land cover distribution for remote sensing, specifically for satellite image classification. We compare a fuzzy-inference method with two other computational intelligence methods, decision trees and neural networks, using a case study of land cover classification from satellite images. Further, an unsupervised approach based on k-means clustering has been also taken into consideration for comparison. The fuzzy-inference method includes training the classifier with a fuzzy-fusion technique and then performing land cover classification using reinforcement aggregation operators. To assess the robustness of the four methods, a comparative study including three years of land cover maps for the district of Mandimba, Niassa province, Mozambique, was undertaken. Our results show that the fuzzy-fusion method performs similarly to decision trees, achieving reliable classifications; neural networks suffer from overfitting; while k-means clustering constitutes a promising technique to identify land cover types from unknown areas.

  10. Exploring different approaches for music genre classification

    Directory of Open Access Journals (Sweden)

    Antonio Jose Homsi Goulart

    2012-07-01

    Full Text Available In this letter, we present different approaches for music genre classification. The proposed techniques, which are composed of a feature extraction stage followed by a classification procedure, explore both the variations of parameters used as input and the classifier architecture. Tests were carried out with three styles of music, namely blues, classical, and lounge, which are considered informally by some musicians as being “big dividers” among music genres, showing the efficacy of the proposed algorithms and establishing a relationship between the relevance of each set of parameters for each music style and each classifier. In contrast to other works, entropies and fractal dimensions are the features adopted for the classifications.

  11. Iris Data Classification Using Quantum Neural Networks

    International Nuclear Information System (INIS)

    Sahni, Vishal; Patvardhan, C.

    2006-01-01

    Quantum computing is a novel paradigm that promises to be the future of computing. The performance of quantum algorithms has proved to be stunning. ANN within the context of classical computation has been used for approximation and classification tasks with some success. This paper presents an idea of quantum neural networks along with the training algorithm and its convergence property. It synergizes the unique properties of quantum bits or qubits with the various techniques in vogue in neural networks. An example application of Fisher's Iris data set, a benchmark classification problem has also been presented. The results obtained amply demonstrate the classification capabilities of the quantum neuron and give an idea of their promising capabilities

  12. The composite sequential clustering technique for analysis of multispectral scanner data

    Science.gov (United States)

    Su, M. Y.

    1972-01-01

    The clustering technique consists of two parts: (1) a sequential statistical clustering which is essentially a sequential variance analysis, and (2) a generalized K-means clustering. In this composite clustering technique, the output of (1) is a set of initial clusters which are input to (2) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum likelihood classification techniques. The mathematical algorithms for the composite sequential clustering program and a detailed computer program description with job setup are given.

  13. Document Examination: Applications of Image Processing Systems.

    Science.gov (United States)

    Kopainsky, B

    1989-12-01

    Dealing with images is a familiar business for an expert in questioned documents: microscopic, photographic, infrared, and other optical techniques generate images containing the information he or she is looking for. A recent method for extracting most of this information is digital image processing, ranging from the simple contrast and contour enhancement to the advanced restoration of blurred texts. When combined with a sophisticated physical imaging system, an image pricessing system has proven to be a powerful and fast tool for routine non-destructive scanning of suspect documents. This article reviews frequent applications, comprising techniques to increase legibility, two-dimensional spectroscopy (ink discrimination, alterations, erased entries, etc.), comparison techniques (stamps, typescript letters, photo substitution), and densitometry. Computerized comparison of handwriting is not included. Copyright © 1989 Central Police University.

  14. Overlap and Nonoverlap Between the ICF Core Sets for Hearing Loss and Otology and Audiology Intake Documentation.

    Science.gov (United States)

    van Leeuwen, Lisette M; Merkus, Paul; Pronk, Marieke; van der Torn, Marein; Maré, Marcel; Goverts, S Theo; Kramer, Sophia E

    The International Classification of Functioning Disability and Health (ICF) Core Sets for Hearing Loss (HL) were developed to serve as a standard for the assessment and reporting of the functioning and health of patients with HL. The aim of the present study was to compare the content of the intake documentation currently used in secondary and tertiary hearing care settings in the Netherlands with the content of the ICF Core Sets for HL. Research questions were (1) to what extent are the ICF Core Sets for HL represented in the Dutch Otology and Audiology intake documentation? (2) are there any extra ICF categories expressed in the intake documentation that are currently not part of the ICF Core Sets for HL, or constructs expressed that are not part of the ICF? Multicenter patient record study including 176 adult patients from two secondary, and two tertiary hearing care settings. The intake documentation was selected from anonymized patient records. The content was linked to the appropriate ICF category from the whole ICF classification using established linking rules. The extent to which the ICF Core Sets for HL were represented in the intake documentation was determined by assessing the overlap between the ICF categories in the Core Sets and the list of unique ICF categories extracted from the intake documentation. Any extra constructs that were expressed in the intake documentation but are not part of the Core Sets were described as well, differentiating between ICF categories that are not part of the Core Sets and constructs that are not part of the ICF classification. In total, otology and audiology intake documentation represented 24 of the 27 Brief ICF Core Set categories (i.e., 89%), and 60 of the 117 Comprehensive ICF Core Set categories (i.e., 51%). Various ICF Core Sets categories were not represented, including higher mental functions (Body Functions), civic life aspects (Activities and Participation), and support and attitudes of family (Environmental

  15. A Graph Cut Approach to Artery/Vein Classification in Ultra-Widefield Scanning Laser Ophthalmoscopy.

    Science.gov (United States)

    Pellegrini, Enrico; Robertson, Gavin; MacGillivray, Tom; van Hemert, Jano; Houston, Graeme; Trucco, Emanuele

    2018-02-01

    The classification of blood vessels into arterioles and venules is a fundamental step in the automatic investigation of retinal biomarkers for systemic diseases. In this paper, we present a novel technique for vessel classification on ultra-wide-field-of-view images of the retinal fundus acquired with a scanning laser ophthalmoscope. To the best of our knowledge, this is the first time that a fully automated artery/vein classification technique for this type of retinal imaging with no manual intervention has been presented. The proposed method exploits hand-crafted features based on local vessel intensity and vascular morphology to formulate a graph representation from which a globally optimal separation between the arterial and venular networks is computed by graph cut approach. The technique was tested on three different data sets (one publicly available and two local) and achieved an average classification accuracy of 0.883 in the largest data set.

  16. Integrated tracking, classification, and sensor management theory and applications

    CERN Document Server

    Krishnamurthy, Vikram; Vo, Ba-Ngu

    2012-01-01

    A unique guide to the state of the art of tracking, classification, and sensor management. This book addresses the tremendous progress made over the last few decades in algorithm development and mathematical analysis for filtering, multi-target multi-sensor tracking, sensor management and control, and target classification. It provides for the first time an integrated treatment of these advanced topics, complete with careful mathematical formulation, clear description of the theory, and real-world applications. Written by experts in the field, Integrated Tracking, Classification, and Sensor Management provides readers with easy access to key Bayesian modeling and filtering methods, multi-target tracking approaches, target classification procedures, and large scale sensor management problem-solving techniques.

  17. Automated radial basis function neural network based image classification system for diabetic retinopathy detection in retinal images

    Science.gov (United States)

    Anitha, J.; Vijila, C. Kezi Selva; Hemanth, D. Jude

    2010-02-01

    Diabetic retinopathy (DR) is a chronic eye disease for which early detection is highly essential to avoid any fatal results. Image processing of retinal images emerge as a feasible tool for this early diagnosis. Digital image processing techniques involve image classification which is a significant technique to detect the abnormality in the eye. Various automated classification systems have been developed in the recent years but most of them lack high classification accuracy. Artificial neural networks are the widely preferred artificial intelligence technique since it yields superior results in terms of classification accuracy. In this work, Radial Basis function (RBF) neural network based bi-level classification system is proposed to differentiate abnormal DR Images and normal retinal images. The results are analyzed in terms of classification accuracy, sensitivity and specificity. A comparative analysis is performed with the results of the probabilistic classifier namely Bayesian classifier to show the superior nature of neural classifier. Experimental results show promising results for the neural classifier in terms of the performance measures.

  18. IDENTIFYING ROOF FALL PREDICTORS USING FUZZY CLASSIFICATION

    International Nuclear Information System (INIS)

    Bertoncini, C. A.; Hinders, M. K.

    2010-01-01

    Microseismic monitoring involves placing geophones on the rock surfaces of a mine to record seismic activity. Classification of microseismic mine data can be used to predict seismic events in a mine to mitigate mining hazards, such as roof falls, where properly bolting and bracing the roof is often an insufficient method of preventing weak roofs from destabilizing. In this study, six months of recorded acoustic waveforms from microseismic monitoring in a Pennsylvania limestone mine were analyzed using classification techniques to predict roof falls. Fuzzy classification using features selected for computational ease was applied on the mine data. Both large roof fall events could be predicted using a Roof Fall Index (RFI) metric calculated from the results of the fuzzy classification. RFI was successfully used to resolve the two significant roof fall events and predicted both events by at least 15 hours before visual signs of the roof falls were evident.

  19. Precise documentation of well-structured programs

    Energy Technology Data Exchange (ETDEWEB)

    Parnas, D.L.; Madey, J.; Iglewski, M. [McMaster Univ., Hamilton, Ontario (Canada)

    1997-11-01

    This paper describes a new form of program documentation that is precise, systematic and readable. This documentation comprises a set of displays supplemented by a lexicon and an index. Each display presents a program fragment in such a way that its correctness can be examined without looking at any other display. Each display has three parts: (1) the specification of the program presented in the display, (2) the program itself, and (3) the specifications of programs invoked by this program. The displays are intended to be used by Software Engineers as a reference document during inspection and maintenance. This paper also introduces a specification technique that is a refinement of Mills functional approach to program documentation and verification; programs are specified and described in tabular form.

  20. Transitioning Existing Content: inferring organisation-specific documents

    Directory of Open Access Journals (Sweden)

    Arijit Sengupta

    2000-11-01

    Full Text Available A definition for a document type within an organization represents an organizational norm about the way the organizational actors represent products and supporting evidence of organizational processes. Generating a good organization-specific document structure is, therefore, important since it can capture a shared understanding among the organizational actors about how certain business processes should be performed. Current tools that generate document type definitions focus on the underlying technology, emphasizing tags created in a single instance document. The tools, thus, fall short of capturing the shared understanding between organizational actors about how a given document type should be represented. We propose a method for inferring organization-specific document structures using multiple instance documents as inputs. The method consists of heuristics that combine individual document definitions, which may have been compiled using standard algorithms. We propose a number of heuristics utilizing artificial intelligence and natural language processing techniques. As the research progresses, the heuristics will be tested on a suite of test cases representing multiple instance documents for different document types. The complete methodology will be implemented as a research prototype

  1. Fingerprint pattern classification approach based on the coordinate geometry of singularities

    CSIR Research Space (South Africa)

    Msiza, IS

    2009-10-01

    Full Text Available of fingerprint matching, it serves to reduce the duration of the query. The fingerprint classes discussed in this document are the Central Twins (CT), Tented Arch (TA), Left Loop (LL), Right Loop (RL) and the Plain Arch (PA). The classification rules employed...

  2. Fuzzy classification for strawberry diseases-infection using machine vision and soft-computing techniques

    Science.gov (United States)

    Altıparmak, Hamit; Al Shahadat, Mohamad; Kiani, Ehsan; Dimililer, Kamil

    2018-04-01

    Robotic agriculture requires smart and doable techniques to substitute the human intelligence with machine intelligence. Strawberry is one of the important Mediterranean product and its productivity enhancement requires modern and machine-based methods. Whereas a human identifies the disease infected leaves by his eye, the machine should also be capable of vision-based disease identification. The objective of this paper is to practically verify the applicability of a new computer-vision method for discrimination between the healthy and disease infected strawberry leaves which does not require neural network or time consuming trainings. The proposed method was tested under outdoor lighting condition using a regular DLSR camera without any particular lens. Since the type and infection degree of disease is approximated a human brain a fuzzy decision maker classifies the leaves over the images captured on-site having the same properties of human vision. Optimizing the fuzzy parameters for a typical strawberry production area at a summer mid-day in Cyprus produced 96% accuracy for segmented iron deficiency and 93% accuracy for segmented using a typical human instant classification approximation as the benchmark holding higher accuracy than a human eye identifier. The fuzzy-base classifier provides approximate result for decision making on the leaf status as if it is healthy or not.

  3. CORPORATE DOCUMENTS AND FILES: FROM THE DAY-BY-DAY ORGANIZATION TO THE EFFECTIVE MANAGEMENT

    Directory of Open Access Journals (Sweden)

    Elisabeth Adriana Dudziak

    2010-09-01

    Full Text Available The aim of this paper is to present and discuss the process of planning, organization and management of business archives and documents from the standpoint of the professional of executive secretary. It begins from the introduction to current scenario with focus on global dynamics, the service demands and the essential role of executive secretary in the company. It presents a theoretical review of key concepts and standards internationally adopted. The sets out the need of carry more than the simple organization of archives, management of documents and archives. Based on this premise, we present conceptual bases for implantation of a Service of Documentation and Archive, main document types in companies, and criteria of classification and document custody temporality. We discuss about the importance of strategic planning, tactical and operational management of archives. Finally, we present a description of the steps of the implantation of an Archive and Documentation Service, and the code of ethics of document manager.

  4. Acute pancreatitis: international classification and nomenclature

    International Nuclear Information System (INIS)

    Bollen, T.L.

    2016-01-01

    The incidence of acute pancreatitis (AP) is increasing and it is associated with a major healthcare concern. New insights in the pathophysiology, better imaging techniques, and novel treatment options for complicated AP prompted the update of the 1992 Atlanta Classification. Updated nomenclature for pancreatic collections based on imaging criteria is proposed. Adoption of the newly Revised Classification of Acute Pancreatitis 2012 by radiologists should help standardise reports and facilitate accurate conveyance of relevant findings to referring physicians involved in the care of patients with AP. This review will clarify the nomenclature of pancreatic collections in the setting of AP.

  5. CLASSIFICATION OF THE MGR SITE WATER SYSTEM

    International Nuclear Information System (INIS)

    J.A. Ziegler

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) site water system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (QARD) (DOE 1998)

  6. Classification of the MGR Assembly Transfer System

    International Nuclear Information System (INIS)

    S.E. Salzman

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) assembly transfer system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (QARD) (DOE 1998)

  7. CLASSIFICATION OF THE MGR SITE OPERATIONS SYSTEM

    International Nuclear Information System (INIS)

    J.A. Ziegler

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) site operations system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (QARD) (DOE 1998)

  8. CLASSIFICATION OF THE MGR SITE LAYOUT SYSTEM

    International Nuclear Information System (INIS)

    S.E. Salzman

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) site layout system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (QARD) (DOE 1998)

  9. CLASSIFICATION OF THE MGR OFFSITE UTILITIES SYSTEM

    International Nuclear Information System (INIS)

    J.A. Ziegler

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) offsite utilities system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (QARD) (DOE 1998)

  10. CLASSIFICATION OF THE MGR SUBSURFACE EXCAVATION SYSTEM

    International Nuclear Information System (INIS)

    R. Garrett

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) subsurface excavation system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (QARD) (DOE 1998)

  11. Characteristics and application study of AP1000 NPPs equipment reliability classification method

    International Nuclear Information System (INIS)

    Guan Gao

    2013-01-01

    AP1000 nuclear power plant applies an integrated approach to establish equipment reliability classification, which includes probabilistic risk assessment technique, maintenance rule administrative, power production reliability classification and functional equipment group bounding method, and eventually classify equipment reliability into 4 levels. This classification process and result are very different from classical RCM and streamlined RCM. It studied the characteristic of AP1000 equipment reliability classification approach, considered that equipment reliability classification should effectively support maintenance strategy development and work process control, recommended to use a combined RCM method to establish the future equipment reliability program of AP1000 nuclear power plants. (authors)

  12. Gas Classification Using Deep Convolutional Neural Networks

    Science.gov (United States)

    Peng, Pai; Zhao, Xiaojin; Pan, Xiaofang; Ye, Wenbin

    2018-01-01

    In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. Inspired by the great success of DCNN in the field of computer vision, we designed a DCNN with up to 38 layers. In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. Together, these various layers make up a powerful deep model for gas classification. Experimental results show that the proposed DCNN method is an effective technique for classifying electronic nose data. We also demonstrate that the DCNN method can provide higher classification accuracy than comparable Support Vector Machine (SVM) methods and Multiple Layer Perceptron (MLP). PMID:29316723

  13. Gas Classification Using Deep Convolutional Neural Networks.

    Science.gov (United States)

    Peng, Pai; Zhao, Xiaojin; Pan, Xiaofang; Ye, Wenbin

    2018-01-08

    In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. Inspired by the great success of DCNN in the field of computer vision, we designed a DCNN with up to 38 layers. In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. Together, these various layers make up a powerful deep model for gas classification. Experimental results show that the proposed DCNN method is an effective technique for classifying electronic nose data. We also demonstrate that the DCNN method can provide higher classification accuracy than comparable Support Vector Machine (SVM) methods and Multiple Layer Perceptron (MLP).

  14. Feasibility study of stain-free classification of cell apoptosis based on diffraction imaging flow cytometry and supervised machine learning techniques.

    Science.gov (United States)

    Feng, Jingwen; Feng, Tong; Yang, Chengwen; Wang, Wei; Sa, Yu; Feng, Yuanming

    2018-06-01

    This study was to explore the feasibility of prediction and classification of cells in different stages of apoptosis with a stain-free method based on diffraction images and supervised machine learning. Apoptosis was induced in human chronic myelogenous leukemia K562 cells by cis-platinum (DDP). A newly developed technique of polarization diffraction imaging flow cytometry (p-DIFC) was performed to acquire diffraction images of the cells in three different statuses (viable, early apoptotic and late apoptotic/necrotic) after cell separation through fluorescence activated cell sorting with Annexin V-PE and SYTOX® Green double staining. The texture features of the diffraction images were extracted with in-house software based on the Gray-level co-occurrence matrix algorithm to generate datasets for cell classification with supervised machine learning method. Therefore, this new method has been verified in hydrogen peroxide induced apoptosis model of HL-60. Results show that accuracy of higher than 90% was achieved respectively in independent test datasets from each cell type based on logistic regression with ridge estimators, which indicated that p-DIFC system has a great potential in predicting and classifying cells in different stages of apoptosis.

  15. Using Apparent Density of Paper from Hardwood Kraft Pulps to Predict Sheet Properties, based on Unsupervised Classification and Multivariable Regression Techniques

    Directory of Open Access Journals (Sweden)

    Ofélia Anjos

    2015-07-01

    Full Text Available Paper properties determine the product application potential and depend on the raw material, pulping conditions, and pulp refining. The aim of this study was to construct mathematical models that predict quantitative relations between the paper density and various mechanical and optical properties of the paper. A dataset of properties of paper handsheets produced with pulps of Acacia dealbata, Acacia melanoxylon, and Eucalyptus globulus beaten at 500, 2500, and 4500 revolutions was used. Unsupervised classification techniques were combined to assess the need to perform separated prediction models for each species, and multivariable regression techniques were used to establish such prediction models. It was possible to develop models with a high goodness of fit using paper density as the independent variable (or predictor for all variables except tear index and zero-span tensile strength, both dry and wet.

  16. Occupancy classification of position weight matrix-inferred transcription factor binding sites.

    Directory of Open Access Journals (Sweden)

    Hollis Wright

    Full Text Available BACKGROUND: Computational prediction of Transcription Factor Binding Sites (TFBS from sequence data alone is difficult and error-prone. Machine learning techniques utilizing additional environmental information about a predicted binding site (such as distances from the site to particular chromatin features to determine its occupancy/functionality class show promise as methods to achieve more accurate prediction of true TFBS in silico. We evaluate the Bayesian Network (BN and Support Vector Machine (SVM machine learning techniques on four distinct TFBS data sets and analyze their performance. We describe the features that are most useful for classification and contrast and compare these feature sets between the factors. RESULTS: Our results demonstrate good performance of classifiers both on TFBS for transcription factors used for initial training and for TFBS for other factors in cross-classification experiments. We find that distances to chromatin modifications (specifically, histone modification islands as well as distances between such modifications to be effective predictors of TFBS occupancy, though the impact of individual predictors is largely TF specific. In our experiments, Bayesian network classifiers outperform SVM classifiers. CONCLUSIONS: Our results demonstrate good performance of machine learning techniques on the problem of occupancy classification, and demonstrate that effective classification can be achieved using distances to chromatin features. We additionally demonstrate that cross-classification of TFBS is possible, suggesting the possibility of constructing a generalizable occupancy classifier capable of handling TFBS for many different transcription factors.

  17. Classifying Classifications

    DEFF Research Database (Denmark)

    Debus, Michael S.

    2017-01-01

    This paper critically analyzes seventeen game classifications. The classifications were chosen on the basis of diversity, ranging from pre-digital classification (e.g. Murray 1952), over game studies classifications (e.g. Elverdam & Aarseth 2007) to classifications of drinking games (e.g. LaBrie et...... al. 2013). The analysis aims at three goals: The classifications’ internal consistency, the abstraction of classification criteria and the identification of differences in classification across fields and/or time. Especially the abstraction of classification criteria can be used in future endeavors...... into the topic of game classifications....

  18. Classification of JET Neutron and Gamma Emissivity Profiles

    Science.gov (United States)

    Craciunescu, T.; Murari, A.; Kiptily, V.; Vega, J.; Contributors, JET

    2016-05-01

    In thermonuclear plasmas, emission tomography uses integrated measurements along lines of sight (LOS) to determine the two-dimensional (2-D) spatial distribution of the volume emission intensity. Due to the availability of only a limited number views and to the coarse sampling of the LOS, the tomographic inversion is a limited data set problem. Several techniques have been developed for tomographic reconstruction of the 2-D gamma and neutron emissivity on JET. In specific experimental conditions the availability of LOSs is restricted to a single view. In this case an explicit reconstruction of the emissivity profile is no longer possible. However, machine learning classification methods can be used in order to derive the type of the distribution. In the present approach the classification is developed using the theory of belief functions which provide the support to fuse the results of independent clustering and supervised classification. The method allows to represent the uncertainty of the results provided by different independent techniques, to combine them and to manage possible conflicts.

  19. Electronic structure classifications using scanning tunneling microscopy conductance imaging

    International Nuclear Information System (INIS)

    Horn, K.M.; Swartzentruber, B.S.; Osbourn, G.C.; Bouchard, A.; Bartholomew, J.W.

    1998-01-01

    The electronic structure of atomic surfaces is imaged by applying multivariate image classification techniques to multibias conductance data measured using scanning tunneling microscopy. Image pixels are grouped into classes according to shared conductance characteristics. The image pixels, when color coded by class, produce an image that chemically distinguishes surface electronic features over the entire area of a multibias conductance image. Such open-quotes classedclose quotes images reveal surface features not always evident in a topograph. This article describes the experimental technique used to record multibias conductance images, how image pixels are grouped in a mathematical, classification space, how a computed grouping algorithm can be employed to group pixels with similar conductance characteristics in any number of dimensions, and finally how the quality of the resulting classed images can be evaluated using a computed, combinatorial analysis of the full dimensional space in which the classification is performed. copyright 1998 American Institute of Physics

  20. Organizational Data Classification Based on the Importance Concept of Complex Networks.

    Science.gov (United States)

    Carneiro, Murillo Guimaraes; Zhao, Liang

    2017-08-01

    Data classification is a common task, which can be performed by both computers and human beings. However, a fundamental difference between them can be observed: computer-based classification considers only physical features (e.g., similarity, distance, or distribution) of input data; by contrast, brain-based classification takes into account not only physical features, but also the organizational structure of data. In this paper, we figure out the data organizational structure for classification using complex networks constructed from training data. Specifically, an unlabeled instance is classified by the importance concept characterized by Google's PageRank measure of the underlying data networks. Before a test data instance is classified, a network is constructed from vector-based data set and the test instance is inserted into the network in a proper manner. To this end, we also propose a measure, called spatio-structural differential efficiency, to combine the physical and topological features of the input data. Such a method allows for the classification technique to capture a variety of data patterns using the unique importance measure. Extensive experiments demonstrate that the proposed technique has promising predictive performance on the detection of heart abnormalities.

  1. Some Issues in the Automatic Classification of U.S. Patents Working Notes for the AAAI-98 Workshop on Learning for Text Categorization

    National Research Council Canada - National Science Library

    Larkey, Leah

    1998-01-01

    The classification of U.S. patents poses some special problems due to the enormous size of the corpus, the size and complex hierarchical structure of the classification system, and the size and structure of patent documents...

  2. A Classification Table for Achondrites

    Science.gov (United States)

    Chennaoui-Aoudjehane, H.; Larouci, N.; Jambon, A.; Mittlefehldt, D. W.

    2014-01-01

    Classifying chondrites is relatively easy and the criteria are well documented. It is based on mineral compositions, textural characteristics and more recently, magnetic susceptibility. It can be more difficult to classify achondrites, especially those that are very similar to terrestrial igneous rocks, because mineralogical, textural and compositional properties can be quite variable. Achondrites contain essentially olivine, pyroxenes, plagioclases, oxides, sulphides and accessory minerals. Their origin is attributed to differentiated parents bodies: large asteroids (Vesta); planets (Mars); a satellite (the Moon); and numerous asteroids of unknown size. In most cases, achondrites are not eye witnessed falls and some do not have fusion crust. Because of the mineralogical and magnetic susceptibility similarity with terrestrial igneous rocks for some achondrites, it can be difficult for classifiers to confirm their extra-terrestrial origin. We -as classifiers of meteorites- are confronted with this problem with every suspected achondrite we receive for identification. We are developing a "grid" of classification to provide an easier approach for initial classification. We use simple but reproducible criteria based on mineralogical, petrological and geochemical studies. We presented the classes: acapulcoites, lodranites, winonaites and Martian meteorites (shergottite, chassignites, nakhlites). In this work we are completing the classification table by including the groups: angrites, aubrites, brachinites, ureilites, HED (howardites, eucrites, and diogenites), lunar meteorites, pallasites and mesosiderites. Iron meteorites are not presented in this abstract.

  3. A Robust Geometric Model for Argument Classification

    Science.gov (United States)

    Giannone, Cristina; Croce, Danilo; Basili, Roberto; de Cao, Diego

    Argument classification is the task of assigning semantic roles to syntactic structures in natural language sentences. Supervised learning techniques for frame semantics have been recently shown to benefit from rich sets of syntactic features. However argument classification is also highly dependent on the semantics of the involved lexicals. Empirical studies have shown that domain dependence of lexical information causes large performance drops in outside domain tests. In this paper a distributional approach is proposed to improve the robustness of the learning model against out-of-domain lexical phenomena.

  4. Revue bibliographique: les méthodes chimiques d'identification et de classification des champignons

    Directory of Open Access Journals (Sweden)

    Verscheure M.

    2002-01-01

    Full Text Available Chemotaxonomy of fungi : a review. For few years, advancements of molecular methods and analytical techniques enabled scientists to realise a classification of microorganisms based on biochemical characteristics. This classification, called chemotaxonomy, includes molecular methods and chemical methods which provide additional data and lead to a better identification and/or classification.

  5. World Gas Conference 1997. Sub-committee G 2. International classification for the gas industry

    International Nuclear Information System (INIS)

    1997-01-01

    This international Gas Union classification is a documentary tool for the transfer of information relative to the gas industry. It is written in English and in French, official languages of the IGU. In a resource centre, the classification can be used to set up a manual index file of documentary references selected for subsequent information retrieval. In the case of computer file, the classification can be used for automatic classification of document references selected by field. The IGU classification is divided into ten chapters: O - Arts, science and technologies. A - Manufacture and treatment of gases. B - Gas supply. C - Gas utilizations. D - By-products of manufactures and natural gas. E - Personnel of the gas industry. F - Hygiene and safety in the gas industry. G - Administration and organization of the gas industry. H - Commercial questions of the gas industry. L - Geographical classification. Each chapter is subdivided into subjects, and each subject is further divided according to a hierarchical set of classification codes which correspond to a more detailed approach to the subject concerned. (EG)

  6. Hyperspectral Image Classification Using Kernel Fukunaga-Koontz Transform

    Directory of Open Access Journals (Sweden)

    Semih Dinç

    2013-01-01

    images. In experiment section, the improved performance of HSI classification technique, K-FKT, has been tested comparing other methods such as the classical FKT and three types of support vector machines (SVMs.

  7. Forensic Analysis of Blue Ball point Pen Inks on Questioned Documents by High Performance Thin Layer Chromatography Technique (HPTLC)

    International Nuclear Information System (INIS)

    Lee, L.C.; Siti Mariam Nunurung; Abdul Aziz Ishak

    2014-01-01

    Nowadays, crimes related to forged documents are increasing. Any erasure, addition or modification in the document content always involves the use of writing instrument such as ball point pens. Hence, there is an evident need to develop a fast and accurate ink analysis protocol to solve this problem. This study is aimed to determine the discrimination power of high performance thin layer chromatography (HPTLC) technique for analyzing a set of blue ball point pen inks. Ink samples deposited on paper were extracted using methanol and separated via a solvent mixture of ethyl acetate, methanol and distilled water (70: 35: 30, v/ v/ v). In this method, the discrimination power of 89.40 % was achieved, which confirm that the proposed method was able to differentiate a significant number of pen-pair samples. In addition, composition of blue pen inks was found to be homogeneous (RSD < 2.5 %) and the proposed method showed good repeatability and reproducibility (RSD < 3. 0%). As a conclusion, HPTLC is an effective tool to separate blue ball point pen inks. (author)

  8. Understanding the use of standardized nursing terminology and classification systems in published research: A case study using the International Classification for Nursing Practice(®).

    Science.gov (United States)

    Strudwick, Gillian; Hardiker, Nicholas R

    2016-10-01

    In the era of evidenced based healthcare, nursing is required to demonstrate that care provided by nurses is associated with optimal patient outcomes, and a high degree of quality and safety. The use of standardized nursing terminologies and classification systems are a way that nursing documentation can be leveraged to generate evidence related to nursing practice. Several widely-reported nursing specific terminologies and classifications systems currently exist including the Clinical Care Classification System, International Classification for Nursing Practice(®), Nursing Intervention Classification, Nursing Outcome Classification, Omaha System, Perioperative Nursing Data Set and NANDA International. However, the influence of these systems on demonstrating the value of nursing and the professions' impact on quality, safety and patient outcomes in published research is relatively unknown. This paper seeks to understand the use of standardized nursing terminology and classification systems in published research, using the International Classification for Nursing Practice(®) as a case study. A systematic review of international published empirical studies on, or using, the International Classification for Nursing Practice(®) were completed using Medline and the Cumulative Index for Nursing and Allied Health Literature. Since 2006, 38 studies have been published on the International Classification for Nursing Practice(®). The main objectives of the published studies have been to validate the appropriateness of the classification system for particular care areas or populations, further develop the classification system, or utilize it to support the generation of new nursing knowledge. To date, most studies have focused on the classification system itself, and a lesser number of studies have used the system to generate information about the outcomes of nursing practice. Based on the published literature that features the International Classification for Nursing

  9. CLASSIFICATION OF THE MGR SUBSURFACE VENTILATION SYSTEM

    International Nuclear Information System (INIS)

    R.J. Garrett

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) subsurface ventilation system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P7 ''Quality Assurance Requirements and Description'' (QARD) (DOE 1998)

  10. CLASSIFICATION OF THE MGR EMERGENCY RESPONSE SYSTEM

    International Nuclear Information System (INIS)

    Zeigler, J.A.

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) emergency response system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P7 ''Quality Assurance Requirements and Description'' (QARD) (DOE 1998)

  11. Compression of Probabilistic XML documents

    NARCIS (Netherlands)

    Veldman, Irma

    2009-01-01

    Probabilistic XML (PXML) files resulting from data integration can become extremely large, which is undesired. For XML there are several techniques available to compress the document and since probabilistic XML is in fact (a special form of) XML, it might benefit from these methods even more. In

  12. Analysis of Different Classification Techniques for Two-Class Functional Near-Infrared Spectroscopy-Based Brain-Computer Interface

    Directory of Open Access Journals (Sweden)

    Noman Naseer

    2016-01-01

    Full Text Available We analyse and compare the classification accuracies of six different classifiers for a two-class mental task (mental arithmetic and rest using functional near-infrared spectroscopy (fNIRS signals. The signals of the mental arithmetic and rest tasks from the prefrontal cortex region of the brain for seven healthy subjects were acquired using a multichannel continuous-wave imaging system. After removal of the physiological noises, six features were extracted from the oxygenated hemoglobin (HbO signals. Two- and three-dimensional combinations of those features were used for classification of mental tasks. In the classification, six different modalities, linear discriminant analysis (LDA, quadratic discriminant analysis (QDA, k-nearest neighbour (kNN, the Naïve Bayes approach, support vector machine (SVM, and artificial neural networks (ANN, were utilized. With these classifiers, the average classification accuracies among the seven subjects for the 2- and 3-dimensional combinations of features were 71.6, 90.0, 69.7, 89.8, 89.5, and 91.4% and 79.6, 95.2, 64.5, 94.8, 95.2, and 96.3%, respectively. ANN showed the maximum classification accuracies: 91.4 and 96.3%. In order to validate the results, a statistical significance test was performed, which confirmed that the p values were statistically significant relative to all of the other classifiers (p < 0.005 using HbO signals.

  13. A contextual image segmentation system using a priori information for automatic data classification in nuclear physics

    International Nuclear Information System (INIS)

    Benkirane, A.; Auger, G.; Chbihi, A.; Bloyet, D.; Plagnol, E.

    1994-01-01

    This paper presents an original approach to solve an automatic data classification problem by means of image processing techniques. The classification is achieved using image segmentation techniques for extracting the meaningful classes. Two types of information are merged for this purpose: the information contained in experimental images and a priori information derived from underlying physics (and adapted to image segmentation problem). This data fusion is widely used at different stages of the segmentation process. This approach yields interesting results in terms of segmentation performances, even in very noisy cases. Satisfactory classification results are obtained in cases where more ''classical'' automatic data classification methods fail. (authors). 25 refs., 14 figs., 1 append

  14. A contextual image segmentation system using a priori information for automatic data classification in nuclear physics

    Energy Technology Data Exchange (ETDEWEB)

    Benkirane, A; Auger, G; Chbihi, A [Grand Accelerateur National d` Ions Lourds (GANIL), 14 - Caen (France); Bloyet, D [Caen Univ., 14 (France); Plagnol, E [Paris-11 Univ., 91 - Orsay (France). Inst. de Physique Nucleaire

    1994-12-31

    This paper presents an original approach to solve an automatic data classification problem by means of image processing techniques. The classification is achieved using image segmentation techniques for extracting the meaningful classes. Two types of information are merged for this purpose: the information contained in experimental images and a priori information derived from underlying physics (and adapted to image segmentation problem). This data fusion is widely used at different stages of the segmentation process. This approach yields interesting results in terms of segmentation performances, even in very noisy cases. Satisfactory classification results are obtained in cases where more ``classical`` automatic data classification methods fail. (authors). 25 refs., 14 figs., 1 append.

  15. A new classification of post-sternotomy dehiscence

    Science.gov (United States)

    Anger, Jaime; Dantas, Daniel Chagas; Arnoni, Renato Tambellini; Farsky, Pedro Sílvio

    2015-01-01

    The dehiscence after median transesternal sternotomy used as surgical access for cardiac surgery is one of its complications and it increases the patient's morbidity and mortality. A variety of surgical techniques were recently described resulting to the need of a classification bringing a measure of objectivity to the management of these complex and dangerous wounds. The different related classifications are based in the primary causal infection, but recently the anatomical description of the wound including the deepness and the vertical extension showed to be more useful. We propose a new classification based only on the anatomical changes following sternotomy dehiscence and chronic wound formation separating it in four types according to the deepness and in two sub-groups according to the vertical extension based on the inferior insertion of the pectoralis major muscle. PMID:25859875

  16. Simple adaptive sparse representation based classification schemes for EEG based brain-computer interface applications.

    Science.gov (United States)

    Shin, Younghak; Lee, Seungchan; Ahn, Minkyu; Cho, Hohyun; Jun, Sung Chan; Lee, Heung-No

    2015-11-01

    One of the main problems related to electroencephalogram (EEG) based brain-computer interface (BCI) systems is the non-stationarity of the underlying EEG signals. This results in the deterioration of the classification performance during experimental sessions. Therefore, adaptive classification techniques are required for EEG based BCI applications. In this paper, we propose simple adaptive sparse representation based classification (SRC) schemes. Supervised and unsupervised dictionary update techniques for new test data and a dictionary modification method by using the incoherence measure of the training data are investigated. The proposed methods are very simple and additional computation for the re-training of the classifier is not needed. The proposed adaptive SRC schemes are evaluated using two BCI experimental datasets. The proposed methods are assessed by comparing classification results with the conventional SRC and other adaptive classification methods. On the basis of the results, we find that the proposed adaptive schemes show relatively improved classification accuracy as compared to conventional methods without requiring additional computation. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Faults Classification Of Power Electronic Circuits Based On A Support Vector Data Description Method

    Directory of Open Access Journals (Sweden)

    Cui Jiang

    2015-06-01

    Full Text Available Power electronic circuits (PECs are prone to various failures, whose classification is of paramount importance. This paper presents a data-driven based fault diagnosis technique, which employs a support vector data description (SVDD method to perform fault classification of PECs. In the presented method, fault signals (e.g. currents, voltages, etc. are collected from accessible nodes of circuits, and then signal processing techniques (e.g. Fourier analysis, wavelet transform, etc. are adopted to extract feature samples, which are subsequently used to perform offline machine learning. Finally, the SVDD classifier is used to implement fault classification task. However, in some cases, the conventional SVDD cannot achieve good classification performance, because this classifier may generate some so-called refusal areas (RAs, and in our design these RAs are resolved with the one-against-one support vector machine (SVM classifier. The obtained experiment results from simulated and actual circuits demonstrate that the improved SVDD has a classification performance close to the conventional one-against-one SVM, and can be applied to fault classification of PECs in practice.

  18. A review on fault classification methodologies in power transmission systems: Part—I

    Directory of Open Access Journals (Sweden)

    Avagaddi Prasad

    2018-05-01

    Full Text Available This paper presents a survey on different fault classification methodologies in transmission lines. Efforts have been made to include almost all the techniques and philosophies of transmission lines reported in the literature. Fault classification is necessary for reliable and high speed protective relaying followed by digital distance protection. Hence, a suitable review of these methods is needed. The contribution consists of two parts. This is part 1 of the series of two parts. Part 1, it is a review on brief introduction on faults in transmission lines and the scope of various old approaches in this field are reviewed. Part 2 will focus and present a newly developed approaches in this field. Keywords: Fault, Fault classification, Protection, Soft computing techniques, Transmission lines

  19. Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification

    National Research Council Canada - National Science Library

    Han, Euihong; Karypis, George; Kumar, Vipin

    1999-01-01

    .... The authors present a nearest neighbor classification scheme for text categorization in which the importance of discriminating words is learned using mutual information and weight adjustment techniques...

  20. Multi-Temporal Land Cover Classification with Long Short-Term Memory Neural Networks

    Science.gov (United States)

    Rußwurm, M.; Körner, M.

    2017-05-01

    Land cover classification (LCC) is a central and wide field of research in earth observation and has already put forth a variety of classification techniques. Many approaches are based on classification techniques considering observation at certain points in time. However, some land cover classes, such as crops, change their spectral characteristics due to environmental influences and can thus not be monitored effectively with classical mono-temporal approaches. Nevertheless, these temporal observations should be utilized to benefit the classification process. After extensive research has been conducted on modeling temporal dynamics by spectro-temporal profiles using vegetation indices, we propose a deep learning approach to utilize these temporal characteristics for classification tasks. In this work, we show how long short-term memory (LSTM) neural networks can be employed for crop identification purposes with SENTINEL 2A observations from large study areas and label information provided by local authorities. We compare these temporal neural network models, i.e., LSTM and recurrent neural network (RNN), with a classical non-temporal convolutional neural network (CNN) model and an additional support vector machine (SVM) baseline. With our rather straightforward LSTM variant, we exceeded state-of-the-art classification performance, thus opening promising potential for further research.

  1. MULTI-TEMPORAL LAND COVER CLASSIFICATION WITH LONG SHORT-TERM MEMORY NEURAL NETWORKS

    Directory of Open Access Journals (Sweden)

    M. Rußwurm

    2017-05-01

    Full Text Available Land cover classification (LCC is a central and wide field of research in earth observation and has already put forth a variety of classification techniques. Many approaches are based on classification techniques considering observation at certain points in time. However, some land cover classes, such as crops, change their spectral characteristics due to environmental influences and can thus not be monitored effectively with classical mono-temporal approaches. Nevertheless, these temporal observations should be utilized to benefit the classification process. After extensive research has been conducted on modeling temporal dynamics by spectro-temporal profiles using vegetation indices, we propose a deep learning approach to utilize these temporal characteristics for classification tasks. In this work, we show how long short-term memory (LSTM neural networks can be employed for crop identification purposes with SENTINEL 2A observations from large study areas and label information provided by local authorities. We compare these temporal neural network models, i.e., LSTM and recurrent neural network (RNN, with a classical non-temporal convolutional neural network (CNN model and an additional support vector machine (SVM baseline. With our rather straightforward LSTM variant, we exceeded state-of-the-art classification performance, thus opening promising potential for further research.

  2. A Survey in Indexing and Searching XML Documents.

    Science.gov (United States)

    Luk, Robert W. P.; Leong, H. V.; Dillon, Tharam S.; Chan, Alvin T. S.; Croft, W. Bruce; Allan, James

    2002-01-01

    Discussion of XML focuses on indexing techniques for XML documents, grouping them into flat-file, semistructured, and structured indexing paradigms. Highlights include searching techniques, including full text search and multistage search; search result presentations; database and information retrieval system integration; XML query languages; and…

  3. Contribution of non-negative matrix factorization to the classification of remote sensing images

    Science.gov (United States)

    Karoui, M. S.; Deville, Y.; Hosseini, S.; Ouamri, A.; Ducrot, D.

    2008-10-01

    Remote sensing has become an unavoidable tool for better managing our environment, generally by realizing maps of land cover using classification techniques. The classification process requires some pre-processing, especially for data size reduction. The most usual technique is Principal Component Analysis. Another approach consists in regarding each pixel of the multispectral image as a mixture of pure elements contained in the observed area. Using Blind Source Separation (BSS) methods, one can hope to unmix each pixel and to perform the recognition of the classes constituting the observed scene. Our contribution consists in using Non-negative Matrix Factorization (NMF) combined with sparse coding as a solution to BSS, in order to generate new images (which are at least partly separated images) using HRV SPOT images from Oran area, Algeria). These images are then used as inputs of a supervised classifier integrating textural information. The results of classifications of these "separated" images show a clear improvement (correct pixel classification rate improved by more than 20%) compared to classification of initial (i.e. non separated) images. These results show the contribution of NMF as an attractive pre-processing for classification of multispectral remote sensing imagery.

  4. Classification of uranium reserves/resources

    International Nuclear Information System (INIS)

    1998-08-01

    Projections of future availability of uranium to meet present and future nuclear power requirements depend on the reliability of uranium resource estimates. Lack of harmony of the definition of the different classes of uranium reserves and resources between countries makes the compilation and analysis of such information difficult. The problem was accentuated in the early 1990s with the entry of uranium producing countries from the former Soviet Union, eastern Europe and China into the world uranium supply market. The need for an internationally acceptable reserve/resource classification system and terminology using market based criteria is therefore obvious. This publication was compiled from participant's contributions and findings of the Consultants Meetings on Harmonization of Uranium Resource Assessment Concepts held in Vienna from 22 to 25 June 1992, and two Consultants Meetings on the Development of a More Meaningful Classification of Uranium Resources held in Kiev, Ukraine on 24-26 April 1995 and 20-23 August 1996. This document includes 11 contributions, summary, list of participants of the Consultants Meetings. Each contribution has been indexed and provided with an abstract

  5. Rule-guided human classification of Volunteered Geographic Information

    Science.gov (United States)

    Ali, Ahmed Loai; Falomir, Zoe; Schmid, Falko; Freksa, Christian

    2017-05-01

    During the last decade, web technologies and location sensing devices have evolved generating a form of crowdsourcing known as Volunteered Geographic Information (VGI). VGI acted as a platform of spatial data collection, in particular, when a group of public participants are involved in collaborative mapping activities: they work together to collect, share, and use information about geographic features. VGI exploits participants' local knowledge to produce rich data sources. However, the resulting data inherits problematic data classification. In VGI projects, the challenges of data classification are due to the following: (i) data is likely prone to subjective classification, (ii) remote contributions and flexible contribution mechanisms in most projects, and (iii) the uncertainty of spatial data and non-strict definitions of geographic features. These factors lead to various forms of problematic classification: inconsistent, incomplete, and imprecise data classification. This research addresses classification appropriateness. Whether the classification of an entity is appropriate or inappropriate is related to quantitative and/or qualitative observations. Small differences between observations may be not recognizable particularly for non-expert participants. Hence, in this paper, the problem is tackled by developing a rule-guided classification approach. This approach exploits data mining techniques of Association Classification (AC) to extract descriptive (qualitative) rules of specific geographic features. The rules are extracted based on the investigation of qualitative topological relations between target features and their context. Afterwards, the extracted rules are used to develop a recommendation system able to guide participants to the most appropriate classification. The approach proposes two scenarios to guide participants towards enhancing the quality of data classification. An empirical study is conducted to investigate the classification of grass

  6. Application of InterPro for the functional classification of the proteins ...

    Indian Academy of Sciences (India)

    InterPro (http://www.ebi.ac.uk/interpro/) is an integrated documentation resource for protein families, domains and sites, developed initially as a means of rationalizing the complementary efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. It is a useful resource that aids the functional classification of ...

  7. Transforming landscape ecological evaluations using sub-pixel remote sensing classifications: A study of invasive saltcedar (Tamarix spp.)

    Science.gov (United States)

    Frazier, Amy E.

    Invasive species disrupt landscape patterns and compromise the functionality of ecosystem processes. Non-native saltcedar (Tamarix spp.) poses significant threats to native vegetation and groundwater resources in the southwestern U.S. and Mexico, and quantifying spatial and temporal distribution patterns is essential for monitoring its spread. Advanced remote sensing classification techniques such as sub-pixel classifications are able to detect and discriminate saltcedar from native vegetation with high accuracy, but these types of classifications are not compatible with landscape metrics, which are the primary tool available for statistically assessing distribution patterns, because they do not have discrete class boundaries. The objective of this research is to develop new methods that allow sub-pixel classifications to be analyzed using landscape metrics. The research will be carried out through three specific aims: (1) develop and test a method to transform continuous sub-pixel classifications into categorical representations that are compatible with widely used landscape metric tools, (2) establish a gradient-based concept of landscape using sub-pixel classifications and the technique developed in the first objective to explore the relationships between pattern and process, and (3) generate a new super-resolution mapping technique method to predict the spatial locations of fractional land covers within a pixel. Results show that the threshold gradient method is appropriate for discretizing sub-pixel data, and can be used to generate increased information about the landscape compared to traditional single-value metrics. Additionally, the super-resolution classification technique was also able to provide detailed sub-pixel mapping information, but additional work will be needed to develop rigorous validation and accuracy assessment techniques.

  8. Calibration of a Plastic Classification System with the CCW Model; Calibracion de un Sistema de Clasification de Plasticos segun el Modelo CCW

    Energy Technology Data Exchange (ETDEWEB)

    Barcala Riveira, J M; Fernandez Marron, J L; Alberdi Primicia, J; Navarrete Marin, J J; Oller Gonzalez, J C

    2003-07-01

    This document describes the calibration of a plastic Classification system with the CCW model (Classification by Quaternions built Wavelet Coefficients). The method is applied to spectra of plastics usually present in domestic wastes. Obtained results are showed. (Author) 16 refs.

  9. Sensor Data Acquisition and Processing Parameters for Human Activity Classification

    Directory of Open Access Journals (Sweden)

    Sebastian D. Bersch

    2014-03-01

    Full Text Available It is known that parameter selection for data sampling frequency and segmentation techniques (including different methods and window sizes has an impact on the classification accuracy. For Ambient Assisted Living (AAL, no clear information to select these parameters exists, hence a wide variety and inconsistency across today’s literature is observed. This paper presents the empirical investigation of different data sampling rates, segmentation techniques and segmentation window sizes and their effect on the accuracy of Activity of Daily Living (ADL event classification and computational load for two different accelerometer sensor datasets. The study is conducted using an ANalysis Of VAriance (ANOVA based on 32 different window sizes, three different segmentation algorithm (with and without overlap, totaling in six different parameters and six sampling frequencies for nine common classification algorithms. The classification accuracy is based on a feature vector consisting of Root Mean Square (RMS, Mean, Signal Magnitude Area (SMA, Signal Vector Magnitude (here SMV, Energy, Entropy, FFTPeak, Standard Deviation (STD. The results are presented alongside recommendations for the parameter selection on the basis of the best performing parameter combinations that are identified by means of the corresponding Pareto curve.

  10. Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics

    Science.gov (United States)

    Faye, Ibrahima; Samir, Brahim Belhaouari; Md Said, Abas

    2014-01-01

    Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. PMID:25045727

  11. Event classification and optimization methods using artificial intelligence and other relevant techniques: Sharing the experiences

    Science.gov (United States)

    Mohamed, Abdul Aziz; Hasan, Abu Bakar; Ghazali, Abu Bakar Mhd.

    2017-01-01

    Classification of large data into respected classes or groups could be carried out with the help of artificial intelligence (AI) tools readily available in the market. To get the optimum or best results, optimization tool could be applied on those data. Classification and optimization have been used by researchers throughout their works, and the outcomes were very encouraging indeed. Here, the authors are trying to share what they have experienced in three different areas of applied research.

  12. Manual of Documentation Practices Applicable to Defence-Aerospace Scientific and Technical Information. Volume 1. Section 1 - Acquisition and Sources. Section 2 - Descriptive Cataloguing. Section 3 - Abstracting and Subject Analysis

    Science.gov (United States)

    1978-08-01

    be added for new subcategories. The Dewey Decimal Classification, the Library of Congress Classification of the United States, and the Universal...to be published. A translation duplicate is a translation of a report or an article into another language. DDC (DOD/USA) Defense Documentation Center...controls need not be elaborate. The system described below has proved adequate over a number of years of operation at the Defense Documentation Center ( DDC

  13. Spectrum analysis on quality requirements consideration in software design documents.

    Science.gov (United States)

    Kaiya, Haruhiko; Umemura, Masahiro; Ogata, Shinpei; Kaijiri, Kenji

    2013-12-01

    Software quality requirements defined in the requirements analysis stage should be implemented in the final products, such as source codes and system deployment. To guarantee this meta-requirement, quality requirements should be considered in the intermediate stages, such as the design stage or the architectural definition stage. We propose a novel method for checking whether quality requirements are considered in the design stage. In this method, a technique called "spectrum analysis for quality requirements" is applied not only to requirements specifications but also to design documents. The technique enables us to derive the spectrum of a document, and quality requirements considerations in the document are numerically represented in the spectrum. We can thus objectively identify whether the considerations of quality requirements in a requirements document are adapted to its design document. To validate the method, we applied it to commercial software systems with the help of a supporting tool, and we confirmed that the method worked well.

  14. Projected estimators for robust semi-supervised classification

    NARCIS (Netherlands)

    Krijthe, J.H.; Loog, M.

    2017-01-01

    For semi-supervised techniques to be applied safely in practice we at least want methods to outperform their supervised counterparts. We study this question for classification using the well-known quadratic surrogate loss function. Unlike other approaches to semi-supervised learning, the

  15. Fossil Signatures Using Elemental Abundance Distributions and Bayesian Probabilistic Classification

    Science.gov (United States)

    Hoover, Richard B.; Storrie-Lombardi, Michael C.

    2004-01-01

    Elemental abundances (C6, N7, O8, Na11, Mg12, Al3, P15, S16, Cl17, K19, Ca20, Ti22, Mn25, Fe26, and Ni28) were obtained for a set of terrestrial fossils and the rock matrix surrounding them. Principal Component Analysis extracted five factors accounting for the 92.5% of the data variance, i.e. information content, of the elemental abundance data. Hierarchical Cluster Analysis provided unsupervised sample classification distinguishing fossil from matrix samples on the basis of either raw abundances or PCA input that agreed strongly with visual classification. A stochastic, non-linear Artificial Neural Network produced a Bayesian probability of correct sample classification. The results provide a quantitative probabilistic methodology for discriminating terrestrial fossils from the surrounding rock matrix using chemical information. To demonstrate the applicability of these techniques to the assessment of meteoritic samples or in situ extraterrestrial exploration, we present preliminary data on samples of the Orgueil meteorite. In both systems an elemental signature produces target classification decisions remarkably consistent with morphological classification by a human expert using only structural (visual) information. We discuss the possibility of implementing a complexity analysis metric capable of automating certain image analysis and pattern recognition abilities of the human eye using low magnification optical microscopy images and discuss the extension of this technique across multiple scales.

  16. MULTI-TEMPORAL CLASSIFICATION AND CHANGE DETECTION USING UAV IMAGES

    Directory of Open Access Journals (Sweden)

    S. Makuti

    2018-05-01

    Full Text Available In this paper different methodologies for the classification and change detection of UAV image blocks are explored. UAV is not only the cheapest platform for image acquisition but it is also the easiest platform to operate in repeated data collections over a changing area like a building construction site. Two change detection techniques have been evaluated in this study: the pre-classification and the post-classification algorithms. These methods are based on three main steps: feature extraction, classification and change detection. A set of state of the art features have been used in the tests: colour features (HSV, textural features (GLCM and 3D geometric features. For classification purposes Conditional Random Field (CRF has been used: the unary potential was determined using the Random Forest algorithm while the pairwise potential was defined by the fully connected CRF. In the performed tests, different feature configurations and settings have been considered to assess the performance of these methods in such challenging task. Experimental results showed that the post-classification approach outperforms the pre-classification change detection method. This was analysed using the overall accuracy, where by post classification have an accuracy of up to 62.6 % and the pre classification change detection have an accuracy of 46.5 %. These results represent a first useful indication for future works and developments.

  17. Data Processing And Machine Learning Methods For Multi-Modal Operator State Classification Systems

    Science.gov (United States)

    Hearn, Tristan A.

    2015-01-01

    This document is intended as an introduction to a set of common signal processing learning methods that may be used in the software portion of a functional crew state monitoring system. This includes overviews of both the theory of the methods involved, as well as examples of implementation. Practical considerations are discussed for implementing modular, flexible, and scalable processing and classification software for a multi-modal, multi-channel monitoring system. Example source code is also given for all of the discussed processing and classification methods.

  18. Applying machine-learning techniques to Twitter data for automatic hazard-event classification.

    Science.gov (United States)

    Filgueira, R.; Bee, E. J.; Diaz-Doce, D.; Poole, J., Sr.; Singh, A.

    2017-12-01

    The constant flow of information offered by tweets provides valuable information about all sorts of events at a high temporal and spatial resolution. Over the past year we have been analyzing in real-time geological hazards/phenomenon, such as earthquakes, volcanic eruptions, landslides, floods or the aurora, as part of the GeoSocial project, by geo-locating tweets filtered by keywords in a web-map. However, not all the filtered tweets are related with hazard/phenomenon events. This work explores two classification techniques for automatic hazard-event categorization based on tweets about the "Aurora". First, tweets were filtered using aurora-related keywords, removing stop words and selecting the ones written in English. For classifying the remaining between "aurora-event" or "no-aurora-event" categories, we compared two state-of-art techniques: Support Vector Machine (SVM) and Deep Convolutional Neural Networks (CNN) algorithms. Both approaches belong to the family of supervised learning algorithms, which make predictions based on labelled training dataset. Therefore, we created a training dataset by tagging 1200 tweets between both categories. The general form of SVM is used to separate two classes by a function (kernel). We compared the performance of four different kernels (Linear Regression, Logistic Regression, Multinomial Naïve Bayesian and Stochastic Gradient Descent) provided by Scikit-Learn library using our training dataset to build the SVM classifier. The results shown that the Logistic Regression (LR) gets the best accuracy (87%). So, we selected the SVM-LR classifier to categorise a large collection of tweets using the "dispel4py" framework.Later, we developed a CNN classifier, where the first layer embeds words into low-dimensional vectors. The next layer performs convolutions over the embedded word vectors. Results from the convolutional layer are max-pooled into a long feature vector, which is classified using a softmax layer. The CNN's accuracy

  19. A classification of user-generated content into consumer decision journey stages.

    Science.gov (United States)

    Vázquez, Silvia; Muñoz-García, Óscar; Campanella, Inés; Poch, Marc; Fisas, Beatriz; Bel, Nuria; Andreu, Gloria

    2014-10-01

    In the last decades, the availability of digital user-generated documents from social media has dramatically increased. This massive growth of user-generated content has also affected traditional shopping behaviour. Customers have embraced new communication channels such as microblogs and social networks that enable them not only just to talk with friends and acquaintances about their shopping experience, but also to search for opinions expressed by complete strangers as part of their decision making processes. Uncovering how customers feel about specific products or brands and detecting purchase habits and preferences has traditionally been a costly and highly time-consuming task which involved the use of methods such as focus groups and surveys. However, the new scenario calls for a deep assessment of current market research techniques in order to better interpret and profit from this ever-growing stream of attitudinal data. With this purpose, we present a novel analysis and classification of user-generated content in terms of it belonging to one of the four stages of the Consumer Decision Journey Court et al. (2009) (i.e. the purchase process from the moment when a customer is aware of the existence of the product to the moment when he or she buys, experiences and talks about it). Using a corpus of short texts written in English and Spanish and extracted from different social media, we identify a set of linguistic patterns for each purchase stage that will be then used in a rule-based classifier. Additionally, we use machine learning algorithms to automatically identify business indicators such as the Marketing Mix elements McCarthy and Brogowicz (1981). The classification of the purchase stages achieves an average precision of 74%. The proposed classification of texts depending on the Marketing Mix elements expressed achieved an average precision of 75% for all the elements analysed. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Neural network classification of sweet potato embryos

    Science.gov (United States)

    Molto, Enrique; Harrell, Roy C.

    1993-05-01

    Somatic embryogenesis is a process that allows for the in vitro propagation of thousands of plants in sub-liter size vessels and has been successfully applied to many significant species. The heterogeneity of maturity and quality of embryos produced with this technique requires sorting to obtain a uniform product. An automated harvester is being developed at the University of Florida to sort embryos in vitro at different stages of maturation in a suspension culture. The system utilizes machine vision to characterize embryo morphology and a fluidic based separation device to isolate embryos associated with a pre-defined, targeted morphology. Two different backpropagation neural networks (BNN) were used to classify embryos based on information extracted from the vision system. One network utilized geometric features such as embryo area, length, and symmetry as inputs. The alternative network utilized polar coordinates of an embryo's perimeter with respect to its centroid as inputs. The performances of both techniques were compared with each other and with an embryo classification method based on linear discriminant analysis (LDA). Similar results were obtained with all three techniques. Classification efficiency was improved by reducing the dimension of the feature vector trough a forward stepwise analysis by LDA. In order to enhance the purity of the sample selected as harvestable, a reject to classify option was introduced in the model and analyzed. The best classifier performances (76% overall correct classifications, 75% harvestable objects properly classified, homogeneity improvement ratio 1.5) were obtained using 8 features in a BNN.

  1. Automatic modulation classification principles, algorithms and applications

    CERN Document Server

    Zhu, Zhechen

    2014-01-01

    Automatic Modulation Classification (AMC) has been a key technology in many military, security, and civilian telecommunication applications for decades. In military and security applications, modulation often serves as another level of encryption; in modern civilian applications, multiple modulation types can be employed by a signal transmitter to control the data rate and link reliability. This book offers comprehensive documentation of AMC models, algorithms and implementations for successful modulation recognition. It provides an invaluable theoretical and numerical comparison of AMC algo

  2. Improved Management of Part Safety Classification System for Nuclear Power Plant

    Energy Technology Data Exchange (ETDEWEB)

    Park, Jin Young; Park, Youn Won; Park, Heung Gyu; Park, Hyo Chan [BEES Inc., Daejeon (Korea, Republic of)

    2016-10-15

    As, in recent years, many quality assurance (QA) related incidents, such as falsely-certified parts and forged documentation, etc., were reported in association with the supply of structures, systems, components and parts to nuclear power plants, a need for a better management of safety classification system was addressed so that it would be based more on the level of parts . Presently, the Korean nuclear power plants do not develop and apply relevant procedures for safety classifications, but rather the safety classes of parts are determined solely based on the experience of equipment designers. So proposed in this paper is a better management plan for safety equipment classification system with an aim to strengthen the quality management for parts. The plan was developed through the analysis of newly introduced technical criteria to be applied to parts of nuclear power plant.

  3. DTI measurements for Alzheimer’s classification

    Science.gov (United States)

    Maggipinto, Tommaso; Bellotti, Roberto; Amoroso, Nicola; Diacono, Domenico; Donvito, Giacinto; Lella, Eufemia; Monaco, Alfonso; Antonella Scelsi, Marzia; Tangaro, Sabina; Disease Neuroimaging Initiative, Alzheimer's.

    2017-03-01

    Diffusion tensor imaging (DTI) is a promising imaging technique that provides insight into white matter microstructure integrity and it has greatly helped identifying white matter regions affected by Alzheimer’s disease (AD) in its early stages. DTI can therefore be a valuable source of information when designing machine-learning strategies to discriminate between healthy control (HC) subjects, AD patients and subjects with mild cognitive impairment (MCI). Nonetheless, several studies have reported so far conflicting results, especially because of the adoption of biased feature selection strategies. In this paper we firstly analyzed DTI scans of 150 subjects from the Alzheimer’s disease neuroimaging initiative (ADNI) database. We measured a significant effect of the feature selection bias on the classification performance (p-value  informative content provided by DTI measurements for AD classification. Classification performances and biological insight, concerning brain regions related to the disease, provided by cross-validation analysis were both confirmed on the independent test.

  4. Automated classification of Permanent Scatterers time-series based on statistical characterization tests

    Science.gov (United States)

    Berti, Matteo; Corsini, Alessandro; Franceschini, Silvia; Iannacone, Jean Pascal

    2013-04-01

    The application of space borne synthetic aperture radar interferometry has progressed, over the last two decades, from the pioneer use of single interferograms for analyzing changes on the earth's surface to the development of advanced multi-interferogram techniques to analyze any sort of natural phenomena which involves movements of the ground. The success of multi-interferograms techniques in the analysis of natural hazards such as landslides and subsidence is widely documented in the scientific literature and demonstrated by the consensus among the end-users. Despite the great potential of this technique, radar interpretation of slope movements is generally based on the sole analysis of average displacement velocities, while the information embraced in multi interferogram time series is often overlooked if not completely neglected. The underuse of PS time series is probably due to the detrimental effect of residual atmospheric errors, which make the PS time series characterized by erratic, irregular fluctuations often difficult to interpret, and also to the difficulty of performing a visual, supervised analysis of the time series for a large dataset. In this work is we present a procedure for automatic classification of PS time series based on a series of statistical characterization tests. The procedure allows to classify the time series into six distinctive target trends (0=uncorrelated; 1=linear; 2=quadratic; 3=bilinear; 4=discontinuous without constant velocity; 5=discontinuous with change in velocity) and retrieve for each trend a series of descriptive parameters which can be efficiently used to characterize the temporal changes of ground motion. The classification algorithms were developed and tested using an ENVISAT datasets available in the frame of EPRS-E project (Extraordinary Plan of Environmental Remote Sensing) of the Italian Ministry of Environment (track "Modena", Northern Apennines). This dataset was generated using standard processing, then the

  5. Autonomous Non-Linear Classification of LPI Radar Signal Modulations

    National Research Council Canada - National Science Library

    Gulum, Taylan O

    2007-01-01

    ...) radar modulations is investigated. A software engineering architecture that allows a full investigation of various preprocessing algorithms and classification techniques is applied to a database of important LPI radar waveform...

  6. Automated Stellar Classification for Large Surveys with EKF and RBF Neural Networks

    Institute of Scientific and Technical Information of China (English)

    Ling Bai; Ping Guo; Zhan-Yi Hu

    2005-01-01

    An automated classification technique for large size stellar surveys is proposed. It uses the extended Kalman filter as a feature selector and pre-classifier of the data, and the radial basis function neural networks for the classification.Experiments with real data have shown that the correct classification rate can reach as high as 93%, which is quite satisfactory. When different system models are selected for the extended Kalman filter, the classification results are relatively stable. It is shown that for this particular case the result using extended Kalman filter is better than using principal component analysis.

  7. Frequency Optimization for Enhancement of Surface Defect Classification Using the Eddy Current Technique

    Science.gov (United States)

    Fan, Mengbao; Wang, Qi; Cao, Binghua; Ye, Bo; Sunny, Ali Imam; Tian, Guiyun

    2016-01-01

    Eddy current testing is quite a popular non-contact and cost-effective method for nondestructive evaluation of product quality and structural integrity. Excitation frequency is one of the key performance factors for defect characterization. In the literature, there are many interesting papers dealing with wide spectral content and optimal frequency in terms of detection sensitivity. However, research activity on frequency optimization with respect to characterization performances is lacking. In this paper, an investigation into optimum excitation frequency has been conducted to enhance surface defect classification performance. The influences of excitation frequency for a group of defects were revealed in terms of detection sensitivity, contrast between defect features, and classification accuracy using kernel principal component analysis (KPCA) and a support vector machine (SVM). It is observed that probe signals are the most sensitive on the whole for a group of defects when excitation frequency is set near the frequency at which maximum probe signals are retrieved for the largest defect. After the use of KPCA, the margins between the defect features are optimum from the perspective of the SVM, which adopts optimal hyperplanes for structure risk minimization. As a result, the best classification accuracy is obtained. The main contribution is that the influences of excitation frequency on defect characterization are interpreted, and experiment-based procedures are proposed to determine the optimal excitation frequency for a group of defects rather than a single defect with respect to optimal characterization performances. PMID:27164112

  8. A statistical approach to root system classification.

    Directory of Open Access Journals (Sweden)

    Gernot eBodner

    2013-08-01

    Full Text Available Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for plant functional type identification in ecology can be applied to the classification of root systems. We demonstrate that combining principal component and cluster analysis yields a meaningful classification of rooting types based on morphological traits. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. Biplot inspection is used to determine key traits and to ensure stability in cluster based grouping. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Three rooting types emerged from measured data, distinguished by diameter/weight, density and spatial distribution respectively. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement

  9. The theory and practice of the Dewey Decimal Classification system

    CERN Document Server

    Satija, M P

    2013-01-01

    The Dewey Decimal Classification system (DDC) is the world's most popular library classification system. The 23rd edition of the DDC was published in 2011. This second edition of The Theory and Practice of the Dewey Decimal Classification System examines the history, management and technical aspects of the DDC up to its latest edition. The book places emphasis on explaining the structure and number building techniques in the DDC and reviews all aspects of subject analysis and number building by the most recent version of the DDC. A history of, and introduction to, the DDC is followed by subjec

  10. Classification of assembly techniques for micro products

    DEFF Research Database (Denmark)

    Hansen, Hans Nørgaard; Tosello, Guido; Gegeckaite, Asta

    2005-01-01

    of components and level of integration are made. This paper describes a systematic characterization of micro assembly methods. This methodology offers the opportunity of a cross comparison among different techniques to gain a choosing principle of the favourable micro assembly technology in a specific case...

  11. Adaptive phase k-means algorithm for waveform classification

    Science.gov (United States)

    Song, Chengyun; Liu, Zhining; Wang, Yaojun; Xu, Feng; Li, Xingming; Hu, Guangmin

    2018-01-01

    Waveform classification is a powerful technique for seismic facies analysis that describes the heterogeneity and compartments within a reservoir. Horizon interpretation is a critical step in waveform classification. However, the horizon often produces inconsistent waveform phase, and thus results in an unsatisfied classification. To alleviate this problem, an adaptive phase waveform classification method called the adaptive phase k-means is introduced in this paper. Our method improves the traditional k-means algorithm using an adaptive phase distance for waveform similarity measure. The proposed distance is a measure with variable phases as it moves from sample to sample along the traces. Model traces are also updated with the best phase interference in the iterative process. Therefore, our method is robust to phase variations caused by the interpretation horizon. We tested the effectiveness of our algorithm by applying it to synthetic and real data. The satisfactory results reveal that the proposed method tolerates certain waveform phase variation and is a good tool for seismic facies analysis.

  12. Signal classification using global dynamical models, Part I: Theory

    International Nuclear Information System (INIS)

    Kadtke, J.; Kremliovsky, M.

    1996-01-01

    Detection and classification of signals is one of the principal areas of signal processing, and the utilization of nonlinear information has long been considered as a way of improving performance beyond standard linear (e.g. spectral) techniques. Here, we develop a method for using global models of chaotic dynamical systems theory to define a signal classification processing chain, which is sensitive to nonlinear correlations in the data. We use it to demonstrate classification in high noise regimes (negative SNR), and argue that classification probabilities can be directly computed from ensemble statistics in the model coefficient space. We also develop a modification for non-stationary signals (i.e. transients) using non-autonomous ODEs. In Part II of this paper, we demonstrate the analysis on actual open ocean acoustic data from marine biologics. copyright 1996 American Institute of Physics

  13. A review of supervised object-based land-cover image classification

    Science.gov (United States)

    Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue

    2017-08-01

    Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial

  14. Enabling Sex Workers to Document Violence (India and Cambodia ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    Digital advocacy techniques have been used to document human rights abuses and communicate these to people of influence, but these have yet to be systematically applied to sex worker-led advocacy. This project is based on the assumption that enabling sex workers to document violations and amplify their advocacy ...

  15. Classification of phosphogypsum as a waste material from the aspect of environmental protection

    Directory of Open Access Journals (Sweden)

    Rajković Miloš B.

    2004-01-01

    Full Text Available Phosphogypsum is primarily classified as a heavy waste. The classification of phosphogypsum as dangerous waste may be only maintained under the condition that phosphates with the highest content of radio nuclides are used in the production of H3PO4 by the so called "wet procedure" (Morocco, Florida, which, due to the great quantity of present radio nuclides, causes considerable environmental pollution by radon. The classification of phosphogypsum as a separate category of radioactive waste may be conditionally accepted, because phosphogypsum is not a radioactive waste. All the instructions about the collection, documentation and storage of phosphogypsum so far on disposal sites, and possible transport, also due to non-existing legal recommendations must comply with the classification of phosphogypsum as dangerous waste.

  16. AN ADABOOST OPTIMIZED CCFIS BASED CLASSIFICATION MODEL FOR BREAST CANCER DETECTION

    Directory of Open Access Journals (Sweden)

    CHANDRASEKAR RAVI

    2017-06-01

    Full Text Available Classification is a Data Mining technique used for building a prototype of the data behaviour, using which an unseen data can be classified into one of the defined classes. Several researchers have proposed classification techniques but most of them did not emphasis much on the misclassified instances and storage space. In this paper, a classification model is proposed that takes into account the misclassified instances and storage space. The classification model is efficiently developed using a tree structure for reducing the storage complexity and uses single scan of the dataset. During the training phase, Class-based Closed Frequent ItemSets (CCFIS were mined from the training dataset in the form of a tree structure. The classification model has been developed using the CCFIS and a similarity measure based on Longest Common Subsequence (LCS. Further, the Particle Swarm Optimization algorithm is applied on the generated CCFIS, which assigns weights to the itemsets and their associated classes. Most of the classifiers are correctly classifying the common instances but they misclassify the rare instances. In view of that, AdaBoost algorithm has been used to boost the weights of the misclassified instances in the previous round so as to include them in the training phase to classify the rare instances. This improves the accuracy of the classification model. During the testing phase, the classification model is used to classify the instances of the test dataset. Breast Cancer dataset from UCI repository is used for experiment. Experimental analysis shows that the accuracy of the proposed classification model outperforms the PSOAdaBoost-Sequence classifier by 7% superior to other approaches like Naïve Bayes Classifier, Support Vector Machine Classifier, Instance Based Classifier, ID3 Classifier, J48 Classifier, etc.

  17. Applying Hypertext Structures to Software Documentation.

    Science.gov (United States)

    French, James C.; And Others

    1997-01-01

    Describes a prototype system for software documentation management called SLEUTH (Software Literacy Enhancing Usefulness to Humans) being developed at the University of Virginia. Highlights include information retrieval techniques, hypertext links that are installed automatically, a WAIS (Wide Area Information Server) search engine, user…

  18. Tumor taxonomy for the developmental lineage classification of neoplasms

    International Nuclear Information System (INIS)

    Berman, Jules J

    2004-01-01

    The new 'Developmental lineage classification of neoplasms' was described in a prior publication. The classification is simple (the entire hierarchy is described with just 39 classifiers), comprehensive (providing a place for every tumor of man), and consistent with recent attempts to characterize tumors by cytogenetic and molecular features. A taxonomy is a list of the instances that populate a classification. The taxonomy of neoplasia attempts to list every known term for every known tumor of man. The taxonomy provides each concept with a unique code and groups synonymous terms under the same concept. A Perl script validated successive drafts of the taxonomy ensuring that: 1) each term occurs only once in the taxonomy; 2) each term occurs in only one tumor class; 3) each concept code occurs in one and only one hierarchical position in the classification; and 4) the file containing the classification and taxonomy is a well-formed XML (eXtensible Markup Language) document. The taxonomy currently contains 122,632 different terms encompassing 5,376 neoplasm concepts. Each concept has, on average, 23 synonyms. The taxonomy populates 'The developmental lineage classification of neoplasms,' and is available as an XML file, currently 9+ Megabytes in length. A representation of the classification/taxonomy listing each term followed by its code, followed by its full ancestry, is available as a flat-file, 19+ Megabytes in length. The taxonomy is the largest nomenclature of neoplasms, with more than twice the number of neoplasm names found in other medical nomenclatures, including the 2004 version of the Unified Medical Language System, the Systematized Nomenclature of Medicine Clinical Terminology, the National Cancer Institute's Thesaurus, and the International Classification of Diseases Oncolology version. This manuscript describes a comprehensive taxonomy of neoplasia that collects synonymous terms under a unique code number and assigns each

  19. A new classification scheme of plastic wastes based upon recycling labels

    Energy Technology Data Exchange (ETDEWEB)

    Özkan, Kemal, E-mail: kozkan@ogu.edu.tr [Computer Engineering Dept., Eskişehir Osmangazi University, 26480 Eskişehir (Turkey); Ergin, Semih, E-mail: sergin@ogu.edu.tr [Electrical Electronics Engineering Dept., Eskişehir Osmangazi University, 26480 Eskişehir (Turkey); Işık, Şahin, E-mail: sahini@ogu.edu.tr [Computer Engineering Dept., Eskişehir Osmangazi University, 26480 Eskişehir (Turkey); Işıklı, İdil, E-mail: idil.isikli@bilecik.edu.tr [Electrical Electronics Engineering Dept., Bilecik University, 11210 Bilecik (Turkey)

    2015-01-15

    Highlights: • PET, HPDE or PP types of plastics are considered. • An automated classification of plastic bottles based on the feature extraction and classification methods is performed. • The decision mechanism consists of PCA, Kernel PCA, FLDA, SVD and Laplacian Eigenmaps methods. • SVM is selected to achieve the classification task and majority voting technique is used. - Abstract: Since recycling of materials is widely assumed to be environmentally and economically beneficial, reliable sorting and processing of waste packaging materials such as plastics is very important for recycling with high efficiency. An automated system that can quickly categorize these materials is certainly needed for obtaining maximum classification while maintaining high throughput. In this paper, first of all, the photographs of the plastic bottles have been taken and several preprocessing steps were carried out. The first preprocessing step is to extract the plastic area of a bottle from the background. Then, the morphological image operations are implemented. These operations are edge detection, noise removal, hole removing, image enhancement, and image segmentation. These morphological operations can be generally defined in terms of the combinations of erosion and dilation. The effect of bottle color as well as label are eliminated using these operations. Secondly, the pixel-wise intensity values of the plastic bottle images have been used together with the most popular subspace and statistical feature extraction methods to construct the feature vectors in this study. Only three types of plastics are considered due to higher existence ratio of them than the other plastic types in the world. The decision mechanism consists of five different feature extraction methods including as Principal Component Analysis (PCA), Kernel PCA (KPCA), Fisher’s Linear Discriminant Analysis (FLDA), Singular Value Decomposition (SVD) and Laplacian Eigenmaps (LEMAP) and uses a simple

  20. A new classification scheme of plastic wastes based upon recycling labels

    International Nuclear Information System (INIS)

    Özkan, Kemal; Ergin, Semih; Işık, Şahin; Işıklı, İdil

    2015-01-01

    Highlights: • PET, HPDE or PP types of plastics are considered. • An automated classification of plastic bottles based on the feature extraction and classification methods is performed. • The decision mechanism consists of PCA, Kernel PCA, FLDA, SVD and Laplacian Eigenmaps methods. • SVM is selected to achieve the classification task and majority voting technique is used. - Abstract: Since recycling of materials is widely assumed to be environmentally and economically beneficial, reliable sorting and processing of waste packaging materials such as plastics is very important for recycling with high efficiency. An automated system that can quickly categorize these materials is certainly needed for obtaining maximum classification while maintaining high throughput. In this paper, first of all, the photographs of the plastic bottles have been taken and several preprocessing steps were carried out. The first preprocessing step is to extract the plastic area of a bottle from the background. Then, the morphological image operations are implemented. These operations are edge detection, noise removal, hole removing, image enhancement, and image segmentation. These morphological operations can be generally defined in terms of the combinations of erosion and dilation. The effect of bottle color as well as label are eliminated using these operations. Secondly, the pixel-wise intensity values of the plastic bottle images have been used together with the most popular subspace and statistical feature extraction methods to construct the feature vectors in this study. Only three types of plastics are considered due to higher existence ratio of them than the other plastic types in the world. The decision mechanism consists of five different feature extraction methods including as Principal Component Analysis (PCA), Kernel PCA (KPCA), Fisher’s Linear Discriminant Analysis (FLDA), Singular Value Decomposition (SVD) and Laplacian Eigenmaps (LEMAP) and uses a simple

  1. Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields

    Directory of Open Access Journals (Sweden)

    Hong-Jie Dai

    2015-01-01

    Full Text Available Electronic health record (EHR is a digital data format that collects electronic health information about an individual patient or population. To enhance the meaningful use of EHRs, information extraction techniques have been developed to recognize clinical concepts mentioned in EHRs. Nevertheless, the clinical judgment of an EHR cannot be known solely based on the recognized concepts without considering its contextual information. In order to improve the readability and accessibility of EHRs, this work developed a section heading recognition system for clinical documents. In contrast to formulating the section heading recognition task as a sentence classification problem, this work proposed a token-based formulation with the conditional random field (CRF model. A standard section heading recognition corpus was compiled by annotators with clinical experience to evaluate the performance and compare it with sentence classification and dictionary-based approaches. The results of the experiments showed that the proposed method achieved a satisfactory F-score of 0.942, which outperformed the sentence-based approach and the best dictionary-based system by 0.087 and 0.096, respectively. One important advantage of our formulation over the sentence-based approach is that it presented an integrated solution without the need to develop additional heuristics rules for isolating the headings from the surrounding section contents.

  2. Combination of laser-induced breakdown spectroscopy and Raman spectroscopy for multivariate classification of bacteria

    Science.gov (United States)

    Prochazka, D.; Mazura, M.; Samek, O.; Rebrošová, K.; Pořízka, P.; Klus, J.; Prochazková, P.; Novotný, J.; Novotný, K.; Kaiser, J.

    2018-01-01

    In this work, we investigate the impact of data provided by complementary laser-based spectroscopic methods on multivariate classification accuracy. Discrimination and classification of five Staphylococcus bacterial strains and one strain of Escherichia coli is presented. The technique that we used for measurements is a combination of Raman spectroscopy and Laser-Induced Breakdown Spectroscopy (LIBS). Obtained spectroscopic data were then processed using Multivariate Data Analysis algorithms. Principal Components Analysis (PCA) was selected as the most suitable technique for visualization of bacterial strains data. To classify the bacterial strains, we used Neural Networks, namely a supervised version of Kohonen's self-organizing maps (SOM). We were processing results in three different ways - separately from LIBS measurements, from Raman measurements, and we also merged data from both mentioned methods. The three types of results were then compared. By applying the PCA to Raman spectroscopy data, we observed that two bacterial strains were fully distinguished from the rest of the data set. In the case of LIBS data, three bacterial strains were fully discriminated. Using a combination of data from both methods, we achieved the complete discrimination of all bacterial strains. All the data were classified with a high success rate using SOM algorithm. The most accurate classification was obtained using a combination of data from both techniques. The classification accuracy varied, depending on specific samples and techniques. As for LIBS, the classification accuracy ranged from 45% to 100%, as for Raman Spectroscopy from 50% to 100% and in case of merged data, all samples were classified correctly. Based on the results of the experiments presented in this work, we can assume that the combination of Raman spectroscopy and LIBS significantly enhances discrimination and classification accuracy of bacterial species and strains. The reason is the complementarity in

  3. Three-dimensional measurement system for crime scene documentation

    Science.gov (United States)

    Adamczyk, Marcin; Hołowko, Elwira; Lech, Krzysztof; Michoński, Jakub; MÄ czkowski, Grzegorz; Bolewicki, Paweł; Januszkiewicz, Kamil; Sitnik, Robert

    2017-10-01

    Three dimensional measurements (such as photogrammetry, Time of Flight, Structure from Motion or Structured Light techniques) are becoming a standard in the crime scene documentation process. The usage of 3D measurement techniques provide an opportunity to prepare more insightful investigation and helps to show every trace in the context of the entire crime scene. In this paper we would like to present a hierarchical, three-dimensional measurement system that is designed for crime scenes documentation process. Our system reflects the actual standards in crime scene documentation process - it is designed to perform measurement in two stages. First stage of documentation, the most general, is prepared with a scanner with relatively low spatial resolution but also big measuring volume - it is used for the whole scene documentation. Second stage is much more detailed: high resolution but smaller size of measuring volume for areas that required more detailed approach. The documentation process is supervised by a specialised application CrimeView3D, that is a software platform for measurements management (connecting with scanners and carrying out measurements, automatic or semi-automatic data registration in the real time) and data visualisation (3D visualisation of documented scenes). It also provides a series of useful tools for forensic technicians: virtual measuring tape, searching for sources of blood spatter, virtual walk on the crime scene and many others. In this paper we present our measuring system and the developed software. We also provide an outcome from research on metrological validation of scanners that was performed according to VDI/VDE standard. We present a CrimeView3D - a software-platform that was developed to manage the crime scene documentation process. We also present an outcome from measurement sessions that were conducted on real crime scenes with cooperation with Technicians from Central Forensic Laboratory of Police.

  4. Raster Vs. Point Cloud LiDAR Data Classification

    Science.gov (United States)

    El-Ashmawy, N.; Shaker, A.

    2014-09-01

    Airborne Laser Scanning systems with light detection and ranging (LiDAR) technology is one of the fast and accurate 3D point data acquisition techniques. Generating accurate digital terrain and/or surface models (DTM/DSM) is the main application of collecting LiDAR range data. Recently, LiDAR range and intensity data have been used for land cover classification applications. Data range and Intensity, (strength of the backscattered signals measured by the LiDAR systems), are affected by the flying height, the ground elevation, scanning angle and the physical characteristics of the objects surface. These effects may lead to uneven distribution of point cloud or some gaps that may affect the classification process. Researchers have investigated the conversion of LiDAR range point data to raster image for terrain modelling. Interpolation techniques have been used to achieve the best representation of surfaces, and to fill the gaps between the LiDAR footprints. Interpolation methods are also investigated to generate LiDAR range and intensity image data for land cover classification applications. In this paper, different approach has been followed to classifying the LiDAR data (range and intensity) for land cover mapping. The methodology relies on the classification of the point cloud data based on their range and intensity and then converted the classified points into raster image. The gaps in the data are filled based on the classes of the nearest neighbour. Land cover maps are produced using two approaches using: (a) the conventional raster image data based on point interpolation; and (b) the proposed point data classification. A study area covering an urban district in Burnaby, British Colombia, Canada, is selected to compare the results of the two approaches. Five different land cover classes can be distinguished in that area: buildings, roads and parking areas, trees, low vegetation (grass), and bare soil. The results show that an improvement of around 10 % in the

  5. Classification of bacterial contamination using image processing and distributed computing.

    Science.gov (United States)

    Ahmed, W M; Bayraktar, B; Bhunia, A; Hirleman, E D; Robinson, J P; Rajwa, B

    2013-01-01

    Disease outbreaks due to contaminated food are a major concern not only for the food-processing industry but also for the public at large. Techniques for automated detection and classification of microorganisms can be a great help in preventing outbreaks and maintaining the safety of the nations food supply. Identification and classification of foodborne pathogens using colony scatter patterns is a promising new label-free technique that utilizes image-analysis and machine-learning tools. However, the feature-extraction tools employed for this approach are computationally complex, and choosing the right combination of scatter-related features requires extensive testing with different feature combinations. In the presented work we used computer clusters to speed up the feature-extraction process, which enables us to analyze the contribution of different scatter-based features to the overall classification accuracy. A set of 1000 scatter patterns representing ten different bacterial strains was used. Zernike and Chebyshev moments as well as Haralick texture features were computed from the available light-scatter patterns. The most promising features were first selected using Fishers discriminant analysis, and subsequently a support-vector-machine (SVM) classifier with a linear kernel was used. With extensive testing we were able to identify a small subset of features that produced the desired results in terms of classification accuracy and execution speed. The use of distributed computing for scatter-pattern analysis, feature extraction, and selection provides a feasible mechanism for large-scale deployment of a light scatter-based approach to bacterial classification.

  6. A New Insight into Land Use Classification Based on Aggregated Mobile Phone Data

    OpenAIRE

    Pei, Tao; Sobolevsky, Stanislav; Ratti, Carlo; Shaw, Shih-Lung; Zhou, Chenghu

    2013-01-01

    Land use classification is essential for urban planning. Urban land use types can be differentiated either by their physical characteristics (such as reflectivity and texture) or social functions. Remote sensing techniques have been recognized as a vital method for urban land use classification because of their ability to capture the physical characteristics of land use. Although significant progress has been achieved in remote sensing methods designed for urban land use classification, most ...

  7. CLASSIFICATION OF THE MGR SITE COMPRESSED AIR SYSTEM

    International Nuclear Information System (INIS)

    J.A. Ziegler

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) site compressed air system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (QARD) (DOE 1998)

  8. CLASSIFICATION OF THE MGR MAINTENANCE AND SUPPLY SYSTEM

    International Nuclear Information System (INIS)

    J.A. Ziegler

    1999-01-01

    The purpose of this analysis is to document the Quality Assurance (QA) classification of the Monitored Geologic Repository (MGR) maintenance and supply system structures, systems and components (SSCs) performed by the MGR Safety Assurance Department. This analysis also provides the basis for revision of YMP/90-55Q, Q-List (YMP 1998). The Q-List identifies those MGR SSCs subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (QARD) (DOE 1998)

  9. Albinism: classification, clinical characteristics, and recent findings.

    Science.gov (United States)

    Summers, C Gail

    2009-06-01

    To describe the clinical characteristics and recent findings in the heterogeneous group of inherited disorders of melanin biosynthesis grouped as "albinism." The current classification of albinism, and the cutaneous, ocular, and central nervous system characteristics are presented. Recent clinical findings are summarized. Albinism is now classified based on genes known to be responsible for albinism. Foveal hypoplasia is invariably present and individuals with albinism often have delayed visual development, reduced vision, nystagmus, a positive angle kappa, strabismus, iris transillumination, and absent or reduced melanin pigment in the fundi. A visual-evoked potential can document the excessive retinostriate decussation seen in albinism. Grating acuity can be used to document delayed visual development in preverbal children. Glasses are often needed to improve visual acuity and binocular alignment. Albinism is caused by several different genes. Heterogeneity in clinical phenotype indicates that expressivity is variable.

  10. Wavelet transform and ANNs for detection and classification of power signal disturbances

    International Nuclear Information System (INIS)

    Memon, A.P.; Uqaili, M.A.; Memon, Z.A.

    2012-01-01

    This article proposes WT (Wavelet Transform) and an ANN (Artificial Neural Network) based approach for detection and classification of EPQDs (Electrical Power Quality Disturbances). A modified WT known as ST (Stockwell Transform) is suggested for feature extraction and PNN (probabilistic Neural Network) for pattern classification. The ST possesses outstanding time-frequency resolution characteristics and its phase correction techniques determine the phase of the WT to the zero time point The feature vectors for the input of PNN are extracted using ST technique and these obtained features are discrete, logical, and unaffected to noisy data of distorted signals. The data of the models required to develop the distorted EPQ (Electrical Power Quality) signals, is obtained within the ranges specified by IEEE 1159-1995 in its literatures. The features vectors including noisy time varying data during steady state or transient condition and extracted using the ST, are trained through PNN for pattern classification. Their simulation results demonstrate that the proposed methodology is successful and can classify EPQDs even under a noisy environment very efficiently with an average classification accuracy of 96%. (author)

  11. Classification and interdisciplinary browsing: the experience at the University of Pavia

    Directory of Open Access Journals (Sweden)

    Laura Pusterla

    2017-05-01

    Full Text Available The gradual adoption by the University of Pavia libraries of the classified shelving based on the Dewey Decimal Classification is presented. This has allowed developing the browsing interface  SciGator, which offers the possibility to browse through the subjects in the libraries and launch searches in the OPAC catalogue for documents shelved by their corresponding DDC classes or their local equivalents. SciGator also allows searching for documents indexed with their respective class or with a class linked to it in the network of class cross-references. The convenience and possible improvements of SciGator are discussed.

  12. Classification of structurally related commercial contrast media by near infrared spectroscopy.

    Science.gov (United States)

    Yip, Wai Lam; Soosainather, Tom Collin; Dyrstad, Knut; Sande, Sverre Arne

    2014-03-01

    Near infrared spectroscopy (NIRS) is a non-destructive measurement technique with broad application in pharmaceutical industry. Correct identification of pharmaceutical ingredients is an important task for quality control. Failure in this step can result in several adverse consequences, varied from economic loss to negative impact on patient safety. We have compared different methods in classification of a set of commercially available structurally related contrast media, Iodixanol (Visipaque(®)), Iohexol (Omnipaque(®)), Caldiamide Sodium and Gadodiamide (Omniscan(®)), by using NIR spectroscopy. The performance of classification models developed by soft independent modelling of class analogy (SIMCA), partial least squares discriminant analysis (PLS-DA) and Main and Interactions of Individual Principal Components Regression (MIPCR) were compared. Different variable selection methods were applied to optimize the classification models. Models developed by backward variable elimination partial least squares regression (BVE-PLS) and MIPCR were found to be most effective for classification of the set of contrast media. Below 1.5% of samples from the independent test set were not recognized by the BVE-PLS and MIPCR models, compared to up to 15% when models developed by other techniques were applied. Copyright © 2013 Elsevier B.V. All rights reserved.

  13. NASA software documentation standard software engineering program

    Science.gov (United States)

    1991-01-01

    The NASA Software Documentation Standard (hereinafter referred to as Standard) can be applied to the documentation of all NASA software. This Standard is limited to documentation format and content requirements. It does not mandate specific management, engineering, or assurance standards or techniques. This Standard defines the format and content of documentation for software acquisition, development, and sustaining engineering. Format requirements address where information shall be recorded and content requirements address what information shall be recorded. This Standard provides a framework to allow consistency of documentation across NASA and visibility into the completeness of project documentation. This basic framework consists of four major sections (or volumes). The Management Plan contains all planning and business aspects of a software project, including engineering and assurance planning. The Product Specification contains all technical engineering information, including software requirements and design. The Assurance and Test Procedures contains all technical assurance information, including Test, Quality Assurance (QA), and Verification and Validation (V&V). The Management, Engineering, and Assurance Reports is the library and/or listing of all project reports.

  14. Application of Musical Information Retrieval (MIR Techniques to Seismic Facies Classification. Examples in Hydrocarbon Exploration

    Directory of Open Access Journals (Sweden)

    Paolo Dell’Aversana

    2016-12-01

    Full Text Available In this paper, we introduce a novel approach for automatic pattern recognition and classification of geophysical data based on digital music technology. We import and apply in the geophysical domain the same approaches commonly used for Musical Information Retrieval (MIR. After accurate conversion from geophysical formats (example: SEG-Y to musical formats (example: Musical Instrument Digital Interface, or briefly MIDI, we extract musical features from the converted data. These can be single-valued attributes, such as pitch and sound intensity, or multi-valued attributes, such as pitch histograms, melodic, harmonic and rhythmic paths. Using a real data set, we show that these musical features can be diagnostic for seismic facies classification in a complex exploration area. They can be complementary with respect to “conventional” seismic attributes. Using a supervised machine learning approach based on the k-Nearest Neighbors algorithm and on Automatic Neural Networks, we classify three gas-bearing channels. The good performance of our classification approach is confirmed by borehole data available in the same area.

  15. Applications of Location Similarity Measures and Conceptual Spaces to Event Coreference and Classification

    Science.gov (United States)

    McConky, Katie Theresa

    2013-01-01

    This work covers topics in event coreference and event classification from spoken conversation. Event coreference is the process of identifying descriptions of the same event across sentences, documents, or structured databases. Existing event coreference work focuses on sentence similarity models or feature based similarity models requiring slot…

  16. Clustering document fragments using background color and texture information

    Science.gov (United States)

    Chanda, Sukalpa; Franke, Katrin; Pal, Umapada

    2012-01-01

    Forensic analysis of questioned documents sometimes can be extensively data intensive. A forensic expert might need to analyze a heap of document fragments and in such cases to ensure reliability he/she should focus only on relevant evidences hidden in those document fragments. Relevant document retrieval needs finding of similar document fragments. One notion of obtaining such similar documents could be by using document fragment's physical characteristics like color, texture, etc. In this article we propose an automatic scheme to retrieve similar document fragments based on visual appearance of document paper and texture. Multispectral color characteristics using biologically inspired color differentiation techniques are implemented here. This is done by projecting document color characteristics to Lab color space. Gabor filter-based texture analysis is used to identify document texture. It is desired that document fragments from same source will have similar color and texture. For clustering similar document fragments of our test dataset we use a Self Organizing Map (SOM) of dimension 5×5, where the document color and texture information are used as features. We obtained an encouraging accuracy of 97.17% from 1063 test images.

  17. Decision tree approach for classification of remotely sensed satellite ...

    Indian Academy of Sciences (India)

    sensed satellite data using open source support. Richa Sharma .... Decision tree classification techniques have been .... the USGS Earth Resource Observation Systems. (EROS) ... for shallow water, 11% were for sparse and dense built-up ...

  18. The computer integrated documentation project: A merge of hypermedia and AI techniques

    Science.gov (United States)

    Mathe, Nathalie; Boy, Guy

    1993-01-01

    To generate intelligent indexing that allows context-sensitive information retrieval, a system must be able to acquire knowledge directly through interaction with users. In this paper, we present the architecture for CID (Computer Integrated Documentation). CID is a system that enables integration of various technical documents in a hypertext framework and includes an intelligent browsing system that incorporates indexing in context. CID's knowledge-based indexing mechanism allows case based knowledge acquisition by experimentation. It utilizes on-line user information requirements and suggestions either to reinforce current indexing in case of success or to generate new knowledge in case of failure. This allows CID's intelligent interface system to provide helpful responses, based on previous experience (user feedback). We describe CID's current capabilities and provide an overview of our plans for extending the system.

  19. Multi-material classification of dry recyclables from municipal solid waste based on thermal imaging.

    Science.gov (United States)

    Gundupalli, Sathish Paulraj; Hait, Subrata; Thakur, Atul

    2017-12-01

    There has been a significant rise in municipal solid waste (MSW) generation in the last few decades due to rapid urbanization and industrialization. Due to the lack of source segregation practice, a need for automated segregation of recyclables from MSW exists in the developing countries. This paper reports a thermal imaging based system for classifying useful recyclables from simulated MSW sample. Experimental results have demonstrated the possibility to use thermal imaging technique for classification and a robotic system for sorting of recyclables in a single process step. The reported classification system yields an accuracy in the range of 85-96% and is comparable with the existing single-material recyclable classification techniques. We believe that the reported thermal imaging based system can emerge as a viable and inexpensive large-scale classification-cum-sorting technology in recycling plants for processing MSW in developing countries. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Modern radiosurgical and endovascular classification schemes for brain arteriovenous malformations.

    Science.gov (United States)

    Tayebi Meybodi, Ali; Lawton, Michael T

    2018-05-04

    Stereotactic radiosurgery (SRS) and endovascular techniques are commonly used for treating brain arteriovenous malformations (bAVMs). They are usually used as ancillary techniques to microsurgery but may also be used as solitary treatment options. Careful patient selection requires a clear estimate of the treatment efficacy and complication rates for the individual patient. As such, classification schemes are an essential part of patient selection paradigm for each treatment modality. While the Spetzler-Martin grading system and its subsequent modifications are commonly used for microsurgical outcome prediction for bAVMs, the same system(s) may not be easily applicable to SRS and endovascular therapy. Several radiosurgical- and endovascular-based grading scales have been proposed for bAVMs. However, a comprehensive review of these systems including a discussion on their relative advantages and disadvantages is missing. This paper is dedicated to modern classification schemes designed for SRS and endovascular techniques.

  1. Evaluation of normalization methods for cDNA microarray data by k-NN classification

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Wei; Xing, Eric P; Myers, Connie; Mian, Saira; Bissell, Mina J

    2004-12-17

    Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Using LOOCV error of k-NNs as the evaluation criterion, three double

  2. Algorithms for Hyperspectral Endmember Extraction and Signature Classification with Morphological Dendritic Networks

    Science.gov (United States)

    Schmalz, M.; Ritter, G.

    Accurate multispectral or hyperspectral signature classification is key to the nonimaging detection and recognition of space objects. Additionally, signature classification accuracy depends on accurate spectral endmember determination [1]. Previous approaches to endmember computation and signature classification were based on linear operators or neural networks (NNs) expressed in terms of the algebra (R, +, x) [1,2]. Unfortunately, class separation in these methods tends to be suboptimal, and the number of signatures that can be accurately classified often depends linearly on the number of NN inputs. This can lead to poor endmember distinction, as well as potentially significant classification errors in the presence of noise or densely interleaved signatures. In contrast to traditional CNNs, autoassociative morphological memories (AMM) are a construct similar to Hopfield autoassociatived memories defined on the (R, +, ?,?) lattice algebra [3]. Unlimited storage and perfect recall of noiseless real valued patterns has been proven for AMMs [4]. However, AMMs suffer from sensitivity to specific noise models, that can be characterized as erosive and dilative noise. On the other hand, the prior definition of a set of endmembers corresponds to material spectra lying on vertices of the minimum convex region covering the image data. These vertices can be characterized as morphologically independent patterns. It has further been shown that AMMs can be based on dendritic computation [3,6]. These techniques yield improved accuracy and class segmentation/separation ability in the presence of highly interleaved signature data. In this paper, we present a procedure for endmember determination based on AMM noise sensitivity, which employs morphological dendritic computation. We show that detected endmembers can be exploited by AMM based classification techniques, to achieve accurate signature classification in the presence of noise, closely spaced or interleaved signatures, and

  3. 78 FR 68983 - Cotton Futures Classification: Optional Classification Procedure

    Science.gov (United States)

    2013-11-18

    ...-AD33 Cotton Futures Classification: Optional Classification Procedure AGENCY: Agricultural Marketing... regulations to allow for the addition of an optional cotton futures classification procedure--identified and... response to requests from the U.S. cotton industry and ICE, AMS will offer a futures classification option...

  4. Equivalence classification by California sea lions using class-specific reinforcers.

    OpenAIRE

    Kastak, C R; Schusterman, R J; Kastak, D

    2001-01-01

    The ability to group dissimilar stimuli into categories on the basis of common stimulus relations (stimulus equivalence) or common functional relations (functional equivalence) has been convincingly demonstrated in verbally competent subjects. However, there are investigations with verbally limited humans and with nonhuman animals that suggest that the formation and use of classification schemes based on equivalence does not depend on linguistic skills. The present investigation documented th...

  5. Learning Supervised Topic Models for Classification and Regression from Crowds.

    Science.gov (United States)

    Rodrigues, Filipe; Lourenco, Mariana; Ribeiro, Bernardete; Pereira, Francisco C

    2017-12-01

    The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages of the proposed model over state-of-the-art approaches.

  6. Analysis of a risk prevention document using dependability techniques: a first step towards an effectiveness model

    Directory of Open Access Journals (Sweden)

    L. Ferrer

    2018-04-01

    Full Text Available Major hazard prevention is a main challenge given that it is specifically based on information communicated to the public. In France, preventive information is notably provided by way of local regulatory documents. Unfortunately, the law requires only few specifications concerning their content; therefore one can question the impact on the general population relative to the way the document is concretely created. Ergo, the purpose of our work is to propose an analytical methodology to evaluate preventive risk communication document effectiveness. The methodology is based on dependability approaches and is applied in this paper to the Document d'Information Communal sur les Risques Majeurs (DICRIM; in English, Municipal Information Document on Major Risks. DICRIM has to be made by mayors and addressed to the public to provide information on major hazards affecting their municipalities. An analysis of law compliance of the document is carried out thanks to the identification of regulatory detection elements. These are applied to a database of 30 DICRIMs. This analysis leads to a discussion on points such as usefulness of the missing elements. External and internal function analysis permits the identification of the form and content requirements and service and technical functions of the document and its components (here its sections. Their results are used to carry out an FMEA (failure modes and effects analysis, which allows us to define the failure and to identify detection elements. This permits the evaluation of the effectiveness of form and content of each components of the document. The outputs are validated by experts from the different fields investigated. Those results are obtained to build, in future works, a decision support model for the municipality (or specialised consulting firms in charge of drawing up documents.

  7. Analysis of a risk prevention document using dependability techniques: a first step towards an effectiveness model

    Science.gov (United States)

    Ferrer, Laetitia; Curt, Corinne; Tacnet, Jean-Marc

    2018-04-01

    Major hazard prevention is a main challenge given that it is specifically based on information communicated to the public. In France, preventive information is notably provided by way of local regulatory documents. Unfortunately, the law requires only few specifications concerning their content; therefore one can question the impact on the general population relative to the way the document is concretely created. Ergo, the purpose of our work is to propose an analytical methodology to evaluate preventive risk communication document effectiveness. The methodology is based on dependability approaches and is applied in this paper to the Document d'Information Communal sur les Risques Majeurs (DICRIM; in English, Municipal Information Document on Major Risks). DICRIM has to be made by mayors and addressed to the public to provide information on major hazards affecting their municipalities. An analysis of law compliance of the document is carried out thanks to the identification of regulatory detection elements. These are applied to a database of 30 DICRIMs. This analysis leads to a discussion on points such as usefulness of the missing elements. External and internal function analysis permits the identification of the form and content requirements and service and technical functions of the document and its components (here its sections). Their results are used to carry out an FMEA (failure modes and effects analysis), which allows us to define the failure and to identify detection elements. This permits the evaluation of the effectiveness of form and content of each components of the document. The outputs are validated by experts from the different fields investigated. Those results are obtained to build, in future works, a decision support model for the municipality (or specialised consulting firms) in charge of drawing up documents.

  8. Neural net classification of x-ray pistachio nut data

    Science.gov (United States)

    Casasent, David P.; Sipe, Michael A.; Schatzki, Thomas F.; Keagy, Pamela M.; Le, Lan Chau

    1996-12-01

    Classification results for agricultural products are presented using a new neural network. This neural network inherently produces higher-order decision surfaces. It achieves this with fewer hidden layer neurons than other classifiers require. This gives better generalization. It uses new techniques to select the number of hidden layer neurons and adaptive algorithms that avoid other such ad hoc parameter selection problems; it allows selection of the best classifier parameters without the need to analyze the test set results. The agriculture case study considered is the inspection and classification of pistachio nuts using x- ray imagery. Present inspection techniques cannot provide good rejection of worm damaged nuts without rejecting too many good nuts. X-ray imagery has the potential to provide 100% inspection of such agricultural products in real time. Only preliminary results are presented, but these indicate the potential to reduce major defects to 2% of the crop with 1% of good nuts rejected. Future image processing techniques that should provide better features to improve performance and allow inspection of a larger variety of nuts are noted. These techniques and variations of them have uses in a number of other agricultural product inspection problems.

  9. Multisensor multiresolution data fusion for improvement in classification

    Science.gov (United States)

    Rubeena, V.; Tiwari, K. C.

    2016-04-01

    The rapid advancements in technology have facilitated easy availability of multisensor and multiresolution remote sensing data. Multisensor, multiresolution data contain complementary information and fusion of such data may result in application dependent significant information which may otherwise remain trapped within. The present work aims at improving classification by fusing features of coarse resolution hyperspectral (1 m) LWIR and fine resolution (20 cm) RGB data. The classification map comprises of eight classes. The class names are Road, Trees, Red Roof, Grey Roof, Concrete Roof, Vegetation, bare Soil and Unclassified. The processing methodology for hyperspectral LWIR data comprises of dimensionality reduction, resampling of data by interpolation technique for registering the two images at same spatial resolution, extraction of the spatial features to improve classification accuracy. In the case of fine resolution RGB data, the vegetation index is computed for classifying the vegetation class and the morphological building index is calculated for buildings. In order to extract the textural features, occurrence and co-occurence statistics is considered and the features will be extracted from all the three bands of RGB data. After extracting the features, Support Vector Machine (SVMs) has been used for training and classification. To increase the classification accuracy, post processing steps like removal of any spurious noise such as salt and pepper noise is done which is followed by filtering process by majority voting within the objects for better object classification.

  10. Scalable Packet Classification with Hash Tables

    Science.gov (United States)

    Wang, Pi-Chung

    In the last decade, the technique of packet classification has been widely deployed in various network devices, including routers, firewalls and network intrusion detection systems. In this work, we improve the performance of packet classification by using multiple hash tables. The existing hash-based algorithms have superior scalability with respect to the required space; however, their search performance may not be comparable to other algorithms. To improve the search performance, we propose a tuple reordering algorithm to minimize the number of accessed hash tables with the aid of bitmaps. We also use pre-computation to ensure the accuracy of our search procedure. Performance evaluation based on both real and synthetic filter databases shows that our scheme is effective and scalable and the pre-computation cost is moderate.

  11. A NEW WASTE CLASSIFYING MODEL: HOW WASTE CLASSIFICATION CAN BECOME MORE OBJECTIVE?

    Directory of Open Access Journals (Sweden)

    Burcea Stefan Gabriel

    2015-07-01

    documents available in the virtual space, on the websites of certain international organizations involved in the wide and complex issue of waste management. The second part of the paper contains a proposal classification model with four main criteria in order to make waste classification a more objective process. The new classification model has the main role of transforming the traditional patterns of waste classification into an objective waste classification system and a second role of eliminating the strong contextuality of the actual waste classification models.

  12. Probing spermiogenesis: a digital strategy for mouse acrosome classification.

    Science.gov (United States)

    Taloni, Alessandro; Font-Clos, Francesc; Guidetti, Luca; Milan, Simone; Ascagni, Miriam; Vasco, Chiara; Pasini, Maria Enrica; Gioria, Maria Rosa; Ciusani, Emilio; Zapperi, Stefano; La Porta, Caterina A M

    2017-06-16

    Classification of morphological features in biological samples is usually performed by a trained eye but the increasing amount of available digital images calls for semi-automatic classification techniques. Here we explore this possibility in the context of acrosome morphological analysis during spermiogenesis. Our method combines feature extraction from three dimensional reconstruction of confocal images with principal component analysis and machine learning. The method could be particularly useful in cases where the amount of data does not allow for a direct inspection by trained eye.

  13. Odor Classification using Agent Technology

    Directory of Open Access Journals (Sweden)

    Sigeru OMATU

    2014-03-01

    Full Text Available In order to measure and classify odors, Quartz Crystal Microbalance (QCM can be used. In the present study, seven QCM sensors and three different odors are used. The system has been developed as a virtual organization of agents using an agent platform called PANGEA (Platform for Automatic coNstruction of orGanizations of intElligent Agents. This is a platform for developing open multi-agent systems, specifically those including organizational aspects. The main reason for the use of agents is the scalability of the platform, i.e. the way in which it models the services. The system models functionalities as services inside the agents, or as Service Oriented Approach (SOA architecture compliant services using Web Services. This way the adaptation of the odor classification systems with new algorithms, tools and classification techniques is allowed.

  14. A Classification-based Review Recommender

    Science.gov (United States)

    O'Mahony, Michael P.; Smyth, Barry

    Many online stores encourage their users to submit product/service reviews in order to guide future purchasing decisions. These reviews are often listed alongside product recommendations but, to date, limited attention has been paid as to how best to present these reviews to the end-user. In this paper, we describe a supervised classification approach that is designed to identify and recommend the most helpful product reviews. Using the TripAdvisor service as a case study, we compare the performance of several classification techniques using a range of features derived from hotel reviews. We then describe how these classifiers can be used as the basis for a practical recommender that automatically suggests the mosthelpful contrasting reviews to end-users. We present an empirical evaluation which shows that our approach achieves a statistically significant improvement over alternative review ranking schemes.

  15. The First AO Classification System for Fractures of the Craniomaxillofacial Skeleton: Rationale, Methodological Background, Developmental Process, and Objectives.

    Science.gov (United States)

    Audigé, Laurent; Cornelius, Carl-Peter; Di Ieva, Antonio; Prein, Joachim

    2014-12-01

    Validated trauma classification systems are the sole means to provide the basis for reliable documentation and evaluation of patient care, which will open the gateway to evidence-based procedures and healthcare in the coming years. With the support of AO Investigation and Documentation, a classification group was established to develop and evaluate a comprehensive classification system for craniomaxillofacial (CMF) fractures. Blueprints for fracture classification in the major constituents of the human skull were drafted and then evaluated by a multispecialty group of experienced CMF surgeons and a radiologist in a structured process during iterative agreement sessions. At each session, surgeons independently classified the radiological imaging of up to 150 consecutive cases with CMF fractures. During subsequent review meetings, all discrepancies in the classification outcome were critically appraised for clarification and improvement until consensus was reached. The resulting CMF classification system is structured in a hierarchical fashion with three levels of increasing complexity. The most elementary level 1 simply distinguishes four fracture locations within the skull: mandible (code 91), midface (code 92), skull base (code 93), and cranial vault (code 94). Levels 2 and 3 focus on further defining the fracture locations and for fracture morphology, achieving an almost individual mapping of the fracture pattern. This introductory article describes the rationale for the comprehensive AO CMF classification system, discusses the methodological framework, and provides insight into the experiences and interactions during the evaluation process within the core groups. The details of this system in terms of anatomy and levels are presented in a series of focused tutorials illustrated with case examples in this special issue of the Journal.

  16. Applying data mining techniques to medical time series: an empirical case study in electroencephalography and stabilometry

    Directory of Open Access Journals (Sweden)

    A. Anguera

    2016-01-01

    This paper illustrates the application of different knowledge discovery techniques for the purposes of classification within the above domains. The accuracy of this application for the two classes considered in each case is 99.86% and 98.11% for epilepsy diagnosis in the electroencephalography (EEG domain and 99.4% and 99.1% for early-age sports talent classification in the stabilometry domain. The KDD techniques achieve better results than other traditional neural network-based classification techniques.

  17. Crowdsourcing as a novel technique for retinal fundus photography classification: analysis of images in the EPIC Norfolk cohort on behalf of the UK Biobank Eye and Vision Consortium.

    Science.gov (United States)

    Mitry, Danny; Peto, Tunde; Hayat, Shabina; Morgan, James E; Khaw, Kay-Tee; Foster, Paul J

    2013-01-01

    Crowdsourcing is the process of outsourcing numerous tasks to many untrained individuals. Our aim was to assess the performance and repeatability of crowdsourcing for the classification of retinal fundus photography. One hundred retinal fundus photograph images with pre-determined disease criteria were selected by experts from a large cohort study. After reading brief instructions and an example classification, we requested that knowledge workers (KWs) from a crowdsourcing platform classified each image as normal or abnormal with grades of severity. Each image was classified 20 times by different KWs. Four study designs were examined to assess the effect of varying incentive and KW experience in classification accuracy. All study designs were conducted twice to examine repeatability. Performance was assessed by comparing the sensitivity, specificity and area under the receiver operating characteristic curve (AUC). Without restriction on eligible participants, two thousand classifications of 100 images were received in under 24 hours at minimal cost. In trial 1 all study designs had an AUC (95%CI) of 0.701(0.680-0.721) or greater for classification of normal/abnormal. In trial 1, the highest AUC (95%CI) for normal/abnormal classification was 0.757 (0.738-0.776) for KWs with moderate experience. Comparable results were observed in trial 2. In trial 1, between 64-86% of any abnormal image was correctly classified by over half of all KWs. In trial 2, this ranged between 74-97%. Sensitivity was ≥ 96% for normal versus severely abnormal detections across all trials. Sensitivity for normal versus mildly abnormal varied between 61-79% across trials. With minimal training, crowdsourcing represents an accurate, rapid and cost-effective method of retinal image analysis which demonstrates good repeatability. Larger studies with more comprehensive participant training are needed to explore the utility of this compelling technique in large scale medical image analysis.

  18. An Ultrasonic Pattern Recognition Approach to Welding Defect Classification

    International Nuclear Information System (INIS)

    Song, Sung Jin

    1995-01-01

    Classification of flaws in weldments from their ultrasonic scattering signals is very important in quantitative nondestructive evaluation. This problem is ideally suited to a modern ultrasonic pattern recognition technique. Here brief discussion on systematic approach to this methodology is presented including ultrasonic feature extraction, feature selection and classification. A stronger emphasis is placed on probabilistic neural networks as efficient classifiers for many practical classification problems. In an example probabilistic neural networks are applied to classify flaws in weldments into 3 classes such as cracks, porosity and slag inclusions. Probabilistic nets are shown to be able to exhibit high performance of other classifiers without any training time overhead. In addition, forward selection scheme for sensitive features is addressed to enhance network performance

  19. PHOTOGRAMMETRIC AND LIDAR DOCUMENTATION OF THE ROYAL CHAPEL (CATHEDRAL-MOSQUE OF CORDOBA, SPAIN

    Directory of Open Access Journals (Sweden)

    J. Cardenal

    2012-07-01

    Full Text Available At present, cultural heritage documentation projects use a variety of spatial data acquisition techniques such as conventional surveying, photogrammetry and terrestrial laser scanning. This paper deals with a full documentation project based on all those techniques in the Royal Chapel located in the Cathedral-Mosque of Cordoba in Spain, declared World Heritage Site by UNESCO. At present, the Royal Chapel is under study for a detailed diagnostic analysis in order to evaluate the actual state of the chapel, pathologies, construction phases, previous restoration works, material analysis, etc. So in order to assist the evaluation, a documentation project with photogrammetric and laser scanner techniques (TLS has been carried out. With this purpose, accurate cartographic and 3D products, by means of the integration of both image and laser based techniques, were needed to register all data collected during the diagnostic analysis.

  20. Classification of alarm processing techniques and human performance issues

    International Nuclear Information System (INIS)

    Kim, I.S.; O'Hara, J.M.

    1993-01-01

    Human factors reviews indicate that conventional alarm systems based on the one sensor, one alarm approach, have many human engineering deficiencies, a paramount example being too many alarms during major disturbances. As an effort to resolve these deficiencies, various alarm processing systems have been developed using different techniques. To ensure their contribution to operational safety, the impacts of those systems on operating crew performance should be carefully evaluated. This paper briefly reviews some of the human factors research issues associated with alarm processing techniques and then discusses a framework with which to classify the techniques. The dimensions of this framework can be used to explore the effects of alarm processing systems on human performance

  1. Classification of alarm processing techniques and human performance issues

    Energy Technology Data Exchange (ETDEWEB)

    Kim, I.S.; O' Hara, J.M.

    1993-01-01

    Human factors reviews indicate that conventional alarm systems based on the one sensor, one alarm approach, have many human engineering deficiencies, a paramount example being too many alarms during major disturbances. As an effort to resolve these deficiencies, various alarm processing systems have been developed using different techniques. To ensure their contribution to operational safety, the impacts of those systems on operating crew performance should be carefully evaluated. This paper briefly reviews some of the human factors research issues associated with alarm processing techniques and then discusses a framework with which to classify the techniques. The dimensions of this framework can be used to explore the effects of alarm processing systems on human performance.

  2. Classification of alarm processing techniques and human performance issues

    Energy Technology Data Exchange (ETDEWEB)

    Kim, I.S.; O`Hara, J.M.

    1993-05-01

    Human factors reviews indicate that conventional alarm systems based on the one sensor, one alarm approach, have many human engineering deficiencies, a paramount example being too many alarms during major disturbances. As an effort to resolve these deficiencies, various alarm processing systems have been developed using different techniques. To ensure their contribution to operational safety, the impacts of those systems on operating crew performance should be carefully evaluated. This paper briefly reviews some of the human factors research issues associated with alarm processing techniques and then discusses a framework with which to classify the techniques. The dimensions of this framework can be used to explore the effects of alarm processing systems on human performance.

  3. Classification of reflected signals from cavitated tooth surfaces using an artificial intelligence technique incorporating a fiber optic displacement sensor

    Science.gov (United States)

    Rahman, Husna Abdul; Harun, Sulaiman Wadi; Arof, Hamzah; Irawati, Ninik; Musirin, Ismail; Ibrahim, Fatimah; Ahmad, Harith

    2014-05-01

    An enhanced dental cavity diameter measurement mechanism using an intensity-modulated fiber optic displacement sensor (FODS) scanning and imaging system, fuzzy logic as well as a single-layer perceptron (SLP) neural network, is presented. The SLP network was employed for the classification of the reflected signals, which were obtained from the surfaces of teeth samples and captured using FODS. Two features were used for the classification of the reflected signals with one of them being the output of a fuzzy logic. The test results showed that the combined fuzzy logic and SLP network methodology contributed to a 100% classification accuracy of the network. The high-classification accuracy significantly demonstrates the suitability of the proposed features and classification using SLP networks for classifying the reflected signals from teeth surfaces, enabling the sensor to accurately measure small diameters of tooth cavity of up to 0.6 mm. The method remains simple enough to allow its easy integration in existing dental restoration support systems.

  4. Chronic radiation effects on dental hard tissue (''radiation carries''). Classification and therapeutic strategies

    International Nuclear Information System (INIS)

    Groetz, K.A.; Brahm, R.; Al-Nawas, B.; Wagner, W.; Riesenbeck, D.; Willich, N.; Seegenschmiedt, M.H.

    2001-01-01

    Objectives: Since the first description of rapid destruction of dental hard tissues following head and neck radiotherapy 80 years ago, 'radiation caries' is an established clinical finding. The internationally accepted clinical evaluation score RTOG/EORTC however is lacking a classification of this frequent radiogenic alteration. Material and Methods: Medical records, data and images of radiation effects on the teeth of more than 1,500 patients, who underwent periradiotherapeutic care, were analyzed. Macroscopic alterations regarding the grade of late lesions of tooth crowns were used for a classification into 4 grades according to the RTOG/EORTC guidelines. Results: No early radiation effects were found by macroscopic inspection. In the first 90 days following radiotherapy 1/3 of the patients complained of reversible hypersensitivity, which may be related to a temporary hyperemia of the pulp. It was possible to classify radiation caries as a late radiation effect on a graded scale as known from RTOG/EORTC for other organ systems. This is a prerequisite for the integration of radiation caries into the international nomenclature of the RTOG/EORTC classification. Conclusions: The documentation of early radiation effects on dental hard tissues seems to be neglectable. On the other hand the documentation of late radiation effects has a high clinical impact. The identification of an initial lesion at the high-risk areas of the neck and incisal part of the tooth can lead to a successful therapy as a major prerequisite for orofacial rehabilitation. An internationally standardized documentation is a basis for the evaluation of the side effects of radiooncotic therapy as well as the effectiveness of protective and supportive procedures. (orig.) [de

  5. Decision theory for discrimination-aware classification

    KAUST Repository

    Kamiran, Faisal

    2012-12-01

    Social discrimination (e.g., against females) arising from data mining techniques is a growing concern worldwide. In recent years, several methods have been proposed for making classifiers learned over discriminatory data discriminationaware. However, these methods suffer from two major shortcomings: (1) They require either modifying the discriminatory data or tweaking a specific classification algorithm and (2) They are not flexible w.r.t. discrimination control and multiple sensitive attribute handling. In this paper, we present two solutions for discrimination-aware classification that neither require data modification nor classifier tweaking. Our first and second solutions exploit, respectively, the reject option of probabilistic classifier(s) and the disagreement region of general classifier ensembles to reduce discrimination. We relate both solutions with decision theory for better understanding of the process. Our experiments using real-world datasets demonstrate that our solutions outperform existing state-ofthe-art methods, especially at low discrimination which is a significant advantage. The superior performance coupled with flexible control over discrimination and easy applicability to multiple sensitive attributes makes our solutions an important step forward in practical discrimination-aware classification. © 2012 IEEE.

  6. a Two-Step Classification Approach to Distinguishing Similar Objects in Mobile LIDAR Point Clouds

    Science.gov (United States)

    He, H.; Khoshelham, K.; Fraser, C.

    2017-09-01

    Nowadays, lidar is widely used in cultural heritage documentation, urban modeling, and driverless car technology for its fast and accurate 3D scanning ability. However, full exploitation of the potential of point cloud data for efficient and automatic object recognition remains elusive. Recently, feature-based methods have become very popular in object recognition on account of their good performance in capturing object details. Compared with global features describing the whole shape of the object, local features recording the fractional details are more discriminative and are applicable for object classes with considerable similarity. In this paper, we propose a two-step classification approach based on point feature histograms and the bag-of-features method for automatic recognition of similar objects in mobile lidar point clouds. Lamp post, street light and traffic sign are grouped as one category in the first-step classification for their inter similarity compared with tree and vehicle. A finer classification of the lamp post, street light and traffic sign based on the result of the first-step classification is implemented in the second step. The proposed two-step classification approach is shown to yield a considerable improvement over the conventional one-step classification approach.

  7. Final hazard classification and auditable safety analysis for the 105-C Reactor Interim Safe Storage Project

    International Nuclear Information System (INIS)

    Rodovsky, T.J.; Larson, A.R.; Dexheimer, D.

    1996-12-01

    This document summarizes the inventories of radioactive and hazardous materials present in the 105-C Reactor Facility and the operations associated with the Interim Safe Storage Project which includes decontamination and demolition and interim safe storage of the remaining facility. This document also establishes a final hazard classification and verifies that appropriate and adequate safety functions and controls are in place to reduce or mitigate the risk associated with those operations

  8. Enregistrer, ordonner et contrôler: les documents diplomatiques dans les "Registra Secreta " de Jacques II d'Aragon

    Directory of Open Access Journals (Sweden)

    Péquignot, Stéphane

    2002-06-01

    Full Text Available The study of the internal organisation of Jayme II´s Registra secreta shows the existence of a specific «system» for the gathering of diplomatical registered papers in chapters called legatio. However, this method of classification does not exclude other ones and these registers can be considered as an experimental laboratory for the arrangement and the location of registered documents. The composition of the registers and the interventions of the scribes within conduce then to look for the functions they are assigned with: allowing an easier control of the uses of the documentation by the king´s ambassadors, the Registra secreta take part, at the chancery, in a living memory of the administration of diplomatical matters.

    L'étude de l'économie interne des Registra secreta de Jacques II montre l'existence d'un «système» spécifique de rassemblement de la documentation diplomatique sous des rubriques legatio. Ce mode de classification n'est cependant pas exclusif, et les registres constituent un véritable lieu d'expérimentation pour l'ordonnancement et le repérage de la documentation enregistrée. L'examen de la tenue des registres conduit alors à celui des fonctions qui leur sont assignées: en facilitant le contrôle des usages de la documentation par les ambassadeurs du roi, ils participent à la chancellerie dune mémoire active de l'administration des affaires diplomatiques.

  9. Recognition for old Arabic manuscripts using spatial gray level dependence (SGLD

    Directory of Open Access Journals (Sweden)

    Ahmad M. Abd Al-Aziz

    2011-03-01

    Full Text Available Texture analysis forms the basis of object recognition and classification in several domains, one of these domains is historical document manuscripts because the manuscripts hold our culture heritage and also large numbers of undated manuscripts exist. This paper presents results for historical document classification of old Arabic manuscripts using texture analysis and a segmentation free approach. The main objective is to discriminate between historical documents of different writing styles to three different ages: Contemporary (Modern Age, Ottoman Age and Mamluk Age. This classification depends on a Spatial Gray-level Dependence (SGLD technique which provides eight distinct texture features for each sample document. We applied Stepwise Discriminant Analysis and Multiple discriminant analysis methods to decrease the dimensionality of features and extract training vector features from samples. To classify historical documents into three main historical age classes the decision tree classification is applied. The system has been tested on 48 Arabic historical manuscripts documents from the Dar Al-Kotob Al-Masria Library. Our results so far yield 95.83% correct classification for the historical Arabic documents.

  10. Analysis of Landsat-4 Thematic Mapper data for classification of forest stands in Baldwin County, Alabama

    Science.gov (United States)

    Hill, C. L.

    1984-01-01

    A computer-implemented classification has been derived from Landsat-4 Thematic Mapper data acquired over Baldwin County, Alabama on January 15, 1983. One set of spectral signatures was developed from the data by utilizing a 3x3 pixel sliding window approach. An analysis of the classification produced from this technique identified forested areas. Additional information regarding only the forested areas. Additional information regarding only the forested areas was extracted by employing a pixel-by-pixel signature development program which derived spectral statistics only for pixels within the forested land covers. The spectral statistics from both approaches were integrated and the data classified. This classification was evaluated by comparing the spectral classes produced from the data against corresponding ground verification polygons. This iterative data analysis technique resulted in an overall classification accuracy of 88.4 percent correct for slash pine, young pine, loblolly pine, natural pine, and mixed hardwood-pine. An accuracy assessment matrix has been produced for the classification.

  11. Prediction of customer behaviour analysis using classification algorithms

    Science.gov (United States)

    Raju, Siva Subramanian; Dhandayudam, Prabha

    2018-04-01

    Customer Relationship management plays a crucial role in analyzing of customer behavior patterns and their values with an enterprise. Analyzing of customer data can be efficient performed using various data mining techniques, with the goal of developing business strategies and to enhance the business. In this paper, three classification models (NB, J48, and MLPNN) are studied and evaluated for our experimental purpose. The performance measures of the three classifications are compared using three different parameters (accuracy, sensitivity, specificity) and experimental results expose J48 algorithm has better accuracy with compare to NB and MLPNN algorithm.

  12. Data classification based on the hybrid intellectual technology

    Directory of Open Access Journals (Sweden)

    Demidova Liliya

    2018-01-01

    Full Text Available In this paper the data classification technique, implying the consistent application of the SVM and Parzen classifiers, has been suggested. The Parser classifier applies to data which can be both correctly and erroneously classified using the SVM classifier, and are located in the experimentally defined subareas near the hyperplane which separates the classes. A herewith, the SVM classifier is used with the default parameters values, and the optimal parameters values of the Parser classifier are determined using the genetic algorithm. The experimental results confirming the effectiveness of the proposed hybrid intellectual data classification technology have been presented.

  13. Risk assessments using the Strain Index and the TLV for HAL, Part I: Task and multi-task job exposure classifications.

    Science.gov (United States)

    Kapellusch, Jay M; Bao, Stephen S; Silverstein, Barbara A; Merryweather, Andrew S; Thiese, Mathew S; Hegmann, Kurt T; Garg, Arun

    2017-12-01

    The Strain Index (SI) and the American Conference of Governmental Industrial Hygienists (ACGIH) Threshold Limit Value for Hand Activity Level (TLV for HAL) use different constituent variables to quantify task physical exposures. Similarly, time-weighted-average (TWA), Peak, and Typical exposure techniques to quantify physical exposure from multi-task jobs make different assumptions about each task's contribution to the whole job exposure. Thus, task and job physical exposure classifications differ depending upon which model and technique are used for quantification. This study examines exposure classification agreement, disagreement, correlation, and magnitude of classification differences between these models and techniques. Data from 710 multi-task job workers performing 3,647 tasks were analyzed using the SI and TLV for HAL models, as well as with the TWA, Typical and Peak job exposure techniques. Physical exposures were classified as low, medium, and high using each model's recommended, or a priori limits. Exposure classification agreement and disagreement between models (SI, TLV for HAL) and between job exposure techniques (TWA, Typical, Peak) were described and analyzed. Regardless of technique, the SI classified more tasks as high exposure than the TLV for HAL, and the TLV for HAL classified more tasks as low exposure. The models agreed on 48.5% of task classifications (kappa = 0.28) with 15.5% of disagreement between low and high exposure categories. Between-technique (i.e., TWA, Typical, Peak) agreement ranged from 61-93% (kappa: 0.16-0.92) depending on whether the SI or TLV for HAL was used. There was disagreement between the SI and TLV for HAL and between the TWA, Typical and Peak techniques. Disagreement creates uncertainty for job design, job analysis, risk assessments, and developing interventions. Task exposure classifications from the SI and TLV for HAL might complement each other. However, TWA, Typical, and Peak job exposure techniques all have

  14. Automated classification of Acid Rock Drainage potential from Corescan drill core imagery

    Science.gov (United States)

    Cracknell, M. J.; Jackson, L.; Parbhakar-Fox, A.; Savinova, K.

    2017-12-01

    Classification of the acid forming potential of waste rock is important for managing environmental hazards associated with mining operations. Current methods for the classification of acid rock drainage (ARD) potential usually involve labour intensive and subjective assessment of drill core and/or hand specimens. Manual methods are subject to operator bias, human error and the amount of material that can be assessed within a given time frame is limited. The automated classification of ARD potential documented here is based on the ARD Index developed by Parbhakar-Fox et al. (2011). This ARD Index involves the combination of five indicators: A - sulphide content; B - sulphide alteration; C - sulphide morphology; D - primary neutraliser content; and E - sulphide mineral association. Several components of the ARD Index require accurate identification of sulphide minerals. This is achieved by classifying Corescan Red-Green-Blue true colour images into the presence or absence of sulphide minerals using supervised classification. Subsequently, sulphide classification images are processed and combined with Corescan SWIR-based mineral classifications to obtain information on sulphide content, indices representing sulphide textures (disseminated versus massive and degree of veining), and spatially associated minerals. This information is combined to calculate ARD Index indicator values that feed into the classification of ARD potential. Automated ARD potential classifications of drill core samples associated with a porphyry Cu-Au deposit are compared to manually derived classifications and those obtained by standard static geochemical testing and X-ray diffractometry analyses. Results indicate a high degree of similarity between automated and manual ARD potential classifications. Major differences between approaches are observed in sulphide and neutraliser mineral percentages, likely due to the subjective nature of manual estimates of mineral content. The automated approach

  15. Non-fuel assembly components: 10 CFR 61.55 classification for waste disposal

    International Nuclear Information System (INIS)

    Migliore, R.J.; Reid, B.D.; Fadeff, S.K.; Pauley, K.A.; Jenquin, U.P.

    1994-09-01

    This document reports the results of laboratory radionuclide measurements on a representative group of non-fuel assembly (NFA) components for the purposes of waste classification. This document also provides a methodology to estimate the radionuclide inventory of NFA components, including those located outside the fueled region of a nuclear reactor. These radionuclide estimates can then be used to determine the waste classification of NFA components for which there are no physical measurements. Previously, few radionuclide inventory measurements had been performed on NFA components. For this project, recommended scaling factors were selected for the ORIGEN2 computer code that result in conservative estimates of radionuclide concentrations in NFA components. These scaling factors were based upon experimental data obtained from the following NFA components: (1) a pressurized water reactor (PWR) burnable poison rod assembly, (2) a PVM rod cluster control assembly, and (3) a boiling water reactor cruciform control rod blade. As a whole, these components were found to be within Class C limits. Laboratory radionuclide measurements for these components are provided in detail

  16. Preliminary Hazard Classification of the 1714-N, Lead Storage

    International Nuclear Information System (INIS)

    Kerr, N. R.

    1999-01-01

    The 1714-N, -NA and -NB is a building segment that was deactivated under the N Area Deactivation Project. During the deactivation the building was designated as an area to store recycled or reused lead products. This document presents the Preliminary Hazard Classification (PHC) for the continued storage of lead products by Bechtel Hanford, Inc. (BHI). Two types of hazardous substances are the focus of this PHC: lead and residual radiological contamination. An evaluation contained in this PHC concludes that there is little risk from the remaining hazardous substances. It was further concluded that standard institutional controls that are implemented under the BHI contract provide adequate protection to people and the environment. No further safety analysis documentation is required for the continued lead storage

  17. Classification of gene expression data: A hubness-aware semi-supervised approach.

    Science.gov (United States)

    Buza, Krisztian

    2016-04-01

    Classification of gene expression data is the common denominator of various biomedical recognition tasks. However, obtaining class labels for large training samples may be difficult or even impossible in many cases. Therefore, semi-supervised classification techniques are required as semi-supervised classifiers take advantage of unlabeled data. Gene expression data is high-dimensional which gives rise to the phenomena known under the umbrella of the curse of dimensionality, one of its recently explored aspects being the presence of hubs or hubness for short. Therefore, hubness-aware classifiers have been developed recently, such as Naive Hubness-Bayesian k-Nearest Neighbor (NHBNN). In this paper, we propose a semi-supervised extension of NHBNN which follows the self-training schema. As one of the core components of self-training is the certainty score, we propose a new hubness-aware certainty score. We performed experiments on publicly available gene expression data. These experiments show that the proposed classifier outperforms its competitors. We investigated the impact of each of the components (classification algorithm, semi-supervised technique, hubness-aware certainty score) separately and showed that each of these components are relevant to the performance of the proposed approach. Our results imply that our approach may increase classification accuracy and reduce computational costs (i.e., runtime). Based on the promising results presented in the paper, we envision that hubness-aware techniques will be used in various other biomedical machine learning tasks. In order to accelerate this process, we made an implementation of hubness-aware machine learning techniques publicly available in the PyHubs software package (http://www.biointelligence.hu/pyhubs) implemented in Python, one of the most popular programming languages of data science. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  18. Land Cover Classification Using ALOS Imagery For Penang, Malaysia

    International Nuclear Information System (INIS)

    Sim, C K; Abdullah, K; MatJafri, M Z; Lim, H S

    2014-01-01

    This paper presents the potential of integrating optical and radar remote sensing data to improve automatic land cover mapping. The analysis involved standard image processing, and consists of spectral signature extraction and application of a statistical decision rule to identify land cover categories. A maximum likelihood classifier is utilized to determine different land cover categories. Ground reference data from sites throughout the study area are collected for training and validation. The land cover information was extracted from the digital data using PCI Geomatica 10.3.2 software package. The variations in classification accuracy due to a number of radar imaging processing techniques are studied. The relationship between the processing window and the land classification is also investigated. The classification accuracies from the optical and radar feature combinations are studied. Our research finds that fusion of radar and optical significantly improved classification accuracies. This study indicates that the land cover/use can be mapped accurately by using this approach

  19. A Classification Framework Applied to Cancer Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Hussein Hijazi

    2013-01-01

    Full Text Available Classification of cancer based on gene expression has provided insight into possible treatment strategies. Thus, developing machine learning methods that can successfully distinguish among cancer subtypes or normal versus cancer samples is important. This work discusses supervised learning techniques that have been employed to classify cancers. Furthermore, a two-step feature selection method based on an attribute estimation method (e.g., ReliefF and a genetic algorithm was employed to find a set of genes that can best differentiate between cancer subtypes or normal versus cancer samples. The application of different classification methods (e.g., decision tree, k-nearest neighbor, support vector machine (SVM, bagging, and random forest on 5 cancer datasets shows that no classification method universally outperforms all the others. However, k-nearest neighbor and linear SVM generally improve the classification performance over other classifiers. Finally, incorporating diverse types of genomic data (e.g., protein-protein interaction data and gene expression increase the prediction accuracy as compared to using gene expression alone.

  20. Is overall similarity classification less effortful than single-dimension classification?

    Science.gov (United States)

    Wills, Andy J; Milton, Fraser; Longmore, Christopher A; Hester, Sarah; Robinson, Jo

    2013-01-01

    It is sometimes argued that the implementation of an overall similarity classification is less effortful than the implementation of a single-dimension classification. In the current article, we argue that the evidence securely in support of this view is limited, and report additional evidence in support of the opposite proposition--overall similarity classification is more effortful than single-dimension classification. Using a match-to-standards procedure, Experiments 1A, 1B and 2 demonstrate that concurrent load reduces the prevalence of overall similarity classification, and that this effect is robust to changes in the concurrent load task employed, the level of time pressure experienced, and the short-term memory requirements of the classification task. Experiment 3 demonstrates that participants who produced overall similarity classifications from the outset have larger working memory capacities than those who produced single-dimension classifications initially, and Experiment 4 demonstrates that instructions to respond meticulously increase the prevalence of overall similarity classification.