WorldWideScience

Sample records for classifying proteinlike sequences

  1. Optimization of short amino acid sequences classifier

    Science.gov (United States)

    Barcz, Aleksy; Szymański, Zbigniew

    This article describes processing methods used for short amino acid sequences classification. The data processed are 9-symbols string representations of amino acid sequences, divided into 49 data sets - each one containing samples labeled as reacting or not with given enzyme. The goal of the classification is to determine for a single enzyme, whether an amino acid sequence would react with it or not. Each data set is processed separately. Feature selection is performed to reduce the number of dimensions for each data set. The method used for feature selection consists of two phases. During the first phase, significant positions are selected using Classification and Regression Trees. Afterwards, symbols appearing at the selected positions are substituted with numeric values of amino acid properties taken from the AAindex database. In the second phase the new set of features is reduced using a correlation-based ranking formula and Gram-Schmidt orthogonalization. Finally, the preprocessed data is used for training LS-SVM classifiers. SPDE, an evolutionary algorithm, is used to obtain optimal hyperparameters for the LS-SVM classifier, such as error penalty parameter C and kernel-specific hyperparameters. A simple score penalty is used to adapt the SPDE algorithm to the task of selecting classifiers with best performance measures values.

  2. Classifying next-generation sequencing data using a zero-inflated Poisson model.

    Science.gov (United States)

    Zhou, Yan; Wan, Xiang; Zhang, Baoxue; Tong, Tiejun

    2018-04-15

    With the development of high-throughput techniques, RNA-sequencing (RNA-seq) is becoming increasingly popular as an alternative for gene expression analysis, such as RNAs profiling and classification. Identifying which type of diseases a new patient belongs to with RNA-seq data has been recognized as a vital problem in medical research. As RNA-seq data are discrete, statistical methods developed for classifying microarray data cannot be readily applied for RNA-seq data classification. Witten proposed a Poisson linear discriminant analysis (PLDA) to classify the RNA-seq data in 2011. Note, however, that the count datasets are frequently characterized by excess zeros in real RNA-seq or microRNA sequence data (i.e. when the sequence depth is not enough or small RNAs with the length of 18-30 nucleotides). Therefore, it is desired to develop a new model to analyze RNA-seq data with an excess of zeros. In this paper, we propose a Zero-Inflated Poisson Logistic Discriminant Analysis (ZIPLDA) for RNA-seq data with an excess of zeros. The new method assumes that the data are from a mixture of two distributions: one is a point mass at zero, and the other follows a Poisson distribution. We then consider a logistic relation between the probability of observing zeros and the mean of the genes and the sequencing depth in the model. Simulation studies show that the proposed method performs better than, or at least as well as, the existing methods in a wide range of settings. Two real datasets including a breast cancer RNA-seq dataset and a microRNA-seq dataset are also analyzed, and they coincide with the simulation results that our proposed method outperforms the existing competitors. The software is available at http://www.math.hkbu.edu.hk/∼tongt. xwan@comp.hkbu.edu.hk or tongt@hkbu.edu.hk. Supplementary data are available at Bioinformatics online.

  3. Binary classifiers and latent sequence models for emotion detection in suicide notes.

    Science.gov (United States)

    Cherry, Colin; Mohammad, Saif M; de Bruijn, Berry

    2012-01-01

    This paper describes the National Research Council of Canada's submission to the 2011 i2b2 NLP challenge on the detection of emotions in suicide notes. In this task, each sentence of a suicide note is annotated with zero or more emotions, making it a multi-label sentence classification task. We employ two distinct large-margin models capable of handling multiple labels. The first uses one classifier per emotion, and is built to simplify label balance issues and to allow extremely fast development. This approach is very effective, scoring an F-measure of 55.22 and placing fourth in the competition, making it the best system that does not use web-derived statistics or re-annotated training data. Second, we present a latent sequence model, which learns to segment the sentence into a number of emotion regions. This model is intended to gracefully handle sentences that convey multiple thoughts and emotions. Preliminary work with the latent sequence model shows promise, resulting in comparable performance using fewer features.

  4. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    Science.gov (United States)

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy.

    Directory of Open Access Journals (Sweden)

    Lina Zhang

    Full Text Available Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information, PSSM (Position Specific Scoring Matrix, RSA (Relative Solvent Accessibility, and CTD (Composition, Transition, Distribution. The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest, SMO (Sequential Minimal Optimization, NNA (Nearest Neighbor Algorithm, and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew's Correlation Coefficient of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc.

  6. Comparing methods of classifying life courses: Sequence analysis and latent class analysis

    NARCIS (Netherlands)

    Elzinga, C.H.; Liefbroer, Aart C.; Han, Sapphire

    2017-01-01

    We compare life course typology solutions generated by sequence analysis (SA) and latent class analysis (LCA). First, we construct an analytic protocol to arrive at typology solutions for both methodologies and present methods to compare the empirical quality of alternative typologies. We apply this

  7. Comparing methods of classifying life courses: sequence analysis and latent class analysis

    NARCIS (Netherlands)

    Han, Y.; Liefbroer, A.C.; Elzinga, C.

    2017-01-01

    We compare life course typology solutions generated by sequence analysis (SA) and latent class analysis (LCA). First, we construct an analytic protocol to arrive at typology solutions for both methodologies and present methods to compare the empirical quality of alternative typologies. We apply this

  8. An Effective Antifreeze Protein Predictor with Ensemble Classifiers and Comprehensive Sequence Descriptors

    Directory of Open Access Journals (Sweden)

    Runtao Yang

    2015-09-01

    Full Text Available Antifreeze proteins (AFPs play a pivotal role in the antifreeze effect of overwintering organisms. They have a wide range of applications in numerous fields, such as improving the production of crops and the quality of frozen foods. Accurate identification of AFPs may provide important clues to decipher the underlying mechanisms of AFPs in ice-binding and to facilitate the selection of the most appropriate AFPs for several applications. Based on an ensemble learning technique, this study proposes an AFP identification system called AFP-Ensemble. In this system, random forest classifiers are trained by different training subsets and then aggregated into a consensus classifier by majority voting. The resulting predictor yields a sensitivity of 0.892, a specificity of 0.940, an accuracy of 0.938 and a balanced accuracy of 0.916 on an independent dataset, which are far better than the results obtained by previous methods. These results reveal that AFP-Ensemble is an effective and promising predictor for large-scale determination of AFPs. The detailed feature analysis in this study may give useful insights into the molecular mechanisms of AFP-ice interactions and provide guidance for the related experimental validation. A web server has been designed to implement the proposed method.

  9. Can-Evo-Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences.

    Science.gov (United States)

    Ali, Safdar; Majid, Abdul

    2015-04-01

    The diagnostic of human breast cancer is an intricate process and specific indicators may produce negative results. In order to avoid misleading results, accurate and reliable diagnostic system for breast cancer is indispensable. Recently, several interesting machine-learning (ML) approaches are proposed for prediction of breast cancer. To this end, we developed a novel classifier stacking based evolutionary ensemble system "Can-Evo-Ens" for predicting amino acid sequences associated with breast cancer. In this paper, first, we selected four diverse-type of ML algorithms of Naïve Bayes, K-Nearest Neighbor, Support Vector Machines, and Random Forest as base-level classifiers. These classifiers are trained individually in different feature spaces using physicochemical properties of amino acids. In order to exploit the decision spaces, the preliminary predictions of base-level classifiers are stacked. Genetic programming (GP) is then employed to develop a meta-classifier that optimal combine the predictions of the base classifiers. The most suitable threshold value of the best-evolved predictor is computed using Particle Swarm Optimization technique. Our experiments have demonstrated the robustness of Can-Evo-Ens system for independent validation dataset. The proposed system has achieved the highest value of Area Under Curve (AUC) of ROC Curve of 99.95% for cancer prediction. The comparative results revealed that proposed approach is better than individual ML approaches and conventional ensemble approaches of AdaBoostM1, Bagging, GentleBoost, and Random Subspace. It is expected that the proposed novel system would have a major impact on the fields of Biomedical, Genomics, Proteomics, Bioinformatics, and Drug Development. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier

    KAUST Repository

    Kulmanov, Maxat

    2017-09-27

    Motivation A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. Results We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein–protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations.

  11. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.

    Science.gov (United States)

    Kulmanov, Maxat; Khan, Mohammed Asif; Hoehndorf, Robert; Wren, Jonathan

    2018-02-15

    A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein-protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations. Web server: http://deepgo.bio2vec.net, Source code: https://github.com/bio-ontology-research-group/deepgo. robert.hoehndorf@kaust.edu.sa. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  12. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier

    KAUST Repository

    Kulmanov, Maxat; Khan, Mohammed Asif; Hoehndorf, Robert

    2017-01-01

    A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often

  13. Methodology to classify accident sequences of an Individual Plant Examination according to the severe releases for BWR type reactors

    International Nuclear Information System (INIS)

    Sandoval V, S.

    2001-01-01

    The Light Water Reactor (LWR) operation regulations require to every operating plant to perform of an Individual Plant Examination study (Ipe). One of the main purposes of an Ipe is t o gain a more quantitative understanding of the overall probabilities of core damage and fission product releases . Probabilistic Safety Analysis (PSA) methodologies and Severe Accident Analysis are used to perform Ipe studies. PSA methodologies are used to identify and analyse the set of event sequences that might originate the fission product release from a nuclear power plant; these methodologies are combinatorial in nature and generate thousands of sequences. Among other uses within an Ipe, severe accident simulations are used to determine the characteristics of the fission product release for the identified sequences and in this way, the releases can be understood and characterized. A vast amount of resources is required to simulate and analyse every Ipe sequence. This effort is unnecessary if similar sequences are grouped. The grouping scheme must achieve an efficient trade off between problem reduction and accuracy. The methodology presented in this work enables an accurate characterization and analysis of the Ipe fission product releases by using a reduced problem. The methodology encourages the use of specific plant simulations. (Author)

  14. Development and confirmation of potential gene classifiers of human clear cell renal cell carcinoma using next-generation RNA sequencing.

    Science.gov (United States)

    Eikrem, Oystein S; Strauss, Philipp; Beisland, Christian; Scherer, Andreas; Landolt, Lea; Flatberg, Arnar; Leh, Sabine; Beisvag, Vidar; Skogstrand, Trude; Hjelle, Karin; Shresta, Anjana; Marti, Hans-Peter

    2016-12-01

    A previous study by this group demonstrated the feasibility of RNA sequencing (RNAseq) technology for capturing disease biology of clear cell renal cell carcinoma (ccRCC), and presented initial results for carbonic anhydrase-9 (CA9) and tumor necrosis factor-α-induced protein-6 (TNFAIP6) as possible biomarkers of ccRCC (discovery set) [Eikrem et al. PLoS One 2016;11:e0149743]. To confirm these results, the previous study is expanded, and RNAseq data from additional matched ccRCC and normal renal biopsies are analyzed (confirmation set). Two core biopsies from patients (n = 12) undergoing partial or full nephrectomy were obtained with a 16 g needle. RNA sequencing libraries were generated with the Illumina TruSeq ® Access library preparation protocol. Comparative analysis was done using linear modeling (voom/Limma; R Bioconductor). The formalin-fixed and paraffin-embedded discovery and confirmation data yielded 8957 and 11,047 detected transcripts, respectively. The two data sets shared 1193 of differentially expressed genes with each other. The average expression and the log 2 -fold changes of differentially expressed transcripts in both data sets correlated, with R²   =   .95 and R²   =   .94, respectively. Among transcripts with the highest fold changes were CA9, neuronal pentraxin-2 and uromodulin. Epithelial-mesenchymal transition was highlighted by differential expression of, for example, transforming growth factor-β 1 and delta-like ligand-4. The diagnostic accuracy of CA9 was 100% and 93.9% when using the discovery set as the training set and the confirmation data as the test set, and vice versa, respectively. These data further support TNFAIP6 as a novel biomarker of ccRCC. TNFAIP6 had combined accuracy of 98.5% in the two data sets. This study provides confirmatory data on the potential use of CA9 and TNFAIP6 as biomarkers of ccRCC. Thus, next-generation sequencing expands the clinical application of tissue analyses.

  15. Classifying Microorganisms

    DEFF Research Database (Denmark)

    Sommerlund, Julie

    2006-01-01

    This paper describes the coexistence of two systems for classifying organisms and species: a dominant genetic system and an older naturalist system. The former classifies species and traces their evolution on the basis of genetic characteristics, while the latter employs physiological characteris......This paper describes the coexistence of two systems for classifying organisms and species: a dominant genetic system and an older naturalist system. The former classifies species and traces their evolution on the basis of genetic characteristics, while the latter employs physiological...... characteristics. The coexistence of the classification systems does not lead to a conflict between them. Rather, the systems seem to co-exist in different configurations, through which they are complementary, contradictory and inclusive in different situations-sometimes simultaneously. The systems come...

  16. A DNA-based pattern classifier with in vitro learning and associative recall for genomic characterization and biosensing without explicit sequence knowledge.

    Science.gov (United States)

    Lee, Ju Seok; Chen, Junghuei; Deaton, Russell; Kim, Jin-Woo

    2014-01-01

    Genetic material extracted from in situ microbial communities has high promise as an indicator of biological system status. However, the challenge is to access genomic information from all organisms at the population or community scale to monitor the biosystem's state. Hence, there is a need for a better diagnostic tool that provides a holistic view of a biosystem's genomic status. Here, we introduce an in vitro methodology for genomic pattern classification of biological samples that taps large amounts of genetic information from all genes present and uses that information to detect changes in genomic patterns and classify them. We developed a biosensing protocol, termed Biological Memory, that has in vitro computational capabilities to "learn" and "store" genomic sequence information directly from genomic samples without knowledge of their explicit sequences, and that discovers differences in vitro between previously unknown inputs and learned memory molecules. The Memory protocol was designed and optimized based upon (1) common in vitro recombinant DNA operations using 20-base random probes, including polymerization, nuclease digestion, and magnetic bead separation, to capture a snapshot of the genomic state of a biological sample as a DNA memory and (2) the thermal stability of DNA duplexes between new input and the memory to detect similarities and differences. For efficient read out, a microarray was used as an output method. When the microarray-based Memory protocol was implemented to test its capability and sensitivity using genomic DNA from two model bacterial strains, i.e., Escherichia coli K12 and Bacillus subtilis, results indicate that the Memory protocol can "learn" input DNA, "recall" similar DNA, differentiate between dissimilar DNA, and detect relatively small concentration differences in samples. This study demonstrated not only the in vitro information processing capabilities of DNA, but also its promise as a genomic pattern classifier that could

  17. Carbon classified?

    DEFF Research Database (Denmark)

    Lippert, Ingmar

    2012-01-01

    . Using an actor- network theory (ANT) framework, the aim is to investigate the actors who bring together the elements needed to classify their carbon emission sources and unpack the heterogeneous relations drawn on. Based on an ethnographic study of corporate agents of ecological modernisation over...... a period of 13 months, this paper provides an exploration of three cases of enacting classification. Drawing on ANT, we problematise the silencing of a range of possible modalities of consumption facts and point to the ontological ethics involved in such performances. In a context of global warming...

  18. Molecular Dynamics Simulations of a Flexible Polyethylene: A Protein-Like Behaviour in a Water Solvent

    CERN Document Server

    Kretov, D A

    2005-01-01

    We used molecular dynamics (MD) simulations to study the density and the temperature behaviour of a flexible polyethylene (PE) subjected to various heating conditions and to investigate the PE chain conformational changes in a water solvent. First, we have considered the influence of the heating process on the final state of the polymeric system and the sensitivity of its thermodynamic characteristics (density, energy, etc.) for different heating regimes. For this purpose three different simulations were performed: fast, moderate, and slow heating. Second, we have investigated the PE chain conformational dynamics in water solvent for various simulation conditions and various configurations of the environment. From the obtained results we have got the pictures of the PE dynamical motions in water. We have observed a protein-like behaviour of the PE chain, like that of the DNA and the proteins in water, and have also estimated the rates of the conformational changes. For the MD simulations we used the optimized...

  19. Measurement of protein-like fluorescence in river and waste water using a handheld spectrophotometer.

    Science.gov (United States)

    Baker, Andy; Ward, David; Lieten, Shakti H; Periera, Ryan; Simpson, Ellie C; Slater, Malcolm

    2004-07-01

    Protein-like fluorescence intensity in rivers increases with increasing anthropogenic DOM inputs from sewerage and farm wastes. Here, a portable luminescence spectrophotometer was used to investigate if this technology could be used to provide both field scientists with a rapid pollution monitoring tool and process control engineers with a portable waste water monitoring device, through the measurement of river and waste water tryptophan-like fluorescence from a range of rivers in NE England and from effluents from within two waste water treatment plants. The portable spectrophotometer determined that waste waters and sewerage effluents had the highest tryptophan-like fluorescence intensity, urban streams had an intermediate tryptophan-like fluorescence intensity, and the upstream river samples of good water quality the lowest tryptophan-like fluorescence intensity. Replicate samples demonstrated that fluorescence intensity is reproducible to +/- 20% for low fluorescence, 'clean' river water samples and +/- 5% for urban water and waste waters. Correlations between fluorescence measured by the portable spectrophotometer with a conventional bench machine were 0.91; (Spearman's rho, n = 143), demonstrating that the portable spectrophotometer does correlate with tryptophan-like fluorescence intensity measured using the bench spectrophotometer.

  20. Molecular dynamics simulations of a flexible polyethylene: a protein-like behaviour in a water solvent

    International Nuclear Information System (INIS)

    Kretov, D.A.; Kholmurodov, Kh.T.

    2005-01-01

    We used molecular dynamics (MD) simulations to study the density and the temperature behaviour of a flexible polyethylene (PE) subjected to various heating conditions and to investigate the PE chain conformational changes in a water solvent. First, we have considered the influence of the heating process on the final state of the polymeric system and the sensitivity of its thermodynamic characteristics (density, energy, etc.) for different heating regimes. For this purpose three different simulations were performed: fast, moderate, and slow heating. Second, we have investigated the PE chain conformational dynamics in water solvent for various simulation conditions and various configurations of the environment. From the obtained results we have got the pictures of the PE dynamical motions in water. We have observed a protein-like behaviour of the PE chain, like that of the DNA and the proteins in water, and have also estimated the rates of the conformational changes. For the MD simulations we used the optimized general-purpose DL P OLY code and the generic DREIDING force field. The MD simulations were performed on the parallel computers and special-purpose MDGRAPE-2 machine

  1. Functional evolution in the plant SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE (SPL gene family

    Directory of Open Access Journals (Sweden)

    Jill Christine Preston

    2013-04-01

    Full Text Available The SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE (SPL family of transcription factors is functionally diverse, controlling a number of fundamental aspects of plant growth and development, including vegetative phase change, flowering time, branching, and leaf initiation rate. In natural plant populations, variation in flowering time and shoot architecture have major consequences for fitness. Likewise, in crop species, variation in branching and developmental rate impact biomass and yield. Thus, studies aimed at dissecting how the various functions are partitioned among different SPL genes in diverse plant lineages are key to providing insight into the genetic basis of local adaptation and have already garnered attention by crop breeders. Here we use phylogenetic reconstruction to reveal nine major SPL gene lineages, each of which is described in terms of function and diversification. To assess evidence for ancestral and derived functions within each SPL gene lineage, we use ancestral character state reconstructions. Our analyses suggest an emerging pattern of sub-functionalization, neo-functionalization, and possible convergent evolution following both ancient and recent gene duplication. Based on these analyses we suggest future avenues of research that may prove fruitful for elucidating the importance of SPL gene evolution in plant growth and development.

  2. Guanine nucleotide binding protein-like 3 is a potential prognosis indicator of gastric cancer.

    Science.gov (United States)

    Chen, Jing; Dong, Shuang; Hu, Jiangfeng; Duan, Bensong; Yao, Jian; Zhang, Ruiyun; Zhou, Hongmei; Sheng, Haihui; Gao, Hengjun; Li, Shunlong; Zhang, Xianwen

    2015-01-01

    Guanine nucleotide binding protein-like 3 (GNL3) is a GIP-binding nuclear protein that has been reported to be involved in various biological processes, including cell proliferation, cellular senescence and tumorigenesis. This study aimed to investigate the expression level of GNL3 in gastric cancer and to evaluate the relationship between its expression and clinical variables and overall survival of gastric cancer patients. The expression level of GNL3 was examined in 89 human gastric cancer samples using immunohistochemistry (IHC) staining. GNL3 in gastric cancer tissues was significantly upregulated compared with paracancerous tissues. GNL3 expression in adjacent non-cancerous tissues was associated with sex and tumor size. Survival analyses showed that GNL3 expression in both gastric cancer and adjacent non-cancerous tissues were not related to overall survival. However, in the subgroup of patients with larger tumor size (≥ 6 cm), a close association was found between GNL3 expression in gastric cancer tissues and overall survival. GNL3-positive patients had a shorter survival than GNL3-negative patients. Our study suggests that GNL3 might play an important role in the progression of gastric cancer and serve as a biomarker for poor prognosis in gastric cancer patients.

  3. Purification and Characterization of a Novel Cold Shock Protein-Like Bacteriocin Synthesized by Bacillus thuringiensis.

    Science.gov (United States)

    Huang, Tianpei; Zhang, Xiaojuan; Pan, Jieru; Su, Xiaoyu; Jin, Xin; Guan, Xiong

    2016-10-20

    Bacillus thuringiensis (Bt), one of the most successful biopesticides, may expand its potential by producing bacteriocins (thuricins). The aim of this study was to investigate the antimicrobial potential of a novel Bt bacteriocin, thuricin BtCspB, produced by Bt BRC-ZYR2. The results showed that this bacteriocin has a high similarity with cold-shock protein B (CspB). BtCspB lost its activity after proteinase K treatment; however it was active at 60 °C for 30 min and was stable in the pH range 5-7. The partial loss of activity after the treatments of lipase II and catalase were likely due to the change in BtCspB structure and the partial degradation of BtCspB, respectively. The loss of activity at high temperatures and the activity variation at different pHs were not due to degradation or large conformational change. BtCspB did not inhibit four probiotics. It was only active against B. cereus strains 0938 and ATCC 10987 with MIC values of 3.125 μg/mL and 0.781 μg/mL, and MBC values of 12.5 μg/mL and 6.25 μg/mL, respectively. Taken together, these results provide new insights into a novel cold shock protein-like bacteriocin, BtCspB, which displayed promise for its use in food preservation and treatment of B. cereus-associated diseases.

  4. Classifying individuals based on a densely captured sequence of vital signs: An example using repeated blood pressure measurements during hemodialysis treatment.

    Science.gov (United States)

    Goldstein, Benjamin A; Chang, Tara I; Winkelmayer, Wolfgang C

    2015-10-01

    Electronic Health Records (EHRs) present the opportunity to observe serial measurements on patients. While potentially informative, analyzing these data can be challenging. In this work we present a means to classify individuals based on a series of measurements collected by an EHR. Using patients undergoing hemodialysis, we categorized people based on their intradialytic blood pressure. Our primary criteria were that the classifications were time dependent and independent of other subjects. We fit a curve of intradialytic blood pressure using regression splines and then calculated first and second derivatives to come up with four mutually exclusive classifications at different time points. We show that these classifications relate to near term risk of cardiac events and are moderately stable over a succeeding two-week period. This work has general application for analyzing dense EHR data. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. No evidence for association of autism with rare heterozygous point mutations in Contactin-Associated Protein-Like 2 (CNTNAP2, or in Other Contactin-Associated Proteins or Contactins.

    Directory of Open Access Journals (Sweden)

    John D Murdoch

    2015-01-01

    Full Text Available Contactins and Contactin-Associated Proteins, and Contactin-Associated Protein-Like 2 (CNTNAP2 in particular, have been widely cited as autism risk genes based on findings from homozygosity mapping, molecular cytogenetics, copy number variation analyses, and both common and rare single nucleotide association studies. However, data specifically with regard to the contribution of heterozygous single nucleotide variants (SNVs have been inconsistent. In an effort to clarify the role of rare point mutations in CNTNAP2 and related gene families, we have conducted targeted next-generation sequencing and evaluated existing sequence data in cohorts totaling 2704 cases and 2747 controls. We find no evidence for statistically significant association of rare heterozygous mutations in any of the CNTN or CNTNAP genes, including CNTNAP2, placing marked limits on the scale of their plausible contribution to risk.

  6. Molecular Characterization of SQUAMOSA PROMOTER BINDING PROTEIN-LIKE (SPL Gene Family in Betula luminifera

    Directory of Open Access Journals (Sweden)

    Xiu-Yun Li

    2018-05-01

    Full Text Available As a major family of plant-specific transcription factors, SQUAMOSA PROMOTER BINDING PROTEIN-LIKE (SPL genes play vital regulatory roles in plant growth, development and stress responses. In this study, 18 SPL genes were identified and cloned from Betula luminifera. Two zinc finger-like structures and a nuclear location signal (NLS segments were existed in the SBP domains of all BlSPLs. Phylogenetic analysis showed that these genes were clustered into nine groups (group I-IX. The intron/exon structure and motif composition were highly conserved within the same group. 12 of the 18 BlSPLs were experimentally verified as the targets of miR156, and two cleavage sites were detected in these miR156-targeted BlSPL genes. Many putative cis-elements, associated with light, stresses and phytohormones response, were identified in the promoter regions of BlSPLs, suggesting that BlSPL genes are probably involved in important physiological processes and developmental events. Tissue-specific expression analysis showed that miR156-targeted BlSPLs exhibited a more differential expression pattern, while most miR156-nontargeted BlSPLs tended to be constitutively expressed, suggesting the distinct roles of miR156-targeted and nontargeted BlSPLs in development and growth of B. luminifera. Further expression analysis revealed that miR156-targeted BlSPLs were dramatically up-regulated with age, whereas mature BlmiR156 level was apparently declined with age, indicating that miR156/SPL module plays important roles in vegetative phase change of B. luminifera. Moreover, yeast two-hybrid assay indicated that several miR156-targeted and nontargeted BlSPLs could interact with two DELLA proteins (BlRGA and BlRGL, which suggests that certain BlSPLs take part in the GA regulated processes through protein interaction with DELLA proteins. All these results provide an important basis for further exploring the biological functions of BlSPLs in B. luminifera.

  7. Effect of channel-protein interaction on translocation of a protein-like chain through a finite channel

    International Nuclear Information System (INIS)

    Sun Ting-Ting; Ma Hai-Zhu; Jiang Zhou-Ting

    2012-01-01

    We study the translocation of a protein-like chain through a finite cylindrical channel using the pruned-enriched Rosenbluth method (PERM) and the modified orientation-dependent monomer-monomer interaction (ODI) model. Attractive channels (in cp = −2.0, −1.0, −0.5), repulsive channels (in cp = 0.5, 1.0, 2.0), and a neutral channel (in cp = 0) are discussed. The results of the chain dimension and the energy show that Z 0 = 1.0 is an important case to distinguish the types of the channels. For the strong attractive channel, more contacts form during the process of translocation. It is also found that an external force is needed to drive the chain outside of the channel with the strong attraction. While for the neutral, the repulsive, and the weak attractive channels, the translocation is spontaneous. (interdisciplinary physics and related areas of science and technology)

  8. Molecular Architecture of Contactin-associated Protein-like 2 (CNTNAP2) and Its Interaction with Contactin 2 (CNTN2)*

    Science.gov (United States)

    Lu, Zhuoyang; Reddy, M. V. V. V. Sekhar; Liu, Jianfang; Kalichava, Ana; Liu, Jiankang; Zhang, Lei; Chen, Fang; Wang, Yun; Holthauzen, Luis Marcelo F.; White, Mark A.; Seshadrinathan, Suchithra; Zhong, Xiaoying; Ren, Gang; Rudenko, Gabby

    2016-01-01

    Contactin-associated protein-like 2 (CNTNAP2) is a large multidomain neuronal adhesion molecule implicated in a number of neurological disorders, including epilepsy, schizophrenia, autism spectrum disorder, intellectual disability, and language delay. We reveal here by electron microscopy that the architecture of CNTNAP2 is composed of a large, medium, and small lobe that flex with respect to each other. Using epitope labeling and fragments, we assign the F58C, L1, and L2 domains to the large lobe, the FBG and L3 domains to the middle lobe, and the L4 domain to the small lobe of the CNTNAP2 molecular envelope. Our data reveal that CNTNAP2 has a very different architecture compared with neurexin 1α, a fellow member of the neurexin superfamily and a prototype, suggesting that CNTNAP2 uses a different strategy to integrate into the synaptic protein network. We show that the ectodomains of CNTNAP2 and contactin 2 (CNTN2) bind directly and specifically, with low nanomolar affinity. We show further that mutations in CNTNAP2 implicated in autism spectrum disorder are not segregated but are distributed over the whole ectodomain. The molecular shape and dimensions of CNTNAP2 place constraints on how CNTNAP2 integrates in the cleft of axo-glial and neuronal contact sites and how it functions as an organizing and adhesive molecule. PMID:27621318

  9. Paralogous SQUAMOSA PROMOTER BINDING PROTEIN-LIKE (SPL) genes differentially regulate leaf initiation and reproductive phase change in petunia.

    Science.gov (United States)

    Preston, Jill C; Jorgensen, Stacy A; Orozco, Rebecca; Hileman, Lena C

    2016-02-01

    Duplicated petunia clade-VI SPL genes differentially promote the timing of inflorescence and flower development, and leaf initiation rate. The timing of plant reproduction relative to favorable environmental conditions is a critical component of plant fitness, and is often associated with variation in plant architecture and habit. Recent studies have shown that overexpression of the microRNA miR156 in distantly related annual species results in plants with perennial characteristics, including late flowering, weak apical dominance, and abundant leaf production. These phenotypes are largely mediated through the negative regulation of a subset of genes belonging to the SQUAMOSA PROMOTER BINDING PROTEIN-LIKE (SPL) family of transcription factors. In order to determine how and to what extent paralogous SPL genes have partitioned their roles in plant growth and development, we functionally characterized petunia clade-VI SPL genes under different environmental conditions. Our results demonstrate that PhSBP1and PhSBP2 differentially promote discrete stages of the reproductive transition, and that PhSBP1, and possibly PhCNR, accelerates leaf initiation rate. In contrast to the closest homologs in annual Arabidopsis thaliana and Mimulus guttatus, PhSBP1 and PhSBP2 transcription is not mediated by the gibberellic acid pathway, but is positively correlated with photoperiod and developmental age. The developmental functions of clade-VI SPL genes have, thus, evolved following both gene duplication and speciation within the core eudicots, likely through differential regulation and incomplete sub-functionalization.

  10. Derivation of a Markov state model of the dynamics of a protein-like chain immersed in an implicit solvent.

    Science.gov (United States)

    Schofield, Jeremy; Bayat, Hanif

    2014-09-07

    A Markov state model of the dynamics of a protein-like chain immersed in an implicit hard sphere solvent is derived from first principles for a system of monomers that interact via discontinuous potentials designed to account for local structure and bonding in a coarse-grained sense. The model is based on the assumption that the implicit solvent interacts on a fast time scale with the monomers of the chain compared to the time scale for structural rearrangements of the chain and provides sufficient friction so that the motion of monomers is governed by the Smoluchowski equation. A microscopic theory for the dynamics of the system is developed that reduces to a Markovian model of the kinetics under well-defined conditions. Microscopic expressions for the rate constants that appear in the Markov state model are analyzed and expressed in terms of a temperature-dependent linear combination of escape rates that themselves are independent of temperature. Excellent agreement is demonstrated between the theoretical predictions of the escape rates and those obtained through simulation of a stochastic model of the dynamics of bond formation. Finally, the Markov model is studied by analyzing the eigenvalues and eigenvectors of the matrix of transition rates, and the equilibration process for a simple helix-forming system from an ensemble of initially extended configurations to mainly folded configurations is investigated as a function of temperature for a number of different chain lengths. For short chains, the relaxation is primarily single-exponential and becomes independent of temperature in the low-temperature regime. The profile is more complicated for longer chains, where multi-exponential relaxation behavior is seen at intermediate temperatures followed by a low temperature regime in which the folding becomes rapid and single exponential. It is demonstrated that the behavior of the equilibration profile as the temperature is lowered can be understood in terms of the

  11. Classifying Returns as Extreme

    DEFF Research Database (Denmark)

    Christiansen, Charlotte

    2014-01-01

    I consider extreme returns for the stock and bond markets of 14 EU countries using two classification schemes: One, the univariate classification scheme from the previous literature that classifies extreme returns for each market separately, and two, a novel multivariate classification scheme tha...

  12. LCC: Light Curves Classifier

    Science.gov (United States)

    Vo, Martin

    2017-08-01

    Light Curves Classifier uses data mining and machine learning to obtain and classify desired objects. This task can be accomplished by attributes of light curves or any time series, including shapes, histograms, or variograms, or by other available information about the inspected objects, such as color indices, temperatures, and abundances. After specifying features which describe the objects to be searched, the software trains on a given training sample, and can then be used for unsupervised clustering for visualizing the natural separation of the sample. The package can be also used for automatic tuning parameters of used methods (for example, number of hidden neurons or binning ratio). Trained classifiers can be used for filtering outputs from astronomical databases or data stored locally. The Light Curve Classifier can also be used for simple downloading of light curves and all available information of queried stars. It natively can connect to OgleII, OgleIII, ASAS, CoRoT, Kepler, Catalina and MACHO, and new connectors or descriptors can be implemented. In addition to direct usage of the package and command line UI, the program can be used through a web interface. Users can create jobs for ”training” methods on given objects, querying databases and filtering outputs by trained filters. Preimplemented descriptors, classifier and connectors can be picked by simple clicks and their parameters can be tuned by giving ranges of these values. All combinations are then calculated and the best one is used for creating the filter. Natural separation of the data can be visualized by unsupervised clustering.

  13. Intelligent Garbage Classifier

    Directory of Open Access Journals (Sweden)

    Ignacio Rodríguez Novelle

    2008-12-01

    Full Text Available IGC (Intelligent Garbage Classifier is a system for visual classification and separation of solid waste products. Currently, an important part of the separation effort is based on manual work, from household separation to industrial waste management. Taking advantage of the technologies currently available, a system has been built that can analyze images from a camera and control a robot arm and conveyor belt to automatically separate different kinds of waste.

  14. Classifying Linear Canonical Relations

    OpenAIRE

    Lorand, Jonathan

    2015-01-01

    In this Master's thesis, we consider the problem of classifying, up to conjugation by linear symplectomorphisms, linear canonical relations (lagrangian correspondences) from a finite-dimensional symplectic vector space to itself. We give an elementary introduction to the theory of linear canonical relations and present partial results toward the classification problem. This exposition should be accessible to undergraduate students with a basic familiarity with linear algebra.

  15. Derivation of a Markov state model of the dynamics of a protein-like chain immersed in an implicit solvent

    Energy Technology Data Exchange (ETDEWEB)

    Schofield, Jeremy, E-mail: jmschofi@chem.utoronto.ca; Bayat, Hanif, E-mail: hbayat@chem.utoronto.ca [Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6 (Canada)

    2014-09-07

    A Markov state model of the dynamics of a protein-like chain immersed in an implicit hard sphere solvent is derived from first principles for a system of monomers that interact via discontinuous potentials designed to account for local structure and bonding in a coarse-grained sense. The model is based on the assumption that the implicit solvent interacts on a fast time scale with the monomers of the chain compared to the time scale for structural rearrangements of the chain and provides sufficient friction so that the motion of monomers is governed by the Smoluchowski equation. A microscopic theory for the dynamics of the system is developed that reduces to a Markovian model of the kinetics under well-defined conditions. Microscopic expressions for the rate constants that appear in the Markov state model are analyzed and expressed in terms of a temperature-dependent linear combination of escape rates that themselves are independent of temperature. Excellent agreement is demonstrated between the theoretical predictions of the escape rates and those obtained through simulation of a stochastic model of the dynamics of bond formation. Finally, the Markov model is studied by analyzing the eigenvalues and eigenvectors of the matrix of transition rates, and the equilibration process for a simple helix-forming system from an ensemble of initially extended configurations to mainly folded configurations is investigated as a function of temperature for a number of different chain lengths. For short chains, the relaxation is primarily single-exponential and becomes independent of temperature in the low-temperature regime. The profile is more complicated for longer chains, where multi-exponential relaxation behavior is seen at intermediate temperatures followed by a low temperature regime in which the folding becomes rapid and single exponential. It is demonstrated that the behavior of the equilibration profile as the temperature is lowered can be understood in terms of the

  16. Derivation of a Markov state model of the dynamics of a protein-like chain immersed in an implicit solvent

    International Nuclear Information System (INIS)

    Schofield, Jeremy; Bayat, Hanif

    2014-01-01

    A Markov state model of the dynamics of a protein-like chain immersed in an implicit hard sphere solvent is derived from first principles for a system of monomers that interact via discontinuous potentials designed to account for local structure and bonding in a coarse-grained sense. The model is based on the assumption that the implicit solvent interacts on a fast time scale with the monomers of the chain compared to the time scale for structural rearrangements of the chain and provides sufficient friction so that the motion of monomers is governed by the Smoluchowski equation. A microscopic theory for the dynamics of the system is developed that reduces to a Markovian model of the kinetics under well-defined conditions. Microscopic expressions for the rate constants that appear in the Markov state model are analyzed and expressed in terms of a temperature-dependent linear combination of escape rates that themselves are independent of temperature. Excellent agreement is demonstrated between the theoretical predictions of the escape rates and those obtained through simulation of a stochastic model of the dynamics of bond formation. Finally, the Markov model is studied by analyzing the eigenvalues and eigenvectors of the matrix of transition rates, and the equilibration process for a simple helix-forming system from an ensemble of initially extended configurations to mainly folded configurations is investigated as a function of temperature for a number of different chain lengths. For short chains, the relaxation is primarily single-exponential and becomes independent of temperature in the low-temperature regime. The profile is more complicated for longer chains, where multi-exponential relaxation behavior is seen at intermediate temperatures followed by a low temperature regime in which the folding becomes rapid and single exponential. It is demonstrated that the behavior of the equilibration profile as the temperature is lowered can be understood in terms of the

  17. Stack filter classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Porter, Reid B [Los Alamos National Laboratory; Hush, Don [Los Alamos National Laboratory

    2009-01-01

    Just as linear models generalize the sample mean and weighted average, weighted order statistic models generalize the sample median and weighted median. This analogy can be continued informally to generalized additive modeels in the case of the mean, and Stack Filters in the case of the median. Both of these model classes have been extensively studied for signal and image processing but it is surprising to find that for pattern classification, their treatment has been significantly one sided. Generalized additive models are now a major tool in pattern classification and many different learning algorithms have been developed to fit model parameters to finite data. However Stack Filters remain largely confined to signal and image processing and learning algorithms for classification are yet to be seen. This paper is a step towards Stack Filter Classifiers and it shows that the approach is interesting from both a theoretical and a practical perspective.

  18. Classifying Coding DNA with Nucleotide Statistics

    Directory of Open Access Journals (Sweden)

    Nicolas Carels

    2009-10-01

    Full Text Available In this report, we compared the success rate of classification of coding sequences (CDS vs. introns by Codon Structure Factor (CSF and by a method that we called Universal Feature Method (UFM. UFM is based on the scoring of purine bias (Rrr and stop codon frequency. We show that the success rate of CDS/intron classification by UFM is higher than by CSF. UFM classifies ORFs as coding or non-coding through a score based on (i the stop codon distribution, (ii the product of purine probabilities in the three positions of nucleotide triplets, (iii the product of Cytosine (C, Guanine (G, and Adenine (A probabilities in the 1st, 2nd, and 3rd positions of triplets, respectively, (iv the probabilities of G in 1st and 2nd position of triplets and (v the distance of their GC3 vs. GC2 levels to the regression line of the universal correlation. More than 80% of CDSs (true positives of Homo sapiens (>250 bp, Drosophila melanogaster (>250 bp and Arabidopsis thaliana (>200 bp are successfully classified with a false positive rate lower or equal to 5%. The method releases coding sequences in their coding strand and coding frame, which allows their automatic translation into protein sequences with 95% confidence. The method is a natural consequence of the compositional bias of nucleotides in coding sequences.

  19. Fingerprint prediction using classifier ensembles

    CSIR Research Space (South Africa)

    Molale, P

    2011-11-01

    Full Text Available ); logistic discrimination (LgD), k-nearest neighbour (k-NN), artificial neural network (ANN), association rules (AR) decision tree (DT), naive Bayes classifier (NBC) and the support vector machine (SVM). The performance of several multiple classifier systems...

  20. Classifying objects in LWIR imagery via CNNs

    Science.gov (United States)

    Rodger, Iain; Connor, Barry; Robertson, Neil M.

    2016-10-01

    The aim of the presented work is to demonstrate enhanced target recognition and improved false alarm rates for a mid to long range detection system, utilising a Long Wave Infrared (LWIR) sensor. By exploiting high quality thermal image data and recent techniques in machine learning, the system can provide automatic target recognition capabilities. A Convolutional Neural Network (CNN) is trained and the classifier achieves an overall accuracy of > 95% for 6 object classes related to land defence. While the highly accurate CNN struggles to recognise long range target classes, due to low signal quality, robust target discrimination is achieved for challenging candidates. The overall performance of the methodology presented is assessed using human ground truth information, generating classifier evaluation metrics for thermal image sequences.

  1. Classified

    CERN Multimedia

    Computer Security Team

    2011-01-01

    In the last issue of the Bulletin, we have discussed recent implications for privacy on the Internet. But privacy of personal data is just one facet of data protection. Confidentiality is another one. However, confidentiality and data protection are often perceived as not relevant in the academic environment of CERN.   But think twice! At CERN, your personal data, e-mails, medical records, financial and contractual documents, MARS forms, group meeting minutes (and of course your password!) are all considered to be sensitive, restricted or even confidential. And this is not all. Physics results, in particular when being preliminary and pending scrutiny, are sensitive, too. Just recently, an ATLAS collaborator copy/pasted the abstract of an ATLAS note onto an external public blog, despite the fact that this document was clearly marked as an "Internal Note". Such an act was not only embarrassing to the ATLAS collaboration, and had negative impact on CERN’s reputation --- i...

  2. Classifying Sluice Occurrences in Dialogue

    DEFF Research Database (Denmark)

    Baird, Austin; Hamza, Anissa; Hardt, Daniel

    2018-01-01

    perform manual annotation with acceptable inter-coder agreement. We build classifier models with Decision Trees and Naive Bayes, with accuracy of 67%. We deploy a classifier to automatically classify sluice occurrences in OpenSubtitles, resulting in a corpus with 1.7 million occurrences. This will support....... Despite this, the corpus can be of great use in research on sluicing and development of systems, and we are making the corpus freely available on request. Furthermore, we are in the process of improving the accuracy of sluice identification and annotation for the purpose of created a subsequent version...

  3. Quantum ensembles of quantum classifiers.

    Science.gov (United States)

    Schuld, Maria; Petruccione, Francesco

    2018-02-09

    Quantum machine learning witnesses an increasing amount of quantum algorithms for data-driven decision making, a problem with potential applications ranging from automated image recognition to medical diagnosis. Many of those algorithms are implementations of quantum classifiers, or models for the classification of data inputs with a quantum computer. Following the success of collective decision making with ensembles in classical machine learning, this paper introduces the concept of quantum ensembles of quantum classifiers. Creating the ensemble corresponds to a state preparation routine, after which the quantum classifiers are evaluated in parallel and their combined decision is accessed by a single-qubit measurement. This framework naturally allows for exponentially large ensembles in which - similar to Bayesian learning - the individual classifiers do not have to be trained. As an example, we analyse an exponentially large quantum ensemble in which each classifier is weighed according to its performance in classifying the training data, leading to new results for quantum as well as classical machine learning.

  4. IAEA safeguards and classified materials

    International Nuclear Information System (INIS)

    Pilat, J.F.; Eccleston, G.W.; Fearey, B.L.; Nicholas, N.J.; Tape, J.W.; Kratzer, M.

    1997-01-01

    The international community in the post-Cold War period has suggested that the International Atomic Energy Agency (IAEA) utilize its expertise in support of the arms control and disarmament process in unprecedented ways. The pledges of the US and Russian presidents to place excess defense materials, some of which are classified, under some type of international inspections raises the prospect of using IAEA safeguards approaches for monitoring classified materials. A traditional safeguards approach, based on nuclear material accountancy, would seem unavoidably to reveal classified information. However, further analysis of the IAEA's safeguards approaches is warranted in order to understand fully the scope and nature of any problems. The issues are complex and difficult, and it is expected that common technical understandings will be essential for their resolution. Accordingly, this paper examines and compares traditional safeguards item accounting of fuel at a nuclear power station (especially spent fuel) with the challenges presented by inspections of classified materials. This analysis is intended to delineate more clearly the problems as well as reveal possible approaches, techniques, and technologies that could allow the adaptation of safeguards to the unprecedented task of inspecting classified materials. It is also hoped that a discussion of these issues can advance ongoing political-technical debates on international inspections of excess classified materials

  5. Isolation and characterization of an RIP (ribosome-inactivating protein)-like protein from tobacco with dual enzymatic activity.

    Science.gov (United States)

    Sharma, Neelam; Park, Sang-Wook; Vepachedu, Ramarao; Barbieri, Luigi; Ciani, Marialibera; Stirpe, Fiorenzo; Savary, Brett J; Vivanco, Jorge M

    2004-01-01

    Ribosome-inactivating proteins (RIPs) are N-glycosidases that remove a specific adenine from the sarcin/ricin loop of the large rRNA, thus arresting protein synthesis at the translocation step. In the present study, a protein termed tobacco RIP (TRIP) was isolated from tobacco (Nicotiana tabacum) leaves and purified using ion exchange and gel filtration chromatography in combination with yeast ribosome depurination assays. TRIP has a molecular mass of 26 kD as evidenced by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and showed strong N-glycosidase activity as manifested by the depurination of yeast rRNA. Purified TRIP showed immunoreactivity with antibodies of RIPs from Mirabilis expansa. TRIP released fewer amounts of adenine residues from ribosomal (Artemia sp. and rat ribosomes) and non-ribosomal substrates (herring sperm DNA, rRNA, and tRNA) compared with other RIPs. TRIP inhibited translation in wheat (Triticum aestivum) germ more efficiently than in rabbit reticulocytes, showing an IC50 at 30 ng in the former system. Antimicrobial assays using highly purified TRIP (50 microg mL(-1)) conducted against various fungi and bacterial pathogens showed the strongest inhibitory activity against Trichoderma reesei and Pseudomonas solancearum. A 15-amino acid internal polypeptide sequence of TRIP was identical with the internal sequences of the iron-superoxide dismutase (Fe-SOD) from wild tobacco (Nicotiana plumbaginifolia), Arabidopsis, and potato (Solanum tuberosum). Purified TRIP showed SOD activity, and Escherichia coli Fe-SOD was observed to have RIP activity too. Thus, TRIP may be considered a dual activity enzyme showing RIP-like activity and Fe-SOD characteristics.

  6. Hybrid classifiers methods of data, knowledge, and classifier combination

    CERN Document Server

    Wozniak, Michal

    2014-01-01

    This book delivers a definite and compact knowledge on how hybridization can help improving the quality of computer classification systems. In order to make readers clearly realize the knowledge of hybridization, this book primarily focuses on introducing the different levels of hybridization and illuminating what problems we will face with as dealing with such projects. In the first instance the data and knowledge incorporated in hybridization were the action points, and then a still growing up area of classifier systems known as combined classifiers was considered. This book comprises the aforementioned state-of-the-art topics and the latest research results of the author and his team from Department of Systems and Computer Networks, Wroclaw University of Technology, including as classifier based on feature space splitting, one-class classification, imbalance data, and data stream classification.

  7. 3D Bayesian contextual classifiers

    DEFF Research Database (Denmark)

    Larsen, Rasmus

    2000-01-01

    We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours.......We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours....

  8. Knowledge Uncertainty and Composed Classifier

    Czech Academy of Sciences Publication Activity Database

    Klimešová, Dana; Ocelíková, E.

    2007-01-01

    Roč. 1, č. 2 (2007), s. 101-105 ISSN 1998-0140 Institutional research plan: CEZ:AV0Z10750506 Keywords : Boosting architecture * contextual modelling * composed classifier * knowledge management, * knowledge * uncertainty Subject RIV: IN - Informatics, Computer Science

  9. Correlation Dimension-Based Classifier

    Czech Academy of Sciences Publication Activity Database

    Jiřina, Marcel; Jiřina jr., M.

    2014-01-01

    Roč. 44, č. 12 (2014), s. 2253-2263 ISSN 2168-2267 R&D Projects: GA MŠk(CZ) LG12020 Institutional support: RVO:67985807 Keywords : classifier * multidimensional data * correlation dimension * scaling exponent * polynomial expansion Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 3.469, year: 2014

  10. IL-1 receptor accessory protein-like 1 associated with mental retardation and autism mediates synapse formation by trans-synaptic interaction with protein tyrosine phosphatase δ.

    Science.gov (United States)

    Yoshida, Tomoyuki; Yasumura, Misato; Uemura, Takeshi; Lee, Sung-Jin; Ra, Moonjin; Taguchi, Ryo; Iwakura, Yoichiro; Mishina, Masayoshi

    2011-09-21

    Mental retardation (MR) and autism are highly heterogeneous neurodevelopmental disorders. IL-1-receptor accessory protein-like 1 (IL1RAPL1) is responsible for nonsyndromic MR and is associated with autism. Thus, the elucidation of the functional role of IL1RAPL1 will contribute to our understanding of the pathogenesis of these mental disorders. Here, we showed that knockdown of endogenous IL1RAPL1 in cultured cortical neurons suppressed the accumulation of punctate staining signals for active zone protein Bassoon and decreased the number of dendritic protrusions. Consistently, the expression of IL1RAPL1 in cultured neurons stimulated the accumulation of Bassoon and spinogenesis. The extracellular domain (ECD) of IL1RAPL1 was required and sufficient for the presynaptic differentiation-inducing activity, while both the ECD and cytoplasmic domain were essential for the spinogenic activity. Notably, the synaptogenic activity of IL1RAPL1 was specific for excitatory synapses. Furthermore, we identified presynaptic protein tyrosine phosphatase (PTP) δ as a major IL1RAPL1-ECD interacting protein by affinity chromatography. IL1RAPL1 interacted selectively with certain forms of PTPδ splice variants carrying mini-exon peptides in Ig-like domains. The synaptogenic activity of IL1RAPL1 was abolished in primary neurons from PTPδ knock-out mice. IL1RAPL1 showed robust synaptogenic activity in vivo when transfected into the cortical neurons of wild-type mice but not in PTPδ knock-out mice. These results suggest that IL1RAPL1 mediates synapse formation through trans-synaptic interaction with PTPδ. Our findings raise an intriguing possibility that the impairment of synapse formation may underlie certain forms of MR and autism as a common pathogenic pathway shared by these mental disorders.

  11. Classified facilities for environmental protection

    International Nuclear Information System (INIS)

    Anon.

    1993-02-01

    The legislation of the classified facilities governs most of the dangerous or polluting industries or fixed activities. It rests on the law of 9 July 1976 concerning facilities classified for environmental protection and its application decree of 21 September 1977. This legislation, the general texts of which appear in this volume 1, aims to prevent all the risks and the harmful effects coming from an installation (air, water or soil pollutions, wastes, even aesthetic breaches). The polluting or dangerous activities are defined in a list called nomenclature which subjects the facilities to a declaration or an authorization procedure. The authorization is delivered by the prefect at the end of an open and contradictory procedure after a public survey. In addition, the facilities can be subjected to technical regulations fixed by the Environment Minister (volume 2) or by the prefect for facilities subjected to declaration (volume 3). (A.B.)

  12. Energy-Efficient Neuromorphic Classifiers.

    Science.gov (United States)

    Martí, Daniel; Rigotti, Mattia; Seok, Mingoo; Fusi, Stefano

    2016-10-01

    Neuromorphic engineering combines the architectural and computational principles of systems neuroscience with semiconductor electronics, with the aim of building efficient and compact devices that mimic the synaptic and neural machinery of the brain. The energy consumptions promised by neuromorphic engineering are extremely low, comparable to those of the nervous system. Until now, however, the neuromorphic approach has been restricted to relatively simple circuits and specialized functions, thereby obfuscating a direct comparison of their energy consumption to that used by conventional von Neumann digital machines solving real-world tasks. Here we show that a recent technology developed by IBM can be leveraged to realize neuromorphic circuits that operate as classifiers of complex real-world stimuli. Specifically, we provide a set of general prescriptions to enable the practical implementation of neural architectures that compete with state-of-the-art classifiers. We also show that the energy consumption of these architectures, realized on the IBM chip, is typically two or more orders of magnitude lower than that of conventional digital machines implementing classifiers with comparable performance. Moreover, the spike-based dynamics display a trade-off between integration time and accuracy, which naturally translates into algorithms that can be flexibly deployed for either fast and approximate classifications, or more accurate classifications at the mere expense of longer running times and higher energy costs. This work finally proves that the neuromorphic approach can be efficiently used in real-world applications and has significant advantages over conventional digital devices when energy consumption is considered.

  13. 76 FR 34761 - Classified National Security Information

    Science.gov (United States)

    2011-06-14

    ... MARINE MAMMAL COMMISSION Classified National Security Information [Directive 11-01] AGENCY: Marine... Commission's (MMC) policy on classified information, as directed by Information Security Oversight Office... of Executive Order 13526, ``Classified National Security Information,'' and 32 CFR part 2001...

  14. Maximum margin classifier working in a set of strings.

    Science.gov (United States)

    Koyano, Hitoshi; Hayashida, Morihiro; Akutsu, Tatsuya

    2016-03-01

    Numbers and numerical vectors account for a large portion of data. However, recently, the amount of string data generated has increased dramatically. Consequently, classifying string data is a common problem in many fields. The most widely used approach to this problem is to convert strings into numerical vectors using string kernels and subsequently apply a support vector machine that works in a numerical vector space. However, this non-one-to-one conversion involves a loss of information and makes it impossible to evaluate, using probability theory, the generalization error of a learning machine, considering that the given data to train and test the machine are strings generated according to probability laws. In this study, we approach this classification problem by constructing a classifier that works in a set of strings. To evaluate the generalization error of such a classifier theoretically, probability theory for strings is required. Therefore, we first extend a limit theorem for a consensus sequence of strings demonstrated by one of the authors and co-workers in a previous study. Using the obtained result, we then demonstrate that our learning machine classifies strings in an asymptotically optimal manner. Furthermore, we demonstrate the usefulness of our machine in practical data analysis by applying it to predicting protein-protein interactions using amino acid sequences and classifying RNAs by the secondary structure using nucleotide sequences.

  15. 18 CFR 3a.71 - Accountability for classified material.

    Science.gov (United States)

    2010-04-01

    ... numbers assigned to top secret material will be separate from the sequence for other classified material... central control registry in calendar year 1969. TS 1006—Sixth Top Secret document controlled by the... control registry when the document is transferred. (e) For Top Secret documents only, an access register...

  16. LOCALIZATION AND RECOGNITION OF DYNAMIC HAND GESTURES BASED ON HIERARCHY OF MANIFOLD CLASSIFIERS

    OpenAIRE

    M. Favorskaya; A. Nosov; A. Popov

    2015-01-01

    Generally, the dynamic hand gestures are captured in continuous video sequences, and a gesture recognition system ought to extract the robust features automatically. This task involves the highly challenging spatio-temporal variations of dynamic hand gestures. The proposed method is based on two-level manifold classifiers including the trajectory classifiers in any time instants and the posture classifiers of sub-gestures in selected time instants. The trajectory classifiers contain skin dete...

  17. Waste classifying and separation device

    International Nuclear Information System (INIS)

    Kakiuchi, Hiroki.

    1997-01-01

    A flexible plastic bags containing solid wastes of indefinite shape is broken and the wastes are classified. The bag cutting-portion of the device has an ultrasonic-type or a heater-type cutting means, and the cutting means moves in parallel with the transferring direction of the plastic bags. A classification portion separates and discriminates the plastic bag from the contents and conducts classification while rotating a classification table. Accordingly, the plastic bag containing solids of indefinite shape can be broken and classification can be conducted efficiently and reliably. The device of the present invention has a simple structure which requires small installation space and enables easy maintenance. (T.M.)

  18. Defining and Classifying Interest Groups

    DEFF Research Database (Denmark)

    Baroni, Laura; Carroll, Brendan; Chalmers, Adam

    2014-01-01

    The interest group concept is defined in many different ways in the existing literature and a range of different classification schemes are employed. This complicates comparisons between different studies and their findings. One of the important tasks faced by interest group scholars engaged...... in large-N studies is therefore to define the concept of an interest group and to determine which classification scheme to use for different group types. After reviewing the existing literature, this article sets out to compare different approaches to defining and classifying interest groups with a sample...... in the organizational attributes of specific interest group types. As expected, our comparison of coding schemes reveals a closer link between group attributes and group type in narrower classification schemes based on group organizational characteristics than those based on a behavioral definition of lobbying....

  19. Composite Classifiers for Automatic Target Recognition

    National Research Council Canada - National Science Library

    Wang, Lin-Cheng

    1998-01-01

    ...) using forward-looking infrared (FLIR) imagery. Two existing classifiers, one based on learning vector quantization and the other on modular neural networks, are used as the building blocks for our composite classifiers...

  20. Aggregation Operator Based Fuzzy Pattern Classifier Design

    DEFF Research Database (Denmark)

    Mönks, Uwe; Larsen, Henrik Legind; Lohweg, Volker

    2009-01-01

    This paper presents a novel modular fuzzy pattern classifier design framework for intelligent automation systems, developed on the base of the established Modified Fuzzy Pattern Classifier (MFPC) and allows designing novel classifier models which are hardware-efficiently implementable....... The performances of novel classifiers using substitutes of MFPC's geometric mean aggregator are benchmarked in the scope of an image processing application against the MFPC to reveal classification improvement potentials for obtaining higher classification rates....

  1. 15 CFR 4.8 - Classified Information.

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 1 2010-01-01 2010-01-01 false Classified Information. 4.8 Section 4... INFORMATION Freedom of Information Act § 4.8 Classified Information. In processing a request for information..., the information shall be reviewed to determine whether it should remain classified. Ordinarily the...

  2. A Gene Expression Classifier of Node-Positive Colorectal Cancer

    Directory of Open Access Journals (Sweden)

    Paul F. Meeh

    2009-10-01

    Full Text Available We used digital long serial analysis of gene expression to discover gene expression differences between node-negative and node-positive colorectal tumors and developed a multigene classifier able to discriminate between these two tumor types. We prepared and sequenced long serial analysis of gene expression libraries from one node-negative and one node-positive colorectal tumor, sequenced to a depth of 26,060 unique tags, and identified 262 tags significantly differentially expressed between these two tumors (P < 2 x 10-6. We confirmed the tag-to-gene assignments and differential expression of 31 genes by quantitative real-time polymerase chain reaction, 12 of which were elevated in the node-positive tumor. We analyzed the expression levels of these 12 upregulated genes in a validation panel of 23 additional tumors and developed an optimized seven-gene logistic regression classifier. The classifier discriminated between node-negative and node-positive tumors with 86% sensitivity and 80% specificity. Receiver operating characteristic analysis of the classifier revealed an area under the curve of 0.86. Experimental manipulation of the function of one classification gene, Fibronectin, caused profound effects on invasion and migration of colorectal cancer cells in vitro. These results suggest that the development of node-positive colorectal cancer occurs in part through elevated epithelial FN1 expression and suggest novel strategies for the diagnosis and treatment of advanced disease.

  3. Error minimizing algorithms for nearest eighbor classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Porter, Reid B [Los Alamos National Laboratory; Hush, Don [Los Alamos National Laboratory; Zimmer, G. Beate [TEXAS A& M

    2011-01-03

    Stack Filters define a large class of discrete nonlinear filter first introd uced in image and signal processing for noise removal. In recent years we have suggested their application to classification problems, and investigated their relationship to other types of discrete classifiers such as Decision Trees. In this paper we focus on a continuous domain version of Stack Filter Classifiers which we call Ordered Hypothesis Machines (OHM), and investigate their relationship to Nearest Neighbor classifiers. We show that OHM classifiers provide a novel framework in which to train Nearest Neighbor type classifiers by minimizing empirical error based loss functions. We use the framework to investigate a new cost sensitive loss function that allows us to train a Nearest Neighbor type classifier for low false alarm rate applications. We report results on both synthetic data and real-world image data.

  4. Generic Black-Box End-to-End Attack Against State of the Art API Call Based Malware Classifiers

    OpenAIRE

    Rosenberg, Ishai; Shabtai, Asaf; Rokach, Lior; Elovici, Yuval

    2017-01-01

    In this paper, we present a black-box attack against API call based machine learning malware classifiers, focusing on generating adversarial sequences combining API calls and static features (e.g., printable strings) that will be misclassified by the classifier without affecting the malware functionality. We show that this attack is effective against many classifiers due to the transferability principle between RNN variants, feed forward DNNs, and traditional machine learning classifiers such...

  5. Hierarchical mixtures of naive Bayes classifiers

    NARCIS (Netherlands)

    Wiering, M.A.

    2002-01-01

    Naive Bayes classifiers tend to perform very well on a large number of problem domains, although their representation power is quite limited compared to more sophisticated machine learning algorithms. In this pa- per we study combining multiple naive Bayes classifiers by using the hierar- chical

  6. Comparing classifiers for pronunciation error detection

    NARCIS (Netherlands)

    Strik, H.; Truong, K.; Wet, F. de; Cucchiarini, C.

    2007-01-01

    Providing feedback on pronunciation errors in computer assisted language learning systems requires that pronunciation errors be detected automatically. In the present study we compare four types of classifiers that can be used for this purpose: two acoustic-phonetic classifiers (one of which employs

  7. Feature extraction for dynamic integration of classifiers

    NARCIS (Netherlands)

    Pechenizkiy, M.; Tsymbal, A.; Puuronen, S.; Patterson, D.W.

    2007-01-01

    Recent research has shown the integration of multiple classifiers to be one of the most important directions in machine learning and data mining. In this paper, we present an algorithm for the dynamic integration of classifiers in the space of extracted features (FEDIC). It is based on the technique

  8. Deconvolution When Classifying Noisy Data Involving Transformations

    KAUST Repository

    Carroll, Raymond

    2012-09-01

    In the present study, we consider the problem of classifying spatial data distorted by a linear transformation or convolution and contaminated by additive random noise. In this setting, we show that classifier performance can be improved if we carefully invert the data before the classifier is applied. However, the inverse transformation is not constructed so as to recover the original signal, and in fact, we show that taking the latter approach is generally inadvisable. We introduce a fully data-driven procedure based on cross-validation, and use several classifiers to illustrate numerical properties of our approach. Theoretical arguments are given in support of our claims. Our procedure is applied to data generated by light detection and ranging (Lidar) technology, where we improve on earlier approaches to classifying aerosols. This article has supplementary materials online.

  9. Deconvolution When Classifying Noisy Data Involving Transformations.

    Science.gov (United States)

    Carroll, Raymond; Delaigle, Aurore; Hall, Peter

    2012-09-01

    In the present study, we consider the problem of classifying spatial data distorted by a linear transformation or convolution and contaminated by additive random noise. In this setting, we show that classifier performance can be improved if we carefully invert the data before the classifier is applied. However, the inverse transformation is not constructed so as to recover the original signal, and in fact, we show that taking the latter approach is generally inadvisable. We introduce a fully data-driven procedure based on cross-validation, and use several classifiers to illustrate numerical properties of our approach. Theoretical arguments are given in support of our claims. Our procedure is applied to data generated by light detection and ranging (Lidar) technology, where we improve on earlier approaches to classifying aerosols. This article has supplementary materials online.

  10. Deconvolution When Classifying Noisy Data Involving Transformations

    KAUST Repository

    Carroll, Raymond; Delaigle, Aurore; Hall, Peter

    2012-01-01

    In the present study, we consider the problem of classifying spatial data distorted by a linear transformation or convolution and contaminated by additive random noise. In this setting, we show that classifier performance can be improved if we carefully invert the data before the classifier is applied. However, the inverse transformation is not constructed so as to recover the original signal, and in fact, we show that taking the latter approach is generally inadvisable. We introduce a fully data-driven procedure based on cross-validation, and use several classifiers to illustrate numerical properties of our approach. Theoretical arguments are given in support of our claims. Our procedure is applied to data generated by light detection and ranging (Lidar) technology, where we improve on earlier approaches to classifying aerosols. This article has supplementary materials online.

  11. Logarithmic learning for generalized classifier neural network.

    Science.gov (United States)

    Ozyildirim, Buse Melis; Avci, Mutlu

    2014-12-01

    Generalized classifier neural network is introduced as an efficient classifier among the others. Unless the initial smoothing parameter value is close to the optimal one, generalized classifier neural network suffers from convergence problem and requires quite a long time to converge. In this work, to overcome this problem, a logarithmic learning approach is proposed. The proposed method uses logarithmic cost function instead of squared error. Minimization of this cost function reduces the number of iterations used for reaching the minima. The proposed method is tested on 15 different data sets and performance of logarithmic learning generalized classifier neural network is compared with that of standard one. Thanks to operation range of radial basis function included by generalized classifier neural network, proposed logarithmic approach and its derivative has continuous values. This makes it possible to adopt the advantage of logarithmic fast convergence by the proposed learning method. Due to fast convergence ability of logarithmic cost function, training time is maximally decreased to 99.2%. In addition to decrease in training time, classification performance may also be improved till 60%. According to the test results, while the proposed method provides a solution for time requirement problem of generalized classifier neural network, it may also improve the classification accuracy. The proposed method can be considered as an efficient way for reducing the time requirement problem of generalized classifier neural network. Copyright © 2014 Elsevier Ltd. All rights reserved.

  12. A CLASSIFIER SYSTEM USING SMOOTH GRAPH COLORING

    Directory of Open Access Journals (Sweden)

    JORGE FLORES CRUZ

    2017-01-01

    Full Text Available Unsupervised classifiers allow clustering methods with less or no human intervention. Therefore it is desirable to group the set of items with less data processing. This paper proposes an unsupervised classifier system using the model of soft graph coloring. This method was tested with some classic instances in the literature and the results obtained were compared with classifications made with human intervention, yielding as good or better results than supervised classifiers, sometimes providing alternative classifications that considers additional information that humans did not considered.

  13. High dimensional classifiers in the imbalanced case

    DEFF Research Database (Denmark)

    Bak, Britta Anker; Jensen, Jens Ledet

    We consider the binary classification problem in the imbalanced case where the number of samples from the two groups differ. The classification problem is considered in the high dimensional case where the number of variables is much larger than the number of samples, and where the imbalance leads...... to a bias in the classification. A theoretical analysis of the independence classifier reveals the origin of the bias and based on this we suggest two new classifiers that can handle any imbalance ratio. The analytical results are supplemented by a simulation study, where the suggested classifiers in some...

  14. Arabic Handwriting Recognition Using Neural Network Classifier

    African Journals Online (AJOL)

    pc

    2018-03-05

    Mar 5, 2018 ... an OCR using Neural Network classifier preceded by a set of preprocessing .... Artificial Neural Networks (ANNs), which we adopt in this research, consist of ... advantage and disadvantages of each technique. In [9],. Khemiri ...

  15. Classifiers based on optimal decision rules

    KAUST Repository

    Amin, Talha

    2013-11-25

    Based on dynamic programming approach we design algorithms for sequential optimization of exact and approximate decision rules relative to the length and coverage [3, 4]. In this paper, we use optimal rules to construct classifiers, and study two questions: (i) which rules are better from the point of view of classification-exact or approximate; and (ii) which order of optimization gives better results of classifier work: length, length+coverage, coverage, or coverage+length. Experimental results show that, on average, classifiers based on exact rules are better than classifiers based on approximate rules, and sequential optimization (length+coverage or coverage+length) is better than the ordinary optimization (length or coverage).

  16. Classifiers based on optimal decision rules

    KAUST Repository

    Amin, Talha M.; Chikalov, Igor; Moshkov, Mikhail; Zielosko, Beata

    2013-01-01

    Based on dynamic programming approach we design algorithms for sequential optimization of exact and approximate decision rules relative to the length and coverage [3, 4]. In this paper, we use optimal rules to construct classifiers, and study two questions: (i) which rules are better from the point of view of classification-exact or approximate; and (ii) which order of optimization gives better results of classifier work: length, length+coverage, coverage, or coverage+length. Experimental results show that, on average, classifiers based on exact rules are better than classifiers based on approximate rules, and sequential optimization (length+coverage or coverage+length) is better than the ordinary optimization (length or coverage).

  17. Combining multiple classifiers for age classification

    CSIR Research Space (South Africa)

    Van Heerden, C

    2009-11-01

    Full Text Available The authors compare several different classifier combination methods on a single task, namely speaker age classification. This task is well suited to combination strategies, since significantly different feature classes are employed. Support vector...

  18. Neural Network Classifiers for Local Wind Prediction.

    Science.gov (United States)

    Kretzschmar, Ralf; Eckert, Pierre; Cattani, Daniel; Eggimann, Fritz

    2004-05-01

    This paper evaluates the quality of neural network classifiers for wind speed and wind gust prediction with prediction lead times between +1 and +24 h. The predictions were realized based on local time series and model data. The selection of appropriate input features was initiated by time series analysis and completed by empirical comparison of neural network classifiers trained on several choices of input features. The selected input features involved day time, yearday, features from a single wind observation device at the site of interest, and features derived from model data. The quality of the resulting classifiers was benchmarked against persistence for two different sites in Switzerland. The neural network classifiers exhibited superior quality when compared with persistence judged on a specific performance measure, hit and false-alarm rates.

  19. Consistency Analysis of Nearest Subspace Classifier

    OpenAIRE

    Wang, Yi

    2015-01-01

    The Nearest subspace classifier (NSS) finds an estimation of the underlying subspace within each class and assigns data points to the class that corresponds to its nearest subspace. This paper mainly studies how well NSS can be generalized to new samples. It is proved that NSS is strongly consistent under certain assumptions. For completeness, NSS is evaluated through experiments on various simulated and real data sets, in comparison with some other linear model based classifiers. It is also ...

  20. Comprehensive benchmarking and ensemble approaches for metagenomic classifiers.

    Science.gov (United States)

    McIntyre, Alexa B R; Ounit, Rachid; Afshinnekoo, Ebrahim; Prill, Robert J; Hénaff, Elizabeth; Alexander, Noah; Minot, Samuel S; Danko, David; Foox, Jonathan; Ahsanuddin, Sofia; Tighe, Scott; Hasan, Nur A; Subramanian, Poorani; Moffat, Kelly; Levy, Shawn; Lonardi, Stefano; Greenfield, Nick; Colwell, Rita R; Rosen, Gail L; Mason, Christopher E

    2017-09-21

    One of the main challenges in metagenomics is the identification of microorganisms in clinical and environmental samples. While an extensive and heterogeneous set of computational tools is available to classify microorganisms using whole-genome shotgun sequencing data, comprehensive comparisons of these methods are limited. In this study, we use the largest-to-date set of laboratory-generated and simulated controls across 846 species to evaluate the performance of 11 metagenomic classifiers. Tools were characterized on the basis of their ability to identify taxa at the genus, species, and strain levels, quantify relative abundances of taxa, and classify individual reads to the species level. Strikingly, the number of species identified by the 11 tools can differ by over three orders of magnitude on the same datasets. Various strategies can ameliorate taxonomic misclassification, including abundance filtering, ensemble approaches, and tool intersection. Nevertheless, these strategies were often insufficient to completely eliminate false positives from environmental samples, which are especially important where they concern medically relevant species. Overall, pairing tools with different classification strategies (k-mer, alignment, marker) can combine their respective advantages. This study provides positive and negative controls, titrated standards, and a guide for selecting tools for metagenomic analyses by comparing ranges of precision, accuracy, and recall. We show that proper experimental design and analysis parameters can reduce false positives, provide greater resolution of species in complex metagenomic samples, and improve the interpretation of results.

  1. SPiCE : A web-based tool for sequence-based protein classification and exploration

    NARCIS (Netherlands)

    Van den Berg, B.A.; Reinders, M.J.; Roubos, J.A.; De Ridder, D.

    2014-01-01

    Background Amino acid sequences and features extracted from such sequences have been used to predict many protein properties, such as subcellular localization or solubility, using classifier algorithms. Although software tools are available for both feature extraction and classifier construction,

  2. Characterization of the beta amyloid precursor protein-like gene in the central nervous system of the crab Chasmagnathus. Expression during memory consolidation

    Directory of Open Access Journals (Sweden)

    Fustiñana Maria

    2010-09-01

    Full Text Available Abstract Background Human β-amyloid, the main component in the neuritic plaques found in patients with Alzheimer's disease, is generated by cleavage of the β-amyloid precursor protein. Beyond the role in pathology, members of this protein family are synaptic proteins and have been associated with synaptogenesis, neuronal plasticity and memory, both in vertebrates and in invertebrates. Consolidation is necessary to convert a short-term labile memory to a long-term and stable form. During consolidation, gene expression and de novo protein synthesis are regulated in order to produce key proteins for the maintenance of plastic changes produced during the acquisition of new information. Results Here we partially cloned and sequenced the beta-amyloid precursor protein like gene homologue in the crab Chasmagnathus (cappl, showing a 37% of identity with the fruit fly Drosophila melanogaster homologue and 23% with Homo sapiens but with much higher degree of sequence similarity in certain regions. We observed a wide distribution of cappl mRNA in the nervous system as well as in muscle and gills. The protein localized in all tissues analyzed with the exception of muscle. Immunofluorescence revealed localization of cAPPL in associative and sensory brain areas. We studied gene and protein expression during long-term memory consolidation using a well characterized memory model: the context-signal associative memory in this crab species. mRNA levels varied at different time points during long-term memory consolidation and correlated with cAPPL protein levels Conclusions cAPPL mRNA and protein is widely distributed in the central nervous system of the crab and the time course of expression suggests a role of cAPPL during long-term memory formation.

  3. Characterization of the beta amyloid precursor protein-like gene in the central nervous system of the crab Chasmagnathus. Expression during memory consolidation.

    Science.gov (United States)

    Fustiñana, Maria Sol; Ariel, Pablo; Federman, Noel; Freudenthal, Ramiro; Romano, Arturo

    2010-09-01

    Human β-amyloid, the main component in the neuritic plaques found in patients with Alzheimer's disease, is generated by cleavage of the β-amyloid precursor protein. Beyond the role in pathology, members of this protein family are synaptic proteins and have been associated with synaptogenesis, neuronal plasticity and memory, both in vertebrates and in invertebrates. Consolidation is necessary to convert a short-term labile memory to a long-term and stable form. During consolidation, gene expression and de novo protein synthesis are regulated in order to produce key proteins for the maintenance of plastic changes produced during the acquisition of new information. Here we partially cloned and sequenced the beta-amyloid precursor protein like gene homologue in the crab Chasmagnathus (cappl), showing a 37% of identity with the fruit fly Drosophila melanogaster homologue and 23% with Homo sapiens but with much higher degree of sequence similarity in certain regions. We observed a wide distribution of cappl mRNA in the nervous system as well as in muscle and gills. The protein localized in all tissues analyzed with the exception of muscle. Immunofluorescence revealed localization of cAPPL in associative and sensory brain areas. We studied gene and protein expression during long-term memory consolidation using a well characterized memory model: the context-signal associative memory in this crab species. mRNA levels varied at different time points during long-term memory consolidation and correlated with cAPPL protein levels cAPPL mRNA and protein is widely distributed in the central nervous system of the crab and the time course of expression suggests a role of cAPPL during long-term memory formation.

  4. Reinforcement Learning Based Artificial Immune Classifier

    Directory of Open Access Journals (Sweden)

    Mehmet Karakose

    2013-01-01

    Full Text Available One of the widely used methods for classification that is a decision-making process is artificial immune systems. Artificial immune systems based on natural immunity system can be successfully applied for classification, optimization, recognition, and learning in real-world problems. In this study, a reinforcement learning based artificial immune classifier is proposed as a new approach. This approach uses reinforcement learning to find better antibody with immune operators. The proposed new approach has many contributions according to other methods in the literature such as effectiveness, less memory cell, high accuracy, speed, and data adaptability. The performance of the proposed approach is demonstrated by simulation and experimental results using real data in Matlab and FPGA. Some benchmark data and remote image data are used for experimental results. The comparative results with supervised/unsupervised based artificial immune system, negative selection classifier, and resource limited artificial immune classifier are given to demonstrate the effectiveness of the proposed new method.

  5. Classifier Fusion With Contextual Reliability Evaluation.

    Science.gov (United States)

    Liu, Zhunga; Pan, Quan; Dezert, Jean; Han, Jun-Wei; He, You

    2018-05-01

    Classifier fusion is an efficient strategy to improve the classification performance for the complex pattern recognition problem. In practice, the multiple classifiers to combine can have different reliabilities and the proper reliability evaluation plays an important role in the fusion process for getting the best classification performance. We propose a new method for classifier fusion with contextual reliability evaluation (CF-CRE) based on inner reliability and relative reliability concepts. The inner reliability, represented by a matrix, characterizes the probability of the object belonging to one class when it is classified to another class. The elements of this matrix are estimated from the -nearest neighbors of the object. A cautious discounting rule is developed under belief functions framework to revise the classification result according to the inner reliability. The relative reliability is evaluated based on a new incompatibility measure which allows to reduce the level of conflict between the classifiers by applying the classical evidence discounting rule to each classifier before their combination. The inner reliability and relative reliability capture different aspects of the classification reliability. The discounted classification results are combined with Dempster-Shafer's rule for the final class decision making support. The performance of CF-CRE have been evaluated and compared with those of main classical fusion methods using real data sets. The experimental results show that CF-CRE can produce substantially higher accuracy than other fusion methods in general. Moreover, CF-CRE is robust to the changes of the number of nearest neighbors chosen for estimating the reliability matrix, which is appealing for the applications.

  6. Classifying sows' activity types from acceleration patterns

    DEFF Research Database (Denmark)

    Cornou, Cecile; Lundbye-Christensen, Søren

    2008-01-01

    An automated method of classifying sow activity using acceleration measurements would allow the individual sow's behavior to be monitored throughout the reproductive cycle; applications for detecting behaviors characteristic of estrus and farrowing or to monitor illness and welfare can be foreseen....... This article suggests a method of classifying five types of activity exhibited by group-housed sows. The method involves the measurement of acceleration in three dimensions. The five activities are: feeding, walking, rooting, lying laterally and lying sternally. Four time series of acceleration (the three...

  7. Data characteristics that determine classifier performance

    CSIR Research Space (South Africa)

    Van der Walt, Christiaan M

    2006-11-01

    Full Text Available available at [11]. The kNN uses a LinearNN nearest neighbour search algorithm with an Euclidean distance metric [8]. The optimal k value is determined by performing 10-fold cross-validation. An optimal k value between 1 and 10 is used for Experiments 1... classifiers. 10-fold cross-validation is used to evaluate and compare the performance of the classifiers on the different data sets. 3.1. Artificial data generation Multivariate Gaussian distributions are used to generate artificial data sets. We use d...

  8. A Customizable Text Classifier for Text Mining

    Directory of Open Access Journals (Sweden)

    Yun-liang Zhang

    2007-12-01

    Full Text Available Text mining deals with complex and unstructured texts. Usually a particular collection of texts that is specified to one or more domains is necessary. We have developed a customizable text classifier for users to mine the collection automatically. It derives from the sentence category of the HNC theory and corresponding techniques. It can start with a few texts, and it can adjust automatically or be adjusted by user. The user can also control the number of domains chosen and decide the standard with which to choose the texts based on demand and abundance of materials. The performance of the classifier varies with the user's choice.

  9. A survey of decision tree classifier methodology

    Science.gov (United States)

    Safavian, S. R.; Landgrebe, David

    1991-01-01

    Decision tree classifiers (DTCs) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps the most important feature of DTCs is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issues. After considering potential advantages of DTCs over single-state classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  10. A Bayesian method for comparing and combining binary classifiers in the absence of a gold standard

    Directory of Open Access Journals (Sweden)

    Keith Jonathan M

    2012-07-01

    Full Text Available Abstract Background Many problems in bioinformatics involve classification based on features such as sequence, structure or morphology. Given multiple classifiers, two crucial questions arise: how does their performance compare, and how can they best be combined to produce a better classifier? A classifier can be evaluated in terms of sensitivity and specificity using benchmark, or gold standard, data, that is, data for which the true classification is known. However, a gold standard is not always available. Here we demonstrate that a Bayesian model for comparing medical diagnostics without a gold standard can be successfully applied in the bioinformatics domain, to genomic scale data sets. We present a new implementation, which unlike previous implementations is applicable to any number of classifiers. We apply this model, for the first time, to the problem of finding the globally optimal logical combination of classifiers. Results We compared three classifiers of protein subcellular localisation, and evaluated our estimates of sensitivity and specificity against estimates obtained using a gold standard. The method overestimated sensitivity and specificity with only a small discrepancy, and correctly ranked the classifiers. Diagnostic tests for swine flu were then compared on a small data set. Lastly, classifiers for a genome-wide association study of macular degeneration with 541094 SNPs were analysed. In all cases, run times were feasible, and results precise. The optimal logical combination of classifiers was also determined for all three data sets. Code and data are available from http://bioinformatics.monash.edu.au/downloads/. Conclusions The examples demonstrate the methods are suitable for both small and large data sets, applicable to the wide range of bioinformatics classification problems, and robust to dependence between classifiers. In all three test cases, the globally optimal logical combination of the classifiers was found to be

  11. 75 FR 37253 - Classified National Security Information

    Science.gov (United States)

    2010-06-28

    ... ``Secret.'' (3) Each interior page of a classified document shall be marked at the top and bottom either... ``(TS)'' for Top Secret, ``(S)'' for Secret, and ``(C)'' for Confidential will be used. (2) Portions... from the informational text. (1) Conspicuously place the overall classification at the top and bottom...

  12. 75 FR 707 - Classified National Security Information

    Science.gov (United States)

    2010-01-05

    ... classified at one of the following three levels: (1) ``Top Secret'' shall be applied to information, the... exercise this authority. (2) ``Top Secret'' original classification authority may be delegated only by the... official has been delegated ``Top Secret'' original classification authority by the agency head. (4) Each...

  13. Neural Network Classifier Based on Growing Hyperspheres

    Czech Academy of Sciences Publication Activity Database

    Jiřina Jr., Marcel; Jiřina, Marcel

    2000-01-01

    Roč. 10, č. 3 (2000), s. 417-428 ISSN 1210-0552. [Neural Network World 2000. Prague, 09.07.2000-12.07.2000] Grant - others:MŠMT ČR(CZ) VS96047; MPO(CZ) RP-4210 Institutional research plan: AV0Z1030915 Keywords : neural network * classifier * hyperspheres * big -dimensional data Subject RIV: BA - General Mathematics

  14. Histogram deconvolution - An aid to automated classifiers

    Science.gov (United States)

    Lorre, J. J.

    1983-01-01

    It is shown that N-dimensional histograms are convolved by the addition of noise in the picture domain. Three methods are described which provide the ability to deconvolve such noise-affected histograms. The purpose of the deconvolution is to provide automated classifiers with a higher quality N-dimensional histogram from which to obtain classification statistics.

  15. Classifying web pages with visual features

    NARCIS (Netherlands)

    de Boer, V.; van Someren, M.; Lupascu, T.; Filipe, J.; Cordeiro, J.

    2010-01-01

    To automatically classify and process web pages, current systems use the textual content of those pages, including both the displayed content and the underlying (HTML) code. However, a very important feature of a web page is its visual appearance. In this paper, we show that using generic visual

  16. Classifying features in CT imagery: accuracy for some single- and multiple-species classifiers

    Science.gov (United States)

    Daniel L. Schmoldt; Jing He; A. Lynn Abbott

    1998-01-01

    Our current approach to automatically label features in CT images of hardwood logs classifies each pixel of an image individually. These feature classifiers use a back-propagation artificial neural network (ANN) and feature vectors that include a small, local neighborhood of pixels and the distance of the target pixel to the center of the log. Initially, this type of...

  17. Disassembly and Sanitization of Classified Matter

    International Nuclear Information System (INIS)

    Stockham, Dwight J.; Saad, Max P.

    2008-01-01

    The Disassembly Sanitization Operation (DSO) process was implemented to support weapon disassembly and disposition by using recycling and waste minimization measures. This process was initiated by treaty agreements and reconfigurations within both the DOD and DOE Complexes. The DOE is faced with disassembling and disposing of a huge inventory of retired weapons, components, training equipment, spare parts, weapon maintenance equipment, and associated material. In addition, regulations have caused a dramatic increase in the need for information required to support the handling and disposition of these parts and materials. In the past, huge inventories of classified weapon components were required to have long-term storage at Sandia and at many other locations throughout the DoE Complex. These materials are placed in onsite storage unit due to classification issues and they may also contain radiological and/or hazardous components. Since no disposal options exist for this material, the only choice was long-term storage. Long-term storage is costly and somewhat problematic, requiring a secured storage area, monitoring, auditing, and presenting the potential for loss or theft of the material. Overall recycling rates for materials sent through the DSO process have enabled 70 to 80% of these components to be recycled. These components are made of high quality materials and once this material has been sanitized, the demand for the component metals for recycling efforts is very high. The DSO process for NGPF, classified components established the credibility of this technique for addressing the long-term storage requirements of the classified weapons component inventory. The success of this application has generated interest from other Sandia organizations and other locations throughout the complex. Other organizations are requesting the help of the DSO team and the DSO is responding to these requests by expanding its scope to include Work-for- Other projects. For example

  18. Comparing cosmic web classifiers using information theory

    International Nuclear Information System (INIS)

    Leclercq, Florent; Lavaux, Guilhem; Wandelt, Benjamin; Jasche, Jens

    2016-01-01

    We introduce a decision scheme for optimally choosing a classifier, which segments the cosmic web into different structure types (voids, sheets, filaments, and clusters). Our framework, based on information theory, accounts for the design aims of different classes of possible applications: (i) parameter inference, (ii) model selection, and (iii) prediction of new observations. As an illustration, we use cosmographic maps of web-types in the Sloan Digital Sky Survey to assess the relative performance of the classifiers T-WEB, DIVA and ORIGAMI for: (i) analyzing the morphology of the cosmic web, (ii) discriminating dark energy models, and (iii) predicting galaxy colors. Our study substantiates a data-supported connection between cosmic web analysis and information theory, and paves the path towards principled design of analysis procedures for the next generation of galaxy surveys. We have made the cosmic web maps, galaxy catalog, and analysis scripts used in this work publicly available.

  19. Design of Robust Neural Network Classifiers

    DEFF Research Database (Denmark)

    Larsen, Jan; Andersen, Lars Nonboe; Hintz-Madsen, Mads

    1998-01-01

    This paper addresses a new framework for designing robust neural network classifiers. The network is optimized using the maximum a posteriori technique, i.e., the cost function is the sum of the log-likelihood and a regularization term (prior). In order to perform robust classification, we present...... a modified likelihood function which incorporates the potential risk of outliers in the data. This leads to the introduction of a new parameter, the outlier probability. Designing the neural classifier involves optimization of network weights as well as outlier probability and regularization parameters. We...... suggest to adapt the outlier probability and regularisation parameters by minimizing the error on a validation set, and a simple gradient descent scheme is derived. In addition, the framework allows for constructing a simple outlier detector. Experiments with artificial data demonstrate the potential...

  20. Comparing cosmic web classifiers using information theory

    Energy Technology Data Exchange (ETDEWEB)

    Leclercq, Florent [Institute of Cosmology and Gravitation (ICG), University of Portsmouth, Dennis Sciama Building, Burnaby Road, Portsmouth PO1 3FX (United Kingdom); Lavaux, Guilhem; Wandelt, Benjamin [Institut d' Astrophysique de Paris (IAP), UMR 7095, CNRS – UPMC Université Paris 6, Sorbonne Universités, 98bis boulevard Arago, F-75014 Paris (France); Jasche, Jens, E-mail: florent.leclercq@polytechnique.org, E-mail: lavaux@iap.fr, E-mail: j.jasche@tum.de, E-mail: wandelt@iap.fr [Excellence Cluster Universe, Technische Universität München, Boltzmannstrasse 2, D-85748 Garching (Germany)

    2016-08-01

    We introduce a decision scheme for optimally choosing a classifier, which segments the cosmic web into different structure types (voids, sheets, filaments, and clusters). Our framework, based on information theory, accounts for the design aims of different classes of possible applications: (i) parameter inference, (ii) model selection, and (iii) prediction of new observations. As an illustration, we use cosmographic maps of web-types in the Sloan Digital Sky Survey to assess the relative performance of the classifiers T-WEB, DIVA and ORIGAMI for: (i) analyzing the morphology of the cosmic web, (ii) discriminating dark energy models, and (iii) predicting galaxy colors. Our study substantiates a data-supported connection between cosmic web analysis and information theory, and paves the path towards principled design of analysis procedures for the next generation of galaxy surveys. We have made the cosmic web maps, galaxy catalog, and analysis scripts used in this work publicly available.

  1. Detection of Fundus Lesions Using Classifier Selection

    Science.gov (United States)

    Nagayoshi, Hiroto; Hiramatsu, Yoshitaka; Sako, Hiroshi; Himaga, Mitsutoshi; Kato, Satoshi

    A system for detecting fundus lesions caused by diabetic retinopathy from fundus images is being developed. The system can screen the images in advance in order to reduce the inspection workload on doctors. One of the difficulties that must be addressed in completing this system is how to remove false positives (which tend to arise near blood vessels) without decreasing the detection rate of lesions in other areas. To overcome this difficulty, we developed classifier selection according to the position of a candidate lesion, and we introduced new features that can distinguish true lesions from false positives. A system incorporating classifier selection and these new features was tested in experiments using 55 fundus images with some lesions and 223 images without lesions. The results of the experiments confirm the effectiveness of the proposed system, namely, degrees of sensitivity and specificity of 98% and 81%, respectively.

  2. Learning for VMM + WTA Embedded Classifiers

    Science.gov (United States)

    2016-03-31

    Learning for VMM + WTA Embedded Classifiers Jennifer Hasler and Sahil Shah Electrical and Computer Engineering Georgia Institute of Technology...enabling correct classification of each novel acoustic signal (generator, idle car, and idle truck ). The classification structure requires, after...measured on our SoC FPAA IC. The test input is composed of signals from urban environment for 3 objects (generator, idle car, and idle truck

  3. Bayes classifiers for imbalanced traffic accidents datasets.

    Science.gov (United States)

    Mujalli, Randa Oqab; López, Griselda; Garach, Laura

    2016-03-01

    Traffic accidents data sets are usually imbalanced, where the number of instances classified under the killed or severe injuries class (minority) is much lower than those classified under the slight injuries class (majority). This, however, supposes a challenging problem for classification algorithms and may cause obtaining a model that well cover the slight injuries instances whereas the killed or severe injuries instances are misclassified frequently. Based on traffic accidents data collected on urban and suburban roads in Jordan for three years (2009-2011); three different data balancing techniques were used: under-sampling which removes some instances of the majority class, oversampling which creates new instances of the minority class and a mix technique that combines both. In addition, different Bayes classifiers were compared for the different imbalanced and balanced data sets: Averaged One-Dependence Estimators, Weightily Average One-Dependence Estimators, and Bayesian networks in order to identify factors that affect the severity of an accident. The results indicated that using the balanced data sets, especially those created using oversampling techniques, with Bayesian networks improved classifying a traffic accident according to its severity and reduced the misclassification of killed and severe injuries instances. On the other hand, the following variables were found to contribute to the occurrence of a killed causality or a severe injury in a traffic accident: number of vehicles involved, accident pattern, number of directions, accident type, lighting, surface condition, and speed limit. This work, to the knowledge of the authors, is the first that aims at analyzing historical data records for traffic accidents occurring in Jordan and the first to apply balancing techniques to analyze injury severity of traffic accidents. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. A Bayesian classifier for symbol recognition

    OpenAIRE

    Barrat , Sabine; Tabbone , Salvatore; Nourrissier , Patrick

    2007-01-01

    URL : http://www.buyans.com/POL/UploadedFile/134_9977.pdf; International audience; We present in this paper an original adaptation of Bayesian networks to symbol recognition problem. More precisely, a descriptor combination method, which enables to improve significantly the recognition rate compared to the recognition rates obtained by each descriptor, is presented. In this perspective, we use a simple Bayesian classifier, called naive Bayes. In fact, probabilistic graphical models, more spec...

  5. SVM classifier on chip for melanoma detection.

    Science.gov (United States)

    Afifi, Shereen; GholamHosseini, Hamid; Sinha, Roopak

    2017-07-01

    Support Vector Machine (SVM) is a common classifier used for efficient classification with high accuracy. SVM shows high accuracy for classifying melanoma (skin cancer) clinical images within computer-aided diagnosis systems used by skin cancer specialists to detect melanoma early and save lives. We aim to develop a medical low-cost handheld device that runs a real-time embedded SVM-based diagnosis system for use in primary care for early detection of melanoma. In this paper, an optimized SVM classifier is implemented onto a recent FPGA platform using the latest design methodology to be embedded into the proposed device for realizing online efficient melanoma detection on a single system on chip/device. The hardware implementation results demonstrate a high classification accuracy of 97.9% and a significant acceleration factor of 26 from equivalent software implementation on an embedded processor, with 34% of resources utilization and 2 watts for power consumption. Consequently, the implemented system meets crucial embedded systems constraints of high performance and low cost, resources utilization and power consumption, while achieving high classification accuracy.

  6. Autoimmune encephalitis with anti-leucine-rich glioma-inactivated 1 or anti-contactin-associated protein-like 2 antibodies (formerly called voltage-gated potassium channel-complex antibodies).

    Science.gov (United States)

    Bastiaansen, Anna E M; van Sonderen, Agnes; Titulaer, Maarten J

    2017-06-01

    Twenty years since the discovery of voltage-gated potassium channel (VGKC)-related autoimmunity; it is currently known that the antibodies are not directed at the VGKC itself but to two closely associated proteins, anti-leucine-rich glioma-inactivated 1 (LGI1) and contactin-associated protein-like 2 (Caspr2). Antibodies to LGI1 and Caspr2 give well-described clinical phenotypes. Anti-LGI1 encephalitis patients mostly have limbic symptoms, and anti-Caspr2 patients have variable syndromes with both central and peripheral symptoms. A large group of patients with heterogeneous symptoms are VGKC positive but do not have antibodies against LGI1 or Caspr2. The clinical relevance of VGKC positivity in these 'double-negative' patients is questionable. This review focusses on these three essentially different subgroups. The clinical phenotypes of anti-LGI1 encephalitis and anti-Caspr2 encephalitis have been described in more detail including data on treatment and long-term follow-up. A specific human leukocyte antigen (HLA) association was found in nontumor anti-LGI1 encephalitis, but not clearly in those with tumors. There has been increasing interest in the VGKC patients without LGI1/Caspr2 antibodies questioning its relevance in clinical practice. Anti-LGI1 encephalitis and anti-Caspr2 encephalitis are separate clinical entities. Early recognition and treatment is necessary and rewarding. The term VGKC-complex antibodies, lumping patients with anti-LGI1, anti-Caspr2 antibodies or lacking both, should be considered obsolete.

  7. Robust Framework to Combine Diverse Classifiers Assigning Distributed Confidence to Individual Classifiers at Class Level

    Directory of Open Access Journals (Sweden)

    Shehzad Khalid

    2014-01-01

    Full Text Available We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension of m-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes.

  8. The Protection of Classified Information: The Legal Framework

    National Research Council Canada - National Science Library

    Elsea, Jennifer K

    2006-01-01

    Recent incidents involving leaks of classified information have heightened interest in the legal framework that governs security classification, access to classified information, and penalties for improper disclosure...

  9. Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2006-06-01

    Full Text Available Abstract Background The subcellular location of a protein is closely related to its function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been made to predict subcellular location from sequence information only, there is the need for further research to improve the accuracy of prediction. Results A novel method called HensBC is introduced to predict protein subcellular location. HensBC is a recursive algorithm which constructs a hierarchical ensemble of classifiers. The classifiers used are Bayesian classifiers based on Markov chain models. We tested our method on six various datasets; among them are Gram-negative bacteria dataset, data for discriminating outer membrane proteins and apoptosis proteins dataset. We observed that our method can predict the subcellular location with high accuracy. Another advantage of the proposed method is that it can improve the accuracy of the prediction of some classes with few sequences in training and is therefore useful for datasets with imbalanced distribution of classes. Conclusion This study introduces an algorithm which uses only the primary sequence of a protein to predict its subcellular location. The proposed recursive scheme represents an interesting methodology for learning and combining classifiers. The method is computationally efficient and competitive with the previously reported approaches in terms of prediction accuracies as empirical results indicate. The code for the software is available upon request.

  10. Classifying smoking urges via machine learning.

    Science.gov (United States)

    Dumortier, Antoine; Beckjord, Ellen; Shiffman, Saul; Sejdić, Ervin

    2016-12-01

    Smoking is the largest preventable cause of death and diseases in the developed world, and advances in modern electronics and machine learning can help us deliver real-time intervention to smokers in novel ways. In this paper, we examine different machine learning approaches to use situational features associated with having or not having urges to smoke during a quit attempt in order to accurately classify high-urge states. To test our machine learning approaches, specifically, Bayes, discriminant analysis and decision tree learning methods, we used a dataset collected from over 300 participants who had initiated a quit attempt. The three classification approaches are evaluated observing sensitivity, specificity, accuracy and precision. The outcome of the analysis showed that algorithms based on feature selection make it possible to obtain high classification rates with only a few features selected from the entire dataset. The classification tree method outperformed the naive Bayes and discriminant analysis methods, with an accuracy of the classifications up to 86%. These numbers suggest that machine learning may be a suitable approach to deal with smoking cessation matters, and to predict smoking urges, outlining a potential use for mobile health applications. In conclusion, machine learning classifiers can help identify smoking situations, and the search for the best features and classifier parameters significantly improves the algorithms' performance. In addition, this study also supports the usefulness of new technologies in improving the effect of smoking cessation interventions, the management of time and patients by therapists, and thus the optimization of available health care resources. Future studies should focus on providing more adaptive and personalized support to people who really need it, in a minimum amount of time by developing novel expert systems capable of delivering real-time interventions. Copyright © 2016 Elsevier Ireland Ltd. All rights

  11. Classifying spaces of degenerating polarized Hodge structures

    CERN Document Server

    Kato, Kazuya

    2009-01-01

    In 1970, Phillip Griffiths envisioned that points at infinity could be added to the classifying space D of polarized Hodge structures. In this book, Kazuya Kato and Sampei Usui realize this dream by creating a logarithmic Hodge theory. They use the logarithmic structures begun by Fontaine-Illusie to revive nilpotent orbits as a logarithmic Hodge structure. The book focuses on two principal topics. First, Kato and Usui construct the fine moduli space of polarized logarithmic Hodge structures with additional structures. Even for a Hermitian symmetric domain D, the present theory is a refinem

  12. Gearbox Condition Monitoring Using Advanced Classifiers

    Directory of Open Access Journals (Sweden)

    P. Večeř

    2010-01-01

    Full Text Available New efficient and reliable methods for gearbox diagnostics are needed in automotive industry because of growing demand for production quality. This paper presents the application of two different classifiers for gearbox diagnostics – Kohonen Neural Networks and the Adaptive-Network-based Fuzzy Interface System (ANFIS. Two different practical applications are presented. In the first application, the tested gearboxes are separated into two classes according to their condition indicators. In the second example, ANFIS is applied to label the tested gearboxes with a Quality Index according to the condition indicators. In both applications, the condition indicators were computed from the vibration of the gearbox housing. 

  13. Cubical sets as a classifying topos

    DEFF Research Database (Denmark)

    Spitters, Bas

    Coquand’s cubical set model for homotopy type theory provides the basis for a computational interpretation of the univalence axiom and some higher inductive types, as implemented in the cubical proof assistant. We show that the underlying cube category is the opposite of the Lawvere theory of De...... Morgan algebras. The topos of cubical sets itself classifies the theory of ‘free De Morgan algebras’. This provides us with a topos with an internal ‘interval’. Using this interval we construct a model of type theory following van den Berg and Garner. We are currently investigating the precise relation...

  14. Double Ramp Loss Based Reject Option Classifier

    Science.gov (United States)

    2015-05-22

    of convex (DC) functions. To minimize it, we use DC programming approach [1]. The proposed method has following advantages: (1) the proposed loss LDR ...space constraints. We see that LDR does not put any restriction on ρ for it to be an upper bound of L0−d−1. 2.2 Risk Formulation Using LDR Let S = {(xn...classifier learnt using LDR based approach (C = 100, μ = 1, d = .2). Filled circles and triangles represent the support vectors. 4 Experimental Results We show

  15. A systematic comparison of supervised classifiers.

    Directory of Open Access Journals (Sweden)

    Diego Raphael Amancio

    Full Text Available Pattern recognition has been employed in a myriad of industrial, commercial and academic applications. Many techniques have been devised to tackle such a diversity of applications. Despite the long tradition of pattern recognition research, there is no technique that yields the best classification in all scenarios. Therefore, as many techniques as possible should be considered in high accuracy applications. Typical related works either focus on the performance of a given algorithm or compare various classification methods. In many occasions, however, researchers who are not experts in the field of machine learning have to deal with practical classification tasks without an in-depth knowledge about the underlying parameters. Actually, the adequate choice of classifiers and parameters in such practical circumstances constitutes a long-standing problem and is one of the subjects of the current paper. We carried out a performance study of nine well-known classifiers implemented in the Weka framework and compared the influence of the parameter configurations on the accuracy. The default configuration of parameters in Weka was found to provide near optimal performance for most cases, not including methods such as the support vector machine (SVM. In addition, the k-nearest neighbor method frequently allowed the best accuracy. In certain conditions, it was possible to improve the quality of SVM by more than 20% with respect to their default parameter configuration.

  16. STATISTICAL TOOLS FOR CLASSIFYING GALAXY GROUP DYNAMICS

    International Nuclear Information System (INIS)

    Hou, Annie; Parker, Laura C.; Harris, William E.; Wilman, David J.

    2009-01-01

    The dynamical state of galaxy groups at intermediate redshifts can provide information about the growth of structure in the universe. We examine three goodness-of-fit tests, the Anderson-Darling (A-D), Kolmogorov, and χ 2 tests, in order to determine which statistical tool is best able to distinguish between groups that are relaxed and those that are dynamically complex. We perform Monte Carlo simulations of these three tests and show that the χ 2 test is profoundly unreliable for groups with fewer than 30 members. Power studies of the Kolmogorov and A-D tests are conducted to test their robustness for various sample sizes. We then apply these tests to a sample of the second Canadian Network for Observational Cosmology Redshift Survey (CNOC2) galaxy groups and find that the A-D test is far more reliable and powerful at detecting real departures from an underlying Gaussian distribution than the more commonly used χ 2 and Kolmogorov tests. We use this statistic to classify a sample of the CNOC2 groups and find that 34 of 106 groups are inconsistent with an underlying Gaussian velocity distribution, and thus do not appear relaxed. In addition, we compute velocity dispersion profiles (VDPs) for all groups with more than 20 members and compare the overall features of the Gaussian and non-Gaussian groups, finding that the VDPs of the non-Gaussian groups are distinct from those classified as Gaussian.

  17. Mercury⊕: An evidential reasoning image classifier

    Science.gov (United States)

    Peddle, Derek R.

    1995-12-01

    MERCURY⊕ is a multisource evidential reasoning classification software system based on the Dempster-Shafer theory of evidence. The design and implementation of this software package is described for improving the classification and analysis of multisource digital image data necessary for addressing advanced environmental and geoscience applications. In the remote-sensing context, the approach provides a more appropriate framework for classifying modern, multisource, and ancillary data sets which may contain a large number of disparate variables with different statistical properties, scales of measurement, and levels of error which cannot be handled using conventional Bayesian approaches. The software uses a nonparametric, supervised approach to classification, and provides a more objective and flexible interface to the evidential reasoning framework using a frequency-based method for computing support values from training data. The MERCURY⊕ software package has been implemented efficiently in the C programming language, with extensive use made of dynamic memory allocation procedures and compound linked list and hash-table data structures to optimize the storage and retrieval of evidence in a Knowledge Look-up Table. The software is complete with a full user interface and runs under Unix, Ultrix, VAX/VMS, MS-DOS, and Apple Macintosh operating system. An example of classifying alpine land cover and permafrost active layer depth in northern Canada is presented to illustrate the use and application of these ideas.

  18. 36 CFR 1256.46 - National security-classified information.

    Science.gov (United States)

    2010-07-01

    ... 36 Parks, Forests, and Public Property 3 2010-07-01 2010-07-01 false National security-classified... Restrictions § 1256.46 National security-classified information. In accordance with 5 U.S.C. 552(b)(1), NARA... properly classified under the provisions of the pertinent Executive Order on Classified National Security...

  19. Two channel EEG thought pattern classifier.

    Science.gov (United States)

    Craig, D A; Nguyen, H T; Burchey, H A

    2006-01-01

    This paper presents a real-time electro-encephalogram (EEG) identification system with the goal of achieving hands free control. With two EEG electrodes placed on the scalp of the user, EEG signals are amplified and digitised directly using a ProComp+ encoder and transferred to the host computer through the RS232 interface. Using a real-time multilayer neural network, the actual classification for the control of a powered wheelchair has a very fast response. It can detect changes in the user's thought pattern in 1 second. Using only two EEG electrodes at positions O(1) and C(4) the system can classify three mental commands (forward, left and right) with an accuracy of more than 79 %

  20. Classifying Drivers' Cognitive Load Using EEG Signals.

    Science.gov (United States)

    Barua, Shaibal; Ahmed, Mobyen Uddin; Begum, Shahina

    2017-01-01

    A growing traffic safety issue is the effect of cognitive loading activities on traffic safety and driving performance. To monitor drivers' mental state, understanding cognitive load is important since while driving, performing cognitively loading secondary tasks, for example talking on the phone, can affect the performance in the primary task, i.e. driving. Electroencephalography (EEG) is one of the reliable measures of cognitive load that can detect the changes in instantaneous load and effect of cognitively loading secondary task. In this driving simulator study, 1-back task is carried out while the driver performs three different simulated driving scenarios. This paper presents an EEG based approach to classify a drivers' level of cognitive load using Case-Based Reasoning (CBR). The results show that for each individual scenario as well as using data combined from the different scenarios, CBR based system achieved approximately over 70% of classification accuracy.

  1. Classifying prion and prion-like phenomena.

    Science.gov (United States)

    Harbi, Djamel; Harrison, Paul M

    2014-01-01

    The universe of prion and prion-like phenomena has expanded significantly in the past several years. Here, we overview the challenges in classifying this data informatically, given that terms such as "prion-like", "prion-related" or "prion-forming" do not have a stable meaning in the scientific literature. We examine the spectrum of proteins that have been described in the literature as forming prions, and discuss how "prion" can have a range of meaning, with a strict definition being for demonstration of infection with in vitro-derived recombinant prions. We suggest that although prion/prion-like phenomena can largely be apportioned into a small number of broad groups dependent on the type of transmissibility evidence for them, as new phenomena are discovered in the coming years, a detailed ontological approach might be necessary that allows for subtle definition of different "flavors" of prion / prion-like phenomena.

  2. Hybrid Neuro-Fuzzy Classifier Based On Nefclass Model

    Directory of Open Access Journals (Sweden)

    Bogdan Gliwa

    2011-01-01

    Full Text Available The paper presents hybrid neuro-fuzzy classifier, based on NEFCLASS model, which wasmodified. The presented classifier was compared to popular classifiers – neural networks andk-nearest neighbours. Efficiency of modifications in classifier was compared with methodsused in original model NEFCLASS (learning methods. Accuracy of classifier was testedusing 3 datasets from UCI Machine Learning Repository: iris, wine and breast cancer wisconsin.Moreover, influence of ensemble classification methods on classification accuracy waspresented.

  3. Classifying Transition Behaviour in Postural Activity Monitoring

    Directory of Open Access Journals (Sweden)

    James BRUSEY

    2009-10-01

    Full Text Available A few accelerometers positioned on different parts of the body can be used to accurately classify steady state behaviour, such as walking, running, or sitting. Such systems are usually built using supervised learning approaches. Transitions between postures are, however, difficult to deal with using posture classification systems proposed to date, since there is no label set for intermediary postures and also the exact point at which the transition occurs can sometimes be hard to pinpoint. The usual bypass when using supervised learning to train such systems is to discard a section of the dataset around each transition. This leads to poorer classification performance when the systems are deployed out of the laboratory and used on-line, particularly if the regimes monitored involve fast paced activity changes. Time-based filtering that takes advantage of sequential patterns is a potential mechanism to improve posture classification accuracy in such real-life applications. Also, such filtering should reduce the number of event messages needed to be sent across a wireless network to track posture remotely, hence extending the system’s life. To support time-based filtering, understanding transitions, which are the major event generators in a classification system, is a key. This work examines three approaches to post-process the output of a posture classifier using time-based filtering: a naïve voting scheme, an exponentially weighted voting scheme, and a Bayes filter. Best performance is obtained from the exponentially weighted voting scheme although it is suspected that a more sophisticated treatment of the Bayes filter might yield better results.

  4. Just-in-time adaptive classifiers-part II: designing the classifier.

    Science.gov (United States)

    Alippi, Cesare; Roveri, Manuel

    2008-12-01

    Aging effects, environmental changes, thermal drifts, and soft and hard faults affect physical systems by changing their nature and behavior over time. To cope with a process evolution adaptive solutions must be envisaged to track its dynamics; in this direction, adaptive classifiers are generally designed by assuming the stationary hypothesis for the process generating the data with very few results addressing nonstationary environments. This paper proposes a methodology based on k-nearest neighbor (NN) classifiers for designing adaptive classification systems able to react to changing conditions just-in-time (JIT), i.e., exactly when it is needed. k-NN classifiers have been selected for their computational-free training phase, the possibility to easily estimate the model complexity k and keep under control the computational complexity of the classifier through suitable data reduction mechanisms. A JIT classifier requires a temporal detection of a (possible) process deviation (aspect tackled in a companion paper) followed by an adaptive management of the knowledge base (KB) of the classifier to cope with the process change. The novelty of the proposed approach resides in the general framework supporting the real-time update of the KB of the classification system in response to novel information coming from the process both in stationary conditions (accuracy improvement) and in nonstationary ones (process tracking) and in providing a suitable estimate of k. It is shown that the classification system grants consistency once the change targets the process generating the data in a new stationary state, as it is the case in many real applications.

  5. Sequence assembly

    DEFF Research Database (Denmark)

    Scheibye-Alsing, Karsten; Hoffmann, S.; Frankel, Annett Maria

    2009-01-01

    Despite the rapidly increasing number of sequenced and re-sequenced genomes, many issues regarding the computational assembly of large-scale sequencing data have remain unresolved. Computational assembly is crucial in large genome projects as well for the evolving high-throughput technologies and...... in genomic DNA, highly expressed genes and alternative transcripts in EST sequences. We summarize existing comparisons of different assemblers and provide a detailed descriptions and directions for download of assembly programs at: http://genome.ku.dk/resources/assembly/methods.html....

  6. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  7. Classifying lipoproteins based on their polar profiles.

    Science.gov (United States)

    Polanco, Carlos; Castañón-González, Jorge Alberto; Buhse, Thomas; Uversky, Vladimir N; Amkie, Rafael Zonana

    2016-01-01

    The lipoproteins are an important group of cargo proteins known for their unique capability to transport lipids. By applying the Polarity index algorithm, which has a metric that only considers the polar profile of the linear sequences of the lipoprotein group, we obtained an analytical and structural differentiation of all the lipoproteins found in UniProt Database. Also, the functional groups of lipoproteins, and particularly of the set of lipoproteins relevant to atherosclerosis, were analyzed with the same method to reveal their structural preference, and the results of Polarity index analysis were verified by an alternate test, the Cumulative Distribution Function algorithm, applied to the same groups of lipoproteins.

  8. Species classifier choice is a key consideration when analysing low-complexity food microbiome data.

    Science.gov (United States)

    Walsh, Aaron M; Crispie, Fiona; O'Sullivan, Orla; Finnegan, Laura; Claesson, Marcus J; Cotter, Paul D

    2018-03-20

    The use of shotgun metagenomics to analyse low-complexity microbial communities in foods has the potential to be of considerable fundamental and applied value. However, there is currently no consensus with respect to choice of species classification tool, platform, or sequencing depth. Here, we benchmarked the performances of three high-throughput short-read sequencing platforms, the Illumina MiSeq, NextSeq 500, and Ion Proton, for shotgun metagenomics of food microbiota. Briefly, we sequenced six kefir DNA samples and a mock community DNA sample, the latter constructed by evenly mixing genomic DNA from 13 food-related bacterial species. A variety of bioinformatic tools were used to analyse the data generated, and the effects of sequencing depth on these analyses were tested by randomly subsampling reads. Compositional analysis results were consistent between the platforms at divergent sequencing depths. However, we observed pronounced differences in the predictions from species classification tools. Indeed, PERMANOVA indicated that there was no significant differences between the compositional results generated by the different sequencers (p = 0.693, R 2  = 0.011), but there was a significant difference between the results predicted by the species classifiers (p = 0.01, R 2  = 0.127). The relative abundances predicted by the classifiers, apart from MetaPhlAn2, were apparently biased by reference genome sizes. Additionally, we observed varying false-positive rates among the classifiers. MetaPhlAn2 had the lowest false-positive rate, whereas SLIMM had the greatest false-positive rate. Strain-level analysis results were also similar across platforms. Each platform correctly identified the strains present in the mock community, but accuracy was improved slightly with greater sequencing depth. Notably, PanPhlAn detected the dominant strains in each kefir sample above 500,000 reads per sample. Again, the outputs from functional profiling analysis using

  9. A random forest classifier for detecting rare variants in NGS data from viral populations

    Directory of Open Access Journals (Sweden)

    Raunaq Malhotra

    Full Text Available We propose a random forest classifier for detecting rare variants from sequencing errors in Next Generation Sequencing (NGS data from viral populations. The method utilizes counts of varying length of k-mers from the reads of a viral population to train a Random forest classifier, called MultiRes, that classifies k-mers as erroneous or rare variants. Our algorithm is rooted in concepts from signal processing and uses a frame-based representation of k-mers. Frames are sets of non-orthogonal basis functions that were traditionally used in signal processing for noise removal. We define discrete spatial signals for genomes and sequenced reads, and show that k-mers of a given size constitute a frame.We evaluate MultiRes on simulated and real viral population datasets, which consist of many low frequency variants, and compare it to the error detection methods used in correction tools known in the literature. MultiRes has 4 to 500 times less false positives k-mer predictions compared to other methods, essential for accurate estimation of viral population diversity and their de-novo assembly. It has high recall of the true k-mers, comparable to other error correction methods. MultiRes also has greater than 95% recall for detecting single nucleotide polymorphisms (SNPs and fewer false positive SNPs, while detecting higher number of rare variants compared to other variant calling methods for viral populations. The software is available freely from the GitHub link https://github.com/raunaq-m/MultiRes. Keywords: Sequencing error detection, Reference free methods, Next-generation sequencing, Viral populations, Multi-resolution frames, Random forest classifier

  10. Classifying Adverse Events in the Dental Office.

    Science.gov (United States)

    Kalenderian, Elsbeth; Obadan-Udoh, Enihomo; Maramaldi, Peter; Etolue, Jini; Yansane, Alfa; Stewart, Denice; White, Joel; Vaderhobli, Ram; Kent, Karla; Hebballi, Nutan B; Delattre, Veronique; Kahn, Maria; Tokede, Oluwabunmi; Ramoni, Rachel B; Walji, Muhammad F

    2017-06-30

    Dentists strive to provide safe and effective oral healthcare. However, some patients may encounter an adverse event (AE) defined as "unnecessary harm due to dental treatment." In this research, we propose and evaluate two systems for categorizing the type and severity of AEs encountered at the dental office. Several existing medical AE type and severity classification systems were reviewed and adapted for dentistry. Using data collected in previous work, two initial dental AE type and severity classification systems were developed. Eight independent reviewers performed focused chart reviews, and AEs identified were used to evaluate and modify these newly developed classifications. A total of 958 charts were independently reviewed. Among the reviewed charts, 118 prospective AEs were found and 101 (85.6%) were verified as AEs through a consensus process. At the end of the study, a final AE type classification comprising 12 categories, and an AE severity classification comprising 7 categories emerged. Pain and infection were the most common AE types representing 73% of the cases reviewed (56% and 17%, respectively) and 88% were found to cause temporary, moderate to severe harm to the patient. Adverse events found during the chart review process were successfully classified using the novel dental AE type and severity classifications. Understanding the type of AEs and their severity are important steps if we are to learn from and prevent patient harm in the dental office.

  11. Is it important to classify ischaemic stroke?

    LENUS (Irish Health Repository)

    Iqbal, M

    2012-02-01

    Thirty-five percent of all ischemic events remain classified as cryptogenic. This study was conducted to ascertain the accuracy of diagnosis of ischaemic stroke based on information given in the medical notes. It was tested by applying the clinical information to the (TOAST) criteria. Hundred and five patients presented with acute stroke between Jan-Jun 2007. Data was collected on 90 patients. Male to female ratio was 39:51 with age range of 47-93 years. Sixty (67%) patients had total\\/partial anterior circulation stroke; 5 (5.6%) had a lacunar stroke and in 25 (28%) the mechanism of stroke could not be identified. Four (4.4%) patients with small vessel disease were anticoagulated; 5 (5.6%) with atrial fibrillation received antiplatelet therapy and 2 (2.2%) patients with atrial fibrillation underwent CEA. This study revealed deficiencies in the clinical assessment of patients and treatment was not tailored to the mechanism of stroke in some patients.

  12. Stress fracture development classified by bone scintigraphy

    International Nuclear Information System (INIS)

    Zwas, S.T.; Elkanovich, R.; Frank, G.; Aharonson, Z.

    1985-01-01

    There is no consensus on classifying stress fractures (SF) appearing on bone scans. The authors present a system of classification based on grading the severity and development of bone lesions by visual inspection, according to three main scintigraphic criteria: focality and size, intensity of uptake compare to adjacent bone, and local medular extension. Four grades of development (I-IV) were ranked, ranging from ill defined slightly increased cortical uptake to well defined regions with markedly increased uptake extending transversely bicortically. 310 male subjects aged 19-2, suffering several weeks from leg pains occurring during intensive physical training underwent bone scans of the pelvis and lower extremities using Tc-99-m-MDP. 76% of the scans were positive with 354 lesions, of which 88% were in th4e mild (I-II) grades and 12% in the moderate (III) and severe (IV) grades. Post-treatment scans were obtained in 65 cases having 78 lesions during 1- to 6-month intervals. Complete resolution was found after 1-2 months in 36% of the mild lesions but in only 12% of the moderate and severe ones, and after 3-6 months in 55% of the mild lesions and 15% of the severe ones. 75% of the moderate and severe lesions showed residual uptake in various stages throughout the follow-up period. Early recognition and treatment of mild SF lesions in this study prevented protracted disability and progression of the lesions and facilitated complete healing

  13. 41 CFR 105-62.102 - Authority to originally classify.

    Science.gov (United States)

    2010-07-01

    ... originally classify. (a) Top secret, secret, and confidential. The authority to originally classify information as Top Secret, Secret, or Confidential may be exercised only by the Administrator and is delegable...

  14. Naive Bayesian classifiers for multinomial features: a theoretical analysis

    CSIR Research Space (South Africa)

    Van Dyk, E

    2007-11-01

    Full Text Available The authors investigate the use of naive Bayesian classifiers for multinomial feature spaces and derive error estimates for these classifiers. The error analysis is done by developing a mathematical model to estimate the probability density...

  15. Ensemble of classifiers based network intrusion detection system performance bound

    CSIR Research Space (South Africa)

    Mkuzangwe, Nenekazi NP

    2017-11-01

    Full Text Available This paper provides a performance bound of a network intrusion detection system (NIDS) that uses an ensemble of classifiers. Currently researchers rely on implementing the ensemble of classifiers based NIDS before they can determine the performance...

  16. Fast Most Similar Neighbor (MSN) classifiers for Mixed Data

    OpenAIRE

    Hernández Rodríguez, Selene

    2010-01-01

    The k nearest neighbor (k-NN) classifier has been extensively used in Pattern Recognition because of its simplicity and its good performance. However, in large datasets applications, the exhaustive k-NN classifier becomes impractical. Therefore, many fast k-NN classifiers have been developed; most of them rely on metric properties (usually the triangle inequality) to reduce the number of prototype comparisons. Hence, the existing fast k-NN classifiers are applicable only when the comparison f...

  17. Fault Diagnosis for Distribution Networks Using Enhanced Support Vector Machine Classifier with Classical Multidimensional Scaling

    Directory of Open Access Journals (Sweden)

    Ming-Yuan Cho

    2017-09-01

    Full Text Available In this paper, a new fault diagnosis techniques based on time domain reflectometry (TDR method with pseudo-random binary sequence (PRBS stimulus and support vector machine (SVM classifier has been investigated to recognize the different types of fault in the radial distribution feeders. This novel technique has considered the amplitude of reflected signals and the peaks of cross-correlation (CCR between the reflected and incident wave for generating fault current dataset for SVM. Furthermore, this multi-layer enhanced SVM classifier is combined with classical multidimensional scaling (CMDS feature extraction algorithm and kernel parameter optimization to increase training speed and improve overall classification accuracy. The proposed technique has been tested on a radial distribution feeder to identify ten different types of fault considering 12 input features generated by using Simulink software and MATLAB Toolbox. The success rate of SVM classifier is over 95% which demonstrates the effectiveness and the high accuracy of proposed method.

  18. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations.

    Science.gov (United States)

    Zhang, Yi; Ren, Jinchang; Jiang, Jianmin

    2015-01-01

    Maximum likelihood classifier (MLC) and support vector machines (SVM) are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  19. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations

    Directory of Open Access Journals (Sweden)

    Yi Zhang

    2015-01-01

    Full Text Available Maximum likelihood classifier (MLC and support vector machines (SVM are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  20. Three data partitioning strategies for building local classifiers (Chapter 14)

    NARCIS (Netherlands)

    Zliobaite, I.; Okun, O.; Valentini, G.; Re, M.

    2011-01-01

    Divide-and-conquer approach has been recognized in multiple classifier systems aiming to utilize local expertise of individual classifiers. In this study we experimentally investigate three strategies for building local classifiers that are based on different routines of sampling data for training.

  1. Recognition of pornographic web pages by classifying texts and images.

    Science.gov (United States)

    Hu, Weiming; Wu, Ou; Chen, Zhouyao; Fu, Zhouyu; Maybank, Steve

    2007-06-01

    With the rapid development of the World Wide Web, people benefit more and more from the sharing of information. However, Web pages with obscene, harmful, or illegal content can be easily accessed. It is important to recognize such unsuitable, offensive, or pornographic Web pages. In this paper, a novel framework for recognizing pornographic Web pages is described. A C4.5 decision tree is used to divide Web pages, according to content representations, into continuous text pages, discrete text pages, and image pages. These three categories of Web pages are handled, respectively, by a continuous text classifier, a discrete text classifier, and an algorithm that fuses the results from the image classifier and the discrete text classifier. In the continuous text classifier, statistical and semantic features are used to recognize pornographic texts. In the discrete text classifier, the naive Bayes rule is used to calculate the probability that a discrete text is pornographic. In the image classifier, the object's contour-based features are extracted to recognize pornographic images. In the text and image fusion algorithm, the Bayes theory is used to combine the recognition results from images and texts. Experimental results demonstrate that the continuous text classifier outperforms the traditional keyword-statistics-based classifier, the contour-based image classifier outperforms the traditional skin-region-based image classifier, the results obtained by our fusion algorithm outperform those by either of the individual classifiers, and our framework can be adapted to different categories of Web pages.

  2. 32 CFR 2400.28 - Dissemination of classified information.

    Science.gov (United States)

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Dissemination of classified information. 2400.28... SECURITY PROGRAM Safeguarding § 2400.28 Dissemination of classified information. Heads of OSTP offices... originating official may prescribe specific restrictions on dissemination of classified information when...

  3. Stability of halophilic proteins: from dipeptide attributes to discrimination classifier.

    Science.gov (United States)

    Zhang, Guangya; Huihua, Ge; Yi, Lin

    2013-02-01

    To investigate the molecular features responsible for protein halophilicity is of great significance for understanding the structure basis of protein halo-stability and would help to develop a practical strategy for designing halophilic proteins. In this work, we have systematically analyzed the dipeptide composition of the halophilic and non-halophilic protein sequences. We observed the halophilic proteins contained more DA, RA, AD, RR, AP, DD, PD, EA, VG and DV at the expense of LK, IL, II, IA, KK, IS, KA, GK, RK and AI. We identified some macromolecular signatures of halo-adaptation, and thought the dipeptide composition might contain more information than amino acid composition. Based on the dipeptide composition, we have developed a machine learning method for classifying halophilic and non-halophilic proteins for the first time. The accuracy of our method for the training dataset was 100.0%, and for the 10-fold cross-validation was 93.1%. We also discussed the influence of some specific dipeptides on prediction accuracy. Copyright © 2012 Elsevier B.V. All rights reserved.

  4. IN-MACA-MCC: Integrated Multiple Attractor Cellular Automata with Modified Clonal Classifier for Human Protein Coding and Promoter Prediction

    Directory of Open Access Journals (Sweden)

    Kiran Sree Pokkuluri

    2014-01-01

    Full Text Available Protein coding and promoter region predictions are very important challenges of bioinformatics (Attwood and Teresa, 2000. The identification of these regions plays a crucial role in understanding the genes. Many novel computational and mathematical methods are introduced as well as existing methods that are getting refined for predicting both of the regions separately; still there is a scope for improvement. We propose a classifier that is built with MACA (multiple attractor cellular automata and MCC (modified clonal classifier to predict both regions with a single classifier. The proposed classifier is trained and tested with Fickett and Tung (1992 datasets for protein coding region prediction for DNA sequences of lengths 54, 108, and 162. This classifier is trained and tested with MMCRI datasets for protein coding region prediction for DNA sequences of lengths 252 and 354. The proposed classifier is trained and tested with promoter sequences from DBTSS (Yamashita et al., 2006 dataset and nonpromoters from EID (Saxonov et al., 2000 and UTRdb (Pesole et al., 2002 datasets. The proposed model can predict both regions with an average accuracy of 90.5% for promoter and 89.6% for protein coding region predictions. The specificity and sensitivity values of promoter and protein coding region predictions are 0.89 and 0.92, respectively.

  5. Peat classified as slowly renewable biomass fuel

    International Nuclear Information System (INIS)

    2001-01-01

    thousands of years. The report states also that peat should be classified as biomass fuel instead of biofuels, such as wood, or fossil fuels such as coal. According to the report peat is a renewable biomass fuel like biofuels, but due to slow accumulation it should be considered as slowly renewable fuel. The report estimates that bonding of carbon in both virgin and forest drained peatlands are so high that it can compensate the emissions formed in combustion of energy peat

  6. Chameleon sequences in neurodegenerative diseases

    International Nuclear Information System (INIS)

    Bahramali, Golnaz; Goliaei, Bahram; Minuchehr, Zarrin; Salari, Ali

    2016-01-01

    Chameleon sequences can adopt either alpha helix sheet or a coil conformation. Defining chameleon sequences in PDB (Protein Data Bank) may yield to an insight on defining peptides and proteins responsible in neurodegeneration. In this research, we benefitted from the large PDB and performed a sequence analysis on Chameleons, where we developed an algorithm to extract peptide segments with identical sequences, but different structures. In order to find new chameleon sequences, we extracted a set of 8315 non-redundant protein sequences from the PDB with an identity less than 25%. Our data was classified to “helix to strand (HE)”, “helix to coil (HC)” and “strand to coil (CE)” alterations. We also analyzed the occurrence of singlet and doublet amino acids and the solvent accessibility in the chameleon sequences; we then sorted out the proteins with the most number of chameleon sequences and named them Chameleon Flexible Proteins (CFPs) in our dataset. Our data revealed that Gly, Val, Ile, Tyr and Phe, are the major amino acids in Chameleons. We also found that there are proteins such as Insulin Degrading Enzyme IDE and GTP-binding nuclear protein Ran (RAN) with the most number of chameleons (640 and 405 respectively). These proteins have known roles in neurodegenerative diseases. Therefore it can be inferred that other CFP's can serve as key proteins in neurodegeneration, and a study on them can shed light on curing and preventing neurodegenerative diseases.

  7. Chameleon sequences in neurodegenerative diseases.

    Science.gov (United States)

    Bahramali, Golnaz; Goliaei, Bahram; Minuchehr, Zarrin; Salari, Ali

    2016-03-25

    Chameleon sequences can adopt either alpha helix sheet or a coil conformation. Defining chameleon sequences in PDB (Protein Data Bank) may yield to an insight on defining peptides and proteins responsible in neurodegeneration. In this research, we benefitted from the large PDB and performed a sequence analysis on Chameleons, where we developed an algorithm to extract peptide segments with identical sequences, but different structures. In order to find new chameleon sequences, we extracted a set of 8315 non-redundant protein sequences from the PDB with an identity less than 25%. Our data was classified to "helix to strand (HE)", "helix to coil (HC)" and "strand to coil (CE)" alterations. We also analyzed the occurrence of singlet and doublet amino acids and the solvent accessibility in the chameleon sequences; we then sorted out the proteins with the most number of chameleon sequences and named them Chameleon Flexible Proteins (CFPs) in our dataset. Our data revealed that Gly, Val, Ile, Tyr and Phe, are the major amino acids in Chameleons. We also found that there are proteins such as Insulin Degrading Enzyme IDE and GTP-binding nuclear protein Ran (RAN) with the most number of chameleons (640 and 405 respectively). These proteins have known roles in neurodegenerative diseases. Therefore it can be inferred that other CFP's can serve as key proteins in neurodegeneration, and a study on them can shed light on curing and preventing neurodegenerative diseases. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Chameleon sequences in neurodegenerative diseases

    Energy Technology Data Exchange (ETDEWEB)

    Bahramali, Golnaz [Institute of Biochemistry and Biophysics, University of Tehran, Tehran (Iran, Islamic Republic of); Goliaei, Bahram, E-mail: goliaei@ut.ac.ir [Institute of Biochemistry and Biophysics, University of Tehran, Tehran (Iran, Islamic Republic of); Minuchehr, Zarrin, E-mail: minuchehr@nigeb.ac.ir [Department of Systems Biotechnology, National Institute of Genetic Engineering and Biotechnology, (NIGEB), Tehran (Iran, Islamic Republic of); Salari, Ali [Department of Systems Biotechnology, National Institute of Genetic Engineering and Biotechnology, (NIGEB), Tehran (Iran, Islamic Republic of)

    2016-03-25

    Chameleon sequences can adopt either alpha helix sheet or a coil conformation. Defining chameleon sequences in PDB (Protein Data Bank) may yield to an insight on defining peptides and proteins responsible in neurodegeneration. In this research, we benefitted from the large PDB and performed a sequence analysis on Chameleons, where we developed an algorithm to extract peptide segments with identical sequences, but different structures. In order to find new chameleon sequences, we extracted a set of 8315 non-redundant protein sequences from the PDB with an identity less than 25%. Our data was classified to “helix to strand (HE)”, “helix to coil (HC)” and “strand to coil (CE)” alterations. We also analyzed the occurrence of singlet and doublet amino acids and the solvent accessibility in the chameleon sequences; we then sorted out the proteins with the most number of chameleon sequences and named them Chameleon Flexible Proteins (CFPs) in our dataset. Our data revealed that Gly, Val, Ile, Tyr and Phe, are the major amino acids in Chameleons. We also found that there are proteins such as Insulin Degrading Enzyme IDE and GTP-binding nuclear protein Ran (RAN) with the most number of chameleons (640 and 405 respectively). These proteins have known roles in neurodegenerative diseases. Therefore it can be inferred that other CFP's can serve as key proteins in neurodegeneration, and a study on them can shed light on curing and preventing neurodegenerative diseases.

  9. Re-interpreting Prominences Classified as Tornadoes

    Science.gov (United States)

    Martin, Sara F.; Venkataramanasastry, Aparna

    2015-04-01

    Some papers in the recent literature identify tornado prominences with barbs of quiescent prominences while papers in the much older historic literature include a second category of tornado prominence that does not correspond to a barb of a quiescent prominence. The latter are described as prominence mass rotating around a nearly vertical axis prior to its eruption and the rotation was verified by spectral measurements. From H alpha Doppler-shifted mass motions recorded at Helio Research or the Dutch Open Telescope, we illustrate how the apparent tornado-like motions, identified with barbs, are illusions in our mind’s eye resulting from poorly resolved counterstreaming threads of mass in the barbs of quiescent prominences. In contrast, we confirm the second category of rotational motion in prominences shortly before and during eruption. In addition, we identify this second category as part of the late phase of a phenomenon called the roll effect in erupting prominences. In these cases, the eruption begins with the sideways rolling of the top of a prominence. As the eruption proceeds the rolling motion propagates down one leg or both legs of the prominence depending on whether the eruption is asymmetric or symmetric respectively. As an asymmetric eruption continues, the longer lasting leg becomes nearly vertical and its rotational motion also continues. If only this phase of the eruption was observed, as in some historic cases, it was called a tornado prominence. However, when we now observe entire eruptions in time-lapse sequences, the similarity to terrestrial tornadoes is lost. We conclude that neither prominence barbs, that give the illusion of rotation, nor the cases of true rotational motion, in the legs of erupting prominences, are usefully described as tornado prominences when the complete prominence structure or complete erupting event is observed.

  10. Green's theorem and Gorenstein sequences

    OpenAIRE

    Ahn, Jeaman; Migliore, Juan C.; Shin, Yong-Su

    2016-01-01

    We study consequences, for a standard graded algebra, of extremal behavior in Green's Hyperplane Restriction Theorem. First, we extend his Theorem 4 from the case of a plane curve to the case of a hypersurface in a linear space. Second, assuming a certain Lefschetz condition, we give a connection to extremal behavior in Macaulay's theorem. We apply these results to show that $(1,19,17,19,1)$ is not a Gorenstein sequence, and as a result we classify the sequences of the form $(1,a,a-2,a,1)$ th...

  11. Localization and Recognition of Dynamic Hand Gestures Based on Hierarchy of Manifold Classifiers

    Science.gov (United States)

    Favorskaya, M.; Nosov, A.; Popov, A.

    2015-05-01

    Generally, the dynamic hand gestures are captured in continuous video sequences, and a gesture recognition system ought to extract the robust features automatically. This task involves the highly challenging spatio-temporal variations of dynamic hand gestures. The proposed method is based on two-level manifold classifiers including the trajectory classifiers in any time instants and the posture classifiers of sub-gestures in selected time instants. The trajectory classifiers contain skin detector, normalized skeleton representation of one or two hands, and motion history representing by motion vectors normalized through predetermined directions (8 and 16 in our case). Each dynamic gesture is separated into a set of sub-gestures in order to predict a trajectory and remove those samples of gestures, which do not satisfy to current trajectory. The posture classifiers involve the normalized skeleton representation of palm and fingers and relative finger positions using fingertips. The min-max criterion is used for trajectory recognition, and the decision tree technique was applied for posture recognition of sub-gestures. For experiments, a dataset "Multi-modal Gesture Recognition Challenge 2013: Dataset and Results" including 393 dynamic hand-gestures was chosen. The proposed method yielded 84-91% recognition accuracy, in average, for restricted set of dynamic gestures.

  12. LOCALIZATION AND RECOGNITION OF DYNAMIC HAND GESTURES BASED ON HIERARCHY OF MANIFOLD CLASSIFIERS

    Directory of Open Access Journals (Sweden)

    M. Favorskaya

    2015-05-01

    Full Text Available Generally, the dynamic hand gestures are captured in continuous video sequences, and a gesture recognition system ought to extract the robust features automatically. This task involves the highly challenging spatio-temporal variations of dynamic hand gestures. The proposed method is based on two-level manifold classifiers including the trajectory classifiers in any time instants and the posture classifiers of sub-gestures in selected time instants. The trajectory classifiers contain skin detector, normalized skeleton representation of one or two hands, and motion history representing by motion vectors normalized through predetermined directions (8 and 16 in our case. Each dynamic gesture is separated into a set of sub-gestures in order to predict a trajectory and remove those samples of gestures, which do not satisfy to current trajectory. The posture classifiers involve the normalized skeleton representation of palm and fingers and relative finger positions using fingertips. The min-max criterion is used for trajectory recognition, and the decision tree technique was applied for posture recognition of sub-gestures. For experiments, a dataset “Multi-modal Gesture Recognition Challenge 2013: Dataset and Results” including 393 dynamic hand-gestures was chosen. The proposed method yielded 84–91% recognition accuracy, in average, for restricted set of dynamic gestures.

  13. Classifying cognitive profiles using machine learning with privileged information in Mild Cognitive Impairment

    Directory of Open Access Journals (Sweden)

    Hanin Hamdan Alahmadi

    2016-11-01

    Full Text Available Early diagnosis of dementia is critical for assessing disease progression and potential treatment. State-or-the-art machine learning techniques have been increasingly employed to take on this diagnostic task. In this study, we employed Generalised Matrix Learning Vector Quantization (GMLVQ classifiers to discriminate patients with Mild Cognitive Impairment (MCI from healthy controls based on their cognitive skills. Further, we adopted a ``Learning with privileged information'' approach to combine cognitive and fMRI data for the classification task. The resulting classifier operates solely on the cognitive data while it incorporates the fMRI data as privileged information (PI during training. This novel classifier is of practical use as the collection of brain imaging data is not always possible with patients and older participants.MCI patients and healthy age-matched controls were trained to extract structure from temporal sequences. We ask whether machine learning classifiers can be used to discriminate patients from controls based on the learning performance and whether differences between these groups relate to individual cognitive profiles. To this end, we tested participants in four cognitive tasks: working memory, cognitive inhibition, divided attention, and selective attention. We also collected fMRI data before and after training on the learning task and extracted fMRI responses and connectivity as features for machine learning classifiers. Our results show that the PI guided GMLVQ classifiers outperform the baseline classifier that only used the cognitive data. In addition, we found that for the baseline classifier, divided attention is the only relevant cognitive feature. When PI was incorporated, divided attention remained the most relevant feature while cognitive inhibition became also relevant for the task. Interestingly, this analysis for the fMRI GMLVQ classifier suggests that (1 when overall fMRI signal for structured stimuli is

  14. Intermediate depth burial of classified transuranic wastes in arid alluvium

    International Nuclear Information System (INIS)

    Cochran, J.R.; Crowe, B.M.; Di Sanza, F.

    1999-01-01

    Intermediate depth disposal operations were conducted by the US Department of Energy (DOE) at the DOE's Nevada Test Site (NTS) from 1984 through 1989. These operations emplaced high-specific activity low-level wastes (LLW) and limited quantities of classified transuranic (TRU) wastes in 37 m (120-ft) deep, Greater Confinement Disposal (GCD) boreholes. The GCD boreholes are 3 m (10 ft) in diameter and founded in a thick sequence of arid alluvium. The bottom 15 m (50 ft) of each borehole was used for waste emplacement and the upper 21 m (70 ft) was backfilled with native alluvium. The bottom of each GCD borehole is almost 200 m (650 ft) above the water table. The GCD boreholes are located in one of the most arid portions of the US, with an average precipitation of 13 cm (5 inches) per year. The limited precipitation, coupled with generally warm temperatures and low humidities results in a hydrologic system dominated by evapotranspiration. The US Environmental Protection Agency's (EPA's) 40 CFR 191 defines the requirements for protection of human health from disposed TRU wastes. This EPA standard sets a number of requirements, including probabilistic limits on the cumulative releases of radionuclides to the accessible environment for 10,000 years. The DOE Nevada Operations Office (DOE/NV) has contracted with Sandia National Laboratories (Sandia) to conduct a performance assessment (PA) to determine if the TRU wastes emplaced in the GCD boreholes complies with the EPA's 40 CFR 191 requirements. This paper describes DOE's actions undertaken to evaluate whether the TRU wastes in the GCD boreholes will, or will not, endanger human health. Based on preliminary modeling, the TRU wastes in the GCD boreholes meet the EPA's requirements, and are, therefore, protective of human health

  15. A Supervised Multiclass Classifier for an Autocoding System

    Directory of Open Access Journals (Sweden)

    Yukako Toko

    2017-11-01

    Full Text Available Classification is often required in various contexts, including in the field of official statistics. In the previous study, we have developed a multiclass classifier that can classify short text descriptions with high accuracy. The algorithm borrows the concept of the naïve Bayes classifier and is so simple that its structure is easily understandable. The proposed classifier has the following two advantages. First, the processing times for both learning and classifying are extremely practical. Second, the proposed classifier yields high-accuracy results for a large portion of a dataset. We have previously developed an autocoding system for the Family Income and Expenditure Survey in Japan that has a better performing classifier. While the original system was developed in Perl in order to improve the efficiency of the coding process of short Japanese texts, the proposed system is implemented in the R programming language in order to explore versatility and is modified to make the system easily applicable to English text descriptions, in consideration of the increasing number of R users in the field of official statistics. We are planning to publish the proposed classifier as an R-package. The proposed classifier would be generally applicable to other classification tasks including coding activities in the field of official statistics, and it would contribute greatly to improving their efficiency.

  16. 18 CFR 3a.12 - Authority to classify official information.

    Science.gov (United States)

    2010-04-01

    ... efficient administration. (b) The authority to classify information or material originally as Top Secret is... classify information or material originally as Secret is exercised only by: (1) Officials who have Top... information or material originally as Confidential is exercised by officials who have Top Secret or Secret...

  17. Using Neural Networks to Classify Digitized Images of Galaxies

    Science.gov (United States)

    Goderya, S. N.; McGuire, P. C.

    2000-12-01

    Automated classification of Galaxies into Hubble types is of paramount importance to study the large scale structure of the Universe, particularly as survey projects like the Sloan Digital Sky Survey complete their data acquisition of one million galaxies. At present it is not possible to find robust and efficient artificial intelligence based galaxy classifiers. In this study we will summarize progress made in the development of automated galaxy classifiers using neural networks as machine learning tools. We explore the Bayesian linear algorithm, the higher order probabilistic network, the multilayer perceptron neural network and Support Vector Machine Classifier. The performance of any machine classifier is dependant on the quality of the parameters that characterize the different groups of galaxies. Our effort is to develop geometric and invariant moment based parameters as input to the machine classifiers instead of the raw pixel data. Such an approach reduces the dimensionality of the classifier considerably, and removes the effects of scaling and rotation, and makes it easier to solve for the unknown parameters in the galaxy classifier. To judge the quality of training and classification we develop the concept of Mathews coefficients for the galaxy classification community. Mathews coefficients are single numbers that quantify classifier performance even with unequal prior probabilities of the classes.

  18. Fisher classifier and its probability of error estimation

    Science.gov (United States)

    Chittineni, C. B.

    1979-01-01

    Computationally efficient expressions are derived for estimating the probability of error using the leave-one-out method. The optimal threshold for the classification of patterns projected onto Fisher's direction is derived. A simple generalization of the Fisher classifier to multiple classes is presented. Computational expressions are developed for estimating the probability of error of the multiclass Fisher classifier.

  19. Performance of classification confidence measures in dynamic classifier systems

    Czech Academy of Sciences Publication Activity Database

    Štefka, D.; Holeňa, Martin

    2013-01-01

    Roč. 23, č. 4 (2013), s. 299-319 ISSN 1210-0552 R&D Projects: GA ČR GA13-17187S Institutional support: RVO:67985807 Keywords : classifier combining * dynamic classifier systems * classification confidence Subject RIV: IN - Informatics, Computer Science Impact factor: 0.412, year: 2013

  20. 32 CFR 2400.30 - Reproduction of classified information.

    Science.gov (United States)

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Reproduction of classified information. 2400.30... SECURITY PROGRAM Safeguarding § 2400.30 Reproduction of classified information. Documents or portions of... the originator or higher authority. Any stated prohibition against reproduction shall be strictly...

  1. Classifying spaces with virtually cyclic stabilizers for linear groups

    DEFF Research Database (Denmark)

    Degrijse, Dieter Dries; Köhl, Ralf; Petrosyan, Nansen

    2015-01-01

    We show that every discrete subgroup of GL(n, ℝ) admits a finite-dimensional classifying space with virtually cyclic stabilizers. Applying our methods to SL(3, ℤ), we obtain a four-dimensional classifying space with virtually cyclic stabilizers and a decomposition of the algebraic K-theory of its...

  2. Dynamic integration of classifiers in the space of principal components

    NARCIS (Netherlands)

    Tsymbal, A.; Pechenizkiy, M.; Puuronen, S.; Patterson, D.W.; Kalinichenko, L.A.; Manthey, R.; Thalheim, B.; Wloka, U.

    2003-01-01

    Recent research has shown the integration of multiple classifiers to be one of the most important directions in machine learning and data mining. It was shown that, for an ensemble to be successful, it should consist of accurate and diverse base classifiers. However, it is also important that the

  3. An ensemble of dissimilarity based classifiers for Mackerel gender determination

    International Nuclear Information System (INIS)

    Blanco, A; Rodriguez, R; Martinez-Maranon, I

    2014-01-01

    Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity

  4. An ensemble of dissimilarity based classifiers for Mackerel gender determination

    Science.gov (United States)

    Blanco, A.; Rodriguez, R.; Martinez-Maranon, I.

    2014-03-01

    Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity.

  5. Just-in-time classifiers for recurrent concepts.

    Science.gov (United States)

    Alippi, Cesare; Boracchi, Giacomo; Roveri, Manuel

    2013-04-01

    Just-in-time (JIT) classifiers operate in evolving environments by classifying instances and reacting to concept drift. In stationary conditions, a JIT classifier improves its accuracy over time by exploiting additional supervised information coming from the field. In nonstationary conditions, however, the classifier reacts as soon as concept drift is detected; the current classification setup is discarded and a suitable one activated to keep the accuracy high. We present a novel generation of JIT classifiers able to deal with recurrent concept drift by means of a practical formalization of the concept representation and the definition of a set of operators working on such representations. The concept-drift detection activity, which is crucial in promptly reacting to changes exactly when needed, is advanced by considering change-detection tests monitoring both inputs and classes distributions.

  6. Prediction of small molecule binding property of protein domains with Bayesian classifiers based on Markov chains.

    Science.gov (United States)

    Bulashevska, Alla; Stein, Martin; Jackson, David; Eils, Roland

    2009-12-01

    Accurate computational methods that can help to predict biological function of a protein from its sequence are of great interest to research biologists and pharmaceutical companies. One approach to assume the function of proteins is to predict the interactions between proteins and other molecules. In this work, we propose a machine learning method that uses a primary sequence of a domain to predict its propensity for interaction with small molecules. By curating the Pfam database with respect to the small molecule binding ability of its component domains, we have constructed a dataset of small molecule binding and non-binding domains. This dataset was then used as training set to learn a Bayesian classifier, which should distinguish members of each class. The domain sequences of both classes are modelled with Markov chains. In a Jack-knife test, our classification procedure achieved the predictive accuracies of 77.2% and 66.7% for binding and non-binding classes respectively. We demonstrate the applicability of our classifier by using it to identify previously unknown small molecule binding domains. Our predictions are available as supplementary material and can provide very useful information to drug discovery specialists. Given the ubiquitous and essential role small molecules play in biological processes, our method is important for identifying pharmaceutically relevant components of complete proteomes. The software is available from the author upon request.

  7. Class-specific Error Bounds for Ensemble Classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Prenger, R; Lemmond, T; Varshney, K; Chen, B; Hanley, W

    2009-10-06

    The generalization error, or probability of misclassification, of ensemble classifiers has been shown to be bounded above by a function of the mean correlation between the constituent (i.e., base) classifiers and their average strength. This bound suggests that increasing the strength and/or decreasing the correlation of an ensemble's base classifiers may yield improved performance under the assumption of equal error costs. However, this and other existing bounds do not directly address application spaces in which error costs are inherently unequal. For applications involving binary classification, Receiver Operating Characteristic (ROC) curves, performance curves that explicitly trade off false alarms and missed detections, are often utilized to support decision making. To address performance optimization in this context, we have developed a lower bound for the entire ROC curve that can be expressed in terms of the class-specific strength and correlation of the base classifiers. We present empirical analyses demonstrating the efficacy of these bounds in predicting relative classifier performance. In addition, we specify performance regions of the ROC curve that are naturally delineated by the class-specific strengths of the base classifiers and show that each of these regions can be associated with a unique set of guidelines for performance optimization of binary classifiers within unequal error cost regimes.

  8. Frog sound identification using extended k-nearest neighbor classifier

    Science.gov (United States)

    Mukahar, Nordiana; Affendi Rosdi, Bakhtiar; Athiar Ramli, Dzati; Jaafar, Haryati

    2017-09-01

    Frog sound identification based on the vocalization becomes important for biological research and environmental monitoring. As a result, different types of feature extractions and classifiers have been employed to evaluate the accuracy of frog sound identification. This paper presents a frog sound identification with Extended k-Nearest Neighbor (EKNN) classifier. The EKNN classifier integrates the nearest neighbors and mutual sharing of neighborhood concepts, with the aims of improving the classification performance. It makes a prediction based on who are the nearest neighbors of the testing sample and who consider the testing sample as their nearest neighbors. In order to evaluate the classification performance in frog sound identification, the EKNN classifier is compared with competing classifier, k -Nearest Neighbor (KNN), Fuzzy k -Nearest Neighbor (FKNN) k - General Nearest Neighbor (KGNN)and Mutual k -Nearest Neighbor (MKNN) on the recorded sounds of 15 frog species obtained in Malaysia forest. The recorded sounds have been segmented using Short Time Energy and Short Time Average Zero Crossing Rate (STE+STAZCR), sinusoidal modeling (SM), manual and the combination of Energy (E) and Zero Crossing Rate (ZCR) (E+ZCR) while the features are extracted by Mel Frequency Cepstrum Coefficient (MFCC). The experimental results have shown that the EKNCN classifier exhibits the best performance in terms of accuracy compared to the competing classifiers, KNN, FKNN, GKNN and MKNN for all cases.

  9. Ship localization in Santa Barbara Channel using machine learning classifiers.

    Science.gov (United States)

    Niu, Haiqiang; Ozanich, Emma; Gerstoft, Peter

    2017-11-01

    Machine learning classifiers are shown to outperform conventional matched field processing for a deep water (600 m depth) ocean acoustic-based ship range estimation problem in the Santa Barbara Channel Experiment when limited environmental information is known. Recordings of three different ships of opportunity on a vertical array were used as training and test data for the feed-forward neural network and support vector machine classifiers, demonstrating the feasibility of machine learning methods to locate unseen sources. The classifiers perform well up to 10 km range whereas the conventional matched field processing fails at about 4 km range without accurate environmental information.

  10. Improving sequence segmentation learning by predicting trigrams

    NARCIS (Netherlands)

    van den Bosch, A.; Daelemans, W.; Dagan, I.; Gildea, D.

    2005-01-01

    Symbolic machine-learning classifiers are known to suffer from near-sightedness when performing sequence segmentation (chunking) tasks in natural language processing: without special architectural additions they are oblivious of the decisions they made earlier when making new ones. We introduce a

  11. Classifying hot water chemistry: Application of MULTIVARIATE STATISTICS

    OpenAIRE

    Sumintadireja, Prihadi; Irawan, Dasapta Erwin; Rezky, Yuanno; Gio, Prana Ugiana; Agustin, Anggita

    2016-01-01

    This file is the dataset for the following paper "Classifying hot water chemistry: Application of MULTIVARIATE STATISTICS". Authors: Prihadi Sumintadireja1, Dasapta Erwin Irawan1, Yuano Rezky2, Prana Ugiana Gio3, Anggita Agustin1

  12. Robust Combining of Disparate Classifiers Through Order Statistics

    Science.gov (United States)

    Tumer, Kagan; Ghosh, Joydeep

    2001-01-01

    Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In this article we investigate a family of combiners based on order statistics, for robust handling of situations where there are large discrepancies in performance of individual classifiers. Based on a mathematical modeling of how the decision boundaries are affected by order statistic combiners, we derive expressions for the reductions in error expected when simple output combination methods based on the the median, the maximum and in general, the ith order statistic, are used. Furthermore, we analyze the trim and spread combiners, both based on linear combinations of the ordered classifier outputs, and show that in the presence of uneven classifier performance, they often provide substantial gains over both linear and simple order statistics combiners. Experimental results on both real world data and standard public domain data sets corroborate these findings.

  13. Using Statistical Process Control Methods to Classify Pilot Mental Workloads

    National Research Council Canada - National Science Library

    Kudo, Terence

    2001-01-01

    .... These include cardiac, ocular, respiratory, and brain activity measures. The focus of this effort is to apply statistical process control methodology on different psychophysiological features in an attempt to classify pilot mental workload...

  14. An ensemble classifier to predict track geometry degradation

    International Nuclear Information System (INIS)

    Cárdenas-Gallo, Iván; Sarmiento, Carlos A.; Morales, Gilberto A.; Bolivar, Manuel A.; Akhavan-Tabatabaei, Raha

    2017-01-01

    Railway operations are inherently complex and source of several problems. In particular, track geometry defects are one of the leading causes of train accidents in the United States. This paper presents a solution approach which entails the construction of an ensemble classifier to forecast the degradation of track geometry. Our classifier is constructed by solving the problem from three different perspectives: deterioration, regression and classification. We considered a different model from each perspective and our results show that using an ensemble method improves the predictive performance. - Highlights: • We present an ensemble classifier to forecast the degradation of track geometry. • Our classifier considers three perspectives: deterioration, regression and classification. • We construct and test three models and our results show that using an ensemble method improves the predictive performance.

  15. A novel statistical method for classifying habitat generalists and specialists

    DEFF Research Database (Denmark)

    Chazdon, Robin L; Chao, Anne; Colwell, Robert K

    2011-01-01

    in second-growth (SG) and old-growth (OG) rain forests in the Caribbean lowlands of northeastern Costa Rica. We evaluate the multinomial model in detail for the tree data set. Our results for birds were highly concordant with a previous nonstatistical classification, but our method classified a higher......: (1) generalist; (2) habitat A specialist; (3) habitat B specialist; and (4) too rare to classify with confidence. We illustrate our multinomial classification method using two contrasting data sets: (1) bird abundance in woodland and heath habitats in southeastern Australia and (2) tree abundance...... fraction (57.7%) of bird species with statistical confidence. Based on a conservative specialization threshold and adjustment for multiple comparisons, 64.4% of tree species in the full sample were too rare to classify with confidence. Among the species classified, OG specialists constituted the largest...

  16. 6 CFR 7.23 - Emergency release of classified information.

    Science.gov (United States)

    2010-01-01

    ... Classified Information Non-disclosure Form. In emergency situations requiring immediate verbal release of... information through approved communication channels by the most secure and expeditious method possible, or by...

  17. Constraint Satisfaction Inference : Non-probabilistic Global Inference for Sequence Labelling

    NARCIS (Netherlands)

    Canisius, S.V.M.; van den Bosch, A.; Daelemans, W.; Basili, R.; Moschitti, A.

    2006-01-01

    We present a new method for performing sequence labelling based on the idea of using a machine-learning classifier to generate several possible output sequences, and then applying an inference procedure to select the best sequence among those. Most sequence labelling methods following a similar

  18. DECISION TREE CLASSIFIERS FOR STAR/GALAXY SEPARATION

    International Nuclear Information System (INIS)

    Vasconcellos, E. C.; Ruiz, R. S. R.; De Carvalho, R. R.; Capelato, H. V.; Gal, R. R.; LaBarbera, F. L.; Frago Campos Velho, H.; Trevisan, M.

    2011-01-01

    We study the star/galaxy classification efficiency of 13 different decision tree algorithms applied to photometric objects in the Sloan Digital Sky Survey Data Release Seven (SDSS-DR7). Each algorithm is defined by a set of parameters which, when varied, produce different final classification trees. We extensively explore the parameter space of each algorithm, using the set of 884,126 SDSS objects with spectroscopic data as the training set. The efficiency of star-galaxy separation is measured using the completeness function. We find that the Functional Tree algorithm (FT) yields the best results as measured by the mean completeness in two magnitude intervals: 14 ≤ r ≤ 21 (85.2%) and r ≥ 19 (82.1%). We compare the performance of the tree generated with the optimal FT configuration to the classifications provided by the SDSS parametric classifier, 2DPHOT, and Ball et al. We find that our FT classifier is comparable to or better in completeness over the full magnitude range 15 ≤ r ≤ 21, with much lower contamination than all but the Ball et al. classifier. At the faintest magnitudes (r > 19), our classifier is the only one that maintains high completeness (>80%) while simultaneously achieving low contamination (∼2.5%). We also examine the SDSS parametric classifier (psfMag - modelMag) to see if the dividing line between stars and galaxies can be adjusted to improve the classifier. We find that currently stars in close pairs are often misclassified as galaxies, and suggest a new cut to improve the classifier. Finally, we apply our FT classifier to separate stars from galaxies in the full set of 69,545,326 SDSS photometric objects in the magnitude range 14 ≤ r ≤ 21.

  19. Local-global classifier fusion for screening chest radiographs

    Science.gov (United States)

    Ding, Meng; Antani, Sameer; Jaeger, Stefan; Xue, Zhiyun; Candemir, Sema; Kohli, Marc; Thoma, George

    2017-03-01

    Tuberculosis (TB) is a severe comorbidity of HIV and chest x-ray (CXR) analysis is a necessary step in screening for the infective disease. Automatic analysis of digital CXR images for detecting pulmonary abnormalities is critical for population screening, especially in medical resource constrained developing regions. In this article, we describe steps that improve previously reported performance of NLM's CXR screening algorithms and help advance the state of the art in the field. We propose a local-global classifier fusion method where two complementary classification systems are combined. The local classifier focuses on subtle and partial presentation of the disease leveraging information in radiology reports that roughly indicates locations of the abnormalities. In addition, the global classifier models the dominant spatial structure in the gestalt image using GIST descriptor for the semantic differentiation. Finally, the two complementary classifiers are combined using linear fusion, where the weight of each decision is calculated by the confidence probabilities from the two classifiers. We evaluated our method on three datasets in terms of the area under the Receiver Operating Characteristic (ROC) curve, sensitivity, specificity and accuracy. The evaluation demonstrates the superiority of our proposed local-global fusion method over any single classifier.

  20. Verification of classified fissile material using unclassified attributes

    International Nuclear Information System (INIS)

    Nicholas, N.J.; Fearey, B.L.; Puckett, J.M.; Tape, J.W.

    1998-01-01

    This paper reports on the most recent efforts of US technical experts to explore verification by IAEA of unclassified attributes of classified excess fissile material. Two propositions are discussed: (1) that multiple unclassified attributes could be declared by the host nation and then verified (and reverified) by the IAEA in order to provide confidence in that declaration of a classified (or unclassified) inventory while protecting classified or sensitive information; and (2) that attributes could be measured, remeasured, or monitored to provide continuity of knowledge in a nonintrusive and unclassified manner. They believe attributes should relate to characteristics of excess weapons materials and should be verifiable and authenticatable with methods usable by IAEA inspectors. Further, attributes (along with the methods to measure them) must not reveal any classified information. The approach that the authors have taken is as follows: (1) assume certain attributes of classified excess material, (2) identify passive signatures, (3) determine range of applicable measurement physics, (4) develop a set of criteria to assess and select measurement technologies, (5) select existing instrumentation for proof-of-principle measurements and demonstration, and (6) develop and design information barriers to protect classified information. While the attribute verification concepts and measurements discussed in this paper appear promising, neither the attribute verification approach nor the measurement technologies have been fully developed, tested, and evaluated

  1. A cardiorespiratory classifier of voluntary and involuntary electrodermal activity

    Directory of Open Access Journals (Sweden)

    Sejdic Ervin

    2010-02-01

    Full Text Available Abstract Background Electrodermal reactions (EDRs can be attributed to many origins, including spontaneous fluctuations of electrodermal activity (EDA and stimuli such as deep inspirations, voluntary mental activity and startling events. In fields that use EDA as a measure of psychophysiological state, the fact that EDRs may be elicited from many different stimuli is often ignored. This study attempts to classify observed EDRs as voluntary (i.e., generated from intentional respiratory or mental activity or involuntary (i.e., generated from startling events or spontaneous electrodermal fluctuations. Methods Eight able-bodied participants were subjected to conditions that would cause a change in EDA: music imagery, startling noises, and deep inspirations. A user-centered cardiorespiratory classifier consisting of 1 an EDR detector, 2 a respiratory filter and 3 a cardiorespiratory filter was developed to automatically detect a participant's EDRs and to classify the origin of their stimulation as voluntary or involuntary. Results Detected EDRs were classified with a positive predictive value of 78%, a negative predictive value of 81% and an overall accuracy of 78%. Without the classifier, EDRs could only be correctly attributed as voluntary or involuntary with an accuracy of 50%. Conclusions The proposed classifier may enable investigators to form more accurate interpretations of electrodermal activity as a measure of an individual's psychophysiological state.

  2. Balanced sensitivity functions for tuning multi-dimensional Bayesian network classifiers

    NARCIS (Netherlands)

    Bolt, J.H.; van der Gaag, L.C.

    Multi-dimensional Bayesian network classifiers are Bayesian networks of restricted topological structure, which are tailored to classifying data instances into multiple dimensions. Like more traditional classifiers, multi-dimensional classifiers are typically learned from data and may include

  3. Nonparametric, Coupled ,Bayesian ,Dictionary ,and Classifier Learning for Hyperspectral Classification.

    Science.gov (United States)

    Akhtar, Naveed; Mian, Ajmal

    2017-10-03

    We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.

  4. Classifying a smoker scale in adult daily and nondaily smokers.

    Science.gov (United States)

    Pulvers, Kim; Scheuermann, Taneisha S; Romero, Devan R; Basora, Brittany; Luo, Xianghua; Ahluwalia, Jasjit S

    2014-05-01

    Smoker identity, or the strength of beliefs about oneself as a smoker, is a robust marker of smoking behavior. However, many nondaily smokers do not identify as smokers, underestimating their risk for tobacco-related disease and resulting in missed intervention opportunities. Assessing underlying beliefs about characteristics used to classify smokers may help explain the discrepancy between smoking behavior and smoker identity. This study examines the factor structure, reliability, and validity of the Classifying a Smoker scale among a racially diverse sample of adult smokers. A cross-sectional survey was administered through an online panel survey service to 2,376 current smokers who were at least 25 years of age. The sample was stratified to obtain equal numbers of 3 racial/ethnic groups (African American, Latino, and White) across smoking level (nondaily and daily smoking). The Classifying a Smoker scale displayed a single factor structure and excellent internal consistency (α = .91). Classifying a Smoker scores significantly increased at each level of smoking, F(3,2375) = 23.68, p smoker identity, stronger dependence on cigarettes, greater health risk perceptions, more smoking friends, and were more likely to carry cigarettes. Classifying a Smoker scores explained unique variance in smoking variables above and beyond that explained by smoker identity. The present study supports the use of the Classifying a Smoker scale among diverse, experienced smokers. Stronger endorsement of characteristics used to classify a smoker (i.e., stricter criteria) was positively associated with heavier smoking and related characteristics. Prospective studies are needed to inform prevention and treatment efforts.

  5. Representative Vector Machines: A Unified Framework for Classical Classifiers.

    Science.gov (United States)

    Gui, Jie; Liu, Tongliang; Tao, Dacheng; Sun, Zhenan; Tan, Tieniu

    2016-08-01

    Classifier design is a fundamental problem in pattern recognition. A variety of pattern classification methods such as the nearest neighbor (NN) classifier, support vector machine (SVM), and sparse representation-based classification (SRC) have been proposed in the literature. These typical and widely used classifiers were originally developed from different theory or application motivations and they are conventionally treated as independent and specific solutions for pattern classification. This paper proposes a novel pattern classification framework, namely, representative vector machines (or RVMs for short). The basic idea of RVMs is to assign the class label of a test example according to its nearest representative vector. The contributions of RVMs are twofold. On one hand, the proposed RVMs establish a unified framework of classical classifiers because NN, SVM, and SRC can be interpreted as the special cases of RVMs with different definitions of representative vectors. Thus, the underlying relationship among a number of classical classifiers is revealed for better understanding of pattern classification. On the other hand, novel and advanced classifiers are inspired in the framework of RVMs. For example, a robust pattern classification method called discriminant vector machine (DVM) is motivated from RVMs. Given a test example, DVM first finds its k -NNs and then performs classification based on the robust M-estimator and manifold regularization. Extensive experimental evaluations on a variety of visual recognition tasks such as face recognition (Yale and face recognition grand challenge databases), object categorization (Caltech-101 dataset), and action recognition (Action Similarity LAbeliNg) demonstrate the advantages of DVM over other classifiers.

  6. Training Classifiers with Shadow Features for Sensor-Based Human Activity Recognition.

    Science.gov (United States)

    Fong, Simon; Song, Wei; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K L

    2017-02-27

    In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR) is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z) of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called 'shadow features' are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research.

  7. Training Classifiers with Shadow Features for Sensor-Based Human Activity Recognition

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2017-02-01

    Full Text Available In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called ‘shadow features’ are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research.

  8. Current Directional Protection of Series Compensated Line Using Intelligent Classifier

    Directory of Open Access Journals (Sweden)

    M. Mollanezhad Heydarabadi

    2016-12-01

    Full Text Available Current inversion condition leads to incorrect operation of current based directional relay in power system with series compensated device. Application of the intelligent system for fault direction classification has been suggested in this paper. A new current directional protection scheme based on intelligent classifier is proposed for the series compensated line. The proposed classifier uses only half cycle of pre-fault and post fault current samples at relay location to feed the classifier. A lot of forward and backward fault simulations under different system conditions upon a transmission line with a fixed series capacitor are carried out using PSCAD/EMTDC software. The applicability of decision tree (DT, probabilistic neural network (PNN and support vector machine (SVM are investigated using simulated data under different system conditions. The performance comparison of the classifiers indicates that the SVM is a best suitable classifier for fault direction discriminating. The backward faults can be accurately distinguished from forward faults even under current inversion without require to detect of the current inversion condition.

  9. Neural network classifier of attacks in IP telephony

    Science.gov (United States)

    Safarik, Jakub; Voznak, Miroslav; Mehic, Miralem; Partila, Pavol; Mikulec, Martin

    2014-05-01

    Various types of monitoring mechanism allow us to detect and monitor behavior of attackers in VoIP networks. Analysis of detected malicious traffic is crucial for further investigation and hardening the network. This analysis is typically based on statistical methods and the article brings a solution based on neural network. The proposed algorithm is used as a classifier of attacks in a distributed monitoring network of independent honeypot probes. Information about attacks on these honeypots is collected on a centralized server and then classified. This classification is based on different mechanisms. One of them is based on the multilayer perceptron neural network. The article describes inner structure of used neural network and also information about implementation of this network. The learning set for this neural network is based on real attack data collected from IP telephony honeypot called Dionaea. We prepare the learning set from real attack data after collecting, cleaning and aggregation of this information. After proper learning is the neural network capable to classify 6 types of most commonly used VoIP attacks. Using neural network classifier brings more accurate attack classification in a distributed system of honeypots. With this approach is possible to detect malicious behavior in a different part of networks, which are logically or geographically divided and use the information from one network to harden security in other networks. Centralized server for distributed set of nodes serves not only as a collector and classifier of attack data, but also as a mechanism for generating a precaution steps against attacks.

  10. Use of information barriers to protect classified information

    International Nuclear Information System (INIS)

    MacArthur, D.; Johnson, M.W.; Nicholas, N.J.; Whiteson, R.

    1998-01-01

    This paper discusses the detailed requirements for an information barrier (IB) for use with verification systems that employ intrusive measurement technologies. The IB would protect classified information in a bilateral or multilateral inspection of classified fissile material. Such a barrier must strike a balance between providing the inspecting party the confidence necessary to accept the measurement while protecting the inspected party's classified information. The authors discuss the structure required of an IB as well as the implications of the IB on detector system maintenance. A defense-in-depth approach is proposed which would provide assurance to the inspected party that all sensitive information is protected and to the inspecting party that the measurements are being performed as expected. The barrier could include elements of physical protection (such as locks, surveillance systems, and tamper indicators), hardening of key hardware components, assurance of capabilities and limitations of hardware and software systems, administrative controls, validation and verification of the systems, and error detection and resolution. Finally, an unclassified interface could be used to display and, possibly, record measurement results. The introduction of an IB into an analysis system may result in many otherwise innocuous components (detectors, analyzers, etc.) becoming classified and unavailable for routine maintenance by uncleared personnel. System maintenance and updating will be significantly simplified if the classification status of as many components as possible can be made reversible (i.e. the component can become unclassified following the removal of classified objects)

  11. Detection of microaneurysms in retinal images using an ensemble classifier

    Directory of Open Access Journals (Sweden)

    M.M. Habib

    2017-01-01

    Full Text Available This paper introduces, and reports on the performance of, a novel combination of algorithms for automated microaneurysm (MA detection in retinal images. The presence of MAs in retinal images is a pathognomonic sign of Diabetic Retinopathy (DR which is one of the leading causes of blindness amongst the working age population. An extensive survey of the literature is presented and current techniques in the field are summarised. The proposed technique first detects an initial set of candidates using a Gaussian Matched Filter and then classifies this set to reduce the number of false positives. A Tree Ensemble classifier is used with a set of 70 features (the most commons features in the literature. A new set of 32 MA groundtruth images (with a total of 256 labelled MAs based on images from the MESSIDOR dataset is introduced as a public dataset for benchmarking MA detection algorithms. We evaluate our algorithm on this dataset as well as another public dataset (DIARETDB1 v2.1 and compare it against the best available alternative. Results show that the proposed classifier is superior in terms of eliminating false positive MA detection from the initial set of candidates. The proposed method achieves an ROC score of 0.415 compared to 0.2636 achieved by the best available technique. Furthermore, results show that the classifier model maintains consistent performance across datasets, illustrating the generalisability of the classifier and that overfitting does not occur.

  12. Generalization in the XCSF classifier system: analysis, improvement, and extension.

    Science.gov (United States)

    Lanzi, Pier Luca; Loiacono, Daniele; Wilson, Stewart W; Goldberg, David E

    2007-01-01

    We analyze generalization in XCSF and introduce three improvements. We begin by showing that the types of generalizations evolved by XCSF can be influenced by the input range. To explain these results we present a theoretical analysis of the convergence of classifier weights in XCSF which highlights a broader issue. In XCSF, because of the mathematical properties of the Widrow-Hoff update, the convergence of classifier weights in a given subspace can be slow when the spread of the eigenvalues of the autocorrelation matrix associated with each classifier is large. As a major consequence, the system's accuracy pressure may act before classifier weights are adequately updated, so that XCSF may evolve piecewise constant approximations, instead of the intended, and more efficient, piecewise linear ones. We propose three different ways to update classifier weights in XCSF so as to increase the generalization capabilities of XCSF: one based on a condition-based normalization of the inputs, one based on linear least squares, and one based on the recursive version of linear least squares. Through a series of experiments we show that while all three approaches significantly improve XCSF, least squares approaches appear to be best performing and most robust. Finally we show how XCSF can be extended to include polynomial approximations.

  13. Dynamic cluster generation for a fuzzy classifier with ellipsoidal regions.

    Science.gov (United States)

    Abe, S

    1998-01-01

    In this paper, we discuss a fuzzy classifier with ellipsoidal regions that dynamically generates clusters. First, for the data belonging to a class we define a fuzzy rule with an ellipsoidal region. Namely, using the training data for each class, we calculate the center and the covariance matrix of the ellipsoidal region for the class. Then we tune the fuzzy rules, i.e., the slopes of the membership functions, successively until there is no improvement in the recognition rate of the training data. Then if the number of the data belonging to a class that are misclassified into another class exceeds a prescribed number, we define a new cluster to which those data belong and the associated fuzzy rule. Then we tune the newly defined fuzzy rules in the similar way as stated above, fixing the already obtained fuzzy rules. We iterate generation of clusters and tuning of the newly generated fuzzy rules until the number of the data belonging to a class that are misclassified into another class does not exceed the prescribed number. We evaluate our method using thyroid data, Japanese Hiragana data of vehicle license plates, and blood cell data. By dynamic cluster generation, the generalization ability of the classifier is improved and the recognition rate of the fuzzy classifier for the test data is the best among the neural network classifiers and other fuzzy classifiers if there are no discrete input variables.

  14. SpectraClassifier 1.0: a user friendly, automated MRS-based classifier-development system

    Directory of Open Access Journals (Sweden)

    Julià-Sapé Margarida

    2010-02-01

    Full Text Available Abstract Background SpectraClassifier (SC is a Java solution for designing and implementing Magnetic Resonance Spectroscopy (MRS-based classifiers. The main goal of SC is to allow users with minimum background knowledge of multivariate statistics to perform a fully automated pattern recognition analysis. SC incorporates feature selection (greedy stepwise approach, either forward or backward, and feature extraction (PCA. Fisher Linear Discriminant Analysis is the method of choice for classification. Classifier evaluation is performed through various methods: display of the confusion matrix of the training and testing datasets; K-fold cross-validation, leave-one-out and bootstrapping as well as Receiver Operating Characteristic (ROC curves. Results SC is composed of the following modules: Classifier design, Data exploration, Data visualisation, Classifier evaluation, Reports, and Classifier history. It is able to read low resolution in-vivo MRS (single-voxel and multi-voxel and high resolution tissue MRS (HRMAS, processed with existing tools (jMRUI, INTERPRET, 3DiCSI or TopSpin. In addition, to facilitate exchanging data between applications, a standard format capable of storing all the information needed for a dataset was developed. Each functionality of SC has been specifically validated with real data with the purpose of bug-testing and methods validation. Data from the INTERPRET project was used. Conclusions SC is a user-friendly software designed to fulfil the needs of potential users in the MRS community. It accepts all kinds of pre-processed MRS data types and classifies them semi-automatically, allowing spectroscopists to concentrate on interpretation of results with the use of its visualisation tools.

  15. A History of Classified Activities at Oak Ridge National Laboratory

    Energy Technology Data Exchange (ETDEWEB)

    Quist, A.S.

    2001-01-30

    The facilities that became Oak Ridge National Laboratory (ORNL) were created in 1943 during the United States' super-secret World War II project to construct an atomic bomb (the Manhattan Project). During World War II and for several years thereafter, essentially all ORNL activities were classified. Now, in 2000, essentially all ORNL activities are unclassified. The major purpose of this report is to provide a brief history of ORNL's major classified activities from 1943 until the present (September 2000). This report is expected to be useful to the ORNL Classification Officer and to ORNL's Authorized Derivative Classifiers and Authorized Derivative Declassifiers in their classification review of ORNL documents, especially those documents that date from the 1940s and 1950s.

  16. COMPARISON OF SVM AND FUZZY CLASSIFIER FOR AN INDIAN SCRIPT

    Directory of Open Access Journals (Sweden)

    M. J. Baheti

    2012-01-01

    Full Text Available With the advent of technological era, conversion of scanned document (handwritten or printed into machine editable format has attracted many researchers. This paper deals with the problem of recognition of Gujarati handwritten numerals. Gujarati numeral recognition requires performing some specific steps as a part of preprocessing. For preprocessing digitization, segmentation, normalization and thinning are done with considering that the image have almost no noise. Further affine invariant moments based model is used for feature extraction and finally Support Vector Machine (SVM and Fuzzy classifiers are used for numeral classification. . The comparison of SVM and Fuzzy classifier is made and it can be seen that SVM procured better results as compared to Fuzzy Classifier.

  17. Optimal threshold estimation for binary classifiers using game theory.

    Science.gov (United States)

    Sanchez, Ignacio Enrique

    2016-01-01

    Many bioinformatics algorithms can be understood as binary classifiers. They are usually compared using the area under the receiver operating characteristic ( ROC ) curve. On the other hand, choosing the best threshold for practical use is a complex task, due to uncertain and context-dependent skews in the abundance of positives in nature and in the yields/costs for correct/incorrect classification. We argue that considering a classifier as a player in a zero-sum game allows us to use the minimax principle from game theory to determine the optimal operating point. The proposed classifier threshold corresponds to the intersection between the ROC curve and the descending diagonal in ROC space and yields a minimax accuracy of 1-FPR. Our proposal can be readily implemented in practice, and reveals that the empirical condition for threshold estimation of "specificity equals sensitivity" maximizes robustness against uncertainties in the abundance of positives in nature and classification costs.

  18. Statistical text classifier to detect specific type of medical incidents.

    Science.gov (United States)

    Wong, Zoie Shui-Yee; Akiyama, Masanori

    2013-01-01

    WHO Patient Safety has put focus to increase the coherence and expressiveness of patient safety classification with the foundation of International Classification for Patient Safety (ICPS). Text classification and statistical approaches has showed to be successful to identifysafety problems in the Aviation industryusing incident text information. It has been challenging to comprehend the taxonomy of medical incidents in a structured manner. Independent reporting mechanisms for patient safety incidents have been established in the UK, Canada, Australia, Japan, Hong Kong etc. This research demonstrates the potential to construct statistical text classifiers to detect specific type of medical incidents using incident text data. An illustrative example for classifying look-alike sound-alike (LASA) medication incidents using structured text from 227 advisories related to medication errors from Global Patient Safety Alerts (GPSA) is shown in this poster presentation. The classifier was built using logistic regression model. ROC curve and the AUC value indicated that this is a satisfactory good model.

  19. A Topic Model Approach to Representing and Classifying Football Plays

    KAUST Repository

    Varadarajan, Jagannadan

    2013-09-09

    We address the problem of modeling and classifying American Football offense teams’ plays in video, a challenging example of group activity analysis. Automatic play classification will allow coaches to infer patterns and tendencies of opponents more ef- ficiently, resulting in better strategy planning in a game. We define a football play as a unique combination of player trajectories. To this end, we develop a framework that uses player trajectories as inputs to MedLDA, a supervised topic model. The joint maximiza- tion of both likelihood and inter-class margins of MedLDA in learning the topics allows us to learn semantically meaningful play type templates, as well as, classify different play types with 70% average accuracy. Furthermore, this method is extended to analyze individual player roles in classifying each play type. We validate our method on a large dataset comprising 271 play clips from real-world football games, which will be made publicly available for future comparisons.

  20. Defending Malicious Script Attacks Using Machine Learning Classifiers

    Directory of Open Access Journals (Sweden)

    Nayeem Khan

    2017-01-01

    Full Text Available The web application has become a primary target for cyber criminals by injecting malware especially JavaScript to perform malicious activities for impersonation. Thus, it becomes an imperative to detect such malicious code in real time before any malicious activity is performed. This study proposes an efficient method of detecting previously unknown malicious java scripts using an interceptor at the client side by classifying the key features of the malicious code. Feature subset was obtained by using wrapper method for dimensionality reduction. Supervised machine learning classifiers were used on the dataset for achieving high accuracy. Experimental results show that our method can efficiently classify malicious code from benign code with promising results.

  1. A systems biology-based classifier for hepatocellular carcinoma diagnosis.

    Directory of Open Access Journals (Sweden)

    Yanqiong Zhang

    Full Text Available AIM: The diagnosis of hepatocellular carcinoma (HCC in the early stage is crucial to the application of curative treatments which are the only hope for increasing the life expectancy of patients. Recently, several large-scale studies have shed light on this problem through analysis of gene expression profiles to identify markers correlated with HCC progression. However, those marker sets shared few genes in common and were poorly validated using independent data. Therefore, we developed a systems biology based classifier by combining the differential gene expression with topological features of human protein interaction networks to enhance the ability of HCC diagnosis. METHODS AND RESULTS: In the Oncomine platform, genes differentially expressed in HCC tissues relative to their corresponding normal tissues were filtered by a corrected Q value cut-off and Concept filters. The identified genes that are common to different microarray datasets were chosen as the candidate markers. Then, their networks were analyzed by GeneGO Meta-Core software and the hub genes were chosen. After that, an HCC diagnostic classifier was constructed by Partial Least Squares modeling based on the microarray gene expression data of the hub genes. Validations of diagnostic performance showed that this classifier had high predictive accuracy (85.88∼92.71% and area under ROC curve (approximating 1.0, and that the network topological features integrated into this classifier contribute greatly to improving the predictive performance. Furthermore, it has been demonstrated that this modeling strategy is not only applicable to HCC, but also to other cancers. CONCLUSION: Our analysis suggests that the systems biology-based classifier that combines the differential gene expression and topological features of human protein interaction network may enhance the diagnostic performance of HCC classifier.

  2. Implications of physical symmetries in adaptive image classifiers

    DEFF Research Database (Denmark)

    Sams, Thomas; Hansen, Jonas Lundbek

    2000-01-01

    It is demonstrated that rotational invariance and reflection symmetry of image classifiers lead to a reduction in the number of free parameters in the classifier. When used in adaptive detectors, e.g. neural networks, this may be used to decrease the number of training samples necessary to learn...... a given classification task, or to improve generalization of the neural network. Notably, the symmetrization of the detector does not compromise the ability to distinguish objects that break the symmetry. (C) 2000 Elsevier Science Ltd. All rights reserved....

  3. Silicon nanowire arrays as learning chemical vapour classifiers

    International Nuclear Information System (INIS)

    Niskanen, A O; Colli, A; White, R; Li, H W; Spigone, E; Kivioja, J M

    2011-01-01

    Nanowire field-effect transistors are a promising class of devices for various sensing applications. Apart from detecting individual chemical or biological analytes, it is especially interesting to use multiple selective sensors to look at their collective response in order to perform classification into predetermined categories. We show that non-functionalised silicon nanowire arrays can be used to robustly classify different chemical vapours using simple statistical machine learning methods. We were able to distinguish between acetone, ethanol and water with 100% accuracy while methanol, ethanol and 2-propanol were classified with 96% accuracy in ambient conditions.

  4. Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets.

    Science.gov (United States)

    Sankari, E Siva; Manimegalai, D

    2017-12-21

    Predicting membrane protein types is an important and challenging research area in bioinformatics and proteomics. Traditional biophysical methods are used to classify membrane protein types. Due to large exploration of uncharacterized protein sequences in databases, traditional methods are very time consuming, expensive and susceptible to errors. Hence, it is highly desirable to develop a robust, reliable, and efficient method to predict membrane protein types. Imbalanced datasets and large datasets are often handled well by decision tree classifiers. Since imbalanced datasets are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Random tree, REP (Reduced Error Pruning) tree, ensemble methods such as Adaboost, RUS (Random Under Sampling) boost, Rotation forest and Random forest are analysed. Among the various decision tree classifiers Random forest performs well in less time with good accuracy of 96.35%. Another inference is RUS boost decision tree classifier is able to classify one or two samples in the class with very less samples while the other classifiers such as DT, Adaboost, Rotation forest and Random forest are not sensitive for the classes with fewer samples. Also the performance of decision tree classifiers is compared with SVM (Support Vector Machine) and Naive Bayes classifier. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Pattern recognition in complex activity travel patterns : comparison of Euclidean distance, signal-processing theoretical, and multidimensional sequence alignment methods

    NARCIS (Netherlands)

    Joh, C.H.; Arentze, T.A.; Timmermans, H.J.P.

    2001-01-01

    The application of a multidimensional sequence alignment method for classifying activity travel patterns is reported. The method was developed as an alternative to the existing classification methods suggested in the transportation literature. The relevance of the multidimensional sequence alignment

  6. Minisatellites as DNA markers to classify bermudagrasses (Cynodon ...

    Indian Academy of Sciences (India)

    RESEARCH NOTE ... an inexpensive, PCR-based method to amplify minisatellite ... isatellite core primer sequences derived from other species, including ... important quantitative traits (Karaca et al. ... These problems in RAPD-PCR are mainly inherited from the .... isatellite sequence organization of 5.2 times repeated-core.

  7. Predicting protein subnuclear location with optimized evidence-theoretic K-nearest classifier and pseudo amino acid composition

    International Nuclear Information System (INIS)

    Shen Hongbin; Chou Kuochen

    2005-01-01

    The nucleus is the brain of eukaryotic cells that guides the life processes of the cell by issuing key instructions. For in-depth understanding of the biochemical process of the nucleus, the knowledge of localization of nuclear proteins is very important. With the avalanche of protein sequences generated in the post-genomic era, it is highly desired to develop an automated method for fast annotating the subnuclear locations for numerous newly found nuclear protein sequences so as to be able to timely utilize them for basic research and drug discovery. In view of this, a novel approach is developed for predicting the protein subnuclear location. It is featured by introducing a powerful classifier, the optimized evidence-theoretic K-nearest classifier, and using the pseudo amino acid composition [K.C. Chou, PROTEINS: Structure, Function, and Genetics, 43 (2001) 246], which can incorporate a considerable amount of sequence-order effects, to represent protein samples. As a demonstration, identifications were performed for 370 nuclear proteins among the following 9 subnuclear locations: (1) Cajal body, (2) chromatin, (3) heterochromatin, (4) nuclear diffuse, (5) nuclear pore, (6) nuclear speckle, (7) nucleolus, (8) PcG body, and (9) PML body. The overall success rates thus obtained by both the re-substitution test and jackknife cross-validation test are significantly higher than those by existing classifiers on the same working dataset. It is anticipated that the powerful approach may also become a useful high throughput vehicle to bridge the huge gap occurring in the post-genomic era between the number of gene sequences in databases and the number of gene products that have been functionally characterized. The OET-KNN classifier will be available at www.pami.sjtu.edu.cn/people/hbshen

  8. Improving the chances of successful protein structure determination with a random forest classifier

    Energy Technology Data Exchange (ETDEWEB)

    Jahandideh, Samad [Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92307 (United States); Joint Center for Structural Genomics, (United States); Jaroszewski, Lukasz; Godzik, Adam, E-mail: adam@burnham.org [Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92307 (United States); Joint Center for Structural Genomics, (United States); University of California, San Diego, La Jolla, California (United States)

    2014-03-01

    Using an extended set of protein features calculated separately for protein surface and interior, a new version of XtalPred based on a random forest classifier achieves a significant improvement in predicting the success of structure determination from the primary amino-acid sequence. Obtaining diffraction quality crystals remains one of the major bottlenecks in structural biology. The ability to predict the chances of crystallization from the amino-acid sequence of the protein can, at least partly, address this problem by allowing a crystallographer to select homologs that are more likely to succeed and/or to modify the sequence of the target to avoid features that are detrimental to successful crystallization. In 2007, the now widely used XtalPred algorithm [Slabinski et al. (2007 ▶), Protein Sci.16, 2472–2482] was developed. XtalPred classifies proteins into five ‘crystallization classes’ based on a simple statistical analysis of the physicochemical features of a protein. Here, towards the same goal, advanced machine-learning methods are applied and, in addition, the predictive potential of additional protein features such as predicted surface ruggedness, hydrophobicity, side-chain entropy of surface residues and amino-acid composition of the predicted protein surface are tested. The new XtalPred-RF (random forest) achieves significant improvement of the prediction of crystallization success over the original XtalPred. To illustrate this, XtalPred-RF was tested by revisiting target selection from 271 Pfam families targeted by the Joint Center for Structural Genomics (JCSG) in PSI-2, and it was estimated that the number of targets entered into the protein-production and crystallization pipeline could have been reduced by 30% without lowering the number of families for which the first structures were solved. The prediction improvement depends on the subset of targets used as a testing set and reaches 100% (i.e. twofold) for the top class of predicted

  9. 18 CFR 367.18 - Criteria for classifying leases.

    Science.gov (United States)

    2010-04-01

    ... the lessee) must not give rise to a new classification of a lease for accounting purposes. ... classifying the lease. (4) The present value at the beginning of the lease term of the minimum lease payments... taxes to be paid by the lessor, including any related profit, equals or exceeds 90 percent of the excess...

  10. Discrimination-Aware Classifiers for Student Performance Prediction

    Science.gov (United States)

    Luo, Ling; Koprinska, Irena; Liu, Wei

    2015-01-01

    In this paper we consider discrimination-aware classification of educational data. Mining and using rules that distinguish groups of students based on sensitive attributes such as gender and nationality may lead to discrimination. It is desirable to keep the sensitive attributes during the training of a classifier to avoid information loss but…

  11. 29 CFR 1910.307 - Hazardous (classified) locations.

    Science.gov (United States)

    2010-07-01

    ... equipment at the location. (c) Electrical installations. Equipment, wiring methods, and installations of... covers the requirements for electric equipment and wiring in locations that are classified depending on... provisions of this section. (4) Division and zone classification. In Class I locations, an installation must...

  12. 29 CFR 1926.407 - Hazardous (classified) locations.

    Science.gov (United States)

    2010-07-01

    ...) locations, unless modified by provisions of this section. (b) Electrical installations. Equipment, wiring..., DEPARTMENT OF LABOR (CONTINUED) SAFETY AND HEALTH REGULATIONS FOR CONSTRUCTION Electrical Installation Safety... electric equipment and wiring in locations which are classified depending on the properties of the...

  13. Classifier fusion for VoIP attacks classification

    Science.gov (United States)

    Safarik, Jakub; Rezac, Filip

    2017-05-01

    SIP is one of the most successful protocols in the field of IP telephony communication. It establishes and manages VoIP calls. As the number of SIP implementation rises, we can expect a higher number of attacks on the communication system in the near future. This work aims at malicious SIP traffic classification. A number of various machine learning algorithms have been developed for attack classification. The paper presents a comparison of current research and the use of classifier fusion method leading to a potential decrease in classification error rate. Use of classifier combination makes a more robust solution without difficulties that may affect single algorithms. Different voting schemes, combination rules, and classifiers are discussed to improve the overall performance. All classifiers have been trained on real malicious traffic. The concept of traffic monitoring depends on the network of honeypot nodes. These honeypots run in several networks spread in different locations. Separation of honeypots allows us to gain an independent and trustworthy attack information.

  14. Bayesian Classifier for Medical Data from Doppler Unit

    Directory of Open Access Journals (Sweden)

    J. Málek

    2006-01-01

    Full Text Available Nowadays, hand-held ultrasonic Doppler units (probes are often used for noninvasive screening of atherosclerosis in the arteries of the lower limbs. The mean velocity of blood flow in time and blood pressures are measured on several positions on each lower limb. By listening to the acoustic signal generated by the device or by reading the signal displayed on screen, a specialist can detect peripheral arterial disease (PAD.This project aims to design software that will be able to analyze data from such a device and classify it into several diagnostic classes. At the Department of Functional Diagnostics at the Regional Hospital in Liberec a database of several hundreds signals was collected. In cooperation with the specialist, the signals were manually classified into four classes. For each class, selected signal features were extracted and then used for training a Bayesian classifier. Another set of signals was used for evaluating and optimizing the parameters of the classifier. Slightly above 84 % of successfully recognized diagnostic states, was recently achieved on the test data. 

  15. An Investigation to Improve Classifier Accuracy for Myo Collected Data

    Science.gov (United States)

    2017-02-01

    Bad Samples Effect on Classification Accuracy 7 5.1 Naïve Bayes (NB) Classifier Accuracy 7 5.2 Logistic Model Tree (LMT) 10 5.3 K-Nearest Neighbor...gesture, pitch feature, user 06. All samples exhibit reversed movement...20 Fig. A-2 Come gesture, pitch feature, user 14. All samples exhibit reversed movement

  16. Diagnosis of Broiler Livers by Classifying Image Patches

    DEFF Research Database (Denmark)

    Jørgensen, Anders; Fagertun, Jens; Moeslund, Thomas B.

    2017-01-01

    The manual health inspection are becoming the bottleneck at poultry processing plants. We present a computer vision method for automatic diagnosis of broiler livers. The non-rigid livers, of varying shape and sizes, are classified in patches by a convolutional neural network, outputting maps...

  17. Support vector machines classifiers of physical activities in preschoolers

    Science.gov (United States)

    The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...

  18. A Linguistic Image of Nature: The Burmese Numerative Classifier System

    Science.gov (United States)

    Becker, Alton L.

    1975-01-01

    The Burmese classifier system is coherent because it is based upon a single elementary semantic dimension: deixis. On that dimension, four distances are distinguished, distances which metaphorically substitute for other conceptual relations between people and other living beings, people and things, and people and concepts. (Author/RM)

  19. Data Stream Classification Based on the Gamma Classifier

    Directory of Open Access Journals (Sweden)

    Abril Valeria Uriarte-Arcia

    2015-01-01

    Full Text Available The ever increasing data generation confronts us with the problem of handling online massive amounts of information. One of the biggest challenges is how to extract valuable information from these massive continuous data streams during single scanning. In a data stream context, data arrive continuously at high speed; therefore the algorithms developed to address this context must be efficient regarding memory and time management and capable of detecting changes over time in the underlying distribution that generated the data. This work describes a novel method for the task of pattern classification over a continuous data stream based on an associative model. The proposed method is based on the Gamma classifier, which is inspired by the Alpha-Beta associative memories, which are both supervised pattern recognition models. The proposed method is capable of handling the space and time constrain inherent to data stream scenarios. The Data Streaming Gamma classifier (DS-Gamma classifier implements a sliding window approach to provide concept drift detection and a forgetting mechanism. In order to test the classifier, several experiments were performed using different data stream scenarios with real and synthetic data streams. The experimental results show that the method exhibits competitive performance when compared to other state-of-the-art algorithms.

  20. Building an automated SOAP classifier for emergency department reports.

    Science.gov (United States)

    Mowery, Danielle; Wiebe, Janyce; Visweswaran, Shyam; Harkema, Henk; Chapman, Wendy W

    2012-02-01

    Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP framework's usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohen's kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F(1) scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks. Copyright © 2011. Published by Elsevier Inc.

  1. Learning to classify wakes from local sensory information

    Science.gov (United States)

    Alsalman, Mohamad; Colvert, Brendan; Kanso, Eva; Kanso Team

    2017-11-01

    Aquatic organisms exhibit remarkable abilities to sense local flow signals contained in their fluid environment and to surmise the origins of these flows. For example, fish can discern the information contained in various flow structures and utilize this information for obstacle avoidance and prey tracking. Flow structures created by flapping and swimming bodies are well characterized in the fluid dynamics literature; however, such characterization relies on classical methods that use an external observer to reconstruct global flow fields. The reconstructed flows, or wakes, are then classified according to the unsteady vortex patterns. Here, we propose a new approach for wake identification: we classify the wakes resulting from a flapping airfoil by applying machine learning algorithms to local flow information. In particular, we simulate the wakes of an oscillating airfoil in an incoming flow, extract the downstream vorticity information, and train a classifier to learn the different flow structures and classify new ones. This data-driven approach provides a promising framework for underwater navigation and detection in application to autonomous bio-inspired vehicles.

  2. The Closing of the Classified Catalog at Boston University

    Science.gov (United States)

    Hazen, Margaret Hindle

    1974-01-01

    Although the classified catalog at Boston University libraries has been a useful research tool, it has proven too expensive to keep current. The library has converted to a traditional alphabetic subject catalog and will recieve catalog cards from the Ohio College Library Center through the New England Library Network. (Author/LS)

  3. Recognition of Arabic Sign Language Alphabet Using Polynomial Classifiers

    Directory of Open Access Journals (Sweden)

    M. Al-Rousan

    2005-08-01

    Full Text Available Building an accurate automatic sign language recognition system is of great importance in facilitating efficient communication with deaf people. In this paper, we propose the use of polynomial classifiers as a classification engine for the recognition of Arabic sign language (ArSL alphabet. Polynomial classifiers have several advantages over other classifiers in that they do not require iterative training, and that they are highly computationally scalable with the number of classes. Based on polynomial classifiers, we have built an ArSL system and measured its performance using real ArSL data collected from deaf people. We show that the proposed system provides superior recognition results when compared with previously published results using ANFIS-based classification on the same dataset and feature extraction methodology. The comparison is shown in terms of the number of misclassified test patterns. The reduction in the rate of misclassified patterns was very significant. In particular, we have achieved a 36% reduction of misclassifications on the training data and 57% on the test data.

  4. Reconfigurable support vector machine classifier with approximate computing

    NARCIS (Netherlands)

    van Leussen, M.J.; Huisken, J.; Wang, L.; Jiao, H.; De Gyvez, J.P.

    2017-01-01

    Support Vector Machine (SVM) is one of the most popular machine learning algorithms. An energy-efficient SVM classifier is proposed in this paper, where approximate computing is utilized to reduce energy consumption and silicon area. A hardware architecture with reconfigurable kernels and

  5. Classifying regularized sensor covariance matrices: An alternative to CSP

    NARCIS (Netherlands)

    Roijendijk, L.M.M.; Gielen, C.C.A.M.; Farquhar, J.D.R.

    2016-01-01

    Common spatial patterns ( CSP) is a commonly used technique for classifying imagined movement type brain-computer interface ( BCI) datasets. It has been very successful with many extensions and improvements on the basic technique. However, a drawback of CSP is that the signal processing pipeline

  6. Classifying regularised sensor covariance matrices: An alternative to CSP

    NARCIS (Netherlands)

    Roijendijk, L.M.M.; Gielen, C.C.A.M.; Farquhar, J.D.R.

    2016-01-01

    Common spatial patterns (CSP) is a commonly used technique for classifying imagined movement type brain computer interface (BCI) datasets. It has been very successful with many extensions and improvements on the basic technique. However, a drawback of CSP is that the signal processing pipeline

  7. Two-categorical bundles and their classifying spaces

    DEFF Research Database (Denmark)

    Baas, Nils A.; Bökstedt, M.; Kro, T.A.

    2012-01-01

    -category is a classifying space for the associated principal 2-bundles. In the process of proving this we develop a lot of powerful machinery which may be useful in further studies of 2-categorical topology. As a corollary we get a new proof of the classification of principal bundles. A calculation based...

  8. 3 CFR - Classified Information and Controlled Unclassified Information

    Science.gov (United States)

    2010-01-01

    ... on Transparency and Open Government and on the Freedom of Information Act, my Administration is... memoranda of January 21, 2009, on Transparency and Open Government and on the Freedom of Information Act; (B... 3 The President 1 2010-01-01 2010-01-01 false Classified Information and Controlled Unclassified...

  9. Comparison of Classifier Architectures for Online Neural Spike Sorting.

    Science.gov (United States)

    Saeed, Maryam; Khan, Amir Ali; Kamboh, Awais Mehmood

    2017-04-01

    High-density, intracranial recordings from micro-electrode arrays need to undergo Spike Sorting in order to associate the recorded neuronal spikes to particular neurons. This involves spike detection, feature extraction, and classification. To reduce the data transmission and power requirements, on-chip real-time processing is becoming very popular. However, high computational resources are required for classifiers in on-chip spike-sorters, making scalability a great challenge. In this review paper, we analyze several popular classifiers to propose five new hardware architectures using the off-chip training with on-chip classification approach. These include support vector classification, fuzzy C-means classification, self-organizing maps classification, moving-centroid K-means classification, and Cosine distance classification. The performance of these architectures is analyzed in terms of accuracy and resource requirement. We establish that the neural networks based Self-Organizing Maps classifier offers the most viable solution. A spike sorter based on the Self-Organizing Maps classifier, requires only 7.83% of computational resources of the best-reported spike sorter, hierarchical adaptive means, while offering a 3% better accuracy at 7 dB SNR.

  10. Cascaded lexicalised classifiers for second-person reference resolution

    NARCIS (Netherlands)

    Purver, M.; Fernández, R.; Frampton, M.; Peters, S.; Healey, P.; Pieraccini, R.; Byron, D.; Young, S.; Purver, M.

    2009-01-01

    This paper examines the resolution of the second person English pronoun you in multi-party dialogue. Following previous work, we attempt to classify instances as generic or referential, and in the latter case identify the singular or plural addressee. We show that accuracy and robustness can be

  11. Human Activity Recognition by Combining a Small Number of Classifiers.

    Science.gov (United States)

    Nazabal, Alfredo; Garcia-Moreno, Pablo; Artes-Rodriguez, Antonio; Ghahramani, Zoubin

    2016-09-01

    We consider the problem of daily human activity recognition (HAR) using multiple wireless inertial sensors, and specifically, HAR systems with a very low number of sensors, each one providing an estimation of the performed activities. We propose new Bayesian models to combine the output of the sensors. The models are based on a soft outputs combination of individual classifiers to deal with the small number of sensors. We also incorporate the dynamic nature of human activities as a first-order homogeneous Markov chain. We develop both inductive and transductive inference methods for each model to be employed in supervised and semisupervised situations, respectively. Using different real HAR databases, we compare our classifiers combination models against a single classifier that employs all the signals from the sensors. Our models exhibit consistently a reduction of the error rate and an increase of robustness against sensor failures. Our models also outperform other classifiers combination models that do not consider soft outputs and an Markovian structure of the human activities.

  12. Evaluation of three classifiers in mapping forest stand types using ...

    African Journals Online (AJOL)

    EJIRO

    applied for classification of the image. Supervised classification technique using maximum likelihood algorithm is the most commonly and widely used method for land cover classification (Jia and Richards, 2006). In Australia, the maximum likelihood classifier was effectively used to map different forest stand types with high.

  13. Classifying patients' complaints for regulatory purposes : A Pilot Study

    NARCIS (Netherlands)

    Bouwman, R.J.R.; Bomhoff, Manja; Robben, Paul; Friele, R.D.

    2018-01-01

    Objectives: It is assumed that classifying and aggregated reporting of patients' complaints by regulators helps to identify problem areas, to respond better to patients and increase public accountability. This pilot study addresses what a classification of complaints in a regulatory setting

  14. Localizing genes to cerebellar layers by classifying ISH images.

    Directory of Open Access Journals (Sweden)

    Lior Kirsch

    Full Text Available Gene expression controls how the brain develops and functions. Understanding control processes in the brain is particularly hard since they involve numerous types of neurons and glia, and very little is known about which genes are expressed in which cells and brain layers. Here we describe an approach to detect genes whose expression is primarily localized to a specific brain layer and apply it to the mouse cerebellum. We learn typical spatial patterns of expression from a few markers that are known to be localized to specific layers, and use these patterns to predict localization for new genes. We analyze images of in-situ hybridization (ISH experiments, which we represent using histograms of local binary patterns (LBP and train image classifiers and gene classifiers for four layers of the cerebellum: the Purkinje, granular, molecular and white matter layer. On held-out data, the layer classifiers achieve accuracy above 94% (AUC by representing each image at multiple scales and by combining multiple image scores into a single gene-level decision. When applied to the full mouse genome, the classifiers predict specific layer localization for hundreds of new genes in the Purkinje and granular layers. Many genes localized to the Purkinje layer are likely to be expressed in astrocytes, and many others are involved in lipid metabolism, possibly due to the unusual size of Purkinje cells.

  15. An ensemble self-training protein interaction article classifier.

    Science.gov (United States)

    Chen, Yifei; Hou, Ping; Manderick, Bernard

    2014-01-01

    Protein-protein interaction (PPI) is essential to understand the fundamental processes governing cell biology. The mining and curation of PPI knowledge are critical for analyzing proteomics data. Hence it is desired to classify articles PPI-related or not automatically. In order to build interaction article classification systems, an annotated corpus is needed. However, it is usually the case that only a small number of labeled articles can be obtained manually. Meanwhile, a large number of unlabeled articles are available. By combining ensemble learning and semi-supervised self-training, an ensemble self-training interaction classifier called EST_IACer is designed to classify PPI-related articles based on a small number of labeled articles and a large number of unlabeled articles. A biological background based feature weighting strategy is extended using the category information from both labeled and unlabeled data. Moreover, a heuristic constraint is put forward to select optimal instances from unlabeled data to improve the performance further. Experiment results show that the EST_IACer can classify the PPI related articles effectively and efficiently.

  16. Classifying Your Food as Acid, Low-Acid, or Acidified

    OpenAIRE

    Bacon, Karleigh

    2012-01-01

    As a food entrepreneur, you should be aware of how ingredients in your product make the food look, feel, and taste; as well as how the ingredients create environments for microorganisms like bacteria, yeast, and molds to survive and grow. This guide will help you classifying your food as acid, low-acid, or acidified.

  17. Gene-expression Classifier in Papillary Thyroid Carcinoma

    DEFF Research Database (Denmark)

    Londero, Stefano Christian; Jespersen, Marie Louise; Krogdahl, Annelise

    2016-01-01

    BACKGROUND: No reliable biomarker for metastatic potential in the risk stratification of papillary thyroid carcinoma exists. We aimed to develop a gene-expression classifier for metastatic potential. MATERIALS AND METHODS: Genome-wide expression analyses were used. Development cohort: freshly...

  18. Abbreviations: Their Effects on Comprehension of Classified Advertisements.

    Science.gov (United States)

    Sokol, Kirstin R.

    Two experimental designs were used to test the hypothesis that abbreviations in classified advertisements decrease the reader's comprehension of such ads. In the first experimental design, 73 high school students read four ads (for employment, used cars, apartments for rent, and articles for sale) either with abbreviations or with all…

  19. Linear discriminant analysis of character sequences using occurrences of words

    KAUST Repository

    Dutta, Subhajit; Chaudhuri, Probal; Ghosh, Anil

    2014-01-01

    Classification of character sequences, where the characters come from a finite set, arises in disciplines such as molecular biology and computer science. For discriminant analysis of such character sequences, the Bayes classifier based on Markov models turns out to have class boundaries defined by linear functions of occurrences of words in the sequences. It is shown that for such classifiers based on Markov models with unknown orders, if the orders are estimated from the data using cross-validation, the resulting classifier has Bayes risk consistency under suitable conditions. Even when Markov models are not valid for the data, we develop methods for constructing classifiers based on linear functions of occurrences of words, where the word length is chosen by cross-validation. Such linear classifiers are constructed using ideas of support vector machines, regression depth, and distance weighted discrimination. We show that classifiers with linear class boundaries have certain optimal properties in terms of their asymptotic misclassification probabilities. The performance of these classifiers is demonstrated in various simulated and benchmark data sets.

  20. Linear discriminant analysis of character sequences using occurrences of words

    KAUST Repository

    Dutta, Subhajit

    2014-02-01

    Classification of character sequences, where the characters come from a finite set, arises in disciplines such as molecular biology and computer science. For discriminant analysis of such character sequences, the Bayes classifier based on Markov models turns out to have class boundaries defined by linear functions of occurrences of words in the sequences. It is shown that for such classifiers based on Markov models with unknown orders, if the orders are estimated from the data using cross-validation, the resulting classifier has Bayes risk consistency under suitable conditions. Even when Markov models are not valid for the data, we develop methods for constructing classifiers based on linear functions of occurrences of words, where the word length is chosen by cross-validation. Such linear classifiers are constructed using ideas of support vector machines, regression depth, and distance weighted discrimination. We show that classifiers with linear class boundaries have certain optimal properties in terms of their asymptotic misclassification probabilities. The performance of these classifiers is demonstrated in various simulated and benchmark data sets.

  1. Shotgun protein sequencing.

    Energy Technology Data Exchange (ETDEWEB)

    Faulon, Jean-Loup Michel; Heffelfinger, Grant S.

    2009-06-01

    A novel experimental and computational technique based on multiple enzymatic digestion of a protein or protein mixture that reconstructs protein sequences from sequences of overlapping peptides is described in this SAND report. This approach, analogous to shotgun sequencing of DNA, is to be used to sequence alternative spliced proteins, to identify post-translational modifications, and to sequence genetically engineered proteins.

  2. Multimodal sequence learning.

    Science.gov (United States)

    Kemény, Ferenc; Meier, Beat

    2016-02-01

    While sequence learning research models complex phenomena, previous studies have mostly focused on unimodal sequences. The goal of the current experiment is to put implicit sequence learning into a multimodal context: to test whether it can operate across different modalities. We used the Task Sequence Learning paradigm to test whether sequence learning varies across modalities, and whether participants are able to learn multimodal sequences. Our results show that implicit sequence learning is very similar regardless of the source modality. However, the presence of correlated task and response sequences was required for learning to take place. The experiment provides new evidence for implicit sequence learning of abstract conceptual representations. In general, the results suggest that correlated sequences are necessary for implicit sequence learning to occur. Moreover, they show that elements from different modalities can be automatically integrated into one unitary multimodal sequence. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Sequence Read Archive (SRA)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome...

  4. Classification of THz pulse signals using two-dimensional cross-correlation feature extraction and non-linear classifiers.

    Science.gov (United States)

    Siuly; Yin, Xiaoxia; Hadjiloucas, Sillas; Zhang, Yanchun

    2016-04-01

    This work provides a performance comparison of four different machine learning classifiers: multinomial logistic regression with ridge estimators (MLR) classifier, k-nearest neighbours (KNN), support vector machine (SVM) and naïve Bayes (NB) as applied to terahertz (THz) transient time domain sequences associated with pixelated images of different powder samples. The six substances considered, although have similar optical properties, their complex insertion loss at the THz part of the spectrum is significantly different because of differences in both their frequency dependent THz extinction coefficient as well as differences in their refractive index and scattering properties. As scattering can be unquantifiable in many spectroscopic experiments, classification solely on differences in complex insertion loss can be inconclusive. The problem is addressed using two-dimensional (2-D) cross-correlations between background and sample interferograms, these ensure good noise suppression of the datasets and provide a range of statistical features that are subsequently used as inputs to the above classifiers. A cross-validation procedure is adopted to assess the performance of the classifiers. Firstly the measurements related to samples that had thicknesses of 2mm were classified, then samples at thicknesses of 4mm, and after that 3mm were classified and the success rate and consistency of each classifier was recorded. In addition, mixtures having thicknesses of 2 and 4mm as well as mixtures of 2, 3 and 4mm were presented simultaneously to all classifiers. This approach provided further cross-validation of the classification consistency of each algorithm. The results confirm the superiority in classification accuracy and robustness of the MLR (least accuracy 88.24%) and KNN (least accuracy 90.19%) algorithms which consistently outperformed the SVM (least accuracy 74.51%) and NB (least accuracy 56.86%) classifiers for the same number of feature vectors across all studies

  5. Supervised Sequence Labelling with Recurrent Neural Networks

    CERN Document Server

    Graves, Alex

    2012-01-01

    Supervised sequence labelling is a vital area of machine learning, encompassing tasks such as speech, handwriting and gesture recognition, protein secondary structure prediction and part-of-speech tagging. Recurrent neural networks are powerful sequence learning tools—robust to input noise and distortion, able to exploit long-range contextual information—that would seem ideally suited to such problems. However their role in large-scale sequence labelling systems has so far been auxiliary.    The goal of this book is a complete framework for classifying and transcribing sequential data with recurrent neural networks only. Three main innovations are introduced in order to realise this goal. Firstly, the connectionist temporal classification output layer allows the framework to be trained with unsegmented target sequences, such as phoneme-level speech transcriptions; this is in contrast to previous connectionist approaches, which were dependent on error-prone prior segmentation. Secondly, multidimensional...

  6. Crystallization and preliminary X-ray analysis of a monomeric mutant of Azami-Green (mAG), an Aequorea victoria green fluorescent protein-like green-emitting fluorescent protein from the stony coral Galaxea fascicularis

    International Nuclear Information System (INIS)

    Ebisawa, Tatsuki; Yamamura, Akihiro; Kameda, Yasuhiro; Hayakawa, Kou; Nagata, Koji; Tanokura, Masaru

    2009-01-01

    A monomeric mutant of Azami-Green from G. fascicularis was expressed, purified and crystallized using the sitting-drop vapour-diffusion method. The crystal belonged to space group P1 and diffracted X-rays to 2.20 Å resolution. Monomeric Azami-Green (mAG) from the stony coral Galaxea fascicularis is the first monomeric green-emitting fluorescent protein that is not a derivative of Aequorea victoria green fluorescent protein (avGFP). mAG and avGFP are 27% identical in amino-acid sequence. Diffraction-quality crystals of recombinant mAG were obtained by the sitting-drop vapour-diffusion method using PEG 3350 as the precipitant. The mAG crystal diffracted X-rays to 2.20 Å resolution on beamline AR-NW12A at the Photon Factory (Tsukuba, Japan). The crystal belonged to space group P1, with unit-cell parameters a = 41.78, b = 51.72, c = 52.89 Å, α = 90.96, β = 103.41, γ = 101.79°. The Matthews coefficient (V M = 2.10 Å 3 Da −1 ) indicated that the crystal contained two mAG molecules per asymmetric unit

  7. Applications of High Throughput Nucleotide Sequencing

    DEFF Research Database (Denmark)

    Waage, Johannes Eichler

    equally large demands in data handling, analysis and interpretation, perhaps defining the modern challenge of the computational biologist of the post-genomic era. The first part of this thesis consists of a general introduction to the history, common terms and challenges of next generation sequencing......-sequencing, a study of the effects on alternative RNA splicing of KO of the nonsense mediated RNA decay system in Mus, using digital gene expression and a custom-built exon-exon junction mapping pipeline is presented (article I). Evolved from this work, a Bioconductor package, spliceR, for classifying alternative...

  8. Deep Feature Learning and Cascaded Classifier for Large Scale Data

    DEFF Research Database (Denmark)

    Prasoon, Adhish

    from data rather than having a predefined feature set. We explore deep learning approach of convolutional neural network (CNN) for segmenting three dimensional medical images. We propose a novel system integrating three 2D CNNs, which have a one-to-one association with the xy, yz and zx planes of 3D......This thesis focuses on voxel/pixel classification based approaches for image segmentation. The main application is segmentation of articular cartilage in knee MRIs. The first major contribution of the thesis deals with large scale machine learning problems. Many medical imaging problems need huge...... amount of training data to cover sufficient biological variability. Learning methods scaling badly with number of training data points cannot be used in such scenarios. This may restrict the usage of many powerful classifiers having excellent generalization ability. We propose a cascaded classifier which...

  9. Scoring and Classifying Examinees Using Measurement Decision Theory

    Directory of Open Access Journals (Sweden)

    Lawrence M. Rudner

    2009-04-01

    Full Text Available This paper describes and evaluates the use of measurement decision theory (MDT to classify examinees based on their item response patterns. The model has a simple framework that starts with the conditional probabilities of examinees in each category or mastery state responding correctly to each item. The presented evaluation investigates: (1 the classification accuracy of tests scored using decision theory; (2 the effectiveness of different sequential testing procedures; and (3 the number of items needed to make a classification. A large percentage of examinees can be classified accurately with very few items using decision theory. A Java Applet for self instruction and software for generating, calibrating and scoring MDT data are provided.

  10. MAMMOGRAMS ANALYSIS USING SVM CLASSIFIER IN COMBINED TRANSFORMS DOMAIN

    Directory of Open Access Journals (Sweden)

    B.N. Prathibha

    2011-02-01

    Full Text Available Breast cancer is a primary cause of mortality and morbidity in women. Reports reveal that earlier the detection of abnormalities, better the improvement in survival. Digital mammograms are one of the most effective means for detecting possible breast anomalies at early stages. Digital mammograms supported with Computer Aided Diagnostic (CAD systems help the radiologists in taking reliable decisions. The proposed CAD system extracts wavelet features and spectral features for the better classification of mammograms. The Support Vector Machines classifier is used to analyze 206 mammogram images from Mias database pertaining to the severity of abnormality, i.e., benign and malign. The proposed system gives 93.14% accuracy for discrimination between normal-malign and 87.25% accuracy for normal-benign samples and 89.22% accuracy for benign-malign samples. The study reveals that features extracted in hybrid transform domain with SVM classifier proves to be a promising tool for analysis of mammograms.

  11. Evaluation of LDA Ensembles Classifiers for Brain Computer Interface

    International Nuclear Information System (INIS)

    Arjona, Cristian; Pentácolo, José; Gareis, Iván; Atum, Yanina; Gentiletti, Gerardo; Acevedo, Rubén; Rufiner, Leonardo

    2011-01-01

    The Brain Computer Interface (BCI) translates brain activity into computer commands. To increase the performance of the BCI, to decode the user intentions it is necessary to get better the feature extraction and classification techniques. In this article the performance of a three linear discriminant analysis (LDA) classifiers ensemble is studied. The system based on ensemble can theoretically achieved better classification results than the individual counterpart, regarding individual classifier generation algorithm and the procedures for combine their outputs. Classic algorithms based on ensembles such as bagging and boosting are discussed here. For the application on BCI, it was concluded that the generated results using ER and AUC as performance index do not give enough information to establish which configuration is better.

  12. Security Enrichment in Intrusion Detection System Using Classifier Ensemble

    Directory of Open Access Journals (Sweden)

    Uma R. Salunkhe

    2017-01-01

    Full Text Available In the era of Internet and with increasing number of people as its end users, a large number of attack categories are introduced daily. Hence, effective detection of various attacks with the help of Intrusion Detection Systems is an emerging trend in research these days. Existing studies show effectiveness of machine learning approaches in handling Intrusion Detection Systems. In this work, we aim to enhance detection rate of Intrusion Detection System by using machine learning technique. We propose a novel classifier ensemble based IDS that is constructed using hybrid approach which combines data level and feature level approach. Classifier ensembles combine the opinions of different experts and improve the intrusion detection rate. Experimental results show the improved detection rates of our system compared to reference technique.

  13. The three-dimensional origin of the classifying algebra

    International Nuclear Information System (INIS)

    Fuchs, Juergen; Schweigert, Christoph; Stigner, Carl

    2010-01-01

    It is known that reflection coefficients for bulk fields of a rational conformal field theory in the presence of an elementary boundary condition can be obtained as representation matrices of irreducible representations of the classifying algebra, a semisimple commutative associative complex algebra. We show how this algebra arises naturally from the three-dimensional geometry of factorization of correlators of bulk fields on the disk. This allows us to derive explicit expressions for the structure constants of the classifying algebra as invariants of ribbon graphs in the three-manifold S 2 xS 1 . Our result unravels a precise relation between intertwiners of the action of the mapping class group on spaces of conformal blocks and boundary conditions in rational conformal field theories.

  14. Machine learning classifiers and fMRI: a tutorial overview.

    Science.gov (United States)

    Pereira, Francisco; Mitchell, Tom; Botvinick, Matthew

    2009-03-01

    Interpreting brain image experiments requires analysis of complex, multivariate data. In recent years, one analysis approach that has grown in popularity is the use of machine learning algorithms to train classifiers to decode stimuli, mental states, behaviours and other variables of interest from fMRI data and thereby show the data contain information about them. In this tutorial overview we review some of the key choices faced in using this approach as well as how to derive statistically significant results, illustrating each point from a case study. Furthermore, we show how, in addition to answering the question of 'is there information about a variable of interest' (pattern discrimination), classifiers can be used to tackle other classes of question, namely 'where is the information' (pattern localization) and 'how is that information encoded' (pattern characterization).

  15. Lung Nodule Detection in CT Images using Neuro Fuzzy Classifier

    Directory of Open Access Journals (Sweden)

    M. Usman Akram

    2013-07-01

    Full Text Available Automated lung cancer detection using computer aided diagnosis (CAD is an important area in clinical applications. As the manual nodule detection is very time consuming and costly so computerized systems can be helpful for this purpose. In this paper, we propose a computerized system for lung nodule detection in CT scan images. The automated system consists of two stages i.e. lung segmentation and enhancement, feature extraction and classification. The segmentation process will result in separating lung tissue from rest of the image, and only the lung tissues under examination are considered as candidate regions for detecting malignant nodules in lung portion. A feature vector for possible abnormal regions is calculated and regions are classified using neuro fuzzy classifier. It is a fully automatic system that does not require any manual intervention and experimental results show the validity of our system.

  16. A Bayesian Classifier for X-Ray Pulsars Recognition

    Directory of Open Access Journals (Sweden)

    Hao Liang

    2016-01-01

    Full Text Available Recognition for X-ray pulsars is important for the problem of spacecraft’s attitude determination by X-ray Pulsar Navigation (XPNAV. By using the nonhomogeneous Poisson model of the received photons and the minimum recognition error criterion, a classifier based on the Bayesian theorem is proposed. For X-ray pulsars recognition with unknown Doppler frequency and initial phase, the features of every X-ray pulsar are extracted and the unknown parameters are estimated using the Maximum Likelihood (ML method. Besides that, a method to recognize unknown X-ray pulsars or X-ray disturbances is proposed. Simulation results certificate the validity of the proposed Bayesian classifier.

  17. Wavelet classifier used for diagnosing shock absorbers in cars

    Directory of Open Access Journals (Sweden)

    Janusz GARDULSKI

    2007-01-01

    Full Text Available The paper discusses some commonly used methods of hydraulic absorbertesting. Disadvantages of the methods are described. A vibro-acoustic method is presented and recommended for practical use on existing test rigs. The method is based on continuous wavelet analysis combined with neural classifier and 25-neuron, one-way, three-layer back propagation network. The analysis satisfies the intended aim.

  18. Classified installations for environmental protection subject to declaration. Tome 2

    International Nuclear Information System (INIS)

    Anon.

    1992-01-01

    Legislation concerning classified installations govern most of industries or dangerous or pollutant activities. This legislation aims to prevent risks and harmful effects coming from an installation, air pollution, water pollution, noise, wastes produced by installations, even aesthetic bad effects. Pollutant or dangerous activities are defined in a list called nomenclature which obliged installations to a rule of declaration or authorization. Technical regulations ordered by the secretary of state for the environment are listed in tome 2

  19. Classified study and clinical value of the phase imaging features

    International Nuclear Information System (INIS)

    Dang Yaping; Ma Aiqun; Zheng Xiaopu; Yang Aimin; Xiao Jiang; Gao Xinyao

    2000-01-01

    445 patients with various heart diseases were examined by the gated cardiac blood pool imaging, and the phase was classified. The relationship between the seven types with left ventricular function index, clinical heart function, different heart diseases as well as electrocardiograph was studied. The results showed that the phase image classification could match with the clinical heart function. It can visually, directly and accurately indicate clinical heart function and can be used to identify diagnosis of heart disease

  20. Evaluating Classifiers in Detecting 419 Scams in Bilingual Cybercriminal Communities

    OpenAIRE

    Mbaziira, Alex V.; Abozinadah, Ehab; Jones Jr, James H.

    2015-01-01

    Incidents of organized cybercrime are rising because of criminals are reaping high financial rewards while incurring low costs to commit crime. As the digital landscape broadens to accommodate more internet-enabled devices and technologies like social media, more cybercriminals who are not native English speakers are invading cyberspace to cash in on quick exploits. In this paper we evaluate the performance of three machine learning classifiers in detecting 419 scams in a bilingual Nigerian c...

  1. Classifying Radio Galaxies with the Convolutional Neural Network

    International Nuclear Information System (INIS)

    Aniyan, A. K.; Thorat, K.

    2017-01-01

    We present the application of a deep machine learning technique to classify radio images of extended sources on a morphological basis using convolutional neural networks (CNN). In this study, we have taken the case of the Fanaroff–Riley (FR) class of radio galaxies as well as radio galaxies with bent-tailed morphology. We have used archival data from the Very Large Array (VLA)—Faint Images of the Radio Sky at Twenty Centimeters survey and existing visually classified samples available in the literature to train a neural network for morphological classification of these categories of radio sources. Our training sample size for each of these categories is ∼200 sources, which has been augmented by rotated versions of the same. Our study shows that CNNs can classify images of the FRI and FRII and bent-tailed radio galaxies with high accuracy (maximum precision at 95%) using well-defined samples and a “fusion classifier,” which combines the results of binary classifications, while allowing for a mechanism to find sources with unusual morphologies. The individual precision is highest for bent-tailed radio galaxies at 95% and is 91% and 75% for the FRI and FRII classes, respectively, whereas the recall is highest for FRI and FRIIs at 91% each, while the bent-tailed class has a recall of 79%. These results show that our results are comparable to that of manual classification, while being much faster. Finally, we discuss the computational and data-related challenges associated with the morphological classification of radio galaxies with CNNs.

  2. Classifying Radio Galaxies with the Convolutional Neural Network

    Energy Technology Data Exchange (ETDEWEB)

    Aniyan, A. K.; Thorat, K. [Department of Physics and Electronics, Rhodes University, Grahamstown (South Africa)

    2017-06-01

    We present the application of a deep machine learning technique to classify radio images of extended sources on a morphological basis using convolutional neural networks (CNN). In this study, we have taken the case of the Fanaroff–Riley (FR) class of radio galaxies as well as radio galaxies with bent-tailed morphology. We have used archival data from the Very Large Array (VLA)—Faint Images of the Radio Sky at Twenty Centimeters survey and existing visually classified samples available in the literature to train a neural network for morphological classification of these categories of radio sources. Our training sample size for each of these categories is ∼200 sources, which has been augmented by rotated versions of the same. Our study shows that CNNs can classify images of the FRI and FRII and bent-tailed radio galaxies with high accuracy (maximum precision at 95%) using well-defined samples and a “fusion classifier,” which combines the results of binary classifications, while allowing for a mechanism to find sources with unusual morphologies. The individual precision is highest for bent-tailed radio galaxies at 95% and is 91% and 75% for the FRI and FRII classes, respectively, whereas the recall is highest for FRI and FRIIs at 91% each, while the bent-tailed class has a recall of 79%. These results show that our results are comparable to that of manual classification, while being much faster. Finally, we discuss the computational and data-related challenges associated with the morphological classification of radio galaxies with CNNs.

  3. Efficient Multi-Concept Visual Classifier Adaptation in Changing Environments

    Science.gov (United States)

    2016-09-01

    sets of images, hand annotated by humans with region boundary outlines followed by label assignment. This annotation is time consuming , and...performed as a necessary but time- consuming step to train su- pervised classifiers. U nsupervised o r s elf-supervised a pproaches h ave b een used to...time- consuming labeling pro- cess. However, the lack of human supervision has limited most of this work to binary classification (e.g., traversability

  4. Classifying apples by the means of fluorescence imaging

    OpenAIRE

    Codrea, Marius C.; Nevalainen, Olli S.; Tyystjärvi, Esa; VAN DE VEN, Martin; VALCKE, Roland

    2004-01-01

    Classification of harvested apples when predicting their storage potential is an important task. This paper describes how chlorophyll a fluorescence images taken in blue light through a red filter, can be used to classify apples. In such an image, fluorescence appears as a relatively homogenous area broken by a number of small nonfluorescing spots, corresponding to normal corky tissue patches, lenticells, and to damaged areas that lower the quality of the apple. The damaged regions appear mor...

  5. Building Road-Sign Classifiers Using a Trainable Similarity Measure

    Czech Academy of Sciences Publication Activity Database

    Paclík, P.; Novovičová, Jana; Duin, R.P.W.

    2006-01-01

    Roč. 7, č. 3 (2006), s. 309-321 ISSN 1524-9050 R&D Projects: GA AV ČR IAA2075302 EU Projects: European Commission(XE) 507752 - MUSCLE Institutional research plan: CEZ:AV0Z10750506 Keywords : classifier system design * road-sign classification * similarity data representation Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 1.434, year: 2006 http://www.ewh.ieee.org/tc/its/trans.html

  6. Classifying Radio Galaxies with the Convolutional Neural Network

    Science.gov (United States)

    Aniyan, A. K.; Thorat, K.

    2017-06-01

    We present the application of a deep machine learning technique to classify radio images of extended sources on a morphological basis using convolutional neural networks (CNN). In this study, we have taken the case of the Fanaroff-Riley (FR) class of radio galaxies as well as radio galaxies with bent-tailed morphology. We have used archival data from the Very Large Array (VLA)—Faint Images of the Radio Sky at Twenty Centimeters survey and existing visually classified samples available in the literature to train a neural network for morphological classification of these categories of radio sources. Our training sample size for each of these categories is ˜200 sources, which has been augmented by rotated versions of the same. Our study shows that CNNs can classify images of the FRI and FRII and bent-tailed radio galaxies with high accuracy (maximum precision at 95%) using well-defined samples and a “fusion classifier,” which combines the results of binary classifications, while allowing for a mechanism to find sources with unusual morphologies. The individual precision is highest for bent-tailed radio galaxies at 95% and is 91% and 75% for the FRI and FRII classes, respectively, whereas the recall is highest for FRI and FRIIs at 91% each, while the bent-tailed class has a recall of 79%. These results show that our results are comparable to that of manual classification, while being much faster. Finally, we discuss the computational and data-related challenges associated with the morphological classification of radio galaxies with CNNs.

  7. Classifying Floating Potential Measurement Unit Data Products as Science Data

    Science.gov (United States)

    Coffey, Victoria; Minow, Joseph

    2015-01-01

    We are Co-Investigators for the Floating Potential Measurement Unit (FPMU) on the International Space Station (ISS) and members of the FPMU operations and data analysis team. We are providing this memo for the purpose of classifying raw and processed FPMU data products and ancillary data as NASA science data with unrestricted, public availability in order to best support science uses of the data.

  8. Snoring classified: The Munich-Passau Snore Sound Corpus.

    Science.gov (United States)

    Janott, Christoph; Schmitt, Maximilian; Zhang, Yue; Qian, Kun; Pandit, Vedhas; Zhang, Zixing; Heiser, Clemens; Hohenhorst, Winfried; Herzog, Michael; Hemmert, Werner; Schuller, Björn

    2018-03-01

    Snoring can be excited in different locations within the upper airways during sleep. It was hypothesised that the excitation locations are correlated with distinct acoustic characteristics of the snoring noise. To verify this hypothesis, a database of snore sounds is developed, labelled with the location of sound excitation. Video and audio recordings taken during drug induced sleep endoscopy (DISE) examinations from three medical centres have been semi-automatically screened for snore events, which subsequently have been classified by ENT experts into four classes based on the VOTE classification. The resulting dataset containing 828 snore events from 219 subjects has been split into Train, Development, and Test sets. An SVM classifier has been trained using low level descriptors (LLDs) related to energy, spectral features, mel frequency cepstral coefficients (MFCC), formants, voicing, harmonic-to-noise ratio (HNR), spectral harmonicity, pitch, and microprosodic features. An unweighted average recall (UAR) of 55.8% could be achieved using the full set of LLDs including formants. Best performing subset is the MFCC-related set of LLDs. A strong difference in performance could be observed between the permutations of train, development, and test partition, which may be caused by the relatively low number of subjects included in the smaller classes of the strongly unbalanced data set. A database of snoring sounds is presented which are classified according to their sound excitation location based on objective criteria and verifiable video material. With the database, it could be demonstrated that machine classifiers can distinguish different excitation location of snoring sounds in the upper airway based on acoustic parameters. Copyright © 2018 Elsevier Ltd. All rights reserved.

  9. Young module multiplicities and classifying the indecomposable Young permutation modules

    OpenAIRE

    Gill, Christopher C.

    2012-01-01

    We study the multiplicities of Young modules as direct summands of permutation modules on cosets of Young subgroups. Such multiplicities have become known as the p-Kostka numbers. We classify the indecomposable Young permutation modules, and, applying the Brauer construction for p-permutation modules, we give some new reductions for p-Kostka numbers. In particular we prove that p-Kostka numbers are preserved under multiplying partitions by p, and strengthen a known reduction given by Henke, c...

  10. BIOPHARMACEUTICS CLASSIFICATION SYSTEM: A STRATEGIC TOOL FOR CLASSIFYING DRUG SUBSTANCES

    OpenAIRE

    Rohilla Seema; Rohilla Ankur; Marwaha RK; Nanda Arun

    2011-01-01

    The biopharmaceutical classification system (BCS) is a scientific approach for classifying drug substances based on their dose/solubility ratio and intestinal permeability. The BCS has been developed to allow prediction of in vivo pharmacokinetic performance of drug products from measurements of permeability and solubility. Moreover, the drugs can be categorized into four classes of BCS on the basis of permeability and solubility namely; high permeability high solubility, high permeability lo...

  11. The Sirenomelia Sequence: A Case History

    Directory of Open Access Journals (Sweden)

    Anis Fadhlaoui

    2010-01-01

    Full Text Available We report a case of sirenomelia sequence observed in an incident of preterm labor during the 29th gestational week. According to some authors, this syndrome should be classified separately from caudal regression syndrome and is likely to be the result of an abnormality taking place during the fourth gestational week, causing developmental abnormalities in the lower extremities, pelvis, genitalia, urinary tract and digestive organs. Despite recent progress in pathology, the etiopathogenesis of sirenomelia is still debated.

  12. The Sirenomelia Sequence: A Case History

    Science.gov (United States)

    Fadhlaoui, Anis; Khrouf, Mohamed; Gaigi, Soumaya; Zhioua, Fethi; Chaker, Anis

    2010-01-01

    We report a case of sirenomelia sequence observed in an incident of preterm labor during the 29th gestational week. According to some authors, this syndrome should be classified separately from caudal regression syndrome and is likely to be the result of an abnormality taking place during the fourth gestational week, causing developmental abnormalities in the lower extremities, pelvis, genitalia, urinary tract and digestive organs. Despite recent progress in pathology, the etiopathogenesis of sirenomelia is still debated. PMID:21769253

  13. The sirenomelia sequence: a case history.

    Science.gov (United States)

    Fadhlaoui, Anis; Khrouf, Mohamed; Gaigi, Soumaya; Zhioua, Fethi; Chaker, Anis

    2010-01-01

    We report a case of sirenomelia sequence observed in an incident of preterm labor during the 29th gestational week. According to some authors, this syndrome should be classified separately from caudal regression syndrome and is likely to be the result of an abnormality taking place during the fourth gestational week, causing developmental abnormalities in the lower extremities, pelvis, genitalia, urinary tract and digestive organs. Despite recent progress in pathology, the etiopathogenesis of sirenomelia is still debated.

  14. Self-organizing map classifier for stressed speech recognition

    Science.gov (United States)

    Partila, Pavol; Tovarek, Jaromir; Voznak, Miroslav

    2016-05-01

    This paper presents a method for detecting speech under stress using Self-Organizing Maps. Most people who are exposed to stressful situations can not adequately respond to stimuli. Army, police, and fire department occupy the largest part of the environment that are typical of an increased number of stressful situations. The role of men in action is controlled by the control center. Control commands should be adapted to the psychological state of a man in action. It is known that the psychological changes of the human body are also reflected physiologically, which consequently means the stress effected speech. Therefore, it is clear that the speech stress recognizing system is required in the security forces. One of the possible classifiers, which are popular for its flexibility, is a self-organizing map. It is one type of the artificial neural networks. Flexibility means independence classifier on the character of the input data. This feature is suitable for speech processing. Human Stress can be seen as a kind of emotional state. Mel-frequency cepstral coefficients, LPC coefficients, and prosody features were selected for input data. These coefficients were selected for their sensitivity to emotional changes. The calculation of the parameters was performed on speech recordings, which can be divided into two classes, namely the stress state recordings and normal state recordings. The benefit of the experiment is a method using SOM classifier for stress speech detection. Results showed the advantage of this method, which is input data flexibility.

  15. Deconstructing Cross-Entropy for Probabilistic Binary Classifiers

    Directory of Open Access Journals (Sweden)

    Daniel Ramos

    2018-03-01

    Full Text Available In this work, we analyze the cross-entropy function, widely used in classifiers both as a performance measure and as an optimization objective. We contextualize cross-entropy in the light of Bayesian decision theory, the formal probabilistic framework for making decisions, and we thoroughly analyze its motivation, meaning and interpretation from an information-theoretical point of view. In this sense, this article presents several contributions: First, we explicitly analyze the contribution to cross-entropy of (i prior knowledge; and (ii the value of the features in the form of a likelihood ratio. Second, we introduce a decomposition of cross-entropy into two components: discrimination and calibration. This decomposition enables the measurement of different performance aspects of a classifier in a more precise way; and justifies previously reported strategies to obtain reliable probabilities by means of the calibration of the output of a discriminating classifier. Third, we give different information-theoretical interpretations of cross-entropy, which can be useful in different application scenarios, and which are related to the concept of reference probabilities. Fourth, we present an analysis tool, the Empirical Cross-Entropy (ECE plot, a compact representation of cross-entropy and its aforementioned decomposition. We show the power of ECE plots, as compared to other classical performance representations, in two diverse experimental examples: a speaker verification system, and a forensic case where some glass findings are present.

  16. General and Local: Averaged k-Dependence Bayesian Classifiers

    Directory of Open Access Journals (Sweden)

    Limin Wang

    2015-06-01

    Full Text Available The inference of a general Bayesian network has been shown to be an NP-hard problem, even for approximate solutions. Although k-dependence Bayesian (KDB classifier can construct at arbitrary points (values of k along the attribute dependence spectrum, it cannot identify the changes of interdependencies when attributes take different values. Local KDB, which learns in the framework of KDB, is proposed in this study to describe the local dependencies implicated in each test instance. Based on the analysis of functional dependencies, substitution-elimination resolution, a new type of semi-naive Bayesian operation, is proposed to substitute or eliminate generalization to achieve accurate estimation of conditional probability distribution while reducing computational complexity. The final classifier, averaged k-dependence Bayesian (AKDB classifiers, will average the output of KDB and local KDB. Experimental results on the repository of machine learning databases from the University of California Irvine (UCI showed that AKDB has significant advantages in zero-one loss and bias relative to naive Bayes (NB, tree augmented naive Bayes (TAN, Averaged one-dependence estimators (AODE, and KDB. Moreover, KDB and local KDB show mutually complementary characteristics with respect to variance.

  17. Evaluation of Polarimetric SAR Decomposition for Classifying Wetland Vegetation Types

    Directory of Open Access Journals (Sweden)

    Sang-Hoon Hong

    2015-07-01

    Full Text Available The Florida Everglades is the largest subtropical wetland system in the United States and, as with subtropical and tropical wetlands elsewhere, has been threatened by severe environmental stresses. It is very important to monitor such wetlands to inform management on the status of these fragile ecosystems. This study aims to examine the applicability of TerraSAR-X quadruple polarimetric (quad-pol synthetic aperture radar (PolSAR data for classifying wetland vegetation in the Everglades. We processed quad-pol data using the Hong & Wdowinski four-component decomposition, which accounts for double bounce scattering in the cross-polarization signal. The calculated decomposition images consist of four scattering mechanisms (single, co- and cross-pol double, and volume scattering. We applied an object-oriented image analysis approach to classify vegetation types with the decomposition results. We also used a high-resolution multispectral optical RapidEye image to compare statistics and classification results with Synthetic Aperture Radar (SAR observations. The calculated classification accuracy was higher than 85%, suggesting that the TerraSAR-X quad-pol SAR signal had a high potential for distinguishing different vegetation types. Scattering components from SAR acquisition were particularly advantageous for classifying mangroves along tidal channels. We conclude that the typical scattering behaviors from model-based decomposition are useful for discriminating among different wetland vegetation types.

  18. A Novel Cascade Classifier for Automatic Microcalcification Detection.

    Directory of Open Access Journals (Sweden)

    Seung Yeon Shin

    Full Text Available In this paper, we present a novel cascaded classification framework for automatic detection of individual and clusters of microcalcifications (μC. Our framework comprises three classification stages: i a random forest (RF classifier for simple features capturing the second order local structure of individual μCs, where non-μC pixels in the target mammogram are efficiently eliminated; ii a more complex discriminative restricted Boltzmann machine (DRBM classifier for μC candidates determined in the RF stage, which automatically learns the detailed morphology of μC appearances for improved discriminative power; and iii a detector to detect clusters of μCs from the individual μC detection results, using two different criteria. From the two-stage RF-DRBM classifier, we are able to distinguish μCs using explicitly computed features, as well as learn implicit features that are able to further discriminate between confusing cases. Experimental evaluation is conducted on the original Mammographic Image Analysis Society (MIAS and mini-MIAS databases, as well as our own Seoul National University Bundang Hospital digital mammographic database. It is shown that the proposed method outperforms comparable methods in terms of receiver operating characteristic (ROC and precision-recall curves for detection of individual μCs and free-response receiver operating characteristic (FROC curve for detection of clustered μCs.

  19. Patients on weaning trials classified with support vector machines

    International Nuclear Information System (INIS)

    Garde, Ainara; Caminal, Pere; Giraldo, Beatriz F; Schroeder, Rico; Voss, Andreas; Benito, Salvador

    2010-01-01

    The process of discontinuing mechanical ventilation is called weaning and is one of the most challenging problems in intensive care. An unnecessary delay in the discontinuation process and an early weaning trial are undesirable. This study aims to characterize the respiratory pattern through features that permit the identification of patients' conditions in weaning trials. Three groups of patients have been considered: 94 patients with successful weaning trials, who could maintain spontaneous breathing after 48 h (GSucc); 39 patients who failed the weaning trial (GFail) and 21 patients who had successful weaning trials, but required reintubation in less than 48 h (GRein). Patients are characterized by their cardiorespiratory interactions, which are described by joint symbolic dynamics (JSD) applied to the cardiac interbeat and breath durations. The most discriminating features in the classification of the different groups of patients (GSucc, GFail and GRein) are identified by support vector machines (SVMs). The SVM-based feature selection algorithm has an accuracy of 81% in classifying GSucc versus the rest of the patients, 83% in classifying GRein versus GSucc patients and 81% in classifying GRein versus the rest of the patients. Moreover, a good balance between sensitivity and specificity is achieved in all classifications

  20. Comparison of artificial intelligence classifiers for SIP attack data

    Science.gov (United States)

    Safarik, Jakub; Slachta, Jiri

    2016-05-01

    Honeypot application is a source of valuable data about attacks on the network. We run several SIP honeypots in various computer networks, which are separated geographically and logically. Each honeypot runs on public IP address and uses standard SIP PBX ports. All information gathered via honeypot is periodically sent to the centralized server. This server classifies all attack data by neural network algorithm. The paper describes optimizations of a neural network classifier, which lower the classification error. The article contains the comparison of two neural network algorithm used for the classification of validation data. The first is the original implementation of the neural network described in recent work; the second neural network uses further optimizations like input normalization or cross-entropy cost function. We also use other implementations of neural networks and machine learning classification algorithms. The comparison test their capabilities on validation data to find the optimal classifier. The article result shows promise for further development of an accurate SIP attack classification engine.

  1. Classifier-Guided Sampling for Complex Energy System Optimization

    Energy Technology Data Exchange (ETDEWEB)

    Backlund, Peter B. [Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States); Eddy, John P. [Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)

    2015-09-01

    This report documents the results of a Laboratory Directed Research and Development (LDRD) effort enti tled "Classifier - Guided Sampling for Complex Energy System Optimization" that was conducted during FY 2014 and FY 2015. The goal of this proj ect was to develop, implement, and test major improvements to the classifier - guided sampling (CGS) algorithm. CGS is type of evolutionary algorithm for perform ing search and optimization over a set of discrete design variables in the face of one or more objective functions. E xisting evolutionary algorithms, such as genetic algorithms , may require a large number of o bjecti ve function evaluations to identify optimal or near - optimal solutions . Reducing the number of evaluations can result in significant time savings, especially if the objective function is computationally expensive. CGS reduce s the evaluation count by us ing a Bayesian network classifier to filter out non - promising candidate designs , prior to evaluation, based on their posterior probabilit ies . In this project, b oth the single - objective and multi - objective version s of the CGS are developed and tested on a set of benchm ark problems. As a domain - specific case study, CGS is used to design a microgrid for use in islanded mode during an extended bulk power grid outage.

  2. Application of the Naive Bayesian Classifier to optimize treatment decisions

    International Nuclear Information System (INIS)

    Kazmierska, Joanna; Malicki, Julian

    2008-01-01

    Background and purpose: To study the accuracy, specificity and sensitivity of the Naive Bayesian Classifier (NBC) in the assessment of individual risk of cancer relapse or progression after radiotherapy (RT). Materials and methods: Data of 142 brain tumour patients irradiated from 2000 to 2005 were analyzed. Ninety-six attributes related to disease, patient and treatment were chosen. Attributes in binary form consisted of the training set for NBC learning. NBC calculated an individual conditional probability of being assigned to: relapse or progression (1), or no relapse or progression (0) group. Accuracy, attribute selection and quality of classifier were determined by comparison with actual treatment results, leave-one-out and cross validation methods, respectively. Clinical setting test utilized data of 35 patients. Treatment results at classification were unknown and were compared with classification results after 3 months. Results: High classification accuracy (84%), specificity (0.87) and sensitivity (0.80) were achieved, both for classifier training and in progressive clinical evaluation. Conclusions: NBC is a useful tool to support the assessment of individual risk of relapse or progression in patients diagnosed with brain tumour undergoing RT postoperatively

  3. A support vector machine (SVM) based voltage stability classifier

    Energy Technology Data Exchange (ETDEWEB)

    Dosano, R.D.; Song, H. [Kunsan National Univ., Kunsan, Jeonbuk (Korea, Republic of); Lee, B. [Korea Univ., Seoul (Korea, Republic of)

    2007-07-01

    Power system stability has become even more complex and critical with the advent of deregulated energy markets and the growing desire to completely employ existing transmission and infrastructure. The economic pressure on electricity markets forces the operation of power systems and components to their limit of capacity and performance. System conditions can be more exposed to instability due to greater uncertainty in day to day system operations and increase in the number of potential components for system disturbances potentially resulting in voltage stability. This paper proposed a support vector machine (SVM) based power system voltage stability classifier using local measurements of voltage and active power of load. It described the procedure for fast classification of long-term voltage stability using the SVM algorithm. The application of the SVM based voltage stability classifier was presented with reference to the choice of input parameters; input data preconditioning; moving window for feature vector; determination of learning samples; and other considerations in SVM applications. The paper presented a case study with numerical examples of an 11-bus test system. The test results for the feasibility study demonstrated that the classifier could offer an excellent performance in classification with time-series measurements in terms of long-term voltage stability. 9 refs., 14 figs.

  4. Entropy based classifier for cross-domain opinion mining

    Directory of Open Access Journals (Sweden)

    Jyoti S. Deshmukh

    2018-01-01

    Full Text Available In recent years, the growth of social network has increased the interest of people in analyzing reviews and opinions for products before they buy them. Consequently, this has given rise to the domain adaptation as a prominent area of research in sentiment analysis. A classifier trained from one domain often gives poor results on data from another domain. Expression of sentiment is different in every domain. The labeling cost of each domain separately is very high as well as time consuming. Therefore, this study has proposed an approach that extracts and classifies opinion words from one domain called source domain and predicts opinion words of another domain called target domain using a semi-supervised approach, which combines modified maximum entropy and bipartite graph clustering. A comparison of opinion classification on reviews on four different product domains is presented. The results demonstrate that the proposed method performs relatively well in comparison to the other methods. Comparison of SentiWordNet of domain-specific and domain-independent words reveals that on an average 72.6% and 88.4% words, respectively, are correctly classified.

  5. Improved target detection algorithm using Fukunaga-Koontz transform and distance classifier correlation filter

    Science.gov (United States)

    Bal, A.; Alam, M. S.; Aslan, M. S.

    2006-05-01

    Often sensor ego-motion or fast target movement causes the target to temporarily go out of the field-of-view leading to reappearing target detection problem in target tracking applications. Since the target goes out of the current frame and reenters at a later frame, the reentering location and variations in rotation, scale, and other 3D orientations of the target are not known thus complicating the detection algorithm has been developed using Fukunaga-Koontz Transform (FKT) and distance classifier correlation filter (DCCF). The detection algorithm uses target and background information, extracted from training samples, to detect possible candidate target images. The detected candidate target images are then introduced into the second algorithm, DCCF, called clutter rejection module, to determine the target coordinates are detected and tracking algorithm is initiated. The performance of the proposed FKT-DCCF based target detection algorithm has been tested using real-world forward looking infrared (FLIR) video sequences.

  6. ConSpeciFix: Classifying prokaryotic species based on gene flow.

    Science.gov (United States)

    Bobay, Louis-Marie; Ellis, Brian Shin-Hua; Ochman, Howard

    2018-05-16

    Classification of prokaryotic species is usually based on sequence similarity thresholds, which are easy to apply but lack a biologically-relevant foundation. Here, we present ConSpeciFix, a program that classifies prokaryotes into species using criteria set forth by the Biological Species Concept, thereby unifying species definition in all domains of life. ConSpeciFix's webserver is freely available at www.conspecifix.com. The local version of the program can be freely downloaded from https://github.com/Bobay-Ochman/ConSpeciFix. ConSpeciFix is written in Python 2.7 and requires the following dependencies: Usearch, MCL, MAFFT and RAxML. ljbobay@uncg.edu.

  7. Degeneration and height of cervical discs classified from MRI compared with precise height measurements from radiographs

    Energy Technology Data Exchange (ETDEWEB)

    Kolstad, Frode [National Centre of Spinal Disorders, Norwegian University of Science and Technology, University Hospital of Trondheim, 7006 Trondheim (Norway)]. E-mail: frode.kolstad@medisin.ntnu.no; Myhr, Gunnar [Department of Radiology, University Hospital of Trondheim, 7006 Trondheim (Norway); Kvistad, Kjell Arne [Department of Radiology, University Hospital of Trondheim, 7006 Trondheim (Norway); Nygaard, Oystein P. [National Centre of Spinal Disorders, Norwegian University of Science and Technology, University Hospital of Trondheim, 7006 Trondheim (Norway); Leivseth, Gunnar [Department of Neuromedicine, Faculty of Medicine, Norwegian University of Science and Technology, University Hospital of Trondheim, 7006 Trondheim (Norway)

    2005-09-01

    Study design: Descriptive study comparing MRI classifications with measurements from radiographs. Objectives: 1.Define the relationship between MRI classified cervical disc degeneration and objectively measured disc height. 2.Assess the level of inter- and intra-observer errors using MRI in defining cervical disc degeneration. Summary of background data: Cervical spine degeneration has been defined radiologically by loss of disc height, decreased disc and bone marrow signal intensity and disc protrusion/herniation on MRI. The intra- and inter-observer error using MRI in defining cervical degeneration influences data interpretation. Few previous studies have addressed this source of error. The relation and time sequence between cervical disc degeneration classified by MRI and cervical disc height decrease measured from radiographs is unclear. Methods: The MRI classification of degeneration was based on nucleus signal, prolaps identification and bone marrow signal. Two neuro-radiologists evaluated the MR-images independently in a blinded fashion. The radiographic disc height measurements were done by a new computer-assisted method compensating for image distortion and permitting comparison with normal level-, age- and gender-appropriate disc height. Results/conclusions: 1.Progressing disc degeneration classified from MRI is on average significantly associated with a decrease of disc height as measured from radiographs. Within each MRI defined category of degeneration measured disc heights, however, scatter in a wide range. 2.The inter-observer agreement between two neuro-radiologists in both defining degeneration and disc height by MRI was only moderate. Studies addressing questions related to cervical disc degeneration should take this into consideration.

  8. Degeneration and height of cervical discs classified from MRI compared with precise height measurements from radiographs

    International Nuclear Information System (INIS)

    Kolstad, Frode; Myhr, Gunnar; Kvistad, Kjell Arne; Nygaard, Oystein P.; Leivseth, Gunnar

    2005-01-01

    Study design: Descriptive study comparing MRI classifications with measurements from radiographs. Objectives: 1.Define the relationship between MRI classified cervical disc degeneration and objectively measured disc height. 2.Assess the level of inter- and intra-observer errors using MRI in defining cervical disc degeneration. Summary of background data: Cervical spine degeneration has been defined radiologically by loss of disc height, decreased disc and bone marrow signal intensity and disc protrusion/herniation on MRI. The intra- and inter-observer error using MRI in defining cervical degeneration influences data interpretation. Few previous studies have addressed this source of error. The relation and time sequence between cervical disc degeneration classified by MRI and cervical disc height decrease measured from radiographs is unclear. Methods: The MRI classification of degeneration was based on nucleus signal, prolaps identification and bone marrow signal. Two neuro-radiologists evaluated the MR-images independently in a blinded fashion. The radiographic disc height measurements were done by a new computer-assisted method compensating for image distortion and permitting comparison with normal level-, age- and gender-appropriate disc height. Results/conclusions: 1.Progressing disc degeneration classified from MRI is on average significantly associated with a decrease of disc height as measured from radiographs. Within each MRI defined category of degeneration measured disc heights, however, scatter in a wide range. 2.The inter-observer agreement between two neuro-radiologists in both defining degeneration and disc height by MRI was only moderate. Studies addressing questions related to cervical disc degeneration should take this into consideration

  9. Anomaly detection in forward looking infrared imaging using one-class classifiers

    Science.gov (United States)

    Popescu, Mihail; Stone, Kevin; Havens, Timothy; Ho, Dominic; Keller, James

    2010-04-01

    In this paper we describe a method for generating cues of possible abnormal objects present in the field of view of an infrared (IR) camera installed on a moving vehicle. The proposed method has two steps. In the first step, for each frame, we generate a set of possible points of interest using a corner detection algorithm. In the second step, the points related to the background are discarded from the point set using an one class classifier (OCC) trained on features extracted from a local neighborhood of each point. The advantage of using an OCC is that we do not need examples from the "abnormal object" class to train the classifier. Instead, OCC is trained using corner points from images known to be abnormal object free, i.e., that contain only background scenes. To further reduce the number of false alarms we use a temporal fusion procedure: a region has to be detected as "interesting" in m out of n, mGM). The comparison is performed using a set of about 900 background point neighborhoods for training and 400 for testing. The best performing OCC is then used to detect abnormal objects in a set of IR video sequences obtained on a 1 mile long country road.

  10. Yellowfin Tuna (Thunnusalbacares Fishing Ground Forecasting Model Based On Bayes Classifier In The South China Sea

    Directory of Open Access Journals (Sweden)

    Zhou Wei-feng

    2017-08-01

    Full Text Available Using the yellowfin tuna (Thunnusalbacares,YFTlongline fishing catch data in the open South China Sea (SCS provided by WCPFC, the optimum interpolation sea surface temperature (OISST from CPC/NOAA and multi-satellites altimetric monthly averaged product sea surface height (SSH released by CNES, eight alternative options based on Bayes classifier were made in this paper according to different strategies on the choice of environment factors and the levels of fishing zones to classify the YFT fishing ground in the open SCS. The classification results were compared with the actual ones for validation and analyzed to know how different plans impact on classification results and precision. The results of validation showed that the precision of the eight options were 71.4%, 75%, 70.8%, 74.4%, 66.7%, 68.5%, 57.7% and 63.7% in sequence, the first to sixth among them above 65% would meet the practical application needs basically. The alternatives which use SST and SSH simultaneously as the environmental factors have higher precision than which only use single SST environmental factor, and the consideration of adding SSH can improve the model precision to a certain extent. The options which use CPUE’s mean ± standard deviation as threshold have higher precision than which use CPUE’s 33.3%-quantile and 66.7%-quantile as the threshold

  11. Automatic Human Facial Expression Recognition Based on Integrated Classifier From Monocular Video with Uncalibrated Camera

    Directory of Open Access Journals (Sweden)

    Yu Tao

    2017-01-01

    Full Text Available An automatic recognition framework for human facial expressions from a monocular video with an uncalibrated camera is proposed. The expression characteristics are first acquired from a kind of deformable template, similar to a facial muscle distribution. After associated regularization, the time sequences from the trait changes in space-time under complete expressional production are then arranged line by line in a matrix. Next, the matrix dimensionality is reduced by a method of manifold learning of neighborhood-preserving embedding. Finally, the refined matrix containing the expression trait information is recognized by a classifier that integrates the hidden conditional random field (HCRF and support vector machine (SVM. In an experiment using the Cohn–Kanade database, the proposed method showed a comparatively higher recognition rate than the individual HCRF or SVM methods in direct recognition from two-dimensional human face traits. Moreover, the proposed method was shown to be more robust than the typical Kotsia method because the former contains more structural characteristics of the data to be classified in space-time

  12. nRC: non-coding RNA Classifier based on structural features.

    Science.gov (United States)

    Fiannaca, Antonino; La Rosa, Massimo; La Paglia, Laura; Rizzo, Riccardo; Urso, Alfonso

    2017-01-01

    Non-coding RNA (ncRNA) are small non-coding sequences involved in gene expression regulation of many biological processes and diseases. The recent discovery of a large set of different ncRNAs with biologically relevant roles has opened the way to develop methods able to discriminate between the different ncRNA classes. Moreover, the lack of knowledge about the complete mechanisms in regulative processes, together with the development of high-throughput technologies, has required the help of bioinformatics tools in addressing biologists and clinicians with a deeper comprehension of the functional roles of ncRNAs. In this work, we introduce a new ncRNA classification tool, nRC (non-coding RNA Classifier). Our approach is based on features extraction from the ncRNA secondary structure together with a supervised classification algorithm implementing a deep learning architecture based on convolutional neural networks. We tested our approach for the classification of 13 different ncRNA classes. We obtained classification scores, using the most common statistical measures. In particular, we reach an accuracy and sensitivity score of about 74%. The proposed method outperforms other similar classification methods based on secondary structure features and machine learning algorithms, including the RNAcon tool that, to date, is the reference classifier. nRC tool is freely available as a docker image at https://hub.docker.com/r/tblab/nrc/. The source code of nRC tool is also available at https://github.com/IcarPA-TBlab/nrc.

  13. Improved Collaborative Representation Classifier Based on l2-Regularized for Human Action Recognition

    Directory of Open Access Journals (Sweden)

    Shirui Huo

    2017-01-01

    Full Text Available Human action recognition is an important recent challenging task. Projecting depth images onto three depth motion maps (DMMs and extracting deep convolutional neural network (DCNN features are discriminant descriptor features to characterize the spatiotemporal information of a specific action from a sequence of depth images. In this paper, a unified improved collaborative representation framework is proposed in which the probability that a test sample belongs to the collaborative subspace of all classes can be well defined and calculated. The improved collaborative representation classifier (ICRC based on l2-regularized for human action recognition is presented to maximize the likelihood that a test sample belongs to each class, then theoretical investigation into ICRC shows that it obtains a final classification by computing the likelihood for each class. Coupled with the DMMs and DCNN features, experiments on depth image-based action recognition, including MSRAction3D and MSRGesture3D datasets, demonstrate that the proposed approach successfully using a distance-based representation classifier achieves superior performance over the state-of-the-art methods, including SRC, CRC, and SVM.

  14. An Ensemble Method with Integration of Feature Selection and Classifier Selection to Detect the Landslides

    Science.gov (United States)

    Zhongqin, G.; Chen, Y.

    2017-12-01

    Abstract Quickly identify the spatial distribution of landslides automatically is essential for the prevention, mitigation and assessment of the landslide hazard. It's still a challenging job owing to the complicated characteristics and vague boundary of the landslide areas on the image. The high resolution remote sensing image has multi-scales, complex spatial distribution and abundant features, the object-oriented image classification methods can make full use of the above information and thus effectively detect the landslides after the hazard happened. In this research we present a new semi-supervised workflow, taking advantages of recent object-oriented image analysis and machine learning algorithms to quick locate the different origins of landslides of some areas on the southwest part of China. Besides a sequence of image segmentation, feature selection, object classification and error test, this workflow ensemble the feature selection and classifier selection. The feature this study utilized were normalized difference vegetation index (NDVI) change, textural feature derived from the gray level co-occurrence matrices (GLCM), spectral feature and etc. The improvement of this study shows this algorithm significantly removes some redundant feature and the classifiers get fully used. All these improvements lead to a higher accuracy on the determination of the shape of landslides on the high resolution remote sensing image, in particular the flexibility aimed at different kinds of landslides.

  15. A comparative evaluation of sequence classification programs

    Directory of Open Access Journals (Sweden)

    Bazinet Adam L

    2012-05-01

    Full Text Available Abstract Background A fundamental problem in modern genomics is to taxonomically or functionally classify DNA sequence fragments derived from environmental sampling (i.e., metagenomics. Several different methods have been proposed for doing this effectively and efficiently, and many have been implemented in software. In addition to varying their basic algorithmic approach to classification, some methods screen sequence reads for ’barcoding genes’ like 16S rRNA, or various types of protein-coding genes. Due to the sheer number and complexity of methods, it can be difficult for a researcher to choose one that is well-suited for a particular analysis. Results We divided the very large number of programs that have been released in recent years for solving the sequence classification problem into three main categories based on the general algorithm they use to compare a query sequence against a database of sequences. We also evaluated the performance of the leading programs in each category on data sets whose taxonomic and functional composition is known. Conclusions We found significant variability in classification accuracy, precision, and resource consumption of sequence classification programs when used to analyze various metagenomics data sets. However, we observe some general trends and patterns that will be useful to researchers who use sequence classification programs.

  16. On the statistical assessment of classifiers using DNA microarray data

    Directory of Open Access Journals (Sweden)

    Carella M

    2006-08-01

    Full Text Available Abstract Background In this paper we present a method for the statistical assessment of cancer predictors which make use of gene expression profiles. The methodology is applied to a new data set of microarray gene expression data collected in Casa Sollievo della Sofferenza Hospital, Foggia – Italy. The data set is made up of normal (22 and tumor (25 specimens extracted from 25 patients affected by colon cancer. We propose to give answers to some questions which are relevant for the automatic diagnosis of cancer such as: Is the size of the available data set sufficient to build accurate classifiers? What is the statistical significance of the associated error rates? In what ways can accuracy be considered dependant on the adopted classification scheme? How many genes are correlated with the pathology and how many are sufficient for an accurate colon cancer classification? The method we propose answers these questions whilst avoiding the potential pitfalls hidden in the analysis and interpretation of microarray data. Results We estimate the generalization error, evaluated through the Leave-K-Out Cross Validation error, for three different classification schemes by varying the number of training examples and the number of the genes used. The statistical significance of the error rate is measured by using a permutation test. We provide a statistical analysis in terms of the frequencies of the genes involved in the classification. Using the whole set of genes, we found that the Weighted Voting Algorithm (WVA classifier learns the distinction between normal and tumor specimens with 25 training examples, providing e = 21% (p = 0.045 as an error rate. This remains constant even when the number of examples increases. Moreover, Regularized Least Squares (RLS and Support Vector Machines (SVM classifiers can learn with only 15 training examples, with an error rate of e = 19% (p = 0.035 and e = 18% (p = 0.037 respectively. Moreover, the error rate

  17. Can scientific journals be classified based on their citation profiles?

    Directory of Open Access Journals (Sweden)

    Sayed-Amir Marashi

    2015-03-01

    Full Text Available Classification of scientific publications is of great importance in biomedical research evaluation. However, accurate classification of research publications is challenging and normally is performed in a rather subjective way. In the present paper, we propose to classify biomedical publications into superfamilies, by analysing their citation profiles, i.e. the location of citations in the structure of citing articles. Such a classification may help authors to find the appropriate biomedical journal for publication, may make journal comparisons more rational, and may even help planners to better track the consequences of their policies on biomedical research.

  18. Classifying the future of universes with dark energy

    International Nuclear Information System (INIS)

    Chiba, Takeshi; Takahashi, Ryuichi; Sugiyama, Naoshi

    2005-01-01

    We classify the future of the universe for general cosmological models including matter and dark energy. If the equation of state of dark energy is less then -1, the age of the universe becomes finite. We compute the rest of the age of the universe for such universe models. The behaviour of the future growth of matter density perturbation is also studied. We find that the collapse of the spherical overdensity region is greatly changed if the equation of state of dark energy is less than -1

  19. DFRFT: A Classified Review of Recent Methods with Its Application

    Directory of Open Access Journals (Sweden)

    Ashutosh Kumar Singh

    2013-01-01

    Full Text Available In the literature, there are various algorithms available for computing the discrete fractional Fourier transform (DFRFT. In this paper, all the existing methods are reviewed, classified into four categories, and subsequently compared to find out the best alternative from the view point of minimal computational error, computational complexity, transform features, and additional features like security. Subsequently, the correlation theorem of FRFT has been utilized to remove significantly the Doppler shift caused due to motion of receiver in the DSB-SC AM signal. Finally, the role of DFRFT has been investigated in the area of steganography.

  20. Application of a naive Bayesians classifiers in assessing the supplier

    Directory of Open Access Journals (Sweden)

    Mijailović Snežana

    2017-01-01

    Full Text Available The paper considers the class of interactive knowledge based systems whose main purpose of making proposals and assisting customers in making decisions. The mathematical model provides a set of examples of learning about the delivered series of outflows from three suppliers, as well as an analysis of an illustrative example for assessing the supplier using a naive Bayesian classifier. The model was developed on the basis of the analysis of subjective probabilities, which are later revised with the help of new empirical information and Bayesian theorem on a posterior probability, i.e. combining of subjective and objective conditional probabilities in the choice of a reliable supplier.

  1. Interface Prostheses With Classifier-Feedback-Based User Training.

    Science.gov (United States)

    Fang, Yinfeng; Zhou, Dalin; Li, Kairu; Liu, Honghai

    2017-11-01

    It is evident that user training significantly affects performance of pattern-recognition-based myoelectric prosthetic device control. Despite plausible classification accuracy on offline datasets, online accuracy usually suffers from the changes in physiological conditions and electrode displacement. The user ability in generating consistent electromyographic (EMG) patterns can be enhanced via proper user training strategies in order to improve online performance. This study proposes a clustering-feedback strategy that provides real-time feedback to users by means of a visualized online EMG signal input as well as the centroids of the training samples, whose dimensionality is reduced to minimal number by dimension reduction. Clustering feedback provides a criterion that guides users to adjust motion gestures and muscle contraction forces intentionally. The experiment results have demonstrated that hand motion recognition accuracy increases steadily along the progress of the clustering-feedback-based user training, while conventional classifier-feedback methods, i.e., label feedback, hardly achieve any improvement. The result concludes that the use of proper classifier feedback can accelerate the process of user training, and implies prosperous future for the amputees with limited or no experience in pattern-recognition-based prosthetic device manipulation.It is evident that user training significantly affects performance of pattern-recognition-based myoelectric prosthetic device control. Despite plausible classification accuracy on offline datasets, online accuracy usually suffers from the changes in physiological conditions and electrode displacement. The user ability in generating consistent electromyographic (EMG) patterns can be enhanced via proper user training strategies in order to improve online performance. This study proposes a clustering-feedback strategy that provides real-time feedback to users by means of a visualized online EMG signal input as well

  2. Nonlinear Knowledge in Kernel-Based Multiple Criteria Programming Classifier

    Science.gov (United States)

    Zhang, Dongling; Tian, Yingjie; Shi, Yong

    Kernel-based Multiple Criteria Linear Programming (KMCLP) model is used as classification methods, which can learn from training examples. Whereas, in traditional machine learning area, data sets are classified only by prior knowledge. Some works combine the above two classification principle to overcome the defaults of each approach. In this paper, we propose a model to incorporate the nonlinear knowledge into KMCLP in order to solve the problem when input consists of not only training example, but also nonlinear prior knowledge. In dealing with real world case breast cancer diagnosis, the model shows its better performance than the model solely based on training data.

  3. On-line computing in a classified environment

    International Nuclear Information System (INIS)

    O'Callaghan, P.B.

    1982-01-01

    Westinghouse Hanford Company (WHC) recently developed a Department of Energy (DOE) approved real-time, on-line computer system to control nuclear material. The system simultaneously processes both classified and unclassified information. Implementation of this system required application of many security techniques. The system has a secure, but user friendly interface. Many software applications protect the integrity of the data base from malevolent or accidental errors. Programming practices ensure the integrity of the computer system software. The audit trail and the reports generation capability record user actions and status of the nuclear material inventory

  4. A Handbook for Derivative Classifiers at Los Alamos National Laboratory

    Energy Technology Data Exchange (ETDEWEB)

    Sinkula, Barbara Jean [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2018-02-23

    The Los Alamos Classification Office (within the SAFE-IP group) prepared this handbook as a resource for the Laboratory’s derivative classifiers (DCs). It contains information about United States Government (USG) classification policy, principles, and authorities as they relate to the LANL Classification Program in general, and to the LANL DC program specifically. At a working level, DCs review Laboratory documents and material that are subject to classification review requirements, while the Classification Office provides the training and resources for DCs to perform that vital function.

  5. Classifying BCI signals from novice users with extreme learning machine

    Directory of Open Access Journals (Sweden)

    Rodríguez-Bermúdez Germán

    2017-07-01

    Full Text Available Brain computer interface (BCI allows to control external devices only with the electrical activity of the brain. In order to improve the system, several approaches have been proposed. However it is usual to test algorithms with standard BCI signals from experts users or from repositories available on Internet. In this work, extreme learning machine (ELM has been tested with signals from 5 novel users to compare with standard classification algorithms. Experimental results show that ELM is a suitable method to classify electroencephalogram signals from novice users.

  6. A novel ensemble and composite approach for classifying proteins ...

    African Journals Online (AJOL)

    For the fact that the location of proteins gave some details about the function of a protein whose location was uncertain, protein classification was regarded as a very important task in the field of biological data mining. However, the success of a human genome project led to a protein sequence explosion. There is a great ...

  7. Nonparametric combinatorial sequence models.

    Science.gov (United States)

    Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa

    2011-11-01

    This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.

  8. Classifying and mapping wetlands and peat resources using digital cartography

    Science.gov (United States)

    Cameron, Cornelia C.; Emery, David A.

    1992-01-01

    Digital cartography allows the portrayal of spatial associations among diverse data types and is ideally suited for land use and resource analysis. We have developed methodology that uses digital cartography for the classification of wetlands and their associated peat resources and applied it to a 1:24 000 scale map area in New Hampshire. Classifying and mapping wetlands involves integrating the spatial distribution of wetlands types with depth variations in associated peat quality and character. A hierarchically structured classification that integrates the spatial distribution of variations in (1) vegetation, (2) soil type, (3) hydrology, (4) geologic aspects, and (5) peat characteristics has been developed and can be used to build digital cartographic files for resource and land use analysis. The first three parameters are the bases used by the National Wetlands Inventory to classify wetlands and deepwater habitats of the United States. The fourth parameter, geological aspects, includes slope, relief, depth of wetland (from surface to underlying rock or substrate), wetland stratigraphy, and the type and structure of solid and unconsolidated rock surrounding and underlying the wetland. The fifth parameter, peat characteristics, includes the subsurface variation in ash, acidity, moisture, heating value (Btu), sulfur content, and other chemical properties as shown in specimens obtained from core holes. These parameters can be shown as a series of map data overlays with tables that can be integrated for resource or land use analysis.

  9. Efficacy of MRI in classifying proximal focal femoral deficiency

    International Nuclear Information System (INIS)

    Maldjian, C.; Patel, T.Y.; Klein, R.M.; Smith, R.C.

    2007-01-01

    To evaluate the efficacy of MRI in classifying PFFD and to compare MRI to radiographic classification of PFFD. Radiographic and MRI classification of the cases was performed utilizing the Amstutz classification system. Retrospective evaluation of radiographs and MRI exams in nine hips of eight patients with proximal focal femoral deficiency was performed by two radiologists. The cases were classified by radiographs as Amstutz 1: n=3, Amstutz 3: n=3, Amstutz 4: n=1 and Amstutz 5: n=2. The classifications based on MRI were Amstutz 1: n=6, Amstutz 2: n=1, Amstutz 3: n=0, Amstutz 4: n=2 and Amstutz 5: n=0. Three hips demonstrated complete agreement. There were six discordant hips. In two of the discordant cases, follow-up radiographs of 6 months or greater intervals were available and helped to confirm MRI findings. Errors in radiographic evaluation consisted of overestimating the degree of deficiency. MRI is more accurate than radiographic evaluation for the classification of PFFD, particularly early on, prior to the ossification of cartilaginous components in the femurs. Since radiographic evaluation tends to overestimate the degree of deficiency, MRI is a more definitive modality for evaluation of PFFD. (orig.)

  10. REPTREE CLASSIFIER FOR IDENTIFYING LINK SPAM IN WEB SEARCH ENGINES

    Directory of Open Access Journals (Sweden)

    S.K. Jayanthi

    2013-01-01

    Full Text Available Search Engines are used for retrieving the information from the web. Most of the times, the importance is laid on top 10 results sometimes it may shrink as top 5, because of the time constraint and reliability on the search engines. Users believe that top 10 or 5 of total results are more relevant. Here comes the problem of spamdexing. It is a method to deceive the search result quality. Falsified metrics such as inserting enormous amount of keywords or links in website may take that website to the top 10 or 5 positions. This paper proposes a classifier based on the Reptree (Regression tree representative. As an initial step Link-based features such as neighbors, pagerank, truncated pagerank, trustrank and assortativity related attributes are inferred. Based on this features, tree is constructed. The tree uses the feature inference to differentiate spam sites from legitimate sites. WEBSPAM-UK-2007 dataset is taken as a base. It is preprocessed and converted into five datasets FEATA, FEATB, FEATC, FEATD and FEATE. Only link based features are taken for experiments. This paper focus on link spam alone. Finally a representative tree is created which will more precisely classify the web spam entries. Results are given. Regression tree classification seems to perform well as shown through experiments.

  11. Deposition of Nanostructured Thin Film from Size-Classified Nanoparticles

    Science.gov (United States)

    Camata, Renato P.; Cunningham, Nicholas C.; Seol, Kwang Soo; Okada, Yoshiki; Takeuchi, Kazuo

    2003-01-01

    Materials comprising nanometer-sized grains (approximately 1_50 nm) exhibit properties dramatically different from those of their homogeneous and uniform counterparts. These properties vary with size, shape, and composition of nanoscale grains. Thus, nanoparticles may be used as building blocks to engineer tailor-made artificial materials with desired properties, such as non-linear optical absorption, tunable light emission, charge-storage behavior, selective catalytic activity, and countless other characteristics. This bottom-up engineering approach requires exquisite control over nanoparticle size, shape, and composition. We describe the design and characterization of an aerosol system conceived for the deposition of size classified nanoparticles whose performance is consistent with these strict demands. A nanoparticle aerosol is generated by laser ablation and sorted according to size using a differential mobility analyzer. Nanoparticles within a chosen window of sizes (e.g., (8.0 plus or minus 0.6) nm) are deposited electrostatically on a surface forming a film of the desired material. The system allows the assembly and engineering of thin films using size-classified nanoparticles as building blocks.

  12. Speaker gender identification based on majority vote classifiers

    Science.gov (United States)

    Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

    2017-03-01

    Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.

  13. Spread-sheet application to classify radioactive material for shipment

    International Nuclear Information System (INIS)

    Brown, A.N.

    1998-01-01

    A spread-sheet application has been developed at the Idaho National Engineering and Environmental Laboratory to aid the shipper when classifying nuclide mixtures of normal form, radioactive materials. The results generated by this spread-sheet are used to confirm the proper US DOT classification when offering radioactive material packages for transport. The user must input to the spread-sheet the mass of the material being classified, the physical form (liquid or not) and the activity of each regulated nuclide. The spread-sheet uses these inputs to calculate two general values: 1)the specific activity of the material and a summation calculation of the nuclide content. The specific activity is used to determine if the material exceeds the DOT minimal threshold for a radioactive material. If the material is calculated to be radioactive, the specific activity is also used to determine if the material meets the activity requirement for one of the three low specific activity designations (LSA-I, LSA-II, LSA-III, or not LSA). Again, if the material is calculated to be radioactive, the summation calculation is then used to determine which activity category the material will meet (Limited Quantity, Type A, Type B, or Highway Route Controlled Quantity). This spread-sheet has proven to be an invaluable aid for shippers of radioactive materials at the Idaho National Engineering and Environmental Laboratory. (authors)

  14. Identifying aggressive prostate cancer foci using a DNA methylation classifier.

    Science.gov (United States)

    Mundbjerg, Kamilla; Chopra, Sameer; Alemozaffar, Mehrdad; Duymich, Christopher; Lakshminarasimhan, Ranjani; Nichols, Peter W; Aron, Manju; Siegmund, Kimberly D; Ukimura, Osamu; Aron, Monish; Stern, Mariana; Gill, Parkash; Carpten, John D; Ørntoft, Torben F; Sørensen, Karina D; Weisenberger, Daniel J; Jones, Peter A; Duddalwar, Vinay; Gill, Inderbir; Liang, Gangning

    2017-01-12

    Slow-growing prostate cancer (PC) can be aggressive in a subset of cases. Therefore, prognostic tools to guide clinical decision-making and avoid overtreatment of indolent PC and undertreatment of aggressive disease are urgently needed. PC has a propensity to be multifocal with several different cancerous foci per gland. Here, we have taken advantage of the multifocal propensity of PC and categorized aggressiveness of individual PC foci based on DNA methylation patterns in primary PC foci and matched lymph node metastases. In a set of 14 patients, we demonstrate that over half of the cases have multiple epigenetically distinct subclones and determine the primary subclone from which the metastatic lesion(s) originated. Furthermore, we develop an aggressiveness classifier consisting of 25 DNA methylation probes to determine aggressive and non-aggressive subclones. Upon validation of the classifier in an independent cohort, the predicted aggressive tumors are significantly associated with the presence of lymph node metastases and invasive tumor stages. Overall, this study provides molecular-based support for determining PC aggressiveness with the potential to impact clinical decision-making, such as targeted biopsy approaches for early diagnosis and active surveillance, in addition to focal therapy.

  15. Spreadsheet application to classify radioactive material for shipment

    International Nuclear Information System (INIS)

    Brown, A.N.

    1997-12-01

    A spreadsheet application has been developed at the Idaho National Engineering and Environmental Laboratory to aid the shipper when classifying nuclide mixtures of normal form, radioactive materials. The results generated by this spreadsheet are used to confirm the proper US Department of Transportation (DOT) classification when offering radioactive material packages for transport. The user must input to the spreadsheet the mass of the material being classified, the physical form (liquid or not), and the activity of each regulated nuclide. The spreadsheet uses these inputs to calculate two general values: (1) the specific activity of the material, and (2) a summation calculation of the nuclide content. The specific activity is used to determine if the material exceeds the DOT minimal threshold for a radioactive material (Yes or No). If the material is calculated to be radioactive, the specific activity is also used to determine if the material meets the activity requirement for one of the three Low Specific Activity designations (LSA-I, LSA-II, LSA-III, or Not LSA). Again, if the material is calculated to be radioactive, the summation calculation is then used to determine which activity category the material will meet (Limited Quantity, Type A, Type B, or Highway Route Controlled Quantity)

  16. Classifying decommissioning wastes for allocation to appropriate final repositories

    International Nuclear Information System (INIS)

    Alder, J.C.; Tunaboylu, K.

    1982-01-01

    For the safe disposal of radioactive wastes in different repositories, it is of advantage to classify them in well-defined conditioned categories, appropriate for final disposal. These categories, the so-called waste sorts are characterized by similar radionuclide distribution, similar nuclide-specific activity concentrations and similar waste matrix. A methodology is presented for classifying decommissioning wastes and is applied to the decommissioning wastes arising from a Swiss program of 6 GWe. The amounts and nuclide-specific activity inventories of the decommissioning waste sorts have been estimated. A first allocation into two different repository types has been performed. Such a classification enables one to define the source parameters for repository safety analysis and allows one to allocate the different waste categories into appropriate final repositories. This work presents a first iteration to determine which waste sorts belong to which repository type. The characteristics of waste sorts have to be better defined and the protective strength of the repository barriers has to be optimized. 7 references, 2 figures, 4 tables

  17. Classifying magnetic resonance image modalities with convolutional neural networks

    Science.gov (United States)

    Remedios, Samuel; Pham, Dzung L.; Butman, John A.; Roy, Snehashis

    2018-02-01

    Magnetic Resonance (MR) imaging allows the acquisition of images with different contrast properties depending on the acquisition protocol and the magnetic properties of tissues. Many MR brain image processing techniques, such as tissue segmentation, require multiple MR contrasts as inputs, and each contrast is treated differently. Thus it is advantageous to automate the identification of image contrasts for various purposes, such as facilitating image processing pipelines, and managing and maintaining large databases via content-based image retrieval (CBIR). Most automated CBIR techniques focus on a two-step process: extracting features from data and classifying the image based on these features. We present a novel 3D deep convolutional neural network (CNN)- based method for MR image contrast classification. The proposed CNN automatically identifies the MR contrast of an input brain image volume. Specifically, we explored three classification problems: (1) identify T1-weighted (T1-w), T2-weighted (T2-w), and fluid-attenuated inversion recovery (FLAIR) contrasts, (2) identify pre vs postcontrast T1, (3) identify pre vs post-contrast FLAIR. A total of 3418 image volumes acquired from multiple sites and multiple scanners were used. To evaluate each task, the proposed model was trained on 2137 images and tested on the remaining 1281 images. Results showed that image volumes were correctly classified with 97.57% accuracy.

  18. Improved training for target detection using Fukunaga-Koontz transform and distance classifier correlation filter

    Science.gov (United States)

    Elbakary, M. I.; Alam, M. S.; Aslan, M. S.

    2008-03-01

    In a FLIR image sequence, a target may disappear permanently or may reappear after some frames and crucial information such as direction, position and size related to the target are lost. If the target reappears at a later frame, it may not be tracked again because the 3D orientation, size and location of the target might be changed. To obtain information about the target before disappearing and to detect the target after reappearing, distance classifier correlation filter (DCCF) is trained manualy by selecting a number of chips randomly. This paper introduces a novel idea to eliminates the manual intervention in training phase of DCCF. Instead of selecting the training chips manually and selecting the number of the training chips randomly, we adopted the K-means algorithm to cluster the training frames and based on the number of clusters we select the training chips such that a training chip for each cluster. To detect and track the target after reappearing in the field-ofview ,TBF and DCCF are employed. The contduced experiemnts using real FLIR sequences show results similar to the traditional agorithm but eleminating the manual intervention is the advantage of the proposed algorithm.

  19. Sequencing of the Litchi Downy Blight Pathogen Reveals It Is a Phytophthora Species With Downy Mildew-Like Characteristics.

    Science.gov (United States)

    Ye, Wenwu; Wang, Yang; Shen, Danyu; Li, Delong; Pu, Tianhuizi; Jiang, Zide; Zhang, Zhengguang; Zheng, Xiaobo; Tyler, Brett M; Wang, Yuanchao

    2016-07-01

    On the basis of its downy mildew-like morphology, the litchi downy blight pathogen was previously named Peronophythora litchii. Recently, however, it was proposed to transfer this pathogen to Phytophthora clade 4. To better characterize this unusual oomycete species and important fruit pathogen, we obtained the genome sequence of Phytophthora litchii and compared it to those from other oomycete species. P. litchii has a small genome with tightly spaced genes. On the basis of a multilocus phylogenetic analysis, the placement of P. litchii in the genus Phytophthora is strongly supported. Effector proteins predicted included 245 RxLR, 30 necrosis-and-ethylene-inducing protein-like, and 14 crinkler proteins. The typical motifs, phylogenies, and activities of these effectors were typical for a Phytophthora species. However, like the genome features of the analyzed downy mildews, P. litchii exhibited a streamlined genome with a relatively small number of genes in both core and species-specific protein families. The low GC content and slight codon preferences of P. litchii sequences were similar to those of the analyzed downy mildews and a subset of Phytophthora species. Taken together, these observations suggest that P. litchii is a Phytophthora pathogen that is in the process of acquiring downy mildew-like genomic and morphological features. Thus P. litchii may provide a novel model for investigating morphological development and genomic adaptation in oomycete pathogens.

  20. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  1. On the evaluation of the fidelity of supervised classifiers in the prediction of chimeric RNAs.

    Science.gov (United States)

    Beaumeunier, Sacha; Audoux, Jérôme; Boureux, Anthony; Ruffle, Florence; Commes, Thérèse; Philippe, Nicolas; Alves, Ronnie

    2016-01-01

    High-throughput sequencing technology and bioinformatics have identified chimeric RNAs (chRNAs), raising the possibility of chRNAs expressing particularly in diseases can be used as potential biomarkers in both diagnosis and prognosis. The task of discriminating true chRNAs from the false ones poses an interesting Machine Learning (ML) challenge. First of all, the sequencing data may contain false reads due to technical artifacts and during the analysis process, bioinformatics tools may generate false positives due to methodological biases. Moreover, if we succeed to have a proper set of observations (enough sequencing data) about true chRNAs, chances are that the devised model can not be able to generalize beyond it. Like any other machine learning problem, the first big issue is finding the good data to build models. As far as we were concerned, there is no common benchmark data available for chRNAs detection. The definition of a classification baseline is lacking in the related literature too. In this work we are moving towards benchmark data and an evaluation of the fidelity of supervised classifiers in the prediction of chRNAs. We proposed a modelization strategy that can be used to increase the tools performances in context of chRNA classification based on a simulated data generator, that permit to continuously integrate new complex chimeric events. The pipeline incorporated a genome mutation process and simulated RNA-seq data. The reads within distinct depth were aligned and analysed by CRAC that integrates genomic location and local coverage, allowing biological predictions at the read scale. Additionally, these reads were functionally annotated and aggregated to form chRNAs events, making it possible to evaluate ML methods (classifiers) performance in both levels of reads and events. Ensemble learning strategies demonstrated to be more robust to this classification problem, providing an average AUC performance of 95 % (ACC=94 %, Kappa=0.87 %). The

  2. A deep learning method for classifying mammographic breast density categories.

    Science.gov (United States)

    Mohamed, Aly A; Berg, Wendie A; Peng, Hong; Luo, Yahong; Jankowitz, Rachel C; Wu, Shandong

    2018-01-01

    Mammographic breast density is an established risk marker for breast cancer and is visually assessed by radiologists in routine mammogram image reading, using four qualitative Breast Imaging and Reporting Data System (BI-RADS) breast density categories. It is particularly difficult for radiologists to consistently distinguish the two most common and most variably assigned BI-RADS categories, i.e., "scattered density" and "heterogeneously dense". The aim of this work was to investigate a deep learning-based breast density classifier to consistently distinguish these two categories, aiming at providing a potential computerized tool to assist radiologists in assigning a BI-RADS category in current clinical workflow. In this study, we constructed a convolutional neural network (CNN)-based model coupled with a large (i.e., 22,000 images) digital mammogram imaging dataset to evaluate the classification performance between the two aforementioned breast density categories. All images were collected from a cohort of 1,427 women who underwent standard digital mammography screening from 2005 to 2016 at our institution. The truths of the density categories were based on standard clinical assessment made by board-certified breast imaging radiologists. Effects of direct training from scratch solely using digital mammogram images and transfer learning of a pretrained model on a large nonmedical imaging dataset were evaluated for the specific task of breast density classification. In order to measure the classification performance, the CNN classifier was also tested on a refined version of the mammogram image dataset by removing some potentially inaccurately labeled images. Receiver operating characteristic (ROC) curves and the area under the curve (AUC) were used to measure the accuracy of the classifier. The AUC was 0.9421 when the CNN-model was trained from scratch on our own mammogram images, and the accuracy increased gradually along with an increased size of training samples

  3. Least Square Support Vector Machine Classifier vs a Logistic Regression Classifier on the Recognition of Numeric Digits

    Directory of Open Access Journals (Sweden)

    Danilo A. López-Sarmiento

    2013-11-01

    Full Text Available In this paper is compared the performance of a multi-class least squares support vector machine (LSSVM mc versus a multi-class logistic regression classifier to problem of recognizing the numeric digits (0-9 handwritten. To develop the comparison was used a data set consisting of 5000 images of handwritten numeric digits (500 images for each number from 0-9, each image of 20 x 20 pixels. The inputs to each of the systems were vectors of 400 dimensions corresponding to each image (not done feature extraction. Both classifiers used OneVsAll strategy to enable multi-classification and a random cross-validation function for the process of minimizing the cost function. The metrics of comparison were precision and training time under the same computational conditions. Both techniques evaluated showed a precision above 95 %, with LS-SVM slightly more accurate. However the computational cost if we found a marked difference: LS-SVM training requires time 16.42 % less than that required by the logistic regression model based on the same low computational conditions.

  4. Long sequence correlation coprocessor

    Science.gov (United States)

    Gage, Douglas W.

    1994-09-01

    A long sequence correlation coprocessor (LSCC) accelerates the bitwise correlation of arbitrarily long digital sequences by calculating in parallel the correlation score for 16, for example, adjacent bit alignments between two binary sequences. The LSCC integrated circuit is incorporated into a computer system with memory storage buffers and a separate general purpose computer processor which serves as its controller. Each of the LSCC's set of sequential counters simultaneously tallies a separate correlation coefficient. During each LSCC clock cycle, computer enable logic associated with each counter compares one bit of a first sequence with one bit of a second sequence to increment the counter if the bits are the same. A shift register assures that the same bit of the first sequence is simultaneously compared to different bits of the second sequence to simultaneously calculate the correlation coefficient by the different counters to represent different alignments of the two sequences.

  5. Roles of repetitive sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bell, G.I.

    1991-12-31

    The DNA of higher eukaryotes contains many repetitive sequences. The study of repetitive sequences is important, not only because many have important biological function, but also because they provide information on genome organization, evolution and dynamics. In this paper, I will first discuss some generic effects that repetitive sequences will have upon genome dynamics and evolution. In particular, it will be shown that repetitive sequences foster recombination among, and turnover of, the elements of a genome. I will then consider some examples of repetitive sequences, notably minisatellite sequences and telomere sequences as examples of tandem repeats, without and with respectively known function, and Alu sequences as an example of interspersed repeats. Some other examples will also be considered in less detail.

  6. Anomaly Detection in Sequences

    Data.gov (United States)

    National Aeronautics and Space Administration — We present a set of novel algorithms which we call sequenceMiner, that detect and characterize anomalies in large sets of high-dimensional symbol sequences that...

  7. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  8. sequenceMiner algorithm

    Data.gov (United States)

    National Aeronautics and Space Administration — Detecting and describing anomalies in large repositories of discrete symbol sequences. sequenceMiner has been open-sourced! Download the file below to try it out....

  9. KinMutRF: a random forest classifier of sequence variants in the human protein kinase superfamily

    DEFF Research Database (Denmark)

    Pons, Tirso; Vazquez, Miguel; Matey-Hernandez, María Luisa

    2016-01-01

    annotations from UniProt, Phospho.ELM and FireDB. KinMutRF identifies disease-associated variants satisfactorily (Acc: 0.88, Prec:0.82, Rec:0.75, F-score:0.78, MCC:0.68) when trained and cross-validated with the 3689 human kinase variants from UniProt that have been annotated as neutral or pathogenic. All...

  10. Higher School Marketing Strategy Formation: Classifying the Factors

    Directory of Open Access Journals (Sweden)

    N. K. Shemetova

    2012-01-01

    Full Text Available The paper deals with the main trends of higher school management strategy formation. The author specifies the educational changes in the modern information society determining the strategy options. For each professional training level the author denotes the set of strategic factors affecting the educational service consumers and, therefore, the effectiveness of the higher school marketing. The given factors are classified from the stand-points of the providers and consumers of educational service (enrollees, students, graduates and postgraduates. The research methods include the statistic analysis and general methods of scientific analysis, synthesis, induction, deduction, comparison, and classification. The author is convinced that the university management should develop the necessary prerequisites for raising the graduates’ competitiveness in the labor market, and stimulate the active marketing policies of the relating subdivisions and departments. In author’s opinion, the above classification of marketing strategy factors can be used as the system of values for educational service providers. 

  11. An automated approach to the design of decision tree classifiers

    Science.gov (United States)

    Argentiero, P.; Chin, R.; Beaudet, P.

    1982-01-01

    An automated technique is presented for designing effective decision tree classifiers predicated only on a priori class statistics. The procedure relies on linear feature extractions and Bayes table look-up decision rules. Associated error matrices are computed and utilized to provide an optimal design of the decision tree at each so-called 'node'. A by-product of this procedure is a simple algorithm for computing the global probability of correct classification assuming the statistical independence of the decision rules. Attention is given to a more precise definition of decision tree classification, the mathematical details on the technique for automated decision tree design, and an example of a simple application of the procedure using class statistics acquired from an actual Landsat scene.

  12. A robust dataset-agnostic heart disease classifier from Phonocardiogram.

    Science.gov (United States)

    Banerjee, Rohan; Dutta Choudhury, Anirban; Deshpande, Parijat; Bhattacharya, Sakyajit; Pal, Arpan; Mandana, K M

    2017-07-01

    Automatic classification of normal and abnormal heart sounds is a popular area of research. However, building a robust algorithm unaffected by signal quality and patient demography is a challenge. In this paper we have analysed a wide list of Phonocardiogram (PCG) features in time and frequency domain along with morphological and statistical features to construct a robust and discriminative feature set for dataset-agnostic classification of normal and cardiac patients. The large and open access database, made available in Physionet 2016 challenge was used for feature selection, internal validation and creation of training models. A second dataset of 41 PCG segments, collected using our in-house smart phone based digital stethoscope from an Indian hospital was used for performance evaluation. Our proposed methodology yielded sensitivity and specificity scores of 0.76 and 0.75 respectively on the test dataset in classifying cardiovascular diseases. The methodology also outperformed three popular prior art approaches, when applied on the same dataset.

  13. Business process modeling for processing classified documents using RFID technology

    Directory of Open Access Journals (Sweden)

    Koszela Jarosław

    2016-01-01

    Full Text Available The article outlines the application of the processing approach to the functional description of the designed IT system supporting the operations of the secret office, which processes classified documents. The article describes the application of the method of incremental modeling of business processes according to the BPMN model to the description of the processes currently implemented (“as is” in a manual manner and target processes (“to be”, using the RFID technology for the purpose of their automation. Additionally, the examples of applying the method of structural and dynamic analysis of the processes (process simulation to verify their correctness and efficiency were presented. The extension of the process analysis method is a possibility of applying the warehouse of processes and process mining methods.

  14. The Motivation of Betrayal by Leaking of Classified Information

    Directory of Open Access Journals (Sweden)

    Lăzăroiu Laurențiu-Leonard

    2017-03-01

    Full Text Available Trying to forecast the human behavior involves acts and knowledge of motivational theories, applicable to profile of each organization and in particular to each individual’s style. The anticipation of personal attitudes has not the only aim for a passive monitoring of professional activity, but also wants to increase performance of risk avoidance, in acordance with a specific organizational environment. The emergence and development of motivational forms and values, whose projections determine social crimes, are risk factors, affecting the professional activity of the person, but also affecting the performance and stability of the institution. Moreover, if the motivation determines attitudes aimed at compromising classified information, the resulting actions may be considered as threats to national security. The prevention of such threats can only be achieved by understanding motivational mechanisms and external conditions for the perssonel that make it possible to transform some intentions into real actions.

  15. Using point-set compression to classify folk songs

    DEFF Research Database (Denmark)

    Meredith, David

    2014-01-01

    -neighbour algorithm and leave-one-out cross-validation to classify the 360 melodies into tune families. The classifications produced by the algorithms were compared with a ground-truth classification prepared by expert musicologists. Twelve of the thirteen compressors used in the experiment were based...... compared. The highest classification success rate of 77–84% was achieved by COSIATEC, followed by 60–64% for Forth’s algorithm and then 52–58% for SIATECCompress. When the NCDs were calculated using bzip2, the success rate was only 12.5%. The results demonstrate that the effectiveness of NCD for measuring...... similarity between folk-songs for classification purposes is highly dependent upon the actual compressor chosen. Furthermore, it seems that compressors based on finding maximal repeated patterns in point-set representations of music show more promise for NCD-based music classification than general...

  16. Sex Bias in Classifying Borderline and Narcissistic Personality Disorder.

    Science.gov (United States)

    Braamhorst, Wouter; Lobbestael, Jill; Emons, Wilco H M; Arntz, Arnoud; Witteman, Cilia L M; Bekker, Marrie H J

    2015-10-01

    This study investigated sex bias in the classification of borderline and narcissistic personality disorders. A sample of psychologists in training for a post-master degree (N = 180) read brief case histories (male or female version) and made DSM classification. To differentiate sex bias due to sex stereotyping or to base rate variation, we used different case histories, respectively: (1) non-ambiguous case histories with enough criteria of either borderline or narcissistic personality disorder to meet the threshold for classification, and (2) an ambiguous case with subthreshold features of both borderline and narcissistic personality disorder. Results showed significant differences due to sex of the patient in the ambiguous condition. Thus, when the diagnosis is not straightforward, as in the case of mixed subthreshold features, sex bias is present and is influenced by base-rate variation. These findings emphasize the need for caution in classifying personality disorders, especially borderline or narcissistic traits.

  17. Fisher information metrics for binary classifier evaluation and training

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    Different evaluation metrics for binary classifiers are appropriate to different scientific domains and even to different problems within the same domain. This presentation focuses on the optimisation of event selection to minimise statistical errors in HEP parameter estimation, a problem that is best analysed in terms of the maximisation of Fisher information about the measured parameters. After describing a general formalism to derive evaluation metrics based on Fisher information, three more specific metrics are introduced for the measurements of signal cross sections in counting experiments (FIP1) or distribution fits (FIP2) and for the measurements of other parameters from distribution fits (FIP3). The FIP2 metric is particularly interesting because it can be derived from any ROC curve, provided that prevalence is also known. In addition to its relation to measurement errors when used as an evaluation criterion (which makes it more interesting that the ROC AUC), a further advantage of the FIP2 metric is ...

  18. Multivariate analysis of quantitative traits can effectively classify rapeseed germplasm

    Directory of Open Access Journals (Sweden)

    Jankulovska Mirjana

    2014-01-01

    Full Text Available In this study, the use of different multivariate approaches to classify rapeseed genotypes based on quantitative traits has been presented. Tree regression analysis, PCA analysis and two-way cluster analysis were applied in order todescribe and understand the extent of genetic variability in spring rapeseed genotype by trait data. The traits which highly influenced seed and oil yield in rapeseed were successfully identified by the tree regression analysis. Principal predictor for both response variables was number of pods per plant (NP. NP and 1000 seed weight could help in the selection of high yielding genotypes. High values for both traits and oil content could lead to high oil yielding genotypes. These traits may serve as indirect selection criteria and can lead to improvement of seed and oil yield in rapeseed. Quantitative traits that explained most of the variability in the studied germplasm were classified using principal component analysis. In this data set, five PCs were identified, out of which the first three PCs explained 63% of the total variance. It helped in facilitating the choice of variables based on which the genotypes’ clustering could be performed. The two-way cluster analysissimultaneously clustered genotypes and quantitative traits. The final number of clusters was determined using bootstrapping technique. This approach provided clear overview on the variability of the analyzed genotypes. The genotypes that have similar performance regarding the traits included in this study can be easily detected on the heatmap. Genotypes grouped in the clusters 1 and 8 had high values for seed and oil yield, and relatively short vegetative growth duration period and those in cluster 9, combined moderate to low values for vegetative growth duration and moderate to high seed and oil yield. These genotypes should be further exploited and implemented in the rapeseed breeding program. The combined application of these multivariate methods

  19. Classifying and Visualising Roman Pottery using Computer-scanned Typologies

    Directory of Open Access Journals (Sweden)

    Jacqueline Christmas

    2018-05-01

    Full Text Available For many archaeological assemblages and type-series, accurate drawings of standardised pottery vessels have been recorded in consistent styles. This provides the opportunity to extract individual pot drawings and derive from them data that can be used for analysis and visualisation. Starting from PDF scans of the original pages of pot drawings, we have automated much of the process for locating, defining the boundaries, extracting and orientating each individual pot drawing. From these processed images, basic features such as width and height, the volume of the interior, the edges, and the shape of the cross-section outline are extracted and are then used to construct more complex features such as a measure of a pot's 'circularity'. Capturing these traits opens up new possibilities for (a classifying vessel form in a way that is sensitive to the physical characteristics of pots relative to other vessels in an assemblage, and (b visualising the results of quantifying assemblages using standard typologies. A frequently encountered problem when trying to compare pottery from different archaeological sites is that the pottery is classified into forms and labels using different standards. With a set of data from early Roman urban centres and related sites that has been labelled both with forms (e.g. 'platter' and 'bowl' and shape identifiers (based on the Camulodunum type-series, we use the extracted features from images to look both at how the pottery forms cluster for a given set of features, and at how the features may be used to compare finds from different sites.

  20. Deep Learning to Classify Radiology Free-Text Reports.

    Science.gov (United States)

    Chen, Matthew C; Ball, Robyn L; Yang, Lingyao; Moradzadeh, Nathaniel; Chapman, Brian E; Larson, David B; Langlotz, Curtis P; Amrhein, Timothy J; Lungren, Matthew P

    2018-03-01

    Purpose To evaluate the performance of a deep learning convolutional neural network (CNN) model compared with a traditional natural language processing (NLP) model in extracting pulmonary embolism (PE) findings from thoracic computed tomography (CT) reports from two institutions. Materials and Methods Contrast material-enhanced CT examinations of the chest performed between January 1, 1998, and January 1, 2016, were selected. Annotations by two human radiologists were made for three categories: the presence, chronicity, and location of PE. Classification of performance of a CNN model with an unsupervised learning algorithm for obtaining vector representations of words was compared with the open-source application PeFinder. Sensitivity, specificity, accuracy, and F1 scores for both the CNN model and PeFinder in the internal and external validation sets were determined. Results The CNN model demonstrated an accuracy of 99% and an area under the curve value of 0.97. For internal validation report data, the CNN model had a statistically significant larger F1 score (0.938) than did PeFinder (0.867) when classifying findings as either PE positive or PE negative, but no significant difference in sensitivity, specificity, or accuracy was found. For external validation report data, no statistical difference between the performance of the CNN model and PeFinder was found. Conclusion A deep learning CNN model can classify radiology free-text reports with accuracy equivalent to or beyond that of an existing traditional NLP model. © RSNA, 2017 Online supplemental material is available for this article.

  1. Immunohistochemical analysis of breast tissue microarray images using contextual classifiers

    Directory of Open Access Journals (Sweden)

    Stephen J McKenna

    2013-01-01

    Full Text Available Background: Tissue microarrays (TMAs are an important tool in translational research for examining multiple cancers for molecular and protein markers. Automatic immunohistochemical (IHC scoring of breast TMA images remains a challenging problem. Methods: A two-stage approach that involves localization of regions of invasive and in-situ carcinoma followed by ordinal IHC scoring of nuclei in these regions is proposed. The localization stage classifies locations on a grid as tumor or non-tumor based on local image features. These classifications are then refined using an auto-context algorithm called spin-context. Spin-context uses a series of classifiers to integrate image feature information with spatial context information in the form of estimated class probabilities. This is achieved in a rotationally-invariant manner. The second stage estimates ordinal IHC scores in terms of the strength of staining and the proportion of nuclei stained. These estimates take the form of posterior probabilities, enabling images with uncertain scores to be referred for pathologist review. Results: The method was validated against manual pathologist scoring on two nuclear markers, progesterone receptor (PR and estrogen receptor (ER. Errors for PR data were consistently lower than those achieved with ER data. Scoring was in terms of estimated proportion of cells that were positively stained (scored on an ordinal scale of 0-6 and perceived strength of staining (scored on an ordinal scale of 0-3. Average absolute differences between predicted scores and pathologist-assigned scores were 0.74 for proportion of cells and 0.35 for strength of staining (PR. Conclusions: The use of context information via spin-context improved the precision and recall of tumor localization. The combination of the spin-context localization method with the automated scoring method resulted in reduced IHC scoring errors.

  2. Novel overlapping coding sequences in Chlamydia trachomatis

    DEFF Research Database (Denmark)

    Jensen, Klaus Thorleif; Petersen, Lise; Falk, Søren

    2006-01-01

    that are in agreement with the primary annotation. Forty two genes from the primary annotation are not predicted by EasyGene. The majority of these genes are listed as hypothetical in the primary annotation. The 15 novel predicted genes all overlap with genes on the complementary strand. We find homologues of several...... of the novel genes in C. trachomatis Serovar A and Chlamydia muridarum. Several of the genes have typical gene-like and protein-like features. Furthermore, we confirm transcriptional activity from 10 of the putative genes. The combined evidence suggests that at least seven of the 15 are protein coding genes...

  3. Genome Sequence of Novel Human Parechovirus Type 17

    OpenAIRE

    B?ttcher, Sindy; Obermeier, Patrick E.; Diedrich, Sabine; Kabor?, Yolande; D?Alfonso, Rossella; Pfister, Herbert; Kaiser, Rolf; Di Cristanziano, Veronica

    2017-01-01

    ABSTRACT Human parechoviruses (HPeV) circulate worldwide, causing a broad variety of symptoms, preferentially in early childhood. We report here the nearly complete genome sequence of a novel HPeV type, consisting of 7,062 nucleotides and encoding 2,179?amino acids. M36/CI/2014 was taxonomically classified as HPeV-17 by the picornavirus study group.

  4. MicroRNA classifier and nomogram for metastasis prediction in colon cancer.

    Science.gov (United States)

    Goossens-Beumer, Inès J; Derr, Remco S; Buermans, Henk P J; Goeman, Jelle J; Böhringer, Stefan; Morreau, Hans; Nitsche, Ulrich; Janssen, Klaus-Peter; van de Velde, Cornelis J H; Kuppen, Peter J K

    2015-01-01

    Colon cancer prognosis and treatment are currently based on a classification system still showing large heterogeneity in clinical outcome, especially in TNM stages II and III. Prognostic biomarkers for metastasis risk are warranted as development of distant recurrent disease mainly accounts for the high lethality rates of colon cancer. miRNAs have been proposed as potential biomarkers for cancer. Furthermore, a verified standard for normalization of the amount of input material in PCR-based relative quantification of miRNA expression is lacking. A selection of frozen tumor specimens from two independent patient cohorts with TNM stage II-III microsatellite stable primary adenocarcinomas was used for laser capture microdissection. Next-generation sequencing was performed on small RNAs isolated from colorectal tumors from the Dutch cohort (N = 50). Differential expression analysis, comparing in metastasized and nonmetastasized tumors, identified prognostic miRNAs. Validation was performed on colon tumors from the German cohort (N = 43) using quantitative PCR (qPCR). miR25-3p and miR339-5p were identified and validated as independent prognostic markers and used to construct a multivariate nomogram for metastasis risk prediction. The nomogram showed good probability prediction in validation. In addition, we recommend combination of miR16-5p and miR26a-5p as standard for normalization in qPCR of colon cancer tissue-derived miRNA expression. In this international study, we identified and validated a miRNA classifier in primary cancers, and propose a nomogram capable of predicting metastasis risk in microsatellite stable TNM stage II-III colon cancer. In conjunction with TNM staging, by means of a nomogram, this miRNA classifier may allow for personalized treatment decisions based on individual tumor characteristics. ©2014 American Association for Cancer Research.

  5. Classified model and characteristics of strategies at tourist companies

    Directory of Open Access Journals (Sweden)

    I.V. Saukh

    2017-12-01

    Full Text Available The research is devoted to the assessment of the scientific approaches to the identification of classification features of the strategy and its types distinguished in accordance with the mentioned features. The research object is the activities of tourist companies and this determines the choice of strategies typical for the tourism field. It is substantiated that the scientific approaches to the classification of strategies are various in specific literature because of obscurity in the strategy definition, vagueness and plurality of its classified features. Due to the current research the authors have improved the classified model of strategies for tourist companies that will result in making effective management decisions directed to the development of enterprise potential under conditions of unstable and unpredictable external environment. The paper singles out the peculiarities of functioning the tourism branch, which are the following : high sensitivity to the changes in external environment; the high level of competition in the field; dynamics and the lack of necessity for the use of «far-seeing» strategies; insufficiency of information provision for the application of traditional western models and matric methods of strategy development; time gap between obtaining the service and its consumption; a great number of intermediaries; seasonal swings in demands; the sudden shift of external environment caused by cyclicity, globalization, political decisions of separate countries and etc. The article shows essential differences in the development of financial strategies of small-scale enterprises and stock companies of tourist business. It is substantiated that small-scale enterprises develop strategies directed to a higher level of personal services, occupational competence, ability and experience in designing, the best knowledge of regional conditions and flexible decisions caused by the peculiarities of the received orders. Taking into

  6. A lung cancer risk classifier comprising genome maintenance genes measured in normal bronchial epithelial cells.

    Science.gov (United States)

    Yeo, Jiyoun; Crawford, Erin L; Zhang, Xiaolu; Khuder, Sadik; Chen, Tian; Levin, Albert; Blomquist, Thomas M; Willey, James C

    2017-05-02

    Annual low dose CT (LDCT) screening of individuals at high demographic risk reduces lung cancer mortality by more than 20%. However, subjects selected for screening based on demographic criteria typically have less than a 10% lifetime risk for lung cancer. Thus, there is need for a biomarker that better stratifies subjects for LDCT screening. Toward this goal, we previously reported a lung cancer risk test (LCRT) biomarker comprising 14 genome-maintenance (GM) pathway genes measured in normal bronchial epithelial cells (NBEC) that accurately classified cancer (CA) from non-cancer (NC) subjects. The primary goal of the studies reported here was to optimize the LCRT biomarker for high specificity and ease of clinical implementation. Targeted competitive multiplex PCR amplicon libraries were prepared for next generation sequencing (NGS) analysis of transcript abundance at 68 sites among 33 GM target genes in NBEC specimens collected from a retrospective cohort of 120 subjects, including 61 CA cases and 59 NC controls. Genes were selected for analysis based on contribution to the previously reported LCRT biomarker and/or prior evidence for association with lung cancer risk. Linear discriminant analysis was used to identify the most accurate classifier suitable to stratify subjects for screening. After cross-validation, a model comprising expression values from 12 genes (CDKN1A, E2F1, ERCC1, ERCC4, ERCC5, GPX1, GSTP1, KEAP1, RB1, TP53, TP63, and XRCC1) and demographic factors age, gender, and pack-years smoking, had Receiver Operator Characteristic area under the curve (ROC AUC) of 0.975 (95% CI: 0.96-0.99). The overall classification accuracy was 93% (95% CI 88%-98%) with sensitivity 93.1%, specificity 92.9%, positive predictive value 93.1% and negative predictive value 93%. The ROC AUC for this classifier was significantly better (p < 0.0001) than the best model comprising demographic features alone. The LCRT biomarker reported here displayed high accuracy and ease

  7. The decision tree classifier - Design and potential. [for Landsat-1 data

    Science.gov (United States)

    Hauska, H.; Swain, P. H.

    1975-01-01

    A new classifier has been developed for the computerized analysis of remote sensor data. The decision tree classifier is essentially a maximum likelihood classifier using multistage decision logic. It is characterized by the fact that an unknown sample can be classified into a class using one or several decision functions in a successive manner. The classifier is applied to the analysis of data sensed by Landsat-1 over Kenosha Pass, Colorado. The classifier is illustrated by a tree diagram which for processing purposes is encoded as a string of symbols such that there is a unique one-to-one relationship between string and decision tree.

  8. Classifying images using restricted Boltzmann machines and convolutional neural networks

    Science.gov (United States)

    Zhao, Zhijun; Xu, Tongde; Dai, Chenyu

    2017-07-01

    To improve the feature recognition ability of deep model transfer learning, we propose a hybrid deep transfer learning method for image classification based on restricted Boltzmann machines (RBM) and convolutional neural networks (CNNs). It integrates learning abilities of two models, which conducts subject classification by exacting structural higher-order statistics features of images. While the method transfers the trained convolutional neural networks to the target datasets, fully-connected layers can be replaced by restricted Boltzmann machine layers; then the restricted Boltzmann machine layers and Softmax classifier are retrained, and BP neural network can be used to fine-tuned the hybrid model. The restricted Boltzmann machine layers has not only fully integrated the whole feature maps, but also learns the statistical features of target datasets in the view of the biggest logarithmic likelihood, thus removing the effects caused by the content differences between datasets. The experimental results show that the proposed method has improved the accuracy of image classification, outperforming other methods on Pascal VOC2007 and Caltech101 datasets.

  9. Classifying and explaining democracy in the Muslim world

    Directory of Open Access Journals (Sweden)

    Rohaizan Baharuddin

    2012-12-01

    Full Text Available The purpose of this study is to classify and explain democracies in the 47 Muslim countries between the years 1998 and 2008 by using liberties and elections as independent variables. Specifically focusing on the context of the Muslim world, this study examines the performance of civil liberties and elections, variation of democracy practised the most, the elections, civil liberties and democratic transitions and patterns that followed. Based on the quantitative data primarily collected from Freedom House, this study demonstrates the following aggregate findings: first, the “not free not fair” elections, the “limited” civil liberties and the “Illiberal Partial Democracy” were the dominant feature of elections, civil liberties and democracy practised in the Muslim world; second, a total of 413 Muslim regimes out of 470 (47 regimes x 10 years remained the same as their democratic origin points, without any transitions to a better or worse level of democracy, throughout these 10 years; and third, a slow, yet steady positive transition of both elections and civil liberties occurred in the Muslim world with changes in the nature of elections becoming much more progressive compared to the civil liberties’ transitions.

  10. An automatic classifier of emotions built from entropy of noise.

    Science.gov (United States)

    Ferreira, Jacqueline; Brás, Susana; Silva, Carlos F; Soares, Sandra C

    2017-04-01

    The electrocardiogram (ECG) signal has been widely used to study the physiological substrates of emotion. However, searching for better filtering techniques in order to obtain a signal with better quality and with the maximum relevant information remains an important issue for researchers in this field. Signal processing is largely performed for ECG analysis and interpretation, but this process can be susceptible to error in the delineation phase. In addition, it can lead to the loss of important information that is usually considered as noise and, consequently, discarded from the analysis. The goal of this study was to evaluate if the ECG noise allows for the classification of emotions, while using its entropy as an input in a decision tree classifier. We collected the ECG signal from 25 healthy participants while they were presented with videos eliciting negative (fear and disgust) and neutral emotions. The results indicated that the neutral condition showed a perfect identification (100%), whereas the classification of negative emotions indicated good identification performances (60% of sensitivity and 80% of specificity). These results suggest that the entropy of noise contains relevant information that can be useful to improve the analysis of the physiological correlates of emotion. © 2016 Society for Psychophysiological Research.

  11. Classifying the Optical Morphology of Shocked POststarburst Galaxies

    Science.gov (United States)

    Stewart, Tess; SPOGs Team

    2018-01-01

    The Shocked POststarburst Galaxy Survey (SPOGS) is a sample of galaxies in transition from blue, star forming spirals to red, inactive ellipticals. These galaxies are earlier in the transition than classical poststarburst samples. We have classified the physical characteristics of the full sample of 1067 SPOGs in 7 categories, covering (1) their shape; (2) the relative prominence of their nuclei; (3) the uniformity of their optical color; (4) whether the outskirts of the galaxy were indicative of on-going star formation; (5) whether they are engaged in interactions with other galaxies, and if so, (6) the kinds of galaxies with which they are interacting; and (7) the presence of asymmetrical features, possibly indicative of recent interactions. We determined that a plurality of SPOGs are in elliptical galaxies, indicating morphological transformations may tend to conclude before other indicators of transitions have faded. Further, early-type SPOGs also tend to have the brightest optical nuclei. Most galaxies do not show signs of current or recent interactions. We used these classifications to search for correlations between qualitative and quantitative characteristics of SPOGs using Sloan Digital Sky Survey and Wide-field Infrared Survey Explorer magnitudes. We find that relative optical nuclear brightness is not a good indicator of the presence of an active galactic nuclei and that galaxies with visible indications of active star formation also cluster in optical color and diagnostic line ratios.

  12. Asymptotic performance of regularized quadratic discriminant analysis based classifiers

    KAUST Repository

    Elkhalil, Khalil

    2017-12-13

    This paper carries out a large dimensional analysis of the standard regularized quadratic discriminant analysis (QDA) classifier designed on the assumption that data arise from a Gaussian mixture model. The analysis relies on fundamental results from random matrix theory (RMT) when both the number of features and the cardinality of the training data within each class grow large at the same pace. Under some mild assumptions, we show that the asymptotic classification error converges to a deterministic quantity that depends only on the covariances and means associated with each class as well as the problem dimensions. Such a result permits a better understanding of the performance of regularized QDA and can be used to determine the optimal regularization parameter that minimizes the misclassification error probability. Despite being valid only for Gaussian data, our theoretical findings are shown to yield a high accuracy in predicting the performances achieved with real data sets drawn from popular real data bases, thereby making an interesting connection between theory and practice.

  13. Molecular Characteristics in MRI-classified Group 1 Glioblastoma Multiforme

    Directory of Open Access Journals (Sweden)

    William E Haskins

    2013-07-01

    Full Text Available Glioblastoma multiforme (GBM is a clinically and pathologically heterogeneous brain tumor. Previous study of MRI-classified GBM has revealed a spatial relationship between Group 1 GBM (GBM1 and the subventricular zone (SVZ. The SVZ is an adult neural stem cell niche and is also suspected to be the origin of a subtype of brain tumor. The intimate contact between GBM1 and the SVZ raises the possibility that tumor cells in GBM1 may be most related to SVZ cells. In support of this notion, we found that neural stem cell and neuroblast markers are highly expressed in GBM1. Additionally, we identified molecular characteristics in this type of GBM that include up-regulation of metabolic enzymes, ribosomal proteins, heat shock proteins, and c-Myc oncoprotein. As GBM1 often recurs at great distances from the initial lesion, the rewiring of metabolism and ribosomal biogenesis may facilitate cancer cells’ growth and survival during tumor migration. Taken together, combined our findings and MRI-based classification of GBM1 would offer better prediction and treatment for this multifocal GBM.

  14. The Complete Gabor-Fisher Classifier for Robust Face Recognition

    Directory of Open Access Journals (Sweden)

    Štruc Vitomir

    2010-01-01

    Full Text Available Abstract This paper develops a novel face recognition technique called Complete Gabor Fisher Classifier (CGFC. Different from existing techniques that use Gabor filters for deriving the Gabor face representation, the proposed approach does not rely solely on Gabor magnitude information but effectively uses features computed based on Gabor phase information as well. It represents one of the few successful attempts found in the literature of combining Gabor magnitude and phase information for robust face recognition. The novelty of the proposed CGFC technique comes from (1 the introduction of a Gabor phase-based face representation and (2 the combination of the recognition technique using the proposed representation with classical Gabor magnitude-based methods into a unified framework. The proposed face recognition framework is assessed in a series of face verification and identification experiments performed on the XM2VTS, Extended YaleB, FERET, and AR databases. The results of the assessment suggest that the proposed technique clearly outperforms state-of-the-art face recognition techniques from the literature and that its performance is almost unaffected by the presence of partial occlusions of the facial area, changes in facial expression, or severe illumination changes.

  15. A Large Dimensional Analysis of Regularized Discriminant Analysis Classifiers

    KAUST Repository

    Elkhalil, Khalil

    2017-11-01

    This article carries out a large dimensional analysis of standard regularized discriminant analysis classifiers designed on the assumption that data arise from a Gaussian mixture model with different means and covariances. The analysis relies on fundamental results from random matrix theory (RMT) when both the number of features and the cardinality of the training data within each class grow large at the same pace. Under mild assumptions, we show that the asymptotic classification error approaches a deterministic quantity that depends only on the means and covariances associated with each class as well as the problem dimensions. Such a result permits a better understanding of the performance of regularized discriminant analsysis, in practical large but finite dimensions, and can be used to determine and pre-estimate the optimal regularization parameter that minimizes the misclassification error probability. Despite being theoretically valid only for Gaussian data, our findings are shown to yield a high accuracy in predicting the performances achieved with real data sets drawn from the popular USPS data base, thereby making an interesting connection between theory and practice.

  16. Classifying Taiwan Lianas with Radiating Plates of Xylem

    Directory of Open Access Journals (Sweden)

    Sheng-Zehn Yang

    2015-12-01

    Full Text Available Radiating plates of xylem are a lianas cambium variation, of which, 22 families have this feature. This study investigates 15 liana species representing nine families with radiating plates of xylem structures. The features of the transverse section and epidermis in fresh liana samples are documented, including shapes and colors of xylem and phloem, ray width and numbers, and skin morphology. Experimental results indicated that the shape of phloem fibers in Ampelopsis brevipedunculata var. hancei is gradually tapered and flame-like, which is in contrast with the other characteristics of this type, including those classified as rays. Both inner and outer cylinders of vascular bundles are found in Piper kwashoense, and the irregularly inner cylinder persists yet gradually diminishes. Red crystals are numerous in the cortex of Celastrus kusanoi. Aristolochia shimadai and A. zollingeriana develop a combination of two cambium variants, radiating plates of xylem and a lobed xylem. The shape of phloem in Stauntonia obovatifoliola is square or truncate, and its rays are numerous. Meanwhile, that of Neoalsomitra integrifolia is blunt and its rays are fewer. As for the features of a stem surface within the same family, Cyclea ochiaiana is brownish in color and has a deep vertical depression with lenticels, Pericampylus glaucus is greenish in color with a vertical shallow depression. Within the same genus, Aristolochia shimadai develops lenticels, which are not in A. zollingeriana; although the periderm developed in Clematis grata is a ring bark and tears easily, that of Clematis tamura is thick and soft.

  17. Using biological indices to classify schizophrenia and other psychotic patients.

    Science.gov (United States)

    Sponheim, S R; Iacono, W G; Thuras, P D; Beiser, M

    2001-07-01

    Although classification of mental disorders using more than clinical description would be desirable, there is scant evidence that available laboratory tests (i.e. biological indices) would provide more valid classifications than current diagnostic systems (e.g. DSM-IV). We used cluster analysis of four biological variables to classify 163 psychotic patients and 83 nonpsychiatric comparison subjects. Analyses revealed a three-cluster solution with the first cluster reflecting electrodermal deviance, the second cluster representing nondeviant biological function, and the third cluster reflecting increased nailfold plexus visibility and ocular motor dysfunction. To assess the construct validity of proband clusters we examined ocular motor performance in 156 first-degree relatives as a function of proband cluster membership. First-degree relatives of third cluster probands exhibited worse ocular motor performance than relatives of other cluster probands. Additionally, better classification sensitivity and specificity were obtained for the relatives when they were grouped by proband cluster than by proband DSM-IV diagnosis. When a single proband characteristic (i.e. eyetracking performance) was used to group relatives, classification sensitivity and specificity failed to significantly increase over grouping by proband DSM-IV diagnosis. Multivariate biologically defined clusters may offer an advantage over DSM-IV classification when examining nosology and etiology of psychotic disorders.

  18. Passive Sonar Target Detection Using Statistical Classifier and Adaptive Threshold

    Directory of Open Access Journals (Sweden)

    Hamed Komari Alaie

    2018-01-01

    Full Text Available This paper presents the results of an experimental investigation about target detecting with passive sonar in Persian Gulf. Detecting propagated sounds in the water is one of the basic challenges of the researchers in sonar field. This challenge will be complex in shallow water (like Persian Gulf and noise less vessels. Generally, in passive sonar, the targets are detected by sonar equation (with constant threshold that increases the detection error in shallow water. The purpose of this study is proposed a new method for detecting targets in passive sonars using adaptive threshold. In this method, target signal (sound is processed in time and frequency domain. For classifying, Bayesian classification is used and posterior distribution is estimated by Maximum Likelihood Estimation algorithm. Finally, target was detected by combining the detection points in both domains using Least Mean Square (LMS adaptive filter. Results of this paper has showed that the proposed method has improved true detection rate by about 24% when compared other the best detection method.

  19. Addressing the Challenge of Defining Valid Proteomic Biomarkers and Classifiers

    LENUS (Irish Health Repository)

    Dakna, Mohammed

    2010-12-10

    Abstract Background The purpose of this manuscript is to provide, based on an extensive analysis of a proteomic data set, suggestions for proper statistical analysis for the discovery of sets of clinically relevant biomarkers. As tractable example we define the measurable proteomic differences between apparently healthy adult males and females. We choose urine as body-fluid of interest and CE-MS, a thoroughly validated platform technology, allowing for routine analysis of a large number of samples. The second urine of the morning was collected from apparently healthy male and female volunteers (aged 21-40) in the course of the routine medical check-up before recruitment at the Hannover Medical School. Results We found that the Wilcoxon-test is best suited for the definition of potential biomarkers. Adjustment for multiple testing is necessary. Sample size estimation can be performed based on a small number of observations via resampling from pilot data. Machine learning algorithms appear ideally suited to generate classifiers. Assessment of any results in an independent test-set is essential. Conclusions Valid proteomic biomarkers for diagnosis and prognosis only can be defined by applying proper statistical data mining procedures. In particular, a justification of the sample size should be part of the study design.

  20. Quantum Hooke's Law to Classify Pulse Laser Induced Ultrafast Melting

    Science.gov (United States)

    Hu, Hao; Ding, Hepeng; Liu, Feng

    2015-02-01

    Ultrafast crystal-to-liquid phase transition induced by femtosecond pulse laser excitation is an interesting material's behavior manifesting the complexity of light-matter interaction. There exist two types of such phase transitions: one occurs at a time scale shorter than a picosecond via a nonthermal process mediated by electron-hole plasma formation; the other at a longer time scale via a thermal melting process mediated by electron-phonon interaction. However, it remains unclear what material would undergo which process and why? Here, by exploiting the property of quantum electronic stress (QES) governed by quantum Hooke's law, we classify the transitions by two distinct classes of materials: the faster nonthermal process can only occur in materials like ice having an anomalous phase diagram characterized with dTm/dP < 0, where Tm is the melting temperature and P is pressure, above a high threshold laser fluence; while the slower thermal process may occur in all materials. Especially, the nonthermal transition is shown to be induced by the QES, acting like a negative internal pressure, which drives the crystal into a ``super pressing'' state to spontaneously transform into a higher-density liquid phase. Our findings significantly advance fundamental understanding of ultrafast crystal-to-liquid phase transitions, enabling quantitative a priori predictions.

  1. Classifying and Analyzing 3d Cell Motion in Jammed Microgels

    Science.gov (United States)

    Bhattacharjee, Tapomoy; Sawyer, W. Gregory; Angelini, Thomas

    Soft granular polyelectrolyte microgels swell in liquid cell growth media to form a continuous elastic solid that can easily transition between solid to fluid state under a low shear stress. Such Liquid-like solids (LLS) have recently been used to create 3D cellular constructs as well as to support, culture and harvest cells in 3D. Current understanding of cell migration mechanics in 3D was established from experiments performed in natural and synthetic polymer networks. Spatial variation in network structure and the transience of degradable gels limit their usefulness in quantitative cell mechanics studies. By contrast, LLS growth media approximates a homogeneous continuum, enabling tractable cell mechanics measurements to be performed in 3D. Here, we introduce a process to understand and classify cytotoxic T cell motion in 3D by studying cellular motility in LLS media. General classification of T cell motion can be achieved with a very traditional statistical approach: the cell's mean squared displacement (MSD) as a function of delay time. We will also use Langevin approaches combined with the constitutive equations of the LLS medium to predict the statistics of T cell motion. National Science Foundation under Grant No. DMR-1352043.

  2. Transforming Musical Signals through a Genre Classifying Convolutional Neural Network

    Science.gov (United States)

    Geng, S.; Ren, G.; Ogihara, M.

    2017-05-01

    Convolutional neural networks (CNNs) have been successfully applied on both discriminative and generative modeling for music-related tasks. For a particular task, the trained CNN contains information representing the decision making or the abstracting process. One can hope to manipulate existing music based on this 'informed' network and create music with new features corresponding to the knowledge obtained by the network. In this paper, we propose a method to utilize the stored information from a CNN trained on musical genre classification task. The network was composed of three convolutional layers, and was trained to classify five-second song clips into five different genres. After training, randomly selected clips were modified by maximizing the sum of outputs from the network layers. In addition to the potential of such CNNs to produce interesting audio transformation, more information about the network and the original music could be obtained from the analysis of the generated features since these features indicate how the network 'understands' the music.

  3. An Informed Framework for Training Classifiers from Social Media

    Directory of Open Access Journals (Sweden)

    Dong Seon Cheng

    2016-04-01

    Full Text Available Extracting information from social media has become a major focus of companies and researchers in recent years. Aside from the study of the social aspects, it has also been found feasible to exploit the collaborative strength of crowds to help solve classical machine learning problems like object recognition. In this work, we focus on the generally underappreciated problem of building effective datasets for training classifiers by automatically assembling data from social media. We detail some of the challenges of this approach and outline a framework that uses expanded search queries to retrieve more qualified data. In particular, we concentrate on collaboratively tagged media on the social platform Flickr, and on the problem of image classification to evaluate our approach. Finally, we describe a novel entropy-based method to incorporate an information-theoretic principle to guide our framework. Experimental validation against well-known public datasets shows the viability of this approach and marks an improvement over the state of the art in terms of simplicity and performance.

  4. Classifying the evolutionary and ecological features of neoplasms

    Science.gov (United States)

    Maley, Carlo C.; Aktipis, Athena; Graham, Trevor A.; Sottoriva, Andrea; Boddy, Amy M.; Janiszewska, Michalina; Silva, Ariosto S.; Gerlinger, Marco; Yuan, Yinyin; Pienta, Kenneth J.; Anderson, Karen S.; Gatenby, Robert; Swanton, Charles; Posada, David; Wu, Chung-I; Schiffman, Joshua D.; Hwang, E. Shelley; Polyak, Kornelia; Anderson, Alexander R. A.; Brown, Joel S.; Greaves, Mel; Shibata, Darryl

    2018-01-01

    Neoplasms change over time through a process of cell-level evolution, driven by genetic and epigenetic alterations. However, the ecology of the microenvironment of a neoplastic cell determines which changes provide adaptive benefits. There is widespread recognition of the importance of these evolutionary and ecological processes in cancer, but to date, no system has been proposed for drawing clinically relevant distinctions between how different tumours are evolving. On the basis of a consensus conference of experts in the fields of cancer evolution and cancer ecology, we propose a framework for classifying tumours that is based on four relevant components. These are the diversity of neoplastic cells (intratumoural heterogeneity) and changes over time in that diversity, which make up an evolutionary index (Evo-index), as well as the hazards to neoplastic cell survival and the resources available to neoplastic cells, which make up an ecological index (Eco-index). We review evidence demonstrating the importance of each of these factors and describe multiple methods that can be used to measure them. Development of this classification system holds promise for enabling clinicians to personalize optimal interventions based on the evolvability of the patient’s tumour. The Evo- and Eco-indices provide a common lexicon for communicating about how neoplasms change in response to interventions, with potential implications for clinical trials, personalized medicine and basic cancer research. PMID:28912577

  5. Learning multiscale and deep representations for classifying remotely sensed imagery

    Science.gov (United States)

    Zhao, Wenzhi; Du, Shihong

    2016-03-01

    It is widely agreed that spatial features can be combined with spectral properties for improving interpretation performances on very-high-resolution (VHR) images in urban areas. However, many existing methods for extracting spatial features can only generate low-level features and consider limited scales, leading to unpleasant classification results. In this study, multiscale convolutional neural network (MCNN) algorithm was presented to learn spatial-related deep features for hyperspectral remote imagery classification. Unlike traditional methods for extracting spatial features, the MCNN first transforms the original data sets into a pyramid structure containing spatial information at multiple scales, and then automatically extracts high-level spatial features using multiscale training data sets. Specifically, the MCNN has two merits: (1) high-level spatial features can be effectively learned by using the hierarchical learning structure and (2) multiscale learning scheme can capture contextual information at different scales. To evaluate the effectiveness of the proposed approach, the MCNN was applied to classify the well-known hyperspectral data sets and compared with traditional methods. The experimental results shown a significant increase in classification accuracies especially for urban areas.

  6. Executed Movement Using EEG Signals through a Naive Bayes Classifier

    Directory of Open Access Journals (Sweden)

    Juliano Machado

    2014-11-01

    Full Text Available Recent years have witnessed a rapid development of brain-computer interface (BCI technology. An independent BCI is a communication system for controlling a device by human intension, e.g., a computer, a wheelchair or a neuroprosthes is, not depending on the brain’s normal output pathways of peripheral nerves and muscles, but on detectable signals that represent responsive or intentional brain activities. This paper presents a comparative study of the usage of the linear discriminant analysis (LDA and the naive Bayes (NB classifiers on describing both right- and left-hand movement through electroencephalographic signal (EEG acquisition. For the analysis, we considered the following input features: the energy of the segments of a band pass-filtered signal with the frequency band in sensorimotor rhythms and the components of the spectral energy obtained through the Welch method. We also used the common spatial pattern (CSP filter, so as to increase the discriminatory activity among movement classes. By using the database generated by this experiment, we obtained hit rates up to 70%. The results are compatible with previous studies.

  7. Online Feature Selection for Classifying Emphysema in HRCT Images

    Directory of Open Access Journals (Sweden)

    M. Prasad

    2008-06-01

    Full Text Available Feature subset selection, applied as a pre- processing step to machine learning, is valuable in dimensionality reduction, eliminating irrelevant data and improving classifier performance. In the classic formulation of the feature selection problem, it is assumed that all the features are available at the beginning. However, in many real world problems, there are scenarios where not all features are present initially and must be integrated as they become available. In such scenarios, online feature selection provides an efficient way to sort through a large space of features. It is in this context that we introduce online feature selection for the classification of emphysema, a smoking related disease that appears as low attenuation regions in High Resolution Computer Tomography (HRCT images. The technique was successfully evaluated on 61 HRCT scans and compared with different online feature selection approaches, including hill climbing, best first search, grafting, and correlation-based feature selection. The results were also compared against ldensity maskr, a standard approach used for emphysema detection in medical image analysis.

  8. Hyperspectral image classifier based on beach spectral feature

    International Nuclear Information System (INIS)

    Liang, Zhang; Lianru, Gao; Bing, Zhang

    2014-01-01

    The seashore, especially coral bank, is sensitive to human activities and environmental changes. A multispectral image, with coarse spectral resolution, is inadaptable for identify subtle spectral distinctions between various beaches. To the contrary, hyperspectral image with narrow and consecutive channels increases our capability to retrieve minor spectral features which is suit for identification and classification of surface materials on the shore. Herein, this paper used airborne hyperspectral data, in addition to ground spectral data to study the beaches in Qingdao. The image data first went through image pretreatment to deal with the disturbance of noise, radiation inconsistence and distortion. In succession, the reflection spectrum, the derivative spectrum and the spectral absorption features of the beach surface were inspected in search of diagnostic features. Hence, spectra indices specific for the unique environment of seashore were developed. According to expert decisions based on image spectrums, the beaches are ultimately classified into sand beach, rock beach, vegetation beach, mud beach, bare land and water. In situ surveying reflection spectrum from GER1500 field spectrometer validated the classification production. In conclusion, the classification approach under expert decision based on feature spectrum is proved to be feasible for beaches

  9. A dimensionless parameter for classifying hemodynamics in intracranial

    Science.gov (United States)

    Asgharzadeh, Hafez; Borazjani, Iman

    2015-11-01

    Rupture of an intracranial aneurysm (IA) is a disease with high rates of mortality. Given the risk associated with the aneurysm surgery, quantifying the likelihood of aneurysm rupture is essential. There are many risk factors that could be implicated in the rupture of an aneurysm. However, the most important factors correlated to the IA rupture are hemodynamic factors such as wall shear stress (WSS) and oscillatory shear index (OSI) which are affected by the IA flows. Here, we carry out three-dimensional high resolution simulations on representative IA models with simple geometries to test a dimensionless number (first proposed by Le et al., ASME J Biomech Eng, 2010), denoted as An number, to classify the flow mode. An number is defined as the ratio of the time takes the parent artery flow transports across the IA neck to the time required for vortex ring formation. Based on the definition, the flow mode is vortex if An>1 and it is cavity if AnOSI on the human subject IA. This work was supported partly by the NIH grant R03EB014860, and the computational resources were partly provided by CCR at UB. We thank Prof. Hui Meng and Dr. Jianping Xiang for providing us the database of aneurysms and helpful discussions.

  10. Instance Selection for Classifier Performance Estimation in Meta Learning

    Directory of Open Access Journals (Sweden)

    Marcin Blachnik

    2017-11-01

    Full Text Available Building an accurate prediction model is challenging and requires appropriate model selection. This process is very time consuming but can be accelerated with meta-learning–automatic model recommendation by estimating the performances of given prediction models without training them. Meta-learning utilizes metadata extracted from the dataset to effectively estimate the accuracy of the model in question. To achieve that goal, metadata descriptors must be gathered efficiently and must be informative to allow the precise estimation of prediction accuracy. In this paper, a new type of metadata descriptors is analyzed. These descriptors are based on the compression level obtained from the instance selection methods at the data-preprocessing stage. To verify their suitability, two types of experiments on real-world datasets have been conducted. In the first one, 11 instance selection methods were examined in order to validate the compression–accuracy relation for three classifiers: k-nearest neighbors (kNN, support vector machine (SVM, and random forest. From this analysis, two methods are recommended (instance-based learning type 2 (IB2, and edited nearest neighbor (ENN which are then compared with the state-of-the-art metaset descriptors. The obtained results confirm that the two suggested compression-based meta-features help to predict accuracy of the base model much more accurately than the state-of-the-art solution.

  11. Implementation of a classifier didactical machine for learning mechatronic processes

    Directory of Open Access Journals (Sweden)

    Alex De La Cruz

    2017-06-01

    Full Text Available The present article shows the design and construction of a classifier didactical machine through artificial vision. The implementation of the machine is to be used as a learning module of mechatronic processes. In the project, it is described the theoretical aspects that relate concepts of mechanical design, electronic design and software management which constitute popular field in science and technology, which is mechatronics. The design of the machine was developed based on the requirements of the user, through the concurrent design methodology to define and materialize the appropriate hardware and software solutions. LabVIEW 2015 was implemented for high-speed image acquisition and analysis, as well as for the establishment of data communication with a programmable logic controller (PLC via Ethernet and an open communications platform known as Open Platform Communications - OPC. In addition, the Arduino MEGA 2560 platform was used to control the movement of the step motor and the servo motors of the module. Also, is used the Arduino MEGA 2560 to control the movement of the stepper motor and servo motors in the module. Finally, we assessed whether the equipment meets the technical specifications raised by running specific test protocols.

  12. Salient Region Detection via Feature Combination and Discriminative Classifier

    Directory of Open Access Journals (Sweden)

    Deming Kong

    2015-01-01

    Full Text Available We introduce a novel approach to detect salient regions of an image via feature combination and discriminative classifier. Our method, which is based on hierarchical image abstraction, uses the logistic regression approach to map the regional feature vector to a saliency score. Four saliency cues are used in our approach, including color contrast in a global context, center-boundary priors, spatially compact color distribution, and objectness, which is as an atomic feature of segmented region in the image. By mapping a four-dimensional regional feature to fifteen-dimensional feature vector, we can linearly separate the salient regions from the clustered background by finding an optimal linear combination of feature coefficients in the fifteen-dimensional feature space and finally fuse the saliency maps across multiple levels. Furthermore, we introduce the weighted salient image center into our saliency analysis task. Extensive experiments on two large benchmark datasets show that the proposed approach achieves the best performance over several state-of-the-art approaches.

  13. Complete Genome Sequence of an Avian Metapneumovirus Subtype A Strain Isolated from Chicken (Gallus gallus) in Brazil

    OpenAIRE

    Rizotto, La?s S.; Scagion, Guilherme P.; Cardoso, Tereza C.; Sim?o, Raphael M.; Caserta, Leonardo C.; Benassi, Julia C.; Keid, Lara B.; Oliveira, Tr?cia M. F. de S.; Soares, Rodrigo M.; Arns, Clarice W.; Van Borm, Steven; Ferreira, Helena L.

    2017-01-01

    ABSTRACT We report here the complete genome sequence of an avian metapneumovirus (aMPV) isolated from a tracheal tissue sample of a commercial layer flock. The complete genome sequence of aMPV-A/chicken/Brazil-SP/669/2003 was obtained using MiSeq (Illumina, Inc.) sequencing. Phylogenetic analysis of the complete genome classified the isolate as avian metapneumovirus subtype A.

  14. Large scale identification and categorization of protein sequences using structured logistic regression

    DEFF Research Database (Denmark)

    Pedersen, Bjørn Panella; Ifrim, Georgiana; Liboriussen, Poul

    2014-01-01

    Abstract Background Structured Logistic Regression (SLR) is a newly developed machine learning tool first proposed in the context of text categorization. Current availability of extensive protein sequence databases calls for an automated method to reliably classify sequences and SLR seems well...... problem. Results Using SLR, we have built classifiers to identify and automatically categorize P-type ATPases into one of 11 pre-defined classes. The SLR-classifiers are compared to a Hidden Markov Model approach and shown to be highly accurate and scalable. Representing the bulk of currently known...... for further biochemical characterization and structural analysis....

  15. How to Name and Classify Your Phage: An Informal Guide

    Directory of Open Access Journals (Sweden)

    Evelien Adriaenssens

    2017-04-01

    Full Text Available With this informal guide, we try to assist both new and experienced phage researchers through two important stages that follow phage discovery; that is, naming and classification. Providing an appropriate name for a bacteriophage is not as trivial as it sounds, and the effects might be long-lasting in databases and in official taxon names. Phage classification is the responsibility of the Bacterial and Archaeal Viruses Subcommittee (BAVS of the International Committee on the Taxonomy of Viruses (ICTV. While the BAVS aims at providing a holistic approach to phage taxonomy, for individual researchers who have isolated and sequenced a new phage, this can be a little overwhelming. We are now providing these researchers with an informal guide to phage naming and classification, taking a “bottom-up” approach from the phage isolate level.

  16. Robust Template Decomposition without Weight Restriction for Cellular Neural Networks Implementing Arbitrary Boolean Functions Using Support Vector Classifiers

    Directory of Open Access Journals (Sweden)

    Yih-Lon Lin

    2013-01-01

    Full Text Available If the given Boolean function is linearly separable, a robust uncoupled cellular neural network can be designed as a maximal margin classifier. On the other hand, if the given Boolean function is linearly separable but has a small geometric margin or it is not linearly separable, a popular approach is to find a sequence of robust uncoupled cellular neural networks implementing the given Boolean function. In the past research works using this approach, the control template parameters and thresholds are restricted to assume only a given finite set of integers, and this is certainly unnecessary for the template design. In this study, we try to remove this restriction. Minterm- and maxterm-based decomposition algorithms utilizing the soft margin and maximal margin support vector classifiers are proposed to design a sequence of robust templates implementing an arbitrary Boolean function. Several illustrative examples are simulated to demonstrate the efficiency of the proposed method by comparing our results with those produced by other decomposition methods with restricted weights.

  17. A GIS semiautomatic tool for classifying and mapping wetland soils

    Science.gov (United States)

    Moreno-Ramón, Héctor; Marqués-Mateu, Angel; Ibáñez-Asensio, Sara

    2016-04-01

    Wetlands are one of the most productive and biodiverse ecosystems in the world. Water is the main resource and controls the relationships between agents and factors that determine the quality of the wetland. However, vegetation, wildlife and soils are also essential factors to understand these environments. It is possible that soils have been the least studied resource due to their sampling problems. This feature has caused that sometimes wetland soils have been classified broadly. The traditional methodology states that homogeneous soil units should be based on the five soil forming-factors. The problem can appear when the variation of one soil-forming factor is too small to differentiate a change in soil units, or in case that there is another factor, which is not taken into account (e.g. fluctuating water table). This is the case of Albufera of Valencia, a coastal wetland located in the middle east of the Iberian Peninsula (Spain). The saline water table fluctuates throughout the year and it generates differences in soils. To solve this problem, the objectives of this study were to establish a reliable methodology to avoid that problems, and develop a GIS tool that would allow us to define homogeneous soil units in wetlands. This step is essential for the soil scientist, who has to decide the number of soil profiles in a study. The research was conducted with data from 133 soil pits of a previous study in the wetland. In that study, soil parameters of 401 samples (organic carbon, salinity, carbonates, n-value, etc.) were analysed. In a first stage, GIS layers were generated according to depth. The method employed was Bayesian Maxim Entropy. Subsequently, it was designed a program in GIS environment that was based on the decision tree algorithms. The goal of this tool was to create a single layer, for each soil variable, according to the different diagnostic criteria of Soil Taxonomy (properties, horizons and diagnostic epipedons). At the end, the program

  18. Infrared dim moving target tracking via sparsity-based discriminative classifier and convolutional network

    Science.gov (United States)

    Qian, Kun; Zhou, Huixin; Wang, Bingjian; Song, Shangzhen; Zhao, Dong

    2017-11-01

    Infrared dim and small target tracking is a great challenging task. The main challenge for target tracking is to account for appearance change of an object, which submerges in the cluttered background. An efficient appearance model that exploits both the global template and local representation over infrared image sequences is constructed for dim moving target tracking. A Sparsity-based Discriminative Classifier (SDC) and a Convolutional Network-based Generative Model (CNGM) are combined with a prior model. In the SDC model, a sparse representation-based algorithm is adopted to calculate the confidence value that assigns more weights to target templates than negative background templates. In the CNGM model, simple cell feature maps are obtained by calculating the convolution between target templates and fixed filters, which are extracted from the target region at the first frame. These maps measure similarities between each filter and local intensity patterns across the target template, therefore encoding its local structural information. Then, all the maps form a representation, preserving the inner geometric layout of a candidate template. Furthermore, the fixed target template set is processed via an efficient prior model. The same operation is applied to candidate templates in the CNGM model. The online update scheme not only accounts for appearance variations but also alleviates the migration problem. At last, collaborative confidence values of particles are utilized to generate particles' importance weights. Experiments on various infrared sequences have validated the tracking capability of the presented algorithm. Experimental results show that this algorithm runs in real-time and provides a higher accuracy than state of the art algorithms.

  19. Locating and classifying defects using an hybrid data base

    Energy Technology Data Exchange (ETDEWEB)

    Luna-Aviles, A; Diaz Pineda, A [Tecnologico de Estudios Superiores de Coacalco. Av. 16 de Septiembre 54, Col. Cabecera Municipal. C.P. 55700 (Mexico); Hernandez-Gomez, L H; Urriolagoitia-Calderon, G; Urriolagoitia-Sosa, G [Instituto Politecnico Nacional. ESIME-SEPI. Unidad Profesional ' Adolfo Lopez Mateos' Edificio 5, 30 Piso, Colonia Lindavista. Gustavo A. Madero. 07738 Mexico D.F. (Mexico); Durodola, J F [School of Technology, Oxford Brookes University, Headington Campus, Gipsy Lane, Oxford OX3 0BP (United Kingdom); Beltran Fernandez, J A, E-mail: alelunaav@hotmail.com, E-mail: luishector56@hotmail.com, E-mail: jdurodola@brookes.ac.uk

    2011-07-19

    A computational inverse technique was used in the localization and classification of defects. Postulated voids of two different sizes (2 mm and 4 mm diameter) were introduced in PMMA bars with and without a notch. The bar dimensions are 200x20x5 mm. One half of them were plain and the other half has a notch (3 mm x 4 mm) which is close to the defect area (19 mm x 16 mm).This analysis was done with an Artificial Neural Network (ANN) and its optimization was done with an Adaptive Neuro Fuzzy Procedure (ANFIS). A hybrid data base was developed with numerical and experimental results. Synthetic data was generated with the finite element method using SOLID95 element of ANSYS code. A parametric analysis was carried out. Only one defect in such bars was taken into account and the first five natural frequencies were calculated. 460 cases were evaluated. Half of them were plain and the other half has a notch. All the input data was classified in two groups. Each one has 230 cases and corresponds to one of the two sort of voids mentioned above. On the other hand, experimental analysis was carried on with PMMA specimens of the same size. The first two natural frequencies of 40 cases were obtained with one void. The other three frequencies were obtained numerically. 20 of these bars were plain and the others have a notch. These experimental results were introduced in the synthetic data base. 400 cases were taken randomly and, with this information, the ANN was trained with the backpropagation algorithm. The accuracy of the results was tested with the 100 cases that were left. In the next stage of this work, the ANN output was optimized with ANFIS. Previous papers showed that localization and classification of defects was reduced as notches were introduced in such bars. In the case of this paper, improved results were obtained when a hybrid data base was used.

  20. CLASSIFYING X-RAY BINARIES: A PROBABILISTIC APPROACH

    International Nuclear Information System (INIS)

    Gopalan, Giri; Bornn, Luke; Vrtilek, Saeqa Dil

    2015-01-01

    In X-ray binary star systems consisting of a compact object that accretes material from an orbiting secondary star, there is no straightforward means to decide whether the compact object is a black hole or a neutron star. To assist in this process, we develop a Bayesian statistical model that makes use of the fact that X-ray binary systems appear to cluster based on their compact object type when viewed from a three-dimensional coordinate system derived from X-ray spectral data where the first coordinate is the ratio of counts in the mid- to low-energy band (color 1), the second coordinate is the ratio of counts in the high- to low-energy band (color 2), and the third coordinate is the sum of counts in all three bands. We use this model to estimate the probabilities of an X-ray binary system containing a black hole, non-pulsing neutron star, or pulsing neutron star. In particular, we utilize a latent variable model in which the latent variables follow a Gaussian process prior distribution, and hence we are able to induce the spatial correlation which we believe exists between systems of the same type. The utility of this approach is demonstrated by the accurate prediction of system types using Rossi X-ray Timing Explorer All Sky Monitor data, but it is not flawless. In particular, non-pulsing neutron systems containing “bursters” that are close to the boundary demarcating systems containing black holes tend to be classified as black hole systems. As a byproduct of our analyses, we provide the astronomer with the public R code which can be used to predict the compact object type of XRBs given training data

  1. Development of multicriteria models to classify energy efficiency alternatives

    International Nuclear Information System (INIS)

    Neves, Luis Pires; Antunes, Carlos Henggeler; Dias, Luis Candido; Martins, Antonio Gomes

    2005-01-01

    This paper aims at describing a novel constructive approach to develop decision support models to classify energy efficiency initiatives, including traditional Demand-Side Management and Market Transformation initiatives, overcoming the limitations and drawbacks of Cost-Benefit Analysis. A multicriteria approach based on the ELECTRE-TRI method is used, focusing on four perspectives: - an independent Agency with the aim of promoting energy efficiency; - Distribution-only utilities under a regulated framework; - the Regulator; - Supply companies in a competitive liberalized market. These perspectives were chosen after a system analysis of the decision situation regarding the implementation of energy efficiency initiatives, looking for the main roles and power relations, with the purpose of structuring the decision problem by identifying the actors, the decision makers, the decision paradigm, and the relevant criteria. The multicriteria models developed allow considering different kinds of impacts, but avoiding difficult measurements and unit conversions due to the nature of the multicriteria method chosen. The decision is then based on all the significant effects of the initiative, both positive and negative ones, including ancillary effects often forgotten in cost-benefit analysis. The ELECTRE-TRI, as most multicriteria methods, provides to the Decision Maker the ability of controlling the relevance each impact can have on the final decision. The decision support process encompasses a robustness analysis, which, together with a good documentation of the parameters supplied into the model, should support sound decisions. The models were tested with a set of real-world initiatives and compared with possible decisions based on Cost-Benefit analysis

  2. Locating and classifying defects using an hybrid data base

    Science.gov (United States)

    Luna-Avilés, A.; Hernández-Gómez, L. H.; Durodola, J. F.; Urriolagoitia-Calderón, G.; Urriolagoitia-Sosa, G.; Beltrán Fernández, J. A.; Díaz Pineda, A.

    2011-07-01

    A computational inverse technique was used in the localization and classification of defects. Postulated voids of two different sizes (2 mm and 4 mm diameter) were introduced in PMMA bars with and without a notch. The bar dimensions are 200×20×5 mm. One half of them were plain and the other half has a notch (3 mm × 4 mm) which is close to the defect area (19 mm × 16 mm).This analysis was done with an Artificial Neural Network (ANN) and its optimization was done with an Adaptive Neuro Fuzzy Procedure (ANFIS). A hybrid data base was developed with numerical and experimental results. Synthetic data was generated with the finite element method using SOLID95 element of ANSYS code. A parametric analysis was carried out. Only one defect in such bars was taken into account and the first five natural frequencies were calculated. 460 cases were evaluated. Half of them were plain and the other half has a notch. All the input data was classified in two groups. Each one has 230 cases and corresponds to one of the two sort of voids mentioned above. On the other hand, experimental analysis was carried on with PMMA specimens of the same size. The first two natural frequencies of 40 cases were obtained with one void. The other three frequencies were obtained numerically. 20 of these bars were plain and the others have a notch. These experimental results were introduced in the synthetic data base. 400 cases were taken randomly and, with this information, the ANN was trained with the backpropagation algorithm. The accuracy of the results was tested with the 100 cases that were left. In the next stage of this work, the ANN output was optimized with ANFIS. Previous papers showed that localization and classification of defects was reduced as notches were introduced in such bars. In the case of this paper, improved results were obtained when a hybrid data base was used.

  3. Multimodal fusion of polynomial classifiers for automatic person recgonition

    Science.gov (United States)

    Broun, Charles C.; Zhang, Xiaozheng

    2001-03-01

    With the prevalence of the information age, privacy and personalization are forefront in today's society. As such, biometrics are viewed as essential components of current evolving technological systems. Consumers demand unobtrusive and non-invasive approaches. In our previous work, we have demonstrated a speaker verification system that meets these criteria. However, there are additional constraints for fielded systems. The required recognition transactions are often performed in adverse environments and across diverse populations, necessitating robust solutions. There are two significant problem areas in current generation speaker verification systems. The first is the difficulty in acquiring clean audio signals in all environments without encumbering the user with a head- mounted close-talking microphone. Second, unimodal biometric systems do not work with a significant percentage of the population. To combat these issues, multimodal techniques are being investigated to improve system robustness to environmental conditions, as well as improve overall accuracy across the population. We propose a multi modal approach that builds on our current state-of-the-art speaker verification technology. In order to maintain the transparent nature of the speech interface, we focus on optical sensing technology to provide the additional modality-giving us an audio-visual person recognition system. For the audio domain, we use our existing speaker verification system. For the visual domain, we focus on lip motion. This is chosen, rather than static face or iris recognition, because it provides dynamic information about the individual. In addition, the lip dynamics can aid speech recognition to provide liveness testing. The visual processing method makes use of both color and edge information, combined within Markov random field MRF framework, to localize the lips. Geometric features are extracted and input to a polynomial classifier for the person recognition process. A late

  4. Machine learning algorithms to classify spinal muscular atrophy subtypes.

    Science.gov (United States)

    Srivastava, Tuhin; Darras, Basil T; Wu, Jim S; Rutkove, Seward B

    2012-07-24

    The development of better biomarkers for disease assessment remains an ongoing effort across the spectrum of neurologic illnesses. One approach for refining biomarkers is based on the concept of machine learning, in which individual, unrelated biomarkers are simultaneously evaluated. In this cross-sectional study, we assess the possibility of using machine learning, incorporating both quantitative muscle ultrasound (QMU) and electrical impedance myography (EIM) data, for classification of muscles affected by spinal muscular atrophy (SMA). Twenty-one normal subjects, 15 subjects with SMA type 2, and 10 subjects with SMA type 3 underwent EIM and QMU measurements of unilateral biceps, wrist extensors, quadriceps, and tibialis anterior. EIM and QMU parameters were then applied in combination using a support vector machine (SVM), a type of machine learning, in an attempt to accurately categorize 165 individual muscles. For all 3 classification problems, normal vs SMA, normal vs SMA 3, and SMA 2 vs SMA 3, use of SVM provided the greatest accuracy in discrimination, surpassing both EIM and QMU individually. For example, the accuracy, as measured by the receiver operating characteristic area under the curve (ROC-AUC) for the SVM discriminating SMA 2 muscles from SMA 3 muscles was 0.928; in comparison, the ROC-AUCs for EIM and QMU parameters alone were only 0.877 (p < 0.05) and 0.627 (p < 0.05), respectively. Combining EIM and QMU data categorizes individual SMA-affected muscles with very high accuracy. Further investigation of this approach for classifying and for following the progression of neuromuscular illness is warranted.

  5. Unsupervised online classifier in sleep scoring for sleep deprivation studies.

    Science.gov (United States)

    Libourel, Paul-Antoine; Corneyllie, Alexandra; Luppi, Pierre-Hervé; Chouvet, Guy; Gervasoni, Damien

    2015-05-01

    This study was designed to evaluate an unsupervised adaptive algorithm for real-time detection of sleep and wake states in rodents. We designed a Bayesian classifier that automatically extracts electroencephalogram (EEG) and electromyogram (EMG) features and categorizes non-overlapping 5-s epochs into one of the three major sleep and wake states without any human supervision. This sleep-scoring algorithm is coupled online with a new device to perform selective paradoxical sleep deprivation (PSD). Controlled laboratory settings for chronic polygraphic sleep recordings and selective PSD. Ten adult Sprague-Dawley rats instrumented for chronic polysomnographic recordings. The performance of the algorithm is evaluated by comparison with the score obtained by a human expert reader. Online detection of PS is then validated with a PSD protocol with duration of 72 hours. Our algorithm gave a high concordance with human scoring with an average κ coefficient > 70%. Notably, the specificity to detect PS reached 92%. Selective PSD using real-time detection of PS strongly reduced PS amounts, leaving only brief PS bouts necessary for the detection of PS in EEG and EMG signals (4.7 ± 0.7% over 72 h, versus 8.9 ± 0.5% in baseline), and was followed by a significant PS rebound (23.3 ± 3.3% over 150 minutes). Our fully unsupervised data-driven algorithm overcomes some limitations of the other automated methods such as the selection of representative descriptors or threshold settings. When used online and coupled with our sleep deprivation device, it represents a better option for selective PSD than other methods like the tedious gentle handling or the platform method. © 2015 Associated Professional Sleep Societies, LLC.

  6. Locating and classifying defects using an hybrid data base

    International Nuclear Information System (INIS)

    Luna-Aviles, A; Diaz Pineda, A; Hernandez-Gomez, L H; Urriolagoitia-Calderon, G; Urriolagoitia-Sosa, G; Durodola, J F; Beltran Fernandez, J A

    2011-01-01

    A computational inverse technique was used in the localization and classification of defects. Postulated voids of two different sizes (2 mm and 4 mm diameter) were introduced in PMMA bars with and without a notch. The bar dimensions are 200x20x5 mm. One half of them were plain and the other half has a notch (3 mm x 4 mm) which is close to the defect area (19 mm x 16 mm).This analysis was done with an Artificial Neural Network (ANN) and its optimization was done with an Adaptive Neuro Fuzzy Procedure (ANFIS). A hybrid data base was developed with numerical and experimental results. Synthetic data was generated with the finite element method using SOLID95 element of ANSYS code. A parametric analysis was carried out. Only one defect in such bars was taken into account and the first five natural frequencies were calculated. 460 cases were evaluated. Half of them were plain and the other half has a notch. All the input data was classified in two groups. Each one has 230 cases and corresponds to one of the two sort of voids mentioned above. On the other hand, experimental analysis was carried on with PMMA specimens of the same size. The first two natural frequencies of 40 cases were obtained with one void. The other three frequencies were obtained numerically. 20 of these bars were plain and the others have a notch. These experimental results were introduced in the synthetic data base. 400 cases were taken randomly and, with this information, the ANN was trained with the backpropagation algorithm. The accuracy of the results was tested with the 100 cases that were left. In the next stage of this work, the ANN output was optimized with ANFIS. Previous papers showed that localization and classification of defects was reduced as notches were introduced in such bars. In the case of this paper, improved results were obtained when a hybrid data base was used.

  7. Classifying supersymmetric solutions in 3D maximal supergravity

    Science.gov (United States)

    de Boer, Jan; Mayerson, Daniel R.; Shigemori, Masaki

    2014-12-01

    String theory contains various extended objects. Among those, objects of codimension two (such as the D7-brane) are particularly interesting. Codimension-two objects carry non-Abelian charges which are elements of a discrete U-duality group and they may not admit a simple spacetime description, in which case they are known as exotic branes. A complete classification of consistent codimension-two objects in string theory is missing, even if we demand that they preserve some supersymmetry. As a step toward such a classification, we study the supersymmetric solutions of 3D maximal supergravity, which can be regarded as an approximate description of the geometry near codimension-two objects. We present a complete classification of the types of supersymmetric solutions that exist in this theory. We found that this problem reduces to that of classifying nilpotent orbits associated with the U-duality group, for which various mathematical results are known. We show that the only allowed supersymmetric configurations are 1/2, 1/4, 1/8, and 1/16 BPS, and determine the nilpotent orbits that they correspond to. One example of 1/16 BPS configurations is a generalization of the MSW system, where momentum runs along the intersection of seven M5-branes. On the other hand, it turns out exceedingly difficult to translate this classification into a simple criterion for supersymmetry in terms of the non-Abelian (monodromy) charges of the objects. For example, it can happen that a supersymmetric solution exists locally but cannot be extended all the way to the location of the object. To illustrate the various issues that arise in constructing supersymmetric solutions, we present a number of explicit examples.

  8. CLASSIFYING BENIGN AND MALIGNANT MASSES USING STATISTICAL MEASURES

    Directory of Open Access Journals (Sweden)

    B. Surendiran

    2011-11-01

    Full Text Available Breast cancer is the primary and most common disease found in women which causes second highest rate of death after lung cancer. The digital mammogram is the X-ray of breast captured for the analysis, interpretation and diagnosis. According to Breast Imaging Reporting and Data System (BIRADS benign and malignant can be differentiated using its shape, size and density, which is how radiologist visualize the mammograms. According to BIRADS mass shape characteristics, benign masses tend to have round, oval, lobular in shape and malignant masses are lobular or irregular in shape. Measuring regular and irregular shapes mathematically is found to be a difficult task, since there is no single measure to differentiate various shapes. In this paper, the malignant and benign masses present in mammogram are classified using Hue, Saturation and Value (HSV weight function based statistical measures. The weight function is robust against noise and captures the degree of gray content of the pixel. The statistical measures use gray weight value instead of gray pixel value to effectively discriminate masses. The 233 mammograms from the Digital Database for Screening Mammography (DDSM benchmark dataset have been used. The PASW data mining modeler has been used for constructing Neural Network for identifying importance of statistical measures. Based on the obtained important statistical measure, the C5.0 tree has been constructed with 60-40 data split. The experimental results are found to be encouraging. Also, the results will agree to the standard specified by the American College of Radiology-BIRADS Systems.

  9. Binary naive Bayesian classifiers for correlated Gaussian features: a theoretical analysis

    CSIR Research Space (South Africa)

    Van Dyk, E

    2008-11-01

    Full Text Available classifier with Gaussian features while using any quadratic decision boundary. Therefore, the analysis is not restricted to Naive Bayesian classifiers alone and can, for instance, be used to calculate the Bayes error performance. We compare the analytical...

  10. 32 CFR 2004.21 - Protection of Classified Information [201(e)].

    Science.gov (United States)

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Protection of Classified Information [201(e... PROGRAM DIRECTIVE NO. 1 Operations § 2004.21 Protection of Classified Information [201(e)]. Procedures for... coordination process. ...

  11. Evaluating the Performance of Multiple Classifier Systems: A Matrix Algebra Representation of Boolean Fusion Rules

    National Research Council Canada - National Science Library

    Hill, Justin

    2003-01-01

    ...., a logical OR, AND, or a majority vote of the classifiers in the system). An established method for evaluating a classifier is measuring some aspect of its Receiver Operating Characteristic (ROC...

  12. Predicting Alzheimer's disease by classifying 3D-Brain MRI images using SVM and other well-defined classifiers

    International Nuclear Information System (INIS)

    Matoug, S; Abdel-Dayem, A; Passi, K; Gross, W; Alqarni, M

    2012-01-01

    Alzheimer's disease (AD) is the most common form of dementia affecting seniors age 65 and over. When AD is suspected, the diagnosis is usually confirmed with behavioural assessments and cognitive tests, often followed by a brain scan. Advanced medical imaging and pattern recognition techniques are good tools to create a learning database in the first step and to predict the class label of incoming data in order to assess the development of the disease, i.e., the conversion from prodromal stages (mild cognitive impairment) to Alzheimer's disease, which is the most critical brain disease for the senior population. Advanced medical imaging such as the volumetric MRI can detect changes in the size of brain regions due to the loss of the brain tissues. Measuring regions that atrophy during the progress of Alzheimer's disease can help neurologists in detecting and staging the disease. In the present investigation, we present a pseudo-automatic scheme that reads volumetric MRI, extracts the middle slices of the brain region, performs segmentation in order to detect the region of brain's ventricle, generates a feature vector that characterizes this region, creates an SQL database that contains the generated data, and finally classifies the images based on the extracted features. For our results, we have used the MRI data sets from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.

  13. Learning Bayesian network classifiers for credit scoring using Markov Chain Monte Carlo search

    NARCIS (Netherlands)

    Baesens, B.; Egmont-Petersen, M.; Castelo, R.; Vanthienen, J.

    2001-01-01

    In this paper, we will evaluate the power and usefulness of Bayesian network classifiers for credit scoring. Various types of Bayesian network classifiers will be evaluated and contrasted including unrestricted Bayesian network classifiers learnt using Markov Chain Monte Carlo (MCMC) search.

  14. Intuitive Action Set Formation in Learning Classifier Systems with Memory Registers

    NARCIS (Netherlands)

    Simões, L.F.; Schut, M.C.; Haasdijk, E.W.

    2008-01-01

    An important design goal in Learning Classifier Systems (LCS) is to equally reinforce those classifiers which cause the level of reward supplied by the environment. In this paper, we propose a new method for action set formation in LCS. When applied to a Zeroth Level Classifier System with Memory

  15. 3 CFR - Implementation of the Executive Order, “Classified National Security Information”

    Science.gov (United States)

    2010-01-01

    ... 29, 2009 Implementation of the Executive Order, “Classified National Security Information” Memorandum..., “Classified National Security Information” (the “order”), which substantially advances my goals for reforming... or handles classified information shall provide the Director of the Information Security Oversight...

  16. 36 CFR 1256.70 - What controls access to national security-classified information?

    Science.gov (United States)

    2010-07-01

    ... national security-classified information? 1256.70 Section 1256.70 Parks, Forests, and Public Property... HISTORICAL MATERIALS Access to Materials Containing National Security-Classified Information § 1256.70 What controls access to national security-classified information? (a) The declassification of and public access...

  17. Evaluation of classifiers that score linear type traits and body condition score using common sires

    NARCIS (Netherlands)

    Veerkamp, R.F.; Gerritsen, C.L.M.; Koenen, E.P.C.; Hamoen, A.; Jong, de G.

    2002-01-01

    Subjective visual assessment of animals by classifiers is undertaken for several different traits in farm livestock, e.g., linear type traits, body condition score, or carcass conformation. One of the difficulties in assessment is the effect of an individual classifier. To ensure that classifiers

  18. EnsembleGASVR: A novel ensemble method for classifying missense single nucleotide polymorphisms

    KAUST Repository

    Rapakoulia, Trisevgeni

    2014-04-26

    Motivation: Single nucleotide polymorphisms (SNPs) are considered the most frequently occurring DNA sequence variations. Several computational methods have been proposed for the classification of missense SNPs to neutral and disease associated. However, existing computational approaches fail to select relevant features by choosing them arbitrarily without sufficient documentation. Moreover, they are limited to the problem ofmissing values, imbalance between the learning datasets and most of them do not support their predictions with confidence scores. Results: To overcome these limitations, a novel ensemble computational methodology is proposed. EnsembleGASVR facilitates a twostep algorithm, which in its first step applies a novel evolutionary embedded algorithm to locate close to optimal Support Vector Regression models. In its second step, these models are combined to extract a universal predictor, which is less prone to overfitting issues, systematizes the rebalancing of the learning sets and uses an internal approach for solving the missing values problem without loss of information. Confidence scores support all the predictions and the model becomes tunable by modifying the classification thresholds. An extensive study was performed for collecting the most relevant features for the problem of classifying SNPs, and a superset of 88 features was constructed. Experimental results show that the proposed framework outperforms well-known algorithms in terms of classification performance in the examined datasets. Finally, the proposed algorithmic framework was able to uncover the significant role of certain features such as the solvent accessibility feature, and the top-scored predictions were further validated by linking them with disease phenotypes. © The Author 2014.

  19. Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier.

    Science.gov (United States)

    Zhang, Baochang; Yang, Yun; Chen, Chen; Yang, Linlin; Han, Jungong; Shao, Ling

    2017-10-01

    Human action recognition is an important yet challenging task. This paper presents a low-cost descriptor called 3D histograms of texture (3DHoTs) to extract discriminant features from a sequence of depth maps. 3DHoTs are derived from projecting depth frames onto three orthogonal Cartesian planes, i.e., the frontal, side, and top planes, and thus compactly characterize the salient information of a specific action, on which texture features are calculated to represent the action. Besides this fast feature descriptor, a new multi-class boosting classifier (MBC) is also proposed to efficiently exploit different kinds of features in a unified framework for action classification. Compared with the existing boosting frameworks, we add a new multi-class constraint into the objective function, which helps to maintain a better margin distribution by maximizing the mean of margin, whereas still minimizing the variance of margin. Experiments on the MSRAction3D, MSRGesture3D, MSRActivity3D, and UTD-MHAD data sets demonstrate that the proposed system combining 3DHoTs and MBC is superior to the state of the art.

  20. Consensus of sample-balanced classifiers for identifying ligand-binding residue by co-evolutionary physicochemical characteristics of amino acids

    KAUST Repository

    Chen, Peng

    2013-01-01

    Protein-ligand binding is an important mechanism for some proteins to perform their functions, and those binding sites are the residues of proteins that physically bind to ligands. So far, the state-of-the-art methods search for similar, known structures of the query and predict the binding sites based on the solved structures. However, such structural information is not commonly available. In this paper, we propose a sequence-based approach to identify protein-ligand binding residues. Due to the highly imbalanced samples between the ligand-binding sites and non ligand-binding sites, we constructed several balanced data sets, for each of which a random forest (RF)-based classifier was trained. The ensemble of these RF classifiers formed a sequence-based protein-ligand binding site predictor. Experimental results on CASP9 targets demonstrated that our method compared favorably with the state-of-the-art. © Springer-Verlag Berlin Heidelberg 2013.

  1. Sequences for Student Investigation

    Science.gov (United States)

    Barton, Jeffrey; Feil, David; Lartigue, David; Mullins, Bernadette

    2004-01-01

    We describe two classes of sequences that give rise to accessible problems for undergraduate research. These problems may be understood with virtually no prerequisites and are well suited for computer-aided investigation. The first sequence is a variation of one introduced by Stephen Wolfram in connection with his study of cellular automata. The…

  2. Drug target ontology to classify and integrate drug discovery data.

    Science.gov (United States)

    Lin, Yu; Mehta, Saurabh; Küçük-McGinty, Hande; Turner, John Paul; Vidovic, Dusica; Forlin, Michele; Koleti, Amar; Nguyen, Dac-Trung; Jensen, Lars Juhl; Guha, Rajarshi; Mathias, Stephen L; Ursu, Oleg; Stathias, Vasileios; Duan, Jianbin; Nabizadeh, Nooshin; Chung, Caty; Mader, Christopher; Visser, Ubbo; Yang, Jeremy J; Bologa, Cristian G; Oprea, Tudor I; Schürer, Stephan C

    2017-11-09

    One of the most successful approaches to develop new small molecule therapeutics has been to start from a validated druggable protein target. However, only a small subset of potentially druggable targets has attracted significant research and development resources. The Illuminating the Druggable Genome (IDG) project develops resources to catalyze the development of likely targetable, yet currently understudied prospective drug targets. A central component of the IDG program is a comprehensive knowledge resource of the druggable genome. As part of that effort, we have developed a framework to integrate, navigate, and analyze drug discovery data based on formalized and standardized classifications and annotations of druggable protein targets, the Drug Target Ontology (DTO). DTO was constructed by extensive curation and consolidation of various resources. DTO classifies the four major drug target protein families, GPCRs, kinases, ion channels and nuclear receptors, based on phylogenecity, function, target development level, disease association, tissue expression, chemical ligand and substrate characteristics, and target-family specific characteristics. The formal ontology was built using a new software tool to auto-generate most axioms from a database while supporting manual knowledge acquisition. A modular, hierarchical implementation facilitate ontology development and maintenance and makes use of various external ontologies, thus integrating the DTO into the ecosystem of biomedical ontologies. As a formal OWL-DL ontology, DTO contains asserted and inferred axioms. Modeling data from the Library of Integrated Network-based Cellular Signatures (LINCS) program illustrates the potential of DTO for contextual data integration and nuanced definition of important drug target characteristics. DTO has been implemented in the IDG user interface Portal, Pharos and the TIN-X explorer of protein target disease relationships. DTO was built based on the need for a formal semantic

  3. Sequence History Update Tool

    Science.gov (United States)

    Khanampompan, Teerapat; Gladden, Roy; Fisher, Forest; DelGuercio, Chris

    2008-01-01

    The Sequence History Update Tool performs Web-based sequence statistics archiving for Mars Reconnaissance Orbiter (MRO). Using a single UNIX command, the software takes advantage of sequencing conventions to automatically extract the needed statistics from multiple files. This information is then used to populate a PHP database, which is then seamlessly formatted into a dynamic Web page. This tool replaces a previous tedious and error-prone process of manually editing HTML code to construct a Web-based table. Because the tool manages all of the statistics gathering and file delivery to and from multiple data sources spread across multiple servers, there is also a considerable time and effort savings. With the use of The Sequence History Update Tool what previously took minutes is now done in less than 30 seconds, and now provides a more accurate archival record of the sequence commanding for MRO.

  4. Maternal hemodynamics: a method to classify hypertensive disorders of pregnancy.

    Science.gov (United States)

    Ferrazzi, Enrico; Stampalija, Tamara; Monasta, Lorenzo; Di Martino, Daniela; Vonck, Sharona; Gyselaers, Wilfried

    2018-01-01

    The classification of hypertensive disorders of pregnancy is based on the time at the onset of hypertension, proteinuria, and other associated complications. Maternal hemodynamic interrogation in hypertensive disorders of pregnancy considers not only the peripheral blood pressure but also the entire cardiovascular system, and it might help to classify the different clinical phenotypes of this syndrome. This study aimed to examine cardiovascular parameters in a cohort of patients affected by hypertensive disorders of pregnancy according to the clinical phenotypes that prioritize fetoplacental characteristics and not the time at onset of hypertensive disorders of pregnancy. At the fetal-maternal medicine unit of Ziekenhuis Oost-Limburg (Genk, Belgium), maternal cardiovascular parameters were obtained through impedance cardiography using a noninvasive continuous cardiac output monitor with the patients placed in a standing position. The patients were classified as pregnant women with hypertensive disorders of pregnancy who delivered appropriate- and small-for-gestational-age fetuses. Normotensive pregnant women with an appropriate-for-gestational-age fetus at delivery were enrolled as the control group. The possible impact of obesity (body mass index ≥30 kg/m 2 ) on maternal hemodynamics was reassessed in the same groups. Maternal age, parity, body mass index, and blood pressure were not significantly different between the hypertensive disorders of pregnancy/appropriate-for-gestational-age and hypertensive disorders of pregnancy/small-for-gestational-age groups. The mean uterine artery pulsatility index was significantly higher in the hypertensive disorders of pregnancy/small-for-gestational-age group. The cardiac output and cardiac index were significantly lower in the hypertensive disorders of pregnancy/small-for-gestational-age group (cardiac output 6.5 L/min, cardiac index 3.6) than in the hypertensive disorders of pregnancy/appropriate-for-gestational-age group

  5. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Directory of Open Access Journals (Sweden)

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  6. Efficient DNA barcode regions for classifying Piper species (Piperaceae

    Directory of Open Access Journals (Sweden)

    Arunrat Chaveerach

    2016-09-01

    Full Text Available Piper species are used for spices, in traditional and processed forms of medicines, in cosmetic compounds, in cultural activities and insecticides. Here barcode analysis was performed for identification of plant parts, young plants and modified forms of plants. Thirty-six Piper species were collected and the three barcode regions, matK, rbcL and psbA-trnH spacer, were amplified, sequenced and aligned to determine their genetic distances. For intraspecific genetic distances, the most effective values for the species identification ranged from no difference to very low distance values. However, P. betle had the highest values at 0.386 for the matK region. This finding may be due to P. betle being an economic and cultivated species, and thus is supported with growth factors, which may have affected its genetic distance. The interspecific genetic distances that were most effective for identification of different species were from the matK region and ranged from a low of 0.002 in 27 paired species to a high of 0.486. Eight species pairs, P. kraense and P. dominantinervium, P. magnibaccum and P. kraense, P. phuwuaense and P. dominantinervium, P. phuwuaense and P. kraense, P. pilobracteatum and P. dominantinervium, P. pilobracteatum and P. kraense, P. pilobracteatum and P. phuwuaense and P. sylvestre and P. polysyphonum, that presented a genetic distance of 0.000 and were identified by independently using each of the other two regions. Concisely, these three barcode regions are powerful for further efficient identification of the 36 Piper species.

  7. Efficient DNA barcode regions for classifying Piper species (Piperaceae).

    Science.gov (United States)

    Chaveerach, Arunrat; Tanee, Tawatchai; Sanubol, Arisa; Monkheang, Pansa; Sudmoon, Runglawan

    2016-01-01

    Piper species are used for spices, in traditional and processed forms of medicines, in cosmetic compounds, in cultural activities and insecticides. Here barcode analysis was performed for identification of plant parts, young plants and modified forms of plants. Thirty-six Piper species were collected and the three barcode regions, matK , rbcL and psbA - trnH spacer, were amplified, sequenced and aligned to determine their genetic distances. For intraspecific genetic distances, the most effective values for the species identification ranged from no difference to very low distance values. However, Piper betle had the highest values at 0.386 for the matK region. This finding may be due to Piper betle being an economic and cultivated species, and thus is supported with growth factors, which may have affected its genetic distance. The interspecific genetic distances that were most effective for identification of different species were from the matK region and ranged from a low of 0.002 in 27 paired species to a high of 0.486. Eight species pairs, Piper kraense and Piper dominantinervium , Piper magnibaccum and Piper kraense , Piper phuwuaense and Piper dominantinervium , Piper phuwuaense and Piper kraense , Piper pilobracteatum and Piper dominantinervium , Piper pilobracteatum and Piper kraense , Piper pilobracteatum and Piper phuwuaense and Piper sylvestre and Piper polysyphonum , that presented a genetic distance of 0.000 and were identified by independently using each of the other two regions. Concisely, these three barcode regions are powerful for further efficient identification of the 36 Piper species.

  8. Efficient DNA barcode regions for classifying Piper species (Piperaceae)

    Science.gov (United States)

    Chaveerach, Arunrat; Tanee, Tawatchai; Sanubol, Arisa; Monkheang, Pansa; Sudmoon, Runglawan

    2016-01-01

    Abstract Piper species are used for spices, in traditional and processed forms of medicines, in cosmetic compounds, in cultural activities and insecticides. Here barcode analysis was performed for identification of plant parts, young plants and modified forms of plants. Thirty-six Piper species were collected and the three barcode regions, matK, rbcL and psbA-trnH spacer, were amplified, sequenced and aligned to determine their genetic distances. For intraspecific genetic distances, the most effective values for the species identification ranged from no difference to very low distance values. However, Piper betle had the highest values at 0.386 for the matK region. This finding may be due to Piper betle being an economic and cultivated species, and thus is supported with growth factors, which may have affected its genetic distance. The interspecific genetic distances that were most effective for identification of different species were from the matK region and ranged from a low of 0.002 in 27 paired species to a high of 0.486. Eight species pairs, Piper kraense and Piper dominantinervium, Piper magnibaccum and Piper kraense, Piper phuwuaense and Piper dominantinervium, Piper phuwuaense and Piper kraense, Piper pilobracteatum and Piper dominantinervium, Piper pilobracteatum and Piper kraense, Piper pilobracteatum and Piper phuwuaense and Piper sylvestre and Piper polysyphonum, that presented a genetic distance of 0.000 and were identified by independently using each of the other two regions. Concisely, these three barcode regions are powerful for further efficient identification of the 36 Piper species. PMID:27829794

  9. Multivariate models to classify Tuscan virgin olive oils by zone.

    Directory of Open Access Journals (Sweden)

    Alessandri, Stefano

    1999-10-01

    Full Text Available In order to study and classify Tuscan virgin olive oils, 179 samples were collected. They were obtained from drupes harvested during the first half of November, from three different zones of the Region. The sampling was repeated for 5 years. Fatty acids, phytol, aliphatic and triterpenic alcohols, triterpenic dialcohols, sterols, squalene and tocopherols were analyzed. A subset of variables was considered. They were selected in a preceding work as the most effective and reliable, from the univariate point of view. The analytical data were transformed (except for the cycloartenol to compensate annual variations, the mean related to the East zone was subtracted from each value, within each year. Univariate three-class models were calculated and further variables discarded. Then multivariate three-zone models were evaluated, including phytol (that was always selected and all the combinations of palmitic, palmitoleic and oleic acid, tetracosanol, cycloartenol and squalene. Models including from two to seven variables were studied. The best model shows by-zone classification errors less than 40%, by-zone within-year classification errors that are less than 45% and a global classification error equal to 30%. This model includes phytol, palmitic acid, tetracosanol and cycloartenol.

    Para estudiar y clasificar aceites de oliva vírgenes Toscanos, se utilizaron 179 muestras, que fueron obtenidas de frutos recolectados durante la primera mitad de Noviembre, de tres zonas diferentes de la Región. El muestreo fue repetido durante 5 años. Se analizaron ácidos grasos, fitol, alcoholes alifáticos y triterpénicos, dialcoholes triterpénicos, esteroles, escualeno y tocoferoles. Se consideró un subconjunto de variables que fueron seleccionadas en un trabajo anterior como el más efectivo y fiable, desde el punto de vista univariado. Los datos analíticos se transformaron (excepto para el cicloartenol para compensar las variaciones anuales, rest

  10. HIV Sequence Compendium 2015

    Energy Technology Data Exchange (ETDEWEB)

    Foley, Brian Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Leitner, Thomas Kenneth [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Apetrei, Cristian [Univ. of Pittsburgh, PA (United States); Hahn, Beatrice [Univ. of Pennsylvania, Philadelphia, PA (United States); Mizrachi, Ilene [National Center for Biotechnology Information, Bethesda, MD (United States); Mullins, James [Univ. of Washington, Seattle, WA (United States); Rambaut, Andrew [Univ. of Edinburgh, Scotland (United Kingdom); Wolinsky, Steven [Northwestern Univ., Evanston, IL (United States); Korber, Bette Tina Marie [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2015-10-05

    This compendium is an annual printed summary of the data contained in the HIV sequence database. We try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2015. Hence, though it is published in 2015 and called the 2015 Compendium, its contents correspond to the 2014 curated alignments on our website. The number of sequences in the HIV database is still increasing. In total, at the end of 2014, there were 624,121 sequences in the HIV Sequence Database, an increase of 7% since the previous year. This is the first year that the number of new sequences added to the database has decreased compared to the previous year. The number of near complete genomes (>7000 nucleotides) increased to 5834 by end of 2014. However, as in previous years, the compendium alignments contain only a fraction of these. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/ content/sequence/NEWALIGN/align.html As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.

  11. Mapping sequences by parts

    Directory of Open Access Journals (Sweden)

    Guziolowski Carito

    2007-09-01

    Full Text Available Abstract Background: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. Results: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N using O (|s| × |t| × N memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. Practical Application: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events.

  12. The Colliding Beams Sequencer

    International Nuclear Information System (INIS)

    Johnson, D.E.; Johnson, R.P.

    1989-01-01

    The Colliding Beam Sequencer (CBS) is a computer program used to operate the pbar-p Collider by synchronizing the applications programs and simulating the activities of the accelerator operators during filling and storage. The Sequencer acts as a meta-program, running otherwise stand alone applications programs, to do the set-up, beam transfers, acceleration, low beta turn on, and diagnostics for the transfers and storage. The Sequencer and its operational performance will be described along with its special features which include a periodic scheduler and command logger. 14 refs., 3 figs

  13. Phylogenetic Trees From Sequences

    Science.gov (United States)

    Ryvkin, Paul; Wang, Li-San

    In this chapter, we review important concepts and approaches for phylogeny reconstruction from sequence data.We first cover some basic definitions and properties of phylogenetics, and briefly explain how scientists model sequence evolution and measure sequence divergence. We then discuss three major approaches for phylogenetic reconstruction: distance-based phylogenetic reconstruction, maximum parsimony, and maximum likelihood. In the third part of the chapter, we review how multiple phylogenies are compared by consensus methods and how to assess confidence using bootstrapping. At the end of the chapter are two sections that list popular software packages and additional reading.

  14. PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIES

    KAUST Repository

    Bahabri, Rihab R.

    2013-06-01

    Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.

  15. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  16. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  17. Pixel Classification of SAR ice images using ANFIS-PSO Classifier

    Directory of Open Access Journals (Sweden)

    G. Vasumathi

    2016-12-01

    Full Text Available Synthetic Aperture Radar (SAR is playing a vital role in taking extremely high resolution radar images. It is greatly used to monitor the ice covered ocean regions. Sea monitoring is important for various purposes which includes global climate systems and ship navigation. Classification on the ice infested area gives important features which will be further useful for various monitoring process around the ice regions. Main objective of this paper is to classify the SAR ice image that helps in identifying the regions around the ice infested areas. In this paper three stages are considered in classification of SAR ice images. It starts with preprocessing in which the speckled SAR ice images are denoised using various speckle removal filters; comparison is made on all these filters to find the best filter in speckle removal. Second stage includes segmentation in which different regions are segmented using K-means and watershed segmentation algorithms; comparison is made between these two algorithms to find the best in segmenting SAR ice images. The last stage includes pixel based classification which identifies and classifies the segmented regions using various supervised learning classifiers. The algorithms includes Back propagation neural networks (BPN, Fuzzy Classifier, Adaptive Neuro Fuzzy Inference Classifier (ANFIS classifier and proposed ANFIS with Particle Swarm Optimization (PSO classifier; comparison is made on all these classifiers to propose which classifier is best suitable for classifying the SAR ice image. Various evaluation metrics are performed separately at all these three stages.

  18. Dynamic Sequence Assignment.

    Science.gov (United States)

    1983-12-01

    D-136 548 DYNAMIIC SEQUENCE ASSIGNMENT(U) ADVANCED INFORMATION AND 1/2 DECISION SYSTEMS MOUNTAIN YIELW CA C A 0 REILLY ET AL. UNCLSSIIED DEC 83 AI/DS...I ADVANCED INFORMATION & DECISION SYSTEMS Mountain View. CA 94040 84 u ,53 V,..’. Unclassified _____ SCURITY CLASSIFICATION OF THIS PAGE REPORT...reviews some important heuristic algorithms developed for fas- ter solution of the sequence assignment problem. 3.1. DINAMIC MOGRAMUNIG FORMULATION FOR

  19. HIV Sequence Compendium 2010

    Energy Technology Data Exchange (ETDEWEB)

    Kuiken, Carla [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Foley, Brian [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Leitner, Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Apetrei, Christian [Univ. of Pittsburgh, PA (United States); Hahn, Beatrice [Univ. of Alabama, Tuscaloosa, AL (United States); Mizrachi, Ilene [National Center for Biotechnology Information, Bethesda, MD (United States); Mullins, James [Univ. of Washington, Seattle, WA (United States); Rambaut, Andrew [Univ. of Edinburgh, Scotland (United Kingdom); Wolinsky, Steven [Northwestern Univ., Evanston, IL (United States); Korber, Bette [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2010-12-31

    This compendium is an annual printed summary of the data contained in the HIV sequence database. In these compendia we try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2010. Hence, though it is called the 2010 Compendium, its contents correspond to the 2009 curated alignments on our website. The number of sequences in the HIV database is still increasing exponentially. In total, at the time of printing, there were 339,306 sequences in the HIV Sequence Database, an increase of 45% since last year. The number of near complete genomes (>7000 nucleotides) increased to 2576 by end of 2009, reflecting a smaller increase than in previous years. However, as in previous years, the compendium alignments contain only a small fraction of these. Included in the alignments are a small number of sequences representing each of the subtypes and the more prevalent circulating recombinant forms (CRFs) such as 01 and 02, as well as a few outgroup sequences (group O and N and SIV-CPZ). Of the rarer CRFs we included one representative each. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/content/sequence/NEWALIGN/align.html. Reprints are available from our website in the form of both HTML and PDF files. As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.

  20. General LTE Sequence

    OpenAIRE

    Billal, Masum

    2015-01-01

    In this paper,we have characterized sequences which maintain the same property described in Lifting the Exponent Lemma. Lifting the Exponent Lemma is a very powerful tool in olympiad number theory and recently it has become very popular. We generalize it to all sequences that maintain a property like it i.e. if p^{\\alpha}||a_k and p^\\b{eta}||n, then p^{{\\alpha}+\\b{eta}}||a_{nk}.

  1. Pairwise Sequence Alignment Library

    Energy Technology Data Exchange (ETDEWEB)

    2015-05-20

    Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, a novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.

  2. Classifier-ensemble incremental-learning procedure for nuclear transient identification at different operational conditions

    Energy Technology Data Exchange (ETDEWEB)

    Baraldi, Piero, E-mail: piero.baraldi@polimi.i [Dipartimento di Energia - Sezione Ingegneria Nucleare, Politecnico di Milano, via Ponzio 34/3, 20133 Milano (Italy); Razavi-Far, Roozbeh [Dipartimento di Energia - Sezione Ingegneria Nucleare, Politecnico di Milano, via Ponzio 34/3, 20133 Milano (Italy); Zio, Enrico [Dipartimento di Energia - Sezione Ingegneria Nucleare, Politecnico di Milano, via Ponzio 34/3, 20133 Milano (Italy); Ecole Centrale Paris-Supelec, Paris (France)

    2011-04-15

    An important requirement for the practical implementation of empirical diagnostic systems is the capability of classifying transients in all plant operational conditions. The present paper proposes an approach based on an ensemble of classifiers for incrementally learning transients under different operational conditions. New classifiers are added to the ensemble where transients occurring in new operational conditions are not satisfactorily classified. The construction of the ensemble is made by bagging; the base classifier is a supervised Fuzzy C Means (FCM) classifier whose outcomes are combined by majority voting. The incremental learning procedure is applied to the identification of simulated transients in the feedwater system of a Boiling Water Reactor (BWR) under different reactor power levels.

  3. Statistical and Machine-Learning Classifier Framework to Improve Pulse Shape Discrimination System Design

    Energy Technology Data Exchange (ETDEWEB)

    Wurtz, R. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Kaplan, A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-10-28

    Pulse shape discrimination (PSD) is a variety of statistical classifier. Fully-­realized statistical classifiers rely on a comprehensive set of tools for designing, building, and implementing. PSD advances rely on improvements to the implemented algorithm. PSD advances can be improved by using conventional statistical classifier or machine learning methods. This paper provides the reader with a glossary of classifier-­building elements and their functions in a fully-­designed and operational classifier framework that can be used to discover opportunities for improving PSD classifier projects. This paper recommends reporting the PSD classifier’s receiver operating characteristic (ROC) curve and its behavior at a gamma rejection rate (GRR) relevant for realistic applications.

  4. An Active Learning Classifier for Further Reducing Diabetic Retinopathy Screening System Cost

    Directory of Open Access Journals (Sweden)

    Yinan Zhang

    2016-01-01

    Full Text Available Diabetic retinopathy (DR screening system raises a financial problem. For further reducing DR screening cost, an active learning classifier is proposed in this paper. Our approach identifies retinal images based on features extracted by anatomical part recognition and lesion detection algorithms. Kernel extreme learning machine (KELM is a rapid classifier for solving classification problems in high dimensional space. Both active learning and ensemble technique elevate performance of KELM when using small training dataset. The committee only proposes necessary manual work to doctor for saving cost. On the publicly available Messidor database, our classifier is trained with 20%–35% of labeled retinal images and comparative classifiers are trained with 80% of labeled retinal images. Results show that our classifier can achieve better classification accuracy than Classification and Regression Tree, radial basis function SVM, Multilayer Perceptron SVM, Linear SVM, and K Nearest Neighbor. Empirical experiments suggest that our active learning classifier is efficient for further reducing DR screening cost.

  5. Classifier-ensemble incremental-learning procedure for nuclear transient identification at different operational conditions

    International Nuclear Information System (INIS)

    Baraldi, Piero; Razavi-Far, Roozbeh; Zio, Enrico

    2011-01-01

    An important requirement for the practical implementation of empirical diagnostic systems is the capability of classifying transients in all plant operational conditions. The present paper proposes an approach based on an ensemble of classifiers for incrementally learning transients under different operational conditions. New classifiers are added to the ensemble where transients occurring in new operational conditions are not satisfactorily classified. The construction of the ensemble is made by bagging; the base classifier is a supervised Fuzzy C Means (FCM) classifier whose outcomes are combined by majority voting. The incremental learning procedure is applied to the identification of simulated transients in the feedwater system of a Boiling Water Reactor (BWR) under different reactor power levels.

  6. Using hierarchical clustering of secreted protein families to classify and rank candidate effectors of rust fungi.

    Directory of Open Access Journals (Sweden)

    Diane G O Saunders

    Full Text Available Rust fungi are obligate biotrophic pathogens that cause considerable damage on crop plants. Puccinia graminis f. sp. tritici, the causal agent of wheat stem rust, and Melampsora larici-populina, the poplar leaf rust pathogen, have strong deleterious impacts on wheat and poplar wood production, respectively. Filamentous pathogens such as rust fungi secrete molecules called disease effectors that act as modulators of host cell physiology and can suppress or trigger host immunity. Current knowledge on effectors from other filamentous plant pathogens can be exploited for the characterisation of effectors in the genome of recently sequenced rust fungi. We designed a comprehensive in silico analysis pipeline to identify the putative effector repertoire from the genome of two plant pathogenic rust fungi. The pipeline is based on the observation that known effector proteins from filamentous pathogens have at least one of the following properties: (i contain a secretion signal, (ii are encoded by in planta induced genes, (iii have similarity to haustorial proteins, (iv are small and cysteine rich, (v contain a known effector motif or a nuclear localization signal, (vi are encoded by genes with long intergenic regions, (vii contain internal repeats, and (viii do not contain PFAM domains, except those associated with pathogenicity. We used Markov clustering and hierarchical clustering to classify protein families of rust pathogens and rank them according to their likelihood of being effectors. Using this approach, we identified eight families of candidate effectors that we consider of high value for functional characterization. This study revealed a diverse set of candidate effectors, including families of haustorial expressed secreted proteins and small cysteine-rich proteins. This comprehensive classification of candidate effectors from these devastating rust pathogens is an initial step towards probing plant germplasm for novel resistance components.

  7. A comparison of graph- and kernel-based -omics data integration algorithms for classifying complex traits.

    Science.gov (United States)

    Yan, Kang K; Zhao, Hongyu; Pang, Herbert

    2017-12-06

    High-throughput sequencing data are widely collected and analyzed in the study of complex diseases in quest of improving human health. Well-studied algorithms mostly deal with single data source, and cannot fully utilize the potential of these multi-omics data sources. In order to provide a holistic understanding of human health and diseases, it is necessary to integrate multiple data sources. Several algorithms have been proposed so far, however, a comprehensive comparison of data integration algorithms for classification of binary traits is currently lacking. In this paper, we focus on two common classes of integration algorithms, graph-based that depict relationships with subjects denoted by nodes and relationships denoted by edges, and kernel-based that can generate a classifier in feature space. Our paper provides a comprehensive comparison of their performance in terms of various measurements of classification accuracy and computation time. Seven different integration algorithms, including graph-based semi-supervised learning, graph sharpening integration, composite association network, Bayesian network, semi-definite programming-support vector machine (SDP-SVM), relevance vector machine (RVM) and Ada-boost relevance vector machine are compared and evaluated with hypertension and two cancer data sets in our study. In general, kernel-based algorithms create more complex models and require longer computation time, but they tend to perform better than graph-based algorithms. The performance of graph-based algorithms has the advantage of being faster computationally. The empirical results demonstrate that composite association network, relevance vector machine, and Ada-boost RVM are the better performers. We provide recommendations on how to choose an appropriate algorithm for integrating data from multiple sources.

  8. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed A.; Bonny, Talal; Salama, Khaled N.

    2012-01-01

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  9. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed A.

    2012-01-26

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  10. Lung Nodule Image Classification Based on Local Difference Pattern and Combined Classifier.

    Science.gov (United States)

    Mao, Keming; Deng, Zhuofu

    2016-01-01

    This paper proposes a novel lung nodule classification method for low-dose CT images. The method includes two stages. First, Local Difference Pattern (LDP) is proposed to encode the feature representation, which is extracted by comparing intensity difference along circular regions centered at the lung nodule. Then, the single-center classifier is trained based on LDP. Due to the diversity of feature distribution for different class, the training images are further clustered into multiple cores and the multicenter classifier is constructed. The two classifiers are combined to make the final decision. Experimental results on public dataset show the superior performance of LDP and the combined classifier.

  11. Lung Nodule Image Classification Based on Local Difference Pattern and Combined Classifier

    Directory of Open Access Journals (Sweden)

    Keming Mao

    2016-01-01

    Full Text Available This paper proposes a novel lung nodule classification method for low-dose CT images. The method includes two stages. First, Local Difference Pattern (LDP is proposed to encode the feature representation, which is extracted by comparing intensity difference along circular regions centered at the lung nodule. Then, the single-center classifier is trained based on LDP. Due to the diversity of feature distribution for different class, the training images are further clustered into multiple cores and the multicenter classifier is constructed. The two classifiers are combined to make the final decision. Experimental results on public dataset show the superior performance of LDP and the combined classifier.

  12. Reducing variability in the output of pattern classifiers using histogram shaping

    International Nuclear Information System (INIS)

    Gupta, Shalini; Kan, Chih-Wen; Markey, Mia K.

    2010-01-01

    Purpose: The authors present a novel technique based on histogram shaping to reduce the variability in the output and (sensitivity, specificity) pairs of pattern classifiers with identical ROC curves, but differently distributed outputs. Methods: The authors identify different sources of variability in the output of linear pattern classifiers with identical ROC curves, which also result in classifiers with differently distributed outputs. They theoretically develop a novel technique based on the matching of the histograms of these differently distributed pattern classifier outputs to reduce the variability in their (sensitivity, specificity) pairs at fixed decision thresholds, and to reduce the variability in their actual output values. They empirically demonstrate the efficacy of the proposed technique by means of analyses on the simulated data and real world mammography data. Results: For the simulated data, with three different known sources of variability, and for the real world mammography data with unknown sources of variability, the proposed classifier output calibration technique significantly reduced the variability in the classifiers' (sensitivity, specificity) pairs at fixed decision thresholds. Furthermore, for classifiers with monotonically or approximately monotonically related output variables, the histogram shaping technique also significantly reduced the variability in their actual output values. Conclusions: Classifier output calibration based on histogram shaping can be successfully employed to reduce the variability in the output values and (sensitivity, specificity) pairs of pattern classifiers with identical ROC curves, but differently distributed outputs.

  13. Detection and Classification of Transformer Winding Mechanical Faults Using UWB Sensors and Bayesian Classifier

    Science.gov (United States)

    Alehosseini, Ali; A. Hejazi, Maryam; Mokhtari, Ghassem; B. Gharehpetian, Gevork; Mohammadi, Mohammad

    2015-06-01

    In this paper, the Bayesian classifier is used to detect and classify the radial deformation and axial displacement of transformer windings. The proposed method is tested on a model of transformer for different volumes of radial deformation and axial displacement. In this method, ultra-wideband (UWB) signal is sent to the simplified model of the transformer winding. The received signal from the winding model is recorded and used for training and testing of Bayesian classifier in different axial displacement and radial deformation states of the winding. It is shown that the proposed method has a good accuracy to detect and classify the axial displacement and radial deformation of the winding.

  14. Application of SVM classifier in thermographic image classification for early detection of breast cancer

    Science.gov (United States)

    Oleszkiewicz, Witold; Cichosz, Paweł; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał

    2016-09-01

    This article presents the application of machine learning algorithms for early detection of breast cancer on the basis of thermographic images. Supervised learning model: Support vector machine (SVM) and Sequential Minimal Optimization algorithm (SMO) for the training of SVM classifier were implemented. The SVM classifier was included in a client-server application which enables to create a training set of examinations and to apply classifiers (including SVM) for the diagnosis and early detection of the breast cancer. The sensitivity and specificity of SVM classifier were calculated based on the thermographic images from studies. Furthermore, the heuristic method for SVM's parameters tuning was proposed.

  15. A degenerate primer MOB typing (DPMT method to classify gamma-proteobacterial plasmids in clinical and environmental settings.

    Directory of Open Access Journals (Sweden)

    Andrés Alvarado

    Full Text Available Transmissible plasmids are responsible for the spread of genetic determinants, such as antibiotic resistance or virulence traits, causing a large ecological and epidemiological impact. Transmissible plasmids, either conjugative or mobilizable, have in common the presence of a relaxase gene. Relaxases were previously classified in six protein families according to their phylogeny. Degenerate primers hybridizing to coding sequences of conserved amino acid motifs were designed to amplify related relaxase genes from γ-Proteobacterial plasmids. Specificity and sensitivity of a selected set of 19 primer pairs were first tested using a collection of 33 reference relaxases, representing the diversity of γ-Proteobacterial plasmids. The validated set was then applied to the analysis of two plasmid collections obtained from clinical isolates. The relaxase screening method, which we call "Degenerate Primer MOB Typing" or DPMT, detected not only most known Inc/Rep groups, but also a plethora of plasmids not previously assigned to any Inc group or Rep-type.

  16. Main sequence mass loss

    International Nuclear Information System (INIS)

    Brunish, W.M.; Guzik, J.A.; Willson, L.A.; Bowen, G.

    1987-01-01

    It has been hypothesized that variable stars may experience mass loss, driven, at least in part, by oscillations. The class of stars we are discussing here are the δ Scuti variables. These are variable stars with masses between about 1.2 and 2.25 M/sub θ/, lying on or very near the main sequence. According to this theory, high rotation rates enhance the rate of mass loss, so main sequence stars born in this mass range would have a range of mass loss rates, depending on their initial rotation velocity and the amplitude of the oscillations. The stars would evolve rapidly down the main sequence until (at about 1.25 M/sub θ/) a surface convection zone began to form. The presence of this convective region would slow the rotation, perhaps allowing magnetic braking to occur, and thus sharply reduce the mass loss rate. 7 refs

  17. Applications of High-Throughput Nucleotide Sequencing (PhD)

    DEFF Research Database (Denmark)

    Waage, Johannes

    equally large demands in data handling, analysis and interpretation, perhaps defining the modern challenge of the computational biologist of the post-genomic era. The first part of this thesis consists of a general introduction to the history, common terms and challenges of next generation sequencing......-sequencing, a study of the effects on alternative RNA splicing of KO of the nonsense mediated RNA decay system in Mus, using digital gene expression and a custom-built exon-exon junction mapping pipeline is presented (article I). Evolved from this work, a Bioconductor package, spliceR, for classifying alternative...

  18. deFUME: Dynamic exploration of functional metagenomic sequencing data

    DEFF Research Database (Denmark)

    van der Helm, Eric; Geertz-Hansen, Henrik Marcus; Genee, Hans Jasper

    2015-01-01

    is time consuming and constitutes a major bottleneck for experimental researchers in the field. Here we present the deFUME web server, an easy-to-use web-based interface for processing, annotation and visualization of functional metagenomics sequencing data, tailored to meet the requirements of non......-bioinformaticians. The web-server integrates multiple analysis steps into one single workflow: read assembly, open reading frame prediction, and annotation with BLAST, InterPro and GO classifiers. Analysis results are visualized in an online dynamic web-interface. The deFUME webserver provides a fast track from raw sequence...

  19. Predicting tissue-specific expressions based on sequence characteristics

    KAUST Repository

    Paik, Hyojung; Ryu, Tae Woo; Heo, Hyoungsam; Seo, Seungwon; Lee, Doheon; Hur, Cheolgoo

    2011-01-01

    In multicellular organisms, including humans, understanding expression specificity at the tissue level is essential for interpreting protein function, such as tissue differentiation. We developed a prediction approach via generated sequence features from overrepresented patterns in housekeeping (HK) and tissue-specific (TS) genes to classify TS expression in humans. Using TS domains and transcriptional factor binding sites (TFBSs), sequence characteristics were used as indices of expressed tissues in a Random Forest algorithm by scoring exclusive patterns considering the biological intuition; TFBSs regulate gene expression, and the domains reflect the functional specificity of a TS gene. Our proposed approach displayed better performance than previous attempts and was validated using computational and experimental methods.

  20. Predicting tissue-specific expressions based on sequence characteristics

    KAUST Repository

    Paik, Hyojung

    2011-04-30

    In multicellular organisms, including humans, understanding expression specificity at the tissue level is essential for interpreting protein function, such as tissue differentiation. We developed a prediction approach via generated sequence features from overrepresented patterns in housekeeping (HK) and tissue-specific (TS) genes to classify TS expression in humans. Using TS domains and transcriptional factor binding sites (TFBSs), sequence characteristics were used as indices of expressed tissues in a Random Forest algorithm by scoring exclusive patterns considering the biological intuition; TFBSs regulate gene expression, and the domains reflect the functional specificity of a TS gene. Our proposed approach displayed better performance than previous attempts and was validated using computational and experimental methods.

  1. CREST--classification resources for environmental sequence tags.

    Directory of Open Access Journals (Sweden)

    Anders Lanzén

    Full Text Available Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags, a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3 from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.

  2. Electricity sequence control

    International Nuclear Information System (INIS)

    Shin, Heung Ryeol

    2010-03-01

    The contents of the book are introduction of control system, like classification and control signal, introduction of electricity power switch, such as push-button and detection switch sensor for induction type and capacitance type machinery for control, solenoid valve, expression of sequence and type of electricity circuit about using diagram, time chart, marking and term, logic circuit like Yes, No, and, or and equivalence logic, basic electricity circuit, electricity sequence control, added condition, special program control about choice and jump of program, motor control, extra circuit on repeat circuit, pause circuit in a conveyer, safety regulations and rule about classification of electricity disaster and protective device for insulation.

  3. Next-generation sequencing

    DEFF Research Database (Denmark)

    Rieneck, Klaus; Bak, Mads; Jønson, Lars

    2013-01-01

    , Illumina); several millions of PCR sequences were analyzed. RESULTS: The results demonstrated the feasibility of diagnosing the fetal KEL1 or KEL2 blood group from cell-free DNA purified from maternal plasma. CONCLUSION: This method requires only one primer pair, and the large amount of sequence...... information obtained allows well for statistical analysis of the data. This general approach can be integrated into current laboratory practice and has numerous applications. Besides DNA-based predictions of blood group phenotypes, platelet phenotypes, or sickle cell anemia, and the determination of zygosity...

  4. Boosted classification trees result in minor to modest improvement in the accuracy in classifying cardiovascular outcomes compared to conventional classification trees

    Science.gov (United States)

    Austin, Peter C; Lee, Douglas S

    2011-01-01

    Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181

  5. 48 CFR 8.608 - Protection of classified and sensitive information.

    Science.gov (United States)

    2010-10-01

    ... Prison Industries, Inc. 8.608 Protection of classified and sensitive information. Agencies shall not enter into any contract with FPI that allows an inmate worker access to any— (a) Classified data; (b) Geographic data regarding the location of— (1) Surface and subsurface infrastructure providing communications...

  6. 40 CFR 260.32 - Variances to be classified as a boiler.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 25 2010-07-01 2010-07-01 false Variances to be classified as a boiler... be classified as a boiler. In accordance with the standards and criteria in § 260.10 (definition of “boiler”), and the procedures in § 260.33, the Administrator may determine on a case-by-case basis that...

  7. 46 CFR 108.187 - Ventilation for brush type electric motors in classified spaces.

    Science.gov (United States)

    2010-10-01

    ... 46 Shipping 4 2010-10-01 2010-10-01 false Ventilation for brush type electric motors in classified... Ventilation for brush type electric motors in classified spaces. Ventilation for brush type electric motors in... Electrical Equipment in Hazardous Locations”, except audible and visual alarms may be used if shutting down...

  8. Oblique decision trees using embedded support vector machines in classifier ensembles

    NARCIS (Netherlands)

    Menkovski, V.; Christou, I.; Efremidis, S.

    2008-01-01

    Classifier ensembles have emerged in recent years as a promising research area for boosting pattern recognition systems' performance. We present a new base classifier that utilizes oblique decision tree technology based on support vector machines for the construction of oblique (non-axis parallel)

  9. Adaptation in P300 braincomputer interfaces: A two-classifier cotraining approach

    DEFF Research Database (Denmark)

    Panicker, Rajesh C.; Sun, Ying; Puthusserypady, Sadasivan

    2010-01-01

    A cotraining-based approach is introduced for constructing high-performance classifiers for P300-based braincomputer interfaces (BCIs), which were trained from very little data. It uses two classifiers: Fishers linear discriminant analysis and Bayesian linear discriminant analysis progressively...

  10. Analysis and minimization of overtraining effect in rule-based classifiers for computer-aided diagnosis

    International Nuclear Information System (INIS)

    Li Qiang; Doi Kunio

    2006-01-01

    Computer-aided diagnostic (CAD) schemes have been developed to assist radiologists detect various lesions in medical images. In CAD schemes, classifiers play a key role in achieving a high lesion detection rate and a low false-positive rate. Although many popular classifiers such as linear discriminant analysis and artificial neural networks have been employed in CAD schemes for reduction of false positives, a rule-based classifier has probably been the simplest and most frequently used one since the early days of development of various CAD schemes. However, with existing rule-based classifiers, there are major disadvantages that significantly reduce their practicality and credibility. The disadvantages include manual design, poor reproducibility, poor evaluation methods such as resubstitution, and a large overtraining effect. An automated rule-based classifier with a minimized overtraining effect can overcome or significantly reduce the extent of the above-mentioned disadvantages. In this study, we developed an 'optimal' method for the selection of cutoff thresholds and a fully automated rule-based classifier. Experimental results performed with Monte Carlo simulation and a real lung nodule CT data set demonstrated that the automated threshold selection method can completely eliminate overtraining effect in the procedure of cutoff threshold selection, and thus can minimize overall overtraining effect in the constructed rule-based classifier. We believe that this threshold selection method is very useful in the construction of automated rule-based classifiers with minimized overtraining effect

  11. Automating the construction of scene classifiers for content-based video retrieval

    NARCIS (Netherlands)

    Khan, L.; Israël, Menno; Petrushin, V.A.; van den Broek, Egon; van der Putten, Peter

    2004-01-01

    This paper introduces a real time automatic scene classifier within content-based video retrieval. In our envisioned approach end users like documentalists, not image processing experts, build classifiers interactively, by simply indicating positive examples of a scene. Classification consists of a

  12. Identification of flooded area from satellite images using Hybrid Kohonen Fuzzy C-Means sigma classifier

    Directory of Open Access Journals (Sweden)

    Krishna Kant Singh

    2017-06-01

    Full Text Available A novel neuro fuzzy classifier Hybrid Kohonen Fuzzy C-Means-σ (HKFCM-σ is proposed in this paper. The proposed classifier is a hybridization of Kohonen Clustering Network (KCN with FCM-σ clustering algorithm. The network architecture of HKFCM-σ is similar to simple KCN network having only two layers, i.e., input and output layer. However, the selection of winner neuron is done based on FCM-σ algorithm. Thus, embedding the features of both, a neural network and a fuzzy clustering algorithm in the classifier. This hybridization results in a more efficient, less complex and faster classifier for classifying satellite images. HKFCM-σ is used to identify the flooding that occurred in Kashmir area in September 2014. The HKFCM-σ classifier is applied on pre and post flooding Landsat 8 OLI images of Kashmir to detect the areas that were flooded due to the heavy rainfalls of September, 2014. The classifier is trained using the mean values of the various spectral indices like NDVI, NDWI, NDBI and first component of Principal Component Analysis. The error matrix was computed to test the performance of the method. The method yields high producer’s accuracy, consumer’s accuracy and kappa coefficient value indicating that the proposed classifier is highly effective and efficient.

  13. Variants of the Borda count method for combining ranked classifier hypotheses

    NARCIS (Netherlands)

    van Erp, Merijn; Schomaker, Lambert; Schomaker, Lambert; Vuurpijl, Louis

    2000-01-01

    The Borda count is a simple yet effective method of combining rankings. In pattern recognition, classifiers are often able to return a ranked set of results. Several experiments have been conducted to test the ability of the Borda count and two variant methods to combine these ranked classifier

  14. Should OCD be classified as an anxiety disorder in DSM-V?

    NARCIS (Netherlands)

    Stein, Dan J.; Fineberg, Naomi A.; Bienvenu, O. Joseph; Denys, Damiaan; Lochner, Christine; Nestadt, Gerald; Leckman, James F.; Rauch, Scott L.; Phillips, Katharine A.

    2010-01-01

    In DSM-III, DSM-III-R, and DSM-IV, obsessive-compulsive disorder (OCD) was classified as an anxiety disorder. In ICD-10, OCD is classified separately from the anxiety disorders, although within the same larger category as anxiety disorders (as one of the "neurotic, stress-related, and somatoform

  15. An expert computer program for classifying stars on the MK spectral classification system

    International Nuclear Information System (INIS)

    Gray, R. O.; Corbally, C. J.

    2014-01-01

    This paper describes an expert computer program (MKCLASS) designed to classify stellar spectra on the MK Spectral Classification system in a way similar to humans—by direct comparison with the MK classification standards. Like an expert human classifier, the program first comes up with a rough spectral type, and then refines that spectral type by direct comparison with MK standards drawn from a standards library. A number of spectral peculiarities, including barium stars, Ap and Am stars, λ Bootis stars, carbon-rich giants, etc., can be detected and classified by the program. The program also evaluates the quality of the delivered spectral type. The program currently is capable of classifying spectra in the violet-green region in either the rectified or flux-calibrated format, although the accuracy of the flux calibration is not important. We report on tests of MKCLASS on spectra classified by human classifiers; those tests suggest that over the entire HR diagram, MKCLASS will classify in the temperature dimension with a precision of 0.6 spectral subclass, and in the luminosity dimension with a precision of about one half of a luminosity class. These results compare well with human classifiers.

  16. 75 FR 733 - Implementation of the Executive Order, ``Classified National Security Information''

    Science.gov (United States)

    2010-01-05

    ... of the Executive Order, ``Classified National Security Information'' Memorandum for the Heads of... Security Information'' (the ``order''), which substantially advances my goals for reforming the security... classified information shall provide the Director of the Information Security Oversight Office (ISOO) a copy...

  17. A distributed approach for optimizing cascaded classifier topologies in real-time stream mining systems.

    Science.gov (United States)

    Foo, Brian; van der Schaar, Mihaela

    2010-11-01

    In this paper, we discuss distributed optimization techniques for configuring classifiers in a real-time, informationally-distributed stream mining system. Due to the large volume of streaming data, stream mining systems must often cope with overload, which can lead to poor performance and intolerable processing delay for real-time applications. Furthermore, optimizing over an entire system of classifiers is a difficult task since changing the filtering process at one classifier can impact both the feature values of data arriving at classifiers further downstream and thus, the classification performance achieved by an ensemble of classifiers, as well as the end-to-end processing delay. To address this problem, this paper makes three main contributions: 1) Based on classification and queuing theoretic models, we propose a utility metric that captures both the performance and the delay of a binary filtering classifier system. 2) We introduce a low-complexity framework for estimating the system utility by observing, estimating, and/or exchanging parameters between the inter-related classifiers deployed across the system. 3) We provide distributed algorithms to reconfigure the system, and analyze the algorithms based on their convergence properties, optimality, information exchange overhead, and rate of adaptation to non-stationary data sources. We provide results using different video classifier systems.

  18. 14 CFR 1213.106 - Preventing release of classified information to the media.

    Science.gov (United States)

    2010-01-01

    ... ADMINISTRATION RELEASE OF INFORMATION TO NEWS AND INFORMATION MEDIA § 1213.106 Preventing release of classified... interviews, audio/visual) to the news media is prohibited. The disclosure of classified information to unauthorized individuals may be cause for prosecution and/or disciplinary action against the NASA employee...

  19. Feature selection for Bayesian network classifiers using the MDL-FS score

    NARCIS (Netherlands)

    Drugan, Madalina M.; Wiering, Marco A.

    When constructing a Bayesian network classifier from data, the more or less redundant features included in a dataset may bias the classifier and as a consequence may result in a relatively poor classification accuracy. In this paper, we study the problem of selecting appropriate subsets of features

  20. A unified classifier for robust face recognition based on combining multiple subspace algorithms

    Science.gov (United States)

    Ijaz Bajwa, Usama; Ahmad Taj, Imtiaz; Waqas Anwar, Muhammad

    2012-10-01

    Face recognition being the fastest growing biometric technology has expanded manifold in the last few years. Various new algorithms and commercial systems have been proposed and developed. However, none of the proposed or developed algorithm is a complete solution because it may work very well on one set of images with say illumination changes but may not work properly on another set of image variations like expression variations. This study is motivated by the fact that any single classifier cannot claim to show generally better performance against all facial image variations. To overcome this shortcoming and achieve generality, combining several classifiers using various strategies has been studied extensively also incorporating the question of suitability of any classifier for this task. The study is based on the outcome of a comprehensive comparative analysis conducted on a combination of six subspace extraction algorithms and four distance metrics on three facial databases. The analysis leads to the selection of the most suitable classifiers which performs better on one task or the other. These classifiers are then combined together onto an ensemble classifier by two different strategies of weighted sum and re-ranking. The results of the ensemble classifier show that these strategies can be effectively used to construct a single classifier that can successfully handle varying facial image conditions of illumination, aging and facial expressions.

  1. Constrained parameter estimation for semi-supervised learning : The case of the nearest mean classifier

    NARCIS (Netherlands)

    Loog, M.

    2011-01-01

    A rather simple semi-supervised version of the equally simple nearest mean classifier is presented. However simple, the proposed approach is of practical interest as the nearest mean classifier remains a relevant tool in biomedical applications or other areas dealing with relatively high-dimensional

  2. An expert computer program for classifying stars on the MK spectral classification system

    Energy Technology Data Exchange (ETDEWEB)

    Gray, R. O. [Department of Physics and Astronomy, Appalachian State University, Boone, NC 26808 (United States); Corbally, C. J. [Vatican Observatory Research Group, Tucson, AZ 85721-0065 (United States)

    2014-04-01

    This paper describes an expert computer program (MKCLASS) designed to classify stellar spectra on the MK Spectral Classification system in a way similar to humans—by direct comparison with the MK classification standards. Like an expert human classifier, the program first comes up with a rough spectral type, and then refines that spectral type by direct comparison with MK standards drawn from a standards library. A number of spectral peculiarities, including barium stars, Ap and Am stars, λ Bootis stars, carbon-rich giants, etc., can be detected and classified by the program. The program also evaluates the quality of the delivered spectral type. The program currently is capable of classifying spectra in the violet-green region in either the rectified or flux-calibrated format, although the accuracy of the flux calibration is not important. We report on tests of MKCLASS on spectra classified by human classifiers; those tests suggest that over the entire HR diagram, MKCLASS will classify in the temperature dimension with a precision of 0.6 spectral subclass, and in the luminosity dimension with a precision of about one half of a luminosity class. These results compare well with human classifiers.

  3. A multiscale curvature algorithm for classifying discrete return LiDAR in forested environments

    Science.gov (United States)

    Jeffrey S. Evans; Andrew T. Hudak

    2007-01-01

    One prerequisite to the use of light detection and ranging (LiDAR) across disciplines is differentiating ground from nonground returns. The objective was to automatically and objectively classify points within unclassified LiDAR point clouds, with few model parameters and minimal postprocessing. Presented is an automated method for classifying LiDAR returns as ground...

  4. Biological sequence analysis

    DEFF Research Database (Denmark)

    Durbin, Richard; Eddy, Sean; Krogh, Anders Stærmose

    This book provides an up-to-date and tutorial-level overview of sequence analysis methods, with particular emphasis on probabilistic modelling. Discussed methods include pairwise alignment, hidden Markov models, multiple alignment, profile searches, RNA secondary structure analysis, and phylogene...

  5. THE RHIC SEQUENCER

    International Nuclear Information System (INIS)

    VAN ZEIJTS, J.; DOTTAVIO, T.; FRAK, B.; MICHNOFF, R.

    2001-01-01

    The Relativistic Heavy Ion Collider (RHIC) has a high level asynchronous time-line driven by a controlling program called the ''Sequencer''. Most high-level magnet and beam related issues are orchestrated by this system. The system also plays an important task in coordinated data acquisition and saving. We present the program, operator interface, operational impact and experience

  6. Twin anemia polycythemia sequence

    NARCIS (Netherlands)

    Slaghekke, Femke

    2014-01-01

    In this thesis we describe that Twin Anemia Polycythemia Sequence (TAPS) is a form of chronic feto-fetal transfusion in monochorionic (identical) twins based on a small amount of blood transfusion through very small anastomoses. For the antenatal diagnosis of TAPS, Middle Cerebral Artery – Peak

  7. simple sequence repeat (SSR)

    African Journals Online (AJOL)

    In the present study, 78 mapped simple sequence repeat (SSR) markers representing 11 linkage groups of adzuki bean were evaluated for transferability to mungbean and related Vigna spp. 41 markers amplified characteristic bands in at least one Vigna species. The transferability percentage across the genotypes ranged ...

  8. Performance analysis of a Principal Component Analysis ensemble classifier for Emotiv headset P300 spellers.

    Science.gov (United States)

    Elsawy, Amr S; Eldawlatly, Seif; Taher, Mohamed; Aly, Gamal M

    2014-01-01

    The current trend to use Brain-Computer Interfaces (BCIs) with mobile devices mandates the development of efficient EEG data processing methods. In this paper, we demonstrate the performance of a Principal Component Analysis (PCA) ensemble classifier for P300-based spellers. We recorded EEG data from multiple subjects using the Emotiv neuroheadset in the context of a classical oddball P300 speller paradigm. We compare the performance of the proposed ensemble classifier to the performance of traditional feature extraction and classifier methods. Our results demonstrate the capability of the PCA ensemble classifier to classify P300 data recorded using the Emotiv neuroheadset with an average accuracy of 86.29% on cross-validation data. In addition, offline testing of the recorded data reveals an average classification accuracy of 73.3% that is significantly higher than that achieved using traditional methods. Finally, we demonstrate the effect of the parameters of the P300 speller paradigm on the performance of the method.

  9. Accuracy Evaluation of C4.5 and Naive Bayes Classifiers Using Attribute Ranking Method

    Directory of Open Access Journals (Sweden)

    S. Sivakumari

    2009-03-01

    Full Text Available This paper intends to classify the Ljubljana Breast Cancer dataset using C4.5 Decision Tree and Nai?ve Bayes classifiers. In this work, classification is carriedout using two methods. In the first method, dataset is analysed using all the attributes in the dataset. In the second method, attributes are ranked using information gain ranking technique and only the high ranked attributes are used to build the classification model. We are evaluating the results of C4.5 Decision Tree and Nai?ve Bayes classifiers in terms of classifier accuracy for various folds of cross validation. Our results show that both the classifiers achieve good accuracy on the dataset.

  10. Iceberg Semantics For Count Nouns And Mass Nouns: Classifiers, measures and portions

    Directory of Open Access Journals (Sweden)

    Fred Landman

    2016-12-01

    It is the analysis of complex NPs and their mass-count properties that is the focus of the second part of this paper. There I develop an analysis of English and Dutch pseudo- partitives, in particular, measure phrases like three liters of wine and classifier phrases like three glasses of wine. We will study measure interpretations and classifier interpretations of measures and classifiers, and different types of classifier interpretations: container interpretations, contents interpretations, and - indeed - portion interpretations. Rothstein 2011 argues that classifier interpretations (including portion interpretations of pseudo partitives pattern with count nouns, but that measure interpretations pattern with mass nouns. I will show that this distinction follows from the very basic architecture of Iceberg semantics.

  11. Novelty Detection Classifiers in Weed Mapping: Silybum marianum Detection on UAV Multispectral Images.

    Science.gov (United States)

    Alexandridis, Thomas K; Tamouridou, Afroditi Alexandra; Pantazi, Xanthoula Eirini; Lagopodi, Anastasia L; Kashefi, Javid; Ovakoglou, Georgios; Polychronos, Vassilios; Moshou, Dimitrios

    2017-09-01

    In the present study, the detection and mapping of Silybum marianum (L.) Gaertn. weed using novelty detection classifiers is reported. A multispectral camera (green-red-NIR) on board a fixed wing unmanned aerial vehicle (UAV) was employed for obtaining high-resolution images. Four novelty detection classifiers were used to identify S. marianum between other vegetation in a field. The classifiers were One Class Support Vector Machine (OC-SVM), One Class Self-Organizing Maps (OC-SOM), Autoencoders and One Class Principal Component Analysis (OC-PCA). As input features to the novelty detection classifiers, the three spectral bands and texture were used. The S. marianum identification accuracy using OC-SVM reached an overall accuracy of 96%. The results show the feasibility of effective S. marianum mapping by means of novelty detection classifiers acting on multispectral UAV imagery.

  12. Case base classification on digital mammograms: improving the performance of case base classifier

    Science.gov (United States)

    Raman, Valliappan; Then, H. H.; Sumari, Putra; Venkatesa Mohan, N.

    2011-10-01

    Breast cancer continues to be a significant public health problem in the world. Early detection is the key for improving breast cancer prognosis. The aim of the research presented here is in twofold. First stage of research involves machine learning techniques, which segments and extracts features from the mass of digital mammograms. Second level is on problem solving approach which includes classification of mass by performance based case base classifier. In this paper we build a case-based Classifier in order to diagnose mammographic images. We explain different methods and behaviors that have been added to the classifier to improve the performance of the classifier. Currently the initial Performance base Classifier with Bagging is proposed in the paper and it's been implemented and it shows an improvement in specificity and sensitivity.

  13. Targeted sequencing of plant genomes

    Science.gov (United States)

    Mark D. Huynh

    2014-01-01

    Next-generation sequencing (NGS) has revolutionized the field of genetics by providing a means for fast and relatively affordable sequencing. With the advancement of NGS, wholegenome sequencing (WGS) has become more commonplace. However, sequencing an entire genome is still not cost effective or even beneficial in all cases. In studies that do not require a whole-...

  14. Almost convergence of triple sequences

    OpenAIRE

    Ayhan Esi; M.Necdet Catalbas

    2013-01-01

    In this paper we introduce and study the concepts of almost convergence and almost Cauchy for triple sequences. Weshow that the set of almost convergent triple sequences of 0's and 1's is of the first category and also almost everytriple sequence of 0's and 1's is not almost convergent.Keywords: almost convergence, P-convergent, triple sequence.

  15. A few Smarandache Integer Sequences

    OpenAIRE

    Ibstedt, Henry

    2010-01-01

    This paper deals with the analysis of a few Smarandache Integer Sequences which first appeared in Properties or the Numbers, F. Smarandache, University or Craiova Archives, 1975. The first four sequences are recurrence generated sequences while the last three are concatenation sequences.

  16. Allele Re-sequencing Technologies

    DEFF Research Database (Denmark)

    Byrne, Stephen; Farrell, Jacqueline Danielle; Asp, Torben

    2013-01-01

    The development of next-generation sequencing technologies has made sequencing an affordable approach for detection of genetic variations associated with various traits. However, the cost of whole genome re-sequencing still remains too high to be feasible for many plant species with large...... alternative to whole genome re-sequencing to identify causative genetic variations in plants. One challenge, however, will be efficient bioinformatics strategies for data handling and analysis from the increasing amount of sequence information....

  17. Classifying Classifications

    DEFF Research Database (Denmark)

    Debus, Michael S.

    2017-01-01

    This paper critically analyzes seventeen game classifications. The classifications were chosen on the basis of diversity, ranging from pre-digital classification (e.g. Murray 1952), over game studies classifications (e.g. Elverdam & Aarseth 2007) to classifications of drinking games (e.g. LaBrie et...... al. 2013). The analysis aims at three goals: The classifications’ internal consistency, the abstraction of classification criteria and the identification of differences in classification across fields and/or time. Especially the abstraction of classification criteria can be used in future endeavors...... into the topic of game classifications....

  18. Voice of the Classified Employee: A Descriptive Study to Determine Degree of Job Satisfaction of Classified Employees and to Design Systems of Support by School District Leaders

    Science.gov (United States)

    Barakos-Cartwright, Rebekah B.

    2012-01-01

    Classified employees comprise thirty two percent of the educational workforce in school districts in the state of California. Acknowledging these employees as a viable and untapped resource within the educational system will enrich job satisfaction for these employees and benefit the operations in school sites. As acknowledged and valued…

  19. How large a training set is needed to develop a classifier for microarray data?

    Science.gov (United States)

    Dobbin, Kevin K; Zhao, Yingdong; Simon, Richard M

    2008-01-01

    A common goal of gene expression microarray studies is the development of a classifier that can be used to divide patients into groups with different prognoses, or with different expected responses to a therapy. These types of classifiers are developed on a training set, which is the set of samples used to train a classifier. The question of how many samples are needed in the training set to produce a good classifier from high-dimensional microarray data is challenging. We present a model-based approach to determining the sample size required to adequately train a classifier. It is shown that sample size can be determined from three quantities: standardized fold change, class prevalence, and number of genes or features on the arrays. Numerous examples and important experimental design issues are discussed. The method is adapted to address ex post facto determination of whether the size of a training set used to develop a classifier was adequate. An interactive web site for performing the sample size calculations is provided. We showed that sample size calculations for classifier development from high-dimensional microarray data are feasible, discussed numerous important considerations, and presented examples.

  20. Can single classifiers be as useful as model ensembles to produce benthic seabed substratum maps?

    Science.gov (United States)

    Turner, Joseph A.; Babcock, Russell C.; Hovey, Renae; Kendrick, Gary A.

    2018-05-01

    Numerous machine-learning classifiers are available for benthic habitat map production, which can lead to different results. This study highlights the performance of the Random Forest (RF) classifier, which was significantly better than Classification Trees (CT), Naïve Bayes (NB), and a multi-model ensemble in terms of overall accuracy, Balanced Error Rate (BER), Kappa, and area under the curve (AUC) values. RF accuracy was often higher than 90% for each substratum class, even at the most detailed level of the substratum classification and AUC values also indicated excellent performance (0.8-1). Total agreement between classifiers was high at the broadest level of classification (75-80%) when differentiating between hard and soft substratum. However, this sharply declined as the number of substratum categories increased (19-45%) including a mix of rock, gravel, pebbles, and sand. The model ensemble, produced from the results of all three classifiers by majority voting, did not show any increase in predictive performance when compared to the single RF classifier. This study shows how a single classifier may be sufficient to produce benthic seabed maps and model ensembles of multiple classifiers.

  1. Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection.

    Science.gov (United States)

    Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Fotouhi, Farshad

    2014-11-01

    Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single Variable Classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices.

  2. Classifier utility modeling and analysis of hypersonic inlet start/unstart considering training data costs

    Science.gov (United States)

    Chang, Juntao; Hu, Qinghua; Yu, Daren; Bao, Wen

    2011-11-01

    Start/unstart detection is one of the most important issues of hypersonic inlets and is also the foundation of protection control of scramjet. The inlet start/unstart detection can be attributed to a standard pattern classification problem, and the training sample costs have to be considered for the classifier modeling as the CFD numerical simulations and wind tunnel experiments of hypersonic inlets both cost time and money. To solve this problem, the CFD simulation of inlet is studied at first step, and the simulation results could provide the training data for pattern classification of hypersonic inlet start/unstart. Then the classifier modeling technology and maximum classifier utility theories are introduced to analyze the effect of training data cost on classifier utility. In conclusion, it is useful to introduce support vector machine algorithms to acquire the classifier model of hypersonic inlet start/unstart, and the minimum total cost of hypersonic inlet start/unstart classifier can be obtained by the maximum classifier utility theories.

  3. Multilocus Sequence Typing

    OpenAIRE

    Belén, Ana; Pavón, Ibarz; Maiden, Martin C.J.

    2009-01-01

    Multilocus sequence typing (MLST) was first proposed in 1998 as a typing approach that enables the unambiguous characterization of bacterial isolates in a standardized, reproducible, and portable manner using the human pathogen Neisseria meningitidis as the exemplar organism. Since then, the approach has been applied to a large and growing number of organisms by public health laboratories and research institutions. MLST data, shared by investigators over the world via the Internet, have been ...

  4. Achalasia Carcinoma Sequence

    OpenAIRE

    Makmun, Dadang

    2001-01-01

    We report a case of carcinoma of the esophagus in a 58 years old woman with achalasia, who has been diagnosed since 30 years ago, which initiated by surgical treatment (myotomy) and the symptoms recurred since 3 years ago. According to the progress of the disease, Malignancy was strongly suspected due to prolonged stasis and mucosal irritation caused by achalasia (achalasia carcinoma sequence). Because of these contributing factors for the development of serious complications such as Malignan...

  5. Grounding grammatical categories: attention bias in hand space influences grammatical congruency judgment of Chinese nominal classifiers.

    Science.gov (United States)

    Lobben, Marit; D'Ascenzo, Stefania

    2015-01-01

    Embodied cognitive theories predict that linguistic conceptual representations are grounded and continually represented in real world, sensorimotor experiences. However, there is an on-going debate on whether this also holds for abstract concepts. Grammar is the archetype of abstract knowledge, and therefore constitutes a test case against embodied theories of language representation. Former studies have largely focussed on lexical-level embodied representations. In the present study we take the grounding-by-modality idea a step further by using reaction time (RT) data from the linguistic processing of nominal classifiers in Chinese. We take advantage of an independent body of research, which shows that attention in hand space is biased. Specifically, objects near the hand consistently yield shorter RTs as a function of readiness for action on graspable objects within reaching space, and the same biased attention inhibits attentional disengagement. We predicted that this attention bias would equally apply to the graspable object classifier but not to the big object classifier. Chinese speakers (N = 22) judged grammatical congruency of classifier-noun combinations in two conditions: graspable object classifier and big object classifier. We found that RTs for the graspable object classifier were significantly faster in congruent combinations, and significantly slower in incongruent combinations, than the big object classifier. There was no main effect on grammatical violations, but rather an interaction effect of classifier type. Thus, we demonstrate here grammatical category-specific effects pertaining to the semantic content and by extension the visual and tactile modality of acquisition underlying the acquisition of these categories. We conclude that abstract grammatical categories are subjected to the same mechanisms as general cognitive and neurophysiological processes and may therefore be grounded.

  6. Using multivariate machine learning methods and structural MRI to classify childhood onset schizophrenia and healthy controls

    Directory of Open Access Journals (Sweden)

    Deanna eGreenstein

    2012-06-01

    Full Text Available Introduction: Multivariate machine learning methods can be used to classify groups of schizophrenia patients and controls using structural magnetic resonance imaging (MRI. However, machine learning methods to date have not been extended beyond classification and contemporaneously applied in a meaningful way to clinical measures. We hypothesized that brain measures would classify groups, and that increased likelihood of being classified as a patient using regional brain measures would be positively related to illness severity, developmental delays and genetic risk. Methods: Using 74 anatomic brain MRI sub regions and Random Forest, we classified 98 COS patients and 99 age, sex, and ethnicity-matched healthy controls. We also used Random Forest to determine the likelihood of being classified as a schizophrenia patient based on MRI measures. We then explored relationships between brain-based probability of illness and symptoms, premorbid development, and presence of copy number variation associated with schizophrenia. Results: Brain regions jointly classified COS and control groups with 73.7% accuracy. Greater brain-based probability of illness was associated with worse functioning (p= 0.0004 and fewer developmental delays (p=0.02. Presence of copy number variation (CNV was associated with lower probability of being classified as schizophrenia (p=0.001. The regions that were most important in classifying groups included left temporal lobes, bilateral dorsolateral prefrontal regions, and left medial parietal lobes. Conclusions: Schizophrenia and control groups can be well classified using Random Forest and anatomic brain measures, and brain-based probability of illness has a positive relationship with illness severity and a negative relationship with developmental delays/problems and CNV-based risk.

  7. A support vector machine classifier reduces interscanner variation in the HRCT classification of regional disease pattern in diffuse lung disease: Comparison to a Bayesian classifier

    Energy Technology Data Exchange (ETDEWEB)

    Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug; Seo, Joon Beom [Department of Radiology, University of Ulsan College of Medicine, 388-1 Pungnap2-dong, Songpa-gu, Seoul 138-736 (Korea, Republic of); Lynch, David A. [Department of Radiology, National Jewish Medical and Research Center, Denver, Colorado 80206 (United States)

    2013-05-15

    Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessed using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For

  8. A support vector machine classifier reduces interscanner variation in the HRCT classification of regional disease pattern in diffuse lung disease: Comparison to a Bayesian classifier

    International Nuclear Information System (INIS)

    Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug; Seo, Joon Beom; Lynch, David A.

    2013-01-01

    Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 × 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs—normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessed using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI

  9. Sequencing BPS spectra

    Energy Technology Data Exchange (ETDEWEB)

    Gukov, Sergei [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Max-Planck-Institut für Mathematik,Vivatsgasse 7, D-53111 Bonn (Germany); Nawata, Satoshi [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Centre for Quantum Geometry of Moduli Spaces, University of Aarhus,Nordre Ringgade 1, DK-8000 (Denmark); Saberi, Ingmar [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Stošić, Marko [CAMGSD, Departamento de Matemática, Instituto Superior Técnico,Av. Rovisco Pais, 1049-001 Lisbon (Portugal); Mathematical Institute SANU,Knez Mihajlova 36, 11000 Belgrade (Serbia); Sułkowski, Piotr [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Faculty of Physics, University of Warsaw,ul. Pasteura 5, 02-093 Warsaw (Poland)

    2016-03-02

    This paper provides both a detailed study of color-dependence of link homologies, as realized in physics as certain spaces of BPS states, and a broad study of the behavior of BPS states in general. We consider how the spectrum of BPS states varies as continuous parameters of a theory are perturbed. This question can be posed in a wide variety of physical contexts, and we answer it by proposing that the relationship between unperturbed and perturbed BPS spectra is described by a spectral sequence. These general considerations unify previous applications of spectral sequence techniques to physics, and explain from a physical standpoint the appearance of many spectral sequences relating various link homology theories to one another. We also study structural properties of colored HOMFLY homology for links and evaluate Poincaré polynomials in numerous examples. Among these structural properties is a novel “sliding” property, which can be explained by using (refined) modular S-matrix. This leads to the identification of modular transformations in Chern-Simons theory and 3d N=2 theory via the 3d/3d correspondence. Lastly, we introduce the notion of associated varieties as classical limits of recursion relations of colored superpolynomials of links, and study their properties.

  10. Sequencing BPS spectra

    International Nuclear Information System (INIS)

    Gukov, Sergei; Nawata, Satoshi; Saberi, Ingmar; Stošić, Marko; Sułkowski, Piotr

    2016-01-01

    This paper provides both a detailed study of color-dependence of link homologies, as realized in physics as certain spaces of BPS states, and a broad study of the behavior of BPS states in general. We consider how the spectrum of BPS states varies as continuous parameters of a theory are perturbed. This question can be posed in a wide variety of physical contexts, and we answer it by proposing that the relationship between unperturbed and perturbed BPS spectra is described by a spectral sequence. These general considerations unify previous applications of spectral sequence techniques to physics, and explain from a physical standpoint the appearance of many spectral sequences relating various link homology theories to one another. We also study structural properties of colored HOMFLY homology for links and evaluate Poincaré polynomials in numerous examples. Among these structural properties is a novel “sliding” property, which can be explained by using (refined) modular S-matrix. This leads to the identification of modular transformations in Chern-Simons theory and 3d N=2 theory via the 3d/3d correspondence. Lastly, we introduce the notion of associated varieties as classical limits of recursion relations of colored superpolynomials of links, and study their properties.

  11. Image sequence analysis

    CERN Document Server

    1981-01-01

    The processing of image sequences has a broad spectrum of important applica­ tions including target tracking, robot navigation, bandwidth compression of TV conferencing video signals, studying the motion of biological cells using microcinematography, cloud tracking, and highway traffic monitoring. Image sequence processing involves a large amount of data. However, because of the progress in computer, LSI, and VLSI technologies, we have now reached a stage when many useful processing tasks can be done in a reasonable amount of time. As a result, research and development activities in image sequence analysis have recently been growing at a rapid pace. An IEEE Computer Society Workshop on Computer Analysis of Time-Varying Imagery was held in Philadelphia, April 5-6, 1979. A related special issue of the IEEE Transactions on Pattern Anal­ ysis and Machine Intelligence was published in November 1980. The IEEE Com­ puter magazine has also published a special issue on the subject in 1981. The purpose of this book ...

  12. A New Adaptive Structural Signature for Symbol Recognition by Using a Galois Lattice as a Classifier.

    Science.gov (United States)

    Coustaty, M; Bertet, K; Visani, M; Ogier, J

    2011-08-01

    In this paper, we propose a new approach for symbol recognition using structural signatures and a Galois lattice as a classifier. The structural signatures are based on topological graphs computed from segments which are extracted from the symbol images by using an adapted Hough transform. These structural signatures-that can be seen as dynamic paths which carry high-level information-are robust toward various transformations. They are classified by using a Galois lattice as a classifier. The performance of the proposed approach is evaluated based on the GREC'03 symbol database, and the experimental results we obtain are encouraging.

  13. Construction of Pancreatic Cancer Classifier Based on SVM Optimized by Improved FOA

    Science.gov (United States)

    Ma, Xiaoqi

    2015-01-01

    A novel method is proposed to establish the pancreatic cancer classifier. Firstly, the concept of quantum and fruit fly optimal algorithm (FOA) are introduced, respectively. Then FOA is improved by quantum coding and quantum operation, and a new smell concentration determination function is defined. Finally, the improved FOA is used to optimize the parameters of support vector machine (SVM) and the classifier is established by optimized SVM. In order to verify the effectiveness of the proposed method, SVM and other classification methods have been chosen as the comparing methods. The experimental results show that the proposed method can improve the classifier performance and cost less time. PMID:26543867

  14. Multiple classifier systems in texton-based approach for the classification of CT images of Lung

    DEFF Research Database (Denmark)

    Gangeh, Mehrdad J.; Sørensen, Lauge; Shaker, Saher B.

    2010-01-01

    In this paper, we propose using texton signatures based on raw pixel representation along with a parallel multiple classifier system for the classification of emphysema in computed tomography images of the lung. The multiple classifier system is composed of support vector machines on the texton.......e., texton size and k value in k-means. Our results show that while aggregation of single decisions by SVMs over various k values using multiple classifier systems helps to improve the results compared to single SVMs, combining over different texton sizes is not beneficial. The performance of the proposed...

  15. Automatic construction of a recurrent neural network based classifier for vehicle passage detection

    Science.gov (United States)

    Burnaev, Evgeny; Koptelov, Ivan; Novikov, German; Khanipov, Timur

    2017-03-01

    Recurrent Neural Networks (RNNs) are extensively used for time-series modeling and prediction. We propose an approach for automatic construction of a binary classifier based on Long Short-Term Memory RNNs (LSTM-RNNs) for detection of a vehicle passage through a checkpoint. As an input to the classifier we use multidimensional signals of various sensors that are installed on the checkpoint. Obtained results demonstrate that the previous approach to handcrafting a classifier, consisting of a set of deterministic rules, can be successfully replaced by an automatic RNN training on an appropriately labelled data.

  16. Foundations of Sequence-to-Sequence Modeling for Time Series

    OpenAIRE

    Kuznetsov, Vitaly; Mariet, Zelda

    2018-01-01

    The availability of large amounts of time series data, paired with the performance of deep-learning algorithms on a broad class of problems, has recently led to significant interest in the use of sequence-to-sequence models for time series forecasting. We provide the first theoretical analysis of this time series forecasting framework. We include a comparison of sequence-to-sequence modeling to classical time series models, and as such our theory can serve as a quantitative guide for practiti...

  17. Complete Genome Sequence of an Avian Metapneumovirus Subtype A Strain Isolated from Chicken (Gallus gallus) in Brazil.

    Science.gov (United States)

    Rizotto, Laís S; Scagion, Guilherme P; Cardoso, Tereza C; Simão, Raphael M; Caserta, Leonardo C; Benassi, Julia C; Keid, Lara B; Oliveira, Trícia M F de S; Soares, Rodrigo M; Arns, Clarice W; Van Borm, Steven; Ferreira, Helena L

    2017-07-20

    We report here the complete genome sequence of an avian metapneumovirus (aMPV) isolated from a tracheal tissue sample of a commercial layer flock. The complete genome sequence of aMPV-A/chicken/Brazil-SP/669/2003 was obtained using MiSeq (Illumina, Inc.) sequencing. Phylogenetic analysis of the complete genome classified the isolate as avian metapneumovirus subtype A. Copyright © 2017 Rizotto et al.

  18. 32 CFR 154.6 - Standards for access to classified information or assignment to sensitive duties.

    Science.gov (United States)

    2010-07-01

    ... OF THE SECRETARY OF DEFENSE SECURITY DEPARTMENT OF DEFENSE PERSONNEL SECURITY PROGRAM REGULATION... person's loyalty, reliability, and trustworthiness are such that entrusting the person with classified... reasonable basis for doubting the person's loyalty to the Government of the United States. ...

  19. A system for classifying wood-using industries and recording statistics for automatic data processing.

    Science.gov (United States)

    E.W. Fobes; R.W. Rowe

    1968-01-01

    A system for classifying wood-using industries and recording pertinent statistics for automatic data processing is described. Forms and coding instructions for recording data of primary processing plants are included.

  20. A Constrained Multi-Objective Learning Algorithm for Feed-Forward Neural Network Classifiers

    Directory of Open Access Journals (Sweden)

    M. Njah

    2017-06-01

    Full Text Available This paper proposes a new approach to address the optimal design of a Feed-forward Neural Network (FNN based classifier. The originality of the proposed methodology, called CMOA, lie in the use of a new constraint handling technique based on a self-adaptive penalty procedure in order to direct the entire search effort towards finding only Pareto optimal solutions that are acceptable. Neurons and connections of the FNN Classifier are dynamically built during the learning process. The approach includes differential evolution to create new individuals and then keeps only the non-dominated ones as the basis for the next generation. The designed FNN Classifier is applied to six binary classification benchmark problems, obtained from the UCI repository, and results indicated the advantages of the proposed approach over other existing multi-objective evolutionary neural networks classifiers reported recently in the literature.