WorldWideScience

Sample records for classifying proteinlike sequences

  1. Classifying Genomic Sequences by Sequence Feature Analysis

    Institute of Scientific and Technical Information of China (English)

    Zhi-Hua Liu; Dian Jiao; Xiao Sun

    2005-01-01

    Traditional sequence analysis depends on sequence alignment. In this study, we analyzed various functional regions of the human genome based on sequence features, including word frequency, dinucleotide relative abundance, and base-base correlation. We analyzed the human chromosome 22 and classified the upstream,exon, intron, downstream, and intergenic regions by principal component analysis and discriminant analysis of these features. The results show that we could classify the functional regions of genome based on sequence feature and discriminant analysis.

  2. Analysis of Sequence Based Classifier Prediction for HIV Subtypes

    Directory of Open Access Journals (Sweden)

    S. Santhosh Kumar

    2012-10-01

    Full Text Available Human immunodeficiency virus (HIV is a lent virus that causes acquired immunodeficiency syndrome (AIDS. The main drawback in HIV treatment process is its sub type prediction. The sub type and group classification of HIV is based on its genetic variability and location. HIV can be divided into two major types, HIV type 1 (HIV-1 and HIV type 2 (HIV-2. Many classifier approaches have been used to classify HIV subtypes based on their group, but some of cases are having two groups in one; in such cases the classification becomes more complex. The methodology used is this paper based on the HIV sequences. For this work several classifier approaches are used to classify the HIV1 and HIV2. For implementation of the work a real time patient database is taken and the patient records are experimented and the final best classifier is identified with quick response time and least error rate.

  3. ELASTIC BEHAVIOR OF PROTEIN-LIKE SINGLE CHAIN

    Institute of Scientific and Technical Information of China (English)

    Wei-qi Yi; Lin-xi Zhang

    2005-01-01

    The conformational properties and elastic behaviors of protein-like single chains in the process of tensile elongation were investigated by means of Monte Carlo method. The sequences of protein-like single chains contain two types of residues: hydrophobic (H) and hydrophilic (P). The average conformations and thermodynamics statistical properties of protein-like single chains with various elongation ratio λ were calculated. It was found that the mean-square end-to-end distance r increases with elongation ratio,λ. The tensor eigenvalues ratio of : decreases with elongation ratio λ for short (HP)x protein-like polymers, however, the ratio of : increases with elongation ratioλ,especially for long (H)x sequence. Average energy per bond increases with elongation ratioλ, especially for(H)x protein-like single chains. Helmholtz free energy per bond also increases with elongation ratioλ. Elastic force (f), energy contribution to force (fU) and entropy contribution to force (fs) for different protein-like single chains were also calculated.These investigations may provide some insights into elastic behaviors of proteins.

  4. Order parameter for design of proteinlike heteropolymers

    CERN Document Server

    Nelson, E D; Onuchic, J N; Nelson, Erik D.; Eyck, Lynn F. Ten; Onuchic, Jose' N.

    1998-01-01

    We define the energetics of proteinlike heteropolymers according to an ensemble of copolymer sequence interactions, in which (i) the sequences define a basis of orthogonal vectors belonging to an optimal class of bases, and (ii) the matrix of contact energies for each sequence has the Mattis (diagonal) form, which eliminates all energetic frustration loops along closed circuits of contacts within any configuration of the chain. This makes it possible to derive a set of physical order parameters which partition the configuration space into structually similar statistical ensembles, each having low topological frustration. By applying this description to the statistics of homopolymeric chains (with length N = 16 - 128) we obtain a number of important results, which provide a simple explanation for the observed frequency dependence of hydrophobic domains in proteins, and suggest that the diagonal ensemble is sufficient to represent the energetics of minimally frustrated heteropolymers.

  5. RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers.

    Science.gov (United States)

    Bindewald, Eckart; Shapiro, Bruce A

    2006-03-01

    We present a machine learning method (a hierarchical network of k-nearest neighbor classifiers) that uses an RNA sequence alignment in order to predict a consensus RNA secondary structure. The input to the network is the mutual information, the fraction of complementary nucleotides, and a novel consensus RNAfold secondary structure prediction of a pair of alignment columns and its nearest neighbors. Given this input, the network computes a prediction as to whether a particular pair of alignment columns corresponds to a base pair. By using a comprehensive test set of 49 RFAM alignments, the program KNetFold achieves an average Matthews correlation coefficient of 0.81. This is a significant improvement compared with the secondary structure prediction methods PFOLD and RNAalifold. By using the example of archaeal RNase P, we show that the program can also predict pseudoknot interactions.

  6. Improved Bevirimat resistance prediction by combination of structural and sequence-based classifiers

    Directory of Open Access Journals (Sweden)

    Dybowski J Nikolaj

    2011-11-01

    Full Text Available Abstract Background Maturation inhibitors such as Bevirimat are a new class of antiretroviral drugs that hamper the cleavage of HIV-1 proteins into their functional active forms. They bind to these preproteins and inhibit their cleavage by the HIV-1 protease, resulting in non-functional virus particles. Nevertheless, there exist mutations in this region leading to resistance against Bevirimat. Highly specific and accurate tools to predict resistance to maturation inhibitors can help to identify patients, who might benefit from the usage of these new drugs. Results We tested several methods to improve Bevirimat resistance prediction in HIV-1. It turned out that combining structural and sequence-based information in classifier ensembles led to accurate and reliable predictions. Moreover, we were able to identify the most crucial regions for Bevirimat resistance computationally, which are in line with experimental results from other studies. Conclusions Our analysis demonstrated the use of machine learning techniques to predict HIV-1 resistance against maturation inhibitors such as Bevirimat. New maturation inhibitors are already under development and might enlarge the arsenal of antiretroviral drugs in the future. Thus, accurate prediction tools are very useful to enable a personalized therapy.

  7. Coding-complete sequencing classifies parrot bornavirus 5 into a novel virus species.

    Science.gov (United States)

    Marton, Szilvia; Bányai, Krisztián; Gál, János; Ihász, Katalin; Kugler, Renáta; Lengyel, György; Jakab, Ferenc; Bakonyi, Tamás; Farkas, Szilvia L

    2015-11-01

    In this study, we determined the sequence of the coding region of an avian bornavirus detected in a blue-and-yellow macaw (Ara ararauna) with pathological/histopathological changes characteristic of proventricular dilatation disease. The genomic organization of the macaw bornavirus is similar to that of other bornaviruses, and its nucleotide sequence is nearly identical to the available partial parrot bornavirus 5 (PaBV-5) sequences. Phylogenetic analysis showed that these strains formed a monophyletic group distinct from other mammalian and avian bornaviruses and in calculations performed with matrix protein coding sequences, the PaBV-5 and PaBV-6 genotypes formed a common cluster, suggesting that according to the recently accepted classification system for bornaviruses, these two genotypes may belong to a new species, provisionally named Psittaciform 2 bornavirus.

  8. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    Science.gov (United States)

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods.

  9. An Effective Antifreeze Protein Predictor with Ensemble Classifiers and Comprehensive Sequence Descriptors

    Directory of Open Access Journals (Sweden)

    Runtao Yang

    2015-09-01

    Full Text Available Antifreeze proteins (AFPs play a pivotal role in the antifreeze effect of overwintering organisms. They have a wide range of applications in numerous fields, such as improving the production of crops and the quality of frozen foods. Accurate identification of AFPs may provide important clues to decipher the underlying mechanisms of AFPs in ice-binding and to facilitate the selection of the most appropriate AFPs for several applications. Based on an ensemble learning technique, this study proposes an AFP identification system called AFP-Ensemble. In this system, random forest classifiers are trained by different training subsets and then aggregated into a consensus classifier by majority voting. The resulting predictor yields a sensitivity of 0.892, a specificity of 0.940, an accuracy of 0.938 and a balanced accuracy of 0.916 on an independent dataset, which are far better than the results obtained by previous methods. These results reveal that AFP-Ensemble is an effective and promising predictor for large-scale determination of AFPs. The detailed feature analysis in this study may give useful insights into the molecular mechanisms of AFP-ice interactions and provide guidance for the related experimental validation. A web server has been designed to implement the proposed method.

  10. Classifying life course trajectories : a comparison of latent class and sequence analysis

    NARCIS (Netherlands)

    Barban, Nicola; Billari, Francesco C.

    2012-01-01

    . We compare two techniques that are widely used in the analysis of life course trajectories: latent class analysis and sequence analysis. In particular, we focus on the use of these techniques as devices to obtain classes of individual life course trajectories. We first compare the consistency of t

  11. Molecular characterization of SQUAMOSA PROMOTER BINDING PROTEIN-LIKE (SPL gene family from Citrus and the effect of fruit load on their expression

    Directory of Open Access Journals (Sweden)

    Liron eShalom

    2015-05-01

    Full Text Available We recently identified a Citrus gene encoding SQUAMOSA PROMOTER BINDING PROTEIN-LIKE (SPL transcription factor that contained a sequence complementary to miR156. Genes of the SPL family are known to play a role in flowering regulation and phase transition. In Citrus, the mRNA levels of the gene were significantly altered by fruit load in buds; under heavy fruit load (ON-Crop trees, known to suppress next year flowering, the mRNA levels were down-regulated, while fruit removal (de-fruiting, inducing next-year flowering, resulted in its up-regulation. In the current work, we set on to study the function of the gene. We showed that the Citrus SPL was able promote flowering independently of photoperiod in Arabidopsis, while miR156 repressed its flowering-promoting activity. In order to find out if fruit load affected the expression of additional genes of the SPL family, we identified and classified all SPL members in the Citrus genome, and studied their seasonal expression patterns in buds and leaves, and in response to de-fruiting. Results showed that two additional SPL-like genes and miR172, known to be induced by SPLs in Arabidopsis, were altered by fruit load. The relationships between these factors in relation to the fruit-load effect on Citrus flowering are discussed.

  12. Proteinlike copolymers as encapsulating agents for small-molecule solutes.

    Science.gov (United States)

    Malik, Ravish; Genzer, Jan; Hall, Carol K

    2015-03-24

    We describe the utilization of proteinlike copolymers (PLCs) as encapsulating agents for small-molecule solutes. We perform Monte Carlo simulations on systems containing PLCs and model solute molecules in order to understand how PLCs assemble in solution and what system conditions promote solute encapsulation. Specifically, we explore how the chemical composition of the PLCs and the range and strength of molecular interactions between hydrophobic segments on the PLC and solute molecules affect the solute encapsulation efficiency. The composition profiles of the hydrophobic and hydrophilic segments, the solute, and implicit solvent (or voids) within the PLC globule are evaluated to gain a complete understanding of the behavior in the PLC/solute system. We find that a single-chain PLC encapsulates solute successfully by collapsing the macromolecule to a well-defined globular conformation when the hydrophobic/solute interaction is at least as strong as the interaction strength among hydrophobic segments and the interaction among solute molecules is at most as strong as the hydrophobic/solute interaction strength. Our results can be used by experimentalists as a framework for optimizing unimolecular PLC solute encapsulation and can be extended potentially to applications such as "drug" delivery via PLCs.

  13. Potential Autoepitope within the Extracellular Region of Contactin-Associated Protein-like 2 in Mice

    Science.gov (United States)

    Obregon, Demian F.; Zhu, Yuyan; Bailey, Antoinette R.; Portis, Samantha M.; Hou, Huayan; Zeng, Jin; Stock, Saundra L.; Murphy, Tanya K.; Bengtson, Michael A.; Tan, Jun

    2013-01-01

    Aims Implicated in autoimmune encephalitis, neuromyotonia and genetic forms of autism, here we report that contactin-associated protein-like 2 (CNTNAP2) contains a potential autoepitope within the extracellular region. Methodology CNTNAP2 sequence-similar regions (CSSRs) from human pathogens were identified. Sera from autistic and control children were obtained and analyzed for the presence of antibodies able to bind CSSRs. One such candidate CSSR was evaluated for evidence of autoimmune responses to CNTNAP2 in a mouse model of acute infection. Results Autistic and control children sera contained antibodies able to discrete regions of CNTNAP2. In a murine model of acute infection, a CSSR derived from the N-terminal extracellular region of CNTNAP2 resulted in anti-CNTNAP2 antibody production, proinflammatory cytokine elevation, cerebellar and cortical white matter T-cell infiltration as well as motor dysfunction. Conclusion Taken together, these data suggest that CNTNAP2 contains a potential autoepitope within the extracellular region. PMID:24466509

  14. Classifying Sequences by the Optimized Dissimilarity Space Embedding Approach: a Case Study on the Solubility Analysis of the E. coli Proteome

    CERN Document Server

    Livi, Lorenzo; Sadeghian, Alireza

    2014-01-01

    We evaluate a version of the recently-proposed Optimized Dissimilarity Space Embedding (ODSE) classification system that operates in the input space of sequences of generic objects. The ODSE system has been originally presented as a labeled graph classification system. However, since it is founded on the dissimilarity space representation of the input data, the classifier can be easily adapted to any input domain where it is possible to define a meaningful dissimilarity measure. We demonstrate the effectiveness of the ODSE classifier for sequences considering an application dealing with recognition of the solubility degree of the Escherichia coli proteome. Overall, the obtained results, which we stress that have been achieved with no context-dependent tuning of the ODSE system, confirm the validity and generality of the ODSE-based approach for structured data classification.

  15. Linear Recurrent Double Sequences with Constant Border in M2(F2 are Classified According to Their Geometric Content

    Directory of Open Access Journals (Sweden)

    Mihai Prunescu

    2011-07-01

    Full Text Available The author used the automatic proof procedure introduced in [1] and verified that the 4096 homomorphic recurrent double sequences with constant borders defined over Klein’s Vierergruppe K and the 4096 linear recurrent double sequences with constant border defined over the matrix ring M2(F2 can be also produced by systems of substitutions with finitely many rules. This permits the definition of a sound notion of geometric content for most of these sequences, more exactly for those which are not primitive. We group the 4096 many linear recurrent double sequences with constant border I over the ring M2(F2 in 90 geometric types. The classification over Klein’s Vierergruppe Kis not explicitly displayed and consists of the same geometric types like for M2(F2, but contains more exceptions. There are a lot of cases of unsymmetric double sequences converging to symmetric geometric contents. We display also geometric types occurring both in a monochromatic and in a dichromatic version.

  16. Identification of phosphatidylcholine transfer protein-like in the parasite Entamoeba histolytica.

    Science.gov (United States)

    Piña-Vázquez, Carolina; Reyes-López, Magda; Mendoza-Hernández, Guillermo; Bermúdez-Cruz, Rosa María; de la Garza, Mireya

    2014-12-01

    Caveolin is the protein marker of caveola-mediated endocytosis. Previously, we demonstrated by immunoblotting and immunofluorescence that an anti-chick embryo caveolin-1 monoclonal antibody (mAb) recognizes a protein in amoeba extracts. Nevertheless, the caveolin-1 gene is absent in the Entamoeba histolytica genome database. In this work, the goal was to isolate, identify and characterize the protein that cross-reacts with chick embryo caveolin-1. We identified the protein using a proteomic approach, and the complete gene was cloned and sequenced. The identified protein, E. histolytica phosphatidylcholine transfer protein-like (EhPCTP-L), is a member of the StAR-related lipid transfer (START) protein superfamily. The human homolog binds and transfers phosphatidylcholine (PC) and phosphatidylethanolamine (PE) between model membranes in vitro; however, the physiological role of PCTP-L remains elusive. Studies in silico showed that EhPCTP-L has a central START domain and also contains a C-terminal intrinsically disordered region. The anti-rEhPCTP-L antibody demonstrated that EhPCTP-L is found in the plasma membrane and cytosol, which is in agreement with previous reports on the human counterpart. This result points to the plasma membrane as one possible target membrane for EhPCTP-L. Furthermore, assays using filipin and nystatin showed down regulation of EhPCTP-L, in an apparently cholesterol-independent way. Interestingly, EhPCTP-L binds primarily to anionic phospholipids phosphatidylserine (PS) and phosphatidic acid (PA), while its mammalian counterpart HsPCTP-L binds neutral phospholipids PC and PE. The present study provides information that helps reveal the possible function and regulation of PCTP-L expression in the primitive eukaryotic parasite E. histolytica.

  17. Classifying Microorganisms

    DEFF Research Database (Denmark)

    Sommerlund, Julie

    2006-01-01

    This paper describes the coexistence of two systems for classifying organisms and species: a dominant genetic system and an older naturalist system. The former classifies species and traces their evolution on the basis of genetic characteristics, while the latter employs physiological characteris......This paper describes the coexistence of two systems for classifying organisms and species: a dominant genetic system and an older naturalist system. The former classifies species and traces their evolution on the basis of genetic characteristics, while the latter employs physiological...... and integration possible, the field of molecular biology seems to be overwhelmingly homogeneous, and in need of heterogeneity and conflict to add drive and momentum to the work being carried out. The paper is based on observations of daily life in a molecular microbiology laboratory at the Technical University...

  18. Classifying Motion.

    Science.gov (United States)

    Duzen, Carl; And Others

    1992-01-01

    Presents a series of activities that utilizes a leveling device to classify constant and accelerated motion. Applies this classification system to uniform circular motion and motion produced by gravitational force. (MDH)

  19. Protein-like fully reversible tetramerisation and super-association of an aminocellulose

    Science.gov (United States)

    Nikolajski, Melanie; Adams, Gary G.; Gillis, Richard B.; Besong, David Tabot; Rowe, Arthur J.; Heinze, Thomas; Harding, Stephen E.

    2014-01-01

    Unusual protein-like, partially reversible associative behaviour has recently been observed in solutions of the water soluble carbohydrates known as 6-deoxy-6-(ω-aminoalkyl)aminocelluloses, which produce controllable self-assembling films for enzyme immobilisation and other biotechnological applications. Now, for the first time, we have found a fully reversible self-association (tetramerisation) within this family of polysaccharides. Remarkably these carbohydrate tetramers are then seen to associate further in a regular way into supra-molecular complexes. Fully reversible oligomerisation has been hitherto completely unknown for carbohydrates and instead resembles in some respects the assembly of polypeptides and proteins like haemoglobin and its sickle cell mutation. Our traditional perceptions as to what might be considered ``protein-like'' and what might be considered as ``carbohydrate-like'' behaviour may need to be rendered more flexible, at least as far as interaction phenomena are concerned.

  20. Small RNA sequencing-microarray analyses in Parkinson leukocytes reveal deep brain stimulation-induced and splicing changes that classify brain region transcriptomes

    Directory of Open Access Journals (Sweden)

    Lilach eSoreq

    2013-05-01

    Full Text Available MicroRNAs (miRNAs are key post transcriptional regulators of their multiple target genes. However, the detailed profile of miRNA expression in Parkinson's disease, the second most common neurodegenerative disease worldwide and the first motor disorder has not been charted yet. Here, we report comprehensive miRNA profiling by next-generation small-RNA sequencing, combined with targets inspection by splice-junction and exon arrays interrogating leukocyte RNA in Parkinson’s disease patients before and after deep brain stimulation (DBS treatment and of matched healthy control volunteers (HC. RNA-Seq analysis identified 254 miRNAs and 79 passenger strand forms as expressed in blood leukocytes, 16 of which were modified in patients pre treatment as compared to HC. 11 miRNAs were modified following brain stimulation, 5 of which were changed inversely to the disease induced changes. Stimulation cessation further induced changes in 11 miRNAs. Transcript isoform abundance analysis yielded 332 changed isoforms in patients compared to HC, which classified brain transcriptomes of 47 PD and control independent microarrays. Functional enrichment analysis highlighted mitochondrion organization. DBS induced 155 splice changes, enriched in ubiquitin homeostasis. Cellular composition analysis revealed immune cell activity pre and post treatment. Overall, 217 disease and 74 treatment alternative isoforms were predictably targeted by modified miRNAs within both 3’ and 5’ untranslated ends and coding sequence sites. The stimulation-induced network sustained 4 miRNAs and 7 transcripts of the disease network. We believe that the presented dynamic networks provide a novel avenue for identifying disease and treatment-related therapeutic targets. Furthermore, the identification of these networks is a major step forward in the road for understanding the molecular basis for neurological and neurodegenerative diseases and assessment of the impact of brain stimulation

  1. Enhancement of spin polarization in transport through protein-like single-helical molecules

    Science.gov (United States)

    Wu, Hai-Na; Wang, Xiao; Zhang, Ya-Jing; Yi, Guang-Yu; Gong, Wei-Jiang

    2016-06-01

    We investigate the spin-polarized electron transport through the single-helical molecules connected with two normal metallic leads. On the basis of an effective model Hamiltonian, influences of the structural parameters on the conductance and the spin polarization are calculated by using the Landauer-Büttiker formula. The optimal structural parameters for the maximal spin polarization are analyzed. Our results show that the dephasing term is an important factor to enhance the spin polarization, in addition to the intrinsic parameters of the single-helical molecule. This work can be helpful in optimizing the spin polarization in the protein-like single-helical molecules.

  2. Molecular Dynamics Simulations of a Flexible Polyethylene: A Protein-Like Behaviour in a Water Solvent

    CERN Document Server

    Kretov, D A

    2005-01-01

    We used molecular dynamics (MD) simulations to study the density and the temperature behaviour of a flexible polyethylene (PE) subjected to various heating conditions and to investigate the PE chain conformational changes in a water solvent. First, we have considered the influence of the heating process on the final state of the polymeric system and the sensitivity of its thermodynamic characteristics (density, energy, etc.) for different heating regimes. For this purpose three different simulations were performed: fast, moderate, and slow heating. Second, we have investigated the PE chain conformational dynamics in water solvent for various simulation conditions and various configurations of the environment. From the obtained results we have got the pictures of the PE dynamical motions in water. We have observed a protein-like behaviour of the PE chain, like that of the DNA and the proteins in water, and have also estimated the rates of the conformational changes. For the MD simulations we used the optimized...

  3. Effective stiffness and formation of secondary structures in a protein-like model

    Science.gov (United States)

    Škrbić, Tatjana; Hoang, Trinh X.; Giacometti, Achille

    2016-08-01

    We use Wang-Landau and replica exchange techniques to study the effect of an increasing stiffness on the formation of secondary structures in protein-like systems. Two possible models are considered. In both models, a polymer chain is formed by tethered beads where non-consecutive backbone beads attract each other via a square-well potential representing the tendency of the chain to fold. In addition, smaller hard spheres are attached to each non-terminal backbone bead along the direction normal to the chain to mimic the steric hindrance of side chains in real proteins. The two models, however, differ in the way bending rigidity is enforced. In the first model, partial overlap between consecutive beads is allowed. This reduces the possible bending angle between consecutive bonds thus producing an effective entropic stiffness that competes with a short-range attraction, and leads to the formation of secondary structures characteristic of proteins. We discuss the low-temperature phase diagram as a function of increasing interpenetration and find a transition from a planar, beta-like structure, to helical shape. In the second model, an energetic stiffness is explicitly introduced by imposing an infinitely large energy penalty for bending above a critical angle between consecutive bonds, and no penalty below it. The low-temperature phase of this model does not show any sign of protein-like secondary structures. At intermediate temperatures, however, where the chain is still in the coil conformation but stiffness is significant, we find the two models to predict a quite similar dependence of the persistence length as a function of the stiffness. This behaviour is rationalized in terms of a simple geometrical mapping between the two models. Finally, we discuss the effect of shrinking side chains to zero and find the above mapping to still hold true.

  4. Measurement of protein-like fluorescence in river and waste water using a handheld spectrophotometer.

    Science.gov (United States)

    Baker, Andy; Ward, David; Lieten, Shakti H; Periera, Ryan; Simpson, Ellie C; Slater, Malcolm

    2004-07-01

    Protein-like fluorescence intensity in rivers increases with increasing anthropogenic DOM inputs from sewerage and farm wastes. Here, a portable luminescence spectrophotometer was used to investigate if this technology could be used to provide both field scientists with a rapid pollution monitoring tool and process control engineers with a portable waste water monitoring device, through the measurement of river and waste water tryptophan-like fluorescence from a range of rivers in NE England and from effluents from within two waste water treatment plants. The portable spectrophotometer determined that waste waters and sewerage effluents had the highest tryptophan-like fluorescence intensity, urban streams had an intermediate tryptophan-like fluorescence intensity, and the upstream river samples of good water quality the lowest tryptophan-like fluorescence intensity. Replicate samples demonstrated that fluorescence intensity is reproducible to +/- 20% for low fluorescence, 'clean' river water samples and +/- 5% for urban water and waste waters. Correlations between fluorescence measured by the portable spectrophotometer with a conventional bench machine were 0.91; (Spearman's rho, n = 143), demonstrating that the portable spectrophotometer does correlate with tryptophan-like fluorescence intensity measured using the bench spectrophotometer.

  5. Functional evolution in the plant SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE (SPL gene family

    Directory of Open Access Journals (Sweden)

    Jill Christine Preston

    2013-04-01

    Full Text Available The SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE (SPL family of transcription factors is functionally diverse, controlling a number of fundamental aspects of plant growth and development, including vegetative phase change, flowering time, branching, and leaf initiation rate. In natural plant populations, variation in flowering time and shoot architecture have major consequences for fitness. Likewise, in crop species, variation in branching and developmental rate impact biomass and yield. Thus, studies aimed at dissecting how the various functions are partitioned among different SPL genes in diverse plant lineages are key to providing insight into the genetic basis of local adaptation and have already garnered attention by crop breeders. Here we use phylogenetic reconstruction to reveal nine major SPL gene lineages, each of which is described in terms of function and diversification. To assess evidence for ancestral and derived functions within each SPL gene lineage, we use ancestral character state reconstructions. Our analyses suggest an emerging pattern of sub-functionalization, neo-functionalization, and possible convergent evolution following both ancient and recent gene duplication. Based on these analyses we suggest future avenues of research that may prove fruitful for elucidating the importance of SPL gene evolution in plant growth and development.

  6. Guanine nucleotide binding protein-like 3 is a potential prognosis indicator of gastric cancer.

    Science.gov (United States)

    Chen, Jing; Dong, Shuang; Hu, Jiangfeng; Duan, Bensong; Yao, Jian; Zhang, Ruiyun; Zhou, Hongmei; Sheng, Haihui; Gao, Hengjun; Li, Shunlong; Zhang, Xianwen

    2015-01-01

    Guanine nucleotide binding protein-like 3 (GNL3) is a GIP-binding nuclear protein that has been reported to be involved in various biological processes, including cell proliferation, cellular senescence and tumorigenesis. This study aimed to investigate the expression level of GNL3 in gastric cancer and to evaluate the relationship between its expression and clinical variables and overall survival of gastric cancer patients. The expression level of GNL3 was examined in 89 human gastric cancer samples using immunohistochemistry (IHC) staining. GNL3 in gastric cancer tissues was significantly upregulated compared with paracancerous tissues. GNL3 expression in adjacent non-cancerous tissues was associated with sex and tumor size. Survival analyses showed that GNL3 expression in both gastric cancer and adjacent non-cancerous tissues were not related to overall survival. However, in the subgroup of patients with larger tumor size (≥ 6 cm), a close association was found between GNL3 expression in gastric cancer tissues and overall survival. GNL3-positive patients had a shorter survival than GNL3-negative patients. Our study suggests that GNL3 might play an important role in the progression of gastric cancer and serve as a biomarker for poor prognosis in gastric cancer patients.

  7. The raspberry model for protein-like particles: Ellipsoids and confinement in cylindrical pores

    Science.gov (United States)

    Ustach, Vincent D.; Faller, Roland

    2016-10-01

    The study of protein mass transport via atomistic simulation requires time and length scales beyond the computational capabilities of modern computer systems. The raspberry model for colloidal particles in combination with the mesoscopic hydrodynamic method of lattice Boltzmann facilitates coarse-grained simulations that are on the order of microseconds and hundreds of nanometers for the study of diffusive transport of protein-like colloid particles. The raspberry model reproduces linearity in resistance to motion versus particle size and correct enhanced drag within cylindrical pores at off-center coordinates for spherical particles. Owing to the high aspect ratio of many proteins, ellipsoidal raspberry colloid particles were constructed and reproduced the geometric resistance factors of Perrin and of Happel and Brenner in the laboratory-frame and in the moving body-frame. Accurate body-frame rotations during diffusive motion have been captured for the first time using projections of displacements. The spatial discretization of the fluid leads to a renormalization of the hydrodynamic radius, however, the data describes a self-consistent hydrodynamic frame within this renormalized system.

  8. Classifying Facial Actions

    Science.gov (United States)

    Donato, Gianluca; Bartlett, Marian Stewart; Hager, Joseph C.; Ekman, Paul; Sejnowski, Terrence J.

    2010-01-01

    The Facial Action Coding System (FACS) [23] is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions. PMID:21188284

  9. No evidence for association of autism with rare heterozygous point mutations in Contactin-Associated Protein-Like 2 (CNTNAP2, or in Other Contactin-Associated Proteins or Contactins.

    Directory of Open Access Journals (Sweden)

    John D Murdoch

    2015-01-01

    Full Text Available Contactins and Contactin-Associated Proteins, and Contactin-Associated Protein-Like 2 (CNTNAP2 in particular, have been widely cited as autism risk genes based on findings from homozygosity mapping, molecular cytogenetics, copy number variation analyses, and both common and rare single nucleotide association studies. However, data specifically with regard to the contribution of heterozygous single nucleotide variants (SNVs have been inconsistent. In an effort to clarify the role of rare point mutations in CNTNAP2 and related gene families, we have conducted targeted next-generation sequencing and evaluated existing sequence data in cohorts totaling 2704 cases and 2747 controls. We find no evidence for statistically significant association of rare heterozygous mutations in any of the CNTN or CNTNAP genes, including CNTNAP2, placing marked limits on the scale of their plausible contribution to risk.

  10. iPPBS-Opt: A Sequence-Based Ensemble Classifier for Identifying Protein-Protein Binding Sites by Optimizing Imbalanced Training Datasets

    Directory of Open Access Journals (Sweden)

    Jianhua Jia

    2016-01-01

    Full Text Available Knowledge of protein-protein interactions and their binding sites is indispensable for in-depth understanding of the networks in living cells. With the avalanche of protein sequences generated in the postgenomic age, it is critical to develop computational methods for identifying in a timely fashion the protein-protein binding sites (PPBSs based on the sequence information alone because the information obtained by this way can be used for both biomedical research and drug development. To address such a challenge, we have proposed a new predictor, called iPPBS-Opt, in which we have used: (1 the K-Nearest Neighbors Cleaning (KNNC and Inserting Hypothetical Training Samples (IHTS treatments to optimize the training dataset; (2 the ensemble voting approach to select the most relevant features; and (3 the stationary wavelet transform to formulate the statistical samples. Cross-validation tests by targeting the experiment-confirmed results have demonstrated that the new predictor is very promising, implying that the aforementioned practices are indeed very effective. Particularly, the approach of using the wavelets to express protein/peptide sequences might be the key in grasping the problem’s essence, fully consistent with the findings that many important biological functions of proteins can be elucidated with their low-frequency internal motions. To maximize the convenience of most experimental scientists, we have provided a step-by-step guide on how to use the predictor’s web server (http://www.jci-bioinfo.cn/iPPBS-Opt to get the desired results without the need to go through the complicated mathematical equations involved.

  11. Ecdysozoan phylogeny and Bayesian inference: first use of nearly complete 28S and 18S rRNA gene sequences to classify the arthropods and their kin.

    Science.gov (United States)

    Mallatt, Jon M; Garey, James R; Shultz, Jeffrey W

    2004-04-01

    Relationships among the ecdysozoans, or molting animals, have been difficult to resolve. Here, we use nearly complete 28S+18S ribosomal RNA gene sequences to estimate the relations of 35 ecdysozoan taxa, including newly obtained 28S sequences from 25 of these. The tree-building algorithms were likelihood-based Bayesian inference and minimum-evolution analysis of LogDet-transformed distances, and hypotheses were tested wth parametric bootstrapping. Better taxonomic resolution and recovery of established taxa were obtained here, especially with Bayesian inference, than in previous parsimony-based studies that used 18S rRNA sequences (or 18S plus small parts of 28S). In our gene trees, priapulan worms represent the basal ecdysozoans, followed by nematomorphs, or nematomorphs plus nematodes, followed by Panarthropoda. Panarthropoda was monophyletic with high support, although the relationships among its three phyla (arthropods, onychophorans, tardigrades) remain uncertain. The four groups of arthropods-hexapods (insects and related forms), crustaceans, chelicerates (spiders, scorpions, horseshoe crabs), and myriapods (centipedes, millipedes, and relatives)-formed two well-supported clades: Hexapoda in a paraphyletic crustacea (Pancrustacea), and 'Chelicerata+Myriapoda' (a clade that we name 'Paradoxopoda'). Pycnogonids (sea spiders) were either chelicerates or part of the 'chelicerate+myriapod' clade, but not basal arthropods. Certain clades derived from morphological taxonomy, such as Mandibulata, Atelocerata, Schizoramia, Maxillopoda and Cycloneuralia, are inconsistent with these rRNA data. The 28S gene contained more signal than the 18S gene, and contributed to the improved phylogenetic resolution. Our findings are similar to those obtained from mitochondrial and nuclear (e.g., elongation factor, RNA polymerase, Hox) protein-encoding genes, and should revive interest in using rRNA genes to study arthropod and ecdysozoan relationships.

  12. Purification and Characterization of a Novel Cold Shock Protein-Like Bacteriocin Synthesized by Bacillus thuringiensis

    Science.gov (United States)

    Huang, Tianpei; Zhang, Xiaojuan; Pan, Jieru; Su, Xiaoyu; Jin, Xin; Guan, Xiong

    2016-01-01

    Bacillus thuringiensis (Bt), one of the most successful biopesticides, may expand its potential by producing bacteriocins (thuricins). The aim of this study was to investigate the antimicrobial potential of a novel Bt bacteriocin, thuricin BtCspB, produced by Bt BRC-ZYR2. The results showed that this bacteriocin has a high similarity with cold-shock protein B (CspB). BtCspB lost its activity after proteinase K treatment; however it was active at 60 °C for 30 min and was stable in the pH range 5–7. The partial loss of activity after the treatments of lipase II and catalase were likely due to the change in BtCspB structure and the partial degradation of BtCspB, respectively. The loss of activity at high temperatures and the activity variation at different pHs were not due to degradation or large conformational change. BtCspB did not inhibit four probiotics. It was only active against B. cereus strains 0938 and ATCC 10987 with MIC values of 3.125 μg/mL and 0.781 μg/mL, and MBC values of 12.5 μg/mL and 6.25 μg/mL, respectively. Taken together, these results provide new insights into a novel cold shock protein-like bacteriocin, BtCspB, which displayed promise for its use in food preservation and treatment of B. cereus-associated diseases. PMID:27762322

  13. Construction of eukaryotic expression vector encoding ATP synthase lipid-binding protein-like protein gene of Sj and its expression in HeLa cells

    Institute of Scientific and Technical Information of China (English)

    Ouyang Danming; Hu Yongxuan; Li Mulan; Zeng Xiaojun; He Zhixiong; Yuan Caijia

    2008-01-01

    Objective: To clone and construct the recombinant plasmid containing ATP synthase lipid-binding protein-like protein gene of Schistosoma japonicum,(SjAslp) and transfer it into mammalian cells to express the objective protein. Methods: By polymerase chain reaction (PCR) technique, SjAslp was amplified from the constructed recombinant plasmid pBCSK+/SjAslp, and inserted into cloning vector pUCm-T. Then, SjAslp was subcloned into an eukaryotic expression vector pcDNA3.1(+). After identifying it by PCR, restrictive enzymes digestion and DNA sequencing, the recombinant plasmid was transfected into HeLa cells using electroporation, and the expression of the recombinant protein was analyzed by immunocytochemical assay. Resnlts: The specific gene fragment of 558 bp was successfully amplified. The DNA vaccine of SjAslp was successfully constructed. Immunocytochemical assay showed that SjAslp was expressed in the cytoplasm of HeLa cells. Conclusion: SjAslp gene can be expressed in eukaryotic system, which lays the foundation for development of the SjAslp DNA vaccine against schitosomiasis.

  14. Why is it so difficult to classify Renazzo-type (CR) carbonaceous chondrites? - Implications from TEM observations of matrices for the sequences of aqueous alteration

    Science.gov (United States)

    Abreu, Neyda M.

    2016-12-01

    A number of different classifications have been proposed for the CR chondrites; this study aims at reconciling these different schemes. Mineralogy-based classification has proved particularly challenging for weakly to moderately altered CRs because incipient mineral replacement and elemental mobilization arising from aqueous alteration only affected the most susceptible primary phases, which are generally located in the matrix. Secondary matrix phases are extremely fine-grained (generally sub-micron) and heterogeneously mixed with primary nebular materials. Compositional and isotopic classification parameters are fraught with confounding factors, such as terrestrial weathering, impact processes, and variable abundance of clasts from different regions of the CR parent body or from altogether different planetary bodies. Here, detailed TEM observations from eighteen FIB sections retrieved from the matrices of nine Antarctic CR chondrites (EET 96259, GRA 95229, GRO 95577, GRO 03116, LAP 02342, LAP 04516, LAP 04720, MIL 07525, and MIL 090001) are presented, representing a range of petrologic types. Amorphous Fe-Mg silicates are found to be the dominant phase in all but the most altered CR chondrite matrices, which still retain significant amounts of these amorphous materials. Amorphous Fe-Mg silicates are mixed with phyllosilicates at the nanometer scale. The ratio of amorphous Fe-Mg silicates to phyllosilicates decreases as: (1) the size of phyllosilicates, (2) abundance of magnetite, and (3) replacement of Fe-Ni sulfides increase. Carbonates are only abundant in the most altered CR chondrite, GRO 95577. Nanophase Fe-Ni metal and tochilinite are present small abundances in most CR matrices. Based on the presence, abundance and size of phyllosilicates with respect to amorphous Fe-Mg silicates, the sub-micron features of CR chondrites have been linked to existing classification sequences, and possible reasons for inconsistencies among classification schemes are discussed.

  15. An evaluation of morphological and functional multi-parametric MRI sequences in classifying non-muscle and muscle invasive bladder cancer.

    Science.gov (United States)

    Panebianco, Valeria; De Berardinis, Ettore; Barchetti, Giovanni; Simone, Giuseppe; Leonardo, Constantino; Grompone, Marcello Domenico; Del Monte, Maurizio; Carano, Davide; Gallucci, Michele; Catto, James; Catalano, Carlo

    2017-09-01

    Our goal is to determine the ability of multi-parametric magnetic resonance imaging (mpMRI) to differentiate muscle invasive bladder cancer (MIBC) from non-muscle invasive bladder cancer (NMIBC). Patients underwent mpMRI before tumour resection. Four MRI sets, i.e. T2-weighted (T2W) + perfusion-weighted imaging (PWI), T2W plus diffusion-weighted imaging (DWI), T2W + DWI + PWI, and T2W + DWI + PWI + dif-fusion tensor imaging (DTI) were interpreted qualitatively by two radiologists, blinded to histology results. PWI, DWI and DTI were also analysed quantitatively. Accuracy was determined using histopathology as the reference standard. A total of 82 tumours were analysed. Ninety-six percent of T1-labeled tumours by the T2W + DWI + PWI image set were confirmed to be NMIBC at histopathology. Overall accuracy of the complete mpMRI protocol was 94% in differentiating NMIBC from MIBC. PWI, DWI and DTI quantitative parameters were shown to be significantly different in cancerous versus non-cancerous areas within the bladder wall in T2-labelled lesions. MpMRI with DWI and DTI appears a reliable staging tool for bladder cancer. If our data are validated, then mpMRI could precede cystoscopic resection to allow a faster recognition of MIBC and accelerated treatment pathways. • A critical step in BCa staging is to differentiate NMIBC from MIBC. • Morphological and functional sequences are reliable techniques in differentiating NMIBC from MIBC. • Diffusion tensor imaging could be an additional tool in BCa staging.

  16. Brut: Automatic bubble classifier

    Science.gov (United States)

    Beaumont, Christopher; Goodman, Alyssa; Williams, Jonathan; Kendrew, Sarah; Simpson, Robert

    2014-07-01

    Brut, written in Python, identifies bubbles in infrared images of the Galactic midplane; it uses a database of known bubbles from the Milky Way Project and Spitzer images to build an automatic bubble classifier. The classifier is based on the Random Forest algorithm, and uses the WiseRF implementation of this algorithm.

  17. Dynamic system classifier

    Science.gov (United States)

    Pumpe, Daniel; Greiner, Maksim; Müller, Ewald; Enßlin, Torsten A.

    2016-07-01

    Stochastic differential equations describe well many physical, biological, and sociological systems, despite the simplification often made in their derivation. Here the usage of simple stochastic differential equations to characterize and classify complex dynamical systems is proposed within a Bayesian framework. To this end, we develop a dynamic system classifier (DSC). The DSC first abstracts training data of a system in terms of time-dependent coefficients of the descriptive stochastic differential equation. Thereby the DSC identifies unique correlation structures within the training data. For definiteness we restrict the presentation of the DSC to oscillation processes with a time-dependent frequency ω (t ) and damping factor γ (t ) . Although real systems might be more complex, this simple oscillator captures many characteristic features. The ω and γ time lines represent the abstract system characterization and permit the construction of efficient signal classifiers. Numerical experiments show that such classifiers perform well even in the low signal-to-noise regime.

  18. Dynamic system classifier

    CERN Document Server

    Pumpe, Daniel; Müller, Ewald; Enßlin, Torsten A

    2016-01-01

    Stochastic differential equations describe well many physical, biological and sociological systems, despite the simplification often made in their derivation. Here the usage of simple stochastic differential equations to characterize and classify complex dynamical systems is proposed within a Bayesian framework. To this end, we develop a dynamic system classifier (DSC). The DSC first abstracts training data of a system in terms of time dependent coefficients of the descriptive stochastic differential equation. Thereby the DSC identifies unique correlation structures within the training data. For definiteness we restrict the presentation of DSC to oscillation processes with a time dependent frequency {\\omega}(t) and damping factor {\\gamma}(t). Although real systems might be more complex, this simple oscillator captures many characteristic features. The {\\omega} and {\\gamma} timelines represent the abstract system characterization and permit the construction of efficient signal classifiers. Numerical experiment...

  19. Classifying Returns as Extreme

    DEFF Research Database (Denmark)

    Christiansen, Charlotte

    2014-01-01

    I consider extreme returns for the stock and bond markets of 14 EU countries using two classification schemes: One, the univariate classification scheme from the previous literature that classifies extreme returns for each market separately, and two, a novel multivariate classification scheme tha...

  20. LCC: Light Curves Classifier

    Science.gov (United States)

    Vo, Martin

    2017-08-01

    Light Curves Classifier uses data mining and machine learning to obtain and classify desired objects. This task can be accomplished by attributes of light curves or any time series, including shapes, histograms, or variograms, or by other available information about the inspected objects, such as color indices, temperatures, and abundances. After specifying features which describe the objects to be searched, the software trains on a given training sample, and can then be used for unsupervised clustering for visualizing the natural separation of the sample. The package can be also used for automatic tuning parameters of used methods (for example, number of hidden neurons or binning ratio). Trained classifiers can be used for filtering outputs from astronomical databases or data stored locally. The Light Curve Classifier can also be used for simple downloading of light curves and all available information of queried stars. It natively can connect to OgleII, OgleIII, ASAS, CoRoT, Kepler, Catalina and MACHO, and new connectors or descriptors can be implemented. In addition to direct usage of the package and command line UI, the program can be used through a web interface. Users can create jobs for ”training” methods on given objects, querying databases and filtering outputs by trained filters. Preimplemented descriptors, classifier and connectors can be picked by simple clicks and their parameters can be tuned by giving ranges of these values. All combinations are then calculated and the best one is used for creating the filter. Natural separation of the data can be visualized by unsupervised clustering.

  1. Classifier in Age classification

    Directory of Open Access Journals (Sweden)

    B. Santhi

    2012-12-01

    Full Text Available Face is the important feature of the human beings. We can derive various properties of a human by analyzing the face. The objective of the study is to design a classifier for age using facial images. Age classification is essential in many applications like crime detection, employment and face detection. The proposed algorithm contains four phases: preprocessing, feature extraction, feature selection and classification. The classification employs two class labels namely child and Old. This study addresses the limitations in the existing classifiers, as it uses the Grey Level Co-occurrence Matrix (GLCM for feature extraction and Support Vector Machine (SVM for classification. This improves the accuracy of the classification as it outperforms the existing methods.

  2. Classifying Linear Canonical Relations

    OpenAIRE

    Lorand, Jonathan

    2015-01-01

    In this Master's thesis, we consider the problem of classifying, up to conjugation by linear symplectomorphisms, linear canonical relations (lagrangian correspondences) from a finite-dimensional symplectic vector space to itself. We give an elementary introduction to the theory of linear canonical relations and present partial results toward the classification problem. This exposition should be accessible to undergraduate students with a basic familiarity with linear algebra.

  3. Intelligent Garbage Classifier

    Directory of Open Access Journals (Sweden)

    Ignacio Rodríguez Novelle

    2008-12-01

    Full Text Available IGC (Intelligent Garbage Classifier is a system for visual classification and separation of solid waste products. Currently, an important part of the separation effort is based on manual work, from household separation to industrial waste management. Taking advantage of the technologies currently available, a system has been built that can analyze images from a camera and control a robot arm and conveyor belt to automatically separate different kinds of waste.

  4. Generalized classifier neural network.

    Science.gov (United States)

    Ozyildirim, Buse Melis; Avci, Mutlu

    2013-03-01

    In this work a new radial basis function based classification neural network named as generalized classifier neural network, is proposed. The proposed generalized classifier neural network has five layers, unlike other radial basis function based neural networks such as generalized regression neural network and probabilistic neural network. They are input, pattern, summation, normalization and output layers. In addition to topological difference, the proposed neural network has gradient descent based optimization of smoothing parameter approach and diverge effect term added calculation improvements. Diverge effect term is an improvement on summation layer calculation to supply additional separation ability and flexibility. Performance of generalized classifier neural network is compared with that of the probabilistic neural network, multilayer perceptron algorithm and radial basis function neural network on 9 different data sets and with that of generalized regression neural network on 3 different data sets include only two classes in MATLAB environment. Better classification performance up to %89 is observed. Improved classification performances proved the effectivity of the proposed neural network.

  5. Application of EST Data of Siderophore Regulation Protein-like Gene of Aspergillus oryzae RIB40

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    Objective: To acquire the new clear color phenotype of Aspergillus oryzae b y th e antisense strategy of siderophore regulation protein (SREP)-like gene. Method s: Construct the cDNA library of Aspergillus oryzae RIB40 and amplify the fr agme nt ac7336f from the every EST clone, which had high homology with SREP gene of o ther species, then construct the eukaryotic expression vector with SREP-like ge ne using antisense strategy. Results: The sequence of this SREP-like gene was a cquired, the vector was successfully constructed. Conclusion: The deduced amino acid sequence of SREP-like gene of Aspergillus oryzae indicated that there is t he high homology with those of SREP genes of Penicillium chrysogenum, Neur ospora crassa and Schizosaccharomyces pombe.%目的:通过反义RNA技术获取稻属曲霉(Aspergillus oryzae)新的表现型.方法:应用BLAST网络服务对稻属曲霉EST数据进行同源性比较,PCR扩增与其它种属铁配体调节蛋白基因(SREP)具有高度同源性的ac7336f片段并反向插入pUSA真核表达载体.结果:测得铁配体调节蛋白类似基因的序列,完成反义表达载体构建.结论:由铁配体调节蛋白类似基因的DNA序列推出的氨基酸序列与Penicillium chrysogenum, Neurospora crassa和Schizosaccharomyces pombe等种属铁配体调节蛋白基因的氨基酸序列具有高度同源性.

  6. High Performance Medical Classifiers

    Science.gov (United States)

    Fountoukis, S. G.; Bekakos, M. P.

    2009-08-01

    In this paper, parallelism methodologies for the mapping of machine learning algorithms derived rules on both software and hardware are investigated. Feeding the input of these algorithms with patient diseases data, medical diagnostic decision trees and their corresponding rules are outputted. These rules can be mapped on multithreaded object oriented programs and hardware chips. The programs can simulate the working of the chips and can exhibit the inherent parallelism of the chips design. The circuit of a chip can consist of many blocks, which are operating concurrently for various parts of the whole circuit. Threads and inter-thread communication can be used to simulate the blocks of the chips and the combination of block output signals. The chips and the corresponding parallel programs constitute medical classifiers, which can classify new patient instances. Measures taken from the patients can be fed both into chips and parallel programs and can be recognized according to the classification rules incorporated in the chips and the programs design. The chips and the programs constitute medical decision support systems and can be incorporated into portable micro devices, assisting physicians in their everyday diagnostic practice.

  7. Derivation of a Markov state model of the dynamics of a protein-like chain immersed in an implicit solvent.

    Science.gov (United States)

    Schofield, Jeremy; Bayat, Hanif

    2014-09-07

    A Markov state model of the dynamics of a protein-like chain immersed in an implicit hard sphere solvent is derived from first principles for a system of monomers that interact via discontinuous potentials designed to account for local structure and bonding in a coarse-grained sense. The model is based on the assumption that the implicit solvent interacts on a fast time scale with the monomers of the chain compared to the time scale for structural rearrangements of the chain and provides sufficient friction so that the motion of monomers is governed by the Smoluchowski equation. A microscopic theory for the dynamics of the system is developed that reduces to a Markovian model of the kinetics under well-defined conditions. Microscopic expressions for the rate constants that appear in the Markov state model are analyzed and expressed in terms of a temperature-dependent linear combination of escape rates that themselves are independent of temperature. Excellent agreement is demonstrated between the theoretical predictions of the escape rates and those obtained through simulation of a stochastic model of the dynamics of bond formation. Finally, the Markov model is studied by analyzing the eigenvalues and eigenvectors of the matrix of transition rates, and the equilibration process for a simple helix-forming system from an ensemble of initially extended configurations to mainly folded configurations is investigated as a function of temperature for a number of different chain lengths. For short chains, the relaxation is primarily single-exponential and becomes independent of temperature in the low-temperature regime. The profile is more complicated for longer chains, where multi-exponential relaxation behavior is seen at intermediate temperatures followed by a low temperature regime in which the folding becomes rapid and single exponential. It is demonstrated that the behavior of the equilibration profile as the temperature is lowered can be understood in terms of the

  8. Identification and evaluation of metastasis-related proteins, oxysterol binding protein-like 5 and calumenin, in lung tumors.

    Science.gov (United States)

    Nagano, Kazuya; Imai, Sunao; Zhao, Xiluli; Yamashita, Takuya; Yoshioka, Yasuo; Abe, Yasuhiro; Mukai, Yohei; Kamada, Haruhiko; Nakagawa, Shinsaku; Tsutsumi, Yasuo; Tsunoda, Shin-Ichi

    2015-07-01

    Metastasis is an important prognosis factor in lung cancer, therefore, it is imperative to identify target molecules and elucidate molecular mechanism of metastasis for developing new therapeutics and diagnosis methods. We searched for metastasis-related proteins by utilizing a novel antibody proteome technology developed in our laboratory that facilitated efficient screening of useful target proteins. Two-dimensional differential in-gel electrophoresis (2D-DIGE) analysis identified sixteen proteins, which were highly expressed in metastatic lung cancer cells, as protein candidates. Monoclonal single-chain variable fragments (scFvs) binding to candidates were isolated from a scFv-displaying phage library by affinity selection. Tissue microarray analysis of scFvs binding to candidates revealed that oxysterol binding protein-like 5 (OSBPL5) and calumenin (CALU) were expressed at a significantly higher levels in the lung tissues of metastasis-positive cases than that in the metastasis-negative cases (OSBPL5; p=0.0156, CALU; p=0.0055). Furthermore, 80% of OSBPL5 and CALU double-positive cases were positive for lymph node metastasis. Consistent with these observations, overexpression of OSBPL5 and CALU promoted invasiveness of lung cancer cells. Conversely, knockdown of these proteins using respective siRNAs reversed the invasiveness of the lung cancer cells. Moreover, these proteins were expressed in lung tumor tissues, but not in normal lung tissues. In conclusion, OSBPL5 and CALU are related to metastatic potential of lung cancer cells, and they could be useful targets for cancer diagnosis and also for development of drugs against metastasis.

  9. Classifiers and Plurality: evidence from a deictic classifier language

    Directory of Open Access Journals (Sweden)

    Filomena Sandalo

    2016-12-01

    Full Text Available This paper investigates the semantic contribution of plural morphology and its interaction with classifiers in Kadiwéu. We show that Kadiwéu, a Waikurúan language spoken in South America, is a classifier language similar to Chinese but classifiers are an obligatory ingredient of all determiner-like elements, such as quantifiers, numerals, and wh-words for arguments. What all elements with classifiers have in common is that they contribute an atomized/individualized interpretation of the NP. Furthermore, this paper revisits the relationship between classifiers and number marking and challenges the common assumption that classifiers and plurals are mutually exclusive.

  10. Stack filter classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Porter, Reid B [Los Alamos National Laboratory; Hush, Don [Los Alamos National Laboratory

    2009-01-01

    Just as linear models generalize the sample mean and weighted average, weighted order statistic models generalize the sample median and weighted median. This analogy can be continued informally to generalized additive modeels in the case of the mean, and Stack Filters in the case of the median. Both of these model classes have been extensively studied for signal and image processing but it is surprising to find that for pattern classification, their treatment has been significantly one sided. Generalized additive models are now a major tool in pattern classification and many different learning algorithms have been developed to fit model parameters to finite data. However Stack Filters remain largely confined to signal and image processing and learning algorithms for classification are yet to be seen. This paper is a step towards Stack Filter Classifiers and it shows that the approach is interesting from both a theoretical and a practical perspective.

  11. Classifying TDSS Stellar Variables

    Science.gov (United States)

    Amaro, Rachael Christina; Green, Paul J.; TDSS Collaboration

    2017-01-01

    The Time Domain Spectroscopic Survey (TDSS), a subprogram of SDSS-IV eBOSS, obtains classification/discovery spectra of point-source photometric variables selected from PanSTARRS and SDSS multi-color light curves regardless of object color or lightcurve shape. Tens of thousands of TDSS spectra are already available and have been spectroscopically classified both via pipeline and by visual inspection. About half of these spectra are quasars, half are stars. Our goal is to classify the stars with their correct variability types. We do this by acquiring public multi-epoch light curves for brighter stars (rSky Survey (CSS). We then run a number of light curve analyses from VARTOOLS, a program for analyzing astronomical time-series data, to constrain variable type both for broad statistics relevant to future surveys like the Transiting Exoplanet Survey Satellite (TESS) and the Large Synoptic Survey Telescope (LSST), and to find the inevitable exotic oddballs that warrant further follow-up. Specifically, the Lomb-Scargle Periodogram and the Box-Least Squares Method are being implemented and tested against their known variable classifications and parameters in the Catalina Surveys Periodic Variable Star Catalog. Variable star classifications include RR Lyr, close eclipsing binaries, CVs, pulsating white dwarfs, and other exotic systems. The key difference between our catalog and others is that along with the light curves, we will be using TDSS spectra to help in the classification of variable type, as spectra are rich with information allowing estimation of physical parameters like temperature, metallicity, gravity, etc. This work was supported by the SDSS Research Experience for Undergraduates program, which is funded by a grant from Sloan Foundation to the Astrophysical Research Consortium.

  12. Botnet analysis using ensemble classifier

    Directory of Open Access Journals (Sweden)

    Anchit Bijalwan

    2016-09-01

    Full Text Available This paper analyses the botnet traffic using Ensemble of classifier algorithm to find out bot evidence. We used ISCX dataset for training and testing purpose. We extracted the features of both training and testing datasets. After extracting the features of this dataset, we bifurcated these features into two classes, normal traffic and botnet traffic and provide labelling. Thereafter using modern data mining tool, we have applied ensemble of classifier algorithm. Our experimental results show that the performance for finding bot evidence using ensemble of classifiers is better than single classifier. Ensemble based classifiers perform better than single classifier by either combining powers of multiple algorithms or introducing diversification to the same classifier by varying input in bot analysis. Our results are showing that by using voting method of ensemble based classifier accuracy is increased up to 96.41% from 93.37%.

  13. Preparation of protein-like silver-cysteine hybrid nanowires and application in ultrasensitive immunoassay of cancer biomarker.

    Science.gov (United States)

    Chen, Wenjuan; Zheng, Liyan; Wang, Meilan; Chi, Yuwu; Chen, Guonan

    2013-10-15

    Novel protein-like silver-cysteine hybrid nanowires (p-SCNWs) have been synthesized by a green, simple, nontemplate, seedless, and one-step aqueous-phase approach. AgNO3 and l-cysteine were dissolved in distilled water, forming Ag-cysteine precipitates and HNO3. Under vigorous stirring, the pH of the solution was rapidly adjusted to 9.0 by addition of concentrated sodium hydroxide solution, leading to quick dissolution of the Ag-cysteine precipitates and sudden appearance of white precipitates of p-SCNWs. The p-SCNWs are monodispersed nanowires with diameter of 100 nm and length of tens of micrometers, and have abundant carboxyl (-COOH) and amine (-NH2) groups at their surfaces, large amounts of peptide-linkages and S-bonding silver ions (Ag(+)) inside, making them look and act like Ag-hybrid protein nanostructures. The abundant -COOH and -NH2 groups at the surfaces of p-SCNWs have been found to facilitate the reactions between the p-SCNWs and proteins including antibodies. Furthermore, the fact that the p-SCNWs contain large amounts of silver ions enables biofunctionalized p-SCNWs to be excellent signal amplifying chemiluminescence labels for ultrasensitive and highly selective detection of important antigens, such as cancer biomarkers. In this work, the immunoassay of carcinoembryonic antigen (CEA) in human serum was taken as an example to demonstrate the immunoassay applications of antibody-functionalized p-SCNWs. By the novel p-SCNW labels, CEA can be detected in the linear range from 5 to 400 fg/mL with a limit of detection (LOD) of 2.2 fg/mL (at signal-to-noise ratio of 3), which is much lower than that obtained by commercially available enzyme-linked immunosorbent assay (ELISA). Therefore, the synthesized p-SCNWs are envisioned to be an excellent carrier for proteins and related immunoassay strategy would have promising applications in ultrasensitive clinical screening of cancer biomarkers for early diagnostics of cancers.

  14. A Biologically Inspired Classifier

    CERN Document Server

    Bagnoli, Franco

    2007-01-01

    We present a method for measuring the distance among records based on the correlations of data stored in the corresponding database entries. The original method (F. Bagnoli, A. Berrones and F. Franci. Physica A 332 (2004) 509-518) was formulated in the context of opinion formation. The opinions expressed over a set of topic originate a ``knowledge network'' among individuals, where two individuals are nearer the more similar their expressed opinions are. Assuming that individuals' opinions are stored in a database, the authors show that it is possible to anticipate an opinion using the correlations in the database. This corresponds to approximating the overlap between the tastes of two individuals with the correlations of their expressed opinions. In this paper we extend this model to nonlinear matching functions, inspired by biological problems such as microarray (probe-sample pairing). We investigate numerically the error between the correlation and the overlap matrix for eight sequences of reference with r...

  15. Emergent behaviors of classifier systems

    Energy Technology Data Exchange (ETDEWEB)

    Forrest, S.; Miller, J.H.

    1989-01-01

    This paper discusses some examples of emergent behavior in classifier systems, describes some recently developed methods for studying them based on dynamical systems theory, and presents some initial results produced by the methodology. The goal of this work is to find techniques for noticing when interesting emergent behaviors of classifier systems emerge, to study how such behaviors might emerge over time, and make suggestions for designing classifier systems that exhibit preferred behaviors. 20 refs., 1 fig.

  16. Novel overlapping coding sequences in Chlamydia trachomatis

    DEFF Research Database (Denmark)

    Jensen, Klaus Thorleif; Petersen, Lise; Falk, Søren;

    2006-01-01

    Chlamydia trachomatis is the aetiological agent of trachoma and sexually transmitted infections. The C. trachomatis genome sequence revealed an organism adapted to the intracellular habitat with a high coding ratio and a small genome consisting of 1.042-kilobase (kb) with 895 annotated protein...... of the novel genes in C. trachomatis Serovar A and Chlamydia muridarum. Several of the genes have typical gene-like and protein-like features. Furthermore, we confirm transcriptional activity from 10 of the putative genes. The combined evidence suggests that at least seven of the 15 are protein coding genes...

  17. Classifying Coding DNA with Nucleotide Statistics

    Directory of Open Access Journals (Sweden)

    Nicolas Carels

    2009-10-01

    Full Text Available In this report, we compared the success rate of classification of coding sequences (CDS vs. introns by Codon Structure Factor (CSF and by a method that we called Universal Feature Method (UFM. UFM is based on the scoring of purine bias (Rrr and stop codon frequency. We show that the success rate of CDS/intron classification by UFM is higher than by CSF. UFM classifies ORFs as coding or non-coding through a score based on (i the stop codon distribution, (ii the product of purine probabilities in the three positions of nucleotide triplets, (iii the product of Cytosine (C, Guanine (G, and Adenine (A probabilities in the 1st, 2nd, and 3rd positions of triplets, respectively, (iv the probabilities of G in 1st and 2nd position of triplets and (v the distance of their GC3 vs. GC2 levels to the regression line of the universal correlation. More than 80% of CDSs (true positives of Homo sapiens (>250 bp, Drosophila melanogaster (>250 bp and Arabidopsis thaliana (>200 bp are successfully classified with a false positive rate lower or equal to 5%. The method releases coding sequences in their coding strand and coding frame, which allows their automatic translation into protein sequences with 95% confidence. The method is a natural consequence of the compositional bias of nucleotides in coding sequences.

  18. Feature Selection and Effective Classifiers.

    Science.gov (United States)

    Deogun, Jitender S.; Choubey, Suresh K.; Raghavan, Vijay V.; Sever, Hayri

    1998-01-01

    Develops and analyzes four algorithms for feature selection in the context of rough set methodology. Experimental results confirm the expected relationship between the time complexity of these algorithms and the classification accuracy of the resulting upper classifiers. When compared, results of upper classifiers perform better than lower…

  19. Large margin classifier-based ensemble tracking

    Science.gov (United States)

    Wang, Yuru; Liu, Qiaoyuan; Yin, Minghao; Wang, ShengSheng

    2016-07-01

    In recent years, many studies consider visual tracking as a two-class classification problem. The key problem is to construct a classifier with sufficient accuracy in distinguishing the target from its background and sufficient generalize ability in handling new frames. However, the variable tracking conditions challenges the existing methods. The difficulty mainly comes from the confused boundary between the foreground and background. This paper handles this difficulty by generalizing the classifier's learning step. By introducing the distribution data of samples, the classifier learns more essential characteristics in discriminating the two classes. Specifically, the samples are represented in a multiscale visual model. For features with different scales, several large margin distribution machine (LDMs) with adaptive kernels are combined in a Baysian way as a strong classifier. Where, in order to improve the accuracy and generalization ability, not only the margin distance but also the sample distribution is optimized in the learning step. Comprehensive experiments are performed on several challenging video sequences, through parameter analysis and field comparison, the proposed LDM combined ensemble tracker is demonstrated to perform with sufficient accuracy and generalize ability in handling various typical tracking difficulties.

  20. Sampling Based Average Classifier Fusion

    Directory of Open Access Journals (Sweden)

    Jian Hou

    2014-01-01

    fusion algorithms have been proposed in literature, average fusion is almost always selected as the baseline for comparison. Little is done on exploring the potential of average fusion and proposing a better baseline. In this paper we empirically investigate the behavior of soft labels and classifiers in average fusion. As a result, we find that; by proper sampling of soft labels and classifiers, the average fusion performance can be evidently improved. This result presents sampling based average fusion as a better baseline; that is, a newly proposed classifier fusion algorithm should at least perform better than this baseline in order to demonstrate its effectiveness.

  1. Classified

    CERN Multimedia

    Computer Security Team

    2011-01-01

    In the last issue of the Bulletin, we have discussed recent implications for privacy on the Internet. But privacy of personal data is just one facet of data protection. Confidentiality is another one. However, confidentiality and data protection are often perceived as not relevant in the academic environment of CERN.   But think twice! At CERN, your personal data, e-mails, medical records, financial and contractual documents, MARS forms, group meeting minutes (and of course your password!) are all considered to be sensitive, restricted or even confidential. And this is not all. Physics results, in particular when being preliminary and pending scrutiny, are sensitive, too. Just recently, an ATLAS collaborator copy/pasted the abstract of an ATLAS note onto an external public blog, despite the fact that this document was clearly marked as an "Internal Note". Such an act was not only embarrassing to the ATLAS collaboration, and had negative impact on CERN’s reputation --- i...

  2. Classifying objects in LWIR imagery via CNNs

    Science.gov (United States)

    Rodger, Iain; Connor, Barry; Robertson, Neil M.

    2016-10-01

    The aim of the presented work is to demonstrate enhanced target recognition and improved false alarm rates for a mid to long range detection system, utilising a Long Wave Infrared (LWIR) sensor. By exploiting high quality thermal image data and recent techniques in machine learning, the system can provide automatic target recognition capabilities. A Convolutional Neural Network (CNN) is trained and the classifier achieves an overall accuracy of > 95% for 6 object classes related to land defence. While the highly accurate CNN struggles to recognise long range target classes, due to low signal quality, robust target discrimination is achieved for challenging candidates. The overall performance of the methodology presented is assessed using human ground truth information, generating classifier evaluation metrics for thermal image sequences.

  3. Chemical evolution of life-like system under hydrothermal environments: prebiotic formation, degradation, and functions regarding protein-like molecules and RNA

    Science.gov (United States)

    Kawamura, Kunio

    The accumulation of biopolymers without enzymes is an essential step for the chemical evolu-tion towards a primitive life-like system. Previously, we discussed the relationship between the RNA world hypothesis and the hydrothermal origin of life hypothesis on the basis of the em-pirical data of RNA behaviors under the hydrothermal environments examined using real-time monitoring technique for hydrothermal reactions within the millisecond to second time scale. On the other hand, we have also examined the stabilities and behaviors of amino acids, pep-tides, and proteins under the hydrothermal environments. These observations have shown the possibility that oligopeptides could have been accumulated under near submarine hydrother-mal vent environments on primitive Earth within the relatively short time scale. However, the formation of oligopeptides under the simulated hydrothermal conditions is not so effective in the absence of catalysts and condensation agents. Thus, the investigation of the roles of min-eral catalysis and condensation reagents are very important since these materials could have enhanced efficiently the formation of peptides and stabilize primitive protein-like molecules. Recently, we investigated the roles of condensation reagents for the elongation of oligopeptides in the presence of minerals. In addition, we have designed a mineral-mediated hydrothermal flow reactor system (MHFR), which enables monitoring hydrothermal reactions in the presence of solid particles. By using MHFR, we attempted to examine naturally occurring minerals, such as apatite and quartz, for the elongation of oligopeptides at temperatures over 200 o C within 10 -30 sec. According to these data, the chemical evolution of protein-like molecules on primitive Earth will be discussed.

  4. Optimally Training a Cascade Classifier

    CERN Document Server

    Shen, Chunhua; Hengel, Anton van den

    2010-01-01

    Cascade classifiers are widely used in real-time object detection. Different from conventional classifiers that are designed for a low overall classification error rate, a classifier in each node of the cascade is required to achieve an extremely high detection rate and moderate false positive rate. Although there are a few reported methods addressing this requirement in the context of object detection, there is no a principled feature selection method that explicitly takes into account this asymmetric node learning objective. We provide such an algorithm here. We show a special case of the biased minimax probability machine has the same formulation as the linear asymmetric classifier (LAC) of \\cite{wu2005linear}. We then design a new boosting algorithm that directly optimizes the cost function of LAC. The resulting totally-corrective boosting algorithm is implemented by the column generation technique in convex optimization. Experimental results on object detection verify the effectiveness of the proposed bo...

  5. Combining different types of classifiers

    OpenAIRE

    Gatnar, Eugeniusz

    2008-01-01

    Model fusion has proved to be a very successful strategy for obtaining accurate models in classification and regression. The key issue, however, is the diversity of the component classifiers because classification error of an ensemble depends on the correlation between its members. The majority of existing ensemble methods combine the same type of models, e.g. trees. In order to promote the diversity of the ensemble members, we propose to aggregate classifiers of different t...

  6. Optimal weighted nearest neighbour classifiers

    CERN Document Server

    Samworth, Richard J

    2011-01-01

    We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier. This allows us to find the asymptotically optimal vector of non-negative weights, which has a rather simple form. We show that the ratio of the regret of this classifier to that of an unweighted $k$-nearest neighbour classifier depends asymptotically only on the dimension $d$ of the feature vectors, and not on the underlying population densities. The improvement is greatest when $d=4$, but thereafter decreases as $d \\rightarrow \\infty$. The popular bagged nearest neighbour classifier can also be regarded as a weighted nearest neighbour classifier, and we show that its corresponding weights are somewhat suboptimal when $d$ is small (in particular, worse than those of the unweighted $k$-nearest neighbour classifier when $d=1$), but are close to optimal when $d$ is large. Finally, we argue that improvements in the rate of convergence are possible under stronger smoothness assumptions, provided we allow nega...

  7. Hybrid classifiers methods of data, knowledge, and classifier combination

    CERN Document Server

    Wozniak, Michal

    2014-01-01

    This book delivers a definite and compact knowledge on how hybridization can help improving the quality of computer classification systems. In order to make readers clearly realize the knowledge of hybridization, this book primarily focuses on introducing the different levels of hybridization and illuminating what problems we will face with as dealing with such projects. In the first instance the data and knowledge incorporated in hybridization were the action points, and then a still growing up area of classifier systems known as combined classifiers was considered. This book comprises the aforementioned state-of-the-art topics and the latest research results of the author and his team from Department of Systems and Computer Networks, Wroclaw University of Technology, including as classifier based on feature space splitting, one-class classification, imbalance data, and data stream classification.

  8. The phylogenetic relationships of the hat-shaped ascospore-forming, nitrate-assimilating Pichia species, formerly classified in the genus Hansenula Sydow et Sydow, based on the partial sequences of 18S and 26S ribosomal RNAs (Saccharomycetaceae): the proposals of three new genera, Ogataea, Kuraishia, and Nakazawaea.

    Science.gov (United States)

    Yamada, Y; Maeda, K; Mikata, K

    1994-07-01

    The twenty-seven strains of the hat-shaped ascospore-forming, nitrate-assimilating species, formerly classified in the genus Hansenula, of the genus Pichia were examined for their 18S and 26S rRNA partial base sequencings. All the strains examined were separate phylogenetically from the type strain of P. membranaefaciens (type species of genus Pichia). Based on the sequence data obtained [by number of base differences (five or more) with P. anomala and base sequences on fingerprint segment] in the 18S rRNA partial base sequences, these species were divided into seven groups. Group I, including P. anomala (identical to H. anomala, type species of genus Hansenula), P. canadensis, P. muscicola, P. silvicola, P. subpelliculosa, P. americana, P. bimundalis, P. ciferrii, P. syndowiorum, P. bispora, and P. fabianii, corresponded to the genus Hansenula Sydow et Sydow. Groups II and III were comprised of P. capsulata and P. holstii, respectively. Group IV included P. angusta, P. minuta var. minuta, P. minuta var. nonfermentans, P. philodendra, P. glucozyma, and P. henricii. Groups V, VI, and VII included P. jadinii, P. petersonii, and P. dryadoides, respectively. The nitrate assimilation-negative species, P. wickerhamii was phylogenetically distant from P. membranaefaciens. The seven groupings are discussed phylogenetically and taxonomically. For Groups IV, II, and III, the three new genera were proposed as Ogataea, Kuraishia, and Nakazawaea, respectively, with the type species, O. minuta (identical to P. minuta), K. capsulata (identical to P. capsulata), and N. holstii (identical to P. holstii).

  9. 3D Bayesian contextual classifiers

    DEFF Research Database (Denmark)

    Larsen, Rasmus

    2000-01-01

    We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours.......We extend a series of multivariate Bayesian 2-D contextual classifiers to 3-D by specifying a simultaneous Gaussian distribution for the feature vectors as well as a prior distribution of the class variables of a pixel and its 6 nearest 3-D neighbours....

  10. Maximum margin Bayesian network classifiers.

    Science.gov (United States)

    Pernkopf, Franz; Wohlmayr, Michael; Tschiatschek, Sebastian

    2012-03-01

    We present a maximum margin parameter learning algorithm for Bayesian network classifiers using a conjugate gradient (CG) method for optimization. In contrast to previous approaches, we maintain the normalization constraints on the parameters of the Bayesian network during optimization, i.e., the probabilistic interpretation of the model is not lost. This enables us to handle missing features in discriminatively optimized Bayesian networks. In experiments, we compare the classification performance of maximum margin parameter learning to conditional likelihood and maximum likelihood learning approaches. Discriminative parameter learning significantly outperforms generative maximum likelihood estimation for naive Bayes and tree augmented naive Bayes structures on all considered data sets. Furthermore, maximizing the margin dominates the conditional likelihood approach in terms of classification performance in most cases. We provide results for a recently proposed maximum margin optimization approach based on convex relaxation. While the classification results are highly similar, our CG-based optimization is computationally up to orders of magnitude faster. Margin-optimized Bayesian network classifiers achieve classification performance comparable to support vector machines (SVMs) using fewer parameters. Moreover, we show that unanticipated missing feature values during classification can be easily processed by discriminatively optimized Bayesian network classifiers, a case where discriminative classifiers usually require mechanisms to complete unknown feature values in the data first.

  11. Classifying Cereal Data (Earlier Methods)

    Science.gov (United States)

    The DSQ includes questions about cereal intake and allows respondents up to two responses on which cereals they consume. We classified each cereal reported first by hot or cold, and then along four dimensions: density of added sugars, whole grains, fiber, and calcium.

  12. On the computational complexity of sequence design problems

    Energy Technology Data Exchange (ETDEWEB)

    Hart, W.E. [Sandia National Labs., Albuquerque, NM (United States). Algorithms and Discrete Mathematics Dept.

    1996-12-31

    Inverse protein folding concerns the identification of an amino acid sequence that folds to a given structure. Sequence design problems attempt to avoid the apparent difficulty of inverse protein folding by defining an energy that can be minimized to find protein-like sequences. The authors evaluate the practical relevance of two sequence design problems by analyzing their computation complexity. They show that the canonical method of sequence design is intractable, and describe approximation algorithms for this problem. The authors also describe an efficient algorithm that exactly solves the grand canonical method. The analysis shows how sequence design problems can fail to reduce the difficulty of the inverse protein folding problem, and highlights the need to analyze these problems to evaluate their practical relevance.

  13. Classifying self-gravitating radiations

    CERN Document Server

    Kim, Hyeong-Chan

    2016-01-01

    We study static systems of self-gravitating radiations confined in a sphere by using numerical and analytic calculations. We classify and analyze the solutions systematically. Due to the scaling symmetry, any solution can be represented as a segment of a solution curve on a plane of two-dimensional scale invariant variables. We find that a system can be conveniently parametrized by three parameters representing the solution curve, the scaling, and the system size, instead of the parameters defined at the outer boundary. The solution curves are classified to three types representing regular solutions, conically singular solutions with, and without an object which resembles an event horizon up to causal disconnectedness. For the last type, the behavior of a self-gravitating system is simple enough to allow analytic calculations.

  14. 76 FR 34761 - Classified National Security Information

    Science.gov (United States)

    2011-06-14

    ... Classified National Security Information AGENCY: Marine Mammal Commission. ACTION: Notice. SUMMARY: This... information, as directed by Information Security Oversight Office regulations. FOR FURTHER INFORMATION CONTACT..., ``Classified National Security Information,'' and 32 CFR part 2001, ``Classified National Security......

  15. Energy-Efficient Neuromorphic Classifiers.

    Science.gov (United States)

    Martí, Daniel; Rigotti, Mattia; Seok, Mingoo; Fusi, Stefano

    2016-10-01

    Neuromorphic engineering combines the architectural and computational principles of systems neuroscience with semiconductor electronics, with the aim of building efficient and compact devices that mimic the synaptic and neural machinery of the brain. The energy consumptions promised by neuromorphic engineering are extremely low, comparable to those of the nervous system. Until now, however, the neuromorphic approach has been restricted to relatively simple circuits and specialized functions, thereby obfuscating a direct comparison of their energy consumption to that used by conventional von Neumann digital machines solving real-world tasks. Here we show that a recent technology developed by IBM can be leveraged to realize neuromorphic circuits that operate as classifiers of complex real-world stimuli. Specifically, we provide a set of general prescriptions to enable the practical implementation of neural architectures that compete with state-of-the-art classifiers. We also show that the energy consumption of these architectures, realized on the IBM chip, is typically two or more orders of magnitude lower than that of conventional digital machines implementing classifiers with comparable performance. Moreover, the spike-based dynamics display a trade-off between integration time and accuracy, which naturally translates into algorithms that can be flexibly deployed for either fast and approximate classifications, or more accurate classifications at the mere expense of longer running times and higher energy costs. This work finally proves that the neuromorphic approach can be efficiently used in real-world applications and has significant advantages over conventional digital devices when energy consumption is considered.

  16. ANALYSIS OF BAYESIAN CLASSIFIER ACCURACY

    Directory of Open Access Journals (Sweden)

    Felipe Schneider Costa

    2013-01-01

    Full Text Available The naïve Bayes classifier is considered one of the most effective classification algorithms today, competing with more modern and sophisticated classifiers. Despite being based on unrealistic (naïve assumption that all variables are independent, given the output class, the classifier provides proper results. However, depending on the scenario utilized (network structure, number of samples or training cases, number of variables, the network may not provide appropriate results. This study uses a process variable selection, using the chi-squared test to verify the existence of dependence between variables in the data model in order to identify the reasons which prevent a Bayesian network to provide good performance. A detailed analysis of the data is also proposed, unlike other existing work, as well as adjustments in case of limit values between two adjacent classes. Furthermore, variable weights are used in the calculation of a posteriori probabilities, calculated with mutual information function. Tests were applied in both a naïve Bayesian network and a hierarchical Bayesian network. After testing, a significant reduction in error rate has been observed. The naïve Bayesian network presented a drop in error rates from twenty five percent to five percent, considering the initial results of the classification process. In the hierarchical network, there was not only a drop in fifteen percent error rate, but also the final result came to zero.

  17. Aggregation Operator Based Fuzzy Pattern Classifier Design

    DEFF Research Database (Denmark)

    Mönks, Uwe; Larsen, Henrik Legind

    2009-01-01

    This paper presents a novel modular fuzzy pattern classifier design framework for intelligent automation systems, developed on the base of the established Modified Fuzzy Pattern Classifier (MFPC) and allows designing novel classifier models which are hardware-efficiently implementable. The perfor......This paper presents a novel modular fuzzy pattern classifier design framework for intelligent automation systems, developed on the base of the established Modified Fuzzy Pattern Classifier (MFPC) and allows designing novel classifier models which are hardware-efficiently implementable...

  18. Using RNA Sequencing to Classify Organisms into Three Primary Kingdoms.

    Science.gov (United States)

    Evans, Robert H.

    1983-01-01

    Using the biochemical record to class archaebacteria, eukaryotes, and eubacteria involves abstractions difficult for the concrete learner. Therefore, a method is provided in which students discover some basic tenets of biochemical classification and apply them in a "hands-on" classification problem. The method involves use of RNA…

  19. Fos protein-like immunoreactive neurons induced by electrical stimulation in the trigeminal sensory nuclear complex of rats with chronically injured peripheral nerve.

    Science.gov (United States)

    Fujisawa, Naoko; Terayama, Ryuji; Yamaguchi, Daisuke; Omura, Shinji; Yamashiro, Takashi; Sugimoto, Tomosada

    2012-06-01

    The rat trigeminal sensory nuclear complex (TSNC) was examined for Fos protein-like immunoreactive (Fos-LI) neurons induced by electrical stimulation (ES) of the lingual nerve (LN) at 2 weeks after injury to the LN or the inferior alveolar nerve (IAN). Intensity-dependent increase in the number of Fos-LI neurons was observed in the subnucleus oralis (Vo) and caudalis (Vc) of the spinal trigeminal tract nucleus irrespective of nerve injury. The number of Fos-LI neurons induced by ES of the chronically injured LN at A-fiber intensity (0.1 mA) was significantly increased in the Vo but not the Vc. On the other hand, in rats with chronically injured IAN, the number of Fos-LI neurons induced by ES of the LN at C-fiber intensity (10 mA) was significantly increased in the Vc but not the Vo. These results indicated that injury of a nerve innervating intraoral structures increased the c-Fos response of Vo neurons to A-fiber intensity ES of the injured nerve. A similar nerve injury enhanced the c-Fos response of Vc neurons to C-fiber intensity ES of a spared uninjured nerve innervating an intraoral territory neighboring that of the injured nerve. The present result show that nerve injury causes differential effects on c-Fos expression in the Vo and Vc, which may explain complexity of neuropathic pain symptoms in clinical cases.

  20. A Spiking Neural Learning Classifier System

    CERN Document Server

    Howard, Gerard; Lanzi, Pier-Luca

    2012-01-01

    Learning Classifier Systems (LCS) are population-based reinforcement learners used in a wide variety of applications. This paper presents a LCS where each traditional rule is represented by a spiking neural network, a type of network with dynamic internal state. We employ a constructivist model of growth of both neurons and dendrites that realise flexible learning by evolving structures of sufficient complexity to solve a well-known problem involving continuous, real-valued inputs. Additionally, we extend the system to enable temporal state decomposition. By allowing our LCS to chain together sequences of heterogeneous actions into macro-actions, it is shown to perform optimally in a problem where traditional methods can fail to find a solution in a reasonable amount of time. Our final system is tested on a simulated robotics platform.

  1. Human Segmentation Using Haar-Classifier

    Directory of Open Access Journals (Sweden)

    Dharani S

    2014-07-01

    Full Text Available Segmentation is an important process in many aspects of multimedia applications. Fast and perfect segmentation of moving objects in video sequences is a basic task in many computer visions and video investigation applications. Particularly Human detection is an active research area in computer vision applications. Segmentation is very useful for tracking and recognition the object in a moving clip. The motion segmentation problem is studied and reviewed the most important techniques. We illustrate some common methods for segmenting the moving objects including background subtraction, temporal segmentation and edge detection. Contour and threshold are common methods for segmenting the objects in moving clip. These methods are widely exploited for moving object segmentation in many video surveillance applications, such as traffic monitoring, human motion capture. In this paper, Haar Classifier is used to detect humans in a moving video clip some features like face detection, eye detection, full body, upper body and lower body detection.

  2. Defining and Classifying Interest Groups

    DEFF Research Database (Denmark)

    Baroni, Laura; Carroll, Brendan; Chalmers, Adam;

    2014-01-01

    The interest group concept is defined in many different ways in the existing literature and a range of different classification schemes are employed. This complicates comparisons between different studies and their findings. One of the important tasks faced by interest group scholars engaged...... in large-N studies is therefore to define the concept of an interest group and to determine which classification scheme to use for different group types. After reviewing the existing literature, this article sets out to compare different approaches to defining and classifying interest groups with a sample...

  3. Fingerprint prediction using classifier ensembles

    CSIR Research Space (South Africa)

    Molale, P

    2011-11-01

    Full Text Available -based learning algorithms. Machine Learning, 6: pp: 37-66. Amit, Y., D. Geman, and K. Wilder, 1997. Joint Induction of Shape Features and Tree Classifiers. IEEE Transc. on Pattern Anal. and machine Intell., 19 (11), pp: 1300- 1305. Breiman, L., 1996. Bagging.... NIST Technical Report NISTIR 5163. Cappelli, R., A. Lumini, D. Maio., and D. Maltoni, 1999. Fingerprint Classification by Direct image Partitioning. IEEE Transc. On Pattern Anal. and Machine Intell., 21 (5), pp: 402-421. Cox, D.R., 1966. Some...

  4. Using Fuzzy Hybrid Features to Classify Strokes in Interactive Sketches

    Directory of Open Access Journals (Sweden)

    Shuxia Wang

    2013-01-01

    Full Text Available A novel method is presented based on fuzzy hybrid-based features to classify strokes into 2D line drawings, and a human computer interactive system is developed for assisting designers in conceptual design stage. Fuzzy classifiers are built based on some geometric features and speed features. The prototype system can support rapid classification based on fuzzy classifiers, and the classified stroke is then fitted with a 2D geometry primitive which could be a line segment, polyline, circle, circular arc, ellipse, elliptical arc, hyperbola, and parabola. The human computer interaction can determine the ambiguous results and then revise the misrecognitions. The test results showed that the proposed method can support online freehand sketching based on conceptual design with no limitation on drawing sequence and direction while achieving a satisfactory interpretation rate.

  5. Recognition of Characters by Adaptive Combination of Classifiers

    Institute of Scientific and Technical Information of China (English)

    WANG Fei; LI Zai-ming

    2004-01-01

    In this paper, the visual feature space based on the long Horizontals, the long Verticals,and the radicals are given. An adaptive combination of classifiers, whose coefficients vary with the input pattern, is also proposed. Experiments show that the approach is promising for character recognition in video sequences.

  6. Contrasting Patterns in the Evolution of Vertebrate MLX Interacting Protein (MLXIP and MLX Interacting Protein-Like (MLXIPL Genes.

    Directory of Open Access Journals (Sweden)

    Parmveer Singh

    Full Text Available ChREBP and MondoA are glucose-sensitive transcription factors that regulate aspects of energy metabolism. Here we performed a phylogenomic analysis of Mlxip (encoding MondoA and Mlxipl (encoding ChREBP genes across vertebrates. Analysis of extant Mlxip and Mlxipl genes suggests that the most recent common ancestor of these genes was composed of 17 coding exons. Single copy genes encoding both ChREBP and MondoA, along with their interacting partner Mlx, were found in diverse vertebrate genomes, including fish that have experienced a genome duplication. This observation suggests that a single Mlx gene has been retained to maintain coordinate regulation of ChREBP and MondoA. The ChREBP-β isoform, the more potent and constitutively active isoform, appeared with the evolution of tetrapods and is absent from the Mlxipl genes of fish. Evaluation of the conservation of ChREBP and MondoA sequences demonstrate that MondoA is better conserved and potentially mediates more ancient function in glucose metabolism.

  7. Major latex protein-like protein 43 (MLP43) functions as a positive regulator during abscisic acid responses and confers drought tolerance in Arabidopsis thaliana.

    Science.gov (United States)

    Wang, Yanping; Yang, Li; Chen, Xi; Ye, Tiantian; Zhong, Bao; Liu, Ruijie; Wu, Yan; Chan, Zhulong

    2016-01-01

    Drought stress is one of the disadvantageous environmental conditions for plant growth and reproduction. Given the importance of abscisic acid (ABA) to plant growth and abiotic stress responses, identification of novel components involved in ABA signalling transduction is critical. In this study, we screened numerous Arabidopsis thaliana mutants by seed germination assay and identified a mutant mlp43 (major latex protein-like 43) with decreased ABA sensitivity in seed germination. The mlp43 mutant was sensitive to drought stress while the MLP43-overexpressed transgenic plants were drought tolerant. The tissue-specific expression pattern analysis showed that MLP43 was predominantly expressed in cotyledons, primary roots and apical meristems, and a subcellular localization study indicated that MLP43 was localized in the nucleus and cytoplasm. Physiological and biochemical analyses indicated that MLP43 functioned as a positive regulator in ABA- and drought-stress responses in Arabidopsis through regulating water loss efficiency, electrolyte leakage, ROS levels, and as well as ABA-responsive gene expression. Moreover, metabolite profiling analysis indicated that MLP43 could modulate the production of primary metabolites under drought stress conditions. Reconstitution of ABA signalling components in Arabidopsis protoplasts indicated that MLP43 was involved in ABA signalling transduction and acted upstream of SnRK2s by directly interacting with SnRK2.6 and ABF1 in a yeast two-hybrid assay. Moreover, ABA and drought stress down-regulated MLP43 expression as a negative feedback loop regulation to the performance of MLP43 in ABA and drought stress responses. Therefore, this study provided new insights for interpretation of physiological and molecular mechanisms of Arabidopsis MLP43 mediating ABA signalling transduction and drought stress responses.

  8. Regulation of the SQUAMOSA PROMOTER-BINDING PROTEIN-LIKE genes/microRNA 156 Module by the Homeodomain Proteins PENNYWISE and POUND-FOOLISH in Arabidopsis

    Institute of Scientific and Technical Information of China (English)

    Shruti Lal; Leo Bryan Pacis; Harley M.S. Smith

    2011-01-01

    The morphology of inflorescences is regulated in part by the temporal and spatial events that regulate flower specification.In Arabidopsis,an endogenous flowering time pathway mediated by a subset of SQUAMOSA PROMOTERBINDING PROTEIN-LIKE (SPL) transcription factors,including SPL3,SPL4,and SPL5,function to specify flowers by activating floral meristem identity genes.During shoot development,SPL3,SPL4,and SPL5 are post-transcriptionally regulated by microRNA156 (miR156).The photoperiod regulated florigenic signal,FLOWERING LOCUS T (FT),promotes floral induction,in part by activating SPL3,SPL4,and SPL5.In turn,these SPLs function in parallel with FT to specify flower meristems.Two related BELL1-like homeobox genes PENNYWISE (PNY) and POUND-FOOLISH (PNF) expressed in the shoot apical meristem are absolutely required for the specification of floral meristems.Genetic studies show that the floral specification function of FT depends upon PNY and PNF; however,the interplay between these homeodomain proteins and SPLs is not known.In this manuscript,we show that the photoperiodic floral induction of SPL3,SPL4,and SPL5 is dependent upon PNY and PNF.Further,PNY and PNF also control SPL3,SPL4,and SPL5 expression by negatively regulating miR156.Lastly,ectopic expression of SPL4 partially rescues the pny pnf non-flower-producing phenotype,while overexpression of SPL3 or SPL5 in pny pnf plants was unable to restore flower specification.These results suggest that:(1) SPL3,SPL4,and SPL5 function is dependent upon PNY and PNF,or (2) expression of multiple SPLs is required for floral specification in pny pnf plants.

  9. Hybrid k -Nearest Neighbor Classifier.

    Science.gov (United States)

    Yu, Zhiwen; Chen, Hantao; Liuxs, Jiming; You, Jane; Leung, Hareton; Han, Guoqiang

    2016-06-01

    Conventional k -nearest neighbor (KNN) classification approaches have several limitations when dealing with some problems caused by the special datasets, such as the sparse problem, the imbalance problem, and the noise problem. In this paper, we first perform a brief survey on the recent progress of the KNN classification approaches. Then, the hybrid KNN (HBKNN) classification approach, which takes into account the local and global information of the query sample, is designed to address the problems raised from the special datasets. In the following, the random subspace ensemble framework based on HBKNN (RS-HBKNN) classifier is proposed to perform classification on the datasets with noisy attributes in the high-dimensional space. Finally, the nonparametric tests are proposed to be adopted to compare the proposed method with other classification approaches over multiple datasets. The experiments on the real-world datasets from the Knowledge Extraction based on Evolutionary Learning dataset repository demonstrate that RS-HBKNN works well on real datasets, and outperforms most of the state-of-the-art classification approaches.

  10. 75 FR 707 - Classified National Security Information

    Science.gov (United States)

    2010-01-05

    ... National Security Information Memorandum of December 29, 2009--Implementation of the Executive Order ``Classified National Security Information'' Order of December 29, 2009--Original Classification Authority #0... 13526 of December 29, 2009 Classified National Security Information This order prescribes a...

  11. Classifier Assignment by Corpus-based Approach

    CERN Document Server

    Sornlertlamvanich, V; Meknavin, S; Sornlertlamvanich, Virach; Pantachat, Wantanee; Meknavin, Surapant

    1994-01-01

    This paper presents an algorithm for selecting an appropriate classifier word for a noun. In Thai language, it frequently happens that there is fluctuation in the choice of classifier for a given concrete noun, both from the point of view of the whole spe ech community and individual speakers. Basically, there is no exect rule for classifier selection. As far as we can do in the rule-based approach is to give a default rule to pick up a corresponding classifier of each noun. Registration of classifier for each noun is limited to the type of unit classifier because other types are open due to the meaning of representation. We propose a corpus-based method (Biber, 1993; Nagao, 1993; Smadja, 1993) which generates Noun Classifier Associations (NCA) to overcome the problems in classifier assignment and semantic construction of noun phrase. The NCA is created statistically from a large corpus and recomposed under concept hierarchy constraints and frequency of occurrences.

  12. Aggregation Operator Based Fuzzy Pattern Classifier Design

    DEFF Research Database (Denmark)

    Mönks, Uwe; Larsen, Henrik Legind

    2009-01-01

    This paper presents a novel modular fuzzy pattern classifier design framework for intelligent automation systems, developed on the base of the established Modified Fuzzy Pattern Classifier (MFPC) and allows designing novel classifier models which are hardware-efficiently implementable. The perfor....... The performances of novel classifiers using substitutes of MFPC's geometric mean aggregator are benchmarked in the scope of an image processing application against the MFPC to reveal classification improvement potentials for obtaining higher classification rates....

  13. 15 CFR 4.8 - Classified Information.

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 1 2010-01-01 2010-01-01 false Classified Information. 4.8 Section 4... INFORMATION Freedom of Information Act § 4.8 Classified Information. In processing a request for information..., the information shall be reviewed to determine whether it should remain classified. Ordinarily...

  14. Data characteristics that determine classifier performance

    CSIR Research Space (South Africa)

    Van der Walt, Christiaan M

    2006-11-01

    Full Text Available classifiers. 10-fold cross-validation is used to evaluate and compare the performance of the classifiers on the different data sets. 3.1. Artificial data generation Multivariate Gaussian distributions are used to generate artificial data sets. We use d...NN) classifier [8], the multi- layer perceptron (MLP) and support vector machines (SVMs) [9]. The NB, DT, kNN, MLP and SVM classifiers are all implementations of the machine learning package Weka [10]. The Gaussian classifier is a Matlab implementation...

  15. A Neural Network Classifier of Volume Datasets

    CERN Document Server

    Zukić, Dženan; Kolb, Andreas

    2009-01-01

    Many state-of-the art visualization techniques must be tailored to the specific type of dataset, its modality (CT, MRI, etc.), the recorded object or anatomical region (head, spine, abdomen, etc.) and other parameters related to the data acquisition process. While parts of the information (imaging modality and acquisition sequence) may be obtained from the meta-data stored with the volume scan, there is important information which is not stored explicitly (anatomical region, tracing compound). Also, meta-data might be incomplete, inappropriate or simply missing. This paper presents a novel and simple method of determining the type of dataset from previously defined categories. 2D histograms based on intensity and gradient magnitude of datasets are used as input to a neural network, which classifies it into one of several categories it was trained with. The proposed method is an important building block for visualization systems to be used autonomously by non-experts. The method has been tested on 80 datasets,...

  16. Dynamical Logic Driven by Classified Inferences Including Abduction

    Science.gov (United States)

    Sawa, Koji; Gunji, Yukio-Pegio

    2010-11-01

    We propose a dynamical model of formal logic which realizes a representation of logical inferences, deduction and induction. In addition, it also represents abduction which is classified by Peirce as the third inference following deduction and induction. The three types of inference are represented as transformations of a directed graph. The state of a relation between objects of the model fluctuates between the collective and the distinctive. In addition, the location of the relation in the sequence of the relation influences its state.

  17. eccCL: parallelized GPU implementation of Ensemble Classifier Chains.

    Science.gov (United States)

    Riemenschneider, Mona; Herbst, Alexander; Rasch, Ari; Gorlatch, Sergei; Heider, Dominik

    2017-08-17

    Multi-label classification has recently gained great attention in diverse fields of research, e.g., in biomedical application such as protein function prediction or drug resistance testing in HIV. In this context, the concept of Classifier Chains has been shown to improve prediction accuracy, especially when applied as Ensemble Classifier Chains. However, these techniques lack computational efficiency when applied on large amounts of data, e.g., derived from next-generation sequencing experiments. By adapting algorithms for the use of graphics processing units, computational efficiency can be greatly improved due to parallelization of computations. Here, we provide a parallelized and optimized graphics processing unit implementation (eccCL) of Classifier Chains and Ensemble Classifier Chains. Additionally to the OpenCL implementation, we provide an R-Package with an easy to use R-interface for parallelized graphics processing unit usage. eccCL is a handy implementation of Classifier Chains on GPUs, which is able to process up to over 25,000 instances per second, and thus can be used efficiently in high-throughput experiments. The software is available at http://www.heiderlab.de .

  18. 22 CFR 125.3 - Exports of classified technical data and classified defense articles.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Exports of classified technical data and... IN ARMS REGULATIONS LICENSES FOR THE EXPORT OF TECHNICAL DATA AND CLASSIFIED DEFENSE ARTICLES § 125.3 Exports of classified technical data and classified defense articles. (a) A request for authority...

  19. Pavement Crack Classifiers: A Comparative Study

    Directory of Open Access Journals (Sweden)

    S. Siddharth

    2012-12-01

    Full Text Available Non Destructive Testing (NDT is an analysis technique used to inspect metal sheets and components without harming the product. NDT do not cause any change after inspection; this technique saves money and time in product evaluation, research and troubleshooting. In this study the objective is to perform NDT using soft computing techniques. Digital images are taken; Gray Level Co-occurrence Matrix (GLCM extracts features from these images. Extracted features are then fed into the classifiers which classifies them into images with and without cracks. Three major classifiers: Neural networks, Support Vector Machine (SVM and Linear classifiers are taken for the classification purpose. Performances of these classifiers are assessed and the best classifier for the given data is chosen.

  20. Comparing different classifiers for automatic age estimation.

    Science.gov (United States)

    Lanitis, Andreas; Draganova, Chrisina; Christodoulou, Chris

    2004-02-01

    We describe a quantitative evaluation of the performance of different classifiers in the task of automatic age estimation. In this context, we generate a statistical model of facial appearance, which is subsequently used as the basis for obtaining a compact parametric description of face images. The aim of our work is to design classifiers that accept the model-based representation of unseen images and produce an estimate of the age of the person in the corresponding face image. For this application, we have tested different classifiers: a classifier based on the use of quadratic functions for modeling the relationship between face model parameters and age, a shortest distance classifier, and artificial neural network based classifiers. We also describe variations to the basic method where we use age-specific and/or appearance specific age estimation methods. In this context, we use age estimation classifiers for each age group and/or classifiers for different clusters of subjects within our training set. In those cases, part of the classification procedure is devoted to choosing the most appropriate classifier for the subject/age range in question, so that more accurate age estimates can be obtained. We also present comparative results concerning the performance of humans and computers in the task of age estimation. Our results indicate that machines can estimate the age of a person almost as reliably as humans.

  1. A review of learning vector quantization classifiers

    CERN Document Server

    Nova, David

    2015-01-01

    In this work we present a review of the state of the art of Learning Vector Quantization (LVQ) classifiers. A taxonomy is proposed which integrates the most relevant LVQ approaches to date. The main concepts associated with modern LVQ approaches are defined. A comparison is made among eleven LVQ classifiers using one real-world and two artificial datasets.

  2. Deconvolution When Classifying Noisy Data Involving Transformations

    KAUST Repository

    Carroll, Raymond

    2012-09-01

    In the present study, we consider the problem of classifying spatial data distorted by a linear transformation or convolution and contaminated by additive random noise. In this setting, we show that classifier performance can be improved if we carefully invert the data before the classifier is applied. However, the inverse transformation is not constructed so as to recover the original signal, and in fact, we show that taking the latter approach is generally inadvisable. We introduce a fully data-driven procedure based on cross-validation, and use several classifiers to illustrate numerical properties of our approach. Theoretical arguments are given in support of our claims. Our procedure is applied to data generated by light detection and ranging (Lidar) technology, where we improve on earlier approaches to classifying aerosols. This article has supplementary materials online.

  3. Logarithmic learning for generalized classifier neural network.

    Science.gov (United States)

    Ozyildirim, Buse Melis; Avci, Mutlu

    2014-12-01

    Generalized classifier neural network is introduced as an efficient classifier among the others. Unless the initial smoothing parameter value is close to the optimal one, generalized classifier neural network suffers from convergence problem and requires quite a long time to converge. In this work, to overcome this problem, a logarithmic learning approach is proposed. The proposed method uses logarithmic cost function instead of squared error. Minimization of this cost function reduces the number of iterations used for reaching the minima. The proposed method is tested on 15 different data sets and performance of logarithmic learning generalized classifier neural network is compared with that of standard one. Thanks to operation range of radial basis function included by generalized classifier neural network, proposed logarithmic approach and its derivative has continuous values. This makes it possible to adopt the advantage of logarithmic fast convergence by the proposed learning method. Due to fast convergence ability of logarithmic cost function, training time is maximally decreased to 99.2%. In addition to decrease in training time, classification performance may also be improved till 60%. According to the test results, while the proposed method provides a solution for time requirement problem of generalized classifier neural network, it may also improve the classification accuracy. The proposed method can be considered as an efficient way for reducing the time requirement problem of generalized classifier neural network. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. A Sequential Algorithm for Training Text Classifiers

    CERN Document Server

    Lewis, D D; Lewis, David D.; Gale, William A.

    1994-01-01

    The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was developed and tested on a newswire text categorization task. This method, which we call uncertainty sampling, reduced by as much as 500-fold the amount of training data that would have to be manually classified to achieve a given level of effectiveness.

  5. A CLASSIFIER SYSTEM USING SMOOTH GRAPH COLORING

    Directory of Open Access Journals (Sweden)

    JORGE FLORES CRUZ

    2017-01-01

    Full Text Available Unsupervised classifiers allow clustering methods with less or no human intervention. Therefore it is desirable to group the set of items with less data processing. This paper proposes an unsupervised classifier system using the model of soft graph coloring. This method was tested with some classic instances in the literature and the results obtained were compared with classifications made with human intervention, yielding as good or better results than supervised classifiers, sometimes providing alternative classifications that considers additional information that humans did not considered.

  6. Nonlinear interpolation fractal classifier for multiple cardiac arrhythmias recognition

    Energy Technology Data Exchange (ETDEWEB)

    Lin, C.-H. [Department of Electrical Engineering, Kao-Yuan University, No. 1821, Jhongshan Rd., Lujhu Township, Kaohsiung County 821, Taiwan (China); Institute of Biomedical Engineering, National Cheng-Kung University, Tainan 70101, Taiwan (China)], E-mail: eechl53@cc.kyu.edu.tw; Du, Y.-C.; Chen Tainsong [Institute of Biomedical Engineering, National Cheng-Kung University, Tainan 70101, Taiwan (China)

    2009-11-30

    This paper proposes a method for cardiac arrhythmias recognition using the nonlinear interpolation fractal classifier. A typical electrocardiogram (ECG) consists of P-wave, QRS-complexes, and T-wave. Iterated function system (IFS) uses the nonlinear interpolation in the map and uses similarity maps to construct various data sequences including the fractal patterns of supraventricular ectopic beat, bundle branch ectopic beat, and ventricular ectopic beat. Grey relational analysis (GRA) is proposed to recognize normal heartbeat and cardiac arrhythmias. The nonlinear interpolation terms produce family functions with fractal dimension (FD), the so-called nonlinear interpolation function (NIF), and make fractal patterns more distinguishing between normal and ill subjects. The proposed QRS classifier is tested using the Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) arrhythmia database. Compared with other methods, the proposed hybrid methods demonstrate greater efficiency and higher accuracy in recognizing ECG signals.

  7. Efficient iris recognition via ICA feature and SVM classifier

    Institute of Scientific and Technical Information of China (English)

    Wang Yong; Xu Luping

    2007-01-01

    To improve flexibility and reliability of iris recognition algorithm while keeping iris recognition success rate, an iris recognition approach for combining SVM with ICA feature extraction model is presented. SVM is a kind of classifier which has demonstrated high generalization capabilities in the object recognition problem. And ICA is a feature extraction technique which can be considered a generalization of principal component analysis. In this paper, ICA is used to generate a set of subsequences of feature vectors for iris feature extraction. Then each subsequence is classified using support vector machine sequence kernels. Experiments are made on CASIA iris database, the result indicates combination of SVM and ICA can improve iris recognition flexibility and reliability while keeping recognition success rate.

  8. Genetic fuzzy classifier for sleep stage identification.

    Science.gov (United States)

    Jo, Han G; Park, Jin Y; Lee, Chung K; An, Suk K; Yoo, Sun K

    2010-07-01

    Soft-computing techniques are commonly used to detect medical phenomena and help with clinical diagnoses and treatment. In this work, we propose a design for a computerized sleep scoring method, which is based on a fuzzy classifier and a genetic algorithm (GA). We design the fuzzy classifier based on the GA using a single electroencephalogram (EEG) signal that detects differences in spectral features. Polysomnography was performed on four healthy young adults (males with a mean age of 27.5 years). The sleep classifier was designed using a sleep record and tested on the sleep records of the subjects. Our results show that the genetic fuzzy classifier (GFC) agreed with visual sleep staging approximately 84.6% of the time in detection of wakefulness (WA), shallow sleep (SS), deep sleep (DS), and rapid eye movement (REM) stages.

  9. Local Component Analysis for Nonparametric Bayes Classifier

    CERN Document Server

    Khademi, Mahmoud; safayani, Meharn

    2010-01-01

    The decision boundaries of Bayes classifier are optimal because they lead to maximum probability of correct decision. It means if we knew the prior probabilities and the class-conditional densities, we could design a classifier which gives the lowest probability of error. However, in classification based on nonparametric density estimation methods such as Parzen windows, the decision regions depend on the choice of parameters such as window width. Moreover, these methods suffer from curse of dimensionality of the feature space and small sample size problem which severely restricts their practical applications. In this paper, we address these problems by introducing a novel dimension reduction and classification method based on local component analysis. In this method, by adopting an iterative cross-validation algorithm, we simultaneously estimate the optimal transformation matrices (for dimension reduction) and classifier parameters based on local information. The proposed method can classify the data with co...

  10. An Efficient and Effective Immune Based Classifier

    Directory of Open Access Journals (Sweden)

    Shahram Golzari

    2011-01-01

    Full Text Available Problem statement: Artificial Immune Recognition System (AIRS is most popular and effective immune inspired classifier. Resource competition is one stage of AIRS. Resource competition is done based on the number of allocated resources. AIRS uses a linear method to allocate resources. The linear resource allocation increases the training time of classifier. Approach: In this study, a new nonlinear resource allocation method is proposed to make AIRS more efficient. New algorithm, AIRS with proposed nonlinear method, is tested on benchmark datasets from UCI machine learning repository. Results: Based on the results of experiments, using proposed nonlinear resource allocation method decreases the training time and number of memory cells and doesn't reduce the accuracy of AIRS. Conclusion: The proposed classifier is an efficient and effective classifier.

  11. Combining multiple classifiers for age classification

    CSIR Research Space (South Africa)

    Van Heerden, C

    2009-11-01

    Full Text Available The authors compare several different classifier combination methods on a single task, namely speaker age classification. This task is well suited to combination strategies, since significantly different feature classes are employed. Support vector...

  12. Classifiers based on optimal decision rules

    KAUST Repository

    Amin, Talha

    2013-11-25

    Based on dynamic programming approach we design algorithms for sequential optimization of exact and approximate decision rules relative to the length and coverage [3, 4]. In this paper, we use optimal rules to construct classifiers, and study two questions: (i) which rules are better from the point of view of classification-exact or approximate; and (ii) which order of optimization gives better results of classifier work: length, length+coverage, coverage, or coverage+length. Experimental results show that, on average, classifiers based on exact rules are better than classifiers based on approximate rules, and sequential optimization (length+coverage or coverage+length) is better than the ordinary optimization (length or coverage).

  13. Pragmatics of classifier use in Chinese discourse

    African Journals Online (AJOL)

    KATEVG

    complex noun phrases (CNPs), and investigates the occurrence and ... classifier phrase from its head noun while a post-nominal RC in English does not ...... The present study takes a cognitive-functional approach to the analysis of a syntactic.

  14. Classifying the Quantum Phases of Matter

    Science.gov (United States)

    2015-01-01

    2013), arXiv:1305.2176. [10] J. Haah, Lattice quantum codes and exotic topological phases of matter , arXiv:1305.6973. [11[ M. Hastings and S...CLASSIFYING THE QUANTUM PHASES OF MATTER CALIFORNIA INSTITUTE OF TECHNOLOGY JANUARY 2015 FINAL TECHNICAL REPORT...REPORT 3. DATES COVERED (From - To) JAN 2012 – AUG 2014 4. TITLE AND SUBTITLE CLASSIFYING THE QUANTUM PHASES OF MATTER 5a. CONTRACT NUMBER FA8750-12-2

  15. Searching and Classifying non-textual information

    OpenAIRE

    Arentz, Will Archer

    2004-01-01

    This dissertation contains a set of contributions that deal with search or classification of non-textual information. Each contribution can be considered a solution to a specific problem, in an attempt to map out a common ground. The problems cover a wide range of research fields, including search in music, classifying digitally sampled music, visualization and navigation in search results, and classifying images and Internet sites.On classification of digitally sample music, as method for ex...

  16. COMBINING CLASSIFIERS FOR CREDIT RISK PREDICTION

    Institute of Scientific and Technical Information of China (English)

    Bhekisipho TWALA

    2009-01-01

    Credit risk prediction models seek to predict quality factors such as whether an individual will default (bad applicant) on a loan or not (good applicant). This can be treated as a kind of machine learning (ML) problem. Recently, the use of ML algorithms has proven to be of great practical value in solving a variety of risk problems including credit risk prediction. One of the most active areas of recent research in ML has been the use of ensemble (combining) classifiers. Research indicates that ensemble individual classifiers lead to a significant improvement in classification performance by having them vote for the most popular class. This paper explores the predicted behaviour of five classifiers for different types of noise in terms of credit risk prediction accuracy, and how could such accuracy be improved by using pairs of classifier ensembles. Benchmarking results on five credit datasets and comparison with the performance of each individual classifier on predictive accuracy at various attribute noise levels are presented. The experimental evaluation shows that the ensemble of classifiers technique has the potential to improve prediction accuracy.

  17. A multi-class large margin classifier

    Institute of Scientific and Technical Information of China (English)

    Liang TANG; Qi XUAN; Rong XIONG; Tie-jun WU; Jian CHU

    2009-01-01

    Currently there are two approaches for a multi-class support vector classifier (SVC). One is to construct and combine several binary classifiers while the other is to directly consider all classes of data in one optimization formulation. For a K-class problem (K>2), the first approach has to construct at least K classifiers, and the second approach has to solve a much larger op-timization problem proportional to K by the algorithms developed so far. In this paper, following the second approach, we present a novel multi-class large margin classifier (MLMC). This new machine can solve K-class problems in one optimization formula-tion without increasing the size of the quadratic programming (QP) problem proportional to K. This property allows us to construct just one classifier with as few variables in the QP problem as possible to classify multi-class data, and we can gain the advantage of speed from it especially when K is large. Our experiments indicate that MLMC almost works as well as (sometimes better than) many other multi-class SVCs for some benchmark data classification problems, and obtains a reasonable performance in face recognition application on the AR face database.

  18. Progress in Study on RecQ Protein-like4%recql4基因的结构及其蛋白功能的研究进展

    Institute of Scientific and Technical Information of China (English)

    郭建国; 卢卫红; 孙野青

    2009-01-01

    recql4(recQ protein-like 4)是RecQ螺旋酶家族的一个成员,这一家族在维持基因的稳定性中起重要作用,在人类中发现,它的突变可以引起一种常染色体隐性遗传病Rothmund-Thomson Syndrome(RTS),通过研究发现该蛋白在DNA复制和DNA断裂修复过程中有重要作用,但精确分子机制还不是很清楚,对于这一基因的研究工作在几个课题组相继展开,现对此基因做一综述.

  19. What are the Differences between Bayesian Classifiers and Mutual-Information Classifiers?

    CERN Document Server

    Hu, Bao-Gang

    2011-01-01

    In this study, both Bayesian classifiers and mutual information classifiers are examined for binary classifications with or without a reject option. The general decision rules in terms of distinctions on error types and reject types are derived for Bayesian classifiers. A formal analysis is conducted to reveal the parameter redundancy of cost terms when abstaining classifications are enforced. The redundancy implies an intrinsic problem of "non-consistency" for interpreting cost terms. If no data is given to the cost terms, we demonstrate the weakness of Bayesian classifiers in class-imbalanced classifications. On the contrary, mutual-information classifiers are able to provide an objective solution from the given data, which shows a reasonable balance among error types and reject types. Numerical examples of using two types of classifiers are given for confirming the theoretical differences, including the extremely-class-imbalanced cases. Finally, we briefly summarize the Bayesian classifiers and mutual-info...

  20. Feature Fusion Based SVM Classifier for Protein Subcellular Localization Prediction.

    Science.gov (United States)

    Rahman, Julia; Mondal, Md Nazrul Islam; Islam, Md Khaled Ben; Hasan, Md Al Mehedi

    2016-12-18

    For the importance of protein subcellular localization in different branches of life science and drug discovery, researchers have focused their attentions on protein subcellular localization prediction. Effective representation of features from protein sequences plays a most vital role in protein subcellular localization prediction specially in case of machine learning techniques. Single feature representation-like pseudo amino acid composition (PseAAC), physiochemical property models (PPM), and amino acid index distribution (AAID) contains insufficient information from protein sequences. To deal with such problems, we have proposed two feature fusion representations, AAIDPAAC and PPMPAAC, to work with Support Vector Machine classifiers, which fused PseAAC with PPM and AAID accordingly. We have evaluated the performance for both single and fused feature representation of a Gram-negative bacterial dataset. We have got at least 3% more actual accuracy by AAIDPAAC and 2% more locative accuracy by PPMPAAC than single feature representation.

  1. Reinforcement Learning Based Artificial Immune Classifier

    Directory of Open Access Journals (Sweden)

    Mehmet Karakose

    2013-01-01

    Full Text Available One of the widely used methods for classification that is a decision-making process is artificial immune systems. Artificial immune systems based on natural immunity system can be successfully applied for classification, optimization, recognition, and learning in real-world problems. In this study, a reinforcement learning based artificial immune classifier is proposed as a new approach. This approach uses reinforcement learning to find better antibody with immune operators. The proposed new approach has many contributions according to other methods in the literature such as effectiveness, less memory cell, high accuracy, speed, and data adaptability. The performance of the proposed approach is demonstrated by simulation and experimental results using real data in Matlab and FPGA. Some benchmark data and remote image data are used for experimental results. The comparative results with supervised/unsupervised based artificial immune system, negative selection classifier, and resource limited artificial immune classifier are given to demonstrate the effectiveness of the proposed new method.

  2. Evolving Classifiers: Methods for Incremental Learning

    CERN Document Server

    Hulley, Greg

    2007-01-01

    The ability of a classifier to take on new information and classes by evolving the classifier without it having to be fully retrained is known as incremental learning. Incremental learning has been successfully applied to many classification problems, where the data is changing and is not all available at once. In this paper there is a comparison between Learn++, which is one of the most recent incremental learning algorithms, and the new proposed method of Incremental Learning Using Genetic Algorithm (ILUGA). Learn++ has shown good incremental learning capabilities on benchmark datasets on which the new ILUGA method has been tested. ILUGA has also shown good incremental learning ability using only a few classifiers and does not suffer from catastrophic forgetting. The results obtained for ILUGA on the Optical Character Recognition (OCR) and Wine datasets are good, with an overall accuracy of 93% and 94% respectively showing a 4% improvement over Learn++.MT for the difficult multi-class OCR dataset.

  3. Averaged Extended Tree Augmented Naive Classifier

    Directory of Open Access Journals (Sweden)

    Aaron Meehan

    2015-07-01

    Full Text Available This work presents a new general purpose classifier named Averaged Extended Tree Augmented Naive Bayes (AETAN, which is based on combining the advantageous characteristics of Extended Tree Augmented Naive Bayes (ETAN and Averaged One-Dependence Estimator (AODE classifiers. We describe the main properties of the approach and algorithms for learning it, along with an analysis of its computational time complexity. Empirical results with numerous data sets indicate that the new approach is superior to ETAN and AODE in terms of both zero-one classification accuracy and log loss. It also compares favourably against weighted AODE and hidden Naive Bayes. The learning phase of the new approach is slower than that of its competitors, while the time complexity for the testing phase is similar. Such characteristics suggest that the new classifier is ideal in scenarios where online learning is not required.

  4. Dynamic Bayesian Combination of Multiple Imperfect Classifiers

    CERN Document Server

    Simpson, Edwin; Psorakis, Ioannis; Smith, Arfon

    2012-01-01

    Classifier combination methods need to make best use of the outputs of multiple, imperfect classifiers to enable higher accuracy classifications. In many situations, such as when human decisions need to be combined, the base decisions can vary enormously in reliability. A Bayesian approach to such uncertain combination allows us to infer the differences in performance between individuals and to incorporate any available prior knowledge about their abilities when training data is sparse. In this paper we explore Bayesian classifier combination, using the computationally efficient framework of variational Bayesian inference. We apply the approach to real data from a large citizen science project, Galaxy Zoo Supernovae, and show that our method far outperforms other established approaches to imperfect decision combination. We go on to analyse the putative community structure of the decision makers, based on their inferred decision making strategies, and show that natural groupings are formed. Finally we present ...

  5. A Customizable Text Classifier for Text Mining

    Directory of Open Access Journals (Sweden)

    Yun-liang Zhang

    2007-12-01

    Full Text Available Text mining deals with complex and unstructured texts. Usually a particular collection of texts that is specified to one or more domains is necessary. We have developed a customizable text classifier for users to mine the collection automatically. It derives from the sentence category of the HNC theory and corresponding techniques. It can start with a few texts, and it can adjust automatically or be adjusted by user. The user can also control the number of domains chosen and decide the standard with which to choose the texts based on demand and abundance of materials. The performance of the classifier varies with the user's choice.

  6. A survey of decision tree classifier methodology

    Science.gov (United States)

    Safavian, S. R.; Landgrebe, David

    1991-01-01

    Decision tree classifiers (DTCs) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps the most important feature of DTCs is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issues. After considering potential advantages of DTCs over single-state classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  7. Design and evaluation of neural classifiers

    DEFF Research Database (Denmark)

    Hintz-Madsen, Mads; Pedersen, Morten With; Hansen, Lars Kai;

    1996-01-01

    In this paper we propose a method for the design of feedforward neural classifiers based on regularization and adaptive architectures. Using a penalized maximum likelihood scheme we derive a modified form of the entropy error measure and an algebraic estimate of the test error. In conjunction...

  8. Face detection by aggregated Bayesian network classifiers

    NARCIS (Netherlands)

    Pham, T.V.; Worring, M.; Smeulders, A.W.M.

    2002-01-01

    A face detection system is presented. A new classification method using forest-structured Bayesian networks is used. The method is used in an aggregated classifier to discriminate face from non-face patterns. The process of generating non-face patterns is integrated with the construction of the aggr

  9. Neural Classifier Construction using Regularization, Pruning

    DEFF Research Database (Denmark)

    Hintz-Madsen, Mads; Hansen, Lars Kai; Larsen, Jan;

    1998-01-01

    In this paper we propose a method for construction of feed-forward neural classifiers based on regularization and adaptive architectures. Using a penalized maximum likelihood scheme, we derive a modified form of the entropic error measure and an algebraic estimate of the test error. In conjunction...

  10. Adaptively robust filtering with classified adaptive factors

    Institute of Scientific and Technical Information of China (English)

    CUI Xianqiang; YANG Yuanxi

    2006-01-01

    The key problems in applying the adaptively robust filtering to navigation are to establish an equivalent weight matrix for the measurements and a suitable adaptive factor for balancing the contributions of the measurements and the predicted state information to the state parameter estimates. In this paper, an adaptively robust filtering with classified adaptive factors was proposed, based on the principles of the adaptively robust filtering and bi-factor robust estimation for correlated observations. According to the constant velocity model of Kalman filtering, the state parameter vector was divided into two groups, namely position and velocity. The estimator of the adaptively robust filtering with classified adaptive factors was derived, and the calculation expressions of the classified adaptive factors were presented. Test results show that the adaptively robust filtering with classified adaptive factors is not only robust in controlling the measurement outliers and the kinematic state disturbing but also reasonable in balancing the contributions of the predicted position and velocity, respectively, and its filtering accuracy is superior to the adaptively robust filter with single adaptive factor based on the discrepancy of the predicted position or the predicted velocity.

  11. Classifying Finitely Generated Indecomposable RA Loops

    CERN Document Server

    Cornelissen, Mariana

    2012-01-01

    In 1995, E. Jespers, G. Leal and C. Polcino Milies classified all finite ring alternative loops (RA loops for short) which are not direct products of proper subloops. In this paper we extend this result to finitely generated RA loops and provide an explicit description of all such loops.

  12. Visual Classifier Training for Text Document Retrieval.

    Science.gov (United States)

    Heimerl, F; Koch, S; Bosch, H; Ertl, T

    2012-12-01

    Performing exhaustive searches over a large number of text documents can be tedious, since it is very hard to formulate search queries or define filter criteria that capture an analyst's information need adequately. Classification through machine learning has the potential to improve search and filter tasks encompassing either complex or very specific information needs, individually. Unfortunately, analysts who are knowledgeable in their field are typically not machine learning specialists. Most classification methods, however, require a certain expertise regarding their parametrization to achieve good results. Supervised machine learning algorithms, in contrast, rely on labeled data, which can be provided by analysts. However, the effort for labeling can be very high, which shifts the problem from composing complex queries or defining accurate filters to another laborious task, in addition to the need for judging the trained classifier's quality. We therefore compare three approaches for interactive classifier training in a user study. All of the approaches are potential candidates for the integration into a larger retrieval system. They incorporate active learning to various degrees in order to reduce the labeling effort as well as to increase effectiveness. Two of them encompass interactive visualization for letting users explore the status of the classifier in context of the labeled documents, as well as for judging the quality of the classifier in iterative feedback loops. We see our work as a step towards introducing user controlled classification methods in addition to text search and filtering for increasing recall in analytics scenarios involving large corpora.

  13. Classifying web pages with visual features

    NARCIS (Netherlands)

    de Boer, V.; van Someren, M.; Lupascu, T.; Filipe, J.; Cordeiro, J.

    2010-01-01

    To automatically classify and process web pages, current systems use the textual content of those pages, including both the displayed content and the underlying (HTML) code. However, a very important feature of a web page is its visual appearance. In this paper, we show that using generic visual fea

  14. MScanner: a classifier for retrieving Medline citations

    Directory of Open Access Journals (Sweden)

    Altman Russ B

    2008-02-01

    Full Text Available Abstract Background Keyword searching through PubMed and other systems is the standard means of retrieving information from Medline. However, ad-hoc retrieval systems do not meet all of the needs of databases that curate information from literature, or of text miners developing a corpus on a topic that has many terms indicative of relevance. Several databases have developed supervised learning methods that operate on a filtered subset of Medline, to classify Medline records so that fewer articles have to be manually reviewed for relevance. A few studies have considered generalisation of Medline classification to operate on the entire Medline database in a non-domain-specific manner, but existing applications lack speed, available implementations, or a means to measure performance in new domains. Results MScanner is an implementation of a Bayesian classifier that provides a simple web interface for submitting a corpus of relevant training examples in the form of PubMed IDs and returning results ranked by decreasing probability of relevance. For maximum speed it uses the Medical Subject Headings (MeSH and journal of publication as a concise document representation, and takes roughly 90 seconds to return results against the 16 million records in Medline. The web interface provides interactive exploration of the results, and cross validated performance evaluation on the relevant input against a random subset of Medline. We describe the classifier implementation, cross validate it on three domain-specific topics, and compare its performance to that of an expert PubMed query for a complex topic. In cross validation on the three sample topics against 100,000 random articles, the classifier achieved excellent separation of relevant and irrelevant article score distributions, ROC areas between 0.97 and 0.99, and averaged precision between 0.69 and 0.92. Conclusion MScanner is an effective non-domain-specific classifier that operates on the entire Medline

  15. Semantic Features for Classifying Referring Search Terms

    Energy Technology Data Exchange (ETDEWEB)

    May, Chandler J.; Henry, Michael J.; McGrath, Liam R.; Bell, Eric B.; Marshall, Eric J.; Gregory, Michelle L.

    2012-05-11

    When an internet user clicks on a result in a search engine, a request is submitted to the destination web server that includes a referrer field containing the search terms given by the user. Using this information, website owners can analyze the search terms leading to their websites to better understand their visitors needs. This work explores some of the features that can be used for classification-based analysis of such referring search terms. We present initial results for the example task of classifying HTTP requests countries of origin. A system that can accurately predict the country of origin from query text may be a valuable complement to IP lookup methods which are susceptible to the obfuscation of dereferrers or proxies. We suggest that the addition of semantic features improves classifier performance in this example application. We begin by looking at related work and presenting our approach. After describing initial experiments and results, we discuss paths forward for this work.

  16. Comparing cosmic web classifiers using information theory

    CERN Document Server

    Leclercq, Florent; Jasche, Jens; Wandelt, Benjamin

    2016-01-01

    We introduce a decision scheme for optimally choosing a classifier, which segments the cosmic web into different structure types (voids, sheets, filaments, and clusters). Our framework, based on information theory, accounts for the design aims of different classes of possible applications: (i) parameter inference, (ii) model selection, and (iii) prediction of new observations. As an illustration, we use cosmographic maps of web-types in the Sloan Digital Sky Survey to assess the relative performance of the classifiers T-web, DIVA and ORIGAMI for: (i) analyzing the morphology of the cosmic web, (ii) discriminating dark energy models, and (iii) predicting galaxy colors. Our study substantiates a data-supported connection between cosmic web analysis and information theory, and paves the path towards principled design of analysis procedures for the next generation of galaxy surveys. We have made the cosmic web maps, galaxy catalog, and analysis scripts used in this work publicly available.

  17. Classifying Star Forming Cores through Chemical Anomalies

    Science.gov (United States)

    Hoq, Sadia; Jackson, J.; Foster, J.

    2011-05-01

    The chemical makeup of Infrared Dark Clouds may offer a method to classify star forming cores. This study uses the molecular line maps from the Millimetre Astronomy Legacy Team 90 GHz (MALT90) Survey, observed using the 22-m ATNF Mopra Telescope. The relative abundances of the four molecules, N2H+, HNC, HCN and HCO+ are calculated for each of 500 cores to determine the chemical signatures of star forming cores in their early evolutionary stages, as deduced from Spitzer data. Cores are classified as prestellar, protostellar, or HII regions. Initial findings indicate that sources with relatively strong N2H+ lines are prestellar, whereas weak N2H+ lines may designate protostellar or HII regions. These chemical anomalies, where the N2H+ lines are either very prominent or weak are rare, suggesting that these are short-lived chemical phases.

  18. Comparing cosmic web classifiers using information theory

    Science.gov (United States)

    Leclercq, Florent; Lavaux, Guilhem; Jasche, Jens; Wandelt, Benjamin

    2016-08-01

    We introduce a decision scheme for optimally choosing a classifier, which segments the cosmic web into different structure types (voids, sheets, filaments, and clusters). Our framework, based on information theory, accounts for the design aims of different classes of possible applications: (i) parameter inference, (ii) model selection, and (iii) prediction of new observations. As an illustration, we use cosmographic maps of web-types in the Sloan Digital Sky Survey to assess the relative performance of the classifiers T-WEB, DIVA and ORIGAMI for: (i) analyzing the morphology of the cosmic web, (ii) discriminating dark energy models, and (iii) predicting galaxy colors. Our study substantiates a data-supported connection between cosmic web analysis and information theory, and paves the path towards principled design of analysis procedures for the next generation of galaxy surveys. We have made the cosmic web maps, galaxy catalog, and analysis scripts used in this work publicly available.

  19. Classifying sows' activity types from acceleration patterns

    DEFF Research Database (Denmark)

    Cornou, Cecile; Lundbye-Christensen, Søren

    2008-01-01

    -dimensional axes, plus the length of the acceleration vector) are selected for each activity. Each time series is modeled using a Dynamic Linear Model with cyclic components. The classification method, based on a Multi-Process Kalman Filter (MPKF), is applied to a total of 15 times series of 120 observations......An automated method of classifying sow activity using acceleration measurements would allow the individual sow's behavior to be monitored throughout the reproductive cycle; applications for detecting behaviors characteristic of estrus and farrowing or to monitor illness and welfare can be foreseen....... This article suggests a method of classifying five types of activity exhibited by group-housed sows. The method involves the measurement of acceleration in three dimensions. The five activities are: feeding, walking, rooting, lying laterally and lying sternally. Four time series of acceleration (the three...

  20. Max-margin based Bayesian classifier

    Institute of Scientific and Technical Information of China (English)

    Tao-cheng HU‡; Jin-hui YU

    2016-01-01

    There is a tradeoff between generalization capability and computational overhead in multi-class learning. We propose a generative probabilistic multi-class classifi er, considering both the generalization capability and the learning/prediction rate. We show that the classifi er has a max-margin property. Thus, prediction on future unseen data can nearly achieve the same performance as in the training stage. In addition, local variables are eliminated, which greatly simplifi es the optimization problem. By convex and probabilistic analysis, an efficient online learning algorithm is developed. The algorithm aggregates rather than averages dualities, which is different from the classical situations. Empirical results indicate that our method has a good generalization capability and coverage rate.

  1. Letter identification and the neural image classifier.

    Science.gov (United States)

    Watson, Andrew B; Ahumada, Albert J

    2015-02-12

    Letter identification is an important visual task for both practical and theoretical reasons. To extend and test existing models, we have reviewed published data for contrast sensitivity for letter identification as a function of size and have also collected new data. Contrast sensitivity increases rapidly from the acuity limit but slows and asymptotes at a symbol size of about 1 degree. We recast these data in terms of contrast difference energy: the average of the squared distances between the letter images and the average letter image. In terms of sensitivity to contrast difference energy, and thus visual efficiency, there is a peak around ¼ degree, followed by a marked decline at larger sizes. These results are explained by a Neural Image Classifier model that includes optical filtering and retinal neural filtering, sampling, and noise, followed by an optimal classifier. As letters are enlarged, sensitivity declines because of the increasing size and spacing of the midget retinal ganglion cell receptive fields in the periphery.

  2. Combining supervised classifiers with unlabeled data

    Institute of Scientific and Technical Information of China (English)

    刘雪艳; 张雪英; 李凤莲; 黄丽霞

    2016-01-01

    Ensemble learning is a wildly concerned issue. Traditional ensemble techniques are always adopted to seek better results with labeled data and base classifiers. They fail to address the ensemble task where only unlabeled data are available. A label propagation based ensemble (LPBE) approach is proposed to further combine base classification results with unlabeled data. First, a graph is constructed by taking unlabeled data as vertexes, and the weights in the graph are calculated by correntropy function. Average prediction results are gained from base classifiers, and then propagated under a regularization framework and adaptively enhanced over the graph. The proposed approach is further enriched when small labeled data are available. The proposed algorithms are evaluated on several UCI benchmark data sets. Results of simulations show that the proposed algorithms achieve satisfactory performance compared with existing ensemble methods.

  3. Classifying bed inclination using pressure images.

    Science.gov (United States)

    Baran Pouyan, M; Ostadabbas, S; Nourani, M; Pompeo, M

    2014-01-01

    Pressure ulcer is one of the most prevalent problems for bed-bound patients in hospitals and nursing homes. Pressure ulcers are painful for patients and costly for healthcare systems. Accurate in-bed posture analysis can significantly help in preventing pressure ulcers. Specifically, bed inclination (back angle) is a factor contributing to pressure ulcer development. In this paper, an efficient methodology is proposed to classify bed inclination. Our approach uses pressure values collected from a commercial pressure mat system. Then, by applying a number of image processing and machine learning techniques, the approximate degree of bed is estimated and classified. The proposed algorithm was tested on 15 subjects with various sizes and weights. The experimental results indicate that our method predicts bed inclination in three classes with 80.3% average accuracy.

  4. Design of Robust Neural Network Classifiers

    DEFF Research Database (Denmark)

    Larsen, Jan; Andersen, Lars Nonboe; Hintz-Madsen, Mads

    1998-01-01

    a modified likelihood function which incorporates the potential risk of outliers in the data. This leads to the introduction of a new parameter, the outlier probability. Designing the neural classifier involves optimization of network weights as well as outlier probability and regularization parameters. We...... suggest to adapt the outlier probability and regularisation parameters by minimizing the error on a validation set, and a simple gradient descent scheme is derived. In addition, the framework allows for constructing a simple outlier detector. Experiments with artificial data demonstrate the potential......This paper addresses a new framework for designing robust neural network classifiers. The network is optimized using the maximum a posteriori technique, i.e., the cost function is the sum of the log-likelihood and a regularization term (prior). In order to perform robust classification, we present...

  5. Classification Studies in an Advanced Air Classifier

    Science.gov (United States)

    Routray, Sunita; Bhima Rao, R.

    2016-10-01

    In the present paper, experiments are carried out using VSK separator which is an advanced air classifier to recover heavy minerals from beach sand. In classification experiments the cage wheel speed and the feed rate are set and the material is fed to the air cyclone and split into fine and coarse particles which are collected in separate bags. The size distribution of each fraction was measured by sieve analysis. A model is developed to predict the performance of the air classifier. The objective of the present model is to predict the grade efficiency curve for a given set of operating parameters such as cage wheel speed and feed rate. The overall experimental data with all variables studied in this investigation is fitted to several models. It is found that the present model is fitting good to the logistic model.

  6. Improving 2D Boosted Classifiers Using Depth LDA Classifier for Robust Face Detection

    Directory of Open Access Journals (Sweden)

    Mahmood Rahat

    2012-05-01

    Full Text Available Face detection plays an important role in Human Robot Interaction. Many of services provided by robots depend on face detection. This paper presents a novel face detection algorithm which uses depth data to improve the efficiency of a boosted classifier on 2D data for reduction of false positive alarms. The proposed method uses two levels of cascade classifiers. The classifiers of the first level deal with 2D data and classifiers of the second level use depth data captured by a stereo camera. The first level employs conventional cascade of boosted classifiers which eliminates many of nonface sub windows. The remaining sub windows are used as input to the second level. After calculating the corresponding depth model of the sub windows, a heuristic classifier along with a Linear Discriminant analysis (LDA classifier is applied on the depth data to reject remaining non face sub windows. The experimental results of the proposed method using a Bumblebee-2 stereo vision system on a mobile platform for real time detection of human faces in natural cluttered environments reveal significantly reduction of false positive alarms of 2D face detector.

  7. Bayes classifiers for imbalanced traffic accidents datasets.

    Science.gov (United States)

    Mujalli, Randa Oqab; López, Griselda; Garach, Laura

    2016-03-01

    Traffic accidents data sets are usually imbalanced, where the number of instances classified under the killed or severe injuries class (minority) is much lower than those classified under the slight injuries class (majority). This, however, supposes a challenging problem for classification algorithms and may cause obtaining a model that well cover the slight injuries instances whereas the killed or severe injuries instances are misclassified frequently. Based on traffic accidents data collected on urban and suburban roads in Jordan for three years (2009-2011); three different data balancing techniques were used: under-sampling which removes some instances of the majority class, oversampling which creates new instances of the minority class and a mix technique that combines both. In addition, different Bayes classifiers were compared for the different imbalanced and balanced data sets: Averaged One-Dependence Estimators, Weightily Average One-Dependence Estimators, and Bayesian networks in order to identify factors that affect the severity of an accident. The results indicated that using the balanced data sets, especially those created using oversampling techniques, with Bayesian networks improved classifying a traffic accident according to its severity and reduced the misclassification of killed and severe injuries instances. On the other hand, the following variables were found to contribute to the occurrence of a killed causality or a severe injury in a traffic accident: number of vehicles involved, accident pattern, number of directions, accident type, lighting, surface condition, and speed limit. This work, to the knowledge of the authors, is the first that aims at analyzing historical data records for traffic accidents occurring in Jordan and the first to apply balancing techniques to analyze injury severity of traffic accidents. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Statistical Mechanics of Soft Margin Classifiers

    OpenAIRE

    Risau-Gusman, Sebastian; Gordon, Mirta B.

    2001-01-01

    We study the typical learning properties of the recently introduced Soft Margin Classifiers (SMCs), learning realizable and unrealizable tasks, with the tools of Statistical Mechanics. We derive analytically the behaviour of the learning curves in the regime of very large training sets. We obtain exponential and power laws for the decay of the generalization error towards the asymptotic value, depending on the task and on general characteristics of the distribution of stabilities of the patte...

  9. A Bayesian classifier for symbol recognition

    OpenAIRE

    Barrat, Sabine; Tabbone, Salvatore; Nourrissier, Patrick

    2007-01-01

    URL : http://www.buyans.com/POL/UploadedFile/134_9977.pdf; International audience; We present in this paper an original adaptation of Bayesian networks to symbol recognition problem. More precisely, a descriptor combination method, which enables to improve significantly the recognition rate compared to the recognition rates obtained by each descriptor, is presented. In this perspective, we use a simple Bayesian classifier, called naive Bayes. In fact, probabilistic graphical models, more spec...

  10. Deterministic Pattern Classifier Based on Genetic Programming

    Institute of Scientific and Technical Information of China (English)

    LI Jian-wu; LI Min-qiang; KOU Ji-song

    2001-01-01

    This paper proposes a supervised training-test method with Genetic Programming (GP) for pattern classification. Compared and contrasted with traditional methods with regard to deterministic pattern classifiers, this method is true for both linear separable problems and linear non-separable problems. For specific training samples, it can formulate the expression of discriminate function well without any prior knowledge. At last, an experiment is conducted, and the result reveals that this system is effective and practical.

  11. Evolving edited k-nearest neighbor classifiers.

    Science.gov (United States)

    Gil-Pita, Roberto; Yao, Xin

    2008-12-01

    The k-nearest neighbor method is a classifier based on the evaluation of the distances to each pattern in the training set. The edited version of this method consists of the application of this classifier with a subset of the complete training set in which some of the training patterns are excluded, in order to reduce the classification error rate. In recent works, genetic algorithms have been successfully applied to determine which patterns must be included in the edited subset. In this paper we propose a novel implementation of a genetic algorithm for designing edited k-nearest neighbor classifiers. It includes the definition of a novel mean square error based fitness function, a novel clustered crossover technique, and the proposal of a fast smart mutation scheme. In order to evaluate the performance of the proposed method, results using the breast cancer database, the diabetes database and the letter recognition database from the UCI machine learning benchmark repository have been included. Both error rate and computational cost have been considered in the analysis. Obtained results show the improvement achieved by the proposed editing method.

  12. Prediction of Protein-Protein Interaction Sites Based on Naive Bayes Classifier

    Directory of Open Access Journals (Sweden)

    Haijiang Geng

    2015-01-01

    Full Text Available Protein functions through interactions with other proteins and biomolecules and these interactions occur on the so-called interface residues of the protein sequences. Identifying interface residues makes us better understand the biological mechanism of protein interaction. Meanwhile, information about the interface residues contributes to the understanding of metabolic, signal transduction networks and indicates directions in drug designing. In recent years, researchers have focused on developing new computational methods for predicting protein interface residues. Here we creatively used a 181-dimension protein sequence feature vector as input to the Naive Bayes Classifier- (NBC- based method to predict interaction sites in protein-protein complexes interaction. The prediction of interaction sites in protein interactions is regarded as an amino acid residue binary classification problem by applying NBC with protein sequence features. Independent test results suggested that Naive Bayes Classifier-based method with the protein sequence features as input vectors performed well.

  13. Reconfiguration-based implementation of SVM classifier on FPGA for Classifying Microarray data.

    Science.gov (United States)

    Hussain, Hanaa M; Benkrid, Khaled; Seker, Huseyin

    2013-01-01

    Classifying Microarray data, which are of high dimensional nature, requires high computational power. Support Vector Machines-based classifier (SVM) is among the most common and successful classifiers used in the analysis of Microarray data but also requires high computational power due to its complex mathematical architecture. Implementing SVM on hardware exploits the parallelism available within the algorithm kernels to accelerate the classification of Microarray data. In this work, a flexible, dynamically and partially reconfigurable implementation of the SVM classifier on Field Programmable Gate Array (FPGA) is presented. The SVM architecture achieved up to 85× speed-up over equivalent general purpose processor (GPP) showing the capability of FPGAs in enhancing the performance of SVM-based analysis of Microarray data as well as future bioinformatics applications.

  14. Bioinformatic approaches to identifying and classifying Rab proteins.

    Science.gov (United States)

    Diekmann, Yoan; Pereira-Leal, José B

    2015-01-01

    The bioinformatic annotation of Rab GTPases is important, for example, to understand the evolution of the endomembrane system. However, Rabs are particularly challenging for standard annotation pipelines because they are similar to other small GTPases and form a large family with many paralogous subfamilies. Here, we describe a bioinformatic annotation pipeline specifically tailored to Rab GTPases. It proceeds in two steps: first, Rabs are distinguished from other proteins based on GTPase-specific motifs, overall sequence similarity to other Rabs, and the occurrence of Rab-specific motifs. Second, Rabs are classified taking either a more accurate but slower phylogenetic approach or a slightly less accurate but much faster bioinformatic approach. All necessary steps can either be performed locally or using the referenced online tools. An implementation of a slightly more involved version of the pipeline presented here is available at RabDB.org.

  15. Classifying smoking urges via machine learning.

    Science.gov (United States)

    Dumortier, Antoine; Beckjord, Ellen; Shiffman, Saul; Sejdić, Ervin

    2016-12-01

    Smoking is the largest preventable cause of death and diseases in the developed world, and advances in modern electronics and machine learning can help us deliver real-time intervention to smokers in novel ways. In this paper, we examine different machine learning approaches to use situational features associated with having or not having urges to smoke during a quit attempt in order to accurately classify high-urge states. To test our machine learning approaches, specifically, Bayes, discriminant analysis and decision tree learning methods, we used a dataset collected from over 300 participants who had initiated a quit attempt. The three classification approaches are evaluated observing sensitivity, specificity, accuracy and precision. The outcome of the analysis showed that algorithms based on feature selection make it possible to obtain high classification rates with only a few features selected from the entire dataset. The classification tree method outperformed the naive Bayes and discriminant analysis methods, with an accuracy of the classifications up to 86%. These numbers suggest that machine learning may be a suitable approach to deal with smoking cessation matters, and to predict smoking urges, outlining a potential use for mobile health applications. In conclusion, machine learning classifiers can help identify smoking situations, and the search for the best features and classifier parameters significantly improves the algorithms' performance. In addition, this study also supports the usefulness of new technologies in improving the effect of smoking cessation interventions, the management of time and patients by therapists, and thus the optimization of available health care resources. Future studies should focus on providing more adaptive and personalized support to people who really need it, in a minimum amount of time by developing novel expert systems capable of delivering real-time interventions. Copyright © 2016 Elsevier Ireland Ltd. All rights

  16. Gearbox Condition Monitoring Using Advanced Classifiers

    Directory of Open Access Journals (Sweden)

    P. Večeř

    2010-01-01

    Full Text Available New efficient and reliable methods for gearbox diagnostics are needed in automotive industry because of growing demand for production quality. This paper presents the application of two different classifiers for gearbox diagnostics – Kohonen Neural Networks and the Adaptive-Network-based Fuzzy Interface System (ANFIS. Two different practical applications are presented. In the first application, the tested gearboxes are separated into two classes according to their condition indicators. In the second example, ANFIS is applied to label the tested gearboxes with a Quality Index according to the condition indicators. In both applications, the condition indicators were computed from the vibration of the gearbox housing. 

  17. Learning Rates for -Regularized Kernel Classifiers

    Directory of Open Access Journals (Sweden)

    Hongzhi Tong

    2013-01-01

    Full Text Available We consider a family of classification algorithms generated from a regularization kernel scheme associated with -regularizer and convex loss function. Our main purpose is to provide an explicit convergence rate for the excess misclassification error of the produced classifiers. The error decomposition includes approximation error, hypothesis error, and sample error. We apply some novel techniques to estimate the hypothesis error and sample error. Learning rates are eventually derived under some assumptions on the kernel, the input space, the marginal distribution, and the approximation error.

  18. Intelligent neural network classifier for automatic testing

    Science.gov (United States)

    Bai, Baoxing; Yu, Heping

    1996-10-01

    This paper is concerned with an application of a multilayer feedforward neural network for the vision detection of industrial pictures, and introduces a high characteristics image processing and recognizing system which can be used for real-time testing blemishes, streaks and cracks, etc. on the inner walls of high-accuracy pipes. To take full advantage of the functions of the artificial neural network, such as the information distributed memory, large scale self-adapting parallel processing, high fault-tolerance ability, this system uses a multilayer perceptron as a regular detector to extract features of the images to be inspected and classify them.

  19. Cubical sets as a classifying topos

    DEFF Research Database (Denmark)

    Spitters, Bas

    Coquand’s cubical set model for homotopy type theory provides the basis for a computational interpretation of the univalence axiom and some higher inductive types, as implemented in the cubical proof assistant. We show that the underlying cube category is the opposite of the Lawvere theory of De...... Morgan algebras. The topos of cubical sets itself classifies the theory of ‘free De Morgan algebras’. This provides us with a topos with an internal ‘interval’. Using this interval we construct a model of type theory following van den Berg and Garner. We are currently investigating the precise relation...

  20. Classifying spaces of degenerating polarized Hodge structures

    CERN Document Server

    Kato, Kazuya

    2009-01-01

    In 1970, Phillip Griffiths envisioned that points at infinity could be added to the classifying space D of polarized Hodge structures. In this book, Kazuya Kato and Sampei Usui realize this dream by creating a logarithmic Hodge theory. They use the logarithmic structures begun by Fontaine-Illusie to revive nilpotent orbits as a logarithmic Hodge structure. The book focuses on two principal topics. First, Kato and Usui construct the fine moduli space of polarized logarithmic Hodge structures with additional structures. Even for a Hermitian symmetric domain D, the present theory is a refinem

  1. Evaluation of the Vocal Tract Length Normalization Based Classifiers for Speaker Verification

    Directory of Open Access Journals (Sweden)

    Walid Hussein

    2016-12-01

    Full Text Available This paper proposes and evaluates classifiers based on Vocal Tract Length Normalization (VTLN in a text-dependent speaker verification (SV task with short testing utterances. This type of tasks is important in commercial applications and is not easily addressed with methods designed for long utterances such as JFA and i-Vectors. In contrast, VTLN is a speaker compensation scheme that can lead to significant improvements in speech recognition accuracy with just a few seconds of speech samples. A novel scheme to generate new classifiers is employed by incorporating the observation vector sequence compensated with VTLN. The modified sequence of feature vectors and the corresponding warping factors are used to generate classifiers whose scores are combined by a Support Vector Machine (SVM based SV system. The proposed scheme can provide an average reduction in EER equal to 14% when compared with the baseline system based on the likelihood of observation vectors.

  2. Vision-based posture recognition using an ensemble classifier and a vote filter

    Science.gov (United States)

    Ji, Peng; Wu, Changcheng; Xu, Xiaonong; Song, Aiguo; Li, Huijun

    2016-10-01

    Posture recognition is a very important Human-Robot Interaction (HRI) way. To segment effective posture from an image, we propose an improved region grow algorithm which combining with the Single Gauss Color Model. The experiment shows that the improved region grow algorithm can get the complete and accurate posture than traditional Single Gauss Model and region grow algorithm, and it can eliminate the similar region from the background at the same time. In the posture recognition part, and in order to improve the recognition rate, we propose a CNN ensemble classifier, and in order to reduce the misjudgments during a continuous gesture control, a vote filter is proposed and applied to the sequence of recognition results. Comparing with CNN classifier, the CNN ensemble classifier we proposed can yield a 96.27% recognition rate, which is better than that of CNN classifier, and the proposed vote filter can improve the recognition result and reduce the misjudgments during the consecutive gesture switch.

  3. Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2006-06-01

    Full Text Available Abstract Background The subcellular location of a protein is closely related to its function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been made to predict subcellular location from sequence information only, there is the need for further research to improve the accuracy of prediction. Results A novel method called HensBC is introduced to predict protein subcellular location. HensBC is a recursive algorithm which constructs a hierarchical ensemble of classifiers. The classifiers used are Bayesian classifiers based on Markov chain models. We tested our method on six various datasets; among them are Gram-negative bacteria dataset, data for discriminating outer membrane proteins and apoptosis proteins dataset. We observed that our method can predict the subcellular location with high accuracy. Another advantage of the proposed method is that it can improve the accuracy of the prediction of some classes with few sequences in training and is therefore useful for datasets with imbalanced distribution of classes. Conclusion This study introduces an algorithm which uses only the primary sequence of a protein to predict its subcellular location. The proposed recursive scheme represents an interesting methodology for learning and combining classifiers. The method is computationally efficient and competitive with the previously reported approaches in terms of prediction accuracies as empirical results indicate. The code for the software is available upon request.

  4. Characterization of the beta amyloid precursor protein-like gene in the central nervous system of the crab Chasmagnathus. Expression during memory consolidation

    Directory of Open Access Journals (Sweden)

    Fustiñana Maria

    2010-09-01

    Full Text Available Abstract Background Human β-amyloid, the main component in the neuritic plaques found in patients with Alzheimer's disease, is generated by cleavage of the β-amyloid precursor protein. Beyond the role in pathology, members of this protein family are synaptic proteins and have been associated with synaptogenesis, neuronal plasticity and memory, both in vertebrates and in invertebrates. Consolidation is necessary to convert a short-term labile memory to a long-term and stable form. During consolidation, gene expression and de novo protein synthesis are regulated in order to produce key proteins for the maintenance of plastic changes produced during the acquisition of new information. Results Here we partially cloned and sequenced the beta-amyloid precursor protein like gene homologue in the crab Chasmagnathus (cappl, showing a 37% of identity with the fruit fly Drosophila melanogaster homologue and 23% with Homo sapiens but with much higher degree of sequence similarity in certain regions. We observed a wide distribution of cappl mRNA in the nervous system as well as in muscle and gills. The protein localized in all tissues analyzed with the exception of muscle. Immunofluorescence revealed localization of cAPPL in associative and sensory brain areas. We studied gene and protein expression during long-term memory consolidation using a well characterized memory model: the context-signal associative memory in this crab species. mRNA levels varied at different time points during long-term memory consolidation and correlated with cAPPL protein levels Conclusions cAPPL mRNA and protein is widely distributed in the central nervous system of the crab and the time course of expression suggests a role of cAPPL during long-term memory formation.

  5. Objectively classifying Southern Hemisphere extratropical cyclones

    Science.gov (United States)

    Catto, Jennifer

    2016-04-01

    There has been a long tradition in attempting to separate extratropical cyclones into different classes depending on their cloud signatures, airflows, synoptic precursors, or upper-level flow features. Depending on these features, the cyclones may have different impacts, for example in their precipitation intensity. It is important, therefore, to understand how the distribution of different cyclone classes may change in the future. Many of the previous classifications have been performed manually. In order to be able to evaluate climate models and understand how extratropical cyclones might change in the future, we need to be able to use an automated method to classify cyclones. Extratropical cyclones have been identified in the Southern Hemisphere from the ERA-Interim reanalysis dataset with a commonly used identification and tracking algorithm that employs 850 hPa relative vorticity. A clustering method applied to large-scale fields from ERA-Interim at the time of cyclone genesis (when the cyclone is first detected), has been used to objectively classify identified cyclones. The results are compared to the manual classification of Sinclair and Revell (2000) and the four objectively identified classes shown in this presentation are found to match well. The relative importance of diabatic heating in the clusters is investigated, as well as the differing precipitation characteristics. The success of the objective classification shows its utility in climate model evaluation and climate change studies.

  6. Adaptive classifier for steel strip surface defects

    Science.gov (United States)

    Jiang, Mingming; Li, Guangyao; Xie, Li; Xiao, Mang; Yi, Li

    2017-01-01

    Surface defects detection system has been receiving increased attention as its precision, speed and less cost. One of the most challenges is reacting to accuracy deterioration with time as aged equipment and changed processes. These variables will make a tiny change to the real world model but a big impact on the classification result. In this paper, we propose a new adaptive classifier with a Bayes kernel (BYEC) which update the model with small sample to it adaptive for accuracy deterioration. Firstly, abundant features were introduced to cover lots of information about the defects. Secondly, we constructed a series of SVMs with the random subspace of the features. Then, a Bayes classifier was trained as an evolutionary kernel to fuse the results from base SVMs. Finally, we proposed the method to update the Bayes evolutionary kernel. The proposed algorithm is experimentally compared with different algorithms, experimental results demonstrate that the proposed method can be updated with small sample and fit the changed model well. Robustness, low requirement for samples and adaptive is presented in the experiment.

  7. A systematic comparison of supervised classifiers.

    Directory of Open Access Journals (Sweden)

    Diego Raphael Amancio

    Full Text Available Pattern recognition has been employed in a myriad of industrial, commercial and academic applications. Many techniques have been devised to tackle such a diversity of applications. Despite the long tradition of pattern recognition research, there is no technique that yields the best classification in all scenarios. Therefore, as many techniques as possible should be considered in high accuracy applications. Typical related works either focus on the performance of a given algorithm or compare various classification methods. In many occasions, however, researchers who are not experts in the field of machine learning have to deal with practical classification tasks without an in-depth knowledge about the underlying parameters. Actually, the adequate choice of classifiers and parameters in such practical circumstances constitutes a long-standing problem and is one of the subjects of the current paper. We carried out a performance study of nine well-known classifiers implemented in the Weka framework and compared the influence of the parameter configurations on the accuracy. The default configuration of parameters in Weka was found to provide near optimal performance for most cases, not including methods such as the support vector machine (SVM. In addition, the k-nearest neighbor method frequently allowed the best accuracy. In certain conditions, it was possible to improve the quality of SVM by more than 20% with respect to their default parameter configuration.

  8. Fcoused crawler bused on Bayesian classifier

    Directory of Open Access Journals (Sweden)

    JIA Haijun

    2013-12-01

    Full Text Available With the rapid development of the network,its information resources are increasingly large and faced a huge amount of information database,search engine plays an important role.Focused crawling technique,as the main core portion of search engine,is used to calculate the relationship between search results and search topics,which is called correlation.Normally,focused crawling method is used only to calculate the correlation between web content and search related topics.In this paper,focused crawling method is used to compute the importance of links through link content and anchor text,then Bayesian classifier is used to classify the links,and finally cosine similarity function is used to calculate the relevance of web pages.If the correlation value is greater than the threshold the page is considered to be associated with the predetermined topics,otherwise not relevant.Experimental results show that a high accuracy can be obtained by using the proposed crawling approach.

  9. Rotational Study of Ambiguous Taxonomic Classified Asteroids

    Science.gov (United States)

    Linder, Tyler R.; Sanchez, Rick; Wuerker, Wolfgang; Clayson, Timothy; Giles, Tucker

    2017-01-01

    The Sloan Digital Sky Survey (SDSS) moving object catalog (MOC4) provided the largest ever catalog of asteroid spectrophotometry observations. Carvano et al. (2010), while analyzing MOC4, discovered that individual observations of asteroids which were observed multiple times did not classify into the same photometric-based taxonomic class. A small subset of those asteroids were classified as having both the presence and absence of a 1um silicate absorption feature. If these variations are linked to differences in surface mineralogy, the prevailing assumption that an asteroid’s surface composition is predominantly homogenous would need to be reexamined. Furthermore, our understanding of the evolution of the asteroid belt, as well as the linkage between certain asteroids and meteorite types may need to be modified.This research is an investigation to determine the rotational rates of these taxonomically ambiguous asteroids. Initial questions to be answered:Do these asteroids have unique or nonstandard rotational rates?Is there any evidence in their light curve to suggest an abnormality?Observations were taken using PROMPT6 a 0.41-m telescope apart of the SKYNET network at Cerro Tololo Inter-American Observatory (CTIO). Observations were calibrated and analyzed using Canopus software. Initial results will be presented at AAS.

  10. Classifying anatomical subtypes of subjective memory impairment.

    Science.gov (United States)

    Jung, Na-Yeon; Seo, Sang Won; Yoo, Heejin; Yang, Jin-Ju; Park, Seongbeom; Kim, Yeo Jin; Lee, Juyoun; Lee, Jin San; Jang, Young Kyoung; Lee, Jong Min; Kim, Sung Tae; Kim, Seonwoo; Kim, Eun-Joo; Na, Duk L; Kim, Hee Jin

    2016-12-01

    We aimed to categorize subjective memory impairment (SMI) individuals based on their patterns of cortical thickness and to propose simple models that can classify each subtype. We recruited 613 SMI individuals and 613 age- and gender-matched normal controls. Using hierarchical agglomerative cluster analysis, SMI individuals were divided into 3 subtypes: temporal atrophy (12.9%), minimal atrophy (52.4%), and diffuse atrophy (34.6%). Individuals in the temporal atrophy (Alzheimer's disease-like atrophy) subtype were older, had more vascular risk factors, and scored the lowest on neuropsychological tests. Combination of these factors classified the temporal atrophy subtype with 73.2% accuracy. On the other hand, individuals with the minimal atrophy (non-neurodegenerative) subtype were younger, were more likely to be female, and had depression. Combination of these factors discriminated the minimal atrophy subtype with 76.0% accuracy. We suggest that SMI can be largely categorized into 3 anatomical subtypes that have distinct clinical features. Our models may help physicians decide next steps when encountering SMI patients and may also be used in clinical trials.

  11. 5 CFR 1312.23 - Access to classified information.

    Science.gov (United States)

    2010-01-01

    ... 5 Administrative Personnel 3 2010-01-01 2010-01-01 false Access to classified information. 1312.23... Classified Information § 1312.23 Access to classified information. Classified information may be made... “need to know” and the access is essential to the accomplishment of official government duties....

  12. Comparison of Current Frame-Based Phoneme Classifiers

    Directory of Open Access Journals (Sweden)

    Vaclav Pfeifer

    2011-01-01

    Full Text Available This paper compares today’s most common frame-based classifiers. These classifiers can be divided into the two main groups – generic classifiers which creates the most probable model based on the training data (for example GMM and discriminative classifiers which focues on creating decision hyperplane. A lot of research has been done with the GMM classifiers and therefore this paper will be mainly focused on the frame-based classifiers. Two discriminative classifiers will be presented. These classifiers implements a hieararchical tree root structure over the input phoneme group which shown to be an effective. Based on these classifiers, two efficient training algorithms will be presented. We demonstrate advantages of our training algorithms by evaluating all classifiers over the TIMIT speech corpus.

  13. Hybrid Neuro-Fuzzy Classifier Based On Nefclass Model

    Directory of Open Access Journals (Sweden)

    Bogdan Gliwa

    2011-01-01

    Full Text Available The paper presents hybrid neuro-fuzzy classifier, based on NEFCLASS model, which wasmodified. The presented classifier was compared to popular classifiers – neural networks andk-nearest neighbours. Efficiency of modifications in classifier was compared with methodsused in original model NEFCLASS (learning methods. Accuracy of classifier was testedusing 3 datasets from UCI Machine Learning Repository: iris, wine and breast cancer wisconsin.Moreover, influence of ensemble classification methods on classification accuracy waspresented.

  14. Classifying antiarrhythmic actions: by facts or speculation.

    Science.gov (United States)

    Vaughan Williams, E M

    1992-11-01

    Classification of antiarrhythmic actions is reviewed in the context of the results of the Cardiac Arrhythmia Suppression Trials, CAST 1 and 2. Six criticisms of the classification recently published (The Sicilian Gambit) are discussed in detail. The alternative classification, when stripped of speculative elements, is shown to be similar to the original classification. Claims that the classification failed to predict the efficacy of antiarrhythmic drugs for the selection of appropriate therapy have been tested by an example. The antiarrhythmic actions of cibenzoline were classified in 1980. A detailed review of confirmatory experiments and clinical trials during the past decade shows that predictions made at the time agree with subsequent results. Classification of the effects drugs actually have on functioning cardiac tissues provides a rational basis for finding the preferred treatment for a particular arrhythmia in accordance with the diagnosis.

  15. A cognitive approach to classifying perceived behaviors

    Science.gov (United States)

    Benjamin, Dale Paul; Lyons, Damian

    2010-04-01

    This paper describes our work on integrating distributed, concurrent control in a cognitive architecture, and using it to classify perceived behaviors. We are implementing the Robot Schemas (RS) language in Soar. RS is a CSP-type programming language for robotics that controls a hierarchy of concurrently executing schemas. The behavior of every RS schema is defined using port automata. This provides precision to the semantics and also a constructive means of reasoning about the behavior and meaning of schemas. Our implementation uses Soar operators to build, instantiate and connect port automata as needed. Our approach is to use comprehension through generation (similar to NLSoar) to search for ways to construct port automata that model perceived behaviors. The generality of RS permits us to model dynamic, concurrent behaviors. A virtual world (Ogre) is used to test the accuracy of these automata. Soar's chunking mechanism is used to generalize and save these automata. In this way, the robot learns to recognize new behaviors.

  16. Learning Vector Quantization for Classifying Astronomical Objects

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    The sizes of astronomical surveys in different wavebands are increas-ing rapidly. Therefore, automatic classification of objects is becoming ever moreimportant. We explore the performance of learning vector quantization (LVQ) inclassifying multi-wavelength data. Our analysis concentrates on separating activesources from non-active ones. Different classes of X-ray emitters populate distinctregions of a multidimensional parameter space. In order to explore the distributionof various objects in a multidimensional parameter space, we positionally cross-correlate the data of quasars, BL Lacs, active galaxies, stars and normal galaxiesin the optical, X-ray and infrared bands. We then apply LVQ to classify them withthe obtained data. Our results show that LVQ is an effective method for separatingAGNs from stars and normal galaxies with multi-wavelength data.

  17. Speech Emotion Recognition Using Fuzzy Logic Classifier

    Directory of Open Access Journals (Sweden)

    Daniar aghsanavard

    2016-01-01

    Full Text Available Over the last two decades, emotions, speech recognition and signal processing have been one of the most significant issues in the adoption of techniques to detect them. Each method has advantages and disadvantages. This paper tries to suggest fuzzy speech emotion recognition based on the classification of speech's signals in order to better recognition along with a higher speed. In this system, the use of fuzzy logic system with 5 layers, which is the combination of neural progressive network and algorithm optimization of firefly, first, speech samples have been given to input of fuzzy orbit and then, signals will be investigated and primary classified in a fuzzy framework. In this model, a pattern of signals will be created for each class of signals, which results in reduction of signal data dimension as well as easier speech recognition. The obtained experimental results show that our proposed method (categorized by firefly, improves recognition of utterances.

  18. Classifying and ranking DMUs in interval DEA

    Institute of Scientific and Technical Information of China (English)

    GUO Jun-peng; WU Yu-hua; LI Wen-hua

    2005-01-01

    During efficiency evaluating by DEA, the inputs and outputs of DMUs may be intervals because of insufficient information or measure error. For this reason, interval DEA is proposed. To make the efficiency scores more discriminative, this paper builds an Interval Modified DEA (IMDEA) model based on MDEA.Furthermore, models of obtaining upper and lower bounds of the efficiency scores for each DMU are set up.Based on this, the DMUs are classified into three types. Next, a new order relation between intervals which can express the DM' s preference to the three types is proposed. As a result, a full and more eonvietive ranking is made on all the DMUs. Finally an example is given.

  19. Classifying prion and prion-like phenomena.

    Science.gov (United States)

    Harbi, Djamel; Harrison, Paul M

    2014-01-01

    The universe of prion and prion-like phenomena has expanded significantly in the past several years. Here, we overview the challenges in classifying this data informatically, given that terms such as "prion-like", "prion-related" or "prion-forming" do not have a stable meaning in the scientific literature. We examine the spectrum of proteins that have been described in the literature as forming prions, and discuss how "prion" can have a range of meaning, with a strict definition being for demonstration of infection with in vitro-derived recombinant prions. We suggest that although prion/prion-like phenomena can largely be apportioned into a small number of broad groups dependent on the type of transmissibility evidence for them, as new phenomena are discovered in the coming years, a detailed ontological approach might be necessary that allows for subtle definition of different "flavors" of prion / prion-like phenomena.

  20. CLASSIFYING MEDICAL IMAGES USING MORPHOLOGICAL APPEARANCE MANIFOLDS.

    Science.gov (United States)

    Varol, Erdem; Gaonkar, Bilwaj; Davatzikos, Christos

    2013-12-31

    Input features for medical image classification algorithms are extracted from raw images using a series of pre processing steps. One common preprocessing step in computational neuroanatomy and functional brain mapping is the nonlinear registration of raw images to a common template space. Typically, the registration methods used are parametric and their output varies greatly with changes in parameters. Most results reported previously perform registration using a fixed parameter setting and use the results as input to the subsequent classification step. The variation in registration results due to choice of parameters thus translates to variation of performance of the classifiers that depend on the registration step for input. Analogous issues have been investigated in the computer vision literature, where image appearance varies with pose and illumination, thereby making classification vulnerable to these confounding parameters. The proposed methodology addresses this issue by sampling image appearances as registration parameters vary, and shows that better classification accuracies can be obtained this way, compared to the conventional approach.

  1. Segmentation of Fingerprint Images Using Linear Classifier

    Directory of Open Access Journals (Sweden)

    Xinjian Chen

    2004-04-01

    Full Text Available An algorithm for the segmentation of fingerprints and a criterion for evaluating the block feature are presented. The segmentation uses three block features: the block clusters degree, the block mean information, and the block variance. An optimal linear classifier has been trained for the classification per block and the criteria of minimal number of misclassified samples are used. Morphology has been applied as postprocessing to reduce the number of classification errors. The algorithm is tested on FVC2002 database, only 2.45% of the blocks are misclassified, while the postprocessing further reduces this ratio. Experiments have shown that the proposed segmentation method performs very well in rejecting false fingerprint features from the noisy background.

  2. Sequence Patterns of Identity Authentication Protocols

    Institute of Scientific and Technical Information of China (English)

    Tao Hongcai; He Dake

    2006-01-01

    From the viewpoint of protocol sequence, analyses are made of the sequence patterns of possible identity authentication protocol under two cases: with or without the trusted third party (TTP). Ten feasible sequence patterns of authentication protocol with TTP and 5 sequence patterns without TTP are gained. These gained sequence patterns meet the requirements for identity authentication,and basically cover almost all the authentication protocols with TTP and without TTP at present. All of the sequence patterns gained are classified into unilateral or bilateral authentication. Then , according to the sequence symmetry, several good sequence patterns with TTP are evaluated. The accompolished results can provide a reference to design of new identity authentication protocols.

  3. Classifying supernovae using only galaxy data

    Energy Technology Data Exchange (ETDEWEB)

    Foley, Ryan J. [Astronomy Department, University of Illinois at Urbana-Champaign, 1002 West Green Street, Urbana, IL 61801 (United States); Mandel, Kaisey [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States)

    2013-12-01

    We present a new method for probabilistically classifying supernovae (SNe) without using SN spectral or photometric data. Unlike all previous studies to classify SNe without spectra, this technique does not use any SN photometry. Instead, the method relies on host-galaxy data. We build upon the well-known correlations between SN classes and host-galaxy properties, specifically that core-collapse SNe rarely occur in red, luminous, or early-type galaxies. Using the nearly spectroscopically complete Lick Observatory Supernova Search sample of SNe, we determine SN fractions as a function of host-galaxy properties. Using these data as inputs, we construct a Bayesian method for determining the probability that an SN is of a particular class. This method improves a common classification figure of merit by a factor of >2, comparable to the best light-curve classification techniques. Of the galaxy properties examined, morphology provides the most discriminating information. We further validate this method using SN samples from the Sloan Digital Sky Survey and the Palomar Transient Factory. We demonstrate that this method has wide-ranging applications, including separating different subclasses of SNe and determining the probability that an SN is of a particular class before photometry or even spectra can. Since this method uses completely independent data from light-curve techniques, there is potential to further improve the overall purity and completeness of SN samples and to test systematic biases of the light-curve techniques. Further enhancements to the host-galaxy method, including additional host-galaxy properties, combination with light-curve methods, and hybrid methods, should further improve the quality of SN samples from past, current, and future transient surveys.

  4. Classifying transcription factor targets and discovering relevant biological features

    Directory of Open Access Journals (Sweden)

    DeLisi Charles

    2008-05-01

    Full Text Available Abstract Background An important goal in post-genomic research is discovering the network of interactions between transcription factors (TFs and the genes they regulate. We have previously reported the development of a supervised-learning approach to TF target identification, and used it to predict targets of 104 transcription factors in yeast. We now include a new sequence conservation measure, expand our predictions to include 59 new TFs, introduce a web-server, and implement an improved ranking method to reveal the biological features contributing to regulation. The classifiers combine 8 genomic datasets covering a broad range of measurements including sequence conservation, sequence overrepresentation, gene expression, and DNA structural properties. Principal Findings (1 Application of the method yields an amplification of information about yeast regulators. The ratio of total targets to previously known targets is greater than 2 for 11 TFs, with several having larger gains: Ash1(4, Ino2(2.6, Yaf1(2.4, and Yap6(2.4. (2 Many predicted targets for TFs match well with the known biology of their regulators. As a case study we discuss the regulator Swi6, presenting evidence that it may be important in the DNA damage response, and that the previously uncharacterized gene YMR279C plays a role in DNA damage response and perhaps in cell-cycle progression. (3 A procedure based on recursive-feature-elimination is able to uncover from the large initial data sets those features that best distinguish targets for any TF, providing clues relevant to its biology. An analysis of Swi6 suggests a possible role in lipid metabolism, and more specifically in metabolism of ceramide, a bioactive lipid currently being investigated for anti-cancer properties. (4 An analysis of global network properties highlights the transcriptional network hubs; the factors which control the most genes and the genes which are bound by the largest set of regulators. Cell-cycle and

  5. Facial Expression Recognition Using SVM Classifier

    OpenAIRE

    2015-01-01

    Facial feature tracking and facial actions recognition from image sequence attracted great attention in computer vision field. Computational facial expression analysis is a challenging research topic in computer vision. It is required by many applications such as human-computer interaction, computer graphic animation and automatic facial expression recognition. In recent years, plenty of computer vision techniques have been developed to track or recognize the facial activities in three levels...

  6. Fuzzy Wavenet (FWN classifier for medical images

    Directory of Open Access Journals (Sweden)

    Entather Mahos

    2005-01-01

    Full Text Available The combination of wavelet theory and neural networks has lead to the development of wavelet networks. Wavelet networks are feed-forward neural networks using wavelets as activation function. Wavelets networks have been used in classification and identification problems with some success. In this work we proposed a fuzzy wavenet network (FWN, which learns by common back-propagation algorithm to classify medical images. The library of medical image has been analyzed, first. Second, Two experimental tables’ rules provide an excellent opportunity to test the ability of fuzzy wavenet network due to the high level of information variability often experienced with this type of images. We have known that the wavelet transformation is more accurate in small dimension problem. But image processing is large dimension problem then we used neural network. Results are presented on the application on the three layer fuzzy wavenet to vision system. They demonstrate a considerable improvement in performance by proposed two table’s rule for fuzzy and deterministic dilation and translation in wavelet transformation techniques.

  7. Classifying pronouns: the view from Romanian

    Directory of Open Access Journals (Sweden)

    Alexandra Cornilescu

    2014-05-01

    Full Text Available This paper is devoted to the analysis of (DP, AP, and PP postnominal modifiers of personal pronouns, focusing especially on Romanian. Regarding the internal structure of personal pronouns, we adopt the traditional view that they actually do not have a nominal restriction; instead, they themselves are definite NPs that raise to the D-domain, thus coming to be DPs. By means of the suffixal definite article, Romanian provides a contrast between definite modifiers, which prove to be DP-internal, and non-definite modifiers, which prove to be DP-external. Non-definite modifiers are non‑problematic: they are predicates in a small clause configuration. By contrast, the definite postpronominal modifiers are analysed as occupying the specifier position of a Classifier Phrase, present in the extended projection of DPs headed by pronouns and proper names (Cornilescu 2007; the modifier “classifies” the personal pronouns with respect to the kind of the pronoun’s referent (e.g. we linguists / Rom. noi lingviştii. Corroborative data from English and other Romance languages support the proposed analysis.

  8. Is it important to classify ischaemic stroke?

    LENUS (Irish Health Repository)

    Iqbal, M

    2012-02-01

    Thirty-five percent of all ischemic events remain classified as cryptogenic. This study was conducted to ascertain the accuracy of diagnosis of ischaemic stroke based on information given in the medical notes. It was tested by applying the clinical information to the (TOAST) criteria. Hundred and five patients presented with acute stroke between Jan-Jun 2007. Data was collected on 90 patients. Male to female ratio was 39:51 with age range of 47-93 years. Sixty (67%) patients had total\\/partial anterior circulation stroke; 5 (5.6%) had a lacunar stroke and in 25 (28%) the mechanism of stroke could not be identified. Four (4.4%) patients with small vessel disease were anticoagulated; 5 (5.6%) with atrial fibrillation received antiplatelet therapy and 2 (2.2%) patients with atrial fibrillation underwent CEA. This study revealed deficiencies in the clinical assessment of patients and treatment was not tailored to the mechanism of stroke in some patients.

  9. Combining classifiers for robust PICO element detection

    Directory of Open Access Journals (Sweden)

    Grad Roland

    2010-05-01

    Full Text Available Abstract Background Formulating a clinical information need in terms of the four atomic parts which are Population/Problem, Intervention, Comparison and Outcome (known as PICO elements facilitates searching for a precise answer within a large medical citation database. However, using PICO defined items in the information retrieval process requires a search engine to be able to detect and index PICO elements in the collection in order for the system to retrieve relevant documents. Methods In this study, we tested multiple supervised classification algorithms and their combinations for detecting PICO elements within medical abstracts. Using the structural descriptors that are embedded in some medical abstracts, we have automatically gathered large training/testing data sets for each PICO element. Results Combining multiple classifiers using a weighted linear combination of their prediction scores achieves promising results with an f-measure score of 86.3% for P, 67% for I and 56.6% for O. Conclusions Our experiments on the identification of PICO elements showed that the task is very challenging. Nevertheless, the performance achieved by our identification method is competitive with previously published results and shows that this task can be achieved with a high accuracy for the P element but lower ones for I and O elements.

  10. Colorization by classifying the prior knowledge

    Institute of Scientific and Technical Information of China (English)

    DU Weiwei

    2011-01-01

    When a one-dimensional luminance scalar is replaced by a vector of a colorful multi-dimension for every pixel of a monochrome image,the process is called colorization.However,colorization is under-constrained.Therefore,the prior knowledge is considered and given to the monochrome image.Colorization using optimization algorithm is an effective algorithm for the above problem.However,it cannot effectively do with some images well without repeating experiments for confirming the place of scribbles.In this paper,a colorization algorithm is proposed,which can automatically generate the prior knowledge.The idea is that firstly,the prior knowledge crystallizes into some points of the prior knowledge which is automatically extracted by downsampling and upsampling method.And then some points of the prior knowledge are classified and given with corresponding colors.Lastly,the color image can be obtained by the color points of the prior knowledge.It is demonstrated that the proposal can not only effectively generate the prior knowledge but also colorize the monochrome image according to requirements of user with some experiments.

  11. Is it important to classify ischaemic stroke?

    LENUS (Irish Health Repository)

    Iqbal, M

    2012-02-01

    Thirty-five percent of all ischemic events remain classified as cryptogenic. This study was conducted to ascertain the accuracy of diagnosis of ischaemic stroke based on information given in the medical notes. It was tested by applying the clinical information to the (TOAST) criteria. Hundred and five patients presented with acute stroke between Jan-Jun 2007. Data was collected on 90 patients. Male to female ratio was 39:51 with age range of 47-93 years. Sixty (67%) patients had total\\/partial anterior circulation stroke; 5 (5.6%) had a lacunar stroke and in 25 (28%) the mechanism of stroke could not be identified. Four (4.4%) patients with small vessel disease were anticoagulated; 5 (5.6%) with atrial fibrillation received antiplatelet therapy and 2 (2.2%) patients with atrial fibrillation underwent CEA. This study revealed deficiencies in the clinical assessment of patients and treatment was not tailored to the mechanism of stroke in some patients.

  12. Fault diagnosis with the Aladdin transient classifier

    Science.gov (United States)

    Roverso, Davide

    2003-08-01

    The purpose of Aladdin is to assist plant operators in the early detection and diagnosis of faults and anomalies in the plant that either have an impact on the plant performance, or that could lead to a plant shutdown or component damage if allowed to go unnoticed. The kind of early fault detection and diagnosis performed by Aladdin is aimed at allowing more time for decision making, increasing the operator awareness, reducing component damage, and supporting improved plant availability and reliability. In this paper we describe in broad lines the Aladdin transient classifier, which combines techniques such as recurrent neural network ensembles, Wavelet On-Line Pre-processing (WOLP), and Autonomous Recursive Task Decomposition (ARTD), in an attempt to improve the practical applicability and scalability of this type of systems to real processes and machinery. The paper focuses then on describing an application of Aladdin to a Nuclear Power Plant (NPP) through the use of the HAMBO experimental simulator of the Forsmark 3 boiling water reactor NPP in Sweden. It should be pointed out that Aladdin is not necessarily restricted to applications in NPPs. Other types of power plants, or even other types of processes, can also benefit from the diagnostic capabilities of Aladdin.

  13. Classifying Unidentified Gamma-ray Sources

    CERN Document Server

    Salvetti, David

    2016-01-01

    During its first 2 years of mission the Fermi-LAT instrument discovered more than 1,800 gamma-ray sources in the 100 MeV to 100 GeV range. Despite the application of advanced techniques to identify and associate the Fermi-LAT sources with counterparts at other wavelengths, about 40% of the LAT sources have no a clear identification remaining "unassociated". The purpose of my Ph.D. work has been to pursue a statistical approach to identify the nature of each Fermi-LAT unassociated source. To this aim, we implemented advanced machine learning techniques, such as logistic regression and artificial neural networks, to classify these sources on the basis of all the available gamma-ray information about location, energy spectrum and time variability. These analyses have been used for selecting targets for AGN and pulsar searches and planning multi-wavelength follow-up observations. In particular, we have focused our attention on the search of possible radio-quiet millisecond pulsar (MSP) candidates in the sample of...

  14. Classifier-assisted metric for chromosome pairing.

    Science.gov (United States)

    Ventura, Rodrigo; Khmelinskii, Artem; Sanches, J

    2010-01-01

    Cytogenetics plays a central role in the detection of chromosomal abnormalities and in the diagnosis of genetic diseases. A karyogram is an image representation of human chromosomes arranged in order of decreasing size and paired in 23 classes. In this paper we propose an approach to automatically pair the chromosomes into a karyogram, using the information obtained in a rough SVM-based classification step, to help the pairing process mainly based on similarity metrics between the chromosomes. Using a set of geometric and band pattern features extracted from the chromosome images, the algorithm is formulated on a Bayesian framework, combining the similarity metric with the results from the classifier. The solution is obtained solving a mixed integer program. Two datasets with contrasting quality levels and 836 chromosomes each were used to test and validate the algorithm. Relevant improvements with respect to the algorithm described by the authors in [1] were obtained with average paring rates above 92%, close to the rates obtained by human operators.

  15. Classifying lipoproteins based on their polar profiles.

    Science.gov (United States)

    Polanco, Carlos; Castañón-González, Jorge Alberto; Buhse, Thomas; Uversky, Vladimir N; Amkie, Rafael Zonana

    2016-01-01

    The lipoproteins are an important group of cargo proteins known for their unique capability to transport lipids. By applying the Polarity index algorithm, which has a metric that only considers the polar profile of the linear sequences of the lipoprotein group, we obtained an analytical and structural differentiation of all the lipoproteins found in UniProt Database. Also, the functional groups of lipoproteins, and particularly of the set of lipoproteins relevant to atherosclerosis, were analyzed with the same method to reveal their structural preference, and the results of Polarity index analysis were verified by an alternate test, the Cumulative Distribution Function algorithm, applied to the same groups of lipoproteins.

  16. Classifying Cognitive Profiles Using Machine Learning with Privileged Information in Mild Cognitive Impairment

    Science.gov (United States)

    Alahmadi, Hanin H.; Shen, Yuan; Fouad, Shereen; Luft, Caroline Di B.; Bentham, Peter; Kourtzi, Zoe; Tino, Peter

    2016-01-01

    Early diagnosis of dementia is critical for assessing disease progression and potential treatment. State-or-the-art machine learning techniques have been increasingly employed to take on this diagnostic task. In this study, we employed Generalized Matrix Learning Vector Quantization (GMLVQ) classifiers to discriminate patients with Mild Cognitive Impairment (MCI) from healthy controls based on their cognitive skills. Further, we adopted a “Learning with privileged information” approach to combine cognitive and fMRI data for the classification task. The resulting classifier operates solely on the cognitive data while it incorporates the fMRI data as privileged information (PI) during training. This novel classifier is of practical use as the collection of brain imaging data is not always possible with patients and older participants. MCI patients and healthy age-matched controls were trained to extract structure from temporal sequences. We ask whether machine learning classifiers can be used to discriminate patients from controls and whether differences between these groups relate to individual cognitive profiles. To this end, we tested participants in four cognitive tasks: working memory, cognitive inhibition, divided attention, and selective attention. We also collected fMRI data before and after training on a probabilistic sequence learning task and extracted fMRI responses and connectivity as features for machine learning classifiers. Our results show that the PI guided GMLVQ classifiers outperform the baseline classifier that only used the cognitive data. In addition, we found that for the baseline classifier, divided attention is the only relevant cognitive feature. When PI was incorporated, divided attention remained the most relevant feature while cognitive inhibition became also relevant for the task. Interestingly, this analysis for the fMRI GMLVQ classifier suggests that (1) when overall fMRI signal is used as inputs to the classifier, the post

  17. Classifying Cognitive Profiles Using Machine Learning with Privileged Information in Mild Cognitive Impairment.

    Science.gov (United States)

    Alahmadi, Hanin H; Shen, Yuan; Fouad, Shereen; Luft, Caroline Di B; Bentham, Peter; Kourtzi, Zoe; Tino, Peter

    2016-01-01

    Early diagnosis of dementia is critical for assessing disease progression and potential treatment. State-or-the-art machine learning techniques have been increasingly employed to take on this diagnostic task. In this study, we employed Generalized Matrix Learning Vector Quantization (GMLVQ) classifiers to discriminate patients with Mild Cognitive Impairment (MCI) from healthy controls based on their cognitive skills. Further, we adopted a "Learning with privileged information" approach to combine cognitive and fMRI data for the classification task. The resulting classifier operates solely on the cognitive data while it incorporates the fMRI data as privileged information (PI) during training. This novel classifier is of practical use as the collection of brain imaging data is not always possible with patients and older participants. MCI patients and healthy age-matched controls were trained to extract structure from temporal sequences. We ask whether machine learning classifiers can be used to discriminate patients from controls and whether differences between these groups relate to individual cognitive profiles. To this end, we tested participants in four cognitive tasks: working memory, cognitive inhibition, divided attention, and selective attention. We also collected fMRI data before and after training on a probabilistic sequence learning task and extracted fMRI responses and connectivity as features for machine learning classifiers. Our results show that the PI guided GMLVQ classifiers outperform the baseline classifier that only used the cognitive data. In addition, we found that for the baseline classifier, divided attention is the only relevant cognitive feature. When PI was incorporated, divided attention remained the most relevant feature while cognitive inhibition became also relevant for the task. Interestingly, this analysis for the fMRI GMLVQ classifier suggests that (1) when overall fMRI signal is used as inputs to the classifier, the post

  18. Optimized Radial Basis Function Classifier for Multi Modal Biometrics

    Directory of Open Access Journals (Sweden)

    Anand Viswanathan

    2014-07-01

    Full Text Available Biometric systems can be used for the identification or verification of humans based on their physiological or behavioral features. In these systems the biometric characteristics such as fingerprints, palm-print, iris or speech can be recorded and are compared with the samples for the identification or verification. Multimodal biometrics is more accurate and solves spoof attacks than the single modal bio metrics systems. In this study, a multimodal biometric system using fingerprint images and finger-vein patterns is proposed and also an optimized Radial Basis Function (RBF kernel classifier is proposed to identify the authorized users. The extracted features from these modalities are selected by PCA and kernel PCA and combined to classify by RBF classifier. The parameters of RBF classifier is optimized by using BAT algorithm with local search. The performance of the proposed classifier is compared with the KNN classifier, Naïve Bayesian classifier and non-optimized RBF classifier.

  19. MISR Level 2 TOA/Cloud Classifier parameters V003

    Data.gov (United States)

    National Aeronautics and Space Administration — This is the Level 2 TOA/Cloud Classifiers Product. It contains the Angular Signature Cloud Mask (ASCM), Regional Cloud Classifiers, Cloud Shadow Mask, and...

  20. Bayesian technique for image classifying registration.

    Science.gov (United States)

    Hachama, Mohamed; Desolneux, Agnès; Richard, Frédéric J P

    2012-09-01

    In this paper, we address a complex image registration issue arising while the dependencies between intensities of images to be registered are not spatially homogeneous. Such a situation is frequently encountered in medical imaging when a pathology present in one of the images modifies locally intensity dependencies observed on normal tissues. Usual image registration models, which are based on a single global intensity similarity criterion, fail to register such images, as they are blind to local deviations of intensity dependencies. Such a limitation is also encountered in contrast-enhanced images where there exist multiple pixel classes having different properties of contrast agent absorption. In this paper, we propose a new model in which the similarity criterion is adapted locally to images by classification of image intensity dependencies. Defined in a Bayesian framework, the similarity criterion is a mixture of probability distributions describing dependencies on two classes. The model also includes a class map which locates pixels of the two classes and weighs the two mixture components. The registration problem is formulated both as an energy minimization problem and as a maximum a posteriori estimation problem. It is solved using a gradient descent algorithm. In the problem formulation and resolution, the image deformation and the class map are estimated simultaneously, leading to an original combination of registration and classification that we call image classifying registration. Whenever sufficient information about class location is available in applications, the registration can also be performed on its own by fixing a given class map. Finally, we illustrate the interest of our model on two real applications from medical imaging: template-based segmentation of contrast-enhanced images and lesion detection in mammograms. We also conduct an evaluation of our model on simulated medical data and show its ability to take into account spatial variations

  1. Method of generating features optimal to a dataset and classifier

    Energy Technology Data Exchange (ETDEWEB)

    Bruillard, Paul J.; Gosink, Luke J.; Jarman, Kenneth D.

    2016-10-18

    A method of generating features optimal to a particular dataset and classifier is disclosed. A dataset of messages is inputted and a classifier is selected. An algebra of features is encoded. Computable features that are capable of describing the dataset from the algebra of features are selected. Irredundant features that are optimal for the classifier and the dataset are selected.

  2. Non-destructive Techniques for Classifying Aircraft Coating Degradation

    Science.gov (United States)

    2015-03-26

    ap- plied to spectral data relevant in this project, identifies the spectral dimensions con- taining information pertinent to classifying degradation...mathematically distinct potential spectral responses. These test spectra are difficult to distinguish and classify in original feature space. As an example...neighboring spectral channels with similar degradation information will each be ranked similarly for selection. During classifi - cation, the inclusion of

  3. Recognition of pornographic web pages by classifying texts and images.

    Science.gov (United States)

    Hu, Weiming; Wu, Ou; Chen, Zhouyao; Fu, Zhouyu; Maybank, Steve

    2007-06-01

    With the rapid development of the World Wide Web, people benefit more and more from the sharing of information. However, Web pages with obscene, harmful, or illegal content can be easily accessed. It is important to recognize such unsuitable, offensive, or pornographic Web pages. In this paper, a novel framework for recognizing pornographic Web pages is described. A C4.5 decision tree is used to divide Web pages, according to content representations, into continuous text pages, discrete text pages, and image pages. These three categories of Web pages are handled, respectively, by a continuous text classifier, a discrete text classifier, and an algorithm that fuses the results from the image classifier and the discrete text classifier. In the continuous text classifier, statistical and semantic features are used to recognize pornographic texts. In the discrete text classifier, the naive Bayes rule is used to calculate the probability that a discrete text is pornographic. In the image classifier, the object's contour-based features are extracted to recognize pornographic images. In the text and image fusion algorithm, the Bayes theory is used to combine the recognition results from images and texts. Experimental results demonstrate that the continuous text classifier outperforms the traditional keyword-statistics-based classifier, the contour-based image classifier outperforms the traditional skin-region-based image classifier, the results obtained by our fusion algorithm outperform those by either of the individual classifiers, and our framework can be adapted to different categories of Web pages.

  4. Counting, Measuring And The Semantics Of Classifiers

    Directory of Open Access Journals (Sweden)

    Susan Rothstein

    2010-12-01

    Full Text Available This paper makes two central claims. The first is that there is an intimate and non-trivial relation between the mass/count distinction on the one hand and the measure/individuation distinction on the other: a (if not the defining property of mass nouns is that they denote sets of entities which can be measured, while count nouns denote sets of entities which can be counted. Crucially, this is a difference in grammatical perspective and not in ontological status. The second claim is that the mass/count distinction between two types of nominals has its direct correlate at the level of classifier phrases: classifier phrases like two bottles of wine are ambiguous between a counting, or individuating, reading and a measure reading. On the counting reading, this phrase has count semantics, on the measure reading it has mass semantics.ReferencesBorer, H. 1999. ‘Deconstructing the construct’. In K. Johnson & I. Roberts (eds. ‘Beyond Principles and Parameters’, 43–89. Dordrecht: Kluwer publications.Borer, H. 2008. ‘Compounds: the view from Hebrew’. In R. Lieber & P. Stekauer (eds. ‘The Oxford Handbook of Compounds’, 491–511. Oxford: Oxford University Press.Carlson, G. 1977b. Reference to Kinds in English. Ph.D. thesis, University of Massachusetts at Amherst.Carlson, G. 1997. Quantifiers and Selection. Ph.D. thesis, University of Leiden.Carslon, G. 1977a. ‘Amount relatives’. Language 53: 520–542.Chierchia, G. 2008. ‘Plurality of mass nouns and the notion of ‘semantic parameter”. In S. Rothstein (ed. ‘Events and Grammar’, 53–103. Dordrecht: Kluwer.Danon, G. 2008. ‘Definiteness spreading in the Hebrew construct state’. Lingua 118: 872–906.http://dx.doi.org/10.1016/j.lingua.2007.05.012Gillon, B. 1992. ‘Toward a common semantics for English count and mass nouns’. Linguistics and Philosophy 15: 597–640.http://dx.doi.org/10.1007/BF00628112Grosu, A. & Landman, F. 1998. ‘Strange relatives of the third kind

  5. Development of a combined GIS, neural network and Bayesian classifier methodology for classifying remotely sensed data

    Science.gov (United States)

    Schneider, Claudio Albert

    This research is aimed at the solution of two common but still largely unsolved problems in the classification of remotely sensed data: (1) Classification accuracy of remotely sensed data decreases significantly in mountainous terrain, where topography strongly influences the spectral response of the features on the ground; and (2) when attempting to obtain more detailed classifications, e.g. forest cover types or species, rather than just broad categories of forest such as coniferous or deciduous, the accuracy of the classification generally decreases significantly. The main objective of the study was to develop a widely applicable and efficient classification procedure for mapping forest and other cover types in mountainous terrain, using an integrated GIS/neural network/Bayesian classification approach. The performance of this new technique was compared to a standard supervised Maximum Likelihood classification technique, a "conventional" Bayesian/Maximum Likelihood classification, and to a "conventional" neural network classifier. Results indicate a considerable improvement of the new technique over the standard Maximum Likelihood classification technique, as well as a better accuracy than the "conventional" Bayesian/Maximum Likelihood classifier (13.08 percent improvement in overall accuracy), but the "conventional" neural network classifiers outperformed all the techniques compared in this study, with an overall accuracy improvement of 15.94 percent as compared to the standard Maximum Likelihood classifier (from 46.77 percent to 62.71 percent). However, the overall accuracies of all the classification techniques compared in this study were relative low. It is believed that this was caused by problems related to the inadequacy of the reference data. On the other hand, the results also indicate the need to develop a different sampling design to more effectively cover the variability across all the parameters needed by the neural network classification technique

  6. A multiple classifier system for early melanoma diagnosis.

    Science.gov (United States)

    Sboner, Andrea; Eccher, Claudio; Blanzieri, Enrico; Bauer, Paolo; Cristofolini, Mario; Zumiani, Giuseppe; Forti, Stefano

    2003-01-01

    Melanoma is the most dangerous skin cancer and early diagnosis is the key factor in its successful treatment. Well-trained dermatologists reach a diagnosis via visual inspection, and reach sensitivity and specificity levels of about 80%. Several computerised diagnostic systems were reported in the literature using different classification algorithms. In this paper, we will illustrate a novel approach by which a suitable combination of different classifiers is used in order to improve the diagnostic performances of single classifiers. We used three different kinds of classifiers, namely linear discriminant analysis (LDA), k-nearest neighbour (k-NN) and a decision tree, the inputs of which are 38 geometric and colorimetric features automatically extracted from digital images of skin lesions. Multiple classifiers were generated by combining the diagnostic outputs of single classifiers with appropriate voting schemata. This approach was evaluated on a set of 152 digital skin images. We compared the performances of multiple classifiers (2- and 3-classifier groups) between them and with respect to single ones (1-classifier group). We further compared the classifiers' performances with those of eight dermatologists. Classifiers' performances were measured in terms of distance from the ideal classifier. Compared with 1- and 2-classifier groups, performances of 3-classifier systems were significantly higher (Pclassifier groups (P=0.352). While the dermatologists group showed a level of performances significantly higher than the 1-classifier systems (Pclassifier groups and the dermatologists groups, indicating comparable performances. This work suggests that a suitable combination of different kinds of classifiers can improve the performances of an automatic diagnostic system.

  7. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations.

    Science.gov (United States)

    Zhang, Yi; Ren, Jinchang; Jiang, Jianmin

    2015-01-01

    Maximum likelihood classifier (MLC) and support vector machines (SVM) are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  8. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations

    Directory of Open Access Journals (Sweden)

    Yi Zhang

    2015-01-01

    Full Text Available Maximum likelihood classifier (MLC and support vector machines (SVM are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  9. [Optimizing algorithm design of piecewise linear classifier for spectra].

    Science.gov (United States)

    Lan, Tian-Ge; Fang, Yong-Hua; Xiong, Wei; Kong, Chao; Li, Da-Cheng; Dong, Da-Ming

    2008-11-01

    Being able to identify pollutant gases quickly and accurately is a basic request of spectroscopic technique for envirment monitoring for spectral classifier. Piecewise linear classifier is simple needs less computational time and approachs nonlinear boundary beautifully. Combining piecewise linear classifier and linear support vector machine which is based on the principle of maximizing margin, an optimizing algorithm for single side piecewise linear classifier was devised. Experimental results indicate that the piecewise linear classifier trained by the optimizing algorithm proposed in this paper can approach nonolinear boundary with fewer super_planes and has higher veracity for classification and recognition.

  10. Automatic sequences

    CERN Document Server

    Haeseler, Friedrich

    2003-01-01

    Automatic sequences are sequences which are produced by a finite automaton. Although they are not random they may look as being random. They are complicated, in the sense of not being not ultimately periodic, they may look rather complicated, in the sense that it may not be easy to name the rule by which the sequence is generated, however there exists a rule which generates the sequence. The concept automatic sequences has special applications in algebra, number theory, finite automata and formal languages, combinatorics on words. The text deals with different aspects of automatic sequences, in particular:· a general introduction to automatic sequences· the basic (combinatorial) properties of automatic sequences· the algebraic approach to automatic sequences· geometric objects related to automatic sequences.

  11. A new approach to classifier fusion based on upper integral.

    Science.gov (United States)

    Wang, Xi-Zhao; Wang, Ran; Feng, Hui-Min; Wang, Hua-Chao

    2014-05-01

    Fusing a number of classifiers can generally improve the performance of individual classifiers, and the fuzzy integral, which can clearly express the interaction among the individual classifiers, has been acknowledged as an effective tool of fusion. In order to make the best use of the individual classifiers and their combinations, we propose in this paper a new scheme of classifier fusion based on upper integrals, which differs from all the existing models. Instead of being a fusion operator, the upper integral is used to reasonably arrange the finite resources, and thus to maximize the classification efficiency. By solving an optimization problem of upper integrals, we obtain a scheme for assigning proportions of examples to different individual classifiers and their combinations. According to these proportions, new examples could be classified by different individual classifiers and their combinations, and the combination of classifiers that specific examples should be submitted to depends on their performance. The definition of upper integral guarantees such a conclusion that the classification efficiency of the fused classifier is not less than that of any individual classifier theoretically. Furthermore, numerical simulations demonstrate that most existing fusion methodologies, such as bagging and boosting, can be improved by our upper integral model.

  12. Image Classifying Registration and Dynamic Region Merging

    Directory of Open Access Journals (Sweden)

    Himadri Nath Moulick

    2013-07-01

    Full Text Available In this paper, we address a complex image registration issue arising when the dependencies between intensities of images to be registered are not spatially homogeneous. Such a situation is frequentlyencountered in medical imaging when a pathology present in one of the images modifies locally intensity dependencies observed on normal tissues. Usual image registration models, which are based on a single global intensity similarity criterion, fail to register such images, as they are blind to local deviations of intensity dependencies. Such a limitation is also encountered in contrast enhanced images where there exist multiple pixel classes having different properties of contrast agent absorption. In this paper, we propose a new model in which the similarity criterion is adapted locally to images by classification of image intensity dependencies. Defined in a Bayesian framework, the similarity criterion is a mixture of probability distributions describing dependencies on two classes. The model also includes a class map which locates pixels of the two classes and weights the two mixture components. The registration problem is formulated both as an energy minimization problem and as a Maximum A Posteriori (MAP estimation problem. It is solved using a gradient descent algorithm. In the problem formulation and resolution, the image deformation and the class map are estimated at the same time, leading to an original combination of registration and classification that we call image classifying registration. Whenever sufficient information about class location is available in applications, the registration can also be performed on its own by fixing a given class map. Finally, we illustrate the interest of our model on two real applications from medical imaging: template-based segmentation of contrast-enhanced images and lesion detection in mammograms. We also conduct an evaluation of our model on simulated medical data and show its ability to take into

  13. Localization and Recognition of Dynamic Hand Gestures Based on Hierarchy of Manifold Classifiers

    Science.gov (United States)

    Favorskaya, M.; Nosov, A.; Popov, A.

    2015-05-01

    Generally, the dynamic hand gestures are captured in continuous video sequences, and a gesture recognition system ought to extract the robust features automatically. This task involves the highly challenging spatio-temporal variations of dynamic hand gestures. The proposed method is based on two-level manifold classifiers including the trajectory classifiers in any time instants and the posture classifiers of sub-gestures in selected time instants. The trajectory classifiers contain skin detector, normalized skeleton representation of one or two hands, and motion history representing by motion vectors normalized through predetermined directions (8 and 16 in our case). Each dynamic gesture is separated into a set of sub-gestures in order to predict a trajectory and remove those samples of gestures, which do not satisfy to current trajectory. The posture classifiers involve the normalized skeleton representation of palm and fingers and relative finger positions using fingertips. The min-max criterion is used for trajectory recognition, and the decision tree technique was applied for posture recognition of sub-gestures. For experiments, a dataset "Multi-modal Gesture Recognition Challenge 2013: Dataset and Results" including 393 dynamic hand-gestures was chosen. The proposed method yielded 84-91% recognition accuracy, in average, for restricted set of dynamic gestures.

  14. LOCALIZATION AND RECOGNITION OF DYNAMIC HAND GESTURES BASED ON HIERARCHY OF MANIFOLD CLASSIFIERS

    Directory of Open Access Journals (Sweden)

    M. Favorskaya

    2015-05-01

    Full Text Available Generally, the dynamic hand gestures are captured in continuous video sequences, and a gesture recognition system ought to extract the robust features automatically. This task involves the highly challenging spatio-temporal variations of dynamic hand gestures. The proposed method is based on two-level manifold classifiers including the trajectory classifiers in any time instants and the posture classifiers of sub-gestures in selected time instants. The trajectory classifiers contain skin detector, normalized skeleton representation of one or two hands, and motion history representing by motion vectors normalized through predetermined directions (8 and 16 in our case. Each dynamic gesture is separated into a set of sub-gestures in order to predict a trajectory and remove those samples of gestures, which do not satisfy to current trajectory. The posture classifiers involve the normalized skeleton representation of palm and fingers and relative finger positions using fingertips. The min-max criterion is used for trajectory recognition, and the decision tree technique was applied for posture recognition of sub-gestures. For experiments, a dataset “Multi-modal Gesture Recognition Challenge 2013: Dataset and Results” including 393 dynamic hand-gestures was chosen. The proposed method yielded 84–91% recognition accuracy, in average, for restricted set of dynamic gestures.

  15. Rule Based Ensembles Using Pair Wise Neural Network Classifiers

    Directory of Open Access Journals (Sweden)

    Moslem Mohammadi Jenghara

    2015-03-01

    Full Text Available In value estimation, the inexperienced people's estimation average is good approximation to true value, provided that the answer of these individual are independent. Classifier ensemble is the implementation of mentioned principle in classification tasks that are investigated in two aspects. In the first aspect, feature space is divided into several local regions and each region is assigned with a highly competent classifier and in the second, the base classifiers are applied in parallel and equally experienced in some ways to achieve a group consensus. In this paper combination of two methods are used. An important consideration in classifier combination is that much better results can be achieved if diverse classifiers, rather than similar classifiers, are combined. To achieve diversity in classifiers output, the symmetric pairwise weighted feature space is used and the outputs of trained classifiers over the weighted feature space are combined to inference final result. In this paper MLP classifiers are used as the base classifiers. The Experimental results show that the applied method is promising.

  16. Classifying cognitive profiles using machine learning with privileged information in Mild Cognitive Impairment

    Directory of Open Access Journals (Sweden)

    Hanin Hamdan Alahmadi

    2016-11-01

    Full Text Available Early diagnosis of dementia is critical for assessing disease progression and potential treatment. State-or-the-art machine learning techniques have been increasingly employed to take on this diagnostic task. In this study, we employed Generalised Matrix Learning Vector Quantization (GMLVQ classifiers to discriminate patients with Mild Cognitive Impairment (MCI from healthy controls based on their cognitive skills. Further, we adopted a ``Learning with privileged information'' approach to combine cognitive and fMRI data for the classification task. The resulting classifier operates solely on the cognitive data while it incorporates the fMRI data as privileged information (PI during training. This novel classifier is of practical use as the collection of brain imaging data is not always possible with patients and older participants.MCI patients and healthy age-matched controls were trained to extract structure from temporal sequences. We ask whether machine learning classifiers can be used to discriminate patients from controls based on the learning performance and whether differences between these groups relate to individual cognitive profiles. To this end, we tested participants in four cognitive tasks: working memory, cognitive inhibition, divided attention, and selective attention. We also collected fMRI data before and after training on the learning task and extracted fMRI responses and connectivity as features for machine learning classifiers. Our results show that the PI guided GMLVQ classifiers outperform the baseline classifier that only used the cognitive data. In addition, we found that for the baseline classifier, divided attention is the only relevant cognitive feature. When PI was incorporated, divided attention remained the most relevant feature while cognitive inhibition became also relevant for the task. Interestingly, this analysis for the fMRI GMLVQ classifier suggests that (1 when overall fMRI signal for structured stimuli is

  17. Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers.

    Science.gov (United States)

    Chou, Kuo-Chen; Shen, Hong-Bin

    2006-08-01

    Facing the explosion of newly generated protein sequences in the post genomic era, we are challenged to develop an automated method for fast and reliably annotating their subcellular locations. Knowledge of subcellular locations of proteins can provide useful hints for revealing their functions and understanding how they interact with each other in cellular networking. Unfortunately, it is both expensive and time-consuming to determine the localization of an uncharacterized protein in a living cell purely based on experiments. To tackle the challenge, a novel hybridization classifier was developed by fusing many basic individual classifiers through a voting system. The "engine" of these basic classifiers was operated by the OET-KNN (Optimized Evidence-Theoretic K-Nearest Neighbor) rule. As a demonstration, predictions were performed with the fusion classifier for proteins among the following 16 localizations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cyanelle, (5) cytoplasm, (6) cytoskeleton, (7) endoplasmic reticulum, (8) extracell, (9) Golgi apparatus, (10) lysosome, (11) mitochondria, (12) nucleus, (13) peroxisome, (14) plasma membrane, (15) plastid, and (16) vacuole. To get rid of redundancy and homology bias, none of the proteins investigated here had >/=25% sequence identity to any other in a same subcellular location. The overall success rates thus obtained via the jack-knife cross-validation test and independent dataset test were 81.6% and 83.7%, respectively, which were 46 approximately 63% higher than those performed by the other existing methods on the same benchmark datasets. Also, it is clearly elucidated that the overwhelmingly high success rates obtained by the fusion classifier is by no means a trivial utilization of the GO annotations as prone to be misinterpreted because there is a huge number of proteins with given accession numbers and the corresponding GO numbers, but their subcellular locations are still unknown, and that the

  18. Stochastic margin-based structure learning of Bayesian network classifiers.

    Science.gov (United States)

    Pernkopf, Franz; Wohlmayr, Michael

    2013-02-01

    The margin criterion for parameter learning in graphical models gained significant impact over the last years. We use the maximum margin score for discriminatively optimizing the structure of Bayesian network classifiers. Furthermore, greedy hill-climbing and simulated annealing search heuristics are applied to determine the classifier structures. In the experiments, we demonstrate the advantages of maximum margin optimized Bayesian network structures in terms of classification performance compared to traditionally used discriminative structure learning methods. Stochastic simulated annealing requires less score evaluations than greedy heuristics. Additionally, we compare generative and discriminative parameter learning on both generatively and discriminatively structured Bayesian network classifiers. Margin-optimized Bayesian network classifiers achieve similar classification performance as support vector machines. Moreover, missing feature values during classification can be handled by discriminatively optimized Bayesian network classifiers, a case where purely discriminative classifiers usually require mechanisms to complete unknown feature values in the data first.

  19. Bayesian classifiers applied to the Tennessee Eastman process.

    Science.gov (United States)

    Dos Santos, Edimilson Batista; Ebecken, Nelson F F; Hruschka, Estevam R; Elkamel, Ali; Madhuranthakam, Chandra M R

    2014-03-01

    Fault diagnosis includes the main task of classification. Bayesian networks (BNs) present several advantages in the classification task, and previous works have suggested their use as classifiers. Because a classifier is often only one part of a larger decision process, this article proposes, for industrial process diagnosis, the use of a Bayesian method called dynamic Markov blanket classifier that has as its main goal the induction of accurate Bayesian classifiers having dependable probability estimates and revealing actual relationships among the most relevant variables. In addition, a new method, named variable ordering multiple offspring sampling capable of inducing a BN to be used as a classifier, is presented. The performance of these methods is assessed on the data of a benchmark problem known as the Tennessee Eastman process. The obtained results are compared with naive Bayes and tree augmented network classifiers, and confirm that both proposed algorithms can provide good classification accuracies as well as knowledge about relevant variables.

  20. Investigating The Fusion of Classifiers Designed Under Different Bayes Errors

    Directory of Open Access Journals (Sweden)

    Fuad M. Alkoot

    2004-12-01

    Full Text Available We investigate a number of parameters commonly affecting the design of a multiple classifier system in order to find when fusing is most beneficial. We extend our previous investigation to the case where unequal classifiers are combined. Results indicate that Sum is not affected by this parameter, however, Vote degrades when a weaker classifier is introduced in the combining system. This is more obvious when estimation error with uniform distribution exists.

  1. Recognition of Arabic Sign Language Alphabet Using Polynomial Classifiers

    OpenAIRE

    2005-01-01

    Building an accurate automatic sign language recognition system is of great importance in facilitating efficient communication with deaf people. In this paper, we propose the use of polynomial classifiers as a classification engine for the recognition of Arabic sign language (ArSL) alphabet. Polynomial classifiers have several advantages over other classifiers in that they do not require iterative training, and that they are highly computationally scalable with the number of classes. Based on...

  2. Diagnosis of Broiler Livers by Classifying Image Patches

    DEFF Research Database (Denmark)

    Jørgensen, Anders; Moeslund, Thomas B.; Fagertun, Jens

    2017-01-01

    The manual health inspection are becoming the bottleneck at poultry processing plants. We present a computer vision method for automatic diagnosis of broiler livers. The non-rigid livers, of varying shape and sizes, are classified in patches by a convolutional neural network, outputting maps...... with probabilities of the three most common diseases. A Random Forest classifier combines the maps to a single diagnosis. The method classifies 77.6% livers correctly in a problem that is far from trivial....

  3. Sequence assembly

    DEFF Research Database (Denmark)

    Scheibye-Alsing, Karsten; Hoffmann, S.; Frankel, Annett Maria

    2009-01-01

    Despite the rapidly increasing number of sequenced and re-sequenced genomes, many issues regarding the computational assembly of large-scale sequencing data have remain unresolved. Computational assembly is crucial in large genome projects as well for the evolving high-throughput technologies...

  4. Application of Data Mining in Protein Sequence Classification

    Directory of Open Access Journals (Sweden)

    Suprativ Saha

    2012-11-01

    Full Text Available Protein sequence classification involves feature selection for accurate classification. Popular protein sequence classification techniques involve extraction of specific features from the sequences. Researchers apply some well-known classification techniques like neural networks, Genetic algorithm, Fuzzy ARTMAP,Rough Set Classifier etc for accurate classification. This paper presents a review is with three different classification models such as neural network model, fuzzy ARTMAP model and Rough set classifier model.This is followed by a new technique for classifying protein sequences. The proposed model is typicallyimplemented with an own designed tool and tries to reduce the computational overheads encountered by earlier approaches and increase the accuracy of classification.

  5. Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric.

    Science.gov (United States)

    Boughorbel, Sabri; Jarray, Fethi; El-Anbari, Mohammed

    2017-01-01

    Data imbalance is frequently encountered in biomedical applications. Resampling techniques can be used in binary classification to tackle this issue. However such solutions are not desired when the number of samples in the small class is limited. Moreover the use of inadequate performance metrics, such as accuracy, lead to poor generalization results because the classifiers tend to predict the largest size class. One of the good approaches to deal with this issue is to optimize performance metrics that are designed to handle data imbalance. Matthews Correlation Coefficient (MCC) is widely used in Bioinformatics as a performance metric. We are interested in developing a new classifier based on the MCC metric to handle imbalanced data. We derive an optimal Bayes classifier for the MCC metric using an approach based on Frechet derivative. We show that the proposed algorithm has the nice theoretical property of consistency. Using simulated data, we verify the correctness of our optimality result by searching in the space of all possible binary classifiers. The proposed classifier is evaluated on 64 datasets from a wide range data imbalance. We compare both classification performance and CPU efficiency for three classifiers: 1) the proposed algorithm (MCC-classifier), the Bayes classifier with a default threshold (MCC-base) and imbalanced SVM (SVM-imba). The experimental evaluation shows that MCC-classifier has a close performance to SVM-imba while being simpler and more efficient.

  6. Construction of unsupervised sentiment classifier on idioms resources

    Institute of Scientific and Technical Information of China (English)

    谢松县; 王挺

    2014-01-01

    Sentiment analysis is the computational study of how opinions, attitudes, emotions, and perspectives are expressed in language, and has been the important task of natural language processing. Sentiment analysis is highly valuable for both research and practical applications. The focuses were put on the difficulties in the construction of sentiment classifiers which normally need tremendous labeled domain training data, and a novel unsupervised framework was proposed to make use of the Chinese idiom resources to develop a general sentiment classifier. Furthermore, the domain adaption of general sentiment classifier was improved by taking the general classifier as the base of a self-training procedure to get a domain self-training sentiment classifier. To validate the effect of the unsupervised framework, several experiments were carried out on publicly available Chinese online reviews dataset. The experiments show that the proposed framework is effective and achieves encouraging results. Specifically, the general classifier outperforms two baselines (a Naïve 50% baseline and a cross-domain classifier), and the bootstrapping self-training classifier approximates the upper bound domain-specific classifier with the lowest accuracy of 81.5%, but the performance is more stable and the framework needs no labeled training dataset.

  7. 6 CFR 7.23 - Emergency release of classified information.

    Science.gov (United States)

    2010-01-01

    ... Classified Information Non-disclosure Form. In emergency situations requiring immediate verbal release of... information through approved communication channels by the most secure and expeditious method possible, or...

  8. Multi-input distributed classifiers for synthetic genetic circuits.

    Directory of Open Access Journals (Sweden)

    Oleg Kanakov

    Full Text Available For practical construction of complex synthetic genetic networks able to perform elaborate functions it is important to have a pool of relatively simple modules with different functionality which can be compounded together. To complement engineering of very different existing synthetic genetic devices such as switches, oscillators or logical gates, we propose and develop here a design of synthetic multi-input classifier based on a recently introduced distributed classifier concept. A heterogeneous population of cells acts as a single classifier, whose output is obtained by summarizing the outputs of individual cells. The learning ability is achieved by pruning the population, instead of tuning parameters of an individual cell. The present paper is focused on evaluating two possible schemes of multi-input gene classifier circuits. We demonstrate their suitability for implementing a multi-input distributed classifier capable of separating data which are inseparable for single-input classifiers, and characterize performance of the classifiers by analytical and numerical results. The simpler scheme implements a linear classifier in a single cell and is targeted at separable classification problems with simple class borders. A hard learning strategy is used to train a distributed classifier by removing from the population any cell answering incorrectly to at least one training example. The other scheme implements a circuit with a bell-shaped response in a single cell to allow potentially arbitrary shape of the classification border in the input space of a distributed classifier. Inseparable classification problems are addressed using soft learning strategy, characterized by probabilistic decision to keep or discard a cell at each training iteration. We expect that our classifier design contributes to the development of robust and predictable synthetic biosensors, which have the potential to affect applications in a lot of fields, including that of

  9. Multi-input distributed classifiers for synthetic genetic circuits.

    Science.gov (United States)

    Kanakov, Oleg; Kotelnikov, Roman; Alsaedi, Ahmed; Tsimring, Lev; Huerta, Ramón; Zaikin, Alexey; Ivanchenko, Mikhail

    2015-01-01

    For practical construction of complex synthetic genetic networks able to perform elaborate functions it is important to have a pool of relatively simple modules with different functionality which can be compounded together. To complement engineering of very different existing synthetic genetic devices such as switches, oscillators or logical gates, we propose and develop here a design of synthetic multi-input classifier based on a recently introduced distributed classifier concept. A heterogeneous population of cells acts as a single classifier, whose output is obtained by summarizing the outputs of individual cells. The learning ability is achieved by pruning the population, instead of tuning parameters of an individual cell. The present paper is focused on evaluating two possible schemes of multi-input gene classifier circuits. We demonstrate their suitability for implementing a multi-input distributed classifier capable of separating data which are inseparable for single-input classifiers, and characterize performance of the classifiers by analytical and numerical results. The simpler scheme implements a linear classifier in a single cell and is targeted at separable classification problems with simple class borders. A hard learning strategy is used to train a distributed classifier by removing from the population any cell answering incorrectly to at least one training example. The other scheme implements a circuit with a bell-shaped response in a single cell to allow potentially arbitrary shape of the classification border in the input space of a distributed classifier. Inseparable classification problems are addressed using soft learning strategy, characterized by probabilistic decision to keep or discard a cell at each training iteration. We expect that our classifier design contributes to the development of robust and predictable synthetic biosensors, which have the potential to affect applications in a lot of fields, including that of medicine and industry.

  10. Error-free image compression algorithm using classifying-sequencing techniques.

    Science.gov (United States)

    He, J D; Dereniak, E L

    1992-05-10

    The development of a new error-free digital image compression algorithm is discussed. Without the help of any statistics information of the images being processed, this algorithm achieves average bits-per-word ratios near the entropy of the neighboring pixel differences. Because this algorithm does not involve statistical modeling, generation of a code book, or long integer-floating point arithmetics, it is simpler and, therefore, faster than the studied statistics codes, such as the Huffman code or the arithmetic code.

  11. Classifying queries submitted to a vertical search engine

    NARCIS (Netherlands)

    Berendsen, R.; Kovachev, B.; Meij, E.; de Rijke, M.; Weerkamp, W.

    2011-01-01

    We propose and motivate a scheme for classifying queries submitted to a people search engine. We specify a number of features for automatically classifying people queries into the proposed classes and examine the eectiveness of these features. Our main nding is that classication is feasible and that

  12. Quantum classifying spaces and universal quantum characteristic classes

    CERN Document Server

    Durdevic, M

    1996-01-01

    A construction of the noncommutative-geometric counterparts of classical classifying spaces is presented, for general compact matrix quantum structure groups. A quantum analogue of the classical concept of the classifying map is introduced and analyzed. Interrelations with the abstract algebraic theory of quantum characteristic classes are discussed. Various non-equivalent approaches to defining universal characteristic classes are outlined.

  13. 25 CFR 304.3 - Classifying and marking of silver.

    Science.gov (United States)

    2010-04-01

    ... 25 Indians 2 2010-04-01 2010-04-01 false Classifying and marking of silver. 304.3 Section 304.3 Indians INDIAN ARTS AND CRAFTS BOARD, DEPARTMENT OF THE INTERIOR NAVAJO, PUEBLO, AND HOPI SILVER, USE OF GOVERNMENT MARK § 304.3 Classifying and marking of silver. For the present the Indian Arts and Crafts...

  14. 21 CFR 1402.4 - Information classified by another agency.

    Science.gov (United States)

    2010-04-01

    ... 21 Food and Drugs 9 2010-04-01 2010-04-01 false Information classified by another agency. 1402.4 Section 1402.4 Food and Drugs OFFICE OF NATIONAL DRUG CONTROL POLICY MANDATORY DECLASSIFICATION REVIEW § 1402.4 Information classified by another agency. When a request is received for information that...

  15. 40 CFR 152.175 - Pesticides classified for restricted use.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 23 2010-07-01 2010-07-01 false Pesticides classified for restricted...) PESTICIDE PROGRAMS PESTICIDE REGISTRATION AND CLASSIFICATION PROCEDURES Classification of Pesticides § 152.175 Pesticides classified for restricted use. The following uses of pesticide products containing the...

  16. Classifying spaces with virtually cyclic stabilizers for linear groups

    DEFF Research Database (Denmark)

    Degrijse, Dieter Dries; Köhl, Ralf; Petrosyan, Nansen

    2015-01-01

    We show that every discrete subgroup of GL(n, ℝ) admits a finite-dimensional classifying space with virtually cyclic stabilizers. Applying our methods to SL(3, ℤ), we obtain a four-dimensional classifying space with virtually cyclic stabilizers and a decomposition of the algebraic K-theory of its...

  17. 16 CFR 1610.4 - Requirements for classifying textiles.

    Science.gov (United States)

    2010-01-01

    ... 16 Commercial Practices 2 2010-01-01 2010-01-01 false Requirements for classifying textiles. 1610... REGULATIONS STANDARD FOR THE FLAMMABILITY OF CLOTHING TEXTILES The Standard § 1610.4 Requirements for classifying textiles. (a) Class 1, Normal Flammability. Class 1 textiles exhibit normal flammability and...

  18. Classifying spaces with virtually cyclic stabilizers for linear groups

    DEFF Research Database (Denmark)

    Degrijse, Dieter Dries; Köhl, Ralf; Petrosyan, Nansen

    2015-01-01

    We show that every discrete subgroup of GL(n, ℝ) admits a finite-dimensional classifying space with virtually cyclic stabilizers. Applying our methods to SL(3, ℤ), we obtain a four-dimensional classifying space with virtually cyclic stabilizers and a decomposition of the algebraic K-theory of its...

  19. Combining contextual and lexical features to classify UMLS concepts.

    Science.gov (United States)

    Fan, Jung-Wei; Friedman, Carol

    2007-10-11

    Semantic classification is important for biomedical terminologies and the many applications that depend on them. Previously we developed two classifiers for 8 broad clinically relevant classes to reclassify and validate UMLS concepts. We found them to be complementary, and then combined them using a manual approach. In this paper, we extended the classifiers by adding an "other" class to categorize concepts not belonging to any of the 8 classes. In addition, we focused on automating the method for combining the two classifiers by training a meta-classifier that performs dynamic combination to exploit the strength of each classifier. The automated method performed as well as manual combination, achieving classification accuracy of about 0.81.

  20. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  1. Identification of Mitral Annulus Hinge Point Based on Local Context Feature and Additive SVM Classifier

    Directory of Open Access Journals (Sweden)

    Jianming Zhang

    2015-01-01

    Full Text Available The position of the hinge point of mitral annulus (MA is important for segmentation, modeling and multimodalities registration of cardiac structures. The main difficulties in identifying the hinge point of MA are the inherent noisy, low resolution of echocardiography, and so on. This work aims to automatically detect the hinge point of MA by combining local context feature with additive support vector machines (SVM classifier. The innovations are as follows: (1 designing a local context feature for MA in cardiac ultrasound image; (2 applying the additive kernel SVM classifier to identify the candidates of the hinge point of MA; (3 designing a weighted density field of candidates which represents the blocks of candidates; and (4 estimating an adaptive threshold on the weighted density field to get the position of the hinge point of MA and exclude the error from SVM classifier. The proposed algorithm is tested on echocardiographic four-chamber image sequence of 10 pediatric patients. Compared with the manual selected hinge points of MA which are selected by professional doctors, the mean error is in 0.96 ± 1.04 mm. Additive SVM classifier can fast and accurately identify the MA hinge point.

  2. Identification of Mitral Annulus Hinge Point Based on Local Context Feature and Additive SVM Classifier.

    Science.gov (United States)

    Zhang, Jianming; Liu, Yangchun; Xu, Wei

    2015-01-01

    The position of the hinge point of mitral annulus (MA) is important for segmentation, modeling and multimodalities registration of cardiac structures. The main difficulties in identifying the hinge point of MA are the inherent noisy, low resolution of echocardiography, and so on. This work aims to automatically detect the hinge point of MA by combining local context feature with additive support vector machines (SVM) classifier. The innovations are as follows: (1) designing a local context feature for MA in cardiac ultrasound image; (2) applying the additive kernel SVM classifier to identify the candidates of the hinge point of MA; (3) designing a weighted density field of candidates which represents the blocks of candidates; and (4) estimating an adaptive threshold on the weighted density field to get the position of the hinge point of MA and exclude the error from SVM classifier. The proposed algorithm is tested on echocardiographic four-chamber image sequence of 10 pediatric patients. Compared with the manual selected hinge points of MA which are selected by professional doctors, the mean error is in 0.96 ± 1.04 mm. Additive SVM classifier can fast and accurately identify the MA hinge point.

  3. iFish: predicting the pathogenicity of human nonsynonymous variants using gene-specific/family-specific attributes and classifiers.

    Science.gov (United States)

    Wang, Meng; Wei, Liping

    2016-08-16

    Accurate prediction of the pathogenicity of genomic variants, especially nonsynonymous single nucleotide variants (nsSNVs), is essential in biomedical research and clinical genetics. Most current prediction methods build a generic classifier for all genes. However, different genes and gene families have different features. We investigated whether gene-specific and family-specific customized classifiers could improve prediction accuracy. Customized gene-specific and family-specific attributes were selected with AIC, BIC, and LASSO, and Support Vector Machine classifiers were generated for 254 genes and 152 gene families, covering a total of 5,985 genes. Our results showed that the customized attributes reflected key features of the genes and gene families, and the customized classifiers achieved higher prediction accuracy than the generic classifier. The customized classifiers and the generic classifier for other genes and families were integrated into a new tool named iFish (integrated Functional inference of SNVs in human, http://ifish.cbi.pku.edu.cn). iFish outperformed other methods on benchmark datasets as well as on prioritization of candidate causal variants from whole exome sequencing. iFish provides a user-friendly web-based interface and supports other functionalities such as integration of genetic evidence. iFish would facilitate high-throughput evaluation and prioritization of nsSNVs in human genetics research.

  4. Malignancy and Abnormality Detection of Mammograms using Classifier Ensembling

    Directory of Open Access Journals (Sweden)

    Nawazish Naveed

    2011-07-01

    Full Text Available The breast cancer detection and diagnosis is a critical and complex procedure that demands high degree of accuracy. In computer aided diagnostic systems, the breast cancer detection is a two stage procedure. First, to classify the malignant and benign mammograms, while in second stage, the type of abnormality is detected. In this paper, we have developed a novel architecture to enhance the classification of malignant and benign mammograms using multi-classification of malignant mammograms into six abnormality classes. DWT (Discrete Wavelet Transformation features are extracted from preprocessed images and passed through different classifiers. To improve accuracy, results generated by various classifiers are ensembled. The genetic algorithm is used to find optimal weights rather than assigning weights to the results of classifiers on the basis of heuristics. The mammograms declared as malignant by ensemble classifiers are divided into six classes. The ensemble classifiers are further used for multiclassification using one-against-all technique for classification. The output of all ensemble classifiers is combined by product, median and mean rule. It has been observed that the accuracy of classification of abnormalities is more than 97% in case of mean rule. The Mammographic Image Analysis Society dataset is used for experimentation.

  5. Frog sound identification using extended k-nearest neighbor classifier

    Science.gov (United States)

    Mukahar, Nordiana; Affendi Rosdi, Bakhtiar; Athiar Ramli, Dzati; Jaafar, Haryati

    2017-09-01

    Frog sound identification based on the vocalization becomes important for biological research and environmental monitoring. As a result, different types of feature extractions and classifiers have been employed to evaluate the accuracy of frog sound identification. This paper presents a frog sound identification with Extended k-Nearest Neighbor (EKNN) classifier. The EKNN classifier integrates the nearest neighbors and mutual sharing of neighborhood concepts, with the aims of improving the classification performance. It makes a prediction based on who are the nearest neighbors of the testing sample and who consider the testing sample as their nearest neighbors. In order to evaluate the classification performance in frog sound identification, the EKNN classifier is compared with competing classifier, k -Nearest Neighbor (KNN), Fuzzy k -Nearest Neighbor (FKNN) k - General Nearest Neighbor (KGNN)and Mutual k -Nearest Neighbor (MKNN) on the recorded sounds of 15 frog species obtained in Malaysia forest. The recorded sounds have been segmented using Short Time Energy and Short Time Average Zero Crossing Rate (STE+STAZCR), sinusoidal modeling (SM), manual and the combination of Energy (E) and Zero Crossing Rate (ZCR) (E+ZCR) while the features are extracted by Mel Frequency Cepstrum Coefficient (MFCC). The experimental results have shown that the EKNCN classifier exhibits the best performance in terms of accuracy compared to the competing classifiers, KNN, FKNN, GKNN and MKNN for all cases.

  6. Representation of classifier distributions in terms of hypergeometric functions

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    This paper derives alternative analytical expressions for classifier product distributions in terms of Gauss hypergeometric function, 2F1, by considering feed distribution defined in terms of Gates-Gaudin-Schumann function and efficiency curve defined in terms of a logistic function. It is shown that classifier distributions under dispersed conditions of classification pivot at a common size and the distributions are difference similar.The paper also addresses an inverse problem of classifier distributions wherein the feed distribution and efficiency curve are identified from the measured product distributions without needing to know the solid flow split of particles to any of the product streams.

  7. Faint spatial object classifier construction based on data mining technology

    Science.gov (United States)

    Lou, Xin; Zhao, Yang; Liao, Yurong; Nie, Yong-ming

    2016-11-01

    Data mining can effectively obtain the faint spatial object's patterns and characteristics, the universal relations and other implicated data characteristics, the key of which is classifier construction. Faint spatial object classifier construction with spatial data mining technology for faint spatial target detection is proposed based on theoretical analysis of design procedures and guidelines in detail. For the one-sidedness weakness during dealing with the fuzziness and randomness using this method, cloud modal classifier is proposed. Simulating analyzing results indicate that this method can realize classification quickly through feature combination and effectively resolve the one-sidedness weakness problem.

  8. Classifying proteins into functional groups based on all-versus-all BLAST of 10 million proteins.

    Science.gov (United States)

    Kolker, Natali; Higdon, Roger; Broomall, William; Stanberry, Larissa; Welch, Dean; Lu, Wei; Haynes, Winston; Barga, Roger; Kolker, Eugene

    2011-01-01

    To address the monumental challenge of assigning function to millions of sequenced proteins, we completed the first of a kind all-versus-all sequence alignments using BLAST for 9.9 million proteins in the UniRef100 database. Microsoft Windows Azure produced over 3 billion filtered records in 6 days using 475 eight-core virtual machines. Protein classification into functional groups was then performed using Hive and custom jars implemented on top of Apache Hadoop utilizing the MapReduce paradigm. First, using the Clusters of Orthologous Genes (COG) database, a length normalized bit score (LNBS) was determined to be the best similarity measure for classification of proteins. LNBS achieved sensitivity and specificity of 98% each. Second, out of 5.1 million bacterial proteins, about two-thirds were assigned to significantly extended COG groups, encompassing 30 times more assigned proteins. Third, the remaining proteins were classified into protein functional groups using an innovative implementation of a single-linkage algorithm on an in-house Hadoop compute cluster. This implementation significantly reduces the run time for nonindexed queries and optimizes efficient clustering on a large scale. The performance was also verified on Amazon Elastic MapReduce. This clustering assigned nearly 2 million proteins to approximately half a million different functional groups. A similar approach was applied to classify 2.8 million eukaryotic sequences resulting in over 1 million proteins being assign to existing KOG groups and the remainder clustered into 100,000 functional groups.

  9. Prediction of small molecule binding property of protein domains with Bayesian classifiers based on Markov chains.

    Science.gov (United States)

    Bulashevska, Alla; Stein, Martin; Jackson, David; Eils, Roland

    2009-12-01

    Accurate computational methods that can help to predict biological function of a protein from its sequence are of great interest to research biologists and pharmaceutical companies. One approach to assume the function of proteins is to predict the interactions between proteins and other molecules. In this work, we propose a machine learning method that uses a primary sequence of a domain to predict its propensity for interaction with small molecules. By curating the Pfam database with respect to the small molecule binding ability of its component domains, we have constructed a dataset of small molecule binding and non-binding domains. This dataset was then used as training set to learn a Bayesian classifier, which should distinguish members of each class. The domain sequences of both classes are modelled with Markov chains. In a Jack-knife test, our classification procedure achieved the predictive accuracies of 77.2% and 66.7% for binding and non-binding classes respectively. We demonstrate the applicability of our classifier by using it to identify previously unknown small molecule binding domains. Our predictions are available as supplementary material and can provide very useful information to drug discovery specialists. Given the ubiquitous and essential role small molecules play in biological processes, our method is important for identifying pharmaceutically relevant components of complete proteomes. The software is available from the author upon request.

  10. 42 CFR 37.50 - Interpreting and classifying chest roentgenograms.

    Science.gov (United States)

    2010-10-01

    ... interpreted and classified in accordance with the ILO Classification system and recorded on a Roentgenographic... under the Act, shall have immediately available for reference a complete set of the ILO...

  11. NUMERICAL SIMULATION OF PARTICLE MOTION IN TURBO CLASSIFIER

    Institute of Scientific and Technical Information of China (English)

    Ning Xu; Guohua Li; Zhichu Huang

    2005-01-01

    Research on the flow field inside a turbo classifier is complicated though important. According to the stochastic trajectory model of particles in gas-solid two-phase flow, and adopting the PHOENICS code, numerical simulation is carried out on the flow field, including particle trajectory, in the inner cavity of a turbo classifier, using both straight and backward crooked elbow blades. Computation results show that when the backward crooked elbow blades are used, the mixed stream that passes through the two blades produces a vortex in the positive direction which counteracts the attached vortex in the opposite direction due to the high-speed turbo rotation, making the flow steadier, thus improving both the grade efficiency and precision of the turbo classifier. This research provides positive theoretical evidences for designing sub-micron particle classifiers with high efficiency and accuracy.

  12. A novel statistical method for classifying habitat generalists and specialists

    DEFF Research Database (Denmark)

    Chazdon, Robin L; Chao, Anne; Colwell, Robert K

    2011-01-01

    We develop a novel statistical approach for classifying generalists and specialists in two distinct habitats. Using a multinomial model based on estimated species relative abundance in two habitats, our method minimizes bias due to differences in sampling intensities between two habitat types...... as well as bias due to insufficient sampling within each habitat. The method permits a robust statistical classification of habitat specialists and generalists, without excluding rare species a priori. Based on a user-defined specialization threshold, the model classifies species into one of four groups...... fraction (57.7%) of bird species with statistical confidence. Based on a conservative specialization threshold and adjustment for multiple comparisons, 64.4% of tree species in the full sample were too rare to classify with confidence. Among the species classified, OG specialists constituted the largest...

  13. A NON-PARAMETER BAYESIAN CLASSIFIER FOR FACE RECOGNITION

    Institute of Scientific and Technical Information of China (English)

    Liu Qingshan; Lu Hanqing; Ma Songde

    2003-01-01

    A non-parameter Bayesian classifier based on Kernel Density Estimation (KDE)is presented for face recognition, which can be regarded as a weighted Nearest Neighbor (NN)classifier in formation. The class conditional density is estimated by KDE and the bandwidthof the kernel function is estimated by Expectation Maximum (EM) algorithm. Two subspaceanalysis methods-linear Principal Component Analysis (PCA) and Kernel-based PCA (KPCA)are respectively used to extract features, and the proposed method is compared with ProbabilisticReasoning Models (PRM), Nearest Center (NC) and NN classifiers which are widely used in facerecognition systems. The experiments are performed on two benchmarks and the experimentalresults show that the KDE outperforms PRM, NC and NN classifiers.

  14. A novel ensemble and composite approach for classifying proteins ...

    African Journals Online (AJOL)

    African Journal of Biotechnology ... For the fact that the location of proteins gave some details about the function of a protein whose location was ... (K-NN) classifiers, each of which was defined in a different pseudo amino composition vector.

  15. A semi-automated approach to building text summarisation classifiers

    Directory of Open Access Journals (Sweden)

    Matias Garcia-Constantino

    2012-12-01

    Full Text Available An investigation into the extraction of useful information from the free text element of questionnaires, using a semi-automated summarisation extraction technique, is described. The summarisation technique utilises the concept of classification but with the support of domain/human experts during classifier construction. A realisation of the proposed technique, SARSET (Semi-Automated Rule Summarisation Extraction Tool, is presented and evaluated using real questionnaire data. The results of this evaluation are compared against the results obtained using two alternative techniques to build text summarisation classifiers. The first of these uses standard rule-based classifier generators, and the second is founded on the concept of building classifiers using secondary data. The results demonstrate that the proposed semi-automated approach outperforms the other two approaches considered.

  16. One pass learning for generalized classifier neural network.

    Science.gov (United States)

    Ozyildirim, Buse Melis; Avci, Mutlu

    2016-01-01

    Generalized classifier neural network introduced as a kind of radial basis function neural network, uses gradient descent based optimized smoothing parameter value to provide efficient classification. However, optimization consumes quite a long time and may cause a drawback. In this work, one pass learning for generalized classifier neural network is proposed to overcome this disadvantage. Proposed method utilizes standard deviation of each class to calculate corresponding smoothing parameter. Since different datasets may have different standard deviations and data distributions, proposed method tries to handle these differences by defining two functions for smoothing parameter calculation. Thresholding is applied to determine which function will be used. One of these functions is defined for datasets having different range of values. It provides balanced smoothing parameters for these datasets through logarithmic function and changing the operation range to lower boundary. On the other hand, the other function calculates smoothing parameter value for classes having standard deviation smaller than the threshold value. Proposed method is tested on 14 datasets and performance of one pass learning generalized classifier neural network is compared with that of probabilistic neural network, radial basis function neural network, extreme learning machines, and standard and logarithmic learning generalized classifier neural network in MATLAB environment. One pass learning generalized classifier neural network provides more than a thousand times faster classification than standard and logarithmic generalized classifier neural network. Due to its classification accuracy and speed, one pass generalized classifier neural network can be considered as an efficient alternative to probabilistic neural network. Test results show that proposed method overcomes computational drawback of generalized classifier neural network and may increase the classification performance. Copyright

  17. Dna Sequencing

    Science.gov (United States)

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  18. Accuracy/diversity and ensemble MLP classifier design.

    Science.gov (United States)

    Windeatt, Terry

    2006-09-01

    The difficulties of tuning parameters of multilayer perceptrons (MLP) classifiers are well known. In this paper, a measure is described that is capable of predicting the number of classifier training epochs for achieving optimal performance in an ensemble of MLP classifiers. The measure is computed between pairs of patterns on the training data and is based on a spectral representation of a Boolean function. This representation characterizes the mapping from classifier decisions to target label and allows accuracy and diversity to be incorporated within a single measure. Results on many benchmark problems, including the Olivetti Research Laboratory (ORL) face database demonstrate that the measure is well correlated with base-classifier test error, and may be used to predict the optimal number of training epochs. While correlation with ensemble test error is not quite as strong, it is shown in this paper that the measure may be used to predict number of epochs for optimal ensemble performance. Although the technique is only applicable to two-class problems, it is extended here to multiclass through output coding. For the output-coding technique, a random code matrix is shown to give better performance than one-per-class code, even when the base classifier is well-tuned.

  19. A cardiorespiratory classifier of voluntary and involuntary electrodermal activity

    Directory of Open Access Journals (Sweden)

    Sejdic Ervin

    2010-02-01

    Full Text Available Abstract Background Electrodermal reactions (EDRs can be attributed to many origins, including spontaneous fluctuations of electrodermal activity (EDA and stimuli such as deep inspirations, voluntary mental activity and startling events. In fields that use EDA as a measure of psychophysiological state, the fact that EDRs may be elicited from many different stimuli is often ignored. This study attempts to classify observed EDRs as voluntary (i.e., generated from intentional respiratory or mental activity or involuntary (i.e., generated from startling events or spontaneous electrodermal fluctuations. Methods Eight able-bodied participants were subjected to conditions that would cause a change in EDA: music imagery, startling noises, and deep inspirations. A user-centered cardiorespiratory classifier consisting of 1 an EDR detector, 2 a respiratory filter and 3 a cardiorespiratory filter was developed to automatically detect a participant's EDRs and to classify the origin of their stimulation as voluntary or involuntary. Results Detected EDRs were classified with a positive predictive value of 78%, a negative predictive value of 81% and an overall accuracy of 78%. Without the classifier, EDRs could only be correctly attributed as voluntary or involuntary with an accuracy of 50%. Conclusions The proposed classifier may enable investigators to form more accurate interpretations of electrodermal activity as a measure of an individual's psychophysiological state.

  20. LESS: a model-based classifier for sparse subspaces.

    Science.gov (United States)

    Veenman, Cor J; Tax, David M J

    2005-09-01

    In this paper, we specifically focus on high-dimensional data sets for which the number of dimensions is an order of magnitude higher than the number of objects. From a classifier design standpoint, such small sample size problems have some interesting challenges. The first challenge is to find, from all hyperplanes that separate the classes, a separating hyperplane which generalizes well for future data. A second important task is to determine which features are required to distinguish the classes. To attack these problems, we propose the LESS (Lowest Error in a Sparse Subspace) classifier that efficiently finds linear discriminants in a sparse subspace. In contrast with most classifiers for high-dimensional data sets, the LESS classifier incorporates a (simple) data model. Further, by means of a regularization parameter, the classifier establishes a suitable trade-off between subspace sparseness and classification accuracy. In the experiments, we show how LESS performs on several high-dimensional data sets and compare its performance to related state-of-the-art classifiers like, among others, linear ridge regression with the LASSO and the Support Vector Machine. It turns out that LESS performs competitively while using fewer dimensions.

  1. Classifying the embedded young stellar population in Perseus and Taurus and the LOMASS database

    DEFF Research Database (Denmark)

    Carney, M. T.; Ylldlz, U. A.; Mottram, J. C.

    2016-01-01

    Context. The classification of young stellar objects (YSOs) is typically done using the infrared spectral slope or bolometric temperature, but either can result in contamination of samples. More accurate methods to determine the evolutionary stage of YSOs will improve the reliability of statistics...... in the protostellar envelopes. The spatial concentration of HCO+J = 4-3 and 850 μm dust emission are used to classify the embedded nature of YSOs. Results. Approximately 30% of Class 0+I sources in Perseus and Taurus are not Stage I, but are likely to be more evolved Stage II pre-main sequence (PMS) stars with disks...

  2. Chameleon sequences in neurodegenerative diseases

    Energy Technology Data Exchange (ETDEWEB)

    Bahramali, Golnaz [Institute of Biochemistry and Biophysics, University of Tehran, Tehran (Iran, Islamic Republic of); Goliaei, Bahram, E-mail: goliaei@ut.ac.ir [Institute of Biochemistry and Biophysics, University of Tehran, Tehran (Iran, Islamic Republic of); Minuchehr, Zarrin, E-mail: minuchehr@nigeb.ac.ir [Department of Systems Biotechnology, National Institute of Genetic Engineering and Biotechnology, (NIGEB), Tehran (Iran, Islamic Republic of); Salari, Ali [Department of Systems Biotechnology, National Institute of Genetic Engineering and Biotechnology, (NIGEB), Tehran (Iran, Islamic Republic of)

    2016-03-25

    Chameleon sequences can adopt either alpha helix sheet or a coil conformation. Defining chameleon sequences in PDB (Protein Data Bank) may yield to an insight on defining peptides and proteins responsible in neurodegeneration. In this research, we benefitted from the large PDB and performed a sequence analysis on Chameleons, where we developed an algorithm to extract peptide segments with identical sequences, but different structures. In order to find new chameleon sequences, we extracted a set of 8315 non-redundant protein sequences from the PDB with an identity less than 25%. Our data was classified to “helix to strand (HE)”, “helix to coil (HC)” and “strand to coil (CE)” alterations. We also analyzed the occurrence of singlet and doublet amino acids and the solvent accessibility in the chameleon sequences; we then sorted out the proteins with the most number of chameleon sequences and named them Chameleon Flexible Proteins (CFPs) in our dataset. Our data revealed that Gly, Val, Ile, Tyr and Phe, are the major amino acids in Chameleons. We also found that there are proteins such as Insulin Degrading Enzyme IDE and GTP-binding nuclear protein Ran (RAN) with the most number of chameleons (640 and 405 respectively). These proteins have known roles in neurodegenerative diseases. Therefore it can be inferred that other CFP's can serve as key proteins in neurodegeneration, and a study on them can shed light on curing and preventing neurodegenerative diseases.

  3. Decision Tree Classifiers for Star/Galaxy Separation

    Science.gov (United States)

    Vasconcellos, E. C.; de Carvalho, R. R.; Gal, R. R.; LaBarbera, F. L.; Capelato, H. V.; Frago Campos Velho, H.; Trevisan, M.; Ruiz, R. S. R.

    2011-06-01

    We study the star/galaxy classification efficiency of 13 different decision tree algorithms applied to photometric objects in the Sloan Digital Sky Survey Data Release Seven (SDSS-DR7). Each algorithm is defined by a set of parameters which, when varied, produce different final classification trees. We extensively explore the parameter space of each algorithm, using the set of 884,126 SDSS objects with spectroscopic data as the training set. The efficiency of star-galaxy separation is measured using the completeness function. We find that the Functional Tree algorithm (FT) yields the best results as measured by the mean completeness in two magnitude intervals: 14 = 19 (82.1%). We compare the performance of the tree generated with the optimal FT configuration to the classifications provided by the SDSS parametric classifier, 2DPHOT, and Ball et al. We find that our FT classifier is comparable to or better in completeness over the full magnitude range 15 19), our classifier is the only one that maintains high completeness (>80%) while simultaneously achieving low contamination (~2.5%). We also examine the SDSS parametric classifier (psfMag - modelMag) to see if the dividing line between stars and galaxies can be adjusted to improve the classifier. We find that currently stars in close pairs are often misclassified as galaxies, and suggest a new cut to improve the classifier. Finally, we apply our FT classifier to separate stars from galaxies in the full set of 69,545,326 SDSS photometric objects in the magnitude range 14 <= r <= 21.

  4. Objective Assessment of Physical Activity: Classifiers for Public Health.

    Science.gov (United States)

    Kerr, Jacqueline; Patterson, Ruth E; Ellis, Katherine; Godbole, Suneeta; Johnson, Eileen; Lanckriet, Gert; Staudenmayer, John

    2016-05-01

    Walking for health is recommended by health agencies, partly based on epidemiological studies of self-reported behaviors. Accelerometers are now replacing survey data, but it is not clear that intensity-based cut points reflect the behaviors previously reported. New computational techniques can help classify raw accelerometer data into behaviors meaningful for public health. Five hundred twenty days of triaxial 30-Hz accelerometer data from three studies (n = 78) were employed as training data. Study 1 included prescribed activities completed in natural settings. The other two studies included multiple days of free-living data with SenseCam-annotated ground truth. The two populations in the free-living data sets were demographically and physical different. Random forest classifiers were trained on each data set, and the classification accuracy on the training data set and that applied to the other available data sets were assessed. Accelerometer cut points were also compared with the ground truth from the three data sets. The random forest classified all behaviors with over 80% accuracy. Classifiers developed on the prescribed data performed with higher accuracy than the free-living data classifier, but these did not perform as well on the free-living data sets. Many of the observed behaviors occurred at different intensities compared with those identified by existing cut points. New machine learning classifiers developed from prescribed activities (study 1) were considerably less accurate when applied to free-living populations or to a functionally different population (studies 2 and 3). These classifiers, developed on free-living data, may have value when applied to large cohort studies with existing hip accelerometer data.

  5. Nonparametric, Coupled ,Bayesian ,Dictionary ,and Classifier Learning for Hyperspectral Classification.

    Science.gov (United States)

    Akhtar, Naveed; Mian, Ajmal

    2017-10-03

    We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.

  6. [Horticultural plant diseases multispectral classification using combined classified methods].

    Science.gov (United States)

    Feng, Jie; Li, Hong-Ning; Yang, Wei-Ping; Hou, De-Dong; Liao, Ning-Fang

    2010-02-01

    The research on multispectral data disposal is getting more and more attention with the development of multispectral technique, capturing data ability and application of multispectral technique in agriculture practice. In the present paper, a cultivated plant cucumber' familiar disease (Trichothecium roseum, Sphaerotheca fuliginea, Cladosporium cucumerinum, Corynespora cassiicola, Pseudoperonospora cubensis) is the research objects. The cucumber leaves multispectral images of 14 visible light channels, near infrared channel and panchromatic channel were captured using narrow-band multispectral imaging system under standard observation and illumination environment, and 210 multispectral data samples which are the 16 bands spectral reflectance of different cucumber disease were obtained. The 210 samples were classified by distance, relativity and BP neural network to discuss effective combination of classified methods for making a diagnosis. The result shows that the classified effective combination of distance and BP neural network classified methods has superior performance than each method, and the advantage of each method is fully used. And the flow of recognizing horticultural plant diseases using combined classified methods is presented.

  7. Classifying the Topology of AHL-Driven Quorum Sensing Circuits in Proteobacterial Genomes

    Directory of Open Access Journals (Sweden)

    Sándor Pongor

    2012-04-01

    Full Text Available Virulence and adaptability of many Gram-negative bacterial species are associated with an N-acylhomoserine lactone (AHL gene regulation mechanism called quorum sensing (QS. The arrangement of quorum sensing genes is variable throughout bacterial genomes, although there are unifying themes that are common among the various topological arrangements. A bioinformatics survey of 1,403 complete bacterial genomes revealed characteristic gene topologies in 152 genomes that could be classified into 16 topological groups. We developed a concise notation for the patterns and show that the sequences of LuxR regulators and LuxI autoinducer synthase proteins cluster according to the topological patterns. The annotated topologies are deposited online with links to sequences and genome annotations at http://bacteria.itk.ppke.hu/QStopologies/.

  8. Training Classifiers with Shadow Features for Sensor-Based Human Activity Recognition

    Science.gov (United States)

    Fong, Simon; Song, Wei; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K. L.

    2017-01-01

    In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR) is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z) of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called ‘shadow features’ are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research. PMID:28264470

  9. Training Classifiers with Shadow Features for Sensor-Based Human Activity Recognition

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2017-02-01

    Full Text Available In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called ‘shadow features’ are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research.

  10. Classifying genotype F of hepatitis B virus into F1 and F2 subtypes

    Institute of Scientific and Technical Information of China (English)

    Hideaki Kato; Takanobu Kato; Yuzo Miyakawa; Masashi Mizokami; Kei Fujiwara; Robert G. Gish; Hiroshi Sakugawa; Hiroshi Yoshizawa; Fuminaka Sugauchi; Etsuro Orito; Ryuzo Ueda; Yasuhito Tanaka

    2005-01-01

    AIM: To explore the propriety of providing hepatitis B virus(HBV) genotypes F and H with two distinct genotypes.METHODS: Eleven HBV isolates of genotype F (HBV/F)were recovered from patients living in San Francisco,Japan, Panama, and Venezuela, and their full-length sequences were determined. Phylogenetic analysis was carried out among them along with HBV isolates previously reported.RESULTS: Seven of them clustered with reported HBV/F isolates in the phylogenetic tree constructed on the entire genomic sequence. The remaining four flocked on another branch along with three HBV isolates formerly reported as genotype H. These seven HBV isolates, including the four in this study and the three reported, had a sequence divergence of 7.3-9.5% from the other HBV/F isolates,and differed by > 13.7% from HBV isolates of the other six genotypes (A-E and G). Based on a marked genomic divergence, falling just short of >8% separating the seven genotypes, these seven HBV/F isolates were classified into F2 subtype and the former seven into F1 subtype provisionally. In a pairwise comparison of the S-gene sequences among the 7 HBV/F2 isolates and against 47HBV/F1 isolates as well as 136 representing the other six genotypes (A-E and G), two clusters separated by distinct genetic distances emerged.CONCLUSION: Based on these analyses, classifying HBV/F isolates into two subtypes (F1 and F2) would be more appropriate than providing them with two distinct genotypes (F and H).

  11. A native Bayesian classifier based routing protocol for VANETS

    Science.gov (United States)

    Bao, Zhenshan; Zhou, Keqin; Zhang, Wenbo; Gong, Xiaolei

    2016-12-01

    Geographic routing protocols are one of the most hot research areas in VANET (Vehicular Ad-hoc Network). However, there are few routing protocols can take both the transmission efficient and the usage of ratio into account. As we have noticed, different messages in VANET may ask different quality of service. So we raised a Native Bayesian Classifier based routing protocol (Naive Bayesian Classifier-Greedy, NBC-Greedy), which can classify and transmit different messages by its emergency degree. As a result, we can balance the transmission efficient and the usage of ratio with this protocol. Based on Matlab simulation, we can draw a conclusion that NBC-Greedy is more efficient and stable than LR-Greedy and GPSR.

  12. A Topic Model Approach to Representing and Classifying Football Plays

    KAUST Repository

    Varadarajan, Jagannadan

    2013-09-09

    We address the problem of modeling and classifying American Football offense teams’ plays in video, a challenging example of group activity analysis. Automatic play classification will allow coaches to infer patterns and tendencies of opponents more ef- ficiently, resulting in better strategy planning in a game. We define a football play as a unique combination of player trajectories. To this end, we develop a framework that uses player trajectories as inputs to MedLDA, a supervised topic model. The joint maximiza- tion of both likelihood and inter-class margins of MedLDA in learning the topics allows us to learn semantically meaningful play type templates, as well as, classify different play types with 70% average accuracy. Furthermore, this method is extended to analyze individual player roles in classifying each play type. We validate our method on a large dataset comprising 271 play clips from real-world football games, which will be made publicly available for future comparisons.

  13. COMPARISON OF SVM AND FUZZY CLASSIFIER FOR AN INDIAN SCRIPT

    Directory of Open Access Journals (Sweden)

    M. J. Baheti

    2012-01-01

    Full Text Available With the advent of technological era, conversion of scanned document (handwritten or printed into machine editable format has attracted many researchers. This paper deals with the problem of recognition of Gujarati handwritten numerals. Gujarati numeral recognition requires performing some specific steps as a part of preprocessing. For preprocessing digitization, segmentation, normalization and thinning are done with considering that the image have almost no noise. Further affine invariant moments based model is used for feature extraction and finally Support Vector Machine (SVM and Fuzzy classifiers are used for numeral classification. . The comparison of SVM and Fuzzy classifier is made and it can be seen that SVM procured better results as compared to Fuzzy Classifier.

  14. [A novel spectral classifier based on coherence measure].

    Science.gov (United States)

    Li, Xiang-ru; Wu, Fu-chao; Hu, Zhan-yi; Luo, A-li

    2005-11-01

    Classification and discovery of new types of celestial bodies from voluminous celestial spectra are two important issues in astronomy, and these two issues are treated separately in the literature to our knowledge. In the present paper, a novel coherence measure is introduced which can effectively measure the coherence of a new spectrum of unknown type with the training sampleslocated within its neighbourhood, then a novel classifier is designed based on this coherence measure. The proposed classifier is capable of carrying out spectral classification and knowledge discovery simultaneously. In particular, it can effectively deal with the situation where different types of training spectra exist within the neighbourhood of a new spectrum, and the traditional k-nearest neighbour method usually fails to reach a correct classification. The satisfactory performance for classification and knowledge discovery has been obtained by the proposed novel classifier over active galactic nucleus (AGNs) and active galaxies (AGs) data.

  15. An ɴ-ary λ-averaging based similarity classifier

    Directory of Open Access Journals (Sweden)

    Kurama Onesfole

    2016-06-01

    Full Text Available We introduce a new n-ary λ similarity classifier that is based on a new n-ary λ-averaging operator in the aggregation of similarities. This work is a natural extension of earlier research on similarity based classification in which aggregation is commonly performed by using the OWA-operator. So far λ-averaging has been used only in binary aggregation. Here the λ-averaging operator is extended to the n-ary aggregation case by using t-norms and t-conorms. We examine four different n-ary norms and test the new similarity classifier with five medical data sets. The new method seems to perform well when compared with the similarity classifier.

  16. A History of Classified Activities at Oak Ridge National Laboratory

    Energy Technology Data Exchange (ETDEWEB)

    Quist, A.S.

    2001-01-30

    The facilities that became Oak Ridge National Laboratory (ORNL) were created in 1943 during the United States' super-secret World War II project to construct an atomic bomb (the Manhattan Project). During World War II and for several years thereafter, essentially all ORNL activities were classified. Now, in 2000, essentially all ORNL activities are unclassified. The major purpose of this report is to provide a brief history of ORNL's major classified activities from 1943 until the present (September 2000). This report is expected to be useful to the ORNL Classification Officer and to ORNL's Authorized Derivative Classifiers and Authorized Derivative Declassifiers in their classification review of ORNL documents, especially those documents that date from the 1940s and 1950s.

  17. Defending Malicious Script Attacks Using Machine Learning Classifiers

    Directory of Open Access Journals (Sweden)

    Nayeem Khan

    2017-01-01

    Full Text Available The web application has become a primary target for cyber criminals by injecting malware especially JavaScript to perform malicious activities for impersonation. Thus, it becomes an imperative to detect such malicious code in real time before any malicious activity is performed. This study proposes an efficient method of detecting previously unknown malicious java scripts using an interceptor at the client side by classifying the key features of the malicious code. Feature subset was obtained by using wrapper method for dimensionality reduction. Supervised machine learning classifiers were used on the dataset for achieving high accuracy. Experimental results show that our method can efficiently classify malicious code from benign code with promising results.

  18. WORD SENSE DISAMBIGUATION BASED ON IMPROVED BAYESIAN CLASSIFIERS

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in practical application. In this paper, we perform WSD study based on large scale real-world corpus using two unsupervised learning algorithms based on ±n-improved Bayesian model and Dependency Grammar(DG)-improved Bayesian model. ±n-improved classifiers reduce the window size of context of ambiguous words with close-distance feature extraction method, and decrease the jamming of useless features, thus obviously improve the accuracy, reaching 83.18% (in open test). DG-improved classifier can more effectively conquer the noise effect existing in Naive-Bayesian classifier. Experimental results show that this approach does better on Chinese WSD, and the open test achieved an accuracy of 86.27%.

  19. Automatically Classifying the Role of Citations in Biomedical Articles

    Science.gov (United States)

    Agarwal, Shashank; Choubey, Lisha; Yu, Hong

    2010-01-01

    Citations are widely used in scientific literature. The traditional model of referencing considers all citations to be the same; however, semantically, citations play different roles. By studying the context in which citations appear, it is possible to determine the role that they play. Here, we report on the development of an eight-category classification scheme, annotation using that scheme, and development and evaluation of supervised machine-learning classifiers using the annotated data. We annotated 1,710 sentences using the annotation schema and our trained classifier obtained an average F1-score of 76.5%. The classifier is available for free as a Java API from http://citation.askhermes.org. PMID:21346931

  20. A cascade classifier for diagnosis of melanoma in clinical images.

    Science.gov (United States)

    Sabouri, P; GholamHosseini, H; Larsson, T; Collins, J

    2014-01-01

    Computer aided diagnosis of medical images can help physicians in better detecting and early diagnosis of many symptoms and therefore reducing the mortality rate. Realization of an efficient mobile device for semi-automatic diagnosis of melanoma would greatly enhance the applicability of medical image classification scheme and make it useful in clinical contexts. In this paper, interactive object recognition methodology is adopted for border segmentation of clinical skin lesion images. In addition, performance of five classifiers, KNN, Naïve Bayes, multi-layer perceptron, random forest and SVM are compared based on color and texture features for discriminating melanoma from benign nevus. The results show that a sensitivity of 82.6% and specificity of 83% can be achieved using a single SVM classifier. However, a better classification performance was achieved using a proposed cascade classifier with the sensitivity of 83.06% and specificity of 90.05% when performing ten-fold cross validation.

  1. A Film Classifier Based on Low-level Visual Features

    Directory of Open Access Journals (Sweden)

    Hui-Yu Huang

    2008-07-01

    Full Text Available We propose an approach to classify the film classes by using low level features and visual features. This approach aims to classify the films into genres. Our current domain of study is using the movie preview. A movie preview often emphasizes the theme of a film and hence provides suitable information for classifying process. In our approach, we categorize films into three broad categories: action, dramas, and thriller films. Four computable video features (average shot length, color variance, motion content and lighting key and visual features (show and fast moving effects are combined in our approach to provide the advantage information to demonstrate the movie category. The experimental results present that visual features are the useful messages for processing the film classification. On the other hand, our approach can also be extended for other potential applications, including the browsing and retrieval of videos on the internet, video-on-demand, and video libraries.

  2. Iris Recognition Based on LBP and Combined LVQ Classifier

    CERN Document Server

    Shams, M Y; Nomir, O; El-Awady, R M; 10.5121/ijcsit.2011.3506

    2011-01-01

    Iris recognition is considered as one of the best biometric methods used for human identification and verification, this is because of its unique features that differ from one person to another, and its importance in the security field. This paper proposes an algorithm for iris recognition and classification using a system based on Local Binary Pattern and histogram properties as a statistical approaches for feature extraction, and Combined Learning Vector Quantization Classifier as Neural Network approach for classification, in order to build a hybrid model depends on both features. The localization and segmentation techniques are presented using both Canny edge detection and Hough Circular Transform in order to isolate an iris from the whole eye image and for noise detection .Feature vectors results from LBP is applied to a Combined LVQ classifier with different classes to determine the minimum acceptable performance, and the result is based on majority voting among several LVQ classifier. Different iris da...

  3. Optimal threshold estimation for binary classifiers using game theory.

    Science.gov (United States)

    Sanchez, Ignacio Enrique

    2016-01-01

    Many bioinformatics algorithms can be understood as binary classifiers. They are usually compared using the area under the receiver operating characteristic ( ROC) curve. On the other hand, choosing the best threshold for practical use is a complex task, due to uncertain and context-dependent skews in the abundance of positives in nature and in the yields/costs for correct/incorrect classification. We argue that considering a classifier as a player in a zero-sum game allows us to use the minimax principle from game theory to determine the optimal operating point. The proposed classifier threshold corresponds to the intersection between the ROC curve and the descending diagonal in ROC space and yields a minimax accuracy of 1-FPR. Our proposal can be readily implemented in practice, and reveals that the empirical condition for threshold estimation of "specificity equals sensitivity" maximizes robustness against uncertainties in the abundance of positives in nature and classification costs.

  4. Examining the significance of fingerprint-based classifiers

    Directory of Open Access Journals (Sweden)

    Collins Jack R

    2008-12-01

    Full Text Available Abstract Background Experimental examinations of biofluids to measure concentrations of proteins or their fragments or metabolites are being explored as a means of early disease detection, distinguishing diseases with similar symptoms, and drug treatment efficacy. Many studies have produced classifiers with a high sensitivity and specificity, and it has been argued that accurate results necessarily imply some underlying biology-based features in the classifier. The simplest test of this conjecture is to examine datasets designed to contain no information with classifiers used in many published studies. Results The classification accuracy of two fingerprint-based classifiers, a decision tree (DT algorithm and a medoid classification algorithm (MCA, are examined. These methods are used to examine 30 artificial datasets that contain random concentration levels for 300 biomolecules. Each dataset contains between 30 and 300 Cases and Controls, and since the 300 observed concentrations are randomly generated, these datasets are constructed to contain no biological information. A modest search of decision trees containing at most seven decision nodes finds a large number of unique decision trees with an average sensitivity and specificity above 85% for datasets containing 60 Cases and 60 Controls or less, and for datasets with 90 Cases and 90 Controls many DTs have an average sensitivity and specificity above 80%. For even the largest dataset (300 Cases and 300 Controls the MCA procedure finds several unique classifiers that have an average sensitivity and specificity above 88% using only six or seven features. Conclusion While it has been argued that accurate classification results must imply some biological basis for the separation of Cases from Controls, our results show that this is not necessarily true. The DT and MCA classifiers are sufficiently flexible and can produce good results from datasets that are specifically constructed to contain no

  5. Silicon nanowire arrays as learning chemical vapour classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Niskanen, A O; Colli, A; White, R; Li, H W; Spigone, E; Kivioja, J M, E-mail: antti.niskanen@nokia.com [Nokia Research Center, Broers Building, 21 JJ Thomson Avenue, Cambridge CB3 0FA (United Kingdom)

    2011-07-22

    Nanowire field-effect transistors are a promising class of devices for various sensing applications. Apart from detecting individual chemical or biological analytes, it is especially interesting to use multiple selective sensors to look at their collective response in order to perform classification into predetermined categories. We show that non-functionalised silicon nanowire arrays can be used to robustly classify different chemical vapours using simple statistical machine learning methods. We were able to distinguish between acetone, ethanol and water with 100% accuracy while methanol, ethanol and 2-propanol were classified with 96% accuracy in ambient conditions.

  6. Learning Continuous Time Bayesian Network Classifiers Using MapReduce

    Directory of Open Access Journals (Sweden)

    Simone Villa

    2014-12-01

    Full Text Available Parameter and structural learning on continuous time Bayesian network classifiers are challenging tasks when you are dealing with big data. This paper describes an efficient scalable parallel algorithm for parameter and structural learning in the case of complete data using the MapReduce framework. Two popular instances of classifiers are analyzed, namely the continuous time naive Bayes and the continuous time tree augmented naive Bayes. Details of the proposed algorithm are presented using Hadoop, an open-source implementation of a distributed file system and the MapReduce framework for distributed data processing. Performance evaluation of the designed algorithm shows a robust parallel scaling.

  7. Face recognition using composite classifier with 2DPCA

    Science.gov (United States)

    Li, Jia; Yan, Ding

    2017-01-01

    In the conventional face recognition, most researchers focused on enhancing the precision which input data was already the member of database. However, they paid less necessary attention to confirm whether the input data belonged to database. This paper proposed an approach of face recognition using two-dimensional principal component analysis (2DPCA). It designed a novel composite classifier founded by statistical technique. Moreover, this paper utilized the advantages of SVM and Logic Regression in field of classification and therefore made its accuracy improved a lot. To test the performance of the composite classifier, the experiments were implemented on the ORL and the FERET database and the result was shown and evaluated.

  8. Classifying depth of anesthesia using EEG features, a comparison.

    Science.gov (United States)

    Esmaeili, Vahid; Shamsollahi, Mohammad Bagher; Arefian, Noor Mohammad; Assareh, Amin

    2007-01-01

    Various EEG features have been used in depth of anesthesia (DOA) studies. The objective of this study was to find the excellent features or combination of them than can discriminate between different anesthesia states. Conducting a clinical study on 22 patients we could define 4 distinct anesthetic states: awake, moderate, general anesthesia, and isoelectric. We examined features that have been used in earlier studies using single-channel EEG signal processing method. The maximum accuracy (99.02%) achieved using approximate entropy as the feature. Some other features could well discriminate a particular state of anesthesia. We could completely classify the patterns by means of 3 features and Bayesian classifier.

  9. Text Classification: Classifying Plain Source Files with Neural Network

    Directory of Open Access Journals (Sweden)

    Jaromir Veber

    2010-10-01

    Full Text Available The automated text file categorization has an important place in computer engineering, particularly in the process called data management automation. A lot has been written about text classification and the methods allowing classification of these files are well known. Unfortunately most studies are theoretical and for practical implementation more research is needed. I decided to contribute with a research focused on creating of a classifier for different kinds of programs (source files, scripts…. This paper will describe practical implementation of the classifier for text files depending on file content.

  10. Online classifier adaptation for cost-sensitive learning

    OpenAIRE

    Zhang, Junlin; Garcia, Jose

    2015-01-01

    In this paper, we propose the problem of online cost-sensitive clas- sifier adaptation and the first algorithm to solve it. We assume we have a base classifier for a cost-sensitive classification problem, but it is trained with respect to a cost setting different to the desired one. Moreover, we also have some training data samples streaming to the algorithm one by one. The prob- lem is to adapt the given base classifier to the desired cost setting using the steaming training samples online. ...

  11. Implications of physical symmetries in adaptive image classifiers

    DEFF Research Database (Denmark)

    Sams, Thomas; Hansen, Jonas Lundbek

    2000-01-01

    It is demonstrated that rotational invariance and reflection symmetry of image classifiers lead to a reduction in the number of free parameters in the classifier. When used in adaptive detectors, e.g. neural networks, this may be used to decrease the number of training samples necessary to learn ...... a given classification task, or to improve generalization of the neural network. Notably, the symmetrization of the detector does not compromise the ability to distinguish objects that break the symmetry. (C) 2000 Elsevier Science Ltd. All rights reserved....

  12. Classifiers in Japanese-to-English Machine Translation

    CERN Document Server

    Bond, F; Ikehara, S; Bond, Francis; Ogura, Kentaro; Ikehara, Satoru

    1996-01-01

    This paper proposes an analysis of classifiers into four major types: UNIT, METRIC, GROUP and SPECIES, based on properties of both Japanese and English. The analysis makes possible a uniform and straightforward treatment of noun phrases headed by classifiers in Japanese-to-English machine translation, and has been implemented in the MT system ALT-J/E. Although the analysis is based on the characteristics of, and differences between, Japanese and English, it is shown to be also applicable to the unrelated language Thai.

  13. NClassG+: A classifier for non-classically secreted Gram-positive bacterial proteins

    Directory of Open Access Journals (Sweden)

    Pino Camilo

    2011-01-01

    Full Text Available Abstract Background Most predictive methods currently available for the identification of protein secretion mechanisms have focused on classically secreted proteins. In fact, only two methods have been reported for predicting non-classically secreted proteins of Gram-positive bacteria. This study describes the implementation of a sequence-based classifier, denoted as NClassG+, for identifying non-classically secreted Gram-positive bacterial proteins. Results Several feature-based classifiers were trained using different sequence transformation vectors (frequencies, dipeptides, physicochemical factors and PSSM and Support Vector Machines (SVMs with Linear, Polynomial and Gaussian kernel functions. Nested k-fold cross-validation (CV was applied to select the best models, using the inner CV loop to tune the model parameters and the outer CV group to compute the error. The parameters and Kernel functions and the combinations between all possible feature vectors were optimized using grid search. Conclusions The final model was tested against an independent set not previously seen by the model, obtaining better predictive performance compared to SecretomeP V2.0 and SecretPV2.0 for the identification of non-classically secreted proteins. NClassG+ is freely available on the web at http://www.biolisi.unal.edu.co/web-servers/nclassgpositive/

  14. Height of lumbar discs measured from radiographs compared with degeneration and height classified from MR images

    Energy Technology Data Exchange (ETDEWEB)

    Frobin, W.; Brinckmann, P. [Muenster Univ. (Germany). Inst. fuer Experimentelle Biomechanik; Kramer, M.; Hartwig, E. [Ulm Univ. (Germany). Sektion fuer Unfallchirurgische Forschung und Biomechanik

    2001-02-01

    The relation between height of lumbar discs (measured from lateral radiographic views) and disc degeneration (classified from MR images) deserves attention in view of the wide, often parallel or interchanged use of both methods. The time sequence of degenerative signs and decrease of disc height is controversial. To clarify the issue, this cross-sectional study documents the relation between disc degeneration and disc height in a selected cohort. Forty-three subjects were selected at random from a cohort examined for potential disc-related disease caused by long-term lifting and carrying. From each subject a lateral radiographic view of the lumbar spine as well as findings from an MR investigation of (in most cases) levels T12/L1 to L5/S1 were available; thus, n = 237 lumbar discs were available for measurement and classification. Disc height was measured from the radiographic views with a new protocol compensating for image distortion and permitting comparison with normal, age- and gender-appropriate disc height. Degeneration as well as disc height were classified twice from MR images by independent observers in a blinded fashion. Disc degeneration classified from MR images is not related to a measurable disc height loss in the first stage of degeneration, whereas progressive degeneration goes along with progressive loss of disc height, though with considerable interindividual variation. Loss of disc height classified from MR images is on average compatible with loss of disc height measured from radiographs. In individual discs, however, classification of height loss from MR images is imprecise. The first sign of disc degeneration (a moderate loss of nucleus signal) precedes disc height decrease. As degeneration progresses, disc height decreases. Disc height decrease and progress of degeneration, however, appear to be only loosely correlated. (orig.)

  15. CryoProtect: A Web Server for Classifying Antifreeze Proteins from Nonantifreeze Proteins

    Directory of Open Access Journals (Sweden)

    Reny Pratiwi

    2017-01-01

    Full Text Available Antifreeze protein (AFP is an ice-binding protein that protects organisms from freezing in extremely cold environments. AFPs are found across a diverse range of species and, therefore, significantly differ in their structures. As there are no consensus sequences available for determining the ice-binding domain of AFPs, thus the prediction and characterization of AFPs from their sequence is a challenging task. This study addresses this issue by predicting AFPs directly from sequence on a large set of 478 AFPs and 9,139 non-AFPs using machine learning (e.g., random forest as a function of interpretable features (e.g., amino acid composition, dipeptide composition, and physicochemical properties. Furthermore, AFPs were characterized using propensity scores and important physicochemical properties via statistical and principal component analysis. The predictive model afforded high performance with an accuracy of 88.28% and results revealed that AFPs are likely to be composed of hydrophobic amino acids as well as amino acids with hydroxyl and sulfhydryl side chains. The predictive model is provided as a free publicly available web server called CryoProtect for classifying query protein sequence as being either AFP or non-AFP. The data set and source code are for reproducing the results which are provided on GitHub.

  16. Main: Sequences [KOME

    Lifescience Database Archive (English)

    Full Text Available Sequences Nucleotide Sequence Nucleotide sequence of full length cDNA (trimmed sequence) kome_ine_full_seq...uence_db.fasta.zip kome_ine_full_sequence_db.zip kome_ine_full_sequence_db ...

  17. Building an automated SOAP classifier for emergency department reports.

    Science.gov (United States)

    Mowery, Danielle; Wiebe, Janyce; Visweswaran, Shyam; Harkema, Henk; Chapman, Wendy W

    2012-02-01

    Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP framework's usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohen's kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F(1) scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks.

  18. An ensemble self-training protein interaction article classifier.

    Science.gov (United States)

    Chen, Yifei; Hou, Ping; Manderick, Bernard

    2014-01-01

    Protein-protein interaction (PPI) is essential to understand the fundamental processes governing cell biology. The mining and curation of PPI knowledge are critical for analyzing proteomics data. Hence it is desired to classify articles PPI-related or not automatically. In order to build interaction article classification systems, an annotated corpus is needed. However, it is usually the case that only a small number of labeled articles can be obtained manually. Meanwhile, a large number of unlabeled articles are available. By combining ensemble learning and semi-supervised self-training, an ensemble self-training interaction classifier called EST_IACer is designed to classify PPI-related articles based on a small number of labeled articles and a large number of unlabeled articles. A biological background based feature weighting strategy is extended using the category information from both labeled and unlabeled data. Moreover, a heuristic constraint is put forward to select optimal instances from unlabeled data to improve the performance further. Experiment results show that the EST_IACer can classify the PPI related articles effectively and efficiently.

  19. A Multiple Classifier Fusion Algorithm Using Weighted Decision Templates

    Directory of Open Access Journals (Sweden)

    Aizhong Mi

    2016-01-01

    Full Text Available Fusing classifiers’ decisions can improve the performance of a pattern recognition system. Many applications areas have adopted the methods of multiple classifier fusion to increase the classification accuracy in the recognition process. From fully considering the classifier performance differences and the training sample information, a multiple classifier fusion algorithm using weighted decision templates is proposed in this paper. The algorithm uses a statistical vector to measure the classifier’s performance and makes a weighed transform on each classifier according to the reliability of its output. To make a decision, the information in the training samples around an input sample is used by the k-nearest-neighbor rule if the algorithm evaluates the sample as being highly likely to be misclassified. An experimental comparison was performed on 15 data sets from the KDD’99, UCI, and ELENA databases. The experimental results indicate that the algorithm can achieve better classification performance. Next, the algorithm was applied to cataract grading in the cataract ultrasonic phacoemulsification operation. The application result indicates that the proposed algorithm is effective and can meet the practical requirements of the operation.

  20. Classifying aquatic macrophytes as indicators of eutrophication in European lakes

    NARCIS (Netherlands)

    Penning, W.E.; Mjelde, M.; Dudley, B.; Hellsten, S.; Hanganu, J.; Kolada, A.; van den Berg, Marcel S.; Poikane, S.; Phillips, G.; Willby, N.; Ecke, F.

    2008-01-01

    Aquatic macrophytes are one of the biological quality elements in the Water Framework Directive (WFD) for which status assessments must be defined. We tested two methods to classify macrophyte species and their response to eutrophication pressure: one based on percentiles of occurrence along a phosp

  1. Using predictive distributions to estimate uncertainty in classifying landmine targets

    Science.gov (United States)

    Close, Ryan; Watford, Ken; Glenn, Taylor; Gader, Paul; Wilson, Joseph

    2011-06-01

    Typical classification models used for detection of buried landmines estimate a singular discriminative output. This classification is based on a model or technique trained with a given set of training data available during system development. Regardless of how well the technique performs when classifying objects that are 'similar' to the training set, most models produce undesirable (and many times unpredictable) responses when presented with object classes different from the training data. This can cause mines or other explosive objects to be misclassified as clutter, or false alarms. Bayesian regression and classification models produce distributions as output, called the predictive distribution. This paper will discuss predictive distributions and their application to characterizing uncertainty in the classification decision, from the context of landmine detection. Specifically, experiments comparing the predictive variance produced by relevance vector machines and Gaussian processes will be described. We demonstrate that predictive variance can be used to determine the uncertainty of the model in classifying an object (i.e., the classifier will know when it's unable to reliably classify an object). The experimental results suggest that degenerate covariance models (such as the relevance vector machine) are not reliable in estimating the predictive variance. This necessitates the use of the Gaussian Process in creating the predictive distribution.

  2. Multiple classifier system for remote sensing image classification: a review.

    Science.gov (United States)

    Du, Peijun; Xia, Junshi; Zhang, Wei; Tan, Kun; Liu, Yi; Liu, Sicong

    2012-01-01

    Over the last two decades, multiple classifier system (MCS) or classifier ensemble has shown great potential to improve the accuracy and reliability of remote sensing image classification. Although there are lots of literatures covering the MCS approaches, there is a lack of a comprehensive literature review which presents an overall architecture of the basic principles and trends behind the design of remote sensing classifier ensemble. Therefore, in order to give a reference point for MCS approaches, this paper attempts to explicitly review the remote sensing implementations of MCS and proposes some modified approaches. The effectiveness of existing and improved algorithms are analyzed and evaluated by multi-source remotely sensed images, including high spatial resolution image (QuickBird), hyperspectral image (OMISII) and multi-spectral image (Landsat ETM+). Experimental results demonstrate that MCS can effectively improve the accuracy and stability of remote sensing image classification, and diversity measures play an active role for the combination of multiple classifiers. Furthermore, this survey provides a roadmap to guide future research, algorithm enhancement and facilitate knowledge accumulation of MCS in remote sensing community.

  3. Subtractive fuzzy classifier based driver distraction levels classification using EEG.

    Science.gov (United States)

    Wali, Mousa Kadhim; Murugappan, Murugappan; Ahmad, Badlishah

    2013-09-01

    [Purpose] In earlier studies of driver distraction, researchers classified distraction into two levels (not distracted, and distracted). This study classified four levels of distraction (neutral, low, medium, high). [Subjects and Methods] Fifty Asian subjects (n=50, 43 males, 7 females), age range 20-35 years, who were free from any disease, participated in this study. Wireless EEG signals were recorded by 14 electrodes during four types of distraction stimuli (Global Position Systems (GPS), music player, short message service (SMS), and mental tasks). We derived the amplitude spectrum of three different frequency bands, theta, alpha, and beta of EEG. Then, based on fusion of discrete wavelet packet transforms and fast fourier transform yield, we extracted two features (power spectral density, spectral centroid frequency) of different wavelets (db4, db8, sym8, and coif5). Mean ± SD was calculated and analysis of variance (ANOVA) was performed. A fuzzy inference system classifier was applied to different wavelets using the two extracted features. [Results] The results indicate that the two features of sym8 posses highly significant discrimination across the four levels of distraction, and the best average accuracy achieved by the subtractive fuzzy classifier was 79.21% using the power spectral density feature extracted using the sym8 wavelet. [Conclusion] These findings suggest that EEG signals can be used to monitor distraction level intensity in order to alert drivers to high levels of distraction.

  4. Multiple Classifier System for Remote Sensing Image Classification: A Review

    Directory of Open Access Journals (Sweden)

    Yi Liu

    2012-04-01

    Full Text Available Over the last two decades, multiple classifier system (MCS or classifier ensemble has shown great potential to improve the accuracy and reliability of remote sensing image classification. Although there are lots of literatures covering the MCS approaches, there is a lack of a comprehensive literature review which presents an overall architecture of the basic principles and trends behind the design of remote sensing classifier ensemble. Therefore, in order to give a reference point for MCS approaches, this paper attempts to explicitly review the remote sensing implementations of MCS and proposes some modified approaches. The effectiveness of existing and improved algorithms are analyzed and evaluated by multi-source remotely sensed images, including high spatial resolution image (QuickBird, hyperspectral image (OMISII and multi-spectral image (Landsat ETM+.Experimental results demonstrate that MCS can effectively improve the accuracy and stability of remote sensing image classification, and diversity measures play an active role for the combination of multiple classifiers. Furthermore, this survey provides a roadmap to guide future research, algorithm enhancement and facilitate knowledge accumulation of MCS in remote sensing community.

  5. Data Stream Classification Based on the Gamma Classifier

    Directory of Open Access Journals (Sweden)

    Abril Valeria Uriarte-Arcia

    2015-01-01

    Full Text Available The ever increasing data generation confronts us with the problem of handling online massive amounts of information. One of the biggest challenges is how to extract valuable information from these massive continuous data streams during single scanning. In a data stream context, data arrive continuously at high speed; therefore the algorithms developed to address this context must be efficient regarding memory and time management and capable of detecting changes over time in the underlying distribution that generated the data. This work describes a novel method for the task of pattern classification over a continuous data stream based on an associative model. The proposed method is based on the Gamma classifier, which is inspired by the Alpha-Beta associative memories, which are both supervised pattern recognition models. The proposed method is capable of handling the space and time constrain inherent to data stream scenarios. The Data Streaming Gamma classifier (DS-Gamma classifier implements a sliding window approach to provide concept drift detection and a forgetting mechanism. In order to test the classifier, several experiments were performed using different data stream scenarios with real and synthetic data streams. The experimental results show that the method exhibits competitive performance when compared to other state-of-the-art algorithms.

  6. Multiple-instance learning as a classifier combining problem

    DEFF Research Database (Denmark)

    Li, Yan; Tax, David M. J.; Duin, Robert P. W.

    2013-01-01

    In multiple-instance learning (MIL), an object is represented as a bag consisting of a set of feature vectors called instances. In the training set, the labels of bags are given, while the uncertainty comes from the unknown labels of instances in the bags. In this paper, we study MIL with the ass......In multiple-instance learning (MIL), an object is represented as a bag consisting of a set of feature vectors called instances. In the training set, the labels of bags are given, while the uncertainty comes from the unknown labels of instances in the bags. In this paper, we study MIL...... with the assumption that instances are drawn from a mixture distribution of the concept and the non-concept, which leads to a convenient way to solve MIL as a classifier combining problem. It is shown that instances can be classified with any standard supervised classifier by re-weighting the classification...... posteriors. Given the instance labels, the label of a bag can be obtained as a classifier combining problem. An optimal decision rule is derived that determines the threshold on the fraction of instances in a bag that is assigned to the concept class. We provide estimators for the two parameters in the model...

  7. Gene-expression Classifier in Papillary Thyroid Carcinoma

    DEFF Research Database (Denmark)

    Londero, Stefano Christian; Jespersen, Marie Louise; Krogdahl, Annelise;

    2016-01-01

    BACKGROUND: No reliable biomarker for metastatic potential in the risk stratification of papillary thyroid carcinoma exists. We aimed to develop a gene-expression classifier for metastatic potential. MATERIALS AND METHODS: Genome-wide expression analyses were used. Development cohort: freshly...

  8. Scoring and Classifying Examinees Using Measurement Decision Theory

    Science.gov (United States)

    Rudner, Lawrence M.

    2009-01-01

    This paper describes and evaluates the use of measurement decision theory (MDT) to classify examinees based on their item response patterns. The model has a simple framework that starts with the conditional probabilities of examinees in each category or mastery state responding correctly to each item. The presented evaluation investigates: (1) the…

  9. Localizing genes to cerebellar layers by classifying ISH images.

    Directory of Open Access Journals (Sweden)

    Lior Kirsch

    Full Text Available Gene expression controls how the brain develops and functions. Understanding control processes in the brain is particularly hard since they involve numerous types of neurons and glia, and very little is known about which genes are expressed in which cells and brain layers. Here we describe an approach to detect genes whose expression is primarily localized to a specific brain layer and apply it to the mouse cerebellum. We learn typical spatial patterns of expression from a few markers that are known to be localized to specific layers, and use these patterns to predict localization for new genes. We analyze images of in-situ hybridization (ISH experiments, which we represent using histograms of local binary patterns (LBP and train image classifiers and gene classifiers for four layers of the cerebellum: the Purkinje, granular, molecular and white matter layer. On held-out data, the layer classifiers achieve accuracy above 94% (AUC by representing each image at multiple scales and by combining multiple image scores into a single gene-level decision. When applied to the full mouse genome, the classifiers predict specific layer localization for hundreds of new genes in the Purkinje and granular layers. Many genes localized to the Purkinje layer are likely to be expressed in astrocytes, and many others are involved in lipid metabolism, possibly due to the unusual size of Purkinje cells.

  10. Weighted Hybrid Decision Tree Model for Random Forest Classifier

    Science.gov (United States)

    Kulkarni, Vrushali Y.; Sinha, Pradeep K.; Petare, Manisha C.

    2016-06-01

    Random Forest is an ensemble, supervised machine learning algorithm. An ensemble generates many classifiers and combines their results by majority voting. Random forest uses decision tree as base classifier. In decision tree induction, an attribute split/evaluation measure is used to decide the best split at each node of the decision tree. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation among them. The work presented in this paper is related to attribute split measures and is a two step process: first theoretical study of the five selected split measures is done and a comparison matrix is generated to understand pros and cons of each measure. These theoretical results are verified by performing empirical analysis. For empirical analysis, random forest is generated using each of the five selected split measures, chosen one at a time. i.e. random forest using information gain, random forest using gain ratio, etc. The next step is, based on this theoretical and empirical analysis, a new approach of hybrid decision tree model for random forest classifier is proposed. In this model, individual decision tree in Random Forest is generated using different split measures. This model is augmented by weighted voting based on the strength of individual tree. The new approach has shown notable increase in the accuracy of random forest.

  11. Automatic Classification of Cetacean Vocalizations Using an Aural Classifier

    Science.gov (United States)

    2013-09-30

    were inspired by research directed at discriminating the timbre of different musical instruments – a passive classification problem – which suggests...the method should be able to classify marine mammal vocalizations since these calls possess many of the acoustic attributes of music . APPROACH

  12. 18 CFR 367.18 - Criteria for classifying leases.

    Science.gov (United States)

    2010-04-01

    ... classification of the lease under the criteria in paragraph (a) of this section had the changed terms been in... the lessee) must not give rise to a new classification of a lease for accounting purposes. ... classifying leases. 367.18 Section 367.18 Conservation of Power and Water Resources FEDERAL ENERGY...

  13. Discrimination-Aware Classifiers for Student Performance Prediction

    Science.gov (United States)

    Luo, Ling; Koprinska, Irena; Liu, Wei

    2015-01-01

    In this paper we consider discrimination-aware classification of educational data. Mining and using rules that distinguish groups of students based on sensitive attributes such as gender and nationality may lead to discrimination. It is desirable to keep the sensitive attributes during the training of a classifier to avoid information loss but…

  14. 18 CFR 3a.12 - Authority to classify official information.

    Science.gov (United States)

    2010-04-01

    ... 18 Conservation of Power and Water Resources 1 2010-04-01 2010-04-01 false Authority to classify official information. 3a.12 Section 3a.12 Conservation of Power and Water Resources FEDERAL ENERGY REGULATORY COMMISSION, DEPARTMENT OF ENERGY GENERAL RULES NATIONAL SECURITY INFORMATION Classification §...

  15. 18 CFR 3a.71 - Accountability for classified material.

    Science.gov (United States)

    2010-04-01

    ... 18 Conservation of Power and Water Resources 1 2010-04-01 2010-04-01 false Accountability for classified material. 3a.71 Section 3a.71 Conservation of Power and Water Resources FEDERAL ENERGY REGULATORY COMMISSION, DEPARTMENT OF ENERGY GENERAL RULES NATIONAL SECURITY INFORMATION Accountability for...

  16. Recognition of Arabic Sign Language Alphabet Using Polynomial Classifiers

    Directory of Open Access Journals (Sweden)

    M. Al-Rousan

    2005-08-01

    Full Text Available Building an accurate automatic sign language recognition system is of great importance in facilitating efficient communication with deaf people. In this paper, we propose the use of polynomial classifiers as a classification engine for the recognition of Arabic sign language (ArSL alphabet. Polynomial classifiers have several advantages over other classifiers in that they do not require iterative training, and that they are highly computationally scalable with the number of classes. Based on polynomial classifiers, we have built an ArSL system and measured its performance using real ArSL data collected from deaf people. We show that the proposed system provides superior recognition results when compared with previously published results using ANFIS-based classification on the same dataset and feature extraction methodology. The comparison is shown in terms of the number of misclassified test patterns. The reduction in the rate of misclassified patterns was very significant. In particular, we have achieved a 36% reduction of misclassifications on the training data and 57% on the test data.

  17. Weakly supervised learning of a classifier for unusual event detection.

    Science.gov (United States)

    Jäger, Mark; Knoll, Christian; Hamprecht, Fred A

    2008-09-01

    In this paper, we present an automatic classification framework combining appearance based features and hidden Markov models (HMM) to detect unusual events in image sequences. One characteristic of the classification task is that anomalies are rare. This reflects the situation in the quality control of industrial processes, where error events are scarce by nature. As an additional restriction, class labels are only available for the complete image sequence, since frame-wise manual scanning of the recorded sequences for anomalies is too expensive and should, therefore, be avoided. The proposed framework reduces the feature space dimension of the image sequences by employing subspace methods and encodes characteristic temporal dynamics using continuous hidden Markov models (CHMMs). The applied learning procedure is as follows. 1) A generative model for the regular sequences is trained (one-class learning). 2) The regular sequence model (RSM) is used to locate potentially unusual segments within error sequences by means of a change detection algorithm (outlier detection). 3) Unusual segments are used to expand the RSM to an error sequence model (ESM). The complexity of the ESM is controlled by means of the Bayesian Information Criterion (BIC). The likelihood ratio of the data given the ESM and the RSM is used for the classification decision. This ratio is close to one for sequences without error events and increases for sequences containing error events. Experimental results are presented for image sequences recorded from industrial laser welding processes. We demonstrate that the learning procedure can significantly reduce the user interaction and that sequences with error events can be found with a small false positive rate. It has also been shown that a modeling of the temporal dynamics is necessary to reach these low error rates.

  18. Bayesian network classifiers for categorizing cortical GABAergic interneurons.

    Science.gov (United States)

    Mihaljević, Bojan; Benavides-Piccione, Ruth; Bielza, Concha; DeFelipe, Javier; Larrañaga, Pedro

    2015-04-01

    An accepted classification of GABAergic interneurons of the cerebral cortex is a major goal in neuroscience. A recently proposed taxonomy based on patterns of axonal arborization promises to be a pragmatic method for achieving this goal. It involves characterizing interneurons according to five axonal arborization features, called F1-F5, and classifying them into a set of predefined types, most of which are established in the literature. Unfortunately, there is little consensus among expert neuroscientists regarding the morphological definitions of some of the proposed types. While supervised classifiers were able to categorize the interneurons in accordance with experts' assignments, their accuracy was limited because they were trained with disputed labels. Thus, here we automatically classify interneuron subsets with different label reliability thresholds (i.e., such that every cell's label is backed by at least a certain (threshold) number of experts). We quantify the cells with parameters of axonal and dendritic morphologies and, in order to predict the type, also with axonal features F1-F4 provided by the experts. Using Bayesian network classifiers, we accurately characterize and classify the interneurons and identify useful predictor variables. In particular, we discriminate among reliable examples of common basket, horse-tail, large basket, and Martinotti cells with up to 89.52% accuracy, and single out the number of branches at 180 μm from the soma, the convex hull 2D area, and the axonal features F1-F4 as especially useful predictors for distinguishing among these types. These results open up new possibilities for an objective and pragmatic classification of interneurons.

  19. Enhancing atlas based segmentation with multiclass linear classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Sdika, Michaël, E-mail: michael.sdika@creatis.insa-lyon.fr [Université de Lyon, CREATIS, CNRS UMR 5220, Inserm U1044, INSA-Lyon, Université Lyon 1, Villeurbanne 69300 (France)

    2015-12-15

    Purpose: To present a method to enrich atlases for atlas based segmentation. Such enriched atlases can then be used as a single atlas or within a multiatlas framework. Methods: In this paper, machine learning techniques have been used to enhance the atlas based segmentation approach. The enhanced atlas defined in this work is a pair composed of a gray level image alongside an image of multiclass classifiers with one classifier per voxel. Each classifier embeds local information from the whole training dataset that allows for the correction of some systematic errors in the segmentation and accounts for the possible local registration errors. The authors also propose to use these images of classifiers within a multiatlas framework: results produced by a set of such local classifier atlases can be combined using a label fusion method. Results: Experiments have been made on the in vivo images of the IBSR dataset and a comparison has been made with several state-of-the-art methods such as FreeSurfer and the multiatlas nonlocal patch based method of Coupé or Rousseau. These experiments show that their method is competitive with state-of-the-art methods while having a low computational cost. Further enhancement has also been obtained with a multiatlas version of their method. It is also shown that, in this case, nonlocal fusion is unnecessary. The multiatlas fusion can therefore be done efficiently. Conclusions: The single atlas version has similar quality as state-of-the-arts multiatlas methods but with the computational cost of a naive single atlas segmentation. The multiatlas version offers a improvement in quality and can be done efficiently without a nonlocal strategy.

  20. Improving the chances of successful protein structure determination with a random forest classifier

    Energy Technology Data Exchange (ETDEWEB)

    Jahandideh, Samad [Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92307 (United States); Joint Center for Structural Genomics, (United States); Jaroszewski, Lukasz; Godzik, Adam, E-mail: adam@burnham.org [Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92307 (United States); Joint Center for Structural Genomics, (United States); University of California, San Diego, La Jolla, California (United States)

    2014-03-01

    Using an extended set of protein features calculated separately for protein surface and interior, a new version of XtalPred based on a random forest classifier achieves a significant improvement in predicting the success of structure determination from the primary amino-acid sequence. Obtaining diffraction quality crystals remains one of the major bottlenecks in structural biology. The ability to predict the chances of crystallization from the amino-acid sequence of the protein can, at least partly, address this problem by allowing a crystallographer to select homologs that are more likely to succeed and/or to modify the sequence of the target to avoid features that are detrimental to successful crystallization. In 2007, the now widely used XtalPred algorithm [Slabinski et al. (2007 ▶), Protein Sci.16, 2472–2482] was developed. XtalPred classifies proteins into five ‘crystallization classes’ based on a simple statistical analysis of the physicochemical features of a protein. Here, towards the same goal, advanced machine-learning methods are applied and, in addition, the predictive potential of additional protein features such as predicted surface ruggedness, hydrophobicity, side-chain entropy of surface residues and amino-acid composition of the predicted protein surface are tested. The new XtalPred-RF (random forest) achieves significant improvement of the prediction of crystallization success over the original XtalPred. To illustrate this, XtalPred-RF was tested by revisiting target selection from 271 Pfam families targeted by the Joint Center for Structural Genomics (JCSG) in PSI-2, and it was estimated that the number of targets entered into the protein-production and crystallization pipeline could have been reduced by 30% without lowering the number of families for which the first structures were solved. The prediction improvement depends on the subset of targets used as a testing set and reaches 100% (i.e. twofold) for the top class of predicted

  1. Classification of THz pulse signals using two-dimensional cross-correlation feature extraction and non-linear classifiers.

    Science.gov (United States)

    Siuly; Yin, Xiaoxia; Hadjiloucas, Sillas; Zhang, Yanchun

    2016-04-01

    This work provides a performance comparison of four different machine learning classifiers: multinomial logistic regression with ridge estimators (MLR) classifier, k-nearest neighbours (KNN), support vector machine (SVM) and naïve Bayes (NB) as applied to terahertz (THz) transient time domain sequences associated with pixelated images of different powder samples. The six substances considered, although have similar optical properties, their complex insertion loss at the THz part of the spectrum is significantly different because of differences in both their frequency dependent THz extinction coefficient as well as differences in their refractive index and scattering properties. As scattering can be unquantifiable in many spectroscopic experiments, classification solely on differences in complex insertion loss can be inconclusive. The problem is addressed using two-dimensional (2-D) cross-correlations between background and sample interferograms, these ensure good noise suppression of the datasets and provide a range of statistical features that are subsequently used as inputs to the above classifiers. A cross-validation procedure is adopted to assess the performance of the classifiers. Firstly the measurements related to samples that had thicknesses of 2mm were classified, then samples at thicknesses of 4mm, and after that 3mm were classified and the success rate and consistency of each classifier was recorded. In addition, mixtures having thicknesses of 2 and 4mm as well as mixtures of 2, 3 and 4mm were presented simultaneously to all classifiers. This approach provided further cross-validation of the classification consistency of each algorithm. The results confirm the superiority in classification accuracy and robustness of the MLR (least accuracy 88.24%) and KNN (least accuracy 90.19%) algorithms which consistently outperformed the SVM (least accuracy 74.51%) and NB (least accuracy 56.86%) classifiers for the same number of feature vectors across all studies

  2. Predicting Contextual Sequences via Submodular Function Maximization

    CERN Document Server

    Dey, Debadeepta; Hebert, Martial; Bagnell, J Andrew

    2012-01-01

    Sequence optimization, where the items in a list are ordered to maximize some reward has many applications such as web advertisement placement, search, and control libraries in robotics. Previous work in sequence optimization produces a static ordering that does not take any features of the item or context of the problem into account. In this work, we propose a general approach to order the items within the sequence based on the context (e.g., perceptual information, environment description, and goals). We take a simple, efficient, reduction-based approach where the choice and order of the items is established by repeatedly learning simple classifiers or regressors for each "slot" in the sequence. Our approach leverages recent work on submodular function maximization to provide a formal regret reduction from submodular sequence optimization to simple cost-sensitive prediction. We apply our contextual sequence prediction algorithm to optimize control libraries and demonstrate results on two robotics problems: ...

  3. Unascertained measurement classifying model of goaf collapse prediction

    Institute of Scientific and Technical Information of China (English)

    DONG Long-jun; PENG Gang-jian; FU Yu-hua; BAI Yun-fei; LIU You-fang

    2008-01-01

    Based on optimized forecast method of unascertained classifying, a unascertained measurement classifying model (UMC) to predict mining induced goaf collapse was established. The discriminated factors of the model are influential factors including overburden layer type, overburden layer thickness, the complex degree of geologic structure,the inclination angle of coal bed, volume rate of the cavity region, the vertical goaf depth from the surface and space superposition layer of the goaf region. Unascertained measurement (UM) function of each factor was calculated. The unascertained measurement to indicate the classification center and the grade of waiting forecast sample was determined by the UM distance between the synthesis index of waiting forecast samples and index of every classification. The training samples were tested by the established model, and the correct rate is 100%. Furthermore, the seven waiting forecast samples were predicted by the UMC model. The results show that the forecast results are fully consistent with the actual situation.

  4. Security Enrichment in Intrusion Detection System Using Classifier Ensemble

    Directory of Open Access Journals (Sweden)

    Uma R. Salunkhe

    2017-01-01

    Full Text Available In the era of Internet and with increasing number of people as its end users, a large number of attack categories are introduced daily. Hence, effective detection of various attacks with the help of Intrusion Detection Systems is an emerging trend in research these days. Existing studies show effectiveness of machine learning approaches in handling Intrusion Detection Systems. In this work, we aim to enhance detection rate of Intrusion Detection System by using machine learning technique. We propose a novel classifier ensemble based IDS that is constructed using hybrid approach which combines data level and feature level approach. Classifier ensembles combine the opinions of different experts and improve the intrusion detection rate. Experimental results show the improved detection rates of our system compared to reference technique.

  5. MAMMOGRAMS ANALYSIS USING SVM CLASSIFIER IN COMBINED TRANSFORMS DOMAIN

    Directory of Open Access Journals (Sweden)

    B.N. Prathibha

    2011-02-01

    Full Text Available Breast cancer is a primary cause of mortality and morbidity in women. Reports reveal that earlier the detection of abnormalities, better the improvement in survival. Digital mammograms are one of the most effective means for detecting possible breast anomalies at early stages. Digital mammograms supported with Computer Aided Diagnostic (CAD systems help the radiologists in taking reliable decisions. The proposed CAD system extracts wavelet features and spectral features for the better classification of mammograms. The Support Vector Machines classifier is used to analyze 206 mammogram images from Mias database pertaining to the severity of abnormality, i.e., benign and malign. The proposed system gives 93.14% accuracy for discrimination between normal-malign and 87.25% accuracy for normal-benign samples and 89.22% accuracy for benign-malign samples. The study reveals that features extracted in hybrid transform domain with SVM classifier proves to be a promising tool for analysis of mammograms.

  6. Using Syntactic-Based Kernels for Classifying Temporal Relations

    Institute of Scientific and Technical Information of China (English)

    Seyed Abolghasem Mirroshandel; Gholamreza Ghassem-Sani; Mahdy Khayyamian

    2011-01-01

    Temporal relation classification is one of contemporary demanding tasks of natural language processing. This task can be used in various applications such as question answering, summarization, and language specific information retrieval. In this paper, we propose an improved algorithm for classifying temporal relations, between events or between events and time, using support vector machines (SVM). Along with gold-standard corpus features, the proposed method aims at exploiting some useful automatically generated syntactic features to improve the accuracy of classification. Accordingly, a number of novel kernel functions are introduced and evaluated. Our evaluations clearly demonstrate that adding syntactic features results in a considerable improvement over the state-of-the-art method of classifying temporal relations.

  7. Feasibility study for banking loan using association rule mining classifier

    Directory of Open Access Journals (Sweden)

    Agus Sasmito Aribowo

    2015-03-01

    Full Text Available The problem of bad loans in the koperasi can be reduced if the koperasi can detect whether member can complete the mortgage debt or decline. The method used for identify characteristic patterns of prospective lenders in this study, called Association Rule Mining Classifier. Pattern of credit member will be converted into knowledge and used to classify other creditors. Classification process would separate creditors into two groups: good credit and bad credit groups. Research using prototyping for implementing the design into an application using programming language and development tool. The process of association rule mining using Weighted Itemset Tidset (WIT–tree methods. The results shown that the method can predict the prospective customer credit. Training data set using 120 customers who already know their credit history. Data test used 61 customers who apply for credit. The results concluded that 42 customers will be paying off their loans and 19 clients are decline

  8. Scoring and Classifying Examinees Using Measurement Decision Theory

    Directory of Open Access Journals (Sweden)

    Lawrence M. Rudner

    2009-04-01

    Full Text Available This paper describes and evaluates the use of measurement decision theory (MDT to classify examinees based on their item response patterns. The model has a simple framework that starts with the conditional probabilities of examinees in each category or mastery state responding correctly to each item. The presented evaluation investigates: (1 the classification accuracy of tests scored using decision theory; (2 the effectiveness of different sequential testing procedures; and (3 the number of items needed to make a classification. A large percentage of examinees can be classified accurately with very few items using decision theory. A Java Applet for self instruction and software for generating, calibrating and scoring MDT data are provided.

  9. The fuzzy gene filter: A classifier performance assesment

    CERN Document Server

    Perez, Meir

    2011-01-01

    The Fuzzy Gene Filter (FGF) is an optimised Fuzzy Inference System designed to rank genes in order of differential expression, based on expression data generated in a microarray experiment. This paper examines the effectiveness of the FGF for feature selection using various classification architectures. The FGF is compared to three of the most common gene ranking algorithms: t-test, Wilcoxon test and ROC curve analysis. Four classification schemes are used to compare the performance of the FGF vis-a-vis the standard approaches: K Nearest Neighbour (KNN), Support Vector Machine (SVM), Naive Bayesian Classifier (NBC) and Artificial Neural Network (ANN). A nested stratified Leave-One-Out Cross Validation scheme is used to identify the optimal number top ranking genes, as well as the optimal classifier parameters. Two microarray data sets are used for the comparison: a prostate cancer data set and a lymphoma data set.

  10. Face Detection Using Adaboosted SVM-Based Component Classifier

    CERN Document Server

    Valiollahzadeh, Seyyed Majid; Nazari, Mohammad

    2008-01-01

    Recently, Adaboost has been widely used to improve the accuracy of any given learning algorithm. In this paper we focus on designing an algorithm to employ combination of Adaboost with Support Vector Machine as weak component classifiers to be used in Face Detection Task. To obtain a set of effective SVM-weaklearner Classifier, this algorithm adaptively adjusts the kernel parameter in SVM instead of using a fixed one. Proposed combination outperforms in generalization in comparison with SVM on imbalanced classification problem. The proposed here method is compared, in terms of classification accuracy, to other commonly used Adaboost methods, such as Decision Trees and Neural Networks, on CMU+MIT face database. Results indicate that the performance of the proposed method is overall superior to previous Adaboost approaches.

  11. Deep Feature Learning and Cascaded Classifier for Large Scale Data

    DEFF Research Database (Denmark)

    Prasoon, Adhish

    from data rather than having a predefined feature set. We explore deep learning approach of convolutional neural network (CNN) for segmenting three dimensional medical images. We propose a novel system integrating three 2D CNNs, which have a one-to-one association with the xy, yz and zx planes of 3D......This thesis focuses on voxel/pixel classification based approaches for image segmentation. The main application is segmentation of articular cartilage in knee MRIs. The first major contribution of the thesis deals with large scale machine learning problems. Many medical imaging problems need huge...... amount of training data to cover sufficient biological variability. Learning methods scaling badly with number of training data points cannot be used in such scenarios. This may restrict the usage of many powerful classifiers having excellent generalization ability. We propose a cascaded classifier which...

  12. Deep Feature Learning and Cascaded Classifier for Large Scale Data

    DEFF Research Database (Denmark)

    Prasoon, Adhish

    This thesis focuses on voxel/pixel classification based approaches for image segmentation. The main application is segmentation of articular cartilage in knee MRIs. The first major contribution of the thesis deals with large scale machine learning problems. Many medical imaging problems need huge...... to a state-of-the-art method for cartilage segmentation using one stage nearest neighbour classifier. Our method achieved better results than the state-of-the-art method for tibial as well as femoral cartilage segmentation. The next main contribution of the thesis deals with learning features autonomously...... image, respectively and this system is referred as triplanar convolutional neural network in the thesis. We applied the triplanar CNN for segmenting articular cartilage in knee MRI and compared its performance with the same state-of-the-art method which was used as a benchmark for cascaded classifier...

  13. Logarithmic Spiral-based Construction of RBF Classifiers

    Directory of Open Access Journals (Sweden)

    Mohamed Wajih Guerfala

    2017-02-01

    Full Text Available Clustering process is defined as grouping similar objects together into homogeneous groups or clusters. Objects that belong to one cluster should be very similar to each other, but objects in different clusters will be dissimilar. It aims to simplify the representation of the initial data. The automatic classification recovers all the methods allowing the automatic construction of such groups. This paper describes the design of radial basis function (RBF neural classifiers using a new algorithm for characterizing the hidden layer structure. This algorithm, called k-means Mahalanobis distance, groups the training data class by class in order to calculate the optimal number of clusters of the hidden layer, using two validity indexes. To initialize the initial clusters of k-means algorithm, the method of logarithmic spiral golden angle has been used. Two real data sets (Iris and Wine are considered to improve the efficiency of the proposed approach and the obtained results are compared with basic literature classifier

  14. Evaluation of LDA Ensembles Classifiers for Brain Computer Interface

    Science.gov (United States)

    Arjona, Cristian; Pentácolo, José; Gareis, Iván; Atum, Yanina; Gentiletti, Gerardo; Acevedo, Rubén; Rufiner, Leonardo

    2011-12-01

    The Brain Computer Interface (BCI) translates brain activity into computer commands. To increase the performance of the BCI, to decode the user intentions it is necessary to get better the feature extraction and classification techniques. In this article the performance of a three linear discriminant analysis (LDA) classifiers ensemble is studied. The system based on ensemble can theoretically achieved better classification results than the individual counterpart, regarding individual classifier generation algorithm and the procedures for combine their outputs. Classic algorithms based on ensembles such as bagging and boosting are discussed here. For the application on BCI, it was concluded that the generated results using ER and AUC as performance index do not give enough information to establish which configuration is better.

  15. Dendritic spine detection using curvilinear structure detector and LDA classifier.

    Science.gov (United States)

    Zhang, Yong; Zhou, Xiaobo; Witt, Rochelle M; Sabatini, Bernardo L; Adjeroh, Donald; Wong, Stephen T C

    2007-06-01

    Dendritic spines are small, bulbous cellular compartments that carry synapses. Biologists have been studying the biochemical pathways by examining the morphological and statistical changes of the dendritic spines at the intracellular level. In this paper a novel approach is presented for automated detection of dendritic spines in neuron images. The dendritic spines are recognized as small objects of variable shape attached or detached to multiple dendritic backbones in the 2D projection of the image stack along the optical direction. We extend the curvilinear structure detector to extract the boundaries as well as the centerlines for the dendritic backbones and spines. We further build a classifier using Linear Discriminate Analysis (LDA) to classify the attached spines into valid and invalid types to improve the accuracy of the spine detection. We evaluate the proposed approach by comparing with the manual results in terms of backbone length, spine number, spine length, and spine density.

  16. Neural Networks Classifier for Data Selection in Statistical Machine Translation

    OpenAIRE

    Peris, Álvaro; Chinea-Rios, Mara; Casacuberta, Francisco

    2016-01-01

    We address the data selection problem in statistical machine translation (SMT) as a classification task. The new data selection method is based on a neural network classifier. We present a new method description and empirical results proving that our data selection method provides better translation quality, compared to a state-of-the-art method (i.e., Cross entropy). Moreover, the empirical results reported are coherent across different language pairs.

  17. Classifying Floating Potential Measurement Unit Data Products as Science Data

    Science.gov (United States)

    Coffey, Victoria; Minow, Joseph

    2015-01-01

    We are Co-Investigators for the Floating Potential Measurement Unit (FPMU) on the International Space Station (ISS) and members of the FPMU operations and data analysis team. We are providing this memo for the purpose of classifying raw and processed FPMU data products and ancillary data as NASA science data with unrestricted, public availability in order to best support science uses of the data.

  18. Mathematical Modeling and Analysis of Classified Marketing of Agricultural Products

    Institute of Scientific and Technical Information of China (English)

    Fengying; WANG

    2014-01-01

    Classified marketing of agricultural products was analyzed using the Logistic Regression Model. This method can take full advantage of information in agricultural product database,to find factors influencing best selling degree of agricultural products,and make quantitative analysis accordingly. Using this model,it is also able to predict sales of agricultural products,and provide reference for mapping out individualized sales strategy for popularizing agricultural products.

  19. [A New HAC Unsupervised Classifier Based on Spectral Harmonic Analysis].

    Science.gov (United States)

    Yang, Ke-ming; Wei, Hua-feng; Shi, Gang-qiang; Sun, Yang-yang; Liu, Fei

    2015-07-01

    Hyperspectral images classification is one of the important methods to identify image information, which has great significance for feature identification, dynamic monitoring and thematic information extraction, etc. Unsupervised classification without prior knowledge is widely used in hyperspectral image classification. This article proposes a new hyperspectral images unsupervised classification algorithm based on harmonic analysis(HA), which is called the harmonic analysis classifer (HAC). First, the HAC algorithm counts the first harmonic component and draws the histogram, so it can determine the initial feature categories and the pixel of cluster centers according to the number and location of the peak. Then, the algorithm is to map the waveform information of pixels to be classified spectrum into the feature space made up of harmonic decomposition times, amplitude and phase, and the similar features can be gotten together in the feature space, these pixels will be classified according to the principle of minimum distance. Finally, the algorithm computes the Euclidean distance of these pixels between cluster center, and merges the initial classification by setting the distance threshold. so the HAC can achieve the purpose of hyperspectral images classification. The paper collects spectral curves of two feature categories, and obtains harmonic decomposition times, amplitude and phase after harmonic analysis, the distribution of HA components in the feature space verified the correctness of the HAC. While the HAC algorithm is applied to EO-1 satellite Hyperion hyperspectral image and obtains the results of classification. Comparing with the hyperspectral image classifying results of K-MEANS, ISODATA and HAC classifiers, the HAC, as a unsupervised classification method, is confirmed to have better application on hyperspectral image classification.

  20. Adaptive statistical pattern classifiers for remotely sensed data

    Science.gov (United States)

    Gonzalez, R. C.; Pace, M. O.; Raulston, H. S.

    1975-01-01

    A technique for the adaptive estimation of nonstationary statistics necessary for Bayesian classification is developed. The basic approach to the adaptive estimation procedure consists of two steps: (1) an optimal stochastic approximation of the parameters of interest and (2) a projection of the parameters in time or position. A divergence criterion is developed to monitor algorithm performance. Comparative results of adaptive and nonadaptive classifier tests are presented for simulated four dimensional spectral scan data.

  1. Manually Classified Errors in Czech-Slovak Translation

    OpenAIRE

    Galuščáková, Petra; Bojar, Ondřej

    2012-01-01

    Outputs of five Czech-Slovak machine translation systems (Česílko, Česílko 2, Google Translate and Moses with different settings) for first 50 sentences of WMT 2010 testing set. The translations were manually processed and the errors were marked and classified according to the scheme by Vilar et al. (David Vilar, Jia Xu, Luis Fernando D’Haro, Hermann Ney: Error Analysis of Statistical Machine Translation Output, Proceedings of LREC-2006, 2006)

  2. Classifying paragraph types using linguistic features: Is paragraph positioning important?

    OpenAIRE

    Scott A. Crossley, Kyle Dempsey & Danielle S. McNamara

    2011-01-01

    This study examines the potential for computational tools and human raters to classify paragraphs based on positioning. In this study, a corpus of 182 paragraphs was collected from student, argumentative essays. The paragraphs selected were initial, middle, and final paragraphs and their positioning related to introductory, body, and concluding paragraphs. The paragraphs were analyzed by the computational tool Coh-Metrix on a variety of linguistic features with correlates to textual cohesion ...

  3. Classifying Radio Galaxies with the Convolutional Neural Network

    Science.gov (United States)

    Aniyan, A. K.; Thorat, K.

    2017-06-01

    We present the application of a deep machine learning technique to classify radio images of extended sources on a morphological basis using convolutional neural networks (CNN). In this study, we have taken the case of the Fanaroff-Riley (FR) class of radio galaxies as well as radio galaxies with bent-tailed morphology. We have used archival data from the Very Large Array (VLA)—Faint Images of the Radio Sky at Twenty Centimeters survey and existing visually classified samples available in the literature to train a neural network for morphological classification of these categories of radio sources. Our training sample size for each of these categories is ˜200 sources, which has been augmented by rotated versions of the same. Our study shows that CNNs can classify images of the FRI and FRII and bent-tailed radio galaxies with high accuracy (maximum precision at 95%) using well-defined samples and a “fusion classifier,” which combines the results of binary classifications, while allowing for a mechanism to find sources with unusual morphologies. The individual precision is highest for bent-tailed radio galaxies at 95% and is 91% and 75% for the FRI and FRII classes, respectively, whereas the recall is highest for FRI and FRIIs at 91% each, while the bent-tailed class has a recall of 79%. These results show that our results are comparable to that of manual classification, while being much faster. Finally, we discuss the computational and data-related challenges associated with the morphological classification of radio galaxies with CNNs.

  4. Prediction of cardiac arrest recurrence using ensemble classifiers

    Indian Academy of Sciences (India)

    NACHIKET TAPAS; TUSHAR LONE; DAMODAR REDDY; VENKATANARESH KUPPILI

    2017-07-01

    Inability of a heart to contract effectually or its failure to contract prevents blood from circulating efficiently, causing circulatory arrest or cardiac arrest or cardiopulmonary arrest. The unexpected cardiac arrest is medically referred to as sudden cardiac arrest (SCA). Poor survival rate of patients with SCA is one of themost ubiquitous health care problems today. Recent studies show that heart-rate-derived features can act as early predictors of SCA. Addition of angiographic and electrophysiological features can increase the robustness of the prediction system. Early warning has the capability of saving many lives. Risk of recurrent terminal cardiac arrest is high for out-of-hospital survivors. Foregoing studies indicate that recurrent cardiac events are time dependent and, while in clinical follow-up, are highly probable, predominantly in early phase. In this paper, we observe the changing risk of and changing influence of various clinical, angiographic and electrophysiological parameters on subsequent cardiac arrest recurrence with time. Various medical and synthetic datasets such as ECG dataset from PhysioNet, Pima Indian Diabetes dataset from UCI Machine Learning Repository and gene expression dataset from GEO are used, which are unique as compared with related works. Various classifiers such as LogitBoost with simple regression function, random forest and multilayer perceptron are used for recurrence risk prediction. Collection of these classifiers together forms the ensemble classifiers. Classifiers are compared based on various measures like accuracy and precision. Based on the classification, risk scores are calculated using logistic regression with backward elimination. The proposed method is used for final risk estimation. The same datasets are used for risk score calculation model development. Experimental results are found to be encouraging.

  5. Review of the US Department of Energy Classified Visits Program

    Energy Technology Data Exchange (ETDEWEB)

    Martin, S W; Killinger, M H; Segura, M A

    1992-07-01

    This review examines the US Department of Energy (DOE) Classified Visits Program, which is administered by the Office of Safeguards and Security. The overall purpose of this analysis is to (1) ensure that DOE policy and implementing procedures are appropriate to maintain US national security intentions; (2) evaluate the effectiveness of the process used across the DOE complex; and (3) recommend changes which will enhance the overall efficiency of the process while maintaining the program's integrity.

  6. An Investigation to Improve Classifier Accuracy for Myo Collected Data

    Science.gov (United States)

    2017-02-01

    Michael H Lee, Andre V Harrison , and Robert P Winkler Approved for public release; distribution is unlimited...Laboratory An Investigation to Improve Classifier Accuracy for Myo Collected Data by Michael H Lee, Andre V Harrison , and Robert P Winkler...5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Michael H Lee, Andre V Harrison , and Robert P Winkler 5d. PROJECT NUMBER 5e

  7. Face Recognition Combining Eigen Features with a Parzen Classifier

    Institute of Scientific and Technical Information of China (English)

    SUN Xin; LIU Bing; LIU Ben-yong

    2005-01-01

    A face recognition scheme is proposed, wherein a face image is preprocessed by pixel averaging and energy normalizing to reduce data dimension and brightness variation effect, followed by the Fourier transform to estimate the spectrum of the preprocessed image. The principal component analysis is conducted on the spectra of a face image to obtain eigen features. Combining eigen features with a Parzen classifier, experiments are taken on the ORL face database.

  8. Image Classifying Registration for Gaussian & Bayesian Techniques: A Review

    Directory of Open Access Journals (Sweden)

    Rahul Godghate,

    2014-04-01

    Full Text Available A Bayesian Technique for Image Classifying Registration to perform simultaneously image registration and pixel classification. Medical image registration is critical for the fusion of complementary information about patient anatomy and physiology, for the longitudinal study of a human organ over time and the monitoring of disease development or treatment effect, for the statistical analysis of a population variation in comparison to a so-called digital atlas, for image-guided therapy, etc. A Bayesian Technique for Image Classifying Registration is well-suited to deal with image pairs that contain two classes of pixels with different inter-image intensity relationships. We will show through different experiments that the model can be applied in many different ways. For instance if the class map is known, then it can be used for template-based segmentation. If the full model is used, then it can be applied to lesion detection by image comparison. Experiments have been conducted on both real and simulated data. It show that in the presence of an extra-class, the classifying registration improves both the registration and the detection, especially when the deformations are small. The proposed model is defined using only two classes but it is straightforward to extend it to an arbitrary number of classes.

  9. General and Local: Averaged k-Dependence Bayesian Classifiers

    Directory of Open Access Journals (Sweden)

    Limin Wang

    2015-06-01

    Full Text Available The inference of a general Bayesian network has been shown to be an NP-hard problem, even for approximate solutions. Although k-dependence Bayesian (KDB classifier can construct at arbitrary points (values of k along the attribute dependence spectrum, it cannot identify the changes of interdependencies when attributes take different values. Local KDB, which learns in the framework of KDB, is proposed in this study to describe the local dependencies implicated in each test instance. Based on the analysis of functional dependencies, substitution-elimination resolution, a new type of semi-naive Bayesian operation, is proposed to substitute or eliminate generalization to achieve accurate estimation of conditional probability distribution while reducing computational complexity. The final classifier, averaged k-dependence Bayesian (AKDB classifiers, will average the output of KDB and local KDB. Experimental results on the repository of machine learning databases from the University of California Irvine (UCI showed that AKDB has significant advantages in zero-one loss and bias relative to naive Bayes (NB, tree augmented naive Bayes (TAN, Averaged one-dependence estimators (AODE, and KDB. Moreover, KDB and local KDB show mutually complementary characteristics with respect to variance.

  10. Weighted-Fusion-Based Representation Classifiers for Hyperspectral Imagery

    Directory of Open Access Journals (Sweden)

    Bing Peng

    2015-11-01

    Full Text Available Spatial texture features have been demonstrated to be very useful for the recently-proposed representation-based classifiers, such as the sparse representation-based classifier (SRC and nearest regularized subspace (NRS. In this work, a weighted residual-fusion-based strategy with multiple features is proposed for these classifiers. Multiple features include local binary patterns (LBP, Gabor features, and the original spectral signatures. In the proposed classification framework, representation residuals for a testing pixel from using each type of features are weighted to generate the final representation residual, and then the label of the testing pixel is determined according to the class yielding the minimum final residual. The motivation of this work is that different features represent pixels from different perspectives and their fusion in the residual domain can enhance the discriminative ability. Experimental results of several real hyperspectral image datasets demonstrate that the proposed residual-based fusion outperforms the original NRS, SRC, support vector machine (SVM with LBP, and SVM with Gabor features, even in small-sample-size (SSS situations.

  11. Self-organizing map classifier for stressed speech recognition

    Science.gov (United States)

    Partila, Pavol; Tovarek, Jaromir; Voznak, Miroslav

    2016-05-01

    This paper presents a method for detecting speech under stress using Self-Organizing Maps. Most people who are exposed to stressful situations can not adequately respond to stimuli. Army, police, and fire department occupy the largest part of the environment that are typical of an increased number of stressful situations. The role of men in action is controlled by the control center. Control commands should be adapted to the psychological state of a man in action. It is known that the psychological changes of the human body are also reflected physiologically, which consequently means the stress effected speech. Therefore, it is clear that the speech stress recognizing system is required in the security forces. One of the possible classifiers, which are popular for its flexibility, is a self-organizing map. It is one type of the artificial neural networks. Flexibility means independence classifier on the character of the input data. This feature is suitable for speech processing. Human Stress can be seen as a kind of emotional state. Mel-frequency cepstral coefficients, LPC coefficients, and prosody features were selected for input data. These coefficients were selected for their sensitivity to emotional changes. The calculation of the parameters was performed on speech recordings, which can be divided into two classes, namely the stress state recordings and normal state recordings. The benefit of the experiment is a method using SOM classifier for stress speech detection. Results showed the advantage of this method, which is input data flexibility.

  12. A Distributed Fuzzy Associative Classifier for Big Data.

    Science.gov (United States)

    Segatori, Armando; Bechini, Alessio; Ducange, Pietro; Marcelloni, Francesco

    2017-09-19

    Fuzzy associative classification has not been widely analyzed in the literature, although associative classifiers (ACs) have proved to be very effective in different real domain applications. The main reason is that learning fuzzy ACs is a very heavy task, especially when dealing with large datasets. To overcome this drawback, in this paper, we propose an efficient distributed fuzzy associative classification approach based on the MapReduce paradigm. The approach exploits a novel distributed discretizer based on fuzzy entropy for efficiently generating fuzzy partitions of the attributes. Then, a set of candidate fuzzy association rules is generated by employing a distributed fuzzy extension of the well-known FP-Growth algorithm. Finally, this set is pruned by using three purposely adapted types of pruning. We implemented our approach on the popular Hadoop framework. Hadoop allows distributing storage and processing of very large data sets on computer clusters built from commodity hardware. We have performed an extensive experimentation and a detailed analysis of the results using six very large datasets with up to 11,000,000 instances. We have also experimented different types of reasoning methods. Focusing on accuracy, model complexity, computation time, and scalability, we compare the results achieved by our approach with those obtained by two distributed nonfuzzy ACs recently proposed in the literature. We highlight that, although the accuracies result to be comparable, the complexity, evaluated in terms of number of rules, of the classifiers generated by the fuzzy distributed approach is lower than the one of the nonfuzzy classifiers.

  13. Classifying the tropospheric precursor patterns of sudden stratospheric warmings

    Science.gov (United States)

    Bao, Ming; Tan, Xin; Hartmann, Dennis L.; Ceppi, Paulo

    2017-08-01

    Classifying the tropospheric precursor patterns of sudden stratospheric warmings (SSWs) may provide insight into the different physical mechanisms of SSWs. Based on 37 major SSWs during the 1958-2014 winters in the ERA reanalysis data sets, the self-organizing maps method is used to classify the tropospheric precursor patterns of SSWs. The cluster analysis indicates that one of the precursor patterns appears as a mixed pattern consisting of the negative-signed Western Hemisphere circulation pattern and the positive phase of the Pacific-North America pattern. The mixed pattern exhibits higher statistical significance as a precursor pattern of SSWs than other previously identified precursors such as the subpolar North Pacific low, Atlantic blocking, and the western Pacific pattern. Other clusters confirm northern European blocking and Gulf of Alaska blocking as precursors of SSWs. Linear interference with the climatological planetary waves provides a simple interpretation for the precursors. The relationship between the classified precursor patterns of SSWs and ENSO phases as well as the types of SSWs is discussed.

  14. Comparison of artificial intelligence classifiers for SIP attack data

    Science.gov (United States)

    Safarik, Jakub; Slachta, Jiri

    2016-05-01

    Honeypot application is a source of valuable data about attacks on the network. We run several SIP honeypots in various computer networks, which are separated geographically and logically. Each honeypot runs on public IP address and uses standard SIP PBX ports. All information gathered via honeypot is periodically sent to the centralized server. This server classifies all attack data by neural network algorithm. The paper describes optimizations of a neural network classifier, which lower the classification error. The article contains the comparison of two neural network algorithm used for the classification of validation data. The first is the original implementation of the neural network described in recent work; the second neural network uses further optimizations like input normalization or cross-entropy cost function. We also use other implementations of neural networks and machine learning classification algorithms. The comparison test their capabilities on validation data to find the optimal classifier. The article result shows promise for further development of an accurate SIP attack classification engine.

  15. A Novel Cascade Classifier for Automatic Microcalcification Detection.

    Directory of Open Access Journals (Sweden)

    Seung Yeon Shin

    Full Text Available In this paper, we present a novel cascaded classification framework for automatic detection of individual and clusters of microcalcifications (μC. Our framework comprises three classification stages: i a random forest (RF classifier for simple features capturing the second order local structure of individual μCs, where non-μC pixels in the target mammogram are efficiently eliminated; ii a more complex discriminative restricted Boltzmann machine (DRBM classifier for μC candidates determined in the RF stage, which automatically learns the detailed morphology of μC appearances for improved discriminative power; and iii a detector to detect clusters of μCs from the individual μC detection results, using two different criteria. From the two-stage RF-DRBM classifier, we are able to distinguish μCs using explicitly computed features, as well as learn implicit features that are able to further discriminate between confusing cases. Experimental evaluation is conducted on the original Mammographic Image Analysis Society (MIAS and mini-MIAS databases, as well as our own Seoul National University Bundang Hospital digital mammographic database. It is shown that the proposed method outperforms comparable methods in terms of receiver operating characteristic (ROC and precision-recall curves for detection of individual μCs and free-response receiver operating characteristic (FROC curve for detection of clustered μCs.

  16. Evaluation of Polarimetric SAR Decomposition for Classifying Wetland Vegetation Types

    Directory of Open Access Journals (Sweden)

    Sang-Hoon Hong

    2015-07-01

    Full Text Available The Florida Everglades is the largest subtropical wetland system in the United States and, as with subtropical and tropical wetlands elsewhere, has been threatened by severe environmental stresses. It is very important to monitor such wetlands to inform management on the status of these fragile ecosystems. This study aims to examine the applicability of TerraSAR-X quadruple polarimetric (quad-pol synthetic aperture radar (PolSAR data for classifying wetland vegetation in the Everglades. We processed quad-pol data using the Hong & Wdowinski four-component decomposition, which accounts for double bounce scattering in the cross-polarization signal. The calculated decomposition images consist of four scattering mechanisms (single, co- and cross-pol double, and volume scattering. We applied an object-oriented image analysis approach to classify vegetation types with the decomposition results. We also used a high-resolution multispectral optical RapidEye image to compare statistics and classification results with Synthetic Aperture Radar (SAR observations. The calculated classification accuracy was higher than 85%, suggesting that the TerraSAR-X quad-pol SAR signal had a high potential for distinguishing different vegetation types. Scattering components from SAR acquisition were particularly advantageous for classifying mangroves along tidal channels. We conclude that the typical scattering behaviors from model-based decomposition are useful for discriminating among different wetland vegetation types.

  17. Classifying prosthetic use via accelerometry in persons with transtibial amputations

    Directory of Open Access Journals (Sweden)

    Morgan T. Redfield, MSEE

    2013-12-01

    Full Text Available Knowledge of how persons with amputation use their prostheses and how this use changes over time may facilitate effective rehabilitation practices and enhance understanding of prosthesis functionality. Perpetual monitoring and classification of prosthesis use may also increase the health and quality of life for prosthetic users. Existing monitoring and classification systems are often limited in that they require the subject to manipulate the sensor (e.g., attach, remove, or reset a sensor, record data over relatively short time periods, and/or classify a limited number of activities and body postures of interest. In this study, a commercially available three-axis accelerometer (ActiLife ActiGraph GT3X+ was used to characterize the activities and body postures of individuals with transtibial amputation. Accelerometers were mounted on prosthetic pylons of 10 persons with transtibial amputation as they performed a preset routine of actions. Accelerometer data was postprocessed using a binary decision tree to identify when the prosthesis was being worn and to classify periods of use as movement (i.e., leg motion such as walking or stair climbing, standing (i.e., standing upright with limited leg motion, or sitting (i.e., seated with limited leg motion. Classifications were compared to visual observation by study researchers. The classifier achieved a mean +/– standard deviation accuracy of 96.6% +/– 3.0%.

  18. Classifying prosthetic use via accelerometry in persons with transtibial amputations.

    Science.gov (United States)

    Redfield, Morgan T; Cagle, John C; Hafner, Brian J; Sanders, Joan E

    2013-01-01

    Knowledge of how persons with amputation use their prostheses and how this use changes over time may facilitate effective rehabilitation practices and enhance understanding of prosthesis functionality. Perpetual monitoring and classification of prosthesis use may also increase the health and quality of life for prosthetic users. Existing monitoring and classification systems are often limited in that they require the subject to manipulate the sensor (e.g., attach, remove, or reset a sensor), record data over relatively short time periods, and/or classify a limited number of activities and body postures of interest. In this study, a commercially available three-axis accelerometer (ActiLife ActiGraph GT3X+) was used to characterize the activities and body postures of individuals with transtibial amputation. Accelerometers were mounted on prosthetic pylons of 10 persons with transtibial amputation as they performed a preset routine of actions. Accelerometer data was postprocessed using a binary decision tree to identify when the prosthesis was being worn and to classify periods of use as movement (i.e., leg motion such as walking or stair climbing), standing (i.e., standing upright with limited leg motion), or sitting (i.e., seated with limited leg motion). Classifications were compared to visual observation by study researchers. The classifier achieved a mean +/- standard deviation accuracy of 96.6% +/- 3.0%.

  19. Analysis of classifiers performance for classification of potential microcalcification

    Science.gov (United States)

    M. N., Arun K.; Sheshadri, H. S.

    2013-07-01

    Breast cancer is a significant public health problem in the world. According to the literature early detection improve breast cancer prognosis. Mammography is a screening tool used for early detection of breast cancer. About 10-30% cases are missed during the routine check as it is difficult for the radiologists to make accurate analysis due to large amount of data. The Microcalcifications (MCs) are considered to be important signs of breast cancer. It has been reported in literature that 30% - 50% of breast cancer detected radio graphically show MCs on mammograms. Histologic examinations report 62% to 79% of breast carcinomas reveals MCs. MC are tiny, vary in size, shape, and distribution, and MC may be closely connected to surrounding tissues. There is a major challenge using the traditional classifiers in the classification of individual potential MCs as the processing of mammograms in appropriate stage generates data sets with an unequal amount of information for both classes (i.e., MC, and Not-MC). Most of the existing state-of-the-art classification approaches are well developed by assuming the underlying training set is evenly distributed. However, they are faced with a severe bias problem when the training set is highly imbalanced in distribution. This paper addresses this issue by using classifiers which handle the imbalanced data sets. In this paper, we also compare the performance of classifiers which are used in the classification of potential MC.

  20. Binary Classifier Calibration Using an Ensemble of Linear Trend Estimation

    Science.gov (United States)

    Naeini, Mahdi Pakdaman; Cooper, Gregory F.

    2017-01-01

    Learning accurate probabilistic models from data is crucial in many practical tasks in data mining. In this paper we present a new non-parametric calibration method called ensemble of linear trend estimation (ELiTE). ELiTE utilizes the recently proposed ℓ1 trend ltering signal approximation method [22] to find the mapping from uncalibrated classification scores to the calibrated probability estimates. ELiTE is designed to address the key limitations of the histogram binning-based calibration methods which are (1) the use of a piecewise constant form of the calibration mapping using bins, and (2) the assumption of independence of predicted probabilities for the instances that are located in different bins. The method post-processes the output of a binary classifier to obtain calibrated probabilities. Thus, it can be applied with many existing classification models. We demonstrate the performance of ELiTE on real datasets for commonly used binary classification models. Experimental results show that the method outperforms several common binary-classifier calibration methods. In particular, ELiTE commonly performs statistically significantly better than the other methods, and never worse. Moreover, it is able to improve the calibration power of classifiers, while retaining their discrimination power. The method is also computationally tractable for large scale datasets, as it is practically O(N log N) time, where N is the number of samples.

  1. ASYMBOOST-BASED FISHER LINEAR CLASSIFIER FOR FACE RECOGNITION

    Institute of Scientific and Technical Information of China (English)

    Wang Xianji; Ye Xueyi; Li Bin; Li Xin; Zhuang Zhenquan

    2008-01-01

    When using AdaBoost to select discriminant features from some feature space (e.g. Gabor feature space) for face recognition, cascade structure is usually adopted to leverage the asymmetry in the distribution of positive and negative samples. Each node in the cascade structure is a classifier trained by AdaBoost with an asymmetric learning goal of high recognition rate but only moderate low false positive rate. One limitation of AdaBoost arises in the context of skewed example distribution and cascade classifiers: AdaBoost minimizes the classification error, which is not guaranteed to achieve the asymmetric node learning goal. In this paper, we propose to use the asymmetric AdaBoost (Asym-Boost) as a mechanism to address the asymmetric node learning goal. Moreover, the two parts of the selecting features and forming ensemble classifiers are decoupled, both of which occur simultaneously in AsymBoost and AdaBoost. Fisher Linear Discriminant Analysis (FLDA) is used on the selected features to learn a linear discriminant function that maximizes the separability of data among the different classes, which we think can improve the recognition performance. The proposed algorithm is dem onstrated with face recognition using a Gabor based representation on the FERET database. Experimental results show that the proposed algorithm yields better recognition performance than AdaBoost itself.

  2. Exploiting Language Models to Classify Events from Twitter

    Directory of Open Access Journals (Sweden)

    Duc-Thuan Vo

    2015-01-01

    Full Text Available Classifying events is challenging in Twitter because tweets texts have a large amount of temporal data with a lot of noise and various kinds of topics. In this paper, we propose a method to classify events from Twitter. We firstly find the distinguishing terms between tweets in events and measure their similarities with learning language models such as ConceptNet and a latent Dirichlet allocation method for selectional preferences (LDA-SP, which have been widely studied based on large text corpora within computational linguistic relations. The relationship of term words in tweets will be discovered by checking them under each model. We then proposed a method to compute the similarity between tweets based on tweets’ features including common term words and relationships among their distinguishing term words. It will be explicit and convenient for applying to k-nearest neighbor techniques for classification. We carefully applied experiments on the Edinburgh Twitter Corpus to show that our method achieves competitive results for classifying events.

  3. GS-TEC: the Gaia Spectrophotometry Transient Events Classifier

    CERN Document Server

    Blagorodnova, Nadejda; Wyrzykowski, \\Lukasz; Irwin, Mike; Walton, Nicholas A

    2014-01-01

    We present an algorithm for classifying the nearby transient objects detected by the Gaia satellite. The algorithm will use the low-resolution spectra from the blue and red spectro-photometers on board of the satellite. Taking a Bayesian approach we model the spectra using the newly constructed reference spectral library and literature-driven priors. We find that for magnitudes brighter than 19 in Gaia $G$ magnitude, around 75\\% of the transients will be robustly classified. The efficiency of the algorithm for SNe type I is higher than 80\\% for magnitudes $G\\leq$18, dropping to approximately 60\\% at magnitude $G$=19. For SNe type II, the efficiency varies from 75 to 60\\% for $G\\leq$18, falling to 50\\% at $G$=19. The purity of our classifier is around 95\\% for SNe type I for all magnitudes. For SNe type II it is over 90\\% for objects with $G \\leq$19. GS-TEC also estimates the redshifts with errors of $\\sigma_z \\le$ 0.01 and epochs with uncertainties $\\sigma_t \\simeq$ 13 and 32 days for type SNe I and SNe II re...

  4. Multimodal biometric fusion using multiple-input correlation filter classifiers

    Science.gov (United States)

    Hennings, Pablo; Savvides, Marios; Vijaya Kumar, B. V. K.

    2005-03-01

    In this work we apply a computationally efficient, closed form design of a jointly optimized filter bank of correlation filter classifiers for biometric verification with the use of multiple biometrics from individuals. Advanced correlation filters have been used successfully for biometric classification, and have shown robustness in verifying faces, palmprints and fingerprints. In this study we address the issues of performing robust biometric verification when multiple biometrics from the same person are available at the moment of authentication; we implement biometric fusion by using a filter bank of correlation filter classifiers which are jointly optimized with each biometric, instead of designing separate independent correlation filter classifiers for each biometric and then fuse the resulting match scores. We present results using fingerprint and palmprint images from a data set of 40 people, showing a considerable advantage in verification performance producing a large margin of separation between the impostor and authentic match scores. The method proposed in this paper is a robust and secure method for authenticating an individual.

  5. Foreign object detection via texture recognition and a neural classifier

    Science.gov (United States)

    Patel, Devesh; Hannah, I.; Davies, E. R.

    1993-10-01

    It is rate to find pieces of stone, wood, metal, or glass in food packets, but when they occur, these foreign objects (FOs) cause distress to the consumer and concern to the manufacturer. Using x-ray imaging to detect FOs within food bags, hard contaminants such as stone or metal appear darker, whereas soft contaminants such as wood or rubber appear slightly lighter than the food substrate. In this paper we concentrate on the detection of soft contaminants such as small pieces of wood in bags of frozen corn kernels. Convolution masks are used to generate textural features which are then classified into corresponding homogeneous regions on the image using an artificial neural network (ANN) classifier. The separate ANN outputs are combined using a majority operator, and region discrepancies are removed by a median filter. Comparisons with classical classifiers showed the ANN approach to have the best overall combination of characteristics for our particular problem. The detected boundaries are in good agreement with the visually perceived segmentations.

  6. Classifying Human Body Acceleration Patterns Using a Hierarchical Temporal Memory

    Science.gov (United States)

    Sassi, Federico; Ascari, Luca; Cagnoni, Stefano

    This paper introduces a novel approach to the detection of human body movements during daily life. With the sole use of one wearable wireless triaxial accelerometer attached to one's chest, this approach aims at classifying raw acceleration data robustly, to detect many common human behaviors without requiring any specific a-priori knowledge about movements. The proposed approach consists of feeding sensory data into a specifically trained Hierarchical Temporal Memory (HTM) to extract invariant spatial-temporal patterns that characterize different body movements. The HTM output is then classified using a Support Vector Machine (SVM) into different categories. The performance of this new HTM+SVM combination is compared with a single SVM using real-word data corresponding to movements like "standing", "walking", "jumping" and "falling", acquired from a group of different people. Experimental results show that the HTM+SVM approach can detect behaviors with very high accuracy and is more robust, with respect to noise, than a classifier based solely on SVMs.

  7. Classifier-Guided Sampling for Complex Energy System Optimization

    Energy Technology Data Exchange (ETDEWEB)

    Backlund, Peter B. [Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States); Eddy, John P. [Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)

    2015-09-01

    This report documents the results of a Laboratory Directed Research and Development (LDRD) effort enti tled "Classifier - Guided Sampling for Complex Energy System Optimization" that was conducted during FY 2014 and FY 2015. The goal of this proj ect was to develop, implement, and test major improvements to the classifier - guided sampling (CGS) algorithm. CGS is type of evolutionary algorithm for perform ing search and optimization over a set of discrete design variables in the face of one or more objective functions. E xisting evolutionary algorithms, such as genetic algorithms , may require a large number of o bjecti ve function evaluations to identify optimal or near - optimal solutions . Reducing the number of evaluations can result in significant time savings, especially if the objective function is computationally expensive. CGS reduce s the evaluation count by us ing a Bayesian network classifier to filter out non - promising candidate designs , prior to evaluation, based on their posterior probabilit ies . In this project, b oth the single - objective and multi - objective version s of the CGS are developed and tested on a set of benchm ark problems. As a domain - specific case study, CGS is used to design a microgrid for use in islanded mode during an extended bulk power grid outage.

  8. The use of hyperspectral data for tree species discrimination: Combining binary classifiers

    CSIR Research Space (South Africa)

    Dastile, X

    2010-11-01

    Full Text Available . ? Construct classifier on training set. ? Estimate error probability (proportion of misclassified samples) on test set. training set (70%): make classifier test set (30%): estimate P(error) 11RU SASA 2010 8. Results of Seven-class classifiers Sets 10... ? Classifiers: Nearest neighbour and Neural Networks ? Estimate of the error probability ? Binary classifiers ? Combining binary classifiers: Error Correcting Output Codes ? Discussion ? References 3RU SASA 2010 2. Hyperspectral Remote Sensing...

  9. Automatic Human Facial Expression Recognition Based on Integrated Classifier From Monocular Video with Uncalibrated Camera

    Directory of Open Access Journals (Sweden)

    Yu Tao

    2017-01-01

    Full Text Available An automatic recognition framework for human facial expressions from a monocular video with an uncalibrated camera is proposed. The expression characteristics are first acquired from a kind of deformable template, similar to a facial muscle distribution. After associated regularization, the time sequences from the trait changes in space-time under complete expressional production are then arranged line by line in a matrix. Next, the matrix dimensionality is reduced by a method of manifold learning of neighborhood-preserving embedding. Finally, the refined matrix containing the expression trait information is recognized by a classifier that integrates the hidden conditional random field (HCRF and support vector machine (SVM. In an experiment using the Cohn–Kanade database, the proposed method showed a comparatively higher recognition rate than the individual HCRF or SVM methods in direct recognition from two-dimensional human face traits. Moreover, the proposed method was shown to be more robust than the typical Kotsia method because the former contains more structural characteristics of the data to be classified in space-time

  10. nRC: non-coding RNA Classifier based on structural features.

    Science.gov (United States)

    Fiannaca, Antonino; La Rosa, Massimo; La Paglia, Laura; Rizzo, Riccardo; Urso, Alfonso

    2017-01-01

    Non-coding RNA (ncRNA) are small non-coding sequences involved in gene expression regulation of many biological processes and diseases. The recent discovery of a large set of different ncRNAs with biologically relevant roles has opened the way to develop methods able to discriminate between the different ncRNA classes. Moreover, the lack of knowledge about the complete mechanisms in regulative processes, together with the development of high-throughput technologies, has required the help of bioinformatics tools in addressing biologists and clinicians with a deeper comprehension of the functional roles of ncRNAs. In this work, we introduce a new ncRNA classification tool, nRC (non-coding RNA Classifier). Our approach is based on features extraction from the ncRNA secondary structure together with a supervised classification algorithm implementing a deep learning architecture based on convolutional neural networks. We tested our approach for the classification of 13 different ncRNA classes. We obtained classification scores, using the most common statistical measures. In particular, we reach an accuracy and sensitivity score of about 74%. The proposed method outperforms other similar classification methods based on secondary structure features and machine learning algorithms, including the RNAcon tool that, to date, is the reference classifier. nRC tool is freely available as a docker image at https://hub.docker.com/r/tblab/nrc/. The source code of nRC tool is also available at https://github.com/IcarPA-TBlab/nrc.

  11. Use of multivariate analysis to minimize collecting of infrared images and classify detected objects

    Science.gov (United States)

    Svensson, Thomas; Letalick, Dietmar

    2016-10-01

    An infrared image contains spatial and radiative information about objects in a scene. Two challenges are to classify pixels in a cluttered environment and to detect partly obscured or buried objects like mines and IEDs. Infrared image sequences provide additional temporal information, which can be utilized for a more robust object detection and an improved classification of object pixels. A manual evaluation of multi-dimensional data is generally time consuming and inefficient and therefore various algorithms are used. By a principal component analysis (PCA) most of the information is retained in a new, reduced system with fewer dimensions. The principal component coefficients (loadings) are here used both for classifying detected object pixels and for reducing the number of images in the analysis by computing of score vectors. For the datasets studied, the number of required images can be reduced significantly without loss of detection and classification ability. This allows for a more sparse sampling and scanning of larger areas when using a UAV, for example.

  12. Understanding and classifying metabolite space and metabolite-likeness.

    Directory of Open Access Journals (Sweden)

    Julio E Peironcely

    Full Text Available While the entirety of 'Chemical Space' is huge (and assumed to contain between 10(63 and 10(200 'small molecules', distinct subsets of this space can nonetheless be defined according to certain structural parameters. An example of such a subspace is the chemical space spanned by endogenous metabolites, defined as 'naturally occurring' products of an organisms' metabolism. In order to understand this part of chemical space in more detail, we analyzed the chemical space populated by human metabolites in two ways. Firstly, in order to understand metabolite space better, we performed Principal Component Analysis (PCA, hierarchical clustering and scaffold analysis of metabolites and non-metabolites in order to analyze which chemical features are characteristic for both classes of compounds. Here we found that heteroatom (both oxygen and nitrogen content, as well as the presence of particular ring systems was able to distinguish both groups of compounds. Secondly, we established which molecular descriptors and classifiers are capable of distinguishing metabolites from non-metabolites, by assigning a 'metabolite-likeness' score. It was found that the combination of MDL Public Keys and Random Forest exhibited best overall classification performance with an AUC value of 99.13%, a specificity of 99.84% and a selectivity of 88.79%. This performance is slightly better than previous classifiers; and interestingly we found that drugs occupy two distinct areas of metabolite-likeness, the one being more 'synthetic' and the other being more 'metabolite-like'. Also, on a truly prospective dataset of 457 compounds, 95.84% correct classification was achieved. Overall, we are confident that we contributed to the tasks of classifying metabolites, as well as to understanding metabolite chemical space better. This knowledge can now be used in the development of new drugs that need to resemble metabolites, and in our work particularly for assessing the metabolite

  13. Decision Bayes Criteria for Optimal Classifier Based on Probabilistic Measures

    Institute of Scientific and Technical Information of China (English)

    Wissal Drira; Faouzi Ghorbel

    2014-01-01

    This paper addresses the high dimension sample problem in discriminate analysis under nonparametric and supervised assumptions. Since there is a kind of equivalence between the probabilistic dependence measure and the Bayes classification error probability, we propose to use an iterative algorithm to optimize the dimension reduction for classification with a probabilistic approach to achieve the Bayes classifier. The estimated probabilities of different errors encountered along the different phases of the system are realized by the Kernel estimate which is adjusted in a means of the smoothing parameter. Experiment results suggest that the proposed approach performs well.

  14. Hierarchical classifier design in high-dimensional, numerous class cases

    Science.gov (United States)

    Kim, Byungyong; Landgrebe, David A.

    1991-01-01

    As progress in new sensor technology continues, increasingly high spectral resolution sensors are being developed. These sensors give more detailed and complex data for each picture element and greatly increase the dimensionality of data over past systems. Three methods for designing a decision tree classifier are discussed; a top down approach, a bottom up approach, and a hybrid approach. Three feature extraction techniques are implemented. Canonical and extended canonical techniques are mainly dependent on the mean difference between two classes. An autocorrelation technique is dependent on the correlation differences. The mathematical relationship among sample size, dimensionality, and risk value is derived.

  15. DFRFT: A Classified Review of Recent Methods with Its Application

    Directory of Open Access Journals (Sweden)

    Ashutosh Kumar Singh

    2013-01-01

    Full Text Available In the literature, there are various algorithms available for computing the discrete fractional Fourier transform (DFRFT. In this paper, all the existing methods are reviewed, classified into four categories, and subsequently compared to find out the best alternative from the view point of minimal computational error, computational complexity, transform features, and additional features like security. Subsequently, the correlation theorem of FRFT has been utilized to remove significantly the Doppler shift caused due to motion of receiver in the DSB-SC AM signal. Finally, the role of DFRFT has been investigated in the area of steganography.

  16. Classifying Cubic Edge-Transitive Graphs of Order 8

    Indian Academy of Sciences (India)

    Mehdi Alaeiyan; M K Hosseinipoor

    2009-11-01

    A simple undirected graph is said to be semisymmetric if it is regular and edge-transitive but not vertex-transitive. Let be a prime. It was shown by Folkman (J. Combin. Theory 3(1967) 215--232) that a regular edge-transitive graph of order 2 or 22 is necessarily vertex-transitive. In this paper, an extension of his result in the case of cubic graphs is given. It is proved that, every cubic edge-transitive graph of order 8 is symmetric, and then all such graphs are classified.

  17. Colorfulness Enhancement Using Image Classifier Based on Chroma-histogram

    Institute of Scientific and Technical Information of China (English)

    Moon-cheol KIM; Kyoung-won LIM

    2010-01-01

    The paper proposes a colorfulness enhancement of pictorial images using image classifier based on chroma histogram.This ap-poach firstly estimates strength of colorfulness of images and their types.With such determined information,the algorithm automatically adjusts image colorfulness for a better natural image look.With the help of an additional detection of skin colors and a pixel chroma adaptive local processing,the algorithm produces more natural image look.The algorithm performance had been tested with an image quality judgment experiment of 20 persons.The experimental result indicates a better image preference.

  18. Can scientific journals be classified based on their citation profiles?

    Directory of Open Access Journals (Sweden)

    Sayed-Amir Marashi

    2015-03-01

    Full Text Available Classification of scientific publications is of great importance in biomedical research evaluation. However, accurate classification of research publications is challenging and normally is performed in a rather subjective way. In the present paper, we propose to classify biomedical publications into superfamilies, by analysing their citation profiles, i.e. the location of citations in the structure of citing articles. Such a classification may help authors to find the appropriate biomedical journal for publication, may make journal comparisons more rational, and may even help planners to better track the consequences of their policies on biomedical research.

  19. Classifying BCI signals from novice users with extreme learning machine

    Science.gov (United States)

    Rodríguez-Bermúdez, Germán; Bueno-Crespo, Andrés; José Martinez-Albaladejo, F.

    2017-07-01

    Brain computer interface (BCI) allows to control external devices only with the electrical activity of the brain. In order to improve the system, several approaches have been proposed. However it is usual to test algorithms with standard BCI signals from experts users or from repositories available on Internet. In this work, extreme learning machine (ELM) has been tested with signals from 5 novel users to compare with standard classification algorithms. Experimental results show that ELM is a suitable method to classify electroencephalogram signals from novice users.

  20. Support vector machine classifiers for large data sets.

    Energy Technology Data Exchange (ETDEWEB)

    Gertz, E. M.; Griffin, J. D.

    2006-01-31

    This report concerns the generation of support vector machine classifiers for solving the pattern recognition problem in machine learning. Several methods are proposed based on interior point methods for convex quadratic programming. Software implementations are developed by adapting the object-oriented packaging OOQP to the problem structure and by using the software package PETSc to perform time-intensive computations in a distributed setting. Linear systems arising from classification problems with moderately large numbers of features are solved by using two techniques--one a parallel direct solver, the other a Krylov-subspace method incorporating novel preconditioning strategies. Numerical results are provided, and computational experience is discussed.

  1. A Fast Scalable Classifier Tightly Integrated with RDBMS

    Institute of Scientific and Technical Information of China (English)

    刘红岩; 陆宏钧; 陈剑

    2002-01-01

    In this paper, we report our success in building efficient scalable classifiers by exploring the capabilities of modern relational database management systems(RDBMS).In addition to high classification accuracy, the unique features of theapproach include its high training speed, linear scalability, and simplicity in implementation. More importantly,the major computation required in the approachcan be implemented using standard functions provided by the modern relational DBMS.Besides, with the effective rule pruning strategy, the algorithm proposed inthis paper can produce a compact set of classification rules. The results of experiments conducted for performance evaluation and analysis are presented.

  2. On the statistical assessment of classifiers using DNA microarray data

    Directory of Open Access Journals (Sweden)

    Carella M

    2006-08-01

    Full Text Available Abstract Background In this paper we present a method for the statistical assessment of cancer predictors which make use of gene expression profiles. The methodology is applied to a new data set of microarray gene expression data collected in Casa Sollievo della Sofferenza Hospital, Foggia – Italy. The data set is made up of normal (22 and tumor (25 specimens extracted from 25 patients affected by colon cancer. We propose to give answers to some questions which are relevant for the automatic diagnosis of cancer such as: Is the size of the available data set sufficient to build accurate classifiers? What is the statistical significance of the associated error rates? In what ways can accuracy be considered dependant on the adopted classification scheme? How many genes are correlated with the pathology and how many are sufficient for an accurate colon cancer classification? The method we propose answers these questions whilst avoiding the potential pitfalls hidden in the analysis and interpretation of microarray data. Results We estimate the generalization error, evaluated through the Leave-K-Out Cross Validation error, for three different classification schemes by varying the number of training examples and the number of the genes used. The statistical significance of the error rate is measured by using a permutation test. We provide a statistical analysis in terms of the frequencies of the genes involved in the classification. Using the whole set of genes, we found that the Weighted Voting Algorithm (WVA classifier learns the distinction between normal and tumor specimens with 25 training examples, providing e = 21% (p = 0.045 as an error rate. This remains constant even when the number of examples increases. Moreover, Regularized Least Squares (RLS and Support Vector Machines (SVM classifiers can learn with only 15 training examples, with an error rate of e = 19% (p = 0.035 and e = 18% (p = 0.037 respectively. Moreover, the error rate

  3. Classifying BCI signals from novice users with extreme learning machine

    Directory of Open Access Journals (Sweden)

    Rodríguez-Bermúdez Germán

    2017-07-01

    Full Text Available Brain computer interface (BCI allows to control external devices only with the electrical activity of the brain. In order to improve the system, several approaches have been proposed. However it is usual to test algorithms with standard BCI signals from experts users or from repositories available on Internet. In this work, extreme learning machine (ELM has been tested with signals from 5 novel users to compare with standard classification algorithms. Experimental results show that ELM is a suitable method to classify electroencephalogram signals from novice users.

  4. BIOPHARMACEUTICS CLASSIFICATION SYSTEM: A STRATEGIC TOOL FOR CLASSIFYING DRUG SUBSTANCES

    Directory of Open Access Journals (Sweden)

    Rohilla Seema

    2011-07-01

    Full Text Available The biopharmaceutical classification system (BCS is a scientific approach for classifying drug substances based on their dose/solubility ratio and intestinal permeability. The BCS has been developed to allow prediction of in vivo pharmacokinetic performance of drug products from measurements of permeability and solubility. Moreover, the drugs can be categorized into four classes of BCS on the basis of permeability and solubility namely; high permeability high solubility, high permeability low solubility, low permeability high solubility and low permeability low solubility. The present review summarizes the principles, objectives, benefits, classification and applications of BCS.

  5. Linear discriminant analysis of character sequences using occurrences of words

    KAUST Repository

    Dutta, Subhajit

    2014-02-01

    Classification of character sequences, where the characters come from a finite set, arises in disciplines such as molecular biology and computer science. For discriminant analysis of such character sequences, the Bayes classifier based on Markov models turns out to have class boundaries defined by linear functions of occurrences of words in the sequences. It is shown that for such classifiers based on Markov models with unknown orders, if the orders are estimated from the data using cross-validation, the resulting classifier has Bayes risk consistency under suitable conditions. Even when Markov models are not valid for the data, we develop methods for constructing classifiers based on linear functions of occurrences of words, where the word length is chosen by cross-validation. Such linear classifiers are constructed using ideas of support vector machines, regression depth, and distance weighted discrimination. We show that classifiers with linear class boundaries have certain optimal properties in terms of their asymptotic misclassification probabilities. The performance of these classifiers is demonstrated in various simulated and benchmark data sets.

  6. Discriminating complex networks through supervised NDR and Bayesian classifier

    Science.gov (United States)

    Yan, Ke-Sheng; Rong, Li-Li; Yu, Kai

    2016-12-01

    Discriminating complex networks is a particularly important task for the purpose of the systematic study of networks. In order to discriminate unknown networks exactly, a large set of network measurements are needed to be taken into account for comprehensively considering network properties. However, as we demonstrate in this paper, these measurements are nonlinear correlated with each other in general, resulting in a wide variety of redundant measurements which unintentionally explain the same aspects of network properties. To solve this problem, we adopt supervised nonlinear dimensionality reduction (NDR) to eliminate the nonlinear redundancy and visualize networks in a low-dimensional projection space. Though unsupervised NDR can achieve the same aim, we illustrate that supervised NDR is more appropriate than unsupervised NDR for discrimination task. After that, we perform Bayesian classifier (BC) in the projection space to discriminate the unknown network by considering the projection score vectors as the input of the classifier. We also demonstrate the feasibility and effectivity of this proposed method in six extensive research real networks, ranging from technological to social or biological. Moreover, the effectiveness and advantage of the proposed method is proved by the contrast experiments with the existing method.

  7. Occlusion Handling via Random Subspace Classifiers for Human Detection.

    Science.gov (United States)

    Marín, Javier; Vázquez, David; López, Antonio M; Amores, Jaume; Kuncheva, Ludmila I

    2014-03-01

    This paper describes a general method to address partial occlusions for human detection in still images. The random subspace method (RSM) is chosen for building a classifier ensemble robust against partial occlusions. The component classifiers are chosen on the basis of their individual and combined performance. The main contribution of this work lies in our approach's capability to improve the detection rate when partial occlusions are present without compromising the detection performance on non occluded data. In contrast to many recent approaches, we propose a method which does not require manual labeling of body parts, defining any semantic spatial components, or using additional data coming from motion or stereo. Moreover, the method can be easily extended to other object classes. The experiments are performed on three large datasets: the INRIA person dataset, the Daimler Multicue dataset, and a new challenging dataset, called PobleSec, in which a considerable number of targets are partially occluded. The different approaches are evaluated at the classification and detection levels for both partially occluded and non-occluded data. The experimental results show that our detector outperforms state-of-the-art approaches in the presence of partial occlusions, while offering performance and reliability similar to those of the holistic approach on non-occluded data. The datasets used in our experiments have been made publicly available for benchmarking purposes.

  8. REPTREE CLASSIFIER FOR IDENTIFYING LINK SPAM IN WEB SEARCH ENGINES

    Directory of Open Access Journals (Sweden)

    S.K. Jayanthi

    2013-01-01

    Full Text Available Search Engines are used for retrieving the information from the web. Most of the times, the importance is laid on top 10 results sometimes it may shrink as top 5, because of the time constraint and reliability on the search engines. Users believe that top 10 or 5 of total results are more relevant. Here comes the problem of spamdexing. It is a method to deceive the search result quality. Falsified metrics such as inserting enormous amount of keywords or links in website may take that website to the top 10 or 5 positions. This paper proposes a classifier based on the Reptree (Regression tree representative. As an initial step Link-based features such as neighbors, pagerank, truncated pagerank, trustrank and assortativity related attributes are inferred. Based on this features, tree is constructed. The tree uses the feature inference to differentiate spam sites from legitimate sites. WEBSPAM-UK-2007 dataset is taken as a base. It is preprocessed and converted into five datasets FEATA, FEATB, FEATC, FEATD and FEATE. Only link based features are taken for experiments. This paper focus on link spam alone. Finally a representative tree is created which will more precisely classify the web spam entries. Results are given. Regression tree classification seems to perform well as shown through experiments.

  9. Automated morphological analysis approach for classifying colorectal microscopic images

    Science.gov (United States)

    Marghani, Khaled A.; Dlay, Satnam S.; Sharif, Bayan S.; Sims, Andrew J.

    2003-10-01

    Automated medical image diagnosis using quantitative measurements is extremely helpful for cancer prognosis to reach a high degree of accuracy and thus make reliable decisions. In this paper, six morphological features based on texture analysis were studied in order to categorize normal and cancer colon mucosa. They were derived after a series of pre-processing steps to generate a set of different shape measurements. Based on the shape and the size, six features known as Euler Number, Equivalent Diamater, Solidity, Extent, Elongation, and Shape Factor AR were extracted. Mathematical morphology is used firstly to remove background noise from segmented images and then to obtain different morphological measures to describe shape, size, and texture of colon glands. The automated system proposed is tested to classifying 102 microscopic samples of colorectal tissues, which consist of 44 normal color mucosa and 58 cancerous. The results were first statistically evaluated, using one-way ANOVA method in order to examine the significance of each feature extracted. Then significant features are selected in order to classify the dataset into two categories. Finally, using two discrimination methods; linear method and k-means clustering, important classification factors were estimated. In brief, this study demonstrates that abnormalities in low-level power tissue morphology can be distinguished using quantitative image analysis. This investigation shows the potential of an automated vision system in histopathology. Furthermore, it has the advantage of being objective, and more importantly a valuable diagnostic decision support tool.

  10. A New Qualitative Typology to Classify Treading Water Movement Patterns

    Directory of Open Access Journals (Sweden)

    Christophe Schnitzler, Chris Button, James L. Croft, Ludovic Seifert

    2015-09-01

    Full Text Available This study proposes a new qualitative typology that can be used to classify learners treading water into different skill-based categories. To establish the typology, 38 participants were videotaped while treading water and their movement patterns were qualitatively analyzed by two experienced biomechanists. 13 sport science students were then asked to classify eight of the original participants after watching a brief tutorial video about how to use the typology. To examine intra-rater consistency, each participant was presented in a random order three times. Generalizability (G and Decision (D studies were performed to estimate the importance variance due to rater, occasion, video and the interactions between them, and to determine the reliability of the raters’ answers. A typology of five general classes of coordination was defined amongst the original 38 participants. The G-study showed an accurate and reliable assessment of different pattern type, with a percentage of correct classification of 80.1%, an overall Fleiss’ Kappa coefficient K = 0.6, and an overall generalizability φ coefficient of 0.99. This study showed that the new typology proposed to characterize the behaviour of individuals treading water was both accurate and highly reliable. Movement pattern classification using the typology might help practitioners distinguish between different skill-based behaviours and potentially guide instruction of key aquatic survival skills.

  11. Comparing Different Classifiers in Sensory Motor Brain Computer Interfaces.

    Directory of Open Access Journals (Sweden)

    Hossein Bashashati

    Full Text Available A problem that impedes the progress in Brain-Computer Interface (BCI research is the difficulty in reproducing the results of different papers. Comparing different algorithms at present is very difficult. Some improvements have been made by the use of standard datasets to evaluate different algorithms. However, the lack of a comparison framework still exists. In this paper, we construct a new general comparison framework to compare different algorithms on several standard datasets. All these datasets correspond to sensory motor BCIs, and are obtained from 21 subjects during their operation of synchronous BCIs and 8 subjects using self-paced BCIs. Other researchers can use our framework to compare their own algorithms on their own datasets. We have compared the performance of different popular classification algorithms over these 29 subjects and performed statistical tests to validate our results. Our findings suggest that, for a given subject, the choice of the classifier for a BCI system depends on the feature extraction method used in that BCI system. This is in contrary to most of publications in the field that have used Linear Discriminant Analysis (LDA as the classifier of choice for BCI systems.

  12. Comparing Different Classifiers in Sensory Motor Brain Computer Interfaces.

    Science.gov (United States)

    Bashashati, Hossein; Ward, Rabab K; Birch, Gary E; Bashashati, Ali

    2015-01-01

    A problem that impedes the progress in Brain-Computer Interface (BCI) research is the difficulty in reproducing the results of different papers. Comparing different algorithms at present is very difficult. Some improvements have been made by the use of standard datasets to evaluate different algorithms. However, the lack of a comparison framework still exists. In this paper, we construct a new general comparison framework to compare different algorithms on several standard datasets. All these datasets correspond to sensory motor BCIs, and are obtained from 21 subjects during their operation of synchronous BCIs and 8 subjects using self-paced BCIs. Other researchers can use our framework to compare their own algorithms on their own datasets. We have compared the performance of different popular classification algorithms over these 29 subjects and performed statistical tests to validate our results. Our findings suggest that, for a given subject, the choice of the classifier for a BCI system depends on the feature extraction method used in that BCI system. This is in contrary to most of publications in the field that have used Linear Discriminant Analysis (LDA) as the classifier of choice for BCI systems.

  13. Classifying paragraph types using linguistic features: Is paragraph positioning important?

    Directory of Open Access Journals (Sweden)

    Scott A. Crossley, Kyle Dempsey & Danielle S. McNamara

    2011-12-01

    Full Text Available This study examines the potential for computational tools and human raters to classify paragraphs based on positioning. In this study, a corpus of 182 paragraphs was collected from student, argumentative essays. The paragraphs selected were initial, middle, and final paragraphs and their positioning related to introductory, body, and concluding paragraphs. The paragraphs were analyzed by the computational tool Coh-Metrix on a variety of linguistic features with correlates to textual cohesion and lexical sophistication and then modeled using statistical techniques. The paragraphs were also classified by human raters based on paragraph positioning. The performance of the reported model was well above chance and reported an accuracy of classification that was similar to human judgments of paragraph type (66% accuracy for human versus 65% accuracy for our model. The model's accuracy increased when longer paragraphs that provided more linguistic coverage and paragraphs judged by human raters to be of higher quality were examined. The findings support the notions that paragraph types contain specific linguistic features that allow them to be distinguished from one another. The finding reported in this study should prove beneficial in classroom writing instruction and in automated writing assessment.

  14. Guidelines to classify subject groups in sport-science research.

    Science.gov (United States)

    De Pauw, Kevin; Roelands, Bart; Cheung, Stephen S; de Geus, Bas; Rietjens, Gerard; Meeusen, Romain

    2013-03-01

    The aim of this systematic literature review was to outline the various preexperimental maximal cycle-test protocols, terminology, and performance indicators currently used to classify subject groups in sport-science research and to construct a classification system for cycling-related research. A database of 130 subject-group descriptions contains information on preexperimental maximal cycle-protocol designs, terminology of the subject groups, biometrical and physiological data, cycling experience, and parameters. Kolmogorov-Smirnov test, 1-way ANOVA, post hoc Bonferroni (P data on a subject group, researchers apply various terms to define the group. To solve this complexity, the authors introduced the neutral term performance levels 1 to 5, representing untrained, recreationally trained, trained, well-trained, and professional subject groups, respectively. The most cited parameter in literature to define subject groups is relative VO(2max), and therefore no overlap between different performance levels may occur for this principal parameter. Another significant cycling parameter is the absolute PPO. The description of additional physiological information and current and past cycling data is advised. This review clearly shows the need to standardize the procedure for classifying subject groups. Recommendations are formulated concerning preexperimental testing, terminology, and performance indicators.

  15. Even more Chironomid species for classifying lake nutrient status

    Directory of Open Access Journals (Sweden)

    Les Ruse

    2015-07-01

    Full Text Available The European Union Water Framework Directive (WFD classifies ecological status of a waterbody by the determination of its natural reference state to provide a measure of perturbation by human impacts based on taxonomic composition and abundance of aquatic species. Ruse (2010; 2011 has provided methods of assessing anthropogenic perturbations to lake ecological status, in terms of nutrient enrichment and acidification, by analysing collections of floating pupal exuviae discarded by emerging adult Chironomidae. The previous nutrient assessment method was derived from chironomid and environmental data collected during 178 lake surveys of all WFD types found in Britain. Canonical Correspondence Analysis provided species optima in relation to phosphate and nitrogen concentrations. Species found in less than three surveys were excluded from analysis in case of spurious association with environmental values. Since Ruse (2010 an additional 72 lakes have been surveyed adding 31 more species for use in nutrient status assessment. These additional scoring species are reported here. The practical application of the Chironomid Pupal Exuvial Technique (CPET to classify WFD lake nutrient status is demonstrated using CPET survey data from lakes in Poland.

  16. An Ocular Protein Triad Can Classify Four Complex Retinal Diseases

    Science.gov (United States)

    Kuiper, J. J. W.; Beretta, L.; Nierkens, S.; van Leeuwen, R.; ten Dam-van Loon, N. H.; Ossewaarde-van Norel, J.; Bartels, M. C.; de Groot-Mijnes, J. D. F.; Schellekens, P.; de Boer, J. H.; Radstake, T. R. D. J.

    2017-01-01

    Retinal diseases generally are vision-threatening conditions that warrant appropriate clinical decision-making which currently solely dependents upon extensive clinical screening by specialized ophthalmologists. In the era where molecular assessment has improved dramatically, we aimed at the identification of biomarkers in 175 ocular fluids to classify four archetypical ocular conditions affecting the retina (age-related macular degeneration, idiopathic non-infectious uveitis, primary vitreoretinal lymphoma, and rhegmatogenous retinal detachment) with one single test. Unsupervised clustering of ocular proteins revealed a classification strikingly similar to the clinical phenotypes of each disease group studied. We developed and independently validated a parsimonious model based merely on three proteins; interleukin (IL)-10, IL-21, and angiotensin converting enzyme (ACE) that could correctly classify patients with an overall accuracy, sensitivity and specificity of respectively, 86.7%, 79.4% and 92.5%. Here, we provide proof-of-concept for molecular profiling as a diagnostic aid for ophthalmologists in the care for patients with retinal conditions. PMID:28128370

  17. Interface Prostheses with Classifier-Feedback based User Training.

    Science.gov (United States)

    Fang, Yinfeng; Zhou, Dalin; Li, Kairu; Liu, Honghai

    2016-12-21

    It is evident that user training significantly affects performance of pattern-recognition based myoelectric prosthetic device control. Despite plausible classification accuracy on offline datasets, online accuracy usually suffers from the changes in physiological conditions and electrode displacement. The user ability in generating consistent EMG patterns can be enhanced via proper user training strategies in order to improve online performance. This study proposes a clustering-feedback strategy that provides real-time feedback to users by means of a visualised online EMG signal input as well as the centroids of the training samples, whose dimensionality is reduced to minimal number by dimension reduction. Clustering-feedback provides a criterion that guides users to adjust motion gestures and muscle contraction forces intentionally. The experiment results have demonstrated that hand motion recognition accuracy increases steadily along the progress of the clustering-feedback based user training, while conventional classifier-feedback methods, i.e. label-feedback, hardly achieve any improvement. The result concludes that the use of proper classifier-feedback can accelerate the process of user training, and implies prosperous future for the amputees with limited or no experience in pattern-recognition based prosthetic device manipulation.

  18. Using Narrow Band Photometry to Classify Stars and Brown Dwarfs

    CERN Document Server

    Mainzer, A K; Sievers, J L; Young, E T; Lean, Ian S. Mc

    2004-01-01

    We present a new system of narrow band filters in the near infrared that can be used to classify stars and brown dwarfs. This set of four filters, spanning the H band, can be used to identify molecular features unique to brown dwarfs, such as H2O and CH4. The four filters are centered at 1.495 um (H2O), 1.595 um (continuum), 1.66 um (CH4), and 1.75 um (H2O). Using two H2O filters allows us to solve for individual objects' reddenings. This can be accomplished by constructing a color-color-color cube and rotating it until the reddening vector disappears. We created a model of predicted color-color-color values for different spectral types by integrating filter bandpass data with spectra of known stars and brown dwarfs. We validated this model by making photometric measurements of seven known L and T dwarfs, ranging from L1 - T7.5. The photometric measurements agree with the model to within +/-0.1 mag, allowing us to create spectral indices for different spectral types. We can classify A through early M stars to...

  19. Accuracy of Birth Certificate Data for Classifying Preterm Birth.

    Science.gov (United States)

    Stout, Molly J; Macones, George A; Tuuli, Methodius G

    2017-05-01

    Classifying preterm birth as spontaneous or indicated is critical both for clinical care and research, yet the accuracy of classification based on different data sources is unclear. We examined the accuracy of preterm birth classification as spontaneous or indicated based on birth certificate data. This is a retrospective cohort study of 123 birth certificates from preterm births in Missouri. Correct classification of spontaneous or indicated preterm birth subtype was based on multi-provider (RN, MFM Fellow, MFM attending) consensus after full medical record review. A categorisation algorithm based on clinical data available in the birth certificate was designed a priori and classification was performed by a single investigator according to the algorithm. Accuracy of birth certificate classification as spontaneous or indicated was compared to the consensus classification. Errors in misclassification were explored. Classification based on birth certificates was correct for 66% of preterm births. Most errors in classification by birth certificate occurred in classifying a birth as spontaneous when it was in fact indicated. The vast majority of errors occurred when preterm rupture of membranes (≥12 h) was checked on the birth certificate causing classification as spontaneous when there was a maternal or fetal indication for delivery. Birth certificate classification overestimated spontaneous preterm birth and underestimated indicated preterm birth compared to classification performed from medical record review. Revisions to birth certificate clinical data would allow more accurate population level surveillance of preterm birth subtypes. © 2017 John Wiley & Sons Ltd.

  20. Using Bayesian neural networks to classify forest scenes

    Science.gov (United States)

    Vehtari, Aki; Heikkonen, Jukka; Lampinen, Jouko; Juujarvi, Jouni

    1998-10-01

    We present results that compare the performance of Bayesian learning methods for neural networks on the task of classifying forest scenes into trees and background. Classification task is demanding due to the texture richness of the trees, occlusions of the forest scene objects and diverse lighting conditions under operation. This makes it difficult to determine which are optimal image features for the classification. A natural way to proceed is to extract many different types of potentially suitable features, and to evaluate their usefulness in later processing stages. One approach to cope with large number of features is to use Bayesian methods to control the model complexity. Bayesian learning uses a prior on model parameters, combines this with evidence from a training data, and the integrates over the resulting posterior to make predictions. With this method, we can use large networks and many features without fear of overfitting. For this classification task we compare two Bayesian learning methods for multi-layer perceptron (MLP) neural networks: (1) The evidence framework of MacKay uses a Gaussian approximation to the posterior weight distribution and maximizes with respect to hyperparameters. (2) In a Markov Chain Monte Carlo (MCMC) method due to Neal, the posterior distribution of the network parameters is numerically integrated using the MCMC method. As baseline classifiers for comparison we use (3) MLP early stop committee, (4) K-nearest-neighbor and (5) Classification And Regression Tree.

  1. Integrating language models into classifiers for BCI communication: a review

    Science.gov (United States)

    Speier, W.; Arnold, C.; Pouratian, N.

    2016-06-01

    Objective. The present review systematically examines the integration of language models to improve classifier performance in brain-computer interface (BCI) communication systems. Approach. The domain of natural language has been studied extensively in linguistics and has been used in the natural language processing field in applications including information extraction, machine translation, and speech recognition. While these methods have been used for years in traditional augmentative and assistive communication devices, information about the output domain has largely been ignored in BCI communication systems. Over the last few years, BCI communication systems have started to leverage this information through the inclusion of language models. Main results. Although this movement began only recently, studies have already shown the potential of language integration in BCI communication and it has become a growing field in BCI research. BCI communication systems using language models in their classifiers have progressed down several parallel paths, including: word completion; signal classification; integration of process models; dynamic stopping; unsupervised learning; error correction; and evaluation. Significance. Each of these methods have shown significant progress, but have largely been addressed separately. Combining these methods could use the full potential of language model, yielding further performance improvements. This integration should be a priority as the field works to create a BCI system that meets the needs of the amyotrophic lateral sclerosis population.

  2. An Ocular Protein Triad Can Classify Four Complex Retinal Diseases

    Science.gov (United States)

    Kuiper, J. J. W.; Beretta, L.; Nierkens, S.; van Leeuwen, R.; Ten Dam-van Loon, N. H.; Ossewaarde-van Norel, J.; Bartels, M. C.; de Groot-Mijnes, J. D. F.; Schellekens, P.; de Boer, J. H.; Radstake, T. R. D. J.

    2017-01-01

    Retinal diseases generally are vision-threatening conditions that warrant appropriate clinical decision-making which currently solely dependents upon extensive clinical screening by specialized ophthalmologists. In the era where molecular assessment has improved dramatically, we aimed at the identification of biomarkers in 175 ocular fluids to classify four archetypical ocular conditions affecting the retina (age-related macular degeneration, idiopathic non-infectious uveitis, primary vitreoretinal lymphoma, and rhegmatogenous retinal detachment) with one single test. Unsupervised clustering of ocular proteins revealed a classification strikingly similar to the clinical phenotypes of each disease group studied. We developed and independently validated a parsimonious model based merely on three proteins; interleukin (IL)-10, IL-21, and angiotensin converting enzyme (ACE) that could correctly classify patients with an overall accuracy, sensitivity and specificity of respectively, 86.7%, 79.4% and 92.5%. Here, we provide proof-of-concept for molecular profiling as a diagnostic aid for ophthalmologists in the care for patients with retinal conditions.

  3. Decision Tree Classifiers for Star/Galaxy Separation

    CERN Document Server

    Vasconcellos, E C; Gal, R R; LaBarbera, F L; Capelato, H V; Velho, H F Campos; Trevisan, M; Ruiz, R S R

    2010-01-01

    We study the star/galaxy classification efficiency of 13 different decision tree algorithms applied to photometric objects in the Sloan Digital Sky Survey Data Release Seven (SDSS DR7). Each algorithm is defined by a set of parameters which, when varied, produce different final classification trees. We extensively explore the parameter space of each algorithm, using the set of $884,126$ SDSS objects with spectroscopic data as the training set. The efficiency of star-galaxy separation is measured using the completeness function. We find that the Functional Tree algorithm (FT) yields the best results as measured by the mean completeness in two magnitude intervals: $14\\le r\\le21$ ($85.2%$) and $r\\ge19$ ($82.1%$). We compare the performance of the tree generated with the optimal FT configuration to the classifications provided by the SDSS parametric classifier, 2DPHOT and Ball et al. (2006). We find that our FT classifier is comparable or better in completeness over the full magnitude range $15\\le r\\le21$, with m...

  4. Phenotype Recognition with Combined Features and Random Subspace Classifier Ensemble

    Directory of Open Access Journals (Sweden)

    Pham Tuan D

    2011-04-01

    Full Text Available Abstract Background Automated, image based high-content screening is a fundamental tool for discovery in biological science. Modern robotic fluorescence microscopes are able to capture thousands of images from massively parallel experiments such as RNA interference (RNAi or small-molecule screens. As such, efficient computational methods are required for automatic cellular phenotype identification capable of dealing with large image data sets. In this paper we investigated an efficient method for the extraction of quantitative features from images by combining second order statistics, or Haralick features, with curvelet transform. A random subspace based classifier ensemble with multiple layer perceptron (MLP as the base classifier was then exploited for classification. Haralick features estimate image properties related to second-order statistics based on the grey level co-occurrence matrix (GLCM, which has been extensively used for various image processing applications. The curvelet transform has a more sparse representation of the image than wavelet, thus offering a description with higher time frequency resolution and high degree of directionality and anisotropy, which is particularly appropriate for many images rich with edges and curves. A combined feature description from Haralick feature and curvelet transform can further increase the accuracy of classification by taking their complementary information. We then investigate the applicability of the random subspace (RS ensemble method for phenotype classification based on microscopy images. A base classifier is trained with a RS sampled subset of the original feature set and the ensemble assigns a class label by majority voting. Results Experimental results on the phenotype recognition from three benchmarking image sets including HeLa, CHO and RNAi show the effectiveness of the proposed approach. The combined feature is better than any individual one in the classification accuracy. The

  5. A multiple classifier approach for spectral-spatial classification of hyperspectral data

    OpenAIRE

    Tarabalka, Yuliya; Benediktsson, Jon Atli; Tilton, James; Chanussot, Jocelyn

    2010-01-01

    International audience; A new multiple classifier method for spectral-spatial classification of hyperspectral images is proposed. Several classifiers are used independently to classify an image. For every pixel, if all the classifiers have assigned this pixel to the same class, the pixel is kept as a marker, i.e., a seed of the spatial region, with the corresponding class label. We propose to use spectral-spatial classifiers at the preliminary step of the marker selection procedure, each of t...

  6. Least Square Support Vector Machine Classifier vs a Logistic Regression Classifier on the Recognition of Numeric Digits

    Directory of Open Access Journals (Sweden)

    Danilo A. López-Sarmiento

    2013-11-01

    Full Text Available In this paper is compared the performance of a multi-class least squares support vector machine (LSSVM mc versus a multi-class logistic regression classifier to problem of recognizing the numeric digits (0-9 handwritten. To develop the comparison was used a data set consisting of 5000 images of handwritten numeric digits (500 images for each number from 0-9, each image of 20 x 20 pixels. The inputs to each of the systems were vectors of 400 dimensions corresponding to each image (not done feature extraction. Both classifiers used OneVsAll strategy to enable multi-classification and a random cross-validation function for the process of minimizing the cost function. The metrics of comparison were precision and training time under the same computational conditions. Both techniques evaluated showed a precision above 95 %, with LS-SVM slightly more accurate. However the computational cost if we found a marked difference: LS-SVM training requires time 16.42 % less than that required by the logistic regression model based on the same low computational conditions.

  7. An Approach for Identifying Cytokines Based on a Novel Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Quan Zou

    2013-01-01

    Full Text Available Biology is meaningful and important to identify cytokines and investigate their various functions and biochemical mechanisms. However, several issues remain, including the large scale of benchmark datasets, serious imbalance of data, and discovery of new gene families. In this paper, we employ the machine learning approach based on a novel ensemble classifier to predict cytokines. We directly selected amino acids sequences as research objects. First, we pretreated the benchmark data accurately. Next, we analyzed the physicochemical properties and distribution of whole amino acids and then extracted a group of 120-dimensional (120D valid features to represent sequences. Third, in the view of the serious imbalance in benchmark datasets, we utilized a sampling approach based on the synthetic minority oversampling technique algorithm and K-means clustering undersampling algorithm to rebuild the training set. Finally, we built a library for dynamic selection and circulating combination based on clustering (LibD3C and employed the new training set to realize cytokine classification. Experiments showed that the geometric mean of sensitivity and specificity obtained through our approach is as high as 93.3%, which proves that our approach is effective for identifying cytokines.

  8. An efficient fully unsupervised video object segmentation scheme using an adaptive neural-network classifier architecture.

    Science.gov (United States)

    Doulamis, A; Doulamis, N; Ntalianis, K; Kollias, S

    2003-01-01

    In this paper, an unsupervised video object (VO) segmentation and tracking algorithm is proposed based on an adaptable neural-network architecture. The proposed scheme comprises: 1) a VO tracking module and 2) an initial VO estimation module. Object tracking is handled as a classification problem and implemented through an adaptive network classifier, which provides better results compared to conventional motion-based tracking algorithms. Network adaptation is accomplished through an efficient and cost effective weight updating algorithm, providing a minimum degradation of the previous network knowledge and taking into account the current content conditions. A retraining set is constructed and used for this purpose based on initial VO estimation results. Two different scenarios are investigated. The first concerns extraction of human entities in video conferencing applications, while the second exploits depth information to identify generic VOs in stereoscopic video sequences. Human face/ body detection based on Gaussian distributions is accomplished in the first scenario, while segmentation fusion is obtained using color and depth information in the second scenario. A decision mechanism is also incorporated to detect time instances for weight updating. Experimental results and comparisons indicate the good performance of the proposed scheme even in sequences with complicated content (object bending, occlusion).

  9. A Speedy Cardiovascular Diseases Classifier Using Multiple Criteria Decision Analysis

    Directory of Open Access Journals (Sweden)

    Wah Ching Lee

    2015-01-01

    Full Text Available Each year, some 30 percent of global deaths are caused by cardiovascular diseases. This figure is worsening due to both the increasing elderly population and severe shortages of medical personnel. The development of a cardiovascular diseases classifier (CDC for auto-diagnosis will help address solve the problem. Former CDCs did not achieve quick evaluation of cardiovascular diseases. In this letter, a new CDC to achieve speedy detection is investigated. This investigation incorporates the analytic hierarchy process (AHP-based multiple criteria decision analysis (MCDA to develop feature vectors using a Support Vector Machine. The MCDA facilitates the efficient assignment of appropriate weightings to potential patients, thus scaling down the number of features. Since the new CDC will only adopt the most meaningful features for discrimination between healthy persons versus cardiovascular disease patients, a speedy detection of cardiovascular diseases has been successfully implemented.

  10. Road network extraction in classified SAR images using genetic algorithm

    Institute of Scientific and Technical Information of China (English)

    肖志强; 鲍光淑; 蒋晓确

    2004-01-01

    Due to the complicated background of objectives and speckle noise, it is almost impossible to extract roads directly from original synthetic aperture radar(SAR) images. A method is proposed for extraction of road network from high-resolution SAR image. Firstly, fuzzy C means is used to classify the filtered SAR image unsupervisedly, and the road pixels are isolated from the image to simplify the extraction of road network. Secondly, according to the features of roads and the membership of pixels to roads, a road model is constructed, which can reduce the extraction of road network to searching globally optimization continuous curves which pass some seed points. Finally, regarding the curves as individuals and coding a chromosome using integer code of variance relative to coordinates, the genetic operations are used to search global optimization roads. The experimental results show that the algorithm can effectively extract road network from high-resolution SAR images.

  11. Using point-set compression to classify folk songs

    DEFF Research Database (Denmark)

    Meredith, David

    2014-01-01

    -neighbour algorithm and leave-one-out cross-validation to classify the 360 melodies into tune families. The classifications produced by the algorithms were compared with a ground-truth classification prepared by expert musicologists. Twelve of the thirteen compressors used in the experiment were based...... on the discovery of translational equivalence classes (TECs) of maximal translatable patterns (MTPs) in point-set representations of the melodies. The twelve algorithms consisted of four variants of each of three basic algorithms, COSIATEC, SIATECCompress and Forth’s algorithm. The main difference between...... similarity between folk-songs for classification purposes is highly dependent upon the actual compressor chosen. Furthermore, it seems that compressors based on finding maximal repeated patterns in point-set representations of music show more promise for NCD-based music classification than general...

  12. Statistical Mechanical Development of a Sparse Bayesian Classifier

    Science.gov (United States)

    Uda, Shinsuke; Kabashima, Yoshiyuki

    2005-08-01

    The demand for extracting rules from high dimensional real world data is increasing in various fields. However, the possible redundancy of such data sometimes makes it difficult to obtain a good generalization ability for novel samples. To resolve this problem, we provide a scheme that reduces the effective dimensions of data by pruning redundant components for bicategorical classification based on the Bayesian framework. First, the potential of the proposed method is confirmed in ideal situations using the replica method. Unfortunately, performing the scheme exactly is computationally difficult. So, we next develop a tractable approximation algorithm, which turns out to offer nearly optimal performance in ideal cases when the system size is large. Finally, the efficacy of the developed classifier is experimentally examined for a real world problem of colon cancer classification, which shows that the developed method can be practically useful.

  13. Handwritten Bangla Alphabet Recognition using an MLP Based Classifier

    CERN Document Server

    Basu, Subhadip; Sarkar, Ram; Kundu, Mahantapas; Nasipuri, Mita; Basu, Dipak Kumar

    2012-01-01

    The work presented here involves the design of a Multi Layer Perceptron (MLP) based classifier for recognition of handwritten Bangla alphabet using a 76 element feature set Bangla is the second most popular script and language in the Indian subcontinent and the fifth most popular language in the world. The feature set developed for representing handwritten characters of Bangla alphabet includes 24 shadow features, 16 centroid features and 36 longest-run features. Recognition performances of the MLP designed to work with this feature set are experimentally observed as 86.46% and 75.05% on the samples of the training and the test sets respectively. The work has useful application in the development of a complete OCR system for handwritten Bangla text.

  14. Should Hypersexual Disorder be Classified as an Addiction?

    Science.gov (United States)

    Kor, Ariel; Fogel, Yehuda; Reid, Rory C; Potenza, Marc N

    2013-01-01

    Hypersexual behavior has been documented within clinical and research settings over the past decade. Despite recent research on hypersexuality and its associated features, many questions remain how best to define and classify hypersexual behavior. Proposed diagnostic criteria for Hypersexual Disorder (HD) have been proposed for the DSM-5 and a preliminary field trial has lent some support to the reliability and validity of the HD diagnosis. However, debate exists with respect to the extent to which the disorder might be categorized as a non-substance or behavioral addiction. In this article, we will discuss this debate in the context of data citing similarities and differences between hypersexual disorder, drug addictions, and pathological gambling. The authors of this paper conclude that despite many similarities between the features of hypersexual behavior and substance-related disorders, the research on HD at this time is in its infancy and much remains to be learned before definitively characterizing HD as an addiction at this time.

  15. Classifying orbits in the restricted three-body problem

    CERN Document Server

    Zotos, Euaggelos E

    2015-01-01

    The case of the planar circular restricted three-body problem is used as a test field in order to determine the character of the orbits of a small body which moves under the gravitational influence of the two heavy primary bodies. We conduct a thorough numerical analysis on the phase space mixing by classifying initial conditions of orbits and distinguishing between three types of motion: (i) bounded, (ii) escape and (iii) collisional. The presented outcomes reveal the high complexity of this dynamical system. Furthermore, our numerical analysis shows a remarkable presence of fractal basin boundaries along all the escape regimes. Interpreting the collisional motion as leaking in the phase space we related our results to both chaotic scattering and the theory of leaking Hamiltonian systems. We also determined the escape and collisional basins and computed the corresponding escape/collisional times. We hope our contribution to be useful for a further understanding of the escape and collisional mechanism of orbi...

  16. Early Detection of Breast Cancer using SVM Classifier Technique

    CERN Document Server

    Rejani, Y Ireaneus Anna

    2009-01-01

    This paper presents a tumor detection algorithm from mammogram. The proposed system focuses on the solution of two problems. One is how to detect tumors as suspicious regions with a very weak contrast to their background and another is how to extract features which categorize tumors. The tumor detection method follows the scheme of (a) mammogram enhancement. (b) The segmentation of the tumor area. (c) The extraction of features from the segmented tumor area. (d) The use of SVM classifier. The enhancement can be defined as conversion of the image quality to a better and more understandable level. The mammogram enhancement procedure includes filtering, top hat operation, DWT. Then the contrast stretching is used to increase the contrast of the image. The segmentation of mammogram images has been playing an important role to improve the detection and diagnosis of breast cancer. The most common segmentation method used is thresholding. The features are extracted from the segmented breast area. Next stage include,...

  17. Refining and classifying finite-time Lyapunov exponent ridges

    CERN Document Server

    Allshouse, Michael R

    2015-01-01

    While more rigorous and sophisticated methods for identifying Lagrangian based coherent structures exist, the finite-time Lyapunov exponent (FTLE) field remains a straightforward and popular method for gaining some insight into transport by complex, time-dependent two-dimensional flows. In light of its enduring appeal, and in support of good practice, we begin by investigating the effects of discretization and noise on two numerical approaches for calculating the FTLE field. A practical method to extract and refine FTLE ridges in two-dimensional flows, which builds on previous methods, is then presented. Seeking to better ascertain the role of an FTLE ridge in flow transport, we adapt an existing classification scheme and provide a thorough treatment of the challenges of classifying the types of deformation represented by an FTLE ridge. As a practical demonstration, the methods are applied to an ocean surface velocity field data set generated by a numerical model.

  18. Support vector classifier based on principal component analysis

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Support vector classifier (SVC) has the superior advantages for small sample learning problems with high dimensions,with especially better generalization ability.However there is some redundancy among the high dimensions of the original samples and the main features of the samples may be picked up first to improve the performance of SVC.A principal component analysis (PCA) is employed to reduce the feature dimensions of the original samples and the pre-selected main features efficiently,and an SVC is constructed in the selected feature space to improve the learning speed and identification rate of SVC.Furthermore,a heuristic genetic algorithm-based automatic model selection is proposed to determine the hyperparameters of SVC to evaluate the performance of the learning machines.Experiments performed on the Heart and Adult benchmark data sets demonstrate that the proposed PCA-based SVC not only reduces the test time drastically,but also improves the identify rates effectively.

  19. Classifying algorithms for SIFT-MS technology and medical diagnosis.

    Science.gov (United States)

    Moorhead, K T; Lee, D; Chase, J G; Moot, A R; Ledingham, K M; Scotter, J; Allardyce, R A; Senthilmohan, S T; Endre, Z

    2008-03-01

    Selected Ion Flow Tube-Mass Spectrometry (SIFT-MS) is an analytical technique for real-time quantification of trace gases in air or breath samples. SIFT-MS system thus offers unique potential for early, rapid detection of disease states. Identification of volatile organic compound (VOC) masses that contribute strongly towards a successful classification clearly highlights potential new biomarkers. A method utilising kernel density estimates is thus presented for classifying unknown samples. It is validated in a simple known case and a clinical setting before-after dialysis. The simple case with nitrogen in Tedlar bags returned a 100% success rate, as expected. The clinical proof-of-concept with seven tests on one patient had an ROC curve area of 0.89. These results validate the method presented and illustrate the emerging clinical potential of this technology.

  20. Object localization based on smoothing preprocessing and cascade classifier

    Science.gov (United States)

    Zhang, Xingfu; Liu, Lei; Zhao, Feng

    2017-01-01

    An improved algorithm for image location is proposed in this paper. Firstly, the image is smoothed and the partial noise is removed. Then use the cascade classifier to train a template. Finally, the template is used to detect the related images. The advantage of the algorithm is that it is robust to noise and the proportion of the image is not sensitive to change. At the same time, the algorithm also has the advantages of fast computation speed. In this paper, a real truck bottom picture is chosen as the experimental object. Images of normal components and faulty components are all included in the image sample. Experimental results show that the accuracy rate of the image is more than 90 percent when the grade is more than 40. So we can draw a conclusion that the algorithm proposed in this paper can be applied to the actual image localization project.

  1. Sex Bias in Classifying Borderline and Narcissistic Personality Disorder.

    Science.gov (United States)

    Braamhorst, Wouter; Lobbestael, Jill; Emons, Wilco H M; Arntz, Arnoud; Witteman, Cilia L M; Bekker, Marrie H J

    2015-10-01

    This study investigated sex bias in the classification of borderline and narcissistic personality disorders. A sample of psychologists in training for a post-master degree (N = 180) read brief case histories (male or female version) and made DSM classification. To differentiate sex bias due to sex stereotyping or to base rate variation, we used different case histories, respectively: (1) non-ambiguous case histories with enough criteria of either borderline or narcissistic personality disorder to meet the threshold for classification, and (2) an ambiguous case with subthreshold features of both borderline and narcissistic personality disorder. Results showed significant differences due to sex of the patient in the ambiguous condition. Thus, when the diagnosis is not straightforward, as in the case of mixed subthreshold features, sex bias is present and is influenced by base-rate variation. These findings emphasize the need for caution in classifying personality disorders, especially borderline or narcissistic traits.

  2. Business process modeling for processing classified documents using RFID technology

    Directory of Open Access Journals (Sweden)

    Koszela Jarosław

    2016-01-01

    Full Text Available The article outlines the application of the processing approach to the functional description of the designed IT system supporting the operations of the secret office, which processes classified documents. The article describes the application of the method of incremental modeling of business processes according to the BPMN model to the description of the processes currently implemented (“as is” in a manual manner and target processes (“to be”, using the RFID technology for the purpose of their automation. Additionally, the examples of applying the method of structural and dynamic analysis of the processes (process simulation to verify their correctness and efficiency were presented. The extension of the process analysis method is a possibility of applying the warehouse of processes and process mining methods.

  3. A non-parametric 2D deformable template classifier

    DEFF Research Database (Denmark)

    Schultz, Nette; Nielsen, Allan Aasbjerg; Conradsen, Knut;

    2005-01-01

    We introduce an interactive segmentation method for a sea floor survey. The method is based on a deformable template classifier and is developed to segment data from an echo sounder post-processor called RoxAnn. RoxAnn collects two different measures for each observation point, and in this 2D...... feature space the ship-master will be able to interactively define a segmentation map, which is refined and optimized by the deformable template algorithms. The deformable templates are defined as two-dimensional vector-cycles. Local random transformations are applied to the vector-cycles, and stochastic...... relaxation in a Bayesian scheme is used. In the Bayesian likelihood a class density function and its estimate hereof is introduced, which is designed to separate the feature space. The method is verified on data collected in Øresund, Scandinavia. The data come from four geographically different areas. Two...

  4. On the way of classifying new states of active matter

    Science.gov (United States)

    Menzel, Andreas M.

    2016-07-01

    With ongoing research into the collective behavior of self-propelled particles, new states of active matter are revealed. Some of them are entirely based on the non-equilibrium character and do not have an immediate equilibrium counterpart. In their recent work, Romanczuk et al (2016 New J. Phys. 18 063015) concentrate on the characterization of smectic-like states of active matter. A new type, referred to by the authors as smectic P, is described. In this state, the active particles form stacked layers and self-propel along them. Identifying and classifying states and phases of non-equilibrium matter, including the transitions between them, is an up-to-date effort that will certainly extend for a longer period into the future.

  5. Performance evaluation of artificial intelligence classifiers for the medical domain.

    Science.gov (United States)

    Smith, A E; Nugent, C D; McClean, S I

    2002-01-01

    The application of artificial intelligence systems is still not widespread in the medical field, however there is an increasing necessity for these to handle the surfeit of information available. One drawback to their implementation is the lack of criteria or guidelines for the evaluation of these systems. This is the primary issue in their acceptability to clinicians, who require them for decision support and therefore need evidence that these systems meet the special safety-critical requirements of the domain. This paper shows evidence that the most prevalent form of intelligent system, neural networks, is generally not being evaluated rigorously regarding classification precision. A taxonomy of the types of evaluation tests that can be carried out, to gauge inherent performance of the outputs of intelligent systems has been assembled, and the results of this presented in a clear and concise form, which should be applicable to all intelligent classifiers for medicine.

  6. Performance Evaluation of Bagged RBF Classifier for Data Mining Applications

    Directory of Open Access Journals (Sweden)

    M.Govindarajan

    2013-11-01

    Full Text Available Data mining is the use of algorithms to extract the information and patterns derived by the knowledge discovery in databases process. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. The feasibility and the benefits of the proposed approaches are demonstrated by the means of data mining applications like intrusion detection, direct marketing, and signature verification. A variety of techniques have been employed for analysis ranging from traditional statistical methods to data mining approaches. Bagging and boosting are two relatively new but popular methods for producing ensembles. In this work, bagging is evaluated on real and benchmark data sets of intrusion detection, direct marketing, and signature verification in conjunction with radial basis function classifier as the base learner. The proposed bagged radial basis function is superior to individual approach for data mining applications in terms of classification accuracy.

  7. Higher School Marketing Strategy Formation: Classifying the Factors

    Directory of Open Access Journals (Sweden)

    N. K. Shemetova

    2012-01-01

    Full Text Available The paper deals with the main trends of higher school management strategy formation. The author specifies the educational changes in the modern information society determining the strategy options. For each professional training level the author denotes the set of strategic factors affecting the educational service consumers and, therefore, the effectiveness of the higher school marketing. The given factors are classified from the stand-points of the providers and consumers of educational service (enrollees, students, graduates and postgraduates. The research methods include the statistic analysis and general methods of scientific analysis, synthesis, induction, deduction, comparison, and classification. The author is convinced that the university management should develop the necessary prerequisites for raising the graduates’ competitiveness in the labor market, and stimulate the active marketing policies of the relating subdivisions and departments. In author’s opinion, the above classification of marketing strategy factors can be used as the system of values for educational service providers. 

  8. Multivariate analysis of quantitative traits can effectively classify rapeseed germplasm

    Directory of Open Access Journals (Sweden)

    Jankulovska Mirjana

    2014-01-01

    Full Text Available In this study, the use of different multivariate approaches to classify rapeseed genotypes based on quantitative traits has been presented. Tree regression analysis, PCA analysis and two-way cluster analysis were applied in order todescribe and understand the extent of genetic variability in spring rapeseed genotype by trait data. The traits which highly influenced seed and oil yield in rapeseed were successfully identified by the tree regression analysis. Principal predictor for both response variables was number of pods per plant (NP. NP and 1000 seed weight could help in the selection of high yielding genotypes. High values for both traits and oil content could lead to high oil yielding genotypes. These traits may serve as indirect selection criteria and can lead to improvement of seed and oil yield in rapeseed. Quantitative traits that explained most of the variability in the studied germplasm were classified using principal component analysis. In this data set, five PCs were identified, out of which the first three PCs explained 63% of the total variance. It helped in facilitating the choice of variables based on which the genotypes’ clustering could be performed. The two-way cluster analysissimultaneously clustered genotypes and quantitative traits. The final number of clusters was determined using bootstrapping technique. This approach provided clear overview on the variability of the analyzed genotypes. The genotypes that have similar performance regarding the traits included in this study can be easily detected on the heatmap. Genotypes grouped in the clusters 1 and 8 had high values for seed and oil yield, and relatively short vegetative growth duration period and those in cluster 9, combined moderate to low values for vegetative growth duration and moderate to high seed and oil yield. These genotypes should be further exploited and implemented in the rapeseed breeding program. The combined application of these multivariate methods

  9. Capability of geometric features to classify ships in SAR imagery

    Science.gov (United States)

    Lang, Haitao; Wu, Siwen; Lai, Quan; Ma, Li

    2016-10-01

    Ship classification in synthetic aperture radar (SAR) imagery has become a new hotspot in remote sensing community for its valuable potential in many maritime applications. Several kinds of ship features, such as geometric features, polarimetric features, and scattering features have been widely applied on ship classification tasks. Compared with polarimetric features and scattering features, which are subject to SAR parameters (e.g., sensor type, incidence angle, polarization, etc.) and environment factors (e.g., sea state, wind, wave, current, etc.), geometric features are relatively independent of SAR and environment factors, and easy to be extracted stably from SAR imagery. In this paper, the capability of geometric features to classify ships in SAR imagery with various resolution has been investigated. Firstly, the relationship between the geometric feature extraction accuracy and the SAR imagery resolution is analyzed. It shows that the minimum bounding rectangle (MBR) of ship can be extracted exactly in terms of absolute precision by the proposed automatic ship-sea segmentation method. Next, six simple but effective geometric features are extracted to build a ship representation for the subsequent classification task. These six geometric features are composed of length (f1), width (f2), area (f3), perimeter (f4), elongatedness (f5) and compactness (f6). Among them, two basic features, length (f1) and width (f2), are directly extracted based on the MBR of ship, the other four are derived from those two basic features. The capability of the utilized geometric features to classify ships are validated on two data set with different image resolutions. The results show that the performance of ship classification solely by geometric features is close to that obtained by the state-of-the-art methods, which obtained by a combination of multiple kinds of features, including scattering features and geometric features after a complex feature selection process.

  10. Building multiclass classifiers for remote homology detection and fold recognition

    Directory of Open Access Journals (Sweden)

    Karypis George

    2006-10-01

    Full Text Available Abstract Background Protein remote homology detection and fold recognition are central problems in computational biology. Supervised learning algorithms based on support vector machines are currently one of the most effective methods for solving these problems. These methods are primarily used to solve binary classification problems and they have not been extensively used to solve the more general multiclass remote homology prediction and fold recognition problems. Results We present a comprehensive evaluation of a number of methods for building SVM-based multiclass classification schemes in the context of the SCOP protein classification. These methods include schemes that directly build an SVM-based multiclass model, schemes that employ a second-level learning approach to combine the predictions generated by a set of binary SVM-based classifiers, and schemes that build and combine binary classifiers for various levels of the SCOP hierarchy beyond those defining the target classes. Conclusion Analyzing the performance achieved by the different approaches on four different datasets we show that most of the proposed multiclass SVM-based classification approaches are quite effective in solving the remote homology prediction and fold recognition problems and that the schemes that use predictions from binary models constructed for ancestral categories within the SCOP hierarchy tend to not only lead to lower error rates but also reduce the number of errors in which a superfamily is assigned to an entirely different fold and a fold is predicted as being from a different SCOP class. Our results also show that the limited size of the training data makes it hard to learn complex second-level models, and that models of moderate complexity lead to consistently better results.

  11. Boosting-Based On-Road Obstacle Sensing Using Discriminative Weak Classifiers

    Science.gov (United States)

    Adhikari, Shyam Prasad; Yoo, Hyeon-Joong; Kim, Hyongsuk

    2011-01-01

    This paper proposes an extension of the weak classifiers derived from the Haar-like features for their use in the Viola-Jones object detection system. These weak classifiers differ from the traditional single threshold ones, in that no specific threshold is needed and these classifiers give a more general solution to the non-trivial task of finding thresholds for the Haar-like features. The proposed quadratic discriminant analysis based extension prominently improves the ability of the weak classifiers to discriminate objects and non-objects. The proposed weak classifiers were evaluated by boosting a single stage classifier to detect rear of car. The experiments demonstrate that the object detector based on the proposed weak classifiers yields higher classification performance with less number of weak classifiers than the detector built with traditional single threshold weak classifiers. PMID:22163852

  12. Performance of Correspondence Algorithms in Vision-Based Driver Assistance Using an Online Image Sequence Database

    DEFF Research Database (Denmark)

    Klette, Reinhard; Krüger, Norbert; Vaudrey, Tobi

    2011-01-01

    that report on hours of driving, and multiple hours of long video data may be segmented into basic sequences and classified into situations. This paper prepares for this expected development. This paper uses three different evaluation approaches (prediction error, synthesized sequences, and labeled sequences...

  13. Supervised Sequence Labelling with Recurrent Neural Networks

    CERN Document Server

    Graves, Alex

    2012-01-01

    Supervised sequence labelling is a vital area of machine learning, encompassing tasks such as speech, handwriting and gesture recognition, protein secondary structure prediction and part-of-speech tagging. Recurrent neural networks are powerful sequence learning tools—robust to input noise and distortion, able to exploit long-range contextual information—that would seem ideally suited to such problems. However their role in large-scale sequence labelling systems has so far been auxiliary.    The goal of this book is a complete framework for classifying and transcribing sequential data with recurrent neural networks only. Three main innovations are introduced in order to realise this goal. Firstly, the connectionist temporal classification output layer allows the framework to be trained with unsegmented target sequences, such as phoneme-level speech transcriptions; this is in contrast to previous connectionist approaches, which were dependent on error-prone prior segmentation. Secondly, multidimensional...

  14. Mining Class-Correlated Patterns for Sequence Labeling

    Science.gov (United States)

    Hopf, Thomas; Kramer, Stefan

    Sequence labeling is the task of assigning a label sequence to an observation sequence. Since many methods to solve this problem depend on the specification of predictive features, automated methods for their derivation are desirable. Unlike in other areas of pattern-based classification, however, no algorithm to directly mine class-correlated patterns for sequence labeling has been proposed so far. We introduce the novel task of mining class-correlated sequence patterns for sequence labeling and present a supervised pattern growth algorithm to find all patterns in a set of observation sequences, which correlate with the assignment of a fixed sequence label no less than a user-specified minimum correlation constraint. From the resulting set of patterns, features for a variety of classifiers can be obtained in a straightforward manner. The efficiency of the approach and the influence of important parameters are shown in experiments on several biological datasets.

  15. MicroRNA classifier and nomogram for metastasis prediction in colon cancer.

    Science.gov (United States)

    Goossens-Beumer, Inès J; Derr, Remco S; Buermans, Henk P J; Goeman, Jelle J; Böhringer, Stefan; Morreau, Hans; Nitsche, Ulrich; Janssen, Klaus-Peter; van de Velde, Cornelis J H; Kuppen, Peter J K

    2015-01-01

    Colon cancer prognosis and treatment are currently based on a classification system still showing large heterogeneity in clinical outcome, especially in TNM stages II and III. Prognostic biomarkers for metastasis risk are warranted as development of distant recurrent disease mainly accounts for the high lethality rates of colon cancer. miRNAs have been proposed as potential biomarkers for cancer. Furthermore, a verified standard for normalization of the amount of input material in PCR-based relative quantification of miRNA expression is lacking. A selection of frozen tumor specimens from two independent patient cohorts with TNM stage II-III microsatellite stable primary adenocarcinomas was used for laser capture microdissection. Next-generation sequencing was performed on small RNAs isolated from colorectal tumors from the Dutch cohort (N = 50). Differential expression analysis, comparing in metastasized and nonmetastasized tumors, identified prognostic miRNAs. Validation was performed on colon tumors from the German cohort (N = 43) using quantitative PCR (qPCR). miR25-3p and miR339-5p were identified and validated as independent prognostic markers and used to construct a multivariate nomogram for metastasis risk prediction. The nomogram showed good probability prediction in validation. In addition, we recommend combination of miR16-5p and miR26a-5p as standard for normalization in qPCR of colon cancer tissue-derived miRNA expression. In this international study, we identified and validated a miRNA classifier in primary cancers, and propose a nomogram capable of predicting metastasis risk in microsatellite stable TNM stage II-III colon cancer. In conjunction with TNM staging, by means of a nomogram, this miRNA classifier may allow for personalized treatment decisions based on individual tumor characteristics. ©2014 American Association for Cancer Research.

  16. Linearly and Quadratically Separable Classifiers Using Adaptive Approach

    Institute of Scientific and Technical Information of China (English)

    Mohamed Abdel-Kawy Mohamed Ali Soliman; Rasha M. Abo-Bakr

    2011-01-01

    This paper presents a fast adaptive iterative algorithm to solve linearly separable classification problems in Rn.In each iteration,a subset of the sampling data (n-points,where n is the number of features) is adaptively chosen and a hyperplane is constructed such that it separates the chosen n-points at a margin e and best classifies the remaining points.The classification problem is formulated and the details of the algorithm are presented.Further,the algorithm is extended to solving quadratically separable classification problems.The basic idea is based on mapping the physical space to another larger one where the problem becomes linearly separable.Numerical illustrations show that few iteration steps are sufficient for convergence when classes are linearly separable.For nonlinearly separable data,given a specified maximum number of iteration steps,the algorithm returns the best hyperplane that minimizes the number of misclassified points occurring through these steps.Comparisons with other machine learning algorithms on practical and benchmark datasets are also presented,showing the performance of the proposed algorithm.

  17. Classifying EEG Signals during Stereoscopic Visualization to Estimate Visual Comfort.

    Science.gov (United States)

    Frey, Jérémy; Appriou, Aurélien; Lotte, Fabien; Hachet, Martin

    2016-01-01

    With stereoscopic displays a sensation of depth that is too strong could impede visual comfort and may result in fatigue or pain. We used Electroencephalography (EEG) to develop a novel brain-computer interface that monitors users' states in order to reduce visual strain. We present the first system that discriminates comfortable conditions from uncomfortable ones during stereoscopic vision using EEG. In particular, we show that either changes in event-related potentials' (ERPs) amplitudes or changes in EEG oscillations power following stereoscopic objects presentation can be used to estimate visual comfort. Our system reacts within 1 s to depth variations, achieving 63% accuracy on average (up to 76%) and 74% on average when 7 consecutive variations are measured (up to 93%). Performances are stable (≈62.5%) when a simplified signal processing is used to simulate online analyses or when the number of EEG channels is lessened. This study could lead to adaptive systems that automatically suit stereoscopic displays to users and viewing conditions. For example, it could be possible to match the stereoscopic effect with users' state by modifying the overlap of left and right images according to the classifier output.

  18. Addressing the Challenge of Defining Valid Proteomic Biomarkers and Classifiers

    LENUS (Irish Health Repository)

    Dakna, Mohammed

    2010-12-10

    Abstract Background The purpose of this manuscript is to provide, based on an extensive analysis of a proteomic data set, suggestions for proper statistical analysis for the discovery of sets of clinically relevant biomarkers. As tractable example we define the measurable proteomic differences between apparently healthy adult males and females. We choose urine as body-fluid of interest and CE-MS, a thoroughly validated platform technology, allowing for routine analysis of a large number of samples. The second urine of the morning was collected from apparently healthy male and female volunteers (aged 21-40) in the course of the routine medical check-up before recruitment at the Hannover Medical School. Results We found that the Wilcoxon-test is best suited for the definition of potential biomarkers. Adjustment for multiple testing is necessary. Sample size estimation can be performed based on a small number of observations via resampling from pilot data. Machine learning algorithms appear ideally suited to generate classifiers. Assessment of any results in an independent test-set is essential. Conclusions Valid proteomic biomarkers for diagnosis and prognosis only can be defined by applying proper statistical data mining procedures. In particular, a justification of the sample size should be part of the study design.

  19. Implementation of a classifier didactical machine for learning mechatronic processes

    Directory of Open Access Journals (Sweden)

    Alex De La Cruz

    2017-06-01

    Full Text Available The present article shows the design and construction of a classifier didactical machine through artificial vision. The implementation of the machine is to be used as a learning module of mechatronic processes. In the project, it is described the theoretical aspects that relate concepts of mechanical design, electronic design and software management which constitute popular field in science and technology, which is mechatronics. The design of the machine was developed based on the requirements of the user, through the concurrent design methodology to define and materialize the appropriate hardware and software solutions. LabVIEW 2015 was implemented for high-speed image acquisition and analysis, as well as for the establishment of data communication with a programmable logic controller (PLC via Ethernet and an open communications platform known as Open Platform Communications - OPC. In addition, the Arduino MEGA 2560 platform was used to control the movement of the step motor and the servo motors of the module. Also, is used the Arduino MEGA 2560 to control the movement of the stepper motor and servo motors in the module. Finally, we assessed whether the equipment meets the technical specifications raised by running specific test protocols.

  20. Learning multiscale and deep representations for classifying remotely sensed imagery

    Science.gov (United States)

    Zhao, Wenzhi; Du, Shihong

    2016-03-01

    It is widely agreed that spatial features can be combined with spectral properties for improving interpretation performances on very-high-resolution (VHR) images in urban areas. However, many existing methods for extracting spatial features can only generate low-level features and consider limited scales, leading to unpleasant classification results. In this study, multiscale convolutional neural network (MCNN) algorithm was presented to learn spatial-related deep features for hyperspectral remote imagery classification. Unlike traditional methods for extracting spatial features, the MCNN first transforms the original data sets into a pyramid structure containing spatial information at multiple scales, and then automatically extracts high-level spatial features using multiscale training data sets. Specifically, the MCNN has two merits: (1) high-level spatial features can be effectively learned by using the hierarchical learning structure and (2) multiscale learning scheme can capture contextual information at different scales. To evaluate the effectiveness of the proposed approach, the MCNN was applied to classify the well-known hyperspectral data sets and compared with traditional methods. The experimental results shown a significant increase in classification accuracies especially for urban areas.

  1. Quantum Hooke's law to classify pulse laser induced ultrafast melting.

    Science.gov (United States)

    Hu, Hao; Ding, Hepeng; Liu, Feng

    2015-02-03

    Ultrafast crystal-to-liquid phase transition induced by femtosecond pulse laser excitation is an interesting material's behavior manifesting the complexity of light-matter interaction. There exist two types of such phase transitions: one occurs at a time scale shorter than a picosecond via a nonthermal process mediated by electron-hole plasma formation; the other at a longer time scale via a thermal melting process mediated by electron-phonon interaction. However, it remains unclear what material would undergo which process and why? Here, by exploiting the property of quantum electronic stress (QES) governed by quantum Hooke's law, we classify the transitions by two distinct classes of materials: the faster nonthermal process can only occur in materials like ice having an anomalous phase diagram characterized with dTm/dP melting temperature and P is pressure, above a high threshold laser fluence; while the slower thermal process may occur in all materials. Especially, the nonthermal transition is shown to be induced by the QES, acting like a negative internal pressure, which drives the crystal into a "super pressing" state to spontaneously transform into a higher-density liquid phase. Our findings significantly advance fundamental understanding of ultrafast crystal-to-liquid phase transitions, enabling quantitative a priori predictions.

  2. Executed Movement Using EEG Signals through a Naive Bayes Classifier

    Directory of Open Access Journals (Sweden)

    Juliano Machado

    2014-11-01

    Full Text Available Recent years have witnessed a rapid development of brain-computer interface (BCI technology. An independent BCI is a communication system for controlling a device by human intension, e.g., a computer, a wheelchair or a neuroprosthes is, not depending on the brain’s normal output pathways of peripheral nerves and muscles, but on detectable signals that represent responsive or intentional brain activities. This paper presents a comparative study of the usage of the linear discriminant analysis (LDA and the naive Bayes (NB classifiers on describing both right- and left-hand movement through electroencephalographic signal (EEG acquisition. For the analysis, we considered the following input features: the energy of the segments of a band pass-filtered signal with the frequency band in sensorimotor rhythms and the components of the spectral energy obtained through the Welch method. We also used the common spatial pattern (CSP filter, so as to increase the discriminatory activity among movement classes. By using the database generated by this experiment, we obtained hit rates up to 70%. The results are compatible with previous studies.

  3. The Complete Gabor-Fisher Classifier for Robust Face Recognition

    Directory of Open Access Journals (Sweden)

    Štruc Vitomir

    2010-01-01

    Full Text Available Abstract This paper develops a novel face recognition technique called Complete Gabor Fisher Classifier (CGFC. Different from existing techniques that use Gabor filters for deriving the Gabor face representation, the proposed approach does not rely solely on Gabor magnitude information but effectively uses features computed based on Gabor phase information as well. It represents one of the few successful attempts found in the literature of combining Gabor magnitude and phase information for robust face recognition. The novelty of the proposed CGFC technique comes from (1 the introduction of a Gabor phase-based face representation and (2 the combination of the recognition technique using the proposed representation with classical Gabor magnitude-based methods into a unified framework. The proposed face recognition framework is assessed in a series of face verification and identification experiments performed on the XM2VTS, Extended YaleB, FERET, and AR databases. The results of the assessment suggest that the proposed technique clearly outperforms state-of-the-art face recognition techniques from the literature and that its performance is almost unaffected by the presence of partial occlusions of the facial area, changes in facial expression, or severe illumination changes.

  4. Not Color Blind Using Multiband Photometry to Classify Supernovae

    CERN Document Server

    Poznanski, D; Maoz, D; Filippenko, A V; Leonard, D C; Matheson, T; Poznanski, Dovi; Gal-Yam, Avishay; Maoz, Dan; Filippenko, Alexei V.; Leonard, Douglas C.; Matheson, Thomas

    2002-01-01

    Large numbers of supernovae (SNe) have been discovered in recent years, and many more will be found in the near future. Once discovered, further study of a SN and its possible use as an astronomical tool (e.g., a distance estimator) require knowledge of the SN type. Current classification methods rely almost solely on the analysis of SN spectra to determine their type. However, spectroscopy may not be possible or practical when SNe are faint, very numerous, or discovered in archival studies. We present a classification method for SNe based on the comparison of their observed colors with synthetic ones, calculated from a large database of multi-epoch optical spectra of nearby events. We discuss the capabilities and limitations of this method. For example, type Ia SNe at redshifts z 100 days) stages. Broad-band photometry through standard Johnson-Cousins UBVRI filters can be useful to classify SNe up to z ~ 0.6. The use of Sloan Digital Sky Survey (SDSS) u'g'r'i'z' filters allows extending our classification m...

  5. A system-awareness decision classifier to automated MSN forensics

    Science.gov (United States)

    Chu, Yin-Teshou Tsao; Fan, Kuo-Pao; Cheng, Ya-Wen; Tseng, Po-Kai; Chen, Huan; Cheng, Bo-Chao

    2007-09-01

    Data collection is the most important stage in network forensics; but under the resource constrained situations, a good evidence collection mechanism is required to provide effective event collections in a high network traffic environment. In literatures, a few network forensic tools offer MSN-messenger behavior reconstruction. Moreover, they do not have classification strategies at the collection stage when the system becomes saturated. The emphasis of this paper is to address the shortcomings of the above situations and pose a solution to select a better classification in order to ensure the integrity of the evidences in the collection stage under high-traffic network environments. A system-awareness decision classifier (SADC) mechanism is proposed in this paper. MSN-shot sensor is able to adjust the amount of data to be collected according to the current system status and to keep evidence integrity as much as possible according to the file format and the current system status. Analytical results show that proposed SADC to implement selective collection (SC) consumes less cost than full collection (FC) under heavy traffic scenarios. With the deployment of the proposed SADC mechanism, we believe that MSN-shot is able to reconstruct the MSN-messenger behaviors perfectly in the context of upcoming next generation network.

  6. Classifying and comparing fundraising performance for nonprofit hospitals.

    Science.gov (United States)

    Erwin, Cathleen O

    2013-01-01

    Charitable contributions are becoming increasingly important to nonprofit hospitals, yet fundraising can sometimes be one of the more troublesome aspects of management for nonprofit organizations. This study utilizes an organizational effectiveness and performance framework to identify groups of nonprofit organizations as a method of classifying organizations for performance evaluation and benchmarking that may be more informative than commonly used characteristics such as organizational age and size. Cluster analysis, ANOVA and chi-square analysis are used to study 401 organizations, which includes hospital foundations as well as nonprofit hospitals directly engaged in fundraising. Three distinct clusters of organizations are identified based on performance measures of productivity, efficiency, and complexity. A general profile is developed for each cluster based upon the cluster analysis variables and subsequent analysis of variance on measures of structure, maturity, and legitimacy as well as selected institutional characteristics. This is one of only a few studies to examine fundraising performance in hospitals and hospital foundations, and is the first to utilize data from an industry survey conducted by the leading general professional association for healthcare philanthropy. It has methodological implications for the study of fundraising as well as practical implications for the strategic management of fundraising for nonprofit hospital and hospital foundations.

  7. A Novel Performance Metric for Building an Optimized Classifier

    Directory of Open Access Journals (Sweden)

    Mohammad Hossin

    2011-01-01

    Full Text Available Problem statement: Typically, the accuracy metric is often applied for optimizing the heuristic or stochastic classification models. However, the use of accuracy metric might lead the searching process to the sub-optimal solutions due to its less discriminating values and it is also not robust to the changes of class distribution. Approach: To solve these detrimental effects, we propose a novel performance metric which combines the beneficial properties of accuracy metric with the extended recall and precision metrics. We call this new performance metric as Optimized Accuracy with Recall-Precision (OARP. Results: In this study, we demonstrate that the OARP metric is theoretically better than the accuracy metric using four generated examples. We also demonstrate empirically that a naïve stochastic classification algorithm, which is Monte Carlo Sampling (MCS algorithm trained with the OARP metric, is able to obtain better predictive results than the one trained with the conventional accuracy metric. Additionally, the t-test analysis also shows a clear advantage of the MCS model trained with the OARP metric over the accuracy metric alone for all binary data sets. Conclusion: The experiments have proved that the OARP metric leads stochastic classifiers such as the MCS towards a better training model, which in turn will improve the predictive results of any heuristic or stochastic classification models.

  8. Salient Region Detection via Feature Combination and Discriminative Classifier

    Directory of Open Access Journals (Sweden)

    Deming Kong

    2015-01-01

    Full Text Available We introduce a novel approach to detect salient regions of an image via feature combination and discriminative classifier. Our method, which is based on hierarchical image abstraction, uses the logistic regression approach to map the regional feature vector to a saliency score. Four saliency cues are used in our approach, including color contrast in a global context, center-boundary priors, spatially compact color distribution, and objectness, which is as an atomic feature of segmented region in the image. By mapping a four-dimensional regional feature to fifteen-dimensional feature vector, we can linearly separate the salient regions from the clustered background by finding an optimal linear combination of feature coefficients in the fifteen-dimensional feature space and finally fuse the saliency maps across multiple levels. Furthermore, we introduce the weighted salient image center into our saliency analysis task. Extensive experiments on two large benchmark datasets show that the proposed approach achieves the best performance over several state-of-the-art approaches.

  9. Transforming Musical Signals through a Genre Classifying Convolutional Neural Network

    Science.gov (United States)

    Geng, S.; Ren, G.; Ogihara, M.

    2017-05-01

    Convolutional neural networks (CNNs) have been successfully applied on both discriminative and generative modeling for music-related tasks. For a particular task, the trained CNN contains information representing the decision making or the abstracting process. One can hope to manipulate existing music based on this 'informed' network and create music with new features corresponding to the knowledge obtained by the network. In this paper, we propose a method to utilize the stored information from a CNN trained on musical genre classification task. The network was composed of three convolutional layers, and was trained to classify five-second song clips into five different genres. After training, randomly selected clips were modified by maximizing the sum of outputs from the network layers. In addition to the potential of such CNNs to produce interesting audio transformation, more information about the network and the original music could be obtained from the analysis of the generated features since these features indicate how the network 'understands' the music.

  10. Wreck finding and classifying with a sonar filter

    Science.gov (United States)

    Agehed, Kenneth I.; Padgett, Mary Lou; Becanovic, Vlatko; Bornich, C.; Eide, Age J.; Engman, Per; Globoden, O.; Lindblad, Thomas; Lodgberg, K.; Waldemark, Karina E.

    1999-03-01

    Sonar detection and classification of sunken wrecks and other objects is of keen interest to many. This paper describes the use of neural networks (NN) for locating, classifying and determining the alignment of objects on a lakebed in Sweden. A complex program for data preprocessing and visualization was developed. Part of this program, The Sonar Viewer, facilitates training and testing of the NN using (1) the MATLAB Neural Networks Toolbox for multilayer perceptrons with backpropagation (BP) and (2) the neural network O-Algorithm (OA) developed by Age Eide and Thomas Lindblad. Comparison of the performance of the two neural networks approaches indicates that, for this data BP generalizes better than OA, but use of OA eliminates the need for training on non-target (lake bed) images. The OA algorithm does not work well with the smaller ships. Increasing the resolution to counteract this problem would slow down processing and require interpolation to suggest data values between the actual sonar measurements. In general, good results were obtained for recognizing large wrecks and determining their alignment. The programs developed a useful tool for further study of sonar signals in many environments. Recent developments in pulse coupled neural networks techniques provide an opportunity to extend the use in real-world applications where experimental data is difficult, expensive or time consuming to obtain.

  11. Estimating the crowding level with a neuro-fuzzy classifier

    Science.gov (United States)

    Boninsegna, Massimo; Coianiz, Tarcisio; Trentin, Edmondo

    1997-07-01

    This paper introduces a neuro-fuzzy system for the estimation of the crowding level in a scene. Monitoring the number of people present in a given indoor environment is a requirement in a variety of surveillance applications. In the present work, crowding has to be estimated from the image processing of visual scenes collected via a TV camera. A suitable preprocessing of the images, along with an ad hoc feature extraction process, is discussed. Estimation of the crowding level in the feature space is described in terms of a fuzzy decision rule, which relies on the membership of input patterns to a set of partially overlapping crowding classes, comprehensive of doubt classifications and outliers. A society of neural networks, either multilayer perceptrons or hyper radial basis functions, is trained to model individual class-membership functions. Integration of the neural nets within the fuzzy decision rule results in an overall neuro-fuzzy classifier. Important topics concerning the generalization ability, the robustness, the adaptivity and the performance evaluation of the system are explored. Experiments with real-world data were accomplished, comparing the present approach with statistical pattern recognition techniques, namely linear discriminant analysis and nearest neighbor. Experimental results validate the neuro-fuzzy approach to a large extent. The system is currently working successfully as a part of a monitoring system in the Dinegro underground station in Genoa, Italy.

  12. A novel molecular disease classifier for psoriasis and eczema.

    Science.gov (United States)

    Garzorz-Stark, Natalie; Krause, Linda; Lauffer, Felix; Atenhan, Anne; Thomas, Jenny; Stark, Sebastian P; Franz, Regina; Weidinger, Stephan; Balato, Anna; Mueller, Nikola S; Theis, Fabian J; Ring, Johannes; Schmidt-Weber, Carsten B; Biedermann, Tilo; Eyerich, Stefanie; Eyerich, Kilian

    2016-10-01

    Novel specific therapies for psoriasis and eczema have been developed, and they mark a new era in the treatment of these complex inflammatory skin diseases. However, within their broad clinical spectrum, psoriasis and eczema phenotypes overlap making an accurate diagnosis impossible in special cases, not to speak about predicting the clinical outcome of an individual patient. Here, we present a novel robust molecular classifier (MC) consisting of NOS2 and CCL27 gene that diagnosed psoriasis and eczema with a sensitivity and specificity of >95% in a cohort of 129 patients suffering from (i) classical forms; (ii) subtypes; and (iii) clinically and histologically indistinct variants of psoriasis and eczema. NOS2 and CCL27 correlated with clinical and histological hallmarks of psoriasis and eczema in a mutually antagonistic way, thus highlighting their biological relevance. In line with this, the MC could be transferred to the level of immunofluorescence stainings for iNOS and CCL27 protein on paraffin-embedded sections, where patients were diagnosed with sensitivity and specificity >88%. Our MC proved superiority over current gold standard methods to distinguish psoriasis and eczema and may therefore build the basis for molecular diagnosis of chronic inflammatory skin diseases required to establish personalized medicine in the field. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  13. Classifying Taiwan Lianas with Radiating Plates of Xylem

    Directory of Open Access Journals (Sweden)

    Sheng-Zehn Yang

    2015-12-01

    Full Text Available Radiating plates of xylem are a lianas cambium variation, of which, 22 families have this feature. This study investigates 15 liana species representing nine families with radiating plates of xylem structures. The features of the transverse section and epidermis in fresh liana samples are documented, including shapes and colors of xylem and phloem, ray width and numbers, and skin morphology. Experimental results indicated that the shape of phloem fibers in Ampelopsis brevipedunculata var. hancei is gradually tapered and flame-like, which is in contrast with the other characteristics of this type, including those classified as rays. Both inner and outer cylinders of vascular bundles are found in Piper kwashoense, and the irregularly inner cylinder persists yet gradually diminishes. Red crystals are numerous in the cortex of Celastrus kusanoi. Aristolochia shimadai and A. zollingeriana develop a combination of two cambium variants, radiating plates of xylem and a lobed xylem. The shape of phloem in Stauntonia obovatifoliola is square or truncate, and its rays are numerous. Meanwhile, that of Neoalsomitra integrifolia is blunt and its rays are fewer. As for the features of a stem surface within the same family, Cyclea ochiaiana is brownish in color and has a deep vertical depression with lenticels, Pericampylus glaucus is greenish in color with a vertical shallow depression. Within the same genus, Aristolochia shimadai develops lenticels, which are not in A. zollingeriana; although the periderm developed in Clematis grata is a ring bark and tears easily, that of Clematis tamura is thick and soft.

  14. Pulmonary nodule detection using a cascaded SVM classifier

    Science.gov (United States)

    Bergtholdt, Martin; Wiemker, Rafael; Klinder, Tobias

    2016-03-01

    Automatic detection of lung nodules from chest CT has been researched intensively over the last decades resulting also in several commercial products. However, solutions are adopted only slowly into daily clinical routine as many current CAD systems still potentially miss true nodules while at the same time generating too many false positives (FP). While many earlier approaches had to rely on rather few cases for development, larger databases become now available and can be used for algorithmic development. In this paper, we address the problem of lung nodule detection via a cascaded SVM classifier. The idea is to sequentially perform two classification tasks in order to select from an extremely large pool of potential candidates the few most likely ones. As the initial pool is allowed to contain thousands of candidates, very loose criteria could be applied during this pre-selection. In this way, the chances that a true nodule is falsely rejected as a candidate are reduced significantly. The final algorithm is trained and tested on the full LIDC/IDRI database. Comparison is done against two previously published CAD systems. Overall, the algorithm achieved sensitivity of 0.859 at 2.5 FP/volume where the other two achieved sensitivity values of 0.321 and 0.625, respectively. On low dose data sets, only slight increase in the number of FP/volume was observed, while the sensitivity was not affected.

  15. Classifying Adults with Binge Eating Disorder Based on Severity Levels.

    Science.gov (United States)

    Dakanalis, Antonios; Riva, Giuseppe; Serino, Silvia; Colmegna, Fabrizia; Clerici, Massimo

    2017-07-01

    The clinical utility of the severity criterion for binge eating disorder (BED), introduced in the DSM-5 as a means of addressing heterogeneity and variability in the severity of this disorder, was evaluated in 189 treatment-seeking adults with (DSM-5) BED. Participants classified with mild, moderate, severe and extreme severity of BED, based on their weekly frequency of binge eating episodes, differed significantly from each other in body mass index (BMI), eating disorder features, putative factors involved in the maintenance process of the disorder, comorbid mood, anxiety and personality disorders, psychological distress, social maladjustment and illness-specific functional impairment (medium-to-large effect sizes). They were also statistically distinguishable in metabolic syndrome prevalence, even after adjusting for BMI (large effect size), suggesting the possibility of non-BMI-mediated mechanisms. The implications of the findings, providing support for the utility of the binge frequency as a severity criterion for BED, and directions for future research are outlined. Copyright © 2017 John Wiley & Sons, Ltd and Eating Disorders Association. Copyright © 2017 John Wiley & Sons, Ltd and Eating Disorders Association.

  16. Using color histograms and SPA-LDA to classify bacteria.

    Science.gov (United States)

    de Almeida, Valber Elias; da Costa, Gean Bezerra; de Sousa Fernandes, David Douglas; Gonçalves Dias Diniz, Paulo Henrique; Brandão, Deysiane; de Medeiros, Ana Claudia Dantas; Véras, Germano

    2014-09-01

    In this work, a new approach is proposed to verify the differentiating characteristics of five bacteria (Escherichia coli, Enterococcus faecalis, Streptococcus salivarius, Streptococcus oralis, and Staphylococcus aureus) by using digital images obtained with a simple webcam and variable selection by the Successive Projections Algorithm associated with Linear Discriminant Analysis (SPA-LDA). In this sense, color histograms in the red-green-blue (RGB), hue-saturation-value (HSV), and grayscale channels and their combinations were used as input data, and statistically evaluated by using different multivariate classifiers (Soft Independent Modeling by Class Analogy (SIMCA), Principal Component Analysis-Linear Discriminant Analysis (PCA-LDA), Partial Least Squares Discriminant Analysis (PLS-DA) and Successive Projections Algorithm-Linear Discriminant Analysis (SPA-LDA)). The bacteria strains were cultivated in a nutritive blood agar base layer for 24 h by following the Brazilian Pharmacopoeia, maintaining the status of cell growth and the nature of nutrient solutions under the same conditions. The best result in classification was obtained by using RGB and SPA-LDA, which reached 94 and 100 % of classification accuracy in the training and test sets, respectively. This result is extremely positive from the viewpoint of routine clinical analyses, because it avoids bacterial identification based on phenotypic identification of the causative organism using Gram staining, culture, and biochemical proofs. Therefore, the proposed method presents inherent advantages, promoting a simpler, faster, and low-cost alternative for bacterial identification.

  17. Impacts of classifying New York City students as overweight.

    Science.gov (United States)

    Almond, Douglas; Lee, Ajin; Schwartz, Amy Ellen

    2016-03-29

    US schools increasingly report body mass index (BMI) to students and their parents in annual fitness "report cards." We obtained 3,592,026 BMI reports for New York City public school students for 2007-2012. We focus on female students whose BMI puts them close to their age-specific cutoff for categorization as overweight. Overweight students are notified that their BMI "falls outside a healthy weight" and they should review their BMI with a health care provider. Using a regression discontinuity design, we compare those classified as overweight but near to the overweight cutoff to those whose BMI narrowly earned them a "healthy" BMI grouping. We find that overweight categorization generates small impacts on girls' subsequent BMI and weight. Whereas presumably an intent of BMI report cards was to slow BMI growth among heavier students, BMIs and weights did not decline relative to healthy peers when assessed the following academic year. Our results speak to the discrete categorization as overweight for girls with BMIs near the overweight cutoff, not to the overall effect of BMI reporting in New York City.

  18. Main: Sequences [KOME

    Lifescience Database Archive (English)

    Full Text Available Sequences Amino Acid Sequence Amino Acid sequence of full length cDNA (Longest ORF) kome_ine_full_seq...uence_amino_db.fasta.zip kome_ine_full_sequence_amino_db.zip kome_ine_full_sequence_amino_db ...

  19. Shotgun protein sequencing.

    Energy Technology Data Exchange (ETDEWEB)

    Faulon, Jean-Loup Michel; Heffelfinger, Grant S.

    2009-06-01

    A novel experimental and computational technique based on multiple enzymatic digestion of a protein or protein mixture that reconstructs protein sequences from sequences of overlapping peptides is described in this SAND report. This approach, analogous to shotgun sequencing of DNA, is to be used to sequence alternative spliced proteins, to identify post-translational modifications, and to sequence genetically engineered proteins.

  20. MISR Level 2 FIRSTLOOK TOA/Cloud Classifier parameters V001

    Data.gov (United States)

    National Aeronautics and Space Administration — This is the Level 2 FIRSTLOOK TOA/Cloud Classifiers Product. It contains the Angular Signature Cloud Mask (ASCM), Cloud Classifiers, and Support Vector Machine...

  1. 75 FR 733 - Implementation of the Executive Order, ``Classified National Security Information''

    Science.gov (United States)

    2010-01-05

    ... December 29, 2009 Implementation of the Executive Order, ``Classified National Security Information... entitled, ``Classified National Security Information'' (the ``order''), which substantially advances my... Information Security Oversight Office (ISOO) a copy of the department or agency regulations implementing...

  2. Sequencing of the Litchi Downy Blight Pathogen Reveals It Is a Phytophthora Species With Downy Mildew-Like Characteristics.

    Science.gov (United States)

    Ye, Wenwu; Wang, Yang; Shen, Danyu; Li, Delong; Pu, Tianhuizi; Jiang, Zide; Zhang, Zhengguang; Zheng, Xiaobo; Tyler, Brett M; Wang, Yuanchao

    2016-07-01

    On the basis of its downy mildew-like morphology, the litchi downy blight pathogen was previously named Peronophythora litchii. Recently, however, it was proposed to transfer this pathogen to Phytophthora clade 4. To better characterize this unusual oomycete species and important fruit pathogen, we obtained the genome sequence of Phytophthora litchii and compared it to those from other oomycete species. P. litchii has a small genome with tightly spaced genes. On the basis of a multilocus phylogenetic analysis, the placement of P. litchii in the genus Phytophthora is strongly supported. Effector proteins predicted included 245 RxLR, 30 necrosis-and-ethylene-inducing protein-like, and 14 crinkler proteins. The typical motifs, phylogenies, and activities of these effectors were typical for a Phytophthora species. However, like the genome features of the analyzed downy mildews, P. litchii exhibited a streamlined genome with a relatively small number of genes in both core and species-specific protein families. The low GC content and slight codon preferences of P. litchii sequences were similar to those of the analyzed downy mildews and a subset of Phytophthora species. Taken together, these observations suggest that P. litchii is a Phytophthora pathogen that is in the process of acquiring downy mildew-like genomic and morphological features. Thus P. litchii may provide a novel model for investigating morphological development and genomic adaptation in oomycete pathogens.

  3. KinMutRF: a random forest classifier of sequence variants in the human protein kinase superfamily

    DEFF Research Database (Denmark)

    Pons, Tirso; Vazquez, Miguel; Matey-Hernandez, María Luisa

    2016-01-01

    as a random forest ponder a battery of features that characterize the variants: a) at the gene level, including membership to a Kinbase group and Gene Ontology terms; b) at the PFAM domain level; and c) at the residue level, the types of amino acids involved, changes in biochemical properties, functional...

  4. A GIS semiautomatic tool for classifying and mapping wetland soils

    Science.gov (United States)

    Moreno-Ramón, Héctor; Marqués-Mateu, Angel; Ibáñez-Asensio, Sara

    2016-04-01

    Wetlands are one of the most productive and biodiverse ecosystems in the world. Water is the main resource and controls the relationships between agents and factors that determine the quality of the wetland. However, vegetation, wildlife and soils are also essential factors to understand these environments. It is possible that soils have been the least studied resource due to their sampling problems. This feature has caused that sometimes wetland soils have been classified broadly. The traditional methodology states that homogeneous soil units should be based on the five soil forming-factors. The problem can appear when the variation of one soil-forming factor is too small to differentiate a change in soil units, or in case that there is another factor, which is not taken into account (e.g. fluctuating water table). This is the case of Albufera of Valencia, a coastal wetland located in the middle east of the Iberian Peninsula (Spain). The saline water table fluctuates throughout the year and it generates differences in soils. To solve this problem, the objectives of this study were to establish a reliable methodology to avoid that problems, and develop a GIS tool that would allow us to define homogeneous soil units in wetlands. This step is essential for the soil scientist, who has to decide the number of soil profiles in a study. The research was conducted with data from 133 soil pits of a previous study in the wetland. In that study, soil parameters of 401 samples (organic carbon, salinity, carbonates, n-value, etc.) were analysed. In a first stage, GIS layers were generated according to depth. The method employed was Bayesian Maxim Entropy. Subsequently, it was designed a program in GIS environment that was based on the decision tree algorithms. The goal of this tool was to create a single layer, for each soil variable, according to the different diagnostic criteria of Soil Taxonomy (properties, horizons and diagnostic epipedons). At the end, the program

  5. CLASSIFYING BENIGN AND MALIGNANT MASSES USING STATISTICAL MEASURES

    Directory of Open Access Journals (Sweden)

    B. Surendiran

    2011-11-01

    Full Text Available Breast cancer is the primary and most common disease found in women which causes second highest rate of death after lung cancer. The digital mammogram is the X-ray of breast captured for the analysis, interpretation and diagnosis. According to Breast Imaging Reporting and Data System (BIRADS benign and malignant can be differentiated using its shape, size and density, which is how radiologist visualize the mammograms. According to BIRADS mass shape characteristics, benign masses tend to have round, oval, lobular in shape and malignant masses are lobular or irregular in shape. Measuring regular and irregular shapes mathematically is found to be a difficult task, since there is no single measure to differentiate various shapes. In this paper, the malignant and benign masses present in mammogram are classified using Hue, Saturation and Value (HSV weight function based statistical measures. The weight function is robust against noise and captures the degree of gray content of the pixel. The statistical measures use gray weight value instead of gray pixel value to effectively discriminate masses. The 233 mammograms from the Digital Database for Screening Mammography (DDSM benchmark dataset have been used. The PASW data mining modeler has been used for constructing Neural Network for identifying importance of statistical measures. Based on the obtained important statistical measure, the C5.0 tree has been constructed with 60-40 data split. The experimental results are found to be encouraging. Also, the results will agree to the standard specified by the American College of Radiology-BIRADS Systems.

  6. Expert system classifier for adaptive radiation therapy in prostate cancer.

    Science.gov (United States)

    Guidi, Gabriele; Maffei, Nicola; Vecchi, Claudio; Gottardi, Giovanni; Ciarmatori, Alberto; Mistretta, Grazia Maria; Mazzeo, Ercole; Giacobazzi, Patrizia; Lohr, Frank; Costi, Tiziana

    2017-06-01

    A classifier-based expert system was developed to compare delivered and planned radiation therapy in prostate cancer patients. Its aim is to automatically identify patients that can benefit from an adaptive treatment strategy. The study predominantly addresses dosimetric uncertainties and critical issues caused by motion of hollow organs. 1200 MVCT images of 38 prostate adenocarcinoma cases were analyzed. An automatic daily re-contouring of structures (i.e. rectum, bladder and femoral heads), rigid/deformable registration and dose warping was carried out to simulate dose and volume variations during therapy. Support vector machine, K-means clustering algorithms and similarity index analysis were used to create an unsupervised predictive tool to detect incorrect setup and/or morphological changes as a consequence of inadequate patient preparation due to stochastic physiological changes, supporting clinical decision-making. After training on a dataset that was considered sufficiently dosimetrically stable, the system identified two equally sized macro clusters with distinctly different volumetric and dosimetric baseline properties and defined thresholds for these two clusters. Application to the test cohort resulted in 25% of the patients located outside the two macro clusters thresholds and which were therefore suspected to be dosimetrically unstable. In these patients, over the treatment course, mean volumetric changes of 30 and 40% for rectum and bladder were detected which possibly represents values justifying adjustment of patient preparation, frequent re-planning or a plan-of-the-day strategy. Based on our research, by combining daily IGRT images with rigid/deformable registration and dose warping, it is possible to apply a machine learning approach to the clinical setting obtaining useful information for a decision regarding an individualized adaptive strategy. Especially for treatments influenced by the movement of hollow organs, this could reduce inadequate

  7. Unsupervised online classifier in sleep scoring for sleep deprivation studies.

    Science.gov (United States)

    Libourel, Paul-Antoine; Corneyllie, Alexandra; Luppi, Pierre-Hervé; Chouvet, Guy; Gervasoni, Damien

    2015-05-01

    This study was designed to evaluate an unsupervised adaptive algorithm for real-time detection of sleep and wake states in rodents. We designed a Bayesian classifier that automatically extracts electroencephalogram (EEG) and electromyogram (EMG) features and categorizes non-overlapping 5-s epochs into one of the three major sleep and wake states without any human supervision. This sleep-scoring algorithm is coupled online with a new device to perform selective paradoxical sleep deprivation (PSD). Controlled laboratory settings for chronic polygraphic sleep recordings and selective PSD. Ten adult Sprague-Dawley rats instrumented for chronic polysomnographic recordings. The performance of the algorithm is evaluated by comparison with the score obtained by a human expert reader. Online detection of PS is then validated with a PSD protocol with duration of 72 hours. Our algorithm gave a high concordance with human scoring with an average κ coefficient > 70%. Notably, the specificity to detect PS reached 92%. Selective PSD using real-time detection of PS strongly reduced PS amounts, leaving only brief PS bouts necessary for the detection of PS in EEG and EMG signals (4.7 ± 0.7% over 72 h, versus 8.9 ± 0.5% in baseline), and was followed by a significant PS rebound (23.3 ± 3.3% over 150 minutes). Our fully unsupervised data-driven algorithm overcomes some limitations of the other automated methods such as the selection of representative descriptors or threshold settings. When used online and coupled with our sleep deprivation device, it represents a better option for selective PSD than other methods like the tedious gentle handling or the platform method. © 2015 Associated Professional Sleep Societies, LLC.

  8. Multimodal fusion of polynomial classifiers for automatic person recgonition

    Science.gov (United States)

    Broun, Charles C.; Zhang, Xiaozheng

    2001-03-01

    With the prevalence of the information age, privacy and personalization are forefront in today's society. As such, biometrics are viewed as essential components of current evolving technological systems. Consumers demand unobtrusive and non-invasive approaches. In our previous work, we have demonstrated a speaker verification system that meets these criteria. However, there are additional constraints for fielded systems. The required recognition transactions are often performed in adverse environments and across diverse populations, necessitating robust solutions. There are two significant problem areas in current generation speaker verification systems. The first is the difficulty in acquiring clean audio signals in all environments without encumbering the user with a head- mounted close-talking microphone. Second, unimodal biometric systems do not work with a significant percentage of the population. To combat these issues, multimodal techniques are being investigated to improve system robustness to environmental conditions, as well as improve overall accuracy across the population. We propose a multi modal approach that builds on our current state-of-the-art speaker verification technology. In order to maintain the transparent nature of the speech interface, we focus on optical sensing technology to provide the additional modality-giving us an audio-visual person recognition system. For the audio domain, we use our existing speaker verification system. For the visual domain, we focus on lip motion. This is chosen, rather than static face or iris recognition, because it provides dynamic information about the individual. In addition, the lip dynamics can aid speech recognition to provide liveness testing. The visual processing method makes use of both color and edge information, combined within Markov random field MRF framework, to localize the lips. Geometric features are extracted and input to a polynomial classifier for the person recognition process. A late

  9. Locating and classifying defects using an hybrid data base

    Energy Technology Data Exchange (ETDEWEB)

    Luna-Aviles, A; Diaz Pineda, A [Tecnologico de Estudios Superiores de Coacalco. Av. 16 de Septiembre 54, Col. Cabecera Municipal. C.P. 55700 (Mexico); Hernandez-Gomez, L H; Urriolagoitia-Calderon, G; Urriolagoitia-Sosa, G [Instituto Politecnico Nacional. ESIME-SEPI. Unidad Profesional ' Adolfo Lopez Mateos' Edificio 5, 30 Piso, Colonia Lindavista. Gustavo A. Madero. 07738 Mexico D.F. (Mexico); Durodola, J F [School of Technology, Oxford Brookes University, Headington Campus, Gipsy Lane, Oxford OX3 0BP (United Kingdom); Beltran Fernandez, J A, E-mail: alelunaav@hotmail.com, E-mail: luishector56@hotmail.com, E-mail: jdurodola@brookes.ac.uk

    2011-07-19

    A computational inverse technique was used in the localization and classification of defects. Postulated voids of two different sizes (2 mm and 4 mm diameter) were introduced in PMMA bars with and without a notch. The bar dimensions are 200x20x5 mm. One half of them were plain and the other half has a notch (3 mm x 4 mm) which is close to the defect area (19 mm x 16 mm).This analysis was done with an Artificial Neural Network (ANN) and its optimization was done with an Adaptive Neuro Fuzzy Procedure (ANFIS). A hybrid data base was developed with numerical and experimental results. Synthetic data was generated with the finite element method using SOLID95 element of ANSYS code. A parametric analysis was carried out. Only one defect in such bars was taken into account and the first five natural frequencies were calculated. 460 cases were evaluated. Half of them were plain and the other half has a notch. All the input data was classified in two groups. Each one has 230 cases and corresponds to one of the two sort of voids mentioned above. On the other hand, experimental analysis was carried on with PMMA specimens of the same size. The first two natural frequencies of 40 cases were obtained with one void. The other three frequencies were obtained numerically. 20 of these bars were plain and the others have a notch. These experimental results were introduced in the synthetic data base. 400 cases were taken randomly and, with this information, the ANN was trained with the backpropagation algorithm. The accuracy of the results was tested with the 100 cases that were left. In the next stage of this work, the ANN output was optimized with ANFIS. Previous papers showed that localization and classification of defects was reduced as notches were introduced in such bars. In the case of this paper, improved results were obtained when a hybrid data base was used.

  10. Predicting Alzheimer's disease by classifying 3D-Brain MRI images using SVM and other well-defined classifiers

    Science.gov (United States)

    Matoug, S.; Abdel-Dayem, A.; Passi, K.; Gross, W.; Alqarni, M.

    2012-02-01

    Alzheimer's disease (AD) is the most common form of dementia affecting seniors age 65 and over. When AD is suspected, the diagnosis is usually confirmed with behavioural assessments and cognitive tests, often followed by a brain scan. Advanced medical imaging and pattern recognition techniques are good tools to create a learning database in the first step and to predict the class label of incoming data in order to assess the development of the disease, i.e., the conversion from prodromal stages (mild cognitive impairment) to Alzheimer's disease, which is the most critical brain disease for the senior population. Advanced medical imaging such as the volumetric MRI can detect changes in the size of brain regions due to the loss of the brain tissues. Measuring regions that atrophy during the progress of Alzheimer's disease can help neurologists in detecting and staging the disease. In the present investigation, we present a pseudo-automatic scheme that reads volumetric MRI, extracts the middle slices of the brain region, performs segmentation in order to detect the region of brain's ventricle, generates a feature vector that characterizes this region, creates an SQL database that contains the generated data, and finally classifies the images based on the extracted features. For our results, we have used the MRI data sets from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.

  11. The Acquisition of Cantonese Classifiers by Preschool Children in Hong Kong

    Science.gov (United States)

    Tse, Shek Kam; Li, Hui; Leung, Shing On

    2007-01-01

    The Cantonese language has a complex classifier system and young learners need to pay attention to both the semantics and syntax of classifiers. This study investigated the repertoire of classifiers produced by 492 Cantonese-speaking preschoolers in three age groups (3;0, 4;0 and 5;0). Spontaneous utterances produced in 30-minute toy-play contexts…

  12. Classification of Cancer Gene Selection Using Random Forest and Neural Network Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Jogendra Kushwah

    2013-06-01

    Full Text Available The free radical gene classification of cancer diseases is challenging job in biomedical data engineering. The improving of classification of gene selection of cancer diseases various classifier are used, but the classification of classifier are not validate. So ensemble classifier is used for cancer gene classification using neural network classifier with random forest tree. The random forest tree is ensembling technique of classifier in this technique the number of classifier ensemble of their leaf node of class of classifier. In this paper we combined neural network with random forest ensemble classifier for classification of cancer gene selection for diagnose analysis of cancer diseases. The proposed method is different from most of the methods of ensemble classifier, which follow an input output paradigm of neural network, where the members of the ensemble are selected from a set of neural network classifier. the number of classifiers is determined during the rising procedure of the forest. Furthermore, the proposed method produces an ensemble not only correct, but also assorted, ensuring the two important properties that should characterize an ensemble classifier. For empirical evaluation of our proposed method we used UCI cancer diseases data set for classification. Our experimental result shows that better result in compression of random forest tree classification.

  13. 48 CFR 52.227-10 - Filing of Patent Applications-Classified Subject Matter.

    Science.gov (United States)

    2010-10-01

    ... Applications-Classified Subject Matter. 52.227-10 Section 52.227-10 Federal Acquisition Regulations System... Text of Provisions and Clauses 52.227-10 Filing of Patent Applications—Classified Subject Matter. As prescribed at 27.203-2, insert the following clause: Filing of Patent Applications—Classified Subject...

  14. Learning Bayesian network classifiers for credit scoring using Markov Chain Monte Carlo search

    NARCIS (Netherlands)

    Baesens, B.; Egmont-Petersen, M.; Castelo, R.; Vanthienen, J.

    2002-01-01

    In this paper, we will evaluate the power and usefulness of Bayesian network classifiers for credit scoring. Various types of Bayesian network classifiers will be evaluated and contrasted including unrestricted Bayesian network classifiers learnt using Markov Chain Monte Carlo (MCMC) search. The exp

  15. Sequence Read Archive (SRA)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome...

  16. Multimodal sequence learning.

    Science.gov (United States)

    Kemény, Ferenc; Meier, Beat

    2016-02-01

    While sequence learning research models complex phenomena, previous studies have mostly focused on unimodal sequences. The goal of the current experiment is to put implicit sequence learning into a multimodal context: to test whether it can operate across different modalities. We used the Task Sequence Learning paradigm to test whether sequence learning varies across modalities, and whether participants are able to learn multimodal sequences. Our results show that implicit sequence learning is very similar regardless of the source modality. However, the presence of correlated task and response sequences was required for learning to take place. The experiment provides new evidence for implicit sequence learning of abstract conceptual representations. In general, the results suggest that correlated sequences are necessary for implicit sequence learning to occur. Moreover, they show that elements from different modalities can be automatically integrated into one unitary multimodal sequence.

  17. How to Name and Classify Your Phage: An Informal Guide

    Directory of Open Access Journals (Sweden)

    Evelien Adriaenssens

    2017-04-01

    Full Text Available With this informal guide, we try to assist both new and experienced phage researchers through two important stages that follow phage discovery; that is, naming and classification. Providing an appropriate name for a bacteriophage is not as trivial as it sounds, and the effects might be long-lasting in databases and in official taxon names. Phage classification is the responsibility of the Bacterial and Archaeal Viruses Subcommittee (BAVS of the International Committee on the Taxonomy of Viruses (ICTV. While the BAVS aims at providing a holistic approach to phage taxonomy, for individual researchers who have isolated and sequenced a new phage, this can be a little overwhelming. We are now providing these researchers with an informal guide to phage naming and classification, taking a “bottom-up” approach from the phage isolate level.

  18. Coordinate cytokine regulatory sequences

    Science.gov (United States)

    Frazer, Kelly A.; Rubin, Edward M.; Loots, Gabriela G.

    2005-05-10

    The present invention provides CNS sequences that regulate the cytokine gene expression, expression cassettes and vectors comprising or lacking the CNS sequences, host cells and non-human transgenic animals comprising the CNS sequences or lacking the CNS sequences. The present invention also provides methods for identifying compounds that modulate the functions of CNS sequences as well as methods for diagnosing defects in the CNS sequences of patients.

  19. ON SPECTRAL PROPERTIES OF A NEW OPERATOR OVER SEQUENCE SPACES c AND c0

    Institute of Scientific and Technical Information of China (English)

    Ezgi ERDO˘ GAN; Vatan KARAKAYA

    2014-01-01

    In this work, we classify and calculate spectra such as point spectrum, continuous spectrum and residual spectrum over sequences spacesℓ∞, c and c0 according to a new matrix operator W which is obtained by matrix product.

  20. Robust Template Decomposition without Weight Restriction for Cellular Neural Networks Implementing Arbitrary Boolean Functions Using Support Vector Classifiers

    Directory of Open Access Journals (Sweden)

    Yih-Lon Lin

    2013-01-01

    Full Text Available If the given Boolean function is linearly separable, a robust uncoupled cellular neural network can be designed as a maximal margin classifier. On the other hand, if the given Boolean function is linearly separable but has a small geometric margin or it is not linearly separable, a popular approach is to find a sequence of robust uncoupled cellular neural networks implementing the given Boolean function. In the past research works using this approach, the control template parameters and thresholds are restricted to assume only a given finite set of integers, and this is certainly unnecessary for the template design. In this study, we try to remove this restriction. Minterm- and maxterm-based decomposition algorithms utilizing the soft margin and maximal margin support vector classifiers are proposed to design a sequence of robust templates implementing an arbitrary Boolean function. Several illustrative examples are simulated to demonstrate the efficiency of the proposed method by comparing our results with those produced by other decomposition methods with restricted weights.

  1. Classification of traumatic brain injury severity using informed data reduction in a series of binary classifier algorithms.

    Science.gov (United States)

    Prichep, Leslie S; Jacquin, Arnaud; Filipenko, Julie; Dastidar, Samanwoy Ghosh; Zabele, Stephen; Vodencarević, Asmir; Rothman, Neil S

    2012-11-01

    Assessment of medical disorders is often aided by objective diagnostic tests which can lead to early intervention and appropriate treatment. In the case of brain dysfunction caused by head injury, there is an urgent need for quantitative evaluation methods to aid in acute triage of those subjects who have sustained traumatic brain injury (TBI). Current clinical tools to detect mild TBI (mTBI/concussion) are limited to subjective reports of symptoms and short neurocognitive batteries, offering little objective evidence for clinical decisions; or computed tomography (CT) scans, with radiation-risk, that are most often negative in mTBI. This paper describes a novel methodology for the development of algorithms to provide multi-class classification in a substantial population of brain injured subjects, across a broad age range and representative subpopulations. The method is based on age-regressed quantitative features (linear and nonlinear) extracted from brain electrical activity recorded from a limited montage of scalp electrodes. These features are used as input to a unique "informed data reduction" method, maximizing confidence of prospective validation and minimizing over-fitting. A training set for supervised learning was used, including: "normal control," "concussed," and "structural injury/CT positive (CT+)." The classifier function separating CT+ from the other groups demonstrated a sensitivity of 96% and specificity of 78%; the classifier separating "normal controls" from the other groups demonstrated a sensitivity of 81% and specificity of 74%, suggesting high utility of such classifiers in acute clinical settings. The use of a sequence of classifiers where the desired risk can be stratified further supports clinical utility.

  2. EnsembleGASVR: A novel ensemble method for classifying missense single nucleotide polymorphisms

    KAUST Repository

    Rapakoulia, Trisevgeni

    2014-04-26

    Motivation: Single nucleotide polymorphisms (SNPs) are considered the most frequently occurring DNA sequence variations. Several computational methods have been proposed for the classification of missense SNPs to neutral and disease associated. However, existing computational approaches fail to select relevant features by choosing them arbitrarily without sufficient documentation. Moreover, they are limited to the problem ofmissing values, imbalance between the learning datasets and most of them do not support their predictions with confidence scores. Results: To overcome these limitations, a novel ensemble computational methodology is proposed. EnsembleGASVR facilitates a twostep algorithm, which in its first step applies a novel evolutionary embedded algorithm to locate close to optimal Support Vector Regression models. In its second step, these models are combined to extract a universal predictor, which is less prone to overfitting issues, systematizes the rebalancing of the learning sets and uses an internal approach for solving the missing values problem without loss of information. Confidence scores support all the predictions and the model becomes tunable by modifying the classification thresholds. An extensive study was performed for collecting the most relevant features for the problem of classifying SNPs, and a superset of 88 features was constructed. Experimental results show that the proposed framework outperforms well-known algorithms in terms of classification performance in the examined datasets. Finally, the proposed algorithmic framework was able to uncover the significant role of certain features such as the solvent accessibility feature, and the top-scored predictions were further validated by linking them with disease phenotypes. © The Author 2014.

  3. Genetic evidence and integration of various data sources for classifying uncertain variants into a single model.

    Science.gov (United States)

    Goldgar, David E; Easton, Douglas F; Byrnes, Graham B; Spurdle, Amanda B; Iversen, Edwin S; Greenblatt, Marc S

    2008-11-01

    Genetic testing often results in the finding of a variant whose clinical significance is unknown. A number of different approaches have been employed in the attempt to classify such variants. For some variants, case-control, segregation, family history, or other statistical studies can provide strong evidence of direct association with cancer risk. For most variants, other evidence is available that relates to properties of the protein or gene sequence. In this work we propose a Bayesian method for assessing the likelihood that a variant is pathogenic. We discuss the assessment of prior probability, and how to combine the various sources of data into a statistically valid integrated assessment with a posterior probability of pathogenicity. In particular, we propose the use of a two-component mixture model to integrate these various sources of data and to estimate the parameters related to sensitivity and specificity of specific kinds of evidence. Further, we discuss some of the issues involved in this process and the assumptions that underpin many of the methods used in the evaluation process.

  4. Combination of designed immune based classifiers for ERP assessment in a P300-based GKT

    Directory of Open Access Journals (Sweden)

    Mohammad Hassan Moradi

    2012-08-01

    Full Text Available Constructing a precise classifier is an important issue in pattern recognition task. Combination the decision of several competing classifiers to achieve improved classification accuracy has become interested in many research areas. In this study, Artificial Immune system (AIS as an effective artificial intelligence technique was used for designing of several efficient classifiers. Combination of multiple immune based classifiers was tested on ERP assessment in a P300-based GKT (Guilty Knowledge Test. Experiment results showed that the proposed classifier named Compact Artificial Immune System (CAIS was a successful classification method and could be competitive to other classifiers such as K-nearest neighbourhood (KNN, Linear Discriminant Analysis (LDA and Support Vector Machine (SVM. Also, in the experiments, it was observed that using the decision fusion techniques for multiple classifier combination lead to better recognition results. The best rate of recognition by CAIS was 80.90% that has been improved in compare to other applied classification methods in our study.

  5. Win percentage: a novel measure for assessing the suitability of machine classifiers for biological problems

    Science.gov (United States)

    2012-01-01

    Background Selecting an appropriate classifier for a particular biological application poses a difficult problem for researchers and practitioners alike. In particular, choosing a classifier depends heavily on the features selected. For high-throughput biomedical datasets, feature selection is often a preprocessing step that gives an unfair advantage to the classifiers built with the same modeling assumptions. In this paper, we seek classifiers that are suitable to a particular problem independent of feature selection. We propose a novel measure, called "win percentage", for assessing the suitability of machine classifiers to a particular problem. We define win percentage as the probability a classifier will perform better than its peers on a finite random sample of feature sets, giving each classifier equal opportunity to find suitable features. Results First, we illustrate the difficulty in evaluating classifiers after feature selection. We show that several classifiers can each perform statistically significantly better than their peers given the right feature set among the top 0.001% of all feature sets. We illustrate the utility of win percentage using synthetic data, and evaluate six classifiers in analyzing eight microarray datasets representing three diseases: breast cancer, multiple myeloma, and neuroblastoma. After initially using all Gaussian gene-pairs, we show that precise estimates of win percentage (within 1%) can be achieved using a smaller random sample of all feature pairs. We show that for these data no single classifier can be considered the best without knowing the feature set. Instead, win percentage captures the non-zero probability that each classifier will outperform its peers based on an empirical estimate of performance. Conclusions Fundamentally, we illustrate that the selection of the most suitable classifier (i.e., one that is more likely to perform better than its peers) not only depends on the dataset and application but also on the

  6. Consensus of sample-balanced classifiers for identifying ligand-binding residue by co-evolutionary physicochemical characteristics of amino acids

    KAUST Repository

    Chen, Peng

    2013-01-01

    Protein-ligand binding is an important mechanism for some proteins to perform their functions, and those binding sites are the residues of proteins that physically bind to ligands. So far, the state-of-the-art methods search for similar, known structures of the query and predict the binding sites based on the solved structures. However, such structural information is not commonly available. In this paper, we propose a sequence-based approach to identify protein-ligand binding residues. Due to the highly imbalanced samples between the ligand-binding sites and non ligand-binding sites, we constructed several balanced data sets, for each of which a random forest (RF)-based classifier was trained. The ensemble of these RF classifiers formed a sequence-based protein-ligand binding site predictor. Experimental results on CASP9 targets demonstrated that our method compared favorably with the state-of-the-art. © Springer-Verlag Berlin Heidelberg 2013.

  7. Fully automated stroke tissue estimation using random forest classifiers (FASTER).

    Science.gov (United States)

    McKinley, Richard; Häni, Levin; Gralla, Jan; El-Koussy, M; Bauer, S; Arnold, M; Fischer, U; Jung, S; Mattmann, Kaspar; Reyes, Mauricio; Wiest, Roland

    2017-08-01

    Several clinical trials have recently proven the efficacy of mechanical thrombectomy for treating ischemic stroke, within a six-hour window for therapy. To move beyond treatment windows and toward personalized risk assessment, it is essential to accurately identify the extent of tissue-at-risk ("penumbra"). We introduce a fully automated method to estimate the penumbra volume using multimodal MRI (diffusion-weighted imaging, a T2w- and T1w contrast-enhanced sequence, and dynamic susceptibility contrast perfusion MRI). The method estimates tissue-at-risk by predicting tissue damage in the case of both persistent occlusion and of complete recanalization. When applied to 19 test cases with a thrombolysis in cerebral infarction grading of 1-2a, mean overestimation of final lesion volume was 30 ml, compared with 121 ml for manually corrected thresholding. Predicted tissue-at-risk volume was positively correlated with final lesion volume ( p serve as an alternative method for identifying tissue-at-risk that may aid in treatment selection in ischemic stroke.

  8. Multivariate models to classify Tuscan virgin olive oils by zone.

    Directory of Open Access Journals (Sweden)

    Alessandri, Stefano

    1999-10-01

    Full Text Available In order to study and classify Tuscan virgin olive oils, 179 samples were collected. They were obtained from drupes harvested during the first half of November, from three different zones of the Region. The sampling was repeated for 5 years. Fatty acids, phytol, aliphatic and triterpenic alcohols, triterpenic dialcohols, sterols, squalene and tocopherols were analyzed. A subset of variables was considered. They were selected in a preceding work as the most effective and reliable, from the univariate point of view. The analytical data were transformed (except for the cycloartenol to compensate annual variations, the mean related to the East zone was subtracted from each value, within each year. Univariate three-class models were calculated and further variables discarded. Then multivariate three-zone models were evaluated, including phytol (that was always selected and all the combinations of palmitic, palmitoleic and oleic acid, tetracosanol, cycloartenol and squalene. Models including from two to seven variables were studied. The best model shows by-zone classification errors less than 40%, by-zone within-year classification errors that are less than 45% and a global classification error equal to 30%. This model includes phytol, palmitic acid, tetracosanol and cycloartenol.

    Para estudiar y clasificar aceites de oliva vírgenes Toscanos, se utilizaron 179 muestras, que fueron obtenidas de frutos recolectados durante la primera mitad de Noviembre, de tres zonas diferentes de la Región. El muestreo fue repetido durante 5 años. Se analizaron ácidos grasos, fitol, alcoholes alifáticos y triterpénicos, dialcoholes triterpénicos, esteroles, escualeno y tocoferoles. Se consideró un subconjunto de variables que fueron seleccionadas en un trabajo anterior como el más efectivo y fiable, desde el punto de vista univariado. Los datos analíticos se transformaron (excepto para el cicloartenol para compensar las variaciones anuales, rest

  9. Efficient DNA barcode regions for classifying Piper species (Piperaceae).

    Science.gov (United States)

    Chaveerach, Arunrat; Tanee, Tawatchai; Sanubol, Arisa; Monkheang, Pansa; Sudmoon, Runglawan

    2016-01-01

    Piper species are used for spices, in traditional and processed forms of medicines, in cosmetic compounds, in cultural activities and insecticides. Here barcode analysis was performed for identification of plant parts, young plants and modified forms of plants. Thirty-six Piper species were collected and the three barcode regions, matK, rbcL and psbA-trnH spacer, were amplified, sequenced and aligned to determine their genetic distances. For intraspecific genetic distances, the most effective values for the species identification ranged from no difference to very low distance values. However, Piper betle had the highest values at 0.386 for the matK region. This finding may be due to Piper betle being an economic and cultivated species, and thus is supported with growth factors, which may have affected its genetic distance. The interspecific genetic distances that were most effective for identification of different species were from the matK region and ranged from a low of 0.002 in 27 paired species to a high of 0.486. Eight species pairs, Piper kraense and Piper dominantinervium, Piper magnibaccum and Piper kraense, Piper phuwuaense and Piper dominantinervium, Piper phuwuaense and Piper kraense, Piper pilobracteatum and Piper dominantinervium, Piper pilobracteatum and Piper kraense, Piper pilobracteatum and Piper phuwuaense and Piper sylvestre and Piper polysyphonum, that presented a genetic distance of 0.000 and were identified by independently using each of the other two regions. Concisely, these three barcode regions are powerful for further efficient identification of the 36 Piper species.

  10. Efficient DNA barcode regions for classifying Piper species (Piperaceae

    Directory of Open Access Journals (Sweden)

    Arunrat Chaveerach

    2016-09-01

    Full Text Available Piper species are used for spices, in traditional and processed forms of medicines, in cosmetic compounds, in cultural activities and insecticides. Here barcode analysis was performed for identification of plant parts, young plants and modified forms of plants. Thirty-six Piper species were collected and the three barcode regions, matK, rbcL and psbA-trnH spacer, were amplified, sequenced and aligned to determine their genetic distances. For intraspecific genetic distances, the most effective values for the species identification ranged from no difference to very low distance values. However, P. betle had the highest values at 0.386 for the matK region. This finding may be due to P. betle being an economic and cultivated species, and thus is supported with growth factors, which may have affected its genetic distance. The interspecific genetic distances that were most effective for identification of different species were from the matK region and ranged from a low of 0.002 in 27 paired species to a high of 0.486. Eight species pairs, P. kraense and P. dominantinervium, P. magnibaccum and P. kraense, P. phuwuaense and P. dominantinervium, P. phuwuaense and P. kraense, P. pilobracteatum and P. dominantinervium, P. pilobracteatum and P. kraense, P. pilobracteatum and P. phuwuaense and P. sylvestre and P. polysyphonum, that presented a genetic distance of 0.000 and were identified by independently using each of the other two regions. Concisely, these three barcode regions are powerful for further efficient identification of the 36 Piper species.

  11. Efficient DNA barcode regions for classifying Piper species (Piperaceae)

    Science.gov (United States)

    Chaveerach, Arunrat; Tanee, Tawatchai; Sanubol, Arisa; Monkheang, Pansa; Sudmoon, Runglawan

    2016-01-01

    Abstract Piper species are used for spices, in traditional and processed forms of medicines, in cosmetic compounds, in cultural activities and insecticides. Here barcode analysis was performed for identification of plant parts, young plants and modified forms of plants. Thirty-six Piper species were collected and the three barcode regions, matK, rbcL and psbA-trnH spacer, were amplified, sequenced and aligned to determine their genetic distances. For intraspecific genetic distances, the most effective values for the species identification ranged from no difference to very low distance values. However, Piper betle had the highest values at 0.386 for the matK region. This finding may be due to Piper betle being an economic and cultivated species, and thus is supported with growth factors, which may have affected its genetic distance. The interspecific genetic distances that were most effective for identification of different species were from the matK region and ranged from a low of 0.002 in 27 paired species to a high of 0.486. Eight species pairs, Piper kraense and Piper dominantinervium, Piper magnibaccum and Piper kraense, Piper phuwuaense and Piper dominantinervium, Piper phuwuaense and Piper kraense, Piper pilobracteatum and Piper dominantinervium, Piper pilobracteatum and Piper kraense, Piper pilobracteatum and Piper phuwuaense and Piper sylvestre and Piper polysyphonum, that presented a genetic distance of 0.000 and were identified by independently using each of the other two regions. Concisely, these three barcode regions are powerful for further efficient identification of the 36 Piper species. PMID:27829794

  12. Pixel Classification of SAR ice images using ANFIS-PSO Classifier

    Directory of Open Access Journals (Sweden)

    G. Vasumathi

    2016-12-01

    Full Text Available Synthetic Aperture Radar (SAR is playing a vital role in taking extremely high resolution radar images. It is greatly used to monitor the ice covered ocean regions. Sea monitoring is important for various purposes which includes global climate systems and ship navigation. Classification on the ice infested area gives important features which will be further useful for various monitoring process around the ice regions. Main objective of this paper is to classify the SAR ice image that helps in identifying the regions around the ice infested areas. In this paper three stages are considered in classification of SAR ice images. It starts with preprocessing in which the speckled SAR ice images are denoised using various speckle removal filters; comparison is made on all these filters to find the best filter in speckle removal. Second stage includes segmentation in which different regions are segmented using K-means and watershed segmentation algorithms; comparison is made between these two algorithms to find the best in segmenting SAR ice images. The last stage includes pixel based classification which identifies and classifies the segmented regions using various supervised learning classifiers. The algorithms includes Back propagation neural networks (BPN, Fuzzy Classifier, Adaptive Neuro Fuzzy Inference Classifier (ANFIS classifier and proposed ANFIS with Particle Swarm Optimization (PSO classifier; comparison is made on all these classifiers to propose which classifier is best suitable for classifying the SAR ice image. Various evaluation metrics are performed separately at all these three stages.

  13. 棘皮动物微管相关类蛋白与间变性淋巴瘤激酶在肺腺癌中的表达及意义%Expression and significance of echinoderm microtubule-associated protein-like 4 in lung adenocarcinoma

    Institute of Scientific and Technical Information of China (English)

    成克伦; 周永松; 陈秀婵; 熊云刚; 邹阳强; 姜森

    2015-01-01

    Objective To investigate the expression and significance of echinoderm microtubule-associated protein-like 4 (EML4-ALK) fusion gene in lung adenocarcinoma through basal and clinical research in the recent five years. Methods The research was conducted through Wanfang Medicine Net and digital library data base of Guizhou to collect and analyze the documents published in China about EML4-ALK in the recent five years. Results There were 21 documents meeting inclusion criteria and 2 550 cases were studied. In summary, it was found out that detection rate of EML4-ALK in lung adenocarcinoma was 9.61%, With different testing method to measure the results there exist certain differences. Conclusion EML4-ALK showed a trend of lower expression in lung adenocarcinoma, associated with the occurrence and development of lung adenocarcinoma, it can be regarded as the new target spot. The detection method needs to be studied and optimized.%目的:探讨2011年4月至2015年4月基础和临床研究关于棘皮动物微管相关类蛋白4与间变性淋巴瘤激酶(EML4-ALK)融合基因在肺腺癌中的表达特点及临床意义。方法通过万方医学网和贵州省数字图书馆数据库文献法查询,对5年来发表在国内的有关EML4-ALK的文献进行收集整理和分析。结果共收集符合纳入标准文献21篇,研究病例2550例,总结发现在肺腺癌中,EML4-ALK的检出率为9.61%,不同检测方法检出结果存在一定差异。结论 EML4-ALK在肺腺癌中呈低表达趋势,与肺腺癌的发生、发展有关,可作为新的治疗靶点,但检测方法仍需进一步研究和优化。

  14. Mapping of mutation-sensitive sites in proteinlike chains

    DEFF Research Database (Denmark)

    Skorobogatiy, M.; Tiana, Guido

    1998-01-01

    are added to such an interaction matrix the distribution pattern changes. The rising of collective effects allows the hot sites to be found in places with a smaller number of nearest neighbors (surface) while the general trend of the hot sites to fall into a bulk part of a conformation still holds....... kinds of residues. If the interaction matrix is dominated by the hydrophobic effect (a Miyazawa-Jernigan-like matrix), this distribution is very simple: All the hot sites can be found at the positions with the maximum number of closest nearest neighbors (bulk). If random or nonlinear corrections...

  15. PHAST: Protein-like heteropolymer analysis by statistical thermodynamics

    Science.gov (United States)

    Frigori, Rafael B.

    2017-06-01

    PHAST is a software package written in standard Fortran, with MPI and CUDA extensions, able to efficiently perform parallel multicanonical Monte Carlo simulations of single or multiple heteropolymeric chains, as coarse-grained models for proteins. The outcome data can be straightforwardly analyzed within its microcanonical Statistical Thermodynamics module, which allows for computing the entropy, caloric curve, specific heat and free energies. As a case study, we investigate the aggregation of heteropolymers bioinspired on Aβ25-33 fragments and their cross-seeding with IAPP20-29 isoforms. Excellent parallel scaling is observed, even under numerically difficult first-order like phase transitions, which are properly described by the built-in fully reconfigurable force fields. Still, the package is free and open source, this shall motivate users to readily adapt it to specific purposes.

  16. The Entire Quantile Path of a Risk-Agnostic SVM Classifier

    CERN Document Server

    Yu, Jin; Zhang, Jian

    2012-01-01

    A quantile binary classifier uses the rule: Classify x as +1 if P(Y = 1|X = x) >= t, and as -1 otherwise, for a fixed quantile parameter t {[0, 1]. It has been shown that Support Vector Machines (SVMs) in the limit are quantile classifiers with t = 1/2 . In this paper, we show that by using asymmetric cost of misclassification SVMs can be appropriately extended to recover, in the limit, the quantile binary classifier for any t. We then present a principled algorithm to solve the extended SVM classifier for all values of t simultaneously. This has two implications: First, one can recover the entire conditional distribution P(Y = 1|X = x) = t for t {[0, 1]. Second, we can build a risk-agnostic SVM classifier where the cost of misclassification need not be known apriori. Preliminary numerical experiments show the effectiveness of the proposed algorithm.

  17. Classifier-ensemble incremental-learning procedure for nuclear transient identification at different operational conditions

    Energy Technology Data Exchange (ETDEWEB)

    Baraldi, Piero, E-mail: piero.baraldi@polimi.i [Dipartimento di Energia - Sezione Ingegneria Nucleare, Politecnico di Milano, via Ponzio 34/3, 20133 Milano (Italy); Razavi-Far, Roozbeh [Dipartimento di Energia - Sezione Ingegneria Nucleare, Politecnico di Milano, via Ponzio 34/3, 20133 Milano (Italy); Zio, Enrico [Dipartimento di Energia - Sezione Ingegneria Nucleare, Politecnico di Milano, via Ponzio 34/3, 20133 Milano (Italy); Ecole Centrale Paris-Supelec, Paris (France)

    2011-04-15

    An important requirement for the practical implementation of empirical diagnostic systems is the capability of classifying transients in all plant operational conditions. The present paper proposes an approach based on an ensemble of classifiers for incrementally learning transients under different operational conditions. New classifiers are added to the ensemble where transients occurring in new operational conditions are not satisfactorily classified. The construction of the ensemble is made by bagging; the base classifier is a supervised Fuzzy C Means (FCM) classifier whose outcomes are combined by majority voting. The incremental learning procedure is applied to the identification of simulated transients in the feedwater system of a Boiling Water Reactor (BWR) under different reactor power levels.

  18. Combining classifiers generated by multi-gene genetic programming for protein fold recognition using genetic algorithm.

    Science.gov (United States)

    Bardsiri, Mahshid Khatibi; Eftekhari, Mahdi; Mousavi, Reza

    2015-01-01

    In this study the problem of protein fold recognition, that is a classification task, is solved via a hybrid of evolutionary algorithms namely multi-gene Genetic Programming (GP) and Genetic Algorithm (GA). Our proposed method consists of two main stages and is performed on three datasets taken from the literature. Each dataset contains different feature groups and classes. In the first step, multi-gene GP is used for producing binary classifiers based on various feature groups for each class. Then, different classifiers obtained for each class are combined via weighted voting so that the weights are determined through GA. At the end of the first step, there is a separate binary classifier for each class. In the second stage, the obtained binary classifiers are combined via GA weighting in order to generate the overall classifier. The final obtained classifier is superior to the previous works found in the literature in terms of classification accuracy.

  19. Ensemble regularized linear discriminant analysis classifier for P300-based brain-computer interface.

    Science.gov (United States)

    Onishi, Akinari; Natsume, Kiyohisa

    2013-01-01

    This paper demonstrates a better classification performance of an ensemble classifier using a regularized linear discriminant analysis (LDA) for P300-based brain-computer interface (BCI). The ensemble classifier with an LDA is sensitive to the lack of training data because covariance matrices are estimated imprecisely. One of the solution against the lack of training data is to employ a regularized LDA. Thus we employed the regularized LDA for the ensemble classifier of the P300-based BCI. The principal component analysis (PCA) was used for the dimension reduction. As a result, an ensemble regularized LDA classifier showed significantly better classification performance than an ensemble un-regularized LDA classifier. Therefore the proposed ensemble regularized LDA classifier is robust against the lack of training data.

  20. Statistical and Machine-Learning Classifier Framework to Improve Pulse Shape Discrimination System Design

    Energy Technology Data Exchange (ETDEWEB)

    Wurtz, R. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Kaplan, A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-10-28

    Pulse shape discrimination (PSD) is a variety of statistical classifier. Fully-­realized statistical classifiers rely on a comprehensive set of tools for designing, building, and implementing. PSD advances rely on improvements to the implemented algorithm. PSD advances can be improved by using conventional statistical classifier or machine learning methods. This paper provides the reader with a glossary of classifier-­building elements and their functions in a fully-­designed and operational classifier framework that can be used to discover opportunities for improving PSD classifier projects. This paper recommends reporting the PSD classifier’s receiver operating characteristic (ROC) curve and its behavior at a gamma rejection rate (GRR) relevant for realistic applications.