WorldWideScience

Sample records for cancer classification based

  1. NIM: A Node Influence Based Method for Cancer Classification

    Directory of Open Access Journals (Sweden)

    Yiwen Wang

    2014-01-01

    Full Text Available The classification of different cancer types owns great significance in the medical field. However, the great majority of existing cancer classification methods are clinical-based and have relatively weak diagnostic ability. With the rapid development of gene expression technology, it is able to classify different kinds of cancers using DNA microarray. Our main idea is to confront the problem of cancer classification using gene expression data from a graph-based view. Based on a new node influence model we proposed, this paper presents a novel high accuracy method for cancer classification, which is composed of four parts: the first is to calculate the similarity matrix of all samples, the second is to compute the node influence of training samples, the third is to obtain the similarity between every test sample and each class using weighted sum of node influence and similarity matrix, and the last is to classify each test sample based on its similarity between every class. The data sets used in our experiments are breast cancer, central nervous system, colon tumor, prostate cancer, acute lymphoblastic leukemia, and lung cancer. experimental results showed that our node influence based method (NIM is more efficient and robust than the support vector machine, K-nearest neighbor, C4.5, naive Bayes, and CART.

  2. Pathological Bases for a Robust Application of Cancer Molecular Classification

    Directory of Open Access Journals (Sweden)

    Salvador J. Diaz-Cano

    2015-04-01

    Full Text Available Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes, and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors.

  3. Cancer Classification Based on Support Vector Machine Optimized by Particle Swarm Optimization and Artificial Bee Colony.

    Science.gov (United States)

    Gao, Lingyun; Ye, Mingquan; Wu, Changrong

    2017-11-29

    Intelligent optimization algorithms have advantages in dealing with complex nonlinear problems accompanied by good flexibility and adaptability. In this paper, the FCBF (Fast Correlation-Based Feature selection) method is used to filter irrelevant and redundant features in order to improve the quality of cancer classification. Then, we perform classification based on SVM (Support Vector Machine) optimized by PSO (Particle Swarm Optimization) combined with ABC (Artificial Bee Colony) approaches, which is represented as PA-SVM. The proposed PA-SVM method is applied to nine cancer datasets, including five datasets of outcome prediction and a protein dataset of ovarian cancer. By comparison with other classification methods, the results demonstrate the effectiveness and the robustness of the proposed PA-SVM method in handling various types of data for cancer classification.

  4. Classification of cancerous cells based on the one-class problem approach

    Science.gov (United States)

    Murshed, Nabeel A.; Bortolozzi, Flavio; Sabourin, Robert

    1996-03-01

    One of the most important factors in reducing the effect of cancerous diseases is the early diagnosis, which requires a good and a robust method. With the advancement of computer technologies and digital image processing, the development of a computer-based system has become feasible. In this paper, we introduce a new approach for the detection of cancerous cells. This approach is based on the one-class problem approach, through which the classification system need only be trained with patterns of cancerous cells. This reduces the burden of the training task by about 50%. Based on this approach, a computer-based classification system is developed, based on the Fuzzy ARTMAP neural networks. Experimental results were performed using a set of 542 patterns taken from a sample of breast cancer. Results of the experiment show 98% correct identification of cancerous cells and 95% correct identification of non-cancerous cells.

  5. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2005-11-01

    Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

  6. FEATURE EXTRACTION BASED WAVELET TRANSFORM IN BREAST CANCER DIAGNOSIS USING FUZZY AND NON-FUZZY CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Pelin GORGEL

    2013-01-01

    Full Text Available This study helps to provide a second eye to the expert radiologists for the classification of manually extracted breast masses taken from 60 digital mammıgrams. These mammograms have been acquired from Istanbul University Faculty of Medicine Hospital and have 78 masses. The diagnosis is implemented with pre-processing by using feature extraction based Fast Wavelet Transform (FWT. Afterwards Adaptive Neuro-Fuzzy Inference System (ANFIS based fuzzy subtractive clustering and Support Vector Machines (SVM methods are used for the classification. It is a comparative study which uses these methods respectively. According to the results of the study, ANFIS based subtractive clustering produces ??% while SVM produces ??% accuracy in malignant-benign classification. The results demonstrate that the developed system could help the radiologists for a true diagnosis and decrease the number of the missing cancerous regions or unnecessary biopsies.

  7. Classification of human cancers based on DNA copy number amplification modeling

    Directory of Open Access Journals (Sweden)

    Knuutila Sakari

    2008-05-01

    Full Text Available Abstract Background DNA amplifications alter gene dosage in cancer genomes by multiplying the gene copy number. Amplifications are quintessential in a considerable number of advanced cancers of various anatomical locations. The aims of this study were to classify human cancers based on their amplification patterns, explore the biological and clinical fundamentals behind their amplification-pattern based classification, and understand the characteristics in human genomic architecture that associate with amplification mechanisms. Methods We applied a machine learning approach to model DNA copy number amplifications using a data set of binary amplification records at chromosome sub-band resolution from 4400 cases that represent 82 cancer types. Amplification data was fused with background data: clinical, histological and biological classifications, and cytogenetic annotations. Statistical hypothesis testing was used to mine associations between the data sets. Results Probabilistic clustering of each chromosome identified 111 amplification models and divided the cancer cases into clusters. The distribution of classification terms in the amplification-model based clustering of cancer cases revealed cancer classes that were associated with specific DNA copy number amplification models. Amplification patterns – finite or bounded descriptions of the ranges of the amplifications in the chromosome – were extracted from the clustered data and expressed according to the original cytogenetic nomenclature. This was achieved by maximal frequent itemset mining using the cluster-specific data sets. The boundaries of amplification patterns were shown to be enriched with fragile sites, telomeres, centromeres, and light chromosome bands. Conclusions Our results demonstrate that amplifications are non-random chromosomal changes and specifically selected in tumor tissue microenvironment. Furthermore, statistical evidence showed that specific chromosomal features

  8. AN ADABOOST OPTIMIZED CCFIS BASED CLASSIFICATION MODEL FOR BREAST CANCER DETECTION

    Directory of Open Access Journals (Sweden)

    CHANDRASEKAR RAVI

    2017-06-01

    Full Text Available Classification is a Data Mining technique used for building a prototype of the data behaviour, using which an unseen data can be classified into one of the defined classes. Several researchers have proposed classification techniques but most of them did not emphasis much on the misclassified instances and storage space. In this paper, a classification model is proposed that takes into account the misclassified instances and storage space. The classification model is efficiently developed using a tree structure for reducing the storage complexity and uses single scan of the dataset. During the training phase, Class-based Closed Frequent ItemSets (CCFIS were mined from the training dataset in the form of a tree structure. The classification model has been developed using the CCFIS and a similarity measure based on Longest Common Subsequence (LCS. Further, the Particle Swarm Optimization algorithm is applied on the generated CCFIS, which assigns weights to the itemsets and their associated classes. Most of the classifiers are correctly classifying the common instances but they misclassify the rare instances. In view of that, AdaBoost algorithm has been used to boost the weights of the misclassified instances in the previous round so as to include them in the training phase to classify the rare instances. This improves the accuracy of the classification model. During the testing phase, the classification model is used to classify the instances of the test dataset. Breast Cancer dataset from UCI repository is used for experiment. Experimental analysis shows that the accuracy of the proposed classification model outperforms the PSOAdaBoost-Sequence classifier by 7% superior to other approaches like Naïve Bayes Classifier, Support Vector Machine Classifier, Instance Based Classifier, ID3 Classifier, J48 Classifier, etc.

  9. A protein and mRNA expression-based classification of gastric cancer.

    Science.gov (United States)

    Setia, Namrata; Agoston, Agoston T; Han, Hye S; Mullen, John T; Duda, Dan G; Clark, Jeffrey W; Deshpande, Vikram; Mino-Kenudson, Mari; Srivastava, Amitabh; Lennerz, Jochen K; Hong, Theodore S; Kwak, Eunice L; Lauwers, Gregory Y

    2016-07-01

    The overall survival of gastric carcinoma patients remains poor despite improved control over known risk factors and surveillance. This highlights the need for new classifications, driven towards identification of potential therapeutic targets. Using sophisticated molecular technologies and analysis, three groups recently provided genetic and epigenetic molecular classifications of gastric cancer (The Cancer Genome Atlas, 'Singapore-Duke' study, and Asian Cancer Research Group). Suggested by these classifications, here, we examined the expression of 14 biomarkers in a cohort of 146 gastric adenocarcinomas and performed unsupervised hierarchical clustering analysis using less expensive and widely available immunohistochemistry and in situ hybridization. Ultimately, we identified five groups of gastric cancers based on Epstein-Barr virus (EBV) positivity, microsatellite instability, aberrant E-cadherin, and p53 expression; the remaining cases constituted a group characterized by normal p53 expression. In addition, the five categories correspond to the reported molecular subgroups by virtue of clinicopathologic features. Furthermore, evaluation between these clusters and survival using the Cox proportional hazards model showed a trend for superior survival in the EBV and microsatellite-instable related adenocarcinomas. In conclusion, we offer as a proposal a simplified algorithm that is able to reproduce the recently proposed molecular subgroups of gastric adenocarcinoma, using immunohistochemical and in situ hybridization techniques.

  10. Training ANFIS structure using genetic algorithm for liver cancer classification based on microarray gene expression data

    Directory of Open Access Journals (Sweden)

    Bülent Haznedar

    2017-02-01

    Full Text Available Classification is an important data mining technique, which is used in many fields mostly exemplified as medicine, genetics and biomedical engineering. The number of studies about classification of the datum on DNA microarray gene expression is specifically increased in recent years. However, because of the reasons as the abundance of gene numbers in the datum as microarray gene expressions and the nonlinear relations mostly across those datum, the success of conventional classification algorithms can be limited. Because of these reasons, the interest on classification methods which are based on artificial intelligence to solve the problem on classification has been gradually increased in recent times. In this study, a hybrid approach which is based on Adaptive Neuro-Fuzzy Inference System (ANFIS and Genetic Algorithm (GA are suggested in order to classify liver microarray cancer data set. Simulation results are compared with the results of other methods. According to the results obtained, it is seen that the recommended method is better than the other methods.

  11. Identifying colon cancer risk modules with better classification performance based on human signaling network.

    Science.gov (United States)

    Qu, Xiaoli; Xie, Ruiqiang; Chen, Lina; Feng, Chenchen; Zhou, Yanyan; Li, Wan; Huang, Hao; Jia, Xu; Lv, Junjie; He, Yuehan; Du, Youwen; Li, Weiguo; Shi, Yuchen; He, Weiming

    2014-10-01

    Identifying differences between normal and tumor samples from a modular perspective may help to improve our understanding of the mechanisms responsible for colon cancer. Many cancer studies have shown that signaling transduction and biological pathways are disturbed in disease states, and expression profiles can distinguish variations in diseases. In this study, we integrated a weighted human signaling network and gene expression profiles to select risk modules associated with tumor conditions. Risk modules as classification features by our method had a better classification performance than other methods, and one risk module for colon cancer had a good classification performance for distinguishing between normal/tumor samples and between tumor stages. All genes in the module were annotated to the biological process of positive regulation of cell proliferation, and were highly associated with colon cancer. These results suggested that these genes might be the potential risk genes for colon cancer. Copyright © 2013. Published by Elsevier Inc.

  12. Actionable gene-based classification toward precision medicine in gastric cancer

    Directory of Open Access Journals (Sweden)

    Hiroshi Ichikawa

    2017-10-01

    Full Text Available Abstract Background Intertumoral heterogeneity represents a significant hurdle to identifying optimized targeted therapies in gastric cancer (GC. To realize precision medicine for GC patients, an actionable gene alteration-based molecular classification that directly associates GCs with targeted therapies is needed. Methods A total of 207 Japanese patients with GC were included in this study. Formalin-fixed, paraffin-embedded (FFPE tumor tissues were obtained from surgical or biopsy specimens and were subjected to DNA extraction. We generated comprehensive genomic profiling data using a 435-gene panel including 69 actionable genes paired with US Food and Drug Administration-approved targeted therapies, and the evaluation of Epstein-Barr virus (EBV infection and microsatellite instability (MSI status. Results Comprehensive genomic sequencing detected at least one alteration of 435 cancer-related genes in 194 GCs (93.7% and of 69 actionable genes in 141 GCs (68.1%. We classified the 207 GCs into four The Cancer Genome Atlas (TCGA subtypes using the genomic profiling data; EBV (N = 9, MSI (N = 17, chromosomal instability (N = 119, and genomically stable subtype (N = 62. Actionable gene alterations were not specific and were widely observed throughout all TCGA subtypes. To discover a novel classification which more precisely selects candidates for targeted therapies, 207 GCs were classified using hypermutated phenotype and the mutation profile of 69 actionable genes. We identified a hypermutated group (N = 32, while the others (N = 175 were sub-divided into six clusters including five with actionable gene alterations: ERBB2 (N = 25, CDKN2A, and CDKN2B (N = 10, KRAS (N = 10, BRCA2 (N = 9, and ATM cluster (N = 12. The clinical utility of this classification was demonstrated by a case of unresectable GC with a remarkable response to anti-HER2 therapy in the ERBB2 cluster. Conclusions This actionable gene-based

  13. An Entropy-based gene selection method for cancer classification using microarray data

    Directory of Open Access Journals (Sweden)

    Krishnan Arun

    2005-03-01

    Full Text Available Abstract Background Accurate diagnosis of cancer subtypes remains a challenging problem. Building classifiers based on gene expression data is a promising approach; yet the selection of non-redundant but relevant genes is difficult. The selected gene set should be small enough to allow diagnosis even in regular clinical laboratories and ideally identify genes involved in cancer-specific regulatory pathways. Here an entropy-based method is proposed that selects genes related to the different cancer classes while at the same time reducing the redundancy among the genes. Results The present study identifies a subset of features by maximizing the relevance and minimizing the redundancy of the selected genes. A merit called normalized mutual information is employed to measure the relevance and the redundancy of the genes. In order to find a more representative subset of features, an iterative procedure is adopted that incorporates an initial clustering followed by data partitioning and the application of the algorithm to each of the partitions. A leave-one-out approach then selects the most commonly selected genes across all the different runs and the gene selection algorithm is applied again to pare down the list of selected genes until a minimal subset is obtained that gives a satisfactory accuracy of classification. The algorithm was applied to three different data sets and the results obtained were compared to work done by others using the same data sets Conclusion This study presents an entropy-based iterative algorithm for selecting genes from microarray data that are able to classify various cancer sub-types with high accuracy. In addition, the feature set obtained is very compact, that is, the redundancy between genes is reduced to a large extent. This implies that classifiers can be built with a smaller subset of genes.

  14. Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series.

    Science.gov (United States)

    Gálvez, Juan Manuel; Castillo, Daniel; Herrera, Luis Javier; San Román, Belén; Valenzuela, Olga; Ortuño, Francisco Manuel; Rojas, Ignacio

    2018-01-01

    Most of the research studies developed applying microarray technology to the characterization of different pathological states of any disease may fail in reaching statistically significant results. This is largely due to the small repertoire of analysed samples, and to the limitation in the number of states or pathologies usually addressed. Moreover, the influence of potential deviations on the gene expression quantification is usually disregarded. In spite of the continuous changes in omic sciences, reflected for instance in the emergence of new Next-Generation Sequencing-related technologies, the existing availability of a vast amount of gene expression microarray datasets should be properly exploited. Therefore, this work proposes a novel methodological approach involving the integration of several heterogeneous skin cancer series, and a later multiclass classifier design. This approach is thus a way to provide the clinicians with an intelligent diagnosis support tool based on the use of a robust set of selected biomarkers, which simultaneously distinguishes among different cancer-related skin states. To achieve this, a multi-platform combination of microarray datasets from Affymetrix and Illumina manufacturers was carried out. This integration is expected to strengthen the statistical robustness of the study as well as the finding of highly-reliable skin cancer biomarkers. Specifically, the designed operation pipeline has allowed the identification of a small subset of 17 differentially expressed genes (DEGs) from which to distinguish among 7 involved skin states. These genes were obtained from the assessment of a number of potential batch effects on the gene expression data. The biological interpretation of these genes was inspected in the specific literature to understand their underlying information in relation to skin cancer. Finally, in order to assess their possible effectiveness in cancer diagnosis, a cross-validation Support Vector Machines (SVM)-based

  15. BCDForest: a boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data.

    Science.gov (United States)

    Guo, Yang; Liu, Shuhui; Li, Zhanhuai; Shang, Xuequn

    2018-04-11

    The classification of cancer subtypes is of great importance to cancer disease diagnosis and therapy. Many supervised learning approaches have been applied to cancer subtype classification in the past few years, especially of deep learning based approaches. Recently, the deep forest model has been proposed as an alternative of deep neural networks to learn hyper-representations by using cascade ensemble decision trees. It has been proved that the deep forest model has competitive or even better performance than deep neural networks in some extent. However, the standard deep forest model may face overfitting and ensemble diversity challenges when dealing with small sample size and high-dimensional biology data. In this paper, we propose a deep learning model, so-called BCDForest, to address cancer subtype classification on small-scale biology datasets, which can be viewed as a modification of the standard deep forest model. The BCDForest distinguishes from the standard deep forest model with the following two main contributions: First, a named multi-class-grained scanning method is proposed to train multiple binary classifiers to encourage diversity of ensemble. Meanwhile, the fitting quality of each classifier is considered in representation learning. Second, we propose a boosting strategy to emphasize more important features in cascade forests, thus to propagate the benefits of discriminative features among cascade layers to improve the classification performance. Systematic comparison experiments on both microarray and RNA-Seq gene expression datasets demonstrate that our method consistently outperforms the state-of-the-art methods in application of cancer subtype classification. The multi-class-grained scanning and boosting strategy in our model provide an effective solution to ease the overfitting challenge and improve the robustness of deep forest model working on small-scale data. Our model provides a useful approach to the classification of cancer subtypes

  16. Proteomic classification of breast cancer.

    LENUS (Irish Health Repository)

    Kamel, Dalia

    2012-11-01

    Being a significant health problem that affects patients in various age groups, breast cancer has been extensively studied to date. Recently, molecular breast cancer classification has advanced significantly with the availability of genomic profiling technologies. Proteomic technologies have also advanced from traditional protein assays including enzyme-linked immunosorbent assay, immunoblotting and immunohistochemistry to more comprehensive approaches including mass spectrometry and reverse phase protein lysate arrays (RPPA). The purpose of this manuscript is to review the current protein markers that influence breast cancer prediction and prognosis and to focus on novel advances in proteomic classification of breast cancer.

  17. Quantum Cascade Laser-Based Infrared Microscopy for Label-Free and Automated Cancer Classification in Tissue Sections.

    Science.gov (United States)

    Kuepper, Claus; Kallenbach-Thieltges, Angela; Juette, Hendrik; Tannapfel, Andrea; Großerueschkamp, Frederik; Gerwert, Klaus

    2018-05-16

    A feasibility study using a quantum cascade laser-based infrared microscope for the rapid and label-free classification of colorectal cancer tissues is presented. Infrared imaging is a reliable, robust, automated, and operator-independent tissue classification method that has been used for differential classification of tissue thin sections identifying tumorous regions. However, long acquisition time by the so far used FT-IR-based microscopes hampered the clinical translation of this technique. Here, the used quantum cascade laser-based microscope provides now infrared images for precise tissue classification within few minutes. We analyzed 110 patients with UICC-Stage II and III colorectal cancer, showing 96% sensitivity and 100% specificity of this label-free method as compared to histopathology, the gold standard in routine clinical diagnostics. The main hurdle for the clinical translation of IR-Imaging is overcome now by the short acquisition time for high quality diagnostic images, which is in the same time range as frozen sections by pathologists.

  18. Genetic Fuzzy System (GFS based wavelet co-occurrence feature selection in mammogram classification for breast cancer diagnosis

    Directory of Open Access Journals (Sweden)

    Meenakshi M. Pawar

    2016-09-01

    Full Text Available Breast cancer is significant health problem diagnosed mostly in women worldwide. Therefore, early detection of breast cancer is performed with the help of digital mammography, which can reduce mortality rate. This paper presents wrapper based feature selection approach for wavelet co-occurrence feature (WCF using Genetic Fuzzy System (GFS in mammogram classification problem. The performance of GFS algorithm is explained using mini-MIAS database. WCF features are obtained from detail wavelet coefficients at each level of decomposition of mammogram image. At first level of decomposition, 18 features are applied to GFS algorithm, which selects 5 features with an average classification success rate of 39.64%. Subsequently, at second level it selects 9 features from 36 features and the classification success rate is improved to 56.75%. For third level, 16 features are selected from 54 features and average success rate is improved to 64.98%. Lastly, at fourth level 72 features are applied to GFS, which selects 16 features and thereby increasing average success rate to 89.47%. Hence, GFS algorithm is the effective way of obtaining optimal set of feature in breast cancer diagnosis.

  19. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  20. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  1. A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification

    Directory of Open Access Journals (Sweden)

    Wang Lily

    2008-07-01

    Full Text Available Abstract Background Cancer diagnosis and clinical outcome prediction are among the most important emerging applications of gene expression microarray technology with several molecular signatures on their way toward clinical deployment. Use of the most accurate classification algorithms available for microarray gene expression data is a critical ingredient in order to develop the best possible molecular signatures for patient care. As suggested by a large body of literature to date, support vector machines can be considered "best of class" algorithms for classification of such data. Recent work, however, suggests that random forest classifiers may outperform support vector machines in this domain. Results In the present paper we identify methodological biases of prior work comparing random forests and support vector machines and conduct a new rigorous evaluation of the two algorithms that corrects these limitations. Our experiments use 22 diagnostic and prognostic datasets and show that support vector machines outperform random forests, often by a large margin. Our data also underlines the importance of sound research design in benchmarking and comparison of bioinformatics algorithms. Conclusion We found that both on average and in the majority of microarray datasets, random forests are outperformed by support vector machines both in the settings when no gene selection is performed and when several popular gene selection methods are used.

  2. Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method

    Directory of Open Access Journals (Sweden)

    Huang Desheng

    2009-07-01

    Full Text Available Abstract Background A reliable and precise classification is essential for successful diagnosis and treatment of cancer. Gene expression microarrays have provided the high-throughput platform to discover genomic biomarkers for cancer diagnosis and prognosis. Rational use of the available bioinformation can not only effectively remove or suppress noise in gene chips, but also avoid one-sided results of separate experiment. However, only some studies have been aware of the importance of prior information in cancer classification. Methods Together with the application of support vector machine as the discriminant approach, we proposed one modified method that incorporated prior knowledge into cancer classification based on gene expression data to improve accuracy. A public well-known dataset, Malignant pleural mesothelioma and lung adenocarcinoma gene expression database, was used in this study. Prior knowledge is viewed here as a means of directing the classifier using known lung adenocarcinoma related genes. The procedures were performed by software R 2.80. Results The modified method performed better after incorporating prior knowledge. Accuracy of the modified method improved from 98.86% to 100% in training set and from 98.51% to 99.06% in test set. The standard deviations of the modified method decreased from 0.26% to 0 in training set and from 3.04% to 2.10% in test set. Conclusion The method that incorporates prior knowledge into discriminant analysis could effectively improve the capacity and reduce the impact of noise. This idea may have good future not only in practice but also in methodology.

  3. Towards precise classification of cancers based on robust gene functional expression profiles

    Directory of Open Access Journals (Sweden)

    Zhu Jing

    2005-03-01

    Full Text Available Abstract Background Development of robust and efficient methods for analyzing and interpreting high dimension gene expression profiles continues to be a focus in computational biology. The accumulated experiment evidence supports the assumption that genes express and perform their functions in modular fashions in cells. Therefore, there is an open space for development of the timely and relevant computational algorithms that use robust functional expression profiles towards precise classification of complex human diseases at the modular level. Results Inspired by the insight that genes act as a module to carry out a highly integrated cellular function, we thus define a low dimension functional expression profile for data reduction. After annotating each individual gene to functional categories defined in a proper gene function classification system such as Gene Ontology applied in this study, we identify those functional categories enriched with differentially expressed genes. For each functional category or functional module, we compute a summary measure (s for the raw expression values of the annotated genes to capture the overall activity level of the module. In this way, we can treat the gene expressions within a functional module as an integrative data point to replace the multiple values of individual genes. We compare the classification performance of decision trees based on functional expression profiles with the conventional gene expression profiles using four publicly available datasets, which indicates that precise classification of tumour types and improved interpretation can be achieved with the reduced functional expression profiles. Conclusion This modular approach is demonstrated to be a powerful alternative approach to analyzing high dimension microarray data and is robust to high measurement noise and intrinsic biological variance inherent in microarray data. Furthermore, efficient integration with current biological knowledge

  4. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic......, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s....

  5. Laser Raman detection for oral cancer based on a Gaussian process classification method

    International Nuclear Information System (INIS)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Zhang, Chijun; Chen, He; Luo, Yusheng; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Shen, Aiguo; Hu, Jiming; Jia, Jun

    2013-01-01

    Oral squamous cell carcinoma is the most common neoplasm of the oral cavity. The incidence rate accounts for 80% of total oral cancer and shows an upward trend in recent years. It has a high degree of malignancy and is difficult to detect in terms of differential diagnosis, as a consequence of which the timing of treatment is always delayed. In this work, Raman spectroscopy was adopted to differentially diagnose oral squamous cell carcinoma and oral gland carcinoma. In total, 852 entries of raw spectral data which consisted of 631 items from 36 oral squamous cell carcinoma patients, 87 items from four oral gland carcinoma patients and 134 items from five normal people were collected by utilizing an optical method on oral tissues. The probability distribution of the datasets corresponding to the spectral peaks of the oral squamous cell carcinoma tissue was analyzed and the experimental result showed that the data obeyed a normal distribution. Moreover, the distribution characteristic of the noise was also in compliance with a Gaussian distribution. A Gaussian process (GP) classification method was utilized to distinguish the normal people and the oral gland carcinoma patients from the oral squamous cell carcinoma patients. The experimental results showed that all the normal people could be recognized. 83.33% of the oral squamous cell carcinoma patients could be correctly diagnosed and the remaining ones would be diagnosed as having oral gland carcinoma. For the classification process of oral gland carcinoma and oral squamous cell carcinoma, the correct ratio was 66.67% and the erroneously diagnosed percentage was 33.33%. The total sensitivity was 80% and the specificity was 100% with the Matthews correlation coefficient (MCC) set to 0.447 213 595. Considering the numerical results above, the application prospects and clinical value of this technique are significantly impressive. (letter)

  6. Laser Raman detection for oral cancer based on a Gaussian process classification method

    Science.gov (United States)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Zhang, Chijun; Chen, He; Luo, Yusheng; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Jia, Jun; Shen, Aiguo; Hu, Jiming

    2013-06-01

    Oral squamous cell carcinoma is the most common neoplasm of the oral cavity. The incidence rate accounts for 80% of total oral cancer and shows an upward trend in recent years. It has a high degree of malignancy and is difficult to detect in terms of differential diagnosis, as a consequence of which the timing of treatment is always delayed. In this work, Raman spectroscopy was adopted to differentially diagnose oral squamous cell carcinoma and oral gland carcinoma. In total, 852 entries of raw spectral data which consisted of 631 items from 36 oral squamous cell carcinoma patients, 87 items from four oral gland carcinoma patients and 134 items from five normal people were collected by utilizing an optical method on oral tissues. The probability distribution of the datasets corresponding to the spectral peaks of the oral squamous cell carcinoma tissue was analyzed and the experimental result showed that the data obeyed a normal distribution. Moreover, the distribution characteristic of the noise was also in compliance with a Gaussian distribution. A Gaussian process (GP) classification method was utilized to distinguish the normal people and the oral gland carcinoma patients from the oral squamous cell carcinoma patients. The experimental results showed that all the normal people could be recognized. 83.33% of the oral squamous cell carcinoma patients could be correctly diagnosed and the remaining ones would be diagnosed as having oral gland carcinoma. For the classification process of oral gland carcinoma and oral squamous cell carcinoma, the correct ratio was 66.67% and the erroneously diagnosed percentage was 33.33%. The total sensitivity was 80% and the specificity was 100% with the Matthews correlation coefficient (MCC) set to 0.447 213 595. Considering the numerical results above, the application prospects and clinical value of this technique are significantly impressive.

  7. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Directory of Open Access Journals (Sweden)

    Enrico Glaab

    Full Text Available Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  8. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Science.gov (United States)

    Glaab, Enrico; Bacardit, Jaume; Garibaldi, Jonathan M; Krasnogor, Natalio

    2012-01-01

    Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  9. Deep learning based classification for head and neck cancer detection with hyperspectral imaging in an animal model

    Science.gov (United States)

    Ma, Ling; Lu, Guolan; Wang, Dongsheng; Wang, Xu; Chen, Zhuo Georgia; Muller, Susan; Chen, Amy; Fei, Baowei

    2017-03-01

    Hyperspectral imaging (HSI) is an emerging imaging modality that can provide a noninvasive tool for cancer detection and image-guided surgery. HSI acquires high-resolution images at hundreds of spectral bands, providing big data to differentiating different types of tissue. We proposed a deep learning based method for the detection of head and neck cancer with hyperspectral images. Since the deep learning algorithm can learn the feature hierarchically, the learned features are more discriminative and concise than the handcrafted features. In this study, we adopt convolutional neural networks (CNN) to learn the deep feature of pixels for classifying each pixel into tumor or normal tissue. We evaluated our proposed classification method on the dataset containing hyperspectral images from 12 tumor-bearing mice. Experimental results show that our method achieved an average accuracy of 91.36%. The preliminary study demonstrated that our deep learning method can be applied to hyperspectral images for detecting head and neck tumors in animal models.

  10. Mechanism-based classification and physical therapy management of persons with cancer pain: A prospective case series

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2013-01-01

    Full Text Available Context: Mechanism-based classification (MBC was established with current evidence and physical therapy (PT management methods for both cancer and for noncancer pain. Aims: This study aims to describe the efficacy of MBC-based PT in persons with primary complaints of cancer pain. Settings and Design: A prospective case series of patients who attended the physiotherapy department of a multispecialty university-affiliated teaching hospital. Material and Methods: A total of 24 adults (18 female, 6 male aged 47.5 ± 10.6 years, with primary diagnosis of heterogeneous group of cancer, chief complaints of chronic disabling pain were included in the study on their consent for participation The patients were evaluated and classified on the basis of five predominant mechanisms for pain. Physical therapy interventions were recommended based on mechanisms identified and home program was prescribed with a patient log to ensure compliance. Treatments were given in five consecutive weekly sessions for five weeks each of 30 min duration. Statistical Analysis Used: Pre-post comparisons for pain severity (PS and pain interference (PI subscales of Brief pain inventory-Cancer pain (BPI-CP and, European organization for research and treatment in cancer-quality of life questionnaire (EORTC-QLQ-C30 were done using Wilcoxon signed-rank test at 95% confidence interval using SPSS for Windows version 16.0 (SPSS Inc, Chicago, IL. Results: There were statistically significant ( P < 0.05 reduction in pain severity, pain interference and total BPI-CP scores, and the EORTC-QLQ-C30. Conclusion: MBC-PT was effective for improving BPI-CP and EORTC-QLQ-C30 scores in people with cancer pain.

  11. Cancer classification through filtering progressive transductive support vector machine based on gene expression data

    Science.gov (United States)

    Lu, Xinguo; Chen, Dan

    2017-08-01

    Traditional supervised classifiers neglect a large amount of data which not have sufficient follow-up information, only work with labeled data. Consequently, the small sample size limits the advancement of design appropriate classifier. In this paper, a transductive learning method which combined with the filtering strategy in transductive framework and progressive labeling strategy is addressed. The progressive labeling strategy does not need to consider the distribution of labeled samples to evaluate the distribution of unlabeled samples, can effective solve the problem of evaluate the proportion of positive and negative samples in work set. Our experiment result demonstrate that the proposed technique have great potential in cancer prediction based on gene expression.

  12. An iterated Laplacian based semi-supervised dimensionality reduction for classification of breast cancer on ultrasound images.

    Science.gov (United States)

    Liu, Xiao; Shi, Jun; Zhou, Shichong; Lu, Minhua

    2014-01-01

    The dimensionality reduction is an important step in ultrasound image based computer-aided diagnosis (CAD) for breast cancer. A newly proposed l2,1 regularized correntropy algorithm for robust feature selection (CRFS) has achieved good performance for noise corrupted data. Therefore, it has the potential to reduce the dimensions of ultrasound image features. However, in clinical practice, the collection of labeled instances is usually expensive and time costing, while it is relatively easy to acquire the unlabeled or undetermined instances. Therefore, the semi-supervised learning is very suitable for clinical CAD. The iterated Laplacian regularization (Iter-LR) is a new regularization method, which has been proved to outperform the traditional graph Laplacian regularization in semi-supervised classification and ranking. In this study, to augment the classification accuracy of the breast ultrasound CAD based on texture feature, we propose an Iter-LR-based semi-supervised CRFS (Iter-LR-CRFS) algorithm, and then apply it to reduce the feature dimensions of ultrasound images for breast CAD. We compared the Iter-LR-CRFS with LR-CRFS, original supervised CRFS, and principal component analysis. The experimental results indicate that the proposed Iter-LR-CRFS significantly outperforms all other algorithms.

  13. Lauren classification and individualized chemotherapy in gastric cancer

    OpenAIRE

    MA, JUNLI; SHEN, HONG; KAPESA, LINDA; ZENG, SHAN

    2016-01-01

    Gastric cancer is one of the most common malignancies worldwide. During the last 50 years, the histological classification of gastric carcinoma has been largely based on Lauren's criteria, in which gastric cancer is classified into two major histological subtypes, namely intestinal type and diffuse type adenocarcinoma. This classification was introduced in 1965, and remains currently widely accepted and employed, since it constitutes a simple and robust classification approach. The two histol...

  14. Granular loess classification based

    International Nuclear Information System (INIS)

    Browzin, B.S.

    1985-01-01

    This paper discusses how loess might be identified by two index properties: the granulometric composition and the dry unit weight. These two indices are necessary but not always sufficient for identification of loess. On the basis of analyses of samples from three continents, it was concluded that the 0.01-0.5-mm fraction deserves the name loessial fraction. Based on the loessial fraction concept, a granulometric classification of loess is proposed. A triangular chart is used to classify loess

  15. Normed kernel function-based fuzzy possibilistic C-means (NKFPCM) algorithm for high-dimensional breast cancer database classification with feature selection is based on Laplacian Score

    Science.gov (United States)

    Lestari, A. W.; Rustam, Z.

    2017-07-01

    In the last decade, breast cancer has become the focus of world attention as this disease is one of the primary leading cause of death for women. Therefore, it is necessary to have the correct precautions and treatment. In previous studies, Fuzzy Kennel K-Medoid algorithm has been used for multi-class data. This paper proposes an algorithm to classify the high dimensional data of breast cancer using Fuzzy Possibilistic C-means (FPCM) and a new method based on clustering analysis using Normed Kernel Function-Based Fuzzy Possibilistic C-Means (NKFPCM). The objective of this paper is to obtain the best accuracy in classification of breast cancer data. In order to improve the accuracy of the two methods, the features candidates are evaluated using feature selection, where Laplacian Score is used. The results show the comparison accuracy and running time of FPCM and NKFPCM with and without feature selection.

  16. Constructing Support Vector Machine Ensembles for Cancer Classification Based on Proteomic Profiling

    Institute of Scientific and Technical Information of China (English)

    Yong Mao; Xiao-Bo Zhou; Dao-Ying Pi; You-Xian Sun

    2005-01-01

    In this study, we present a constructive algorithm for training cooperative support vector machine ensembles (CSVMEs). CSVME combines ensemble architecture design with cooperative training for individual SVMs in ensembles. Unlike most previous studies on training ensembles, CSVME puts emphasis on both accuracy and collaboration among individual SVMs in an ensemble. A group of SVMs selected on the basis of recursive classifier elimination is used in CSVME, and the number of the individual SVMs selected to construct CSVME is determined by 10-fold cross-validation. This kind of SVME has been tested on two ovarian cancer datasets previously obtained by proteomic mass spectrometry. By combining several individual SVMs, the proposed method achieves better performance than the SVME of all base SVMs.

  17. Computerized three-class classification of MRI-based prognostic markers for breast cancer

    Energy Technology Data Exchange (ETDEWEB)

    Bhooshan, Neha; Giger, Maryellen; Edwards, Darrin; Yuan Yading; Jansen, Sanaz; Li Hui; Lan Li; Newstead, Gillian [Department of Radiology, University of Chicago, Chicago, IL 60637 (United States); Sattar, Husain, E-mail: bhooshan@uchicago.edu [Department of Pathology, University of Chicago, Chicago, IL 60637 (United States)

    2011-09-21

    The purpose of this study is to investigate whether computerized analysis using three-class Bayesian artificial neural network (BANN) feature selection and classification can characterize tumor grades (grade 1, grade 2 and grade 3) of breast lesions for prognostic classification on DCE-MRI. A database of 26 IDC grade 1 lesions, 86 IDC grade 2 lesions and 58 IDC grade 3 lesions was collected. The computer automatically segmented the lesions, and kinetic and morphological lesion features were automatically extracted. The discrimination tasks-grade 1 versus grade 3, grade 2 versus grade 3, and grade 1 versus grade 2 lesions-were investigated. Step-wise feature selection was conducted by three-class BANNs. Classification was performed with three-class BANNs using leave-one-lesion-out cross-validation to yield computer-estimated probabilities of being grade 3 lesion, grade 2 lesion and grade 1 lesion. Two-class ROC analysis was used to evaluate the performances. We achieved AUC values of 0.80 {+-} 0.05, 0.78 {+-} 0.05 and 0.62 {+-} 0.05 for grade 1 versus grade 3, grade 1 versus grade 2, and grade 2 versus grade 3, respectively. This study shows the potential for (1) applying three-class BANN feature selection and classification to CADx and (2) expanding the role of DCE-MRI CADx from diagnostic to prognostic classification in distinguishing tumor grades.

  18. Identification of immune cell infiltration in hematoxylin-eosin stained breast cancer samples: texture-based classification of tissue morphologies

    Science.gov (United States)

    Turkki, Riku; Linder, Nina; Kovanen, Panu E.; Pellinen, Teijo; Lundin, Johan

    2016-03-01

    The characteristics of immune cells in the tumor microenvironment of breast cancer capture clinically important information. Despite the heterogeneity of tumor-infiltrating immune cells, it has been shown that the degree of infiltration assessed by visual evaluation of hematoxylin-eosin (H and E) stained samples has prognostic and possibly predictive value. However, quantification of the infiltration in H and E-stained tissue samples is currently dependent on visual scoring by an expert. Computer vision enables automated characterization of the components of the tumor microenvironment, and texture-based methods have successfully been used to discriminate between different tissue morphologies and cell phenotypes. In this study, we evaluate whether local binary pattern texture features with superpixel segmentation and classification with support vector machine can be utilized to identify immune cell infiltration in H and E-stained breast cancer samples. Guided with the pan-leukocyte CD45 marker, we annotated training and test sets from 20 primary breast cancer samples. In the training set of arbitrary sized image regions (n=1,116) a 3-fold cross-validation resulted in 98% accuracy and an area under the receiver-operating characteristic curve (AUC) of 0.98 to discriminate between immune cell -rich and - poor areas. In the test set (n=204), we achieved an accuracy of 96% and AUC of 0.99 to label cropped tissue regions correctly into immune cell -rich and -poor categories. The obtained results demonstrate strong discrimination between immune cell -rich and -poor tissue morphologies. The proposed method can provide a quantitative measurement of the degree of immune cell infiltration and applied to digitally scanned H and E-stained breast cancer samples for diagnostic purposes.

  19. Cancer classification in the genomic era: five contemporary problems.

    Science.gov (United States)

    Song, Qingxuan; Merajver, Sofia D; Li, Jun Z

    2015-10-19

    Classification is an everyday instinct as well as a full-fledged scientific discipline. Throughout the history of medicine, disease classification is central to how we develop knowledge, make diagnosis, and assign treatment. Here, we discuss the classification of cancer and the process of categorizing cancer subtypes based on their observed clinical and biological features. Traditionally, cancer nomenclature is primarily based on organ location, e.g., "lung cancer" designates a tumor originating in lung structures. Within each organ-specific major type, finer subgroups can be defined based on patient age, cell type, histological grades, and sometimes molecular markers, e.g., hormonal receptor status in breast cancer or microsatellite instability in colorectal cancer. In the past 15+ years, high-throughput technologies have generated rich new data regarding somatic variations in DNA, RNA, protein, or epigenomic features for many cancers. These data, collected for increasingly large tumor cohorts, have provided not only new insights into the biological diversity of human cancers but also exciting opportunities to discover previously unrecognized cancer subtypes. Meanwhile, the unprecedented volume and complexity of these data pose significant challenges for biostatisticians, cancer biologists, and clinicians alike. Here, we review five related issues that represent contemporary problems in cancer taxonomy and interpretation. (1) How many cancer subtypes are there? (2) How can we evaluate the robustness of a new classification system? (3) How are classification systems affected by intratumor heterogeneity and tumor evolution? (4) How should we interpret cancer subtypes? (5) Can multiple classification systems co-exist? While related issues have existed for a long time, we will focus on those aspects that have been magnified by the recent influx of complex multi-omics data. Exploration of these problems is essential for data-driven refinement of cancer classification

  20. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    Directory of Open Access Journals (Sweden)

    Liu Qingzhong

    2011-12-01

    Full Text Available Abstract Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.

  1. Molecular Classification and Correlates in Colorectal Cancer

    OpenAIRE

    Ogino, Shuji; Goel, Ajay

    2008-01-01

    Molecular classification of colorectal cancer is evolving. As our understanding of colorectal carcinogenesis improves, we are incorporating new knowledge into the classification system. In particular, global genomic status [microsatellite instability (MSI) status and chromosomal instability (CIN) status] and epigenomic status [CpG island methylator phenotype (CIMP) status] play a significant role in determining clinical, pathological and biological characteristics of colorectal cancer. In thi...

  2. Lauren classification and individualized chemotherapy in gastric cancer.

    Science.gov (United States)

    Ma, Junli; Shen, Hong; Kapesa, Linda; Zeng, Shan

    2016-05-01

    Gastric cancer is one of the most common malignancies worldwide. During the last 50 years, the histological classification of gastric carcinoma has been largely based on Lauren's criteria, in which gastric cancer is classified into two major histological subtypes, namely intestinal type and diffuse type adenocarcinoma. This classification was introduced in 1965, and remains currently widely accepted and employed, since it constitutes a simple and robust classification approach. The two histological subtypes of gastric cancer proposed by the Lauren classification exhibit a number of distinct clinical and molecular characteristics, including histogenesis, cell differentiation, epidemiology, etiology, carcinogenesis, biological behaviors and prognosis. Gastric cancer exhibits varied sensitivity to chemotherapy drugs and significant heterogeneity; therefore, the disease may be a target for individualized therapy. The Lauren classification may provide the basis for individualized treatment for advanced gastric cancer, which is increasingly gaining attention in the scientific field. However, few studies have investigated individualized treatment that is guided by pathological classification. The aim of the current review is to analyze the two major histological subtypes of gastric cancer, as proposed by the Lauren classification, and to discuss the implications of this for personalized chemotherapy.

  3. Laser Raman detection for oral cancer based on an adaptive Gaussian process classification method with posterior probabilities

    International Nuclear Information System (INIS)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Shen, Aiguo; Hu, Jiming; Jia, Jun

    2013-01-01

    The existing methods for early and differential diagnosis of oral cancer are limited due to the unapparent early symptoms and the imperfect imaging examination methods. In this paper, the classification models of oral adenocarcinoma, carcinoma tissues and a control group with just four features are established by utilizing the hybrid Gaussian process (HGP) classification algorithm, with the introduction of the mechanisms of noise reduction and posterior probability. HGP shows much better performance in the experimental results. During the experimental process, oral tissues were divided into three groups, adenocarcinoma (n = 87), carcinoma (n = 100) and the control group (n = 134). The spectral data for these groups were collected. The prospective application of the proposed HGP classification method improved the diagnostic sensitivity to 56.35% and the specificity to about 70.00%, and resulted in a Matthews correlation coefficient (MCC) of 0.36. It is proved that the utilization of HGP in LRS detection analysis for the diagnosis of oral cancer gives accurate results. The prospect of application is also satisfactory. (paper)

  4. Laser Raman detection for oral cancer based on an adaptive Gaussian process classification method with posterior probabilities

    Science.gov (United States)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Jia, Jun; Shen, Aiguo; Hu, Jiming

    2013-03-01

    The existing methods for early and differential diagnosis of oral cancer are limited due to the unapparent early symptoms and the imperfect imaging examination methods. In this paper, the classification models of oral adenocarcinoma, carcinoma tissues and a control group with just four features are established by utilizing the hybrid Gaussian process (HGP) classification algorithm, with the introduction of the mechanisms of noise reduction and posterior probability. HGP shows much better performance in the experimental results. During the experimental process, oral tissues were divided into three groups, adenocarcinoma (n = 87), carcinoma (n = 100) and the control group (n = 134). The spectral data for these groups were collected. The prospective application of the proposed HGP classification method improved the diagnostic sensitivity to 56.35% and the specificity to about 70.00%, and resulted in a Matthews correlation coefficient (MCC) of 0.36. It is proved that the utilization of HGP in LRS detection analysis for the diagnosis of oral cancer gives accurate results. The prospect of application is also satisfactory.

  5. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...

  6. Novelty detection for breast cancer image classification

    Science.gov (United States)

    Cichosz, Pawel; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał; Oleszkiewicz, Witold

    2016-09-01

    Using classification learning algorithms for medical applications may require not only refined model creation techniques and careful unbiased model evaluation, but also detecting the risk of misclassification at the time of model application. This is addressed by novelty detection, which identifies instances for which the training set is not sufficiently representative and for which it may be safer to restrain from classification and request a human expert diagnosis. The paper investigates two techniques for isolated instance identification, based on clustering and one-class support vector machines, which represent two different approaches to multidimensional outlier detection. The prediction quality for isolated instances in breast cancer image data is evaluated using the random forest algorithm and found to be substantially inferior to the prediction quality for non-isolated instances. Each of the two techniques is then used to create a novelty detection model which can be combined with a classification model and used at the time of prediction to detect instances for which the latter cannot be reliably applied. Novelty detection is demonstrated to improve random forest prediction quality and argued to deserve further investigation in medical applications.

  7. Distinct clinical outcomes of two CIMP-positive colorectal cancer subtypes based on a revised CIMP classification system.

    Science.gov (United States)

    Bae, Jeong Mo; Kim, Jung Ho; Kwak, Yoonjin; Lee, Dae-Won; Cha, Yongjun; Wen, Xianyu; Lee, Tae Hun; Cho, Nam-Yun; Jeong, Seung-Yong; Park, Kyu Joo; Han, Sae Won; Lee, Hye Seung; Kim, Tae-You; Kang, Gyeong Hoon

    2017-04-11

    Colorectal cancer (CRC) is a heterogeneous disease in terms of molecular carcinogenic pathways. Based on recent findings regarding the multiple serrated neoplasia pathway, we revised an eight-marker panel for a new CIMP classification system. 1370 patients who received surgical resection for CRCs were classified into three CIMP subtypes (CIMP-N: 0-4 methylated markers, CIMP-P1: 5-6 methylated markers and CIMP-P2: 7-8 methylated markers). Our findings were validated in a separate set of high-risk stage II or stage III CRCs receiving adjuvant fluoropyrimidine plus oxaliplatin (n=950). A total of 1287/62/21 CRCs cases were classified as CIMP-N/CIMP-P1/CIMP-P2, respectively. CIMP-N showed male predominance, distal location, lower T, N category and devoid of BRAF mutation, microsatellite instability (MSI) and MLH1 methylation. CIMP-P1 showed female predominance, proximal location, advanced TNM stage, mild decrease of CK20 and CDX2 expression, mild increase of CK7 expression, BRAF mutation, MSI and MLH1 methylation. CIMP-P2 showed older age, female predominance, proximal location, advanced T category, markedly reduced CK20 and CDX2 expression, rare KRAS mutation, high frequency of CK7 expression, BRAF mutation, MSI and MLH1 methylation. CIMP-N showed better 5-year cancer-specific survival (CSS; HR=0.47; 95% CI: 0.28-0.78) in discovery set and better 5-year relapse-free survival (RFS; HR=0.50; 95% CI: 0.29-0.88) in validation set compared with CIMP-P1. CIMP-P2 showed marginally better 5-year CSS (HR=0.28, 95% CI: 0.07-1.22) in discovery set and marginally better 5-year RFS (HR=0.21, 95% CI: 0.05-0.92) in validation set compared with CIMP-P1. CIMP subtypes classified using our revised system showed different clinical outcomes, demonstrating the heterogeneity of multiple serrated precursors of CIMP-positive CRCs.

  8. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  9. CLASSIFICATION OF SEVERAL SKIN CANCER TYPES BASED ON AUTOFLUORESCENCE INTENSITY OF VISIBLE LIGHT TO NEAR INFRARED RATIO

    Directory of Open Access Journals (Sweden)

    Aryo Tedjo

    2009-12-01

    Full Text Available Skin cancer is a malignant growth on the skin caused by many factors. The most common skin cancers are Basal Cell Cancer (BCC and Squamous Cell Cancer (SCC. This research uses a discriminant analysis to classify some tissues of skin cancer based on criterion number of independent variables. An independent variable is variation of excitation light sources (LED lamp, filters, and sensors to measure Autofluorescence Intensity (IAF of visible light to near infrared (VIS/NIR ratio of paraffin embedded tissue biopsy from BCC, SCC, and Lipoma. From the result of discriminant analysis, it is known that the discriminant function is determined by 4 (four independent variables i.e., Blue LED-Red Filter, Blue LED-Yellow Filter, UV LED-Blue Filter, and UV LED-Yellow Filter. The accuracy of discriminant in classifying the analysis of three skin cancer tissues is 100 %.

  10. A Classification Framework Applied to Cancer Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Hussein Hijazi

    2013-01-01

    Full Text Available Classification of cancer based on gene expression has provided insight into possible treatment strategies. Thus, developing machine learning methods that can successfully distinguish among cancer subtypes or normal versus cancer samples is important. This work discusses supervised learning techniques that have been employed to classify cancers. Furthermore, a two-step feature selection method based on an attribute estimation method (e.g., ReliefF and a genetic algorithm was employed to find a set of genes that can best differentiate between cancer subtypes or normal versus cancer samples. The application of different classification methods (e.g., decision tree, k-nearest neighbor, support vector machine (SVM, bagging, and random forest on 5 cancer datasets shows that no classification method universally outperforms all the others. However, k-nearest neighbor and linear SVM generally improve the classification performance over other classifiers. Finally, incorporating diverse types of genomic data (e.g., protein-protein interaction data and gene expression increase the prediction accuracy as compared to using gene expression alone.

  11. Magnetic resonance imaging texture analysis classification of primary breast cancer

    International Nuclear Information System (INIS)

    Waugh, S.A.; Lerski, R.A.; Purdie, C.A.; Jordan, L.B.; Vinnicombe, S.; Martin, P.; Thompson, A.M.

    2016-01-01

    Patient-tailored treatments for breast cancer are based on histological and immunohistochemical (IHC) subtypes. Magnetic Resonance Imaging (MRI) texture analysis (TA) may be useful in non-invasive lesion subtype classification. Women with newly diagnosed primary breast cancer underwent pre-treatment dynamic contrast-enhanced breast MRI. TA was performed using co-occurrence matrix (COM) features, by creating a model on retrospective training data, then prospectively applying to a test set. Analyses were blinded to breast pathology. Subtype classifications were performed using a cross-validated k-nearest-neighbour (k = 3) technique, with accuracy relative to pathology assessed and receiver operator curve (AUROC) calculated. Mann-Whitney U and Kruskal-Wallis tests were used to assess raw entropy feature values. Histological subtype classifications were similar across training (n = 148 cancers) and test sets (n = 73 lesions) using all COM features (training: 75 %, AUROC = 0.816; test: 72.5 %, AUROC = 0.823). Entropy features were significantly different between lobular and ductal cancers (p < 0.001; Mann-Whitney U). IHC classifications using COM features were also similar for training and test data (training: 57.2 %, AUROC = 0.754; test: 57.0 %, AUROC = 0.750). Hormone receptor positive and negative cancers demonstrated significantly different entropy features. Entropy features alone were unable to create a robust classification model. Textural differences on contrast-enhanced MR images may reflect underlying lesion subtypes, which merits testing against treatment response. (orig.)

  12. Magnetic resonance imaging texture analysis classification of primary breast cancer

    Energy Technology Data Exchange (ETDEWEB)

    Waugh, S.A.; Lerski, R.A. [Ninewells Hospital and Medical School, Department of Medical Physics, Dundee (United Kingdom); Purdie, C.A.; Jordan, L.B. [Ninewells Hospital and Medical School, Department of Pathology, Dundee (United Kingdom); Vinnicombe, S. [University of Dundee, Division of Imaging and Technology, Ninewells Hospital and Medical School, Dundee (United Kingdom); Martin, P. [Ninewells Hospital and Medical School, Department of Clinical Radiology, Dundee (United Kingdom); Thompson, A.M. [University of Texas MD Anderson Cancer Center, Department of Surgical Oncology, Houston, TX (United States)

    2016-02-15

    Patient-tailored treatments for breast cancer are based on histological and immunohistochemical (IHC) subtypes. Magnetic Resonance Imaging (MRI) texture analysis (TA) may be useful in non-invasive lesion subtype classification. Women with newly diagnosed primary breast cancer underwent pre-treatment dynamic contrast-enhanced breast MRI. TA was performed using co-occurrence matrix (COM) features, by creating a model on retrospective training data, then prospectively applying to a test set. Analyses were blinded to breast pathology. Subtype classifications were performed using a cross-validated k-nearest-neighbour (k = 3) technique, with accuracy relative to pathology assessed and receiver operator curve (AUROC) calculated. Mann-Whitney U and Kruskal-Wallis tests were used to assess raw entropy feature values. Histological subtype classifications were similar across training (n = 148 cancers) and test sets (n = 73 lesions) using all COM features (training: 75 %, AUROC = 0.816; test: 72.5 %, AUROC = 0.823). Entropy features were significantly different between lobular and ductal cancers (p < 0.001; Mann-Whitney U). IHC classifications using COM features were also similar for training and test data (training: 57.2 %, AUROC = 0.754; test: 57.0 %, AUROC = 0.750). Hormone receptor positive and negative cancers demonstrated significantly different entropy features. Entropy features alone were unable to create a robust classification model. Textural differences on contrast-enhanced MR images may reflect underlying lesion subtypes, which merits testing against treatment response. (orig.)

  13. The classification of lung cancers and their degree of malignancy by FTIR, PCA-LDA analysis, and a physics-based computational model.

    Science.gov (United States)

    Kaznowska, E; Depciuch, J; Łach, K; Kołodziej, M; Koziorowska, A; Vongsvivut, J; Zawlik, I; Cholewa, M; Cebulski, J

    2018-08-15

    Lung cancer has the highest mortality rate of all malignant tumours. The current effects of cancer treatment, as well as its diagnostics, are unsatisfactory. Therefore it is very important to introduce modern diagnostic tools, which will allow for rapid classification of lung cancers and their degree of malignancy. For this purpose, the authors propose the use of Fourier Transform InfraRed (FTIR) spectroscopy combined with Principal Component Analysis-Linear Discriminant Analysis (PCA-LDA) and a physics-based computational model. The results obtained for lung cancer tissues, adenocarcinoma and squamous cell carcinoma FTIR spectra, show a shift in wavenumbers compared to control tissue FTIR spectra. Furthermore, in the FTIR spectra of adenocarcinoma there are no peaks corresponding to glutamate or phospholipid functional groups. Moreover, in the case of G2 and G3 malignancy of adenocarcinoma lung cancer, the absence of an OH groups peak was noticed. Thus, it seems that FTIR spectroscopy is a valuable tool to classify lung cancer and to determine the degree of its malignancy. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. Pathohistological classification systems in gastric cancer: diagnostic relevance and prognostic value.

    Science.gov (United States)

    Berlth, Felix; Bollschweiler, Elfriede; Drebber, Uta; Hoelscher, Arnulf H; Moenig, Stefan

    2014-05-21

    Several pathohistological classification systems exist for the diagnosis of gastric cancer. Many studies have investigated the correlation between the pathohistological characteristics in gastric cancer and patient characteristics, disease specific criteria and overall outcome. It is still controversial as to which classification system imparts the most reliable information, and therefore, the choice of system may vary in clinical routine. In addition to the most common classification systems, such as the Laurén and the World Health Organization (WHO) classifications, other authors have tried to characterize and classify gastric cancer based on the microscopic morphology and in reference to the clinical outcome of the patients. In more than 50 years of systematic classification of the pathohistological characteristics of gastric cancer, there is no sole classification system that is consistently used worldwide in diagnostics and research. However, several national guidelines for the treatment of gastric cancer refer to the Laurén or the WHO classifications regarding therapeutic decision-making, which underlines the importance of a reliable classification system for gastric cancer. The latest results from gastric cancer studies indicate that it might be useful to integrate DNA- and RNA-based features of gastric cancer into the classification systems to establish prognostic relevance. This article reviews the diagnostic relevance and the prognostic value of different pathohistological classification systems in gastric cancer.

  15. Gastric cancer: epidemiology, prevention, classification, and treatment

    Directory of Open Access Journals (Sweden)

    Sitarz R

    2018-02-01

    Full Text Available Robert Sitarz,1–3 Małgorzata Skierucha,1,2 Jerzy Mielko,1 G Johan A Offerhaus,3 Ryszard Maciejewski,2 Wojciech P Polkowski1 1Department of Surgical Oncology, Medical University of Lublin, Lublin, Poland; 2Department of Human Anatomy, Medical University of Lublin, Lublin, Poland; 3Department of Pathology, University Medical Centre, Utrecht, The Netherlands Abstract: Gastric cancer is the second most common cause of cancer-related deaths in the world, the epidemiology of which has changed within last decades. A trend of steady decline in gastric cancer incidence rates is the effect of the increased standards of hygiene, conscious nutrition, and Helicobacter pylori eradication, which together constitute primary prevention. Avoidance of gastric cancer remains a priority. However, patients with higher risk should be screened for early detection and chemoprevention. Surgical resection enhanced by standardized lymphadenectomy remains the gold standard in gastric cancer therapy. This review briefly summarizes the most important aspects of gastric cancers, which include epidemiology, risk factors, classification, diagnosis, prevention, and treatment. The paper is mostly addressed to physicians who are interested in updating the state of art concerning gastric carcinoma from easily accessible and credible source. Keywords: gastric cancer, epidemiology, classification, risk factors, treatment

  16. Influence of nuclei segmentation on breast cancer malignancy classification

    Science.gov (United States)

    Jelen, Lukasz; Fevens, Thomas; Krzyzak, Adam

    2009-02-01

    Breast Cancer is one of the most deadly cancers affecting middle-aged women. Accurate diagnosis and prognosis are crucial to reduce the high death rate. Nowadays there are numerous diagnostic tools for breast cancer diagnosis. In this paper we discuss a role of nuclear segmentation from fine needle aspiration biopsy (FNA) slides and its influence on malignancy classification. Classification of malignancy plays a very important role during the diagnosis process of breast cancer. Out of all cancer diagnostic tools, FNA slides provide the most valuable information about the cancer malignancy grade which helps to choose an appropriate treatment. This process involves assessing numerous nuclear features and therefore precise segmentation of nuclei is very important. In this work we compare three powerful segmentation approaches and test their impact on the classification of breast cancer malignancy. The studied approaches involve level set segmentation, fuzzy c-means segmentation and textural segmentation based on co-occurrence matrix. Segmented nuclei were used to extract nuclear features for malignancy classification. For classification purposes four different classifiers were trained and tested with previously extracted features. The compared classifiers are Multilayer Perceptron (MLP), Self-Organizing Maps (SOM), Principal Component-based Neural Network (PCA) and Support Vector Machines (SVM). The presented results show that level set segmentation yields the best results over the three compared approaches and leads to a good feature extraction with a lowest average error rate of 6.51% over four different classifiers. The best performance was recorded for multilayer perceptron with an error rate of 3.07% using fuzzy c-means segmentation.

  17. The eighth TNM classification system for lung cancer: A consideration based on the degree of pleural invasion and involved neighboring structures.

    Science.gov (United States)

    Sakakura, Noriaki; Mizuno, Tetsuya; Kuroda, Hiroaki; Arimura, Takaaki; Yatabe, Yasushi; Yoshimura, Kenichi; Sakao, Yukinori

    2018-04-01

    The eighth tumor-node-metastasis (TNM) classification system for lung cancer has been used since January 2017 and must be applied to an individual institution's database. We analyzed pathological stage data of 2756 patients who underwent resection of non-small-cell lung cancer, particularly in terms of the degree of visceral pleural invasion and involved neighboring structures. Few patients had stage IIA disease (103, 4%); stratification between stages IB and IIA was insufficient (p = 0.129). When T2a tumors were divided into PL1 and PL2 subgroups based on the degree of pleural invasion, there was a significant prognostic difference between the subgroups (p consideration. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. [Assessment of functioning in patients with head and neck cancer based on the international classification of functioning, disability and health (ICF)].

    Science.gov (United States)

    Tschiesner, U

    2011-09-01

    The article approaches with the question how preservation of function after treatment of head and neck cancer (HNC) can be defined and measured across treatment approaches. On the basis of the "International Classification of Functioning, Disability and Health (ICF)" a series of efforts are summarized how all relevant aspects of the interdisciplinary team can be integrated into a common concept.Different efforts on the development, validation and implementation of ICF Core Sets for head and neck cancer (ICF-HNC) are discussed. The ICF-HNC covers organ-based problems with food ingestion, breathing, and speech, as well as psychosocial difficulties.Relationships between the ICF-HNC and well-established outcome measures are illustrated. This enables the user to integrate different aspects of functional outcome into a consolidated approach towards preservation/rehabilitation of functioning after HNC - applicable for a variety of treatment-approaches and health-professions. George Thieme Verlag KG Stuttgart · New York.

  19. Tolerance to missing data using a likelihood ratio based classifier for computer-aided classification of breast cancer

    International Nuclear Information System (INIS)

    Bilska-Wolak, Anna O; Floyd, Carey E Jr

    2004-01-01

    While mammography is a highly sensitive method for detecting breast tumours, its ability to differentiate between malignant and benign lesions is low, which may result in as many as 70% of unnecessary biopsies. The purpose of this study was to develop a highly specific computer-aided diagnosis algorithm to improve classification of mammographic masses. A classifier based on the likelihood ratio was developed to accommodate cases with missing data. Data for development included 671 biopsy cases (245 malignant), with biopsy-proved outcome. Sixteen features based on the BI-RADS TM lexicon and patient history had been recorded for the cases, with 1.3 ± 1.1 missing feature values per case. Classifier evaluation methods included receiver operating characteristic and leave-one-out bootstrap sampling. The classifier achieved 32% specificity at 100% sensitivity on the 671 cases with 16 features that had missing values. Utilizing just the seven features present for all cases resulted in decreased performance at 100% sensitivity with average 19% specificity. No cases and no feature data were omitted during classifier development, showing that it is more beneficial to utilize cases with missing values than to discard incomplete cases that cannot be handled by many algorithms. Classification of mammographic masses was commendable at high sensitivity levels, indicating that benign cases could be potentially spared from biopsy

  20. Association between gastric cancer and the Kyoto classification of gastritis.

    Science.gov (United States)

    Shichijo, Satoki; Hirata, Yoshihiro; Niikura, Ryota; Hayakawa, Yoku; Yamada, Atsuo; Koike, Kazuhiko

    2017-09-01

    Histological gastritis is associated with gastric cancer, but its diagnosis requires biopsy. Many classifications of endoscopic gastritis are available, but not all are useful for risk stratification of gastric cancer. The Kyoto Classification of Gastritis was proposed at the 85th Congress of the Japan Gastroenterological Endoscopy Society. This cross-sectional study evaluated the usefulness of the Kyoto Classification of Gastritis for risk stratification of gastric cancer. From August 2013 to September 2014, esophagogastroduodenoscopy was performed and the gastric findings evaluated according to the Kyoto Classification of Gastritis in a total of 4062 patients. The following five endoscopic findings were selected based on previous reports: atrophy, intestinal metaplasia, enlarged folds, nodularity, and diffuse redness. A total of 3392 patients (1746 [51%] men and 1646 [49%] women) were analyzed. Among them, 107 gastric cancers were diagnosed. Atrophy was found in 2585 (78%) and intestinal metaplasia in 924 (27%). Enlarged folds, nodularity, and diffuse redness were found in 197 (5.8%), 22 (0.6%), and 573 (17%), respectively. In univariate analyses, the severity of atrophy, intestinal metaplasia, diffuse redness, age, and male sex were associated with gastric cancer. In a multivariate analysis, atrophy and male sex were found to be independent risk factors. Younger age and severe atrophy were determined to be associated with diffuse-type gastric cancer. Endoscopic detection of atrophy was associated with the risk of gastric cancer. Thus, patients with severe atrophy should be examined carefully and may require intensive follow-up. © 2017 Journal of Gastroenterology and Hepatology Foundation and John Wiley & Sons Australia, Ltd.

  1. A multifactorial likelihood model for MMR gene variant classification incorporating probabilities based on sequence bioinformatics and tumor characteristics: a report from the Colon Cancer Family Registry.

    Science.gov (United States)

    Thompson, Bryony A; Goldgar, David E; Paterson, Carol; Clendenning, Mark; Walters, Rhiannon; Arnold, Sven; Parsons, Michael T; Michael D, Walsh; Gallinger, Steven; Haile, Robert W; Hopper, John L; Jenkins, Mark A; Lemarchand, Loic; Lindor, Noralane M; Newcomb, Polly A; Thibodeau, Stephen N; Young, Joanne P; Buchanan, Daniel D; Tavtigian, Sean V; Spurdle, Amanda B

    2013-01-01

    Mismatch repair (MMR) gene sequence variants of uncertain clinical significance are often identified in suspected Lynch syndrome families, and this constitutes a challenge for both researchers and clinicians. Multifactorial likelihood model approaches provide a quantitative measure of MMR variant pathogenicity, but first require input of likelihood ratios (LRs) for different MMR variation-associated characteristics from appropriate, well-characterized reference datasets. Microsatellite instability (MSI) and somatic BRAF tumor data for unselected colorectal cancer probands of known pathogenic variant status were used to derive LRs for tumor characteristics using the Colon Cancer Family Registry (CFR) resource. These tumor LRs were combined with variant segregation within families, and estimates of prior probability of pathogenicity based on sequence conservation and position, to analyze 44 unclassified variants identified initially in Australasian Colon CFR families. In addition, in vitro splicing analyses were conducted on the subset of variants based on bioinformatic splicing predictions. The LR in favor of pathogenicity was estimated to be ~12-fold for a colorectal tumor with a BRAF mutation-negative MSI-H phenotype. For 31 of the 44 variants, the posterior probabilities of pathogenicity were such that altered clinical management would be indicated. Our findings provide a working multifactorial likelihood model for classification that carefully considers mode of ascertainment for gene testing. © 2012 Wiley Periodicals, Inc.

  2. Classification of breast cancer cytological specimen using convolutional neural network

    Science.gov (United States)

    Żejmo, Michał; Kowal, Marek; Korbicz, Józef; Monczak, Roman

    2017-01-01

    The paper presents a deep learning approach for automatic classification of breast tumors based on fine needle cytology. The main aim of the system is to distinguish benign from malignant cases based on microscopic images. Experiment was carried out on cytological samples derived from 50 patients (25 benign cases + 25 malignant cases) diagnosed in Regional Hospital in Zielona Góra. To classify microscopic images, we used convolutional neural networks (CNN) of two types: GoogLeNet and AlexNet. Due to the very large size of images of cytological specimen (on average 200000 × 100000 pixels), they were divided into smaller patches of size 256 × 256 pixels. Breast cancer classification usually is based on morphometric features of nuclei. Therefore, training and validation patches were selected using Support Vector Machine (SVM) so that suitable amount of cell material was depicted. Neural classifiers were tuned using GPU accelerated implementation of gradient descent algorithm. Training error was defined as a cross-entropy classification loss. Classification accuracy was defined as the percentage ratio of successfully classified validation patches to the total number of validation patches. The best accuracy rate of 83% was obtained by GoogLeNet model. We observed that more misclassified patches belong to malignant cases.

  3. Network-Based Logistic Classification with an Enhanced L1/2 Solver Reveals Biomarker and Subnetwork Signatures for Diagnosing Lung Cancer

    Directory of Open Access Journals (Sweden)

    Hai-Hui Huang

    2015-01-01

    Full Text Available Identifying biomarker and signaling pathway is a critical step in genomic studies, in which the regularization method is a widely used feature extraction approach. However, most of the regularizers are based on L1-norm and their results are not good enough for sparsity and interpretation and are asymptotically biased, especially in genomic research. Recently, we gained a large amount of molecular interaction information about the disease-related biological processes and gathered them through various databases, which focused on many aspects of biological systems. In this paper, we use an enhanced L1/2 penalized solver to penalize network-constrained logistic regression model called an enhanced L1/2 net, where the predictors are based on gene-expression data with biologic network knowledge. Extensive simulation studies showed that our proposed approach outperforms L1 regularization, the old L1/2 penalized solver, and the Elastic net approaches in terms of classification accuracy and stability. Furthermore, we applied our method for lung cancer data analysis and found that our method achieves higher predictive accuracy than L1 regularization, the old L1/2 penalized solver, and the Elastic net approaches, while fewer but informative biomarkers and pathways are selected.

  4. CT-based injury classification

    International Nuclear Information System (INIS)

    Mirvis, S.E.; Whitley, N.O.; Vainright, J.; Gens, D.

    1988-01-01

    Review of preoperative abdominal CT scans obtained in adults after blunt trauma during a 2.5-year period demonstrated isolated or predominant liver injury in 35 patients and splenic injury in 33 patients. CT-based injury scores, consisting of five levels of hepatic injury and four levels of splenic injury, were correlated with clinical outcome and surgical findings. Hepatic injury grades I-III, present in 33 of 35 patients, were associated with successful nonsurgical management in 27 (82%) or with findings at celiotomy not requiring surgical intervention in four (12%). Higher grades of splenic injury generally required early operative intervention, but eight (36%) of 22 patients with initial grade III or IV injury were managed without surgery, while four (36%) of 11 patients with grade I or II injury required delayed celiotomy and splenectomy (three patients) or emergent rehospitalization (one patient). CT-based injury classification is useful in guiding the nonoperative management of blunt hepatic injury in hemodynamically stable adults but appears to be less reliable in predicting the outcome of blunt splenic injury

  5. Using fuzzy association rule mining in cancer classification

    International Nuclear Information System (INIS)

    Mahmoodian, Hamid; Marhaban, M.H.; Abdulrahim, Raha; Rosli, Rozita; Saripan, Iqbal

    2011-01-01

    Full text: The classification of the cancer tumors based on gene expression profiles has been extensively studied in numbers of studies. A wide variety of cancer datasets have been implemented by the various methods of gene selec tion and classification to identify the behavior of the genes in tumors and find the relationships between them and outcome of diseases. Interpretability of the model, which is developed by fuzzy rules and linguistic variables in this study, has been rarely considered. In addition, creating a fuzzy classifier with high performance in classification that uses a subset of significant genes which have been selected by different types of gene selection methods is another goal of this study. A new algorithm has been developed to identify the fuzzy rules and significant genes based on fuzzy association rule mining. At first, different subset of genes which have been selected by different methods, were used to generate primary fuzzy classifiers separately and then proposed algorithm was implemented to mix the genes which have been associated in the primary classifiers and generate a new classifier. The results show that fuzzy classifier can classify the tumors with high performance while presenting the relationships between the genes by linguistic variables

  6. Molecular classification of gastric cancer: a new paradigm.

    Science.gov (United States)

    Shah, Manish A; Khanin, Raya; Tang, Laura; Janjigian, Yelena Y; Klimstra, David S; Gerdes, Hans; Kelsen, David P

    2011-05-01

    Gastric cancer may be subdivided into 3 distinct subtypes--proximal, diffuse, and distal gastric cancer--based on histopathologic and anatomic criteria. Each subtype is associated with unique epidemiology. Our aim is to test the hypothesis that these distinct gastric cancer subtypes may also be distinguished by gene expression analysis. Patients with localized gastric adenocarcinoma being screened for a phase II preoperative clinical trial (National Cancer Institute, NCI #5917) underwent endoscopic biopsy for fresh tumor procurement. Four to 6 targeted biopsies of the primary tumor were obtained. Macrodissection was carried out to ensure more than 80% carcinoma in the sample. HG-U133A GeneChip (Affymetrix) was used for cDNA expression analysis, and all arrays were processed and analyzed using the Bioconductor R-package. Between November 2003 and January 2006, 57 patients were screened to identify 36 patients with localized gastric cancer who had adequate RNA for expression analysis. Using supervised analysis, we built a classifier to distinguish the 3 gastric cancer subtypes, successfully classifying each into tightly grouped clusters. Leave-one-out cross-validation error was 0.14, suggesting that more than 85% of samples were classified correctly. Gene set analysis with the false discovery rate set at 0.25 identified several pathways that were differentially regulated when comparing each gastric cancer subtype to adjacent normal stomach. Subtypes of gastric cancer that have epidemiologic and histologic distinctions are also distinguished by gene expression data. These preliminary data suggest a new classification of gastric cancer with implications for improving our understanding of disease biology and identification of unique molecular drivers for each gastric cancer subtype. ©2011 AACR.

  7. Classification of breast cancer histology images using Convolutional Neural Networks.

    Directory of Open Access Journals (Sweden)

    Teresa Araújo

    Full Text Available Breast cancer is one of the main causes of cancer death worldwide. The diagnosis of biopsy tissue with hematoxylin and eosin stained images is non-trivial and specialists often disagree on the final diagnosis. Computer-aided Diagnosis systems contribute to reduce the cost and increase the efficiency of this process. Conventional classification approaches rely on feature extraction methods designed for a specific problem based on field-knowledge. To overcome the many difficulties of the feature-based approaches, deep learning methods are becoming important alternatives. A method for the classification of hematoxylin and eosin stained breast biopsy images using Convolutional Neural Networks (CNNs is proposed. Images are classified in four classes, normal tissue, benign lesion, in situ carcinoma and invasive carcinoma, and in two classes, carcinoma and non-carcinoma. The architecture of the network is designed to retrieve information at different scales, including both nuclei and overall tissue organization. This design allows the extension of the proposed system to whole-slide histology images. The features extracted by the CNN are also used for training a Support Vector Machine classifier. Accuracies of 77.8% for four class and 83.3% for carcinoma/non-carcinoma are achieved. The sensitivity of our method for cancer cases is 95.6%.

  8. A lymphocyte spatial distribution graph-based method for automated classification of recurrence risk on lung cancer images

    Science.gov (United States)

    Garciá-Arteaga, Juan D.; Corredor, Germán.; Wang, Xiangxue; Velcheti, Vamsidhar; Madabhushi, Anant; Romero, Eduardo

    2017-11-01

    Tumor-infiltrating lymphocytes occurs when various classes of white blood cells migrate from the blood stream towards the tumor, infiltrating it. The presence of TIL is predictive of the response of the patient to therapy. In this paper, we show how the automatic detection of lymphocytes in digital H and E histopathological images and the quantitative evaluation of the global lymphocyte configuration, evaluated through global features extracted from non-parametric graphs, constructed from the lymphocytes' detected positions, can be correlated to the patient's outcome in early-stage non-small cell lung cancer (NSCLC). The method was assessed on a tissue microarray cohort composed of 63 NSCLC cases. From the evaluated graphs, minimum spanning trees and K-nn showed the highest predictive ability, yielding F1 Scores of 0.75 and 0.72 and accuracies of 0.67 and 0.69, respectively. The predictive power of the proposed methodology indicates that graphs may be used to develop objective measures of the infiltration grade of tumors, which can, in turn, be used by pathologists to improve the decision making and treatment planning processes.

  9. Application of machine learning on brain cancer multiclass classification

    Science.gov (United States)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  10. Grading Dysphagia as a Toxicity of Head and Neck Cancer: Differences in Severity Classification Based on MBS DIGEST and Clinical CTCAE Grades.

    Science.gov (United States)

    Goepfert, Ryan P; Lewin, Jan S; Barrow, Martha P; Warneke, Carla L; Fuller, Clifton D; Lai, Stephen Y; Weber, Randal S; Hutcheson, Katherine A

    2018-04-01

    Clinician-reported toxicity grading through common terminology criteria for adverse events (CTCAE) stages dysphagia based on symptoms, diet, and tube dependence. The new dynamic imaging grade of swallowing toxicity (DIGEST) tool offers a similarly scaled five-point ordinal summary grade of pharyngeal swallowing as determined through results of a modified barium swallow (MBS) study. This study aims to inform clinicians on the similarities and differences between dysphagia severity according to clinical CTCAE and MBS-derived DIGEST grading. A cross-sectional sample of 95 MBS studies was randomly selected from a prospectively-acquired MBS database among patients treated with organ preservation strategies for head and neck cancer. MBS DIGEST and clinical CTCAE dysphagia grades were compared. DIGEST and CTCAE dysphagia grades had "fair" agreement per weighted κ of 0.358 (95% CI .231-.485). Using a threshold of DIGEST ≥ 3 as reference, CTCAE had an overall sensitivity of 0.50, specificity of 0.84, and area under the curve (AUC) of 0.67 to identify severe MBS-detected dysphagia. At less than 6 months, sensitivity was 0.72, specificity was 0.76, and AUC was 0.75 while at greater than 6 months, sensitivity was 0.22, specificity was 0.90, and AUC was 0.56 for CTCAE to detect dysphagia as determined by DIGEST. Classification of pharyngeal dysphagia on MBS using DIGEST augments our understanding of dysphagia severity according to the clinically-derived CTCAE while maintaining the simplicity of an ordinal scale. DIGEST likely complements CTCAE toxicity grading through improved specificity for physiologic dysphagia in the acute phase and improved sensitivity for dysphagia in the late-phase.

  11. Classification of neuropathic pain in cancer patients

    DEFF Research Database (Denmark)

    Brunelli, Cinzia; Bennett, Michael I; Kaasa, Stein

    2014-01-01

    and on the relevance of patient-reported outcome (PRO) descriptors for the screening of NP in this population. An international group of 42 experts was invited to participate in a consensus process through a modified 2-round Internet-based Delphi survey. Relevant topics investigated were: peculiarities of NP...... in patients with cancer, IASP NeuPSIG diagnostic criteria adaptation and assessment, and standardized PRO assessment for NP screening. Median consensus scores (MED) and interquartile ranges (IQR) were calculated to measure expert consensus after both rounds. Twenty-nine experts answered, and good agreement...... was proposed. Clinical research on PRO in the screening phase and on the application of the algorithm will be needed to examine their effectiveness in classifying NP in cancer patients....

  12. The classification of osteonecrosis in patients with cancer: validation of a new radiological classification system

    International Nuclear Information System (INIS)

    Niinimäki, T.; Niinimäki, J.; Halonen, J.; Hänninen, P.; Harila-Saari, A.; Niinimäki, R.

    2015-01-01

    Aim: To validate a new, non-joint-specific radiological classification system that is suitable regardless of the site of the osteonecrosis (ON) in patients with cancer. Material and methods: Critical deficiencies in the existing ON classification systems were identified and a new, non-joint-specific radiological classification system was developed. Seventy-two magnetic resonance imaging (MRI) images of patients with cancer and ON lesions were graded, and the validation of the new system was performed by assessing inter- and intra-observer reliability. Results: Intra-observer reliability of ON grading was good or very good, with kappa values of 0.79–0.86. Interobserver agreement was lower but still good, with kappa values of 0.62–0.77. Ninety-eight percent of all intra- or interobserver differences were within one grade. Interobserver reliability of assessing the location of ON was very good, with kappa values of 0.93–0.98. Conclusion: All the available radiological ON classification systems are joint specific. This limitation has spurred the development of multiple systems, which has led to the insufficient use of classifications in ON studies among patients with cancer. The introduced radiological classification system overcomes the problem of joint-specificity, was found to be reliable, and can be used to classify all ON lesions regardless of the affected site. - Highlights: • Patients with cancer may have osteonecrosis lesions at multiple sites. • There is no non-joint-specific osteonecrosis classification available. • We introduced a new non-joint-specific osteonecrosis classification. • The validation was performed by assessing inter- and intra-observer reliability. • The classification was reliable and could be used regardless of the affected site.

  13. Cloud field classification based on textural features

    Science.gov (United States)

    Sengupta, Sailes Kumar

    1989-01-01

    An essential component in global climate research is accurate cloud cover and type determination. Of the two approaches to texture-based classification (statistical and textural), only the former is effective in the classification of natural scenes such as land, ocean, and atmosphere. In the statistical approach that was adopted, parameters characterizing the stochastic properties of the spatial distribution of grey levels in an image are estimated and then used as features for cloud classification. Two types of textural measures were used. One is based on the distribution of the grey level difference vector (GLDV), and the other on a set of textural features derived from the MaxMin cooccurrence matrix (MMCM). The GLDV method looks at the difference D of grey levels at pixels separated by a horizontal distance d and computes several statistics based on this distribution. These are then used as features in subsequent classification. The MaxMin tectural features on the other hand are based on the MMCM, a matrix whose (I,J)th entry give the relative frequency of occurrences of the grey level pair (I,J) that are consecutive and thresholded local extremes separated by a given pixel distance d. Textural measures are then computed based on this matrix in much the same manner as is done in texture computation using the grey level cooccurrence matrix. The database consists of 37 cloud field scenes from LANDSAT imagery using a near IR visible channel. The classification algorithm used is the well known Stepwise Discriminant Analysis. The overall accuracy was estimated by the percentage or correct classifications in each case. It turns out that both types of classifiers, at their best combination of features, and at any given spatial resolution give approximately the same classification accuracy. A neural network based classifier with a feed forward architecture and a back propagation training algorithm is used to increase the classification accuracy, using these two classes

  14. An Authentication Technique Based on Classification

    Institute of Scientific and Technical Information of China (English)

    李钢; 杨杰

    2004-01-01

    We present a novel watermarking approach based on classification for authentication, in which a watermark is embedded into the host image. When the marked image is modified, the extracted watermark is also different to the original watermark, and different kinds of modification lead to different extracted watermarks. In this paper, different kinds of modification are considered as classes, and we used classification algorithm to recognize the modifications with high probability. Simulation results show that the proposed method is potential and effective.

  15. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey

    Directory of Open Access Journals (Sweden)

    Abdullah-Al Nahid

    2017-01-01

    Full Text Available Breast cancer is one of the largest causes of women’s death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors’ and physicians’ time. Despite the various publications on breast image classification, very few review papers are available which provide a detailed description of breast cancer image classification techniques, feature extraction and selection procedures, classification measuring parameterizations, and image classification findings. We have put a special emphasis on the Convolutional Neural Network (CNN method for breast image classification. Along with the CNN method we have also described the involvement of the conventional Neural Network (NN, Logic Based classifiers such as the Random Forest (RF algorithm, Support Vector Machines (SVM, Bayesian methods, and a few of the semisupervised and unsupervised methods which have been used for breast image classification.

  16. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey.

    Science.gov (United States)

    Nahid, Abdullah-Al; Kong, Yinan

    2017-01-01

    Breast cancer is one of the largest causes of women's death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors' and physicians' time. Despite the various publications on breast image classification, very few review papers are available which provide a detailed description of breast cancer image classification techniques, feature extraction and selection procedures, classification measuring parameterizations, and image classification findings. We have put a special emphasis on the Convolutional Neural Network (CNN) method for breast image classification. Along with the CNN method we have also described the involvement of the conventional Neural Network (NN), Logic Based classifiers such as the Random Forest (RF) algorithm, Support Vector Machines (SVM), Bayesian methods, and a few of the semisupervised and unsupervised methods which have been used for breast image classification.

  17. EMG finger movement classification based on ANFIS

    Science.gov (United States)

    Caesarendra, W.; Tjahjowidodo, T.; Nico, Y.; Wahyudati, S.; Nurhasanah, L.

    2018-04-01

    An increase number of people suffering from stroke has impact to the rapid development of finger hand exoskeleton to enable an automatic physical therapy. Prior to the development of finger exoskeleton, a research topic yet important i.e. machine learning of finger gestures classification is conducted. This paper presents a study on EMG signal classification of 5 finger gestures as a preliminary study toward the finger exoskeleton design and development in Indonesia. The EMG signals of 5 finger gestures were acquired using Myo EMG sensor. The EMG signal features were extracted and reduced using PCA. The ANFIS based learning is used to classify reduced features of 5 finger gestures. The result shows that the classification of finger gestures is less than the classification of 7 hand gestures.

  18. A New Direction of Cancer Classification: Positive Effect of Low-Ranking MicroRNAs.

    Science.gov (United States)

    Li, Feifei; Piao, Minghao; Piao, Yongjun; Li, Meijing; Ryu, Keun Ho

    2014-10-01

    Many studies based on microRNA (miRNA) expression profiles showed a new aspect of cancer classification. Because one characteristic of miRNA expression data is the high dimensionality, feature selection methods have been used to facilitate dimensionality reduction. The feature selection methods have one shortcoming thus far: they just consider the problem of where feature to class is 1:1 or n:1. However, because one miRNA may influence more than one type of cancer, human miRNA is considered to be ranked low in traditional feature selection methods and are removed most of the time. In view of the limitation of the miRNA number, low-ranking miRNAs are also important to cancer classification. We considered both high- and low-ranking features to cover all problems (1:1, n:1, 1:n, and m:n) in cancer classification. First, we used the correlation-based feature selection method to select the high-ranking miRNAs, and chose the support vector machine, Bayes network, decision tree, k-nearest-neighbor, and logistic classifier to construct cancer classification. Then, we chose Chi-square test, information gain, gain ratio, and Pearson's correlation feature selection methods to build the m:n feature subset, and used the selected miRNAs to determine cancer classification. The low-ranking miRNA expression profiles achieved higher classification accuracy compared with just using high-ranking miRNAs in traditional feature selection methods. Our results demonstrate that the m:n feature subset made a positive impression of low-ranking miRNAs in cancer classification.

  19. Inventory classification based on decoupling points

    Directory of Open Access Journals (Sweden)

    Joakim Wikner

    2015-01-01

    Full Text Available The ideal state of continuous one-piece flow may never be achieved. Still the logistics manager can improve the flow by carefully positioning inventory to buffer against variations. Strategies such as lean, postponement, mass customization, and outsourcing all rely on strategic positioning of decoupling points to separate forecast-driven from customer-order-driven flows. Planning and scheduling of the flow are also based on classification of decoupling points as master scheduled or not. A comprehensive classification scheme for these types of decoupling points is introduced. The approach rests on identification of flows as being either demand based or supply based. The demand or supply is then combined with exogenous factors, classified as independent, or endogenous factors, classified as dependent. As a result, eight types of strategic as well as tactical decoupling points are identified resulting in a process-based framework for inventory classification that can be used for flow design.

  20. Evolving cancer classification in the era of personalized medicine: A primer for radiologists

    Energy Technology Data Exchange (ETDEWEB)

    O' Neill, Alibhe C.; Jagannathan, Jyothi P.; Ramaiya, Nikhil H. [Dept. of of Imaging, Dana Farber Cancer Institute, Boston (United States)

    2017-01-15

    Traditionally tumors were classified based on anatomic location but now specific genetic mutations in cancers are leading to treatment of tumors with molecular targeted therapies. This has led to a paradigm shift in the classification and treatment of cancer. Tumors treated with molecular targeted therapies often show morphological changes rather than change in size and are associated with class specific and drug specific toxicities, different from those encountered with conventional chemotherapeutic agents. It is important for the radiologists to be familiar with the new cancer classification and the various treatment strategies employed, in order to effectively communicate and participate in the multi-disciplinary care. In this paper we will focus on lung cancer as a prototype of the new molecular classification.

  1. Detection and classification of Breast Cancer in Wavelet Sub-bands of Fractal Segmented Cancerous Zones.

    Science.gov (United States)

    Shirazinodeh, Alireza; Noubari, Hossein Ahmadi; Rabbani, Hossein; Dehnavi, Alireza Mehri

    2015-01-01

    Recent studies on wavelet transform and fractal modeling applied on mammograms for the detection of cancerous tissues indicate that microcalcifications and masses can be utilized for the study of the morphology and diagnosis of cancerous cases. It is shown that the use of fractal modeling, as applied to a given image, can clearly discern cancerous zones from noncancerous areas. In this paper, for fractal modeling, the original image is first segmented into appropriate fractal boxes followed by identifying the fractal dimension of each windowed section using a computationally efficient two-dimensional box-counting algorithm. Furthermore, using appropriate wavelet sub-bands and image Reconstruction based on modified wavelet coefficients, it is shown that it is possible to arrive at enhanced features for detection of cancerous zones. In this paper, we have attempted to benefit from the advantages of both fractals and wavelets by introducing a new algorithm. By using a new algorithm named F1W2, the original image is first segmented into appropriate fractal boxes, and the fractal dimension of each windowed section is extracted. Following from that, by applying a maximum level threshold on fractal dimensions matrix, the best-segmented boxes are selected. In the next step, the segmented Cancerous zones which are candidates are then decomposed by utilizing standard orthogonal wavelet transform and db2 wavelet in three different resolution levels, and after nullifying wavelet coefficients of the image at the first scale and low frequency band of the third scale, the modified reconstructed image is successfully utilized for detection of breast cancer regions by applying an appropriate threshold. For detection of cancerous zones, our simulations indicate the accuracy of 90.9% for masses and 88.99% for microcalcifications detection results using the F1W2 method. For classification of detected mictocalcification into benign and malignant cases, eight features are identified and

  2. Cancer classification using the Immunoscore: a worldwide task force.

    Science.gov (United States)

    Galon, Jérôme; Pagès, Franck; Marincola, Francesco M; Angell, Helen K; Thurin, Magdalena; Lugli, Alessandro; Zlobec, Inti; Berger, Anne; Bifulco, Carlo; Botti, Gerardo; Tatangelo, Fabiana; Britten, Cedrik M; Kreiter, Sebastian; Chouchane, Lotfi; Delrio, Paolo; Arndt, Hartmann; Asslaber, Martin; Maio, Michele; Masucci, Giuseppe V; Mihm, Martin; Vidal-Vanaclocha, Fernando; Allison, James P; Gnjatic, Sacha; Hakansson, Leif; Huber, Christoph; Singh-Jasuja, Harpreet; Ottensmeier, Christian; Zwierzina, Heinz; Laghi, Luigi; Grizzi, Fabio; Ohashi, Pamela S; Shaw, Patricia A; Clarke, Blaise A; Wouters, Bradly G; Kawakami, Yutaka; Hazama, Shoichi; Okuno, Kiyotaka; Wang, Ena; O'Donnell-Tormey, Jill; Lagorce, Christine; Pawelec, Graham; Nishimura, Michael I; Hawkins, Robert; Lapointe, Réjean; Lundqvist, Andreas; Khleif, Samir N; Ogino, Shuji; Gibbs, Peter; Waring, Paul; Sato, Noriyuki; Torigoe, Toshihiko; Itoh, Kyogo; Patel, Prabhu S; Shukla, Shilin N; Palmqvist, Richard; Nagtegaal, Iris D; Wang, Yili; D'Arrigo, Corrado; Kopetz, Scott; Sinicrope, Frank A; Trinchieri, Giorgio; Gajewski, Thomas F; Ascierto, Paolo A; Fox, Bernard A

    2012-10-03

    Prediction of clinical outcome in cancer is usually achieved by histopathological evaluation of tissue samples obtained during surgical resection of the primary tumor. Traditional tumor staging (AJCC/UICC-TNM classification) summarizes data on tumor burden (T), presence of cancer cells in draining and regional lymph nodes (N) and evidence for metastases (M). However, it is now recognized that clinical outcome can significantly vary among patients within the same stage. The current classification provides limited prognostic information, and does not predict response to therapy. Recent literature has alluded to the importance of the host immune system in controlling tumor progression. Thus, evidence supports the notion to include immunological biomarkers, implemented as a tool for the prediction of prognosis and response to therapy. Accumulating data, collected from large cohorts of human cancers, has demonstrated the impact of immune-classification, which has a prognostic value that may add to the significance of the AJCC/UICC TNM-classification. It is therefore imperative to begin to incorporate the 'Immunoscore' into traditional classification, thus providing an essential prognostic and potentially predictive tool. Introduction of this parameter as a biomarker to classify cancers, as part of routine diagnostic and prognostic assessment of tumors, will facilitate clinical decision-making including rational stratification of patient treatment. Equally, the inherent complexity of quantitative immunohistochemistry, in conjunction with protocol variation across laboratories, analysis of different immune cell types, inconsistent region selection criteria, and variable ways to quantify immune infiltration, all underline the urgent requirement to reach assay harmonization. In an effort to promote the Immunoscore in routine clinical settings, an international task force was initiated. This review represents a follow-up of the announcement of this initiative, and of the J

  3. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  4. Rough set classification based on quantum logic

    Science.gov (United States)

    Hassan, Yasser F.

    2017-11-01

    By combining the advantages of quantum computing and soft computing, the paper shows that rough sets can be used with quantum logic for classification and recognition systems. We suggest the new definition of rough set theory as quantum logic theory. Rough approximations are essential elements in rough set theory, the quantum rough set model for set-valued data directly construct set approximation based on a kind of quantum similarity relation which is presented here. Theoretical analyses demonstrate that the new model for quantum rough sets has new type of decision rule with less redundancy which can be used to give accurate classification using principles of quantum superposition and non-linear quantum relations. To our knowledge, this is the first attempt aiming to define rough sets in representation of a quantum rather than logic or sets. The experiments on data-sets have demonstrated that the proposed model is more accuracy than the traditional rough sets in terms of finding optimal classifications.

  5. Bladder cancer: Analysis of the 2004 WHO classification in ...

    African Journals Online (AJOL)

    Objectives: Bladder cancer (BCA) is aworldwide disease and shows a wide range of geographical variation. The aim of this study is to analyze the prevalence of schistosomal and non-schistosomal associated BCA as well as compare our findings with the 2004 WHO consensus classification of urothelial neoplasms and ...

  6. Classification of mitocans, anti-cancer drugs acting on mitochondria

    Czech Academy of Sciences Publication Activity Database

    Neužil, Jiří; Dong, L. F.; Rohlena, Jakub; Truksa, Jaroslav; Ralph, S. J.

    2013-01-01

    Roč. 13, č. 3 (2013), s. 199-208 ISSN 1567-7249 Institutional research plan: CEZ:AV0Z50520701 Keywords : Mitocans * Anti-cancer therapeutics * Classification Subject RIV: EB - Gene tics ; Molecular Biology Impact factor: 3.524, year: 2013

  7. Classification of mitocans, anti-cancer drugs acting on mitochondria

    Czech Academy of Sciences Publication Activity Database

    Neužil, Jiří; Dong, L. F.; Rohlena, Jakub; Truksa, Jaroslav; Ralph, S. J.

    2013-01-01

    Roč. 13, č. 3 (2013), s. 199-208 ISSN 1567-7249 Institutional research plan: CEZ:AV0Z50520701 Keywords : Mitocans * Anti-cancer therapeutics * Classification Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.524, year: 2013

  8. Prognostic classifications of lymph node involvement in lung cancer and current International Association for the Study of Lung Cancer descriptive classification in zones.

    Science.gov (United States)

    Riquet, Marc; Arame, Alex; Foucault, Christophe; Le Pimpec Barthes, Françoise

    2010-09-01

    The lymphatic drainage of solid organ tumors crosses through the lymph nodes (LNs) whose tumoral involvement may still be considered as local disease. Concerning lung cancer, LN involvement may be intrapulmonary (N1), and mediastinal and/or extra-thoracic. More than 30 years ago, mediastinal involved LNs were all considered as N2, and outside the scope of surgery. In 1978, Naruke presented an original article entitled 'Lymph node mapping and curability at various levels of metastasis in resected lung cancer', demonstrating that N2 was not a contraindication to surgery in all patients. The map permitted to localize the favorable N2 on the lung cancer ipsilateral side of the mediastinum. Several maps ensued aiming to discriminate between right and left involvement (1983), and to distinguish N2 (ipsilateral) and N3 (contralateral) mediastinal LN involvement (1983, 1986). The last map (1997 regional LN classification) was recently replaced by a descriptive classification in anatomical zones. This new LN map of the TNM classification for lung cancer is a step toward using anatomical view points which might be the best way to better understand lung cancer lymphatic spread. Nowadays, the LNs are easily identified by current radiological imaging, and their resectability may be anticipated. Each LN chain may be removed by en-bloc lymphadenectomy performed during radical lung resection, a safe procedure which seems to be more oncological based than sampling, and which avoids the source of discrepancies pointed out during the labeling of LN stations by surgeons.

  9. Gastric cancer: epidemiology, prevention, classification, and treatment

    OpenAIRE

    Sitarz, Robert; Skierucha, Małgorzata; Mielko, Jerzy; Offerhaus, G Johan A; Maciejewski, Ryszard; Polkowski, Wojciech P

    2018-01-01

    Robert Sitarz,1–3 Małgorzata Skierucha,1,2 Jerzy Mielko,1 G Johan A Offerhaus,3 Ryszard Maciejewski,2 Wojciech P Polkowski1 1Department of Surgical Oncology, Medical University of Lublin, Lublin, Poland; 2Department of Human Anatomy, Medical University of Lublin, Lublin, Poland; 3Department of Pathology, University Medical Centre, Utrecht, The Netherlands Abstract: Gastric cancer is the second most common cause of cancer-related deaths in the world, the epidemiology of which has ch...

  10. Vessel-guided airway segmentation based on voxel classification

    DEFF Research Database (Denmark)

    Lo, Pechin Chien Pau; Sporring, Jon; Ashraf, Haseem

    2008-01-01

    This paper presents a method for improving airway tree segmentation using vessel orientation information. We use the fact that an airway branch is always accompanied by an artery, with both structures having similar orientations. This work is based on a  voxel classification airway segmentation...... method proposed previously. The probability of a voxel belonging to the airway, from the voxel classification method, is augmented with an orientation similarity measure as a criterion for region growing. The orientation similarity measure of a voxel indicates how similar is the orientation...... of the surroundings of a voxel, estimated based on a tube model, is to that of a neighboring vessel. The proposed method is tested on 20 CT images from different subjects selected randomly from a lung cancer screening study. Length of the airway branches from the results of the proposed method are significantly...

  11. A New Feature Ensemble with a Multistage Classification Scheme for Breast Cancer Diagnosis

    Directory of Open Access Journals (Sweden)

    Idil Isikli Esener

    2017-01-01

    Full Text Available A new and effective feature ensemble with a multistage classification is proposed to be implemented in a computer-aided diagnosis (CAD system for breast cancer diagnosis. A publicly available mammogram image dataset collected during the Image Retrieval in Medical Applications (IRMA project is utilized to verify the suggested feature ensemble and multistage classification. In achieving the CAD system, feature extraction is performed on the mammogram region of interest (ROI images which are preprocessed by applying a histogram equalization followed by a nonlocal means filtering. The proposed feature ensemble is formed by concatenating the local configuration pattern-based, statistical, and frequency domain features. The classification process of these features is implemented in three cases: a one-stage study, a two-stage study, and a three-stage study. Eight well-known classifiers are used in all cases of this multistage classification scheme. Additionally, the results of the classifiers that provide the top three performances are combined via a majority voting technique to improve the recognition accuracy on both two- and three-stage studies. A maximum of 85.47%, 88.79%, and 93.52% classification accuracies are attained by the one-, two-, and three-stage studies, respectively. The proposed multistage classification scheme is more effective than the single-stage classification for breast cancer diagnosis.

  12. Histological image classification using biologically interpretable shape-based features

    International Nuclear Information System (INIS)

    Kothari, Sonal; Phan, John H; Young, Andrew N; Wang, May D

    2013-01-01

    Automatic cancer diagnostic systems based on histological image classification are important for improving therapeutic decisions. Previous studies propose textural and morphological features for such systems. These features capture patterns in histological images that are useful for both cancer grading and subtyping. However, because many of these features lack a clear biological interpretation, pathologists may be reluctant to adopt these features for clinical diagnosis. We examine the utility of biologically interpretable shape-based features for classification of histological renal tumor images. Using Fourier shape descriptors, we extract shape-based features that capture the distribution of stain-enhanced cellular and tissue structures in each image and evaluate these features using a multi-class prediction model. We compare the predictive performance of the shape-based diagnostic model to that of traditional models, i.e., using textural, morphological and topological features. The shape-based model, with an average accuracy of 77%, outperforms or complements traditional models. We identify the most informative shapes for each renal tumor subtype from the top-selected features. Results suggest that these shapes are not only accurate diagnostic features, but also correlate with known biological characteristics of renal tumors. Shape-based analysis of histological renal tumor images accurately classifies disease subtypes and reveals biologically insightful discriminatory features. This method for shape-based analysis can be extended to other histological datasets to aid pathologists in diagnostic and therapeutic decisions

  13. Analysis of composition-based metagenomic classification.

    Science.gov (United States)

    Higashi, Susan; Barreto, André da Motta Salles; Cantão, Maurício Egidio; de Vasconcelos, Ana Tereza Ribeiro

    2012-01-01

    An essential step of a metagenomic study is the taxonomic classification, that is, the identification of the taxonomic lineage of the organisms in a given sample. The taxonomic classification process involves a series of decisions. Currently, in the context of metagenomics, such decisions are usually based on empirical studies that consider one specific type of classifier. In this study we propose a general framework for analyzing the impact that several decisions can have on the classification problem. Instead of focusing on any specific classifier, we define a generic score function that provides a measure of the difficulty of the classification task. Using this framework, we analyze the impact of the following parameters on the taxonomic classification problem: (i) the length of n-mers used to encode the metagenomic sequences, (ii) the similarity measure used to compare sequences, and (iii) the type of taxonomic classification, which can be conventional or hierarchical, depending on whether the classification process occurs in a single shot or in several steps according to the taxonomic tree. We defined a score function that measures the degree of separability of the taxonomic classes under a given configuration induced by the parameters above. We conducted an extensive computational experiment and found out that reasonable values for the parameters of interest could be (i) intermediate values of n, the length of the n-mers; (ii) any similarity measure, because all of them resulted in similar scores; and (iii) the hierarchical strategy, which performed better in all of the cases. As expected, short n-mers generate lower configuration scores because they give rise to frequency vectors that represent distinct sequences in a similar way. On the other hand, large values for n result in sparse frequency vectors that represent differently metagenomic fragments that are in fact similar, also leading to low configuration scores. Regarding the similarity measure, in

  14. Mechanism-based drug exposure classification in pharmacoepidemiological studies

    NARCIS (Netherlands)

    Verdel, B.M.

    2010-01-01

    Mechanism-based classification of drug exposure in pharmacoepidemiological studies In pharmacoepidemiology and pharmacovigilance, the relation between drug exposure and clinical outcomes is crucial. Exposure classification in pharmacoepidemiological studies is traditionally based on

  15. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim U.; Emberton, Mark [University College London, Research Department of Urology, Division of Surgery and Interventional Science, London (United Kingdom); Kirkham, Alex [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2015-09-15

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  16. SQL based cardiovascular ultrasound image classification.

    Science.gov (United States)

    Nandagopalan, S; Suryanarayana, Adiga B; Sudarshan, T S B; Chandrashekar, Dhanalakshmi; Manjunath, C N

    2013-01-01

    This paper proposes a novel method to analyze and classify the cardiovascular ultrasound echocardiographic images using Naïve-Bayesian model via database OLAP-SQL. Efficient data mining algorithms based on tightly-coupled model is used to extract features. Three algorithms are proposed for classification namely Naïve-Bayesian Classifier for Discrete variables (NBCD) with SQL, NBCD with OLAP-SQL, and Naïve-Bayesian Classifier for Continuous variables (NBCC) using OLAP-SQL. The proposed model is trained with 207 patient images containing normal and abnormal categories. Out of the three proposed algorithms, a high classification accuracy of 96.59% was achieved from NBCC which is better than the earlier methods.

  17. Cancer classification using the Immunoscore: a worldwide task force

    Directory of Open Access Journals (Sweden)

    Galon Jérôme

    2012-10-01

    Full Text Available Abstract Prediction of clinical outcome in cancer is usually achieved by histopathological evaluation of tissue samples obtained during surgical resection of the primary tumor. Traditional tumor staging (AJCC/UICC-TNM classification summarizes data on tumor burden (T, presence of cancer cells in draining and regional lymph nodes (N and evidence for metastases (M. However, it is now recognized that clinical outcome can significantly vary among patients within the same stage. The current classification provides limited prognostic information, and does not predict response to therapy. Recent literature has alluded to the importance of the host immune system in controlling tumor progression. Thus, evidence supports the notion to include immunological biomarkers, implemented as a tool for the prediction of prognosis and response to therapy. Accumulating data, collected from large cohorts of human cancers, has demonstrated the impact of immune-classification, which has a prognostic value that may add to the significance of the AJCC/UICC TNM-classification. It is therefore imperative to begin to incorporate the ‘Immunoscore’ into traditional classification, thus providing an essential prognostic and potentially predictive tool. Introduction of this parameter as a biomarker to classify cancers, as part of routine diagnostic and prognostic assessment of tumors, will facilitate clinical decision-making including rational stratification of patient treatment. Equally, the inherent complexity of quantitative immunohistochemistry, in conjunction with protocol variation across laboratories, analysis of different immune cell types, inconsistent region selection criteria, and variable ways to quantify immune infiltration, all underline the urgent requirement to reach assay harmonization. In an effort to promote the Immunoscore in routine clinical settings, an international task force was initiated. This review represents a follow-up of the announcement of

  18. An Approach for Leukemia Classification Based on Cooperative Game Theory

    Directory of Open Access Journals (Sweden)

    Atefeh Torkaman

    2011-01-01

    Full Text Available Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers in the world. So, the emphasis on diagnostic techniques and best treatments would be able to provide better prognosis and survival for patients. In this paper, an automatic diagnosis recommender system for classifying leukemia based on cooperative game is presented. Through out this research, we analyze the flow cytometry data toward the classification of leukemia into eight classes. We work on real data set from different types of leukemia that have been collected at Iran Blood Transfusion Organization (IBTO. Generally, the data set contains 400 samples taken from human leukemic bone marrow. This study deals with cooperative game used for classification according to different weights assigned to the markers. The proposed method is versatile as there are no constraints to what the input or output represent. This means that it can be used to classify a population according to their contributions. In other words, it applies equally to other groups of data. The experimental results show the accuracy rate of 93.12%, for classification and compared to decision tree (C4.5 with (90.16% in accuracy. The result demonstrates that cooperative game is very promising to be used directly for classification of leukemia as a part of Active Medical decision support system for interpretation of flow cytometry readout. This system could assist clinical hematologists to properly recognize different kinds of leukemia by preparing suggestions and this could improve the treatment

  19. An approach for leukemia classification based on cooperative game theory.

    Science.gov (United States)

    Torkaman, Atefeh; Charkari, Nasrollah Moghaddam; Aghaeipour, Mahnaz

    2011-01-01

    Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers in the world. So, the emphasis on diagnostic techniques and best treatments would be able to provide better prognosis and survival for patients. In this paper, an automatic diagnosis recommender system for classifying leukemia based on cooperative game is presented. Through out this research, we analyze the flow cytometry data toward the classification of leukemia into eight classes. We work on real data set from different types of leukemia that have been collected at Iran Blood Transfusion Organization (IBTO). Generally, the data set contains 400 samples taken from human leukemic bone marrow. This study deals with cooperative game used for classification according to different weights assigned to the markers. The proposed method is versatile as there are no constraints to what the input or output represent. This means that it can be used to classify a population according to their contributions. In other words, it applies equally to other groups of data. The experimental results show the accuracy rate of 93.12%, for classification and compared to decision tree (C4.5) with (90.16%) in accuracy. The result demonstrates that cooperative game is very promising to be used directly for classification of leukemia as a part of Active Medical decision support system for interpretation of flow cytometry readout. This system could assist clinical hematologists to properly recognize different kinds of leukemia by preparing suggestions and this could improve the treatment of leukemic

  20. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  1. DNA methylation-based classification of central nervous system tumours

    DEFF Research Database (Denmark)

    Capper, David; Jones, David T.W.; Sill, Martin

    2018-01-01

    Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging - with substantial inter-observer variabil......Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging - with substantial inter......-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show...

  2. Changing Histopathological Diagnostics by Genome-Based Tumor Classification

    Directory of Open Access Journals (Sweden)

    Michael Kloth

    2014-05-01

    Full Text Available Traditionally, tumors are classified by histopathological criteria, i.e., based on their specific morphological appearances. Consequently, current therapeutic decisions in oncology are strongly influenced by histology rather than underlying molecular or genomic aberrations. The increase of information on molecular changes however, enabled by the Human Genome Project and the International Cancer Genome Consortium as well as the manifold advances in molecular biology and high-throughput sequencing techniques, inaugurated the integration of genomic information into disease classification. Furthermore, in some cases it became evident that former classifications needed major revision and adaption. Such adaptations are often required by understanding the pathogenesis of a disease from a specific molecular alteration, using this molecular driver for targeted and highly effective therapies. Altogether, reclassifications should lead to higher information content of the underlying diagnoses, reflecting their molecular pathogenesis and resulting in optimized and individual therapeutic decisions. The objective of this article is to summarize some particularly important examples of genome-based classification approaches and associated therapeutic concepts. In addition to reviewing disease specific markers, we focus on potentially therapeutic or predictive markers and the relevance of molecular diagnostics in disease monitoring.

  3. Recursive Partitioning Analysis for New Classification of Patients With Esophageal Cancer Treated by Chemoradiotherapy

    International Nuclear Information System (INIS)

    Nomura, Motoo; Shitara, Kohei; Kodaira, Takeshi; Kondoh, Chihiro; Takahari, Daisuke; Ura, Takashi; Kojima, Hiroyuki; Kamata, Minoru; Muro, Kei; Sawada, Satoshi

    2012-01-01

    Background: The 7th edition of the American Joint Committee on Cancer staging system does not include lymph node size in the guidelines for staging patients with esophageal cancer. The objectives of this study were to determine the prognostic impact of the maximum metastatic lymph node diameter (ND) on survival and to develop and validate a new staging system for patients with esophageal squamous cell cancer who were treated with definitive chemoradiotherapy (CRT). Methods: Information on 402 patients with esophageal cancer undergoing CRT at two institutions was reviewed. Univariate and multivariate analyses of data from one institution were used to assess the impact of clinical factors on survival, and recursive partitioning analysis was performed to develop the new staging classification. To assess its clinical utility, the new classification was validated using data from the second institution. Results: By multivariate analysis, gender, T, N, and ND stages were independently and significantly associated with survival (p < 0.05). The resulting new staging classification was based on the T and ND. The four new stages led to good separation of survival curves in both the developmental and validation datasets (p < 0.05). Conclusions: Our results showed that lymph node size is a strong independent prognostic factor and that the new staging system, which incorporated lymph node size, provided good prognostic power, and discriminated effectively for patients with esophageal cancer undergoing CRT.

  4. Research on Classification of Chinese Text Data Based on SVM

    Science.gov (United States)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  5. [New molecular classification of colorectal cancer, pancreatic cancer and stomach cancer: Towards "à la carte" treatment?].

    Science.gov (United States)

    Dreyer, Chantal; Afchain, Pauline; Trouilloud, Isabelle; André, Thierry

    2016-01-01

    This review reports 3 of recently published molecular classifications of the 3 main gastro-intestinal cancers: gastric, pancreatic and colorectal adenocarcinoma. In colorectal adenocarcinoma, 6 independent classifications were combined to finally hold 4 molecular sub-groups, Consensus Molecular Subtypes (CMS 1-4), linked to various clinical, molecular and survival data. CMS1 (14% MSI with immune activation); CMS2 (37%: canonical with epithelial differentiation and activation of the WNT/MYC pathway); CMS3 (13% metabolic with epithelial differentiation and RAS mutation); CMS4 (23%: mesenchymal with activation of TGFβ pathway and angiogenesis with stromal invasion). In gastric adenocarcinoma, 4 groups were established: subtype "EBV" (9%, high frequency of PIK3CA mutations, hypermetylation and amplification of JAK2, PD-L1 and PD-L2), subtype "MSI" (22%, high rate of mutation), subtype "genomically stable tumor" (20%, diffuse histology type and mutations of RAS and genes encoding integrins and adhesion proteins including CDH1) and subtype "tumors with chromosomal instability" (50%, intestinal type, aneuploidy and receptor tyrosine kinase amplification). In pancreatic adenocarcinomas, a classification in four sub-groups has been proposed, stable subtype (20%, aneuploidy), locally rearranged subtype (30%, focal event on one or two chromosoms), scattered subtype (36%,200 structural variation events, defects in DNA maintenance). Although currently away from the care of patients, these classifications open the way to "à la carte" treatment depending on molecular biology. Copyright © 2016 Société Française du Cancer. Published by Elsevier Masson SAS. All rights reserved.

  6. X-ray diagnosis of esophageal cancer and application of Borrmann's classification

    International Nuclear Information System (INIS)

    Chin, Soo Yil

    1985-01-01

    In 126 cases and who were diagnosed as esophageal cancer and treated by radiation at Cancer Research Hospital, K. A. E. R. I., from January 1974 to July 1979, a study on the x-ray diagnosis of esophageal cancer was carried out mainly as to the type classification. The ordinary classification od esophageal cancer by x-ray picture was reviewed and Borrmann's classification using gastric cancer was tried to apply to the macroscopic classification of esophageal cancer, and also the application of this classification to x-ray diagnosis was discussed. And the effect of radiotherapy as to each type of cancer according to the ordinary x-ray classification and Borrmann's classification was studied too. The results were as follows: 1. The ordinary x-ray classification was not simple, because the degree of progression of cancer, difference of mural invasion, and position and method of radiography could make misinterpretation of the type of cancer and the therapeutic effect by radiation as to each type according to this classification did not represent a significant characteristic too, although the radiation was most effective in the polypoidal type and least effective in funnel type. 2. The Borrmann's classification was relatively easy even on the radiogram because of little overlapping between each type and the type became more evident on the resected specimen after operation. And also some correlation was recognized between the type of Borrmann's classification and radiotherapeutic effect. The effect was best in type I and It was gradually decreased in type II, III, and IV in the other. The radiotherapy was ineffective in about three quarters of type IV. 3. The Borrmann's classification is now employed to the carcinoma of large bowel, as well as to the gastric cancer. If it is applied to the esophageal cancer, the macroscopic classification for the cancer of digestive tract can be systemized and it will be convenient in clinical study.

  7. Cancer survival classification using integrated data sets and intermediate information.

    Science.gov (United States)

    Kim, Shinuk; Park, Taesung; Kon, Mark

    2014-09-01

    Although numerous studies related to cancer survival have been published, increasing the prediction accuracy of survival classes still remains a challenge. Integration of different data sets, such as microRNA (miRNA) and mRNA, might increase the accuracy of survival class prediction. Therefore, we suggested a machine learning (ML) approach to integrate different data sets, and developed a novel method based on feature selection with Cox proportional hazard regression model (FSCOX) to improve the prediction of cancer survival time. FSCOX provides us with intermediate survival information, which is usually discarded when separating survival into 2 groups (short- and long-term), and allows us to perform survival analysis. We used an ML-based protocol for feature selection, integrating information from miRNA and mRNA expression profiles at the feature level. To predict survival phenotypes, we used the following classifiers, first, existing ML methods, support vector machine (SVM) and random forest (RF), second, a new median-based classifier using FSCOX (FSCOX_median), and third, an SVM classifier using FSCOX (FSCOX_SVM). We compared these methods using 3 types of cancer tissue data sets: (i) miRNA expression, (ii) mRNA expression, and (iii) combined miRNA and mRNA expression. The latter data set included features selected either from the combined miRNA/mRNA profile or independently from miRNAs and mRNAs profiles (IFS). In the ovarian data set, the accuracy of survival classification using the combined miRNA/mRNA profiles with IFS was 75% using RF, 86.36% using SVM, 84.09% using FSCOX_median, and 88.64% using FSCOX_SVM with a balanced 22 short-term and 22 long-term survivor data set. These accuracies are higher than those using miRNA alone (70.45%, RF; 75%, SVM; 75%, FSCOX_median; and 75%, FSCOX_SVM) or mRNA alone (65.91%, RF; 63.64%, SVM; 72.73%, FSCOX_median; and 70.45%, FSCOX_SVM). Similarly in the glioblastoma multiforme data, the accuracy of miRNA/mRNA using IFS

  8. Alternative Polyadenylation Patterns for Novel Gene Discovery and Classification in Cancer

    Directory of Open Access Journals (Sweden)

    Oguzhan Begik

    2017-07-01

    Full Text Available Certain aspects of diagnosis, prognosis, and treatment of cancer patients are still important challenges to be addressed. Therefore, we propose a pipeline to uncover patterns of alternative polyadenylation (APA, a hidden complexity in cancer transcriptomes, to further accelerate efforts to discover novel cancer genes and pathways. Here, we analyzed expression data for 1045 cancer patients and found a significant shift in usage of poly(A signals in common tumor types (breast, colon, lung, prostate, gastric, and ovarian compared to normal tissues. Using machine-learning techniques, we further defined specific subsets of APA events to efficiently classify cancer types. Furthermore, APA patterns were associated with altered protein levels in patients, revealed by antibody-based profiling data, suggesting functional significance. Overall, our study offers a computational approach for use of APA in novel gene discovery and classification in common tumor types, with important implications in basic research, biomarker discovery, and precision medicine approaches.

  9. Contextual segment-based classification of airborne laser scanner data

    NARCIS (Netherlands)

    Vosselman, George; Coenen, Maximilian; Rottensteiner, Franz

    2017-01-01

    Classification of point clouds is needed as a first step in the extraction of various types of geo-information from point clouds. We present a new approach to contextual classification of segmented airborne laser scanning data. Potential advantages of segment-based classification are easily offset

  10. Side effects of cancer therapies. International classification and documentation systems

    International Nuclear Information System (INIS)

    Seegenschmiedt, M.H.

    1998-01-01

    The publication presents and explains verified, international classification and documentation systems for side effects induced by cancer treatments, applicable in general and clinical practice and clinical research, and covers in a clearly arranged manner the whole range of treatments, including acute and chronic side effects of chemotherapy and radiotherapy, surgery, or combined therapies. The book fills a long-felt need in tumor documentation and is a major contribution to quality assurance in clinical oncology in German-speaking countries. As most parts of the book are bilingual, presenting German and English texts and terminology, it satisfies the principles of interdisciplinarity and internationality. The tabulated form chosen for presentation of classification systems and criteria facilitate the user's approach as well as application in daily work. (orig./CB) [de

  11. Call for a Computer-Aided Cancer Detection and Classification Research Initiative in Oman.

    Science.gov (United States)

    Mirzal, Andri; Chaudhry, Shafique Ahmad

    2016-01-01

    Cancer is a major health problem in Oman. It is reported that cancer incidence in Oman is the second highest after Saudi Arabia among Gulf Cooperation Council countries. Based on GLOBOCAN estimates, Oman is predicted to face an almost two-fold increase in cancer incidence in the period 2008-2020. However, cancer research in Oman is still in its infancy. This is due to the fact that medical institutions and infrastructure that play central roles in data collection and analysis are relatively new developments in Oman. We believe the country requires an organized plan and efforts to promote local cancer research. In this paper, we discuss current research progress in cancer diagnosis using machine learning techniques to optimize computer aided cancer detection and classification (CAD). We specifically discuss CAD using two major medical data, i.e., medical imaging and microarray gene expression profiling, because medical imaging like mammography, MRI, and PET have been widely used in Oman for assisting radiologists in early cancer diagnosis and microarray data have been proven to be a reliable source for differential diagnosis. We also discuss future cancer research directions and benefits to Oman economy for entering the cancer research and treatment business as it is a multi-billion dollar industry worldwide.

  12. Comparison of hand-craft feature based SVM and CNN based deep learning framework for automatic polyp classification.

    Science.gov (United States)

    Younghak Shin; Balasingham, Ilangko

    2017-07-01

    Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.

  13. Superpixel-based classification of gastric chromoendoscopy images

    Science.gov (United States)

    Boschetto, Davide; Grisan, Enrico

    2017-03-01

    Chromoendoscopy (CH) is a gastroenterology imaging modality that involves the staining of tissues with methylene blue, which reacts with the internal walls of the gastrointestinal tract, improving the visual contrast in mucosal surfaces and thus enhancing a doctor's ability to screen precancerous lesions or early cancer. This technique helps identify areas that can be targeted for biopsy or treatment and in this work we will focus on gastric cancer detection. Gastric chromoendoscopy for cancer detection has several taxonomies available, one of which classifies CH images into three classes (normal, metaplasia, dysplasia) based on color, shape and regularity of pit patterns. Computer-assisted diagnosis is desirable to help us improve the reliability of the tissue classification and abnormalities detection. However, traditional computer vision methodologies, mainly segmentation, do not translate well to the specific visual characteristics of a gastroenterology imaging scenario. We propose the exploitation of a first unsupervised segmentation via superpixel, which groups pixels into perceptually meaningful atomic regions, used to replace the rigid structure of the pixel grid. For each superpixel, a set of features is extracted and then fed to a random forest based classifier, which computes a model used to predict the class of each superpixel. The average general accuracy of our model is 92.05% in the pixel domain (86.62% in the superpixel domain), while detection accuracies on the normal and abnormal class are respectively 85.71% and 95%. Eventually, the whole image class can be predicted image through a majority vote on each superpixel's predicted class.

  14. Density Based Support Vector Machines for Classification

    OpenAIRE

    Zahra Nazari; Dongshik Kang

    2015-01-01

    Support Vector Machines (SVM) is the most successful algorithm for classification problems. SVM learns the decision boundary from two classes (for Binary Classification) of training points. However, sometimes there are some less meaningful samples amongst training points, which are corrupted by noises or misplaced in wrong side, called outliers. These outliers are affecting on margin and classification performance, and machine should better to discard them. SVM as a popular and widely used cl...

  15. Classification of Cancer-related Death Certificates using Machine Learning

    Directory of Open Access Journals (Sweden)

    Luke Butt

    2013-05-01

    Full Text Available BackgroundCancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities.AimsIn this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated.Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes.ResultsDeath certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032 and false negative rate (0.0297 while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers.ConclusionThe selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with

  16. Visualization and tissue classification of human breast cancer images using ultrahigh-resolution OCT (Conference Presentation)

    Science.gov (United States)

    Yao, Xinwen; Gan, Yu; Chang, Ernest W.; Hibshoosh, Hanina; Feldman, Sheldon; Hendon, Christine P.

    2017-02-01

    We employed a home-built ultrahigh resolution (UHR) OCT system at 800nm to image human breast cancer sample ex vivo. The system has an axial resolution of 2.72µm and a lateral resolution of 5.52µm with an extended imaging range of 1.78mm. Over 900 UHR OCT volumes were generated on specimens from 23 breast cancer cases. With better spatial resolution, detailed structures in the breast tissue were better defined. Different types of breast cancer as well as healthy breast tissue can be well delineated from the UHR OCT images. To quantitatively evaluate the advantages of UHR OCT imaging of breast cancer, features derived from OCT intensity images were used as inputs to a machine learning model, the relevance vector machine. A trained machine learning model was employed to evaluate the performance of tissue classification based on UHR OCT images for differentiating tissue types in the breast samples, including adipose tissue, healthy stroma and cancerous region. For adipose tissue, grid-based local features were extracted from OCT intensity data, including standard deviation, entropy, and homogeneity. We showed that it was possible to enhance the classification performance on distinguishing fat tissue from non-fat tissue by using the UHR images when compared with the results based on OCT images from a commercial 1300 nm OCT system. For invasive ductal carcinoma (IDC) and normal stroma differentiation, the classification was based on frame-based features that portray signal penetration depth and tissue reflectivity. The confusing matrix indicated a sensitivity of 97.5% and a sensitivity of 77.8%.

  17. Classification of treatment-related mortality in children with cancer

    DEFF Research Database (Denmark)

    Alexander, Sarah; Pole, Jason D; Gibson, Paul

    2015-01-01

    Treatment-related mortality is an important outcome in paediatric cancer clinical trials. An international group of experts in supportive care in paediatric cancer developed a consensus-based definition of treatment-related mortality and a cause-of-death attribution system. The reliability and va...

  18. Classification of research reactors and discussion of thinking of safety regulation based on the classification

    International Nuclear Information System (INIS)

    Song Chenxiu; Zhu Lixin

    2013-01-01

    Research reactors have different characteristics in the fields of reactor type, use, power level, design principle, operation model and safety performance, etc, and also have significant discrepancy in the aspect of nuclear safety regulation. This paper introduces classification of research reactors and discusses thinking of safety regulation based on the classification of research reactors. (authors)

  19. An NRG Oncology/GOG study of molecular classification for risk prediction in endometrioid endometrial cancer.

    Science.gov (United States)

    Cosgrove, Casey M; Tritchler, David L; Cohn, David E; Mutch, David G; Rush, Craig M; Lankes, Heather A; Creasman, William T; Miller, David S; Ramirez, Nilsa C; Geller, Melissa A; Powell, Matthew A; Backes, Floor J; Landrum, Lisa M; Timmers, Cynthia; Suarez, Adrian A; Zaino, Richard J; Pearl, Michael L; DiSilvestro, Paul A; Lele, Shashikant B; Goodfellow, Paul J

    2018-01-01

    The purpose of this study was to assess the prognostic significance of a simplified, clinically accessible classification system for endometrioid endometrial cancers combining Lynch syndrome screening and molecular risk stratification. Tumors from NRG/GOG GOG210 were evaluated for mismatch repair defects (MSI, MMR IHC, and MLH1 methylation), POLE mutations, and loss of heterozygosity. TP53 was evaluated in a subset of cases. Tumors were assigned to four molecular classes. Relationships between molecular classes and clinicopathologic variables were assessed using contingency tests and Cox proportional methods. Molecular classification was successful for 982 tumors. Based on the NCI consensus MSI panel assessing MSI and loss of heterozygosity combined with POLE testing, 49% of tumors were classified copy number stable (CNS), 39% MMR deficient, 8% copy number altered (CNA) and 4% POLE mutant. Cancer-specific mortality occurred in 5% of patients with CNS tumors; 2.6% with POLE tumors; 7.6% with MMR deficient tumors and 19% with CNA tumors. The CNA group had worse progression-free (HR 2.31, 95%CI 1.53-3.49) and cancer-specific survival (HR 3.95; 95%CI 2.10-7.44). The POLE group had improved outcomes, but the differences were not statistically significant. CNA class remained significant for cancer-specific survival (HR 2.11; 95%CI 1.04-4.26) in multivariable analysis. The CNA molecular class was associated with TP53 mutation and expression status. A simple molecular classification for endometrioid endometrial cancers that can be easily combined with Lynch syndrome screening provides important prognostic information. These findings support prospective clinical validation and further studies on the predictive value of a simplified molecular classification system. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Sentiment classification technology based on Markov logic networks

    Science.gov (United States)

    He, Hui; Li, Zhigang; Yao, Chongchong; Zhang, Weizhe

    2016-07-01

    With diverse online media emerging, there is a growing concern of sentiment classification problem. At present, text sentiment classification mainly utilizes supervised machine learning methods, which feature certain domain dependency. On the basis of Markov logic networks (MLNs), this study proposed a cross-domain multi-task text sentiment classification method rooted in transfer learning. Through many-to-one knowledge transfer, labeled text sentiment classification, knowledge was successfully transferred into other domains, and the precision of the sentiment classification analysis in the text tendency domain was improved. The experimental results revealed the following: (1) the model based on a MLN demonstrated higher precision than the single individual learning plan model. (2) Multi-task transfer learning based on Markov logical networks could acquire more knowledge than self-domain learning. The cross-domain text sentiment classification model could significantly improve the precision and efficiency of text sentiment classification.

  1. Characteristics of Differently Located Colorectal Cancers Support Proximal and Distal Classification: A Population-Based Study of 57,847 Patients.

    Directory of Open Access Journals (Sweden)

    Jiao Yang

    Full Text Available It has been suggested that colorectal cancer be regarded as several subgroups defined according to tumor location rather than as a single entity. The current study aimed to identify the most useful method for grouping colorectal cancer by tumor location according to both baseline and survival characteristics.Cases of pathologically confirmed colorectal adenocarcinoma diagnosed from 2000 to 2012 were identified from the Surveillance, Epidemiology, and End Results database and categorized into three groups: right colon cancer (RCC, left colon cancer (LCC, and rectal cancer (ReC. Adjusted hazard ratios for known predictors of disease-specific survival (DSS in colorectal cancer were obtained using a Cox proportional hazards regression model.The study included 57847 patients: 43.5% with RCC, 37.7% with LCC, and 18.8% with ReC. Compared with LCC and ReC, RCC was more likely to affect old patients and women, and to be at advanced stage, poorly differentiated or un-differentiated, and mucinous. Patients with LCC or ReC had better DSS than those with RCC in subgroups including stage III or IV disease, age ≤70 years and non-mucinous adenocarcinoma. Conversely, patients with LCC or ReC had worse DSS than those with RCC in subgroups including age ˃70 years and mucinous adenocarcinoma.RCC differed from both LCC and ReC in several clinicopathologic characteristics and in DSS. It seems reasonable to group colorectal cancer into right-sided (i.e., proximal and left-sided (i.e., distal ones.

  2. Cross-Disciplinary Analysis of Lymph Node Classification in Lung Cancer on CT Scanning.

    Science.gov (United States)

    El-Sherief, Ahmed H; Lau, Charles T; Obuchowski, Nancy A; Mehta, Atul C; Rice, Thomas W; Blackstone, Eugene H

    2017-04-01

    Accurate and consistent regional lymph node classification is an important element in the staging and multidisciplinary management of lung cancer. Regional lymph node definition sets-lymph node maps-have been created to standardize regional lymph node classification. In 2009, the International Association for the Study of Lung Cancer (IASLC) introduced a lymph node map to supersede all preexisting lymph node maps. Our aim was to study if and how lung cancer specialists apply the IASLC lymph node map when classifying thoracic lymph nodes encountered on CT scans during lung cancer staging. From April 2013 through July 2013, invitations were distributed to all members of the Fleischner Society, Society of Thoracic Radiology, General Thoracic Surgical Club, and the American Association of Bronchology and Interventional Pulmonology to participate in an anonymous online image-based and text-based 20-question survey regarding lymph node classification for lung cancer staging on CT imaging. Three hundred thirty-seven people responded (approximately 25% participation). Respondents consisted of self-reported thoracic radiologists (n = 158), thoracic surgeons (n = 102), and pulmonologists who perform endobronchial ultrasonography (n = 77). Half of the respondents (50%; 95% CI, 44%-55%) reported using the IASLC lymph node map in daily practice, with no significant differences between subspecialties. A disparity was observed between the IASLC definition sets and their interpretation and application on CT scans, in particular for lymph nodes near the thoracic inlet, anterior to the trachea, anterior to the tracheal bifurcation, near the ligamentum arteriosum, between the bronchus intermedius and esophagus, in the internal mammary space, and adjacent to the heart. Use of older lymph node maps and inconsistencies in interpretation and application of definitions in the IASLC lymph node map may potentially lead to misclassification of stage and suboptimal management of lung

  3. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge; Lo, Pechin Chien Pau; Dirksen, Asger

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment betw...

  4. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge Emil Borch Laurs; Lo, Pechin Chien Pau; Dirksen, Asger

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment...

  5. Breast cancer molecular subtype classification using deep features: preliminary results

    Science.gov (United States)

    Zhu, Zhe; Albadawy, Ehab; Saha, Ashirbani; Zhang, Jun; Harowicz, Michael R.; Mazurowski, Maciej A.

    2018-02-01

    Radiogenomics is a field of investigation that attempts to examine the relationship between imaging characteris- tics of cancerous lesions and their genomic composition. This could offer a noninvasive alternative to establishing genomic characteristics of tumors and aid cancer treatment planning. While deep learning has shown its supe- riority in many detection and classification tasks, breast cancer radiogenomic data suffers from a very limited number of training examples, which renders the training of the neural network for this problem directly and with no pretraining a very difficult task. In this study, we investigated an alternative deep learning approach referred to as deep features or off-the-shelf network approach to classify breast cancer molecular subtypes using breast dynamic contrast enhanced MRIs. We used the feature maps of different convolution layers and fully connected layers as features and trained support vector machines using these features for prediction. For the feature maps that have multiple layers, max-pooling was performed along each channel. We focused on distinguishing the Luminal A subtype from other subtypes. To evaluate the models, 10 fold cross-validation was performed and the final AUC was obtained by averaging the performance of all the folds. The highest average AUC obtained was 0.64 (0.95 CI: 0.57-0.71), using the feature maps of the last fully connected layer. This indicates the promise of using this approach to predict the breast cancer molecular subtypes. Since the best performance appears in the last fully connected layer, it also implies that breast cancer molecular subtypes may relate to high level image features

  6. Risk-based classification system of nanomaterials

    International Nuclear Information System (INIS)

    Tervonen, Tommi; Linkov, Igor; Figueira, Jose Rui; Steevens, Jeffery; Chappell, Mark; Merad, Myriam

    2009-01-01

    Various stakeholders are increasingly interested in the potential toxicity and other risks associated with nanomaterials throughout the different stages of a product's life cycle (e.g., development, production, use, disposal). Risk assessment methods and tools developed and applied to chemical and biological materials may not be readily adaptable for nanomaterials because of the current uncertainty in identifying the relevant physico-chemical and biological properties that adequately describe the materials. Such uncertainty is further driven by the substantial variations in the properties of the original material due to variable manufacturing processes employed in nanomaterial production. To guide scientists and engineers in nanomaterial research and application as well as to promote the safe handling and use of these materials, we propose a decision support system for classifying nanomaterials into different risk categories. The classification system is based on a set of performance metrics that measure both the toxicity and physico-chemical characteristics of the original materials, as well as the expected environmental impacts through the product life cycle. Stochastic multicriteria acceptability analysis (SMAA-TRI), a formal decision analysis method, was used as the foundation for this task. This method allowed us to cluster various nanomaterials in different ecological risk categories based on our current knowledge of nanomaterial physico-chemical characteristics, variation in produced material, and best professional judgments. SMAA-TRI uses Monte Carlo simulations to explore all feasible values for weights, criteria measurements, and other model parameters to assess the robustness of nanomaterial grouping for risk management purposes.

  7. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.

    2015-02-01

    © 2014 The Authors. Microcirculation published by John Wiley & Sons Ltd. Objective: Recent developments in high-resolution imaging techniques have enabled digital reconstruction of three-dimensional sections of microvascular networks down to the capillary scale. To better interpret these large data sets, our goal is to distinguish branching trees of arterioles and venules from capillaries. Methods: Two novel algorithms are presented for classifying vessels in microvascular anatomical data sets without requiring flow information. The algorithms are compared with a classification based on observed flow directions (considered the gold standard), and with an existing resistance-based method that relies only on structural data. Results: The first algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules. The second algorithm, developed for networks with multiple inlets and outlets, correctly identifies more arterioles and venules, but is more sensitive to parameter changes. Conclusions: The algorithms presented here can be used to classify microvessels in large microvascular data sets lacking flow information. This provides a basis for analyzing the distinct geometrical properties and modelling the functional behavior of arterioles, capillaries, and venules.

  8. Risk-based classification system of nanomaterials

    Energy Technology Data Exchange (ETDEWEB)

    Tervonen, Tommi, E-mail: t.p.tervonen@rug.n [University of Groningen, Faculty of Economics and Business (Netherlands); Linkov, Igor, E-mail: igor.linkov@usace.army.mi [US Army Research and Development Center (United States); Figueira, Jose Rui, E-mail: figueira@ist.utl.p [Technical University of Lisbon, CEG-IST, Centre for Management Studies, Instituto Superior Tecnico (Portugal); Steevens, Jeffery, E-mail: jeffery.a.steevens@usace.army.mil; Chappell, Mark, E-mail: mark.a.chappell@usace.army.mi [US Army Research and Development Center (United States); Merad, Myriam, E-mail: myriam.merad@ineris.f [INERIS BP 2, Societal Management of Risks Unit/Accidental Risks Division (France)

    2009-05-15

    Various stakeholders are increasingly interested in the potential toxicity and other risks associated with nanomaterials throughout the different stages of a product's life cycle (e.g., development, production, use, disposal). Risk assessment methods and tools developed and applied to chemical and biological materials may not be readily adaptable for nanomaterials because of the current uncertainty in identifying the relevant physico-chemical and biological properties that adequately describe the materials. Such uncertainty is further driven by the substantial variations in the properties of the original material due to variable manufacturing processes employed in nanomaterial production. To guide scientists and engineers in nanomaterial research and application as well as to promote the safe handling and use of these materials, we propose a decision support system for classifying nanomaterials into different risk categories. The classification system is based on a set of performance metrics that measure both the toxicity and physico-chemical characteristics of the original materials, as well as the expected environmental impacts through the product life cycle. Stochastic multicriteria acceptability analysis (SMAA-TRI), a formal decision analysis method, was used as the foundation for this task. This method allowed us to cluster various nanomaterials in different ecological risk categories based on our current knowledge of nanomaterial physico-chemical characteristics, variation in produced material, and best professional judgments. SMAA-TRI uses Monte Carlo simulations to explore all feasible values for weights, criteria measurements, and other model parameters to assess the robustness of nanomaterial grouping for risk management purposes.

  9. Classification of breast cancer patients using somatic mutation profiles and machine learning approaches.

    Science.gov (United States)

    Vural, Suleyman; Wang, Xiaosheng; Guda, Chittibabu

    2016-08-26

    The high degree of heterogeneity observed in breast cancers makes it very difficult to classify the cancer patients into distinct clinical subgroups and consequently limits the ability to devise effective therapeutic strategies. Several classification strategies based on ER/PR/HER2 expression or the expression profiles of a panel of genes have helped, but such methods often produce misleading results due to their dynamic nature. In contrast, somatic DNA mutations are relatively stable and lead to initiation and progression of many sporadic cancers. Hence in this study, we explore the use of gene mutation profiles to classify, characterize and predict the subgroups of breast cancers. We analyzed the whole exome sequencing data from 358 ethnically similar breast cancer patients in The Cancer Genome Atlas (TCGA) project. Somatic and non-synonymous single nucleotide variants identified from each patient were assigned a quantitative score (C-score) that represents the extent of negative impact on the gene function. Using these scores with non-negative matrix factorization method, we clustered the patients into three subgroups. By comparing the clinical stage of patients, we identified an early-stage-enriched and a late-stage-enriched subgroup. Comparison of the mutation scores of early and late-stage-enriched subgroups identified 358 genes that carry significantly higher mutations rates in the late stage subgroup. Functional characterization of these genes revealed important functional gene families that carry a heavy mutational load in the late state rich subgroup of patients. Finally, using the identified subgroups, we also developed a supervised classification model to predict the stage of the patients. This study demonstrates that gene mutation profiles can be effectively used with unsupervised machine-learning methods to identify clinically distinguishable breast cancer subgroups. The classification model developed in this method could provide a reasonable

  10. KNN BASED CLASSIFICATION OF DIGITAL MODULATED SIGNALS

    Directory of Open Access Journals (Sweden)

    Sajjad Ahmed Ghauri

    2016-11-01

    Full Text Available Demodulation process without the knowledge of modulation scheme requires Automatic Modulation Classification (AMC. When receiver has limited information about received signal then AMC become essential process. AMC finds important place in the field many civil and military fields such as modern electronic warfare, interfering source recognition, frequency management, link adaptation etc. In this paper we explore the use of K-nearest neighbor (KNN for modulation classification with different distance measurement methods. Five modulation schemes are used for classification purpose which is Binary Phase Shift Keying (BPSK, Quadrature Phase Shift Keying (QPSK, Quadrature Amplitude Modulation (QAM, 16-QAM and 64-QAM. Higher order cummulants (HOC are used as an input feature set to the classifier. Simulation results shows that proposed classification method provides better results for the considered modulation formats.

  11. Classification of Breast Cancer Subtypes by combining Gene Expression and DNA Methylation Data

    Directory of Open Access Journals (Sweden)

    List Markus

    2014-06-01

    Full Text Available Selecting the most promising treatment strategy for breast cancer crucially depends on determining the correct subtype. In recent years, gene expression profiling has been investigated as an alternative to histochemical methods. Since databases like TCGA provide easy and unrestricted access to gene expression data for hundreds of patients, the challenge is to extract a minimal optimal set of genes with good prognostic properties from a large bulk of genes making a moderate contribution to classification. Several studies have successfully applied machine learning algorithms to solve this so-called gene selection problem. However, more diverse data from other OMICS technologies are available, including methylation. We hypothesize that combining methylation and gene expression data could already lead to a largely improved classification model, since the resulting model will reflect differences not only on the transcriptomic, but also on an epigenetic level. We compared so-called random forest derived classification models based on gene expression and methylation data alone, to a model based on the combined features and to a model based on the gold standard PAM50. We obtained bootstrap errors of 10-20% and classification error of 1-50%, depending on breast cancer subtype and model. The gene expression model was clearly superior to the methylation model, which was also reflected in the combined model, which mainly selected features from gene expression data. However, the methylation model was able to identify unique features not considered as relevant by the gene expression model, which might provide deeper insights into breast cancer subtype differentiation on an epigenetic level.

  12. Integrating Globality and Locality for Robust Representation Based Classification

    Directory of Open Access Journals (Sweden)

    Zheng Zhang

    2014-01-01

    Full Text Available The representation based classification method (RBCM has shown huge potential for face recognition since it first emerged. Linear regression classification (LRC method and collaborative representation classification (CRC method are two well-known RBCMs. LRC and CRC exploit training samples of each class and all the training samples to represent the testing sample, respectively, and subsequently conduct classification on the basis of the representation residual. LRC method can be viewed as a “locality representation” method because it just uses the training samples of each class to represent the testing sample and it cannot embody the effectiveness of the “globality representation.” On the contrary, it seems that CRC method cannot own the benefit of locality of the general RBCM. Thus we propose to integrate CRC and LRC to perform more robust representation based classification. The experimental results on benchmark face databases substantially demonstrate that the proposed method achieves high classification accuracy.

  13. DNA methylation-based classification of central nervous system tumours.

    Science.gov (United States)

    Capper, David; Jones, David T W; Sill, Martin; Hovestadt, Volker; Schrimpf, Daniel; Sturm, Dominik; Koelsche, Christian; Sahm, Felix; Chavez, Lukas; Reuss, David E; Kratz, Annekathrin; Wefers, Annika K; Huang, Kristin; Pajtler, Kristian W; Schweizer, Leonille; Stichel, Damian; Olar, Adriana; Engel, Nils W; Lindenberg, Kerstin; Harter, Patrick N; Braczynski, Anne K; Plate, Karl H; Dohmen, Hildegard; Garvalov, Boyan K; Coras, Roland; Hölsken, Annett; Hewer, Ekkehard; Bewerunge-Hudler, Melanie; Schick, Matthias; Fischer, Roger; Beschorner, Rudi; Schittenhelm, Jens; Staszewski, Ori; Wani, Khalida; Varlet, Pascale; Pages, Melanie; Temming, Petra; Lohmann, Dietmar; Selt, Florian; Witt, Hendrik; Milde, Till; Witt, Olaf; Aronica, Eleonora; Giangaspero, Felice; Rushing, Elisabeth; Scheurlen, Wolfram; Geisenberger, Christoph; Rodriguez, Fausto J; Becker, Albert; Preusser, Matthias; Haberler, Christine; Bjerkvig, Rolf; Cryan, Jane; Farrell, Michael; Deckert, Martina; Hench, Jürgen; Frank, Stephan; Serrano, Jonathan; Kannan, Kasthuri; Tsirigos, Aristotelis; Brück, Wolfgang; Hofer, Silvia; Brehmer, Stefanie; Seiz-Rosenhagen, Marcel; Hänggi, Daniel; Hans, Volkmar; Rozsnoki, Stephanie; Hansford, Jordan R; Kohlhof, Patricia; Kristensen, Bjarne W; Lechner, Matt; Lopes, Beatriz; Mawrin, Christian; Ketter, Ralf; Kulozik, Andreas; Khatib, Ziad; Heppner, Frank; Koch, Arend; Jouvet, Anne; Keohane, Catherine; Mühleisen, Helmut; Mueller, Wolf; Pohl, Ute; Prinz, Marco; Benner, Axel; Zapatka, Marc; Gottardo, Nicholas G; Driever, Pablo Hernáiz; Kramm, Christof M; Müller, Hermann L; Rutkowski, Stefan; von Hoff, Katja; Frühwald, Michael C; Gnekow, Astrid; Fleischhack, Gudrun; Tippelt, Stephan; Calaminus, Gabriele; Monoranu, Camelia-Maria; Perry, Arie; Jones, Chris; Jacques, Thomas S; Radlwimmer, Bernhard; Gessi, Marco; Pietsch, Torsten; Schramm, Johannes; Schackert, Gabriele; Westphal, Manfred; Reifenberger, Guido; Wesseling, Pieter; Weller, Michael; Collins, Vincent Peter; Blümcke, Ingmar; Bendszus, Martin; Debus, Jürgen; Huang, Annie; Jabado, Nada; Northcott, Paul A; Paulus, Werner; Gajjar, Amar; Robinson, Giles W; Taylor, Michael D; Jaunmuktane, Zane; Ryzhova, Marina; Platten, Michael; Unterberg, Andreas; Wick, Wolfgang; Karajannis, Matthias A; Mittelbronn, Michel; Acker, Till; Hartmann, Christian; Aldape, Kenneth; Schüller, Ulrich; Buslei, Rolf; Lichter, Peter; Kool, Marcel; Herold-Mende, Christel; Ellison, David W; Hasselblatt, Martin; Snuderl, Matija; Brandner, Sebastian; Korshunov, Andrey; von Deimling, Andreas; Pfister, Stefan M

    2018-03-22

    Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging-with substantial inter-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show that the availability of this method may have a substantial impact on diagnostic precision compared to standard methods, resulting in a change of diagnosis in up to 12% of prospective cases. For broader accessibility, we have designed a free online classifier tool, the use of which does not require any additional onsite data processing. Our results provide a blueprint for the generation of machine-learning-based tumour classifiers across other cancer entities, with the potential to fundamentally transform tumour pathology.

  14. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey

    OpenAIRE

    Nahid, Abdullah-Al; Kong, Yinan

    2017-01-01

    Breast cancer is one of the largest causes of women’s death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors’ and physicians’ time. Despite the various publications on breast image classification, very few review papers are available w...

  15. Identification of high-risk cutaneous melanoma tumors is improved when combining the online American Joint Committee on Cancer Individualized Melanoma Patient Outcome Prediction Tool with a 31-gene expression profile-based classification.

    Science.gov (United States)

    Ferris, Laura K; Farberg, Aaron S; Middlebrook, Brooke; Johnson, Clare E; Lassen, Natalie; Oelschlager, Kristen M; Maetzold, Derek J; Cook, Robert W; Rigel, Darrell S; Gerami, Pedram

    2017-05-01

    A significant proportion of patients with American Joint Committee on Cancer (AJCC)-defined early-stage cutaneous melanoma have disease recurrence and die. A 31-gene expression profile (GEP) that accurately assesses metastatic risk associated with primary cutaneous melanomas has been described. We sought to compare accuracy of the GEP in combination with risk determined using the web-based AJCC Individualized Melanoma Patient Outcome Prediction Tool. GEP results from 205 stage I/II cutaneous melanomas with sufficient clinical data for prognostication using the AJCC tool were classified as low (class 1) or high (class 2) risk. Two 5-year overall survival cutoffs (AJCC 79% and 68%), reflecting survival for patients with stage IIA or IIB disease, respectively, were assigned for binary AJCC risk. Cox univariate analysis revealed significant risk classification of distant metastasis-free and overall survival (hazard ratio range 3.2-9.4, P risk by GEP but low risk by AJCC. Specimens reflect tertiary care center referrals; more effective therapies have been approved for clinical use after accrual. The GEP provides valuable prognostic information and improves identification of high-risk melanomas when used together with the AJCC online prediction tool. Copyright © 2016 American Academy of Dermatology, Inc. Published by Elsevier Inc. All rights reserved.

  16. Gene selection for cancer classification with the help of bees.

    Science.gov (United States)

    Moosa, Johra Muhammad; Shakur, Rameen; Kaykobad, Mohammad; Rahman, Mohammad Sohel

    2016-08-10

    Development of biologically relevant models from gene expression data notably, microarray data has become a topic of great interest in the field of bioinformatics and clinical genetics and oncology. Only a small number of gene expression data compared to the total number of genes explored possess a significant correlation with a certain phenotype. Gene selection enables researchers to obtain substantial insight into the genetic nature of the disease and the mechanisms responsible for it. Besides improvement of the performance of cancer classification, it can also cut down the time and cost of medical diagnoses. This study presents a modified Artificial Bee Colony Algorithm (ABC) to select minimum number of genes that are deemed to be significant for cancer along with improvement of predictive accuracy. The search equation of ABC is believed to be good at exploration but poor at exploitation. To overcome this limitation we have modified the ABC algorithm by incorporating the concept of pheromones which is one of the major components of Ant Colony Optimization (ACO) algorithm and a new operation in which successive bees communicate to share their findings. The proposed algorithm is evaluated using a suite of ten publicly available datasets after the parameters are tuned scientifically with one of the datasets. Obtained results are compared to other works that used the same datasets. The performance of the proposed method is proved to be superior. The method presented in this paper can provide subset of genes leading to more accurate classification results while the number of selected genes is smaller. Additionally, the proposed modified Artificial Bee Colony Algorithm could conceivably be applied to problems in other areas as well.

  17. Sparse Representation Based Multi-Instance Learning for Breast Ultrasound Image Classification.

    Science.gov (United States)

    Bing, Lu; Wang, Wei

    2017-01-01

    We propose a novel method based on sparse representation for breast ultrasound image classification under the framework of multi-instance learning (MIL). After image enhancement and segmentation, concentric circle is used to extract the global and local features for improving the accuracy in diagnosis and prediction. The classification problem of ultrasound image is converted to sparse representation based MIL problem. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is represented by one feature vector which is obtained via sparse representations of all instances within the bag. The sparse and MIL problem is further converted to a conventional learning problem that is solved by relevance vector machine (RVM). Results of single classifiers are combined to be used for classification. Experimental results on the breast cancer datasets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods.

  18. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  19. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Science.gov (United States)

    Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun

    2016-01-01

    This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms. PMID:27642363

  20. Case base classification on digital mammograms: improving the performance of case base classifier

    Science.gov (United States)

    Raman, Valliappan; Then, H. H.; Sumari, Putra; Venkatesa Mohan, N.

    2011-10-01

    Breast cancer continues to be a significant public health problem in the world. Early detection is the key for improving breast cancer prognosis. The aim of the research presented here is in twofold. First stage of research involves machine learning techniques, which segments and extracts features from the mass of digital mammograms. Second level is on problem solving approach which includes classification of mass by performance based case base classifier. In this paper we build a case-based Classifier in order to diagnose mammographic images. We explain different methods and behaviors that have been added to the classifier to improve the performance of the classifier. Currently the initial Performance base Classifier with Bagging is proposed in the paper and it's been implemented and it shows an improvement in specificity and sensitivity.

  1. A comprehensive sensitivity analysis of microarray breast cancer classification under feature variability

    Directory of Open Access Journals (Sweden)

    Reinders Marcel JT

    2009-11-01

    Full Text Available Abstract Background Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear. Results We have performed a comprehensive sensitivity analysis of microarray breast cancer classification under the two types of feature variability mentioned above. We used data from six state of the art preprocessing methods, using a compendium consisting of eight diferent datasets, involving 1131 hybridizations, containing data from both one and two-color array technology. For a wide range of classifiers, we performed a joint study on performance, concordance and stability. In the stability analysis we explicitly tested classifiers for their noise tolerance by using perturbed expression profiles that are based on uncertainty information directly related to the preprocessing methods. Our results indicate that signature composition is strongly influenced by feature variability, even if the array platform and the stratification of patient samples are identical. In addition, we show that there is often a high level of discordance between individual class assignments for signatures constructed on data coming from different preprocessing schemes, even if the actual signature composition is identical

  2. Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.

    Science.gov (United States)

    Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel

    2017-06-01

    Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.

  3. Classification of normal and abnormal images of lung cancer

    Science.gov (United States)

    Bhatnagar, Divyesh; Tiwari, Amit Kumar; Vijayarajan, V.; Krishnamoorthy, A.

    2017-11-01

    To find the exact symptoms of lung cancer is difficult, because of the formation of the most cancers tissues, wherein large structure of tissues is intersect in a different way. This problem can be evaluated with the help of digital images. In this strategy images will be examined with basic operation of PCA Algorithm. In this paper, GLCM method is used for pre-processing of the snap shots and function extraction system and to test the level of diseases of a patient in its premature stage get to know it is regular or unusual. With the help of result stage of cancer will be evaluated. With the help of dataset and result survival rate of cancer patient can be estimated. Result is based totally on the precise and wrong arrangement of the patterns of tissues.

  4. Preliminary Research on Grassland Fine-classification Based on MODIS

    International Nuclear Information System (INIS)

    Hu, Z W; Zhang, S; Yu, X Y; Wang, X S

    2014-01-01

    Grassland ecosystem is important for climatic regulation, maintaining the soil and water. Research on the grassland monitoring method could provide effective reference for grassland resource investigation. In this study, we used the vegetation index method for grassland classification. There are several types of climate in China. Therefore, we need to use China's Main Climate Zone Maps and divide the study region into four climate zones. Based on grassland classification system of the first nation-wide grass resource survey in China, we established a new grassland classification system which is only suitable for this research. We used MODIS images as the basic data resources, and use the expert classifier method to perform grassland classification. Based on the 1:1,000,000 Grassland Resource Map of China, we obtained the basic distribution of all the grassland types and selected 20 samples evenly distributed in each type, then used NDVI/EVI product to summarize different spectral features of different grassland types. Finally, we introduced other classification auxiliary data, such as elevation, accumulate temperature (AT), humidity index (HI) and rainfall. China's nation-wide grassland classification map is resulted by merging the grassland in different climate zone. The overall classification accuracy is 60.4%. The result indicated that expert classifier is proper for national wide grassland classification, but the classification accuracy need to be improved

  5. Classification of Dukes' B and C colorectal cancers using expression arrays

    DEFF Research Database (Denmark)

    Frederiksen, C.M.; Knudsen, Steen; Laurberg, S.

    2003-01-01

    Purpose. Colorectal cancer is one of the most common malignancies. Substaging of the cancer is of importance not only to prognosis but also to treatment. Classification of substages based on DNA microarray technology is currently the most promising approach. We therefore investigated if gene...... expression microarrays could be used to classify colorectal tumors. Methods. We used the Affymetrix oligonucleotide arrays to analyze the expression of more than 5,000 genes in samples from the sigmoid and upper rectum of the left colon. Five samples were from normal mucosa and five samples from each...... expression of one of the most common malignancies, colorectal cancer, now seems to be within reach. The data indicates that it is possible at least to classify Dukes' B and C colorectal tumors with microarrays....

  6. Artificial neural networks as classification and diagnostic tools for lymph node-negative breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Eswari J, Satya; Chandrakar, Neha [National Institute of Technology Raipur, Raipur (India)

    2016-04-15

    Artificial neural networks (ANNs) can be used to develop a technique to classify lymph node negative breast cancer that is prone to distant metastases based on gene expression signatures. The neural network used is a multilayered feed forward network that employs back propagation algorithm. Once trained with DNA microarraybased gene expression profiles of genes that were predictive of distant metastasis recurrence of lymph node negative breast cancer, the ANNs became capable of correctly classifying all samples and recognizing the genes most appropriate to the classification. To test the ability of the trained ANN models in recognizing lymph node negative breast cancer, we analyzed additional idle samples that were not used beforehand for the training procedure and obtained the correctly classified result in the validation set. For more substantial result, bootstrapping of training and testing dataset was performed as external validation. This study illustrates the potential application of ANN for breast tumor diagnosis and the identification of candidate targets in patients for therapy.

  7. Automatic classification of ovarian cancer types from cytological images using deep convolutional neural networks.

    Science.gov (United States)

    Wu, Miao; Yan, Chuanbo; Liu, Huiqiang; Liu, Qian

    2018-06-29

    Ovarian cancer is one of the most common gynecologic malignancies. Accurate classification of ovarian cancer types (serous carcinoma, mucous carcinoma, endometrioid carcinoma, transparent cell carcinoma) is an essential part in the different diagnosis. Computer-aided diagnosis (CADx) can provide useful advice for pathologists to determine the diagnosis correctly. In our study, we employed a Deep Convolutional Neural Networks (DCNN) based on AlexNet to automatically classify the different types of ovarian cancers from cytological images. The DCNN consists of five convolutional layers, three max pooling layers, and two full reconnect layers. Then we trained the model by two group input data separately, one was original image data and the other one was augmented image data including image enhancement and image rotation. The testing results are obtained by the method of 10-fold cross-validation, showing that the accuracy of classification models has been improved from 72.76 to 78.20% by using augmented images as training data. The developed scheme was useful for classifying ovarian cancers from cytological images. © 2018 The Author(s).

  8. A proposed data base system for detection, classification and ...

    African Journals Online (AJOL)

    A proposed data base system for detection, classification and location of fault on electricity company of Ghana electrical distribution system. Isaac Owusu-Nyarko, Mensah-Ananoo Eugine. Abstract. No Abstract. Keywords: database, classification of fault, power, distribution system, SCADA, ECG. Full Text: EMAIL FULL TEXT ...

  9. AN OBJECT-BASED METHOD FOR CHINESE LANDFORM TYPES CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    H. Ding

    2016-06-01

    Full Text Available Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM. In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  10. Does the use of the 2009 FIGO classification of endometrial cancer impact on indications of the sentinel node biopsy?

    Directory of Open Access Journals (Sweden)

    Ballester Marcos

    2010-08-01

    Full Text Available Abstract Background Lymphadenectomy is debated in early stages endometrial cancer. Moreover, a new FIGO classification of endometrial cancer, merging stages IA and IB has been recently published. Therefore, the aims of the present study was to evaluate the relevance of the sentinel node (SN procedure in women with endometrial cancer and to discuss whether the use of the 2009 FIGO classification could modify the indications for SN procedure. Methods Eighty-five patients with endometrial cancer underwent the SN procedure followed by pelvic lymphadenectomy. SNs were detected with a dual or single labelling method in 74 and 11 cases, respectively. All SNs were analysed by both H&E staining and immunohistochemistry. Presumed stage before surgery was assessed for all patients based on MR imaging features using the 1988 FIGO classification and the 2009 FIGO classification. Results An SN was detected in 88.2% of cases (75/85 women. Among the fourteen patients with lymph node metastases one-half were detected by serial sectioning and immunohistochemical analysis. There were no false negative case. Using the 1988 FIGO classification and the 2009 FIGO classification, the correlation between preoperative MRI staging and final histology was moderate with Kappa = 0.24 and Kappa = 0.45, respectively. None of the patients with grade 1 endometrioid carcinoma on biopsy and IA 2009 FIGO stage on MR imaging exhibited positive SN. In patients with grade 2-3 endometrioid carcinoma and stage IA on MR imaging, the rate of positive SN reached 16.6% with an incidence of micrometastases of 50%. Conclusions The present study suggests that sentinel node biopsy is an adequate technique to evaluate lymph node status. The use of the 2009 FIGO classification increases the accuracy of MR imaging to stage patients with early stages of endometrial cancer and contributes to clarify the indication of SN biopsy according to tumour grade and histological type.

  11. Does the use of the 2009 FIGO classification of endometrial cancer impact on indications of the sentinel node biopsy?

    International Nuclear Information System (INIS)

    Ballester, Marcos; Koskas, Martin; Coutant, Charles; Chéreau, Elisabeth; Seror, Jeremy; Rouzier, Roman; Daraï, Emile

    2010-01-01

    Lymphadenectomy is debated in early stages endometrial cancer. Moreover, a new FIGO classification of endometrial cancer, merging stages IA and IB has been recently published. Therefore, the aims of the present study was to evaluate the relevance of the sentinel node (SN) procedure in women with endometrial cancer and to discuss whether the use of the 2009 FIGO classification could modify the indications for SN procedure. Eighty-five patients with endometrial cancer underwent the SN procedure followed by pelvic lymphadenectomy. SNs were detected with a dual or single labelling method in 74 and 11 cases, respectively. All SNs were analysed by both H&E staining and immunohistochemistry. Presumed stage before surgery was assessed for all patients based on MR imaging features using the 1988 FIGO classification and the 2009 FIGO classification. An SN was detected in 88.2% of cases (75/85 women). Among the fourteen patients with lymph node metastases one-half were detected by serial sectioning and immunohistochemical analysis. There were no false negative case. Using the 1988 FIGO classification and the 2009 FIGO classification, the correlation between preoperative MRI staging and final histology was moderate with Kappa = 0.24 and Kappa = 0.45, respectively. None of the patients with grade 1 endometrioid carcinoma on biopsy and IA 2009 FIGO stage on MR imaging exhibited positive SN. In patients with grade 2-3 endometrioid carcinoma and stage IA on MR imaging, the rate of positive SN reached 16.6% with an incidence of micrometastases of 50%. The present study suggests that sentinel node biopsy is an adequate technique to evaluate lymph node status. The use of the 2009 FIGO classification increases the accuracy of MR imaging to stage patients with early stages of endometrial cancer and contributes to clarify the indication of SN biopsy according to tumour grade and histological type

  12. CrossLink: a novel method for cross-condition classification of cancer subtypes.

    Science.gov (United States)

    Ma, Chifeng; Sastry, Konduru S; Flore, Mario; Gehani, Salah; Al-Bozom, Issam; Feng, Yusheng; Serpedin, Erchin; Chouchane, Lotfi; Chen, Yidong; Huang, Yufei

    2016-08-22

    We considered the prediction of cancer classes (e.g. subtypes) using patient gene expression profiles that contain both systematic and condition-specific biases when compared with the training reference dataset. The conventional normalization-based approaches cannot guarantee that the gene signatures in the reference and prediction datasets always have the same distribution for all different conditions as the class-specific gene signatures change with the condition. Therefore, the trained classifier would work well under one condition but not under another. To address the problem of current normalization approaches, we propose a novel algorithm called CrossLink (CL). CL recognizes that there is no universal, condition-independent normalization mapping of signatures. In contrast, it exploits the fact that the signature is unique to its associated class under any condition and thus employs an unsupervised clustering algorithm to discover this unique signature. We assessed the performance of CL for cross-condition predictions of PAM50 subtypes of breast cancer by using a simulated dataset modeled after TCGA BRCA tumor samples with a cross-validation scheme, and datasets with known and unknown PAM50 classification. CL achieved prediction accuracy >73 %, highest among other methods we evaluated. We also applied the algorithm to a set of breast cancer tumors derived from Arabic population to assign a PAM50 classification to each tumor based on their gene expression profiles. A novel algorithm CrossLink for cross-condition prediction of cancer classes was proposed. In all test datasets, CL showed robust and consistent improvement in prediction performance over other state-of-the-art normalization and classification algorithms.

  13. Improved prognostic classification of breast cancer defined by antagonistic activation patterns of immune response pathway modules

    International Nuclear Information System (INIS)

    Teschendorff, Andrew E; Gomez, Sergio; Arenas, Alex; El-Ashry, Dorraya; Schmidt, Marcus; Gehrmann, Mathias; Caldas, Carlos

    2010-01-01

    Elucidating the activation pattern of molecular pathways across a given tumour type is a key challenge necessary for understanding the heterogeneity in clinical response and for developing novel more effective therapies. Gene expression signatures of molecular pathway activation derived from perturbation experiments in model systems as well as structural models of molecular interactions ('model signatures') constitute an important resource for estimating corresponding activation levels in tumours. However, relatively few strategies for estimating pathway activity from such model signatures exist and only few studies have used activation patterns of pathways to refine molecular classifications of cancer. Here we propose a novel network-based method for estimating pathway activation in tumours from model signatures. We find that although the pathway networks inferred from cancer expression data are highly consistent with the prior information contained in the model signatures, that they also exhibit a highly modular structure and that estimation of pathway activity is dependent on this modular structure. We apply our methodology to a panel of 438 estrogen receptor negative (ER-) and 785 estrogen receptor positive (ER+) breast cancers to infer activation patterns of important cancer related molecular pathways. We show that in ER negative basal and HER2+ breast cancer, gene expression modules reflecting T-cell helper-1 (Th1) and T-cell helper-2 (Th2) mediated immune responses play antagonistic roles as major risk factors for distant metastasis. Using Boolean interaction Cox-regression models to identify non-linear pathway combinations associated with clinical outcome, we show that simultaneous high activation of Th1 and low activation of a TGF-beta pathway module defines a subtype of particularly good prognosis and that this classification provides a better prognostic model than those based on the individual pathways. In ER+ breast cancer, we find that

  14. Stepwise classification of cancer samples using clinical and molecular data

    Directory of Open Access Journals (Sweden)

    Obulkasim Askar

    2011-10-01

    Full Text Available Abstract Background Combining clinical and molecular data types may potentially improve prediction accuracy of a classifier. However, currently there is a shortage of effective and efficient statistical and bioinformatic tools for true integrative data analysis. Existing integrative classifiers have two main disadvantages: First, coarse combination may lead to subtle contributions of one data type to be overshadowed by more obvious contributions of the other. Second, the need to measure both data types for all patients may be both unpractical and (cost inefficient. Results We introduce a novel classification method, a stepwise classifier, which takes advantage of the distinct classification power of clinical data and high-dimensional molecular data. We apply classification algorithms to two data types independently, starting with the traditional clinical risk factors. We only turn to relatively expensive molecular data when the uncertainty of prediction result from clinical data exceeds a predefined limit. Experimental results show that our approach is adaptive: the proportion of samples that needs to be re-classified using molecular data depends on how much we expect the predictive accuracy to increase when re-classifying those samples. Conclusions Our method renders a more cost-efficient classifier that is at least as good, and sometimes better, than one based on clinical or molecular data alone. Hence our approach is not just a classifier that minimizes a particular loss function. Instead, it aims to be cost-efficient by avoiding molecular tests for a potentially large subgroup of individuals; moreover, for these individuals a test result would be quickly available, which may lead to reduced waiting times (for diagnosis and hence lower the patients distress. Stepwise classification is implemented in R-package stepwiseCM and available at the Bioconductor website.

  15. Classification of premalignant pancreatic cancer mass-spectrometry data using decision tree ensembles

    Directory of Open Access Journals (Sweden)

    Wong G William

    2008-06-01

    Full Text Available Abstract Background Pancreatic cancer is the fourth leading cause of cancer death in the United States. Consequently, identification of clinically relevant biomarkers for the early detection of this cancer type is urgently needed. In recent years, proteomics profiling techniques combined with various data analysis methods have been successfully used to gain critical insights into processes and mechanisms underlying pathologic conditions, particularly as they relate to cancer. However, the high dimensionality of proteomics data combined with their relatively small sample sizes poses a significant challenge to current data mining methodology where many of the standard methods cannot be applied directly. Here, we propose a novel methodological framework using machine learning method, in which decision tree based classifier ensembles coupled with feature selection methods, is applied to proteomics data generated from premalignant pancreatic cancer. Results This study explores the utility of three different feature selection schemas (Student t test, Wilcoxon rank sum test and genetic algorithm to reduce the high dimensionality of a pancreatic cancer proteomic dataset. Using the top features selected from each method, we compared the prediction performances of a single decision tree algorithm C4.5 with six different decision-tree based classifier ensembles (Random forest, Stacked generalization, Bagging, Adaboost, Logitboost and Multiboost. We show that ensemble classifiers always outperform single decision tree classifier in having greater accuracies and smaller prediction errors when applied to a pancreatic cancer proteomics dataset. Conclusion In our cross validation framework, classifier ensembles generally have better classification accuracies compared to that of a single decision tree when applied to a pancreatic cancer proteomic dataset, thus suggesting its utility in future proteomics data analysis. Additionally, the use of feature selection

  16. Gene Expression Profiles for Predicting Metastasis in Breast Cancer: A Cross-Study Comparison of Classification Methods

    Directory of Open Access Journals (Sweden)

    Mark Burton

    2012-01-01

    Full Text Available Machine learning has increasingly been used with microarray gene expression data and for the development of classifiers using a variety of methods. However, method comparisons in cross-study datasets are very scarce. This study compares the performance of seven classification methods and the effect of voting for predicting metastasis outcome in breast cancer patients, in three situations: within the same dataset or across datasets on similar or dissimilar microarray platforms. Combining classification results from seven classifiers into one voting decision performed significantly better during internal validation as well as external validation in similar microarray platforms than the underlying classification methods. When validating between different microarray platforms, random forest, another voting-based method, proved to be the best performing method. We conclude that voting based classifiers provided an advantage with respect to classifying metastasis outcome in breast cancer patients.

  17. Classification of Ovarian Cancer Surgery Facilitates Treatment Decisions in a Gynecological Multidisciplinary Team

    DEFF Research Database (Denmark)

    Bjørn, Signe Frahm; Schnack, Tine Henrichsen; Lajer, Henrik

    2017-01-01

    multidisciplinary team (MDT) decisions. Materials and Methods Four hundred eighteen women diagnosed with ovarian cancers (n = 351) or borderline tumors (n = 66) were selected for primary debulking surgery from January 2008 to July 2013. At an MDT meeting, women were allocated into 3 groups named "pre-COVA" 1 to 3...... classifying the expected extent of the primary surgery and need for postoperative care. On the basis of the operative procedures performed, women were allocated into 1 of the 3 corresponding COVA 1 to 3 groups. The outcome measure was the predictive value of the pre-COVA score compared with the actual COVA......-COVA classification predicted the actual COVA group in 79 (49%) FIGO stages I to IIIB and in 85 (45%) FIGO stages IIIC to IV. Conclusions The COVA classification system is a simple and useful tool in the MDT setting where specialists make treatment decisions based on advanced technology. The use of pre...

  18. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression.

    Science.gov (United States)

    Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa

    2015-11-03

    Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  19. A Classification-based Review Recommender

    Science.gov (United States)

    O'Mahony, Michael P.; Smyth, Barry

    Many online stores encourage their users to submit product/service reviews in order to guide future purchasing decisions. These reviews are often listed alongside product recommendations but, to date, limited attention has been paid as to how best to present these reviews to the end-user. In this paper, we describe a supervised classification approach that is designed to identify and recommend the most helpful product reviews. Using the TripAdvisor service as a case study, we compare the performance of several classification techniques using a range of features derived from hotel reviews. We then describe how these classifiers can be used as the basis for a practical recommender that automatically suggests the mosthelpful contrasting reviews to end-users. We present an empirical evaluation which shows that our approach achieves a statistically significant improvement over alternative review ranking schemes.

  20. Text document classification based on mixture models

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Malík, Antonín

    2004-01-01

    Roč. 40, č. 3 (2004), s. 293-304 ISSN 0023-5954 R&D Projects: GA AV ČR IAA2075302; GA ČR GA102/03/0049; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : text classification * text categorization * multinomial mixture model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004

  1. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  2. Improving breast cancer classification with mammography, supported on an appropriate variable selection analysis

    Science.gov (United States)

    Pérez, Noel; Guevara, Miguel A.; Silva, Augusto

    2013-02-01

    This work addresses the issue of variable selection within the context of breast cancer classification with mammography. A comprehensive repository of feature vectors was used including a hybrid subset gathering image-based and clinical features. It aimed to gather experimental evidence of variable selection in terms of cardinality, type and find a classification scheme that provides the best performance over the Area Under Receiver Operating Characteristics Curve (AUC) scores using the ranked features subset. We evaluated and classified a total of 300 subsets of features formed by the application of Chi-Square Discretization, Information-Gain, One-Rule and RELIEF methods in association with Feed-Forward Backpropagation Neural Network (FFBP), Support Vector Machine (SVM) and Decision Tree J48 (DTJ48) Machine Learning Algorithms (MLA) for a comparative performance evaluation based on AUC scores. A variable selection analysis was performed for Single-View Ranking and Multi-View Ranking groups of features. Features subsets representing Microcalcifications (MCs), Masses and both MCs and Masses lesions achieved AUC scores of 0.91, 0.954 and 0.934 respectively. Experimental evidence demonstrated that classification performance was improved by combining image-based and clinical features. The most important clinical and image-based features were StromaDistortion and Circularity respectively. Other less important but worth to use due to its consistency were Contrast, Perimeter, Microcalcification, Correlation and Elongation.

  3. Three-class classification in computer-aided diagnosis of breast cancer by support vector machine

    Science.gov (United States)

    Sun, Xuejun; Qian, Wei; Song, Dansheng

    2004-05-01

    Design of classifier in computer-aided diagnosis (CAD) scheme of breast cancer plays important role to its overall performance in sensitivity and specificity. Classification of a detected object as malignant lesion, benign lesion, or normal tissue on mammogram is a typical three-class pattern recognition problem. This paper presents a three-class classification approach by using two-stage classifier combined with support vector machine (SVM) learning algorithm for classification of breast cancer on mammograms. The first classification stage is used to detect abnormal areas and normal breast tissues, and the second stage is for classification of malignant or benign in detected abnormal objects. A series of spatial, morphology and texture features have been extracted on detected objects areas. By using genetic algorithm (GA), different feature groups for different stage classification have been investigated. Computerized free-response receiver operating characteristic (FROC) and receiver operating characteristic (ROC) analyses have been employed in different classification stages. Results have shown that obvious performance improvement in both sensitivity and specificity was observed through proposed classification approach compared with conventional two-class classification approaches, indicating its effectiveness in classification of breast cancer on mammograms.

  4. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    N. Li

    2016-06-01

    Full Text Available Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could keep the initial spatial structure and insure the consideration of the neighborhood. Based on a small number of component features a k nearest neighborhood classification is applied.

  5. MO-DE-207B-03: Improved Cancer Classification Using Patient-Specific Biological Pathway Information Via Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Young, M; Craft, D [Massachusetts General Hospital and Harvard Medical School, Boston, MA (United States)

    2016-06-15

    Purpose: To develop an efficient, pathway-based classification system using network biology statistics to assist in patient-specific response predictions to radiation and drug therapies across multiple cancer types. Methods: We developed PICS (Pathway Informed Classification System), a novel two-step cancer classification algorithm. In PICS, a matrix m of mRNA expression values for a patient cohort is collapsed into a matrix p of biological pathways. The entries of p, which we term pathway scores, are obtained from either principal component analysis (PCA), normal tissue centroid (NTC), or gene expression deviation (GED). The pathway score matrix is clustered using both k-means and hierarchical clustering, and a clustering is judged by how well it groups patients into distinct survival classes. The most effective pathway scoring/clustering combination, per clustering p-value, thus generates various ‘signatures’ for conventional and functional cancer classification. Results: PICS successfully regularized large dimension gene data, separated normal and cancerous tissues, and clustered a large patient cohort spanning six cancer types. Furthermore, PICS clustered patient cohorts into distinct, statistically-significant survival groups. For a suboptimally-debulked ovarian cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00127) showed significant improvement over that of a prior gene expression-classified study (p = .0179). For a pancreatic cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00141) showed significant improvement over that of a prior gene expression-classified study (p = .04). Pathway-based classification confirmed biomarkers for the pyrimidine, WNT-signaling, glycerophosphoglycerol, beta-alanine, and panthothenic acid pathways for ovarian cancer. Despite its robust nature, PICS requires significantly less run time than current pathway scoring methods. Conclusion: This work validates the PICS method to improve

  6. Cellular based cancer vaccines

    DEFF Research Database (Denmark)

    Hansen, M; Met, Ö; Svane, I M

    2012-01-01

    Cancer vaccines designed to re-calibrate the existing host-tumour interaction, tipping the balance from tumor acceptance towards tumor control holds huge potential to complement traditional cancer therapies. In general, limited success has been achieved with vaccines composed of tumor...... to transiently affect in vitro migration via autocrine receptor-mediated endocytosis of CCR7. In the current review, we discuss optimal design of DC maturation focused on pre-clinical as well as clinical results from standard and polarized dendritic cell based cancer vaccines....

  7. Iris Image Classification Based on Hierarchical Visual Codebook.

    Science.gov (United States)

    Zhenan Sun; Hui Zhang; Tieniu Tan; Jianyu Wang

    2014-06-01

    Iris recognition as a reliable method for personal identification has been well-studied with the objective to assign the class label of each iris image to a unique subject. In contrast, iris image classification aims to classify an iris image to an application specific category, e.g., iris liveness detection (classification of genuine and fake iris images), race classification (e.g., classification of iris images of Asian and non-Asian subjects), coarse-to-fine iris identification (classification of all iris images in the central database into multiple categories). This paper proposes a general framework for iris image classification based on texture analysis. A novel texture pattern representation method called Hierarchical Visual Codebook (HVC) is proposed to encode the texture primitives of iris images. The proposed HVC method is an integration of two existing Bag-of-Words models, namely Vocabulary Tree (VT), and Locality-constrained Linear Coding (LLC). The HVC adopts a coarse-to-fine visual coding strategy and takes advantages of both VT and LLC for accurate and sparse representation of iris texture. Extensive experimental results demonstrate that the proposed iris image classification method achieves state-of-the-art performance for iris liveness detection, race classification, and coarse-to-fine iris identification. A comprehensive fake iris image database simulating four types of iris spoof attacks is developed as the benchmark for research of iris liveness detection.

  8. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  9. Fluorescently labeled bevacizumab in human breast cancer: defining the classification threshold

    Science.gov (United States)

    Koch, Maximilian; de Jong, Johannes S.; Glatz, Jürgen; Symvoulidis, Panagiotis; Lamberts, Laetitia E.; Adams, Arthur L. L.; Kranendonk, Mariëtte E. G.; Terwisscha van Scheltinga, Anton G. T.; Aichler, Michaela; Jansen, Liesbeth; de Vries, Jakob; Lub-de Hooge, Marjolijn N.; Schröder, Carolien P.; Jorritsma-Smit, Annelies; Linssen, Matthijs D.; de Boer, Esther; van der Vegt, Bert; Nagengast, Wouter B.; Elias, Sjoerd G.; Oliveira, Sabrina; Witkamp, Arjen J.; Mali, Willem P. Th. M.; Van der Wall, Elsken; Garcia-Allende, P. Beatriz; van Diest, Paul J.; de Vries, Elisabeth G. E.; Walch, Axel; van Dam, Gooitzen M.; Ntziachristos, Vasilis

    2017-07-01

    In-vivo fluorescently labelled drug (bevacizumab) breast cancer specimen where obtained from patients. We propose a new structured method to determine the optimal classification threshold in targeted fluorescence intra-operative imaging.

  10. Ebolavirus Classification Based on Natural Vectors

    Science.gov (United States)

    Zheng, Hui; Yin, Changchuan; Hoang, Tung; He, Rong Lucy; Yang, Jie

    2015-01-01

    According to the WHO, ebolaviruses have resulted in 8818 human deaths in West Africa as of January 2015. To better understand the evolutionary relationship of the ebolaviruses and infer virulence from the relationship, we applied the alignment-free natural vector method to classify the newest ebolaviruses. The dataset includes three new Guinea viruses as well as 99 viruses from Sierra Leone. For the viruses of the family of Filoviridae, both genus label classification and species label classification achieve an accuracy rate of 100%. We represented the relationships among Filoviridae viruses by Unweighted Pair Group Method with Arithmetic Mean (UPGMA) phylogenetic trees and found that the filoviruses can be separated well by three genera. We performed the phylogenetic analysis on the relationship among different species of Ebolavirus by their coding-complete genomes and seven viral protein genes (glycoprotein [GP], nucleoprotein [NP], VP24, VP30, VP35, VP40, and RNA polymerase [L]). The topology of the phylogenetic tree by the viral protein VP24 shows consistency with the variations of virulence of ebolaviruses. The result suggests that VP24 be a pharmaceutical target for treating or preventing ebolaviruses. PMID:25803489

  11. EPA`s program for risk assessment guidelines: Cancer classification issues

    Energy Technology Data Exchange (ETDEWEB)

    Wiltse, J. [Environmental Protection Agency, Washington, DC (United States)

    1990-12-31

    Issues presented are related to classification of weight of evidence in cancer risk assessments. The focus in this paper is on lines of evidence used in constructing a conclusion about potential human carcinogenicity. The paper also discusses issues that are mistakenly addressed as classification issues but are really part of the risk assessment process. 2 figs.

  12. Hot complaint intelligent classification based on text mining

    Directory of Open Access Journals (Sweden)

    XIA Haifeng

    2013-10-01

    Full Text Available The complaint recognizer system plays an important role in making sure the correct classification of the hot complaint,improving the service quantity of telecommunications industry.The customers’ complaint in telecommunications industry has its special particularity which should be done in limited time,which cause the error in classification of hot complaint.The paper presents a model of complaint hot intelligent classification based on text mining,which can classify the hot complaint in the correct level of the complaint navigation.The examples show that the model can be efficient to classify the text of the complaint.

  13. Radar Target Classification using Recursive Knowledge-Based Methods

    DEFF Research Database (Denmark)

    Jochumsen, Lars Wurtz

    The topic of this thesis is target classification of radar tracks from a 2D mechanically scanning coastal surveillance radar. The measurements provided by the radar are position data and therefore the classification is mainly based on kinematic data, which is deduced from the position. The target...... been terminated. Therefore, an update of the classification results must be made for each measurement of the target. The data for this work are collected throughout the PhD and are both collected from radars and other sensors such as GPS....

  14. Key-phrase based classification of public health web pages.

    Science.gov (United States)

    Dolamic, Ljiljana; Boyer, Célia

    2013-01-01

    This paper describes and evaluates the public health web pages classification model based on key phrase extraction and matching. Easily extendible both in terms of new classes as well as the new language this method proves to be a good solution for text classification faced with the total lack of training data. To evaluate the proposed solution we have used a small collection of public health related web pages created by a double blind manual classification. Our experiments have shown that by choosing the adequate threshold value the desired value for either precision or recall can be achieved.

  15. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value.

    Directory of Open Access Journals (Sweden)

    Laetitia Marisa

    Full Text Available Colon cancer (CC pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses.Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype-like, normal-like, serrated CC phenotype-like, and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II-III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse-free survival, even after

  16. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  17. Immunogenomic Classification of Colorectal Cancer and Therapeutic Implications

    Directory of Open Access Journals (Sweden)

    Jessica Roelands

    2017-10-01

    Full Text Available The immune system has a substantial effect on colorectal cancer (CRC progression. Additionally, the response to immunotherapeutics and conventional treatment options (e.g., chemotherapy, radiotherapy and targeted therapies is influenced by the immune system. The molecular characterization of colorectal cancer (CRC has led to the identification of favorable and unfavorable immunological attributes linked to clinical outcome. With the definition of consensus molecular subtypes (CMSs based on transcriptomic profiles, multiple characteristics have been proposed to be responsible for the development of the tumor immune microenvironment and corresponding mechanisms of immune escape. In this review, a detailed description of proposed immune phenotypes as well as their interaction with different therapeutic modalities will be provided. Finally, possible strategies to shift the CRC immune phenotype towards a reactive, anti-tumor orientation are proposed per CMS.

  18. Polarimetric SAR image classification based on discriminative dictionary learning model

    Science.gov (United States)

    Sang, Cheng Wei; Sun, Hong

    2018-03-01

    Polarimetric SAR (PolSAR) image classification is one of the important applications of PolSAR remote sensing. It is a difficult high-dimension nonlinear mapping problem, the sparse representations based on learning overcomplete dictionary have shown great potential to solve such problem. The overcomplete dictionary plays an important role in PolSAR image classification, however for PolSAR image complex scenes, features shared by different classes will weaken the discrimination of learned dictionary, so as to degrade classification performance. In this paper, we propose a novel overcomplete dictionary learning model to enhance the discrimination of dictionary. The learned overcomplete dictionary by the proposed model is more discriminative and very suitable for PolSAR classification.

  19. Semantic Document Image Classification Based on Valuable Text Pattern

    Directory of Open Access Journals (Sweden)

    Hossein Pourghassem

    2011-01-01

    Full Text Available Knowledge extraction from detected document image is a complex problem in the field of information technology. This problem becomes more intricate when we know, a negligible percentage of the detected document images are valuable. In this paper, a segmentation-based classification algorithm is used to analysis the document image. In this algorithm, using a two-stage segmentation approach, regions of the image are detected, and then classified to document and non-document (pure region regions in the hierarchical classification. In this paper, a novel valuable definition is proposed to classify document image in to valuable or invaluable categories. The proposed algorithm is evaluated on a database consisting of the document and non-document image that provide from Internet. Experimental results show the efficiency of the proposed algorithm in the semantic document image classification. The proposed algorithm provides accuracy rate of 98.8% for valuable and invaluable document image classification problem.

  20. Video based object representation and classification using multiple covariance matrices.

    Science.gov (United States)

    Zhang, Yurong; Liu, Quan

    2017-01-01

    Video based object recognition and classification has been widely studied in computer vision and image processing area. One main issue of this task is to develop an effective representation for video. This problem can generally be formulated as image set representation. In this paper, we present a new method called Multiple Covariance Discriminative Learning (MCDL) for image set representation and classification problem. The core idea of MCDL is to represent an image set using multiple covariance matrices with each covariance matrix representing one cluster of images. Firstly, we use the Nonnegative Matrix Factorization (NMF) method to do image clustering within each image set, and then adopt Covariance Discriminative Learning on each cluster (subset) of images. At last, we adopt KLDA and nearest neighborhood classification method for image set classification. Promising experimental results on several datasets show the effectiveness of our MCDL method.

  1. PCA based feature reduction to improve the accuracy of decision tree c4.5 classification

    Science.gov (United States)

    Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.

    2018-03-01

    Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.

  2. Classification of BCI Users Based on Cognition

    Directory of Open Access Journals (Sweden)

    N. Firat Ozkan

    2018-01-01

    Full Text Available Brain-Computer Interfaces (BCI are systems originally developed to assist paralyzed patients allowing for commands to the computer with brain activities. This study aims to examine cognitive state with an objective, easy-to-use, and easy-to-interpret method utilizing Brain-Computer Interface systems. Seventy healthy participants completed six tasks using a Brain-Computer Interface system and participants’ pupil dilation, blink rate, and Galvanic Skin Response (GSR data were collected simultaneously. Participants filled Nasa-TLX forms following each task and task performances of participants were also measured. Cognitive state clusters were created from the data collected using the K-means method. Taking these clusters and task performances into account, the general cognitive state of each participant was classified as low risk or high risk. Logistic Regression, Decision Tree, and Neural Networks were also used to classify the same data in order to measure the consistency of this classification with other techniques and the method provided a consistency between 87.1% and 100% with other techniques.

  3. On the International Agency for Research on Cancer classification of glyphosate as a probable human carcinogen.

    Science.gov (United States)

    Tarone, Robert E

    2018-01-01

    The recent classification by International Agency for Research on Cancer (IARC) of the herbicide glyphosate as a probable human carcinogen has generated considerable discussion. The classification is at variance with evaluations of the carcinogenic potential of glyphosate by several national and international regulatory bodies. The basis for the IARC classification is examined under the assumptions that the IARC criteria are reasonable and that the body of scientific studies determined by IARC staff to be relevant to the evaluation of glyphosate by the Monograph Working Group is sufficiently complete. It is shown that the classification of glyphosate as a probable human carcinogen was the result of a flawed and incomplete summary of the experimental evidence evaluated by the Working Group. Rational and effective cancer prevention activities depend on scientifically sound and unbiased assessments of the carcinogenic potential of suspected agents. Implications of the erroneous classification of glyphosate with respect to the IARC Monograph Working Group deliberative process are discussed.

  4. The Study of Land Use Classification Based on SPOT6 High Resolution Data

    OpenAIRE

    Wu Song; Jiang Qigang

    2016-01-01

    A method is carried out to quick classification extract of the type of land use in agricultural areas, which is based on the spot6 high resolution remote sensing classification data and used of the good nonlinear classification ability of support vector machine. The results show that the spot6 high resolution remote sensing classification data can realize land classification efficiently, the overall classification accuracy reached 88.79% and Kappa factor is 0.8632 which means that the classif...

  5. Efficacy of hidden markov model over support vector machine on multiclass classification of healthy and cancerous cervical tissues

    Science.gov (United States)

    Mukhopadhyay, Sabyasachi; Kurmi, Indrajit; Pratiher, Sawon; Mukherjee, Sukanya; Barman, Ritwik; Ghosh, Nirmalya; Panigrahi, Prasanta K.

    2018-02-01

    In this paper, a comparative study between SVM and HMM has been carried out for multiclass classification of cervical healthy and cancerous tissues. In our study, the HMM methodology is more promising to produce higher accuracy in classification.

  6. Recurrent neural networks for breast lesion classification based on DCE-MRIs

    Science.gov (United States)

    Antropova, Natasha; Huynh, Benjamin; Giger, Maryellen

    2018-02-01

    Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) plays a significant role in breast cancer screening, cancer staging, and monitoring response to therapy. Recently, deep learning methods are being rapidly incorporated in image-based breast cancer diagnosis and prognosis. However, most of the current deep learning methods make clinical decisions based on 2-dimentional (2D) or 3D images and are not well suited for temporal image data. In this study, we develop a deep learning methodology that enables integration of clinically valuable temporal components of DCE-MRIs into deep learning-based lesion classification. Our work is performed on a database of 703 DCE-MRI cases for the task of distinguishing benign and malignant lesions, and uses the area under the ROC curve (AUC) as the performance metric in conducting that task. We train a recurrent neural network, specifically a long short-term memory network (LSTM), on sequences of image features extracted from the dynamic MRI sequences. These features are extracted with VGGNet, a convolutional neural network pre-trained on a large dataset of natural images ImageNet. The features are obtained from various levels of the network, to capture low-, mid-, and high-level information about the lesion. Compared to a classification method that takes as input only images at a single time-point (yielding an AUC = 0.81 (se = 0.04)), our LSTM method improves lesion classification with an AUC of 0.85 (se = 0.03).

  7. A new circulation type classification based upon Lagrangian air trajectories

    Directory of Open Access Journals (Sweden)

    Alexandre M. Ramos

    2014-10-01

    Full Text Available A new classification method of the large-scale circulation characteristic for a specific target area (NW Iberian Peninsula is presented, based on the analysis of 90-h backward trajectories arriving in this area calculated with the 3-D Lagrangian particle dispersion model FLEXPART. A cluster analysis is applied to separate the backward trajectories in up to five representative air streams for each day. Specific measures are then used to characterise the distinct air streams (e.g., curvature of the trajectories, cyclonic or anticyclonic flow, moisture evolution, origin and length of the trajectories. The robustness of the presented method is demonstrated in comparison with the Eulerian Lamb weather type classification.A case study of the 2003 heatwave is discussed in terms of the new Lagrangian circulation and the Lamb weather type classifications. It is shown that the new classification method adds valuable information about the pertinent meteorological conditions, which are missing in an Eulerian approach. The new method is climatologically evaluated for the five-year time period from December 1999 to November 2004. The ability of the method to capture the inter-seasonal circulation variability in the target region is shown. Furthermore, the multi-dimensional character of the classification is shortly discussed, in particular with respect to inter-seasonal differences. Finally, the relationship between the new Lagrangian classification and the precipitation in the target area is studied.

  8. Intelligence system based classification approach for medical disease diagnosis

    Science.gov (United States)

    Sagir, Abdu Masanawa; Sathasivam, Saratha

    2017-08-01

    The prediction of breast cancer in women who have no signs or symptoms of the disease as well as survivability after undergone certain surgery has been a challenging problem for medical researchers. The decision about presence or absence of diseases depends on the physician's intuition, experience and skill for comparing current indicators with previous one than on knowledge rich data hidden in a database. This measure is a very crucial and challenging task. The goal is to predict patient condition by using an adaptive neuro fuzzy inference system (ANFIS) pre-processed by grid partitioning. To achieve an accurate diagnosis at this complex stage of symptom analysis, the physician may need efficient diagnosis system. A framework describes methodology for designing and evaluation of classification performances of two discrete ANFIS systems of hybrid learning algorithms least square estimates with Modified Levenberg-Marquardt and Gradient descent algorithms that can be used by physicians to accelerate diagnosis process. The proposed method's performance was evaluated based on training and test datasets with mammographic mass and Haberman's survival Datasets obtained from benchmarked datasets of University of California at Irvine's (UCI) machine learning repository. The robustness of the performance measuring total accuracy, sensitivity and specificity is examined. In comparison, the proposed method achieves superior performance when compared to conventional ANFIS based gradient descent algorithm and some related existing methods. The software used for the implementation is MATLAB R2014a (version 8.3) and executed in PC Intel Pentium IV E7400 processor with 2.80 GHz speed and 2.0 GB of RAM.

  9. Atmospheric circulation classification comparison based on wildfires in Portugal

    Science.gov (United States)

    Pereira, M. G.; Trigo, R. M.

    2009-04-01

    Atmospheric circulation classifications are not a simple description of atmospheric states but a tool to understand and interpret the atmospheric processes and to model the relation between atmospheric circulation and surface climate and other related variables (Radan Huth et al., 2008). Classifications were initially developed with weather forecasting purposes, however with the progress in computer processing capability, new and more robust objective methods were developed and applied to large datasets prompting atmospheric circulation classification methods to one of the most important fields in synoptic and statistical climatology. Classification studies have been extensively used in climate change studies (e.g. reconstructed past climates, recent observed changes and future climates), in bioclimatological research (e.g. relating human mortality to climatic factors) and in a wide variety of synoptic climatological applications (e.g. comparison between datasets, air pollution, snow avalanches, wine quality, fish captures and forest fires). Likewise, atmospheric circulation classifications are important for the study of the role of weather in wildfire occurrence in Portugal because the daily synoptic variability is the most important driver of local weather conditions (Pereira et al., 2005). In particular, the objective classification scheme developed by Trigo and DaCamara (2000) to classify the atmospheric circulation affecting Portugal have proved to be quite useful in discriminating the occurrence and development of wildfires as well as the distribution over Portugal of surface climatic variables with impact in wildfire activity such as maximum and minimum temperature and precipitation. This work aims to present: (i) an overview the existing circulation classification for the Iberian Peninsula, and (ii) the results of a comparison study between these atmospheric circulation classifications based on its relation with wildfires and relevant meteorological

  10. Failure diagnosis using deep belief learning based health state classification

    International Nuclear Information System (INIS)

    Tamilselvan, Prasanna; Wang, Pingfeng

    2013-01-01

    Effective health diagnosis provides multifarious benefits such as improved safety, improved reliability and reduced costs for operation and maintenance of complex engineered systems. This paper presents a novel multi-sensor health diagnosis method using deep belief network (DBN). DBN has recently become a popular approach in machine learning for its promised advantages such as fast inference and the ability to encode richer and higher order network structures. The DBN employs a hierarchical structure with multiple stacked restricted Boltzmann machines and works through a layer by layer successive learning process. The proposed multi-sensor health diagnosis methodology using DBN based state classification can be structured in three consecutive stages: first, defining health states and preprocessing sensory data for DBN training and testing; second, developing DBN based classification models for diagnosis of predefined health states; third, validating DBN classification models with testing sensory dataset. Health diagnosis using DBN based health state classification technique is compared with four existing diagnosis techniques. Benchmark classification problems and two engineering health diagnosis applications: aircraft engine health diagnosis and electric power transformer health diagnosis are employed to demonstrate the efficacy of the proposed approach

  11. Point Based Emotion Classification Using SVM

    OpenAIRE

    Swinkels, Wout

    2016-01-01

    The detection of emotions is a hot topic in the area of computer vision. Emotions are based on subtle changes in the face that are intuitively detected and interpreted by humans. Detecting these subtle changes, based on mathematical models, is a great challenge in the area of computer vision. In this thesis a new method is proposed to achieve state-of-the-art emotion detection performance. This method is based on facial feature points to monitor subtle changes in the face. Therefore the c...

  12. Efficacy of the Kyoto Classification of Gastritis in Identifying Patients at High Risk for Gastric Cancer.

    Science.gov (United States)

    Sugimoto, Mitsushige; Ban, Hiromitsu; Ichikawa, Hitomi; Sahara, Shu; Otsuka, Taketo; Inatomi, Osamu; Bamba, Shigeki; Furuta, Takahisa; Andoh, Akira

    2017-01-01

    Objective The Kyoto gastritis classification categorizes the endoscopic characteristics of Helicobacter pylori (H. pylori) infection-associated gastritis and identifies patterns associated with a high risk of gastric cancer. We investigated its efficacy, comparing scores in patients with H. pylori-associated gastritis and with gastric cancer. Methods A total of 1,200 patients with H. pylori-positive gastritis alone (n=932), early-stage H. pylori-positive gastric cancer (n=189), and successfully treated H. pylori-negative cancer (n=79) were endoscopically graded according to the Kyoto gastritis classification for atrophy, intestinal metaplasia, fold hypertrophy, nodularity, and diffuse redness. Results The prevalence of O-II/O-III-type atrophy according to the Kimura-Takemoto classification in early-stage H. pylori-positive gastric cancer and successfully treated H. pylori-negative cancer groups was 45.1%, which was significantly higher than in subjects with gastritis alone (12.7%, pgastritis scores of atrophy and intestinal metaplasia in the H. pylori-positive cancer group were significantly higher than in subjects with gastritis alone (all pgastritis classification may thus be useful for detecting these patients.

  13. Cancer cell detection and classification using transformation invariant template learning methods

    International Nuclear Information System (INIS)

    Talware, Rajendra; Abhyankar, Aditya

    2011-01-01

    In traditional cancer cell detection, pathologists examine biopsies to make diagnostic assessments, largely based on cell morphology and tissue distribution. The process of image acquisition is very much subjective and the pattern undergoes unknown or random transformations during data acquisition (e.g. variation in illumination, orientation, translation and perspective) results in high degree of variability. Transformed Component Analysis (TCA) incorporates a discrete, hidden variable that accounts for transformations and uses the Expectation Maximization (EM) algorithm to jointly extract components and normalize for transformations. Further the TEMPLAR framework developed takes advantage of hierarchical pattern models and adds probabilistic modeling for local transformations. Pattern classification is based on Expectation Maximization algorithm and General Likelihood Ratio Tests (GLRT). Performance of TEMPLAR is certainly improved by defining area of interest on slide a priori. Performance can be further enhanced by making the kernel function adaptive during learning. (author)

  14. ICF-based classification and measurement of functioning.

    Science.gov (United States)

    Stucki, G; Kostanjsek, N; Ustün, B; Cieza, A

    2008-09-01

    If we aim towards a comprehensive understanding of human functioning and the development of comprehensive programs to optimize functioning of individuals and populations we need to develop suitable measures. The approval of the International Classification, Disability and Health (ICF) in 2001 by the 54th World Health Assembly as the first universally shared model and classification of functioning, disability and health marks, therefore an important step in the development of measurement instruments and ultimately for our understanding of functioning, disability and health. The acceptance and use of the ICF as a reference framework and classification has been facilitated by its development in a worldwide, comprehensive consensus process and the increasing evidence regarding its validity. However, the broad acceptance and use of the ICF as a reference framework and classification will also depend on the resolution of conceptual and methodological challenges relevant for the classification and measurement of functioning. This paper therefore describes first how the ICF categories can serve as building blocks for the measurement of functioning and then the current state of the development of ICF based practical tools and international standards such as the ICF Core Sets. Finally it illustrates how to map the world of measures to the ICF and vice versa and the methodological principles relevant for the transformation of information obtained with a clinical test or a patient-oriented instrument to the ICF as well as the development of ICF-based clinical and self-reported measurement instruments.

  15. Chinese Sentence Classification Based on Convolutional Neural Network

    Science.gov (United States)

    Gu, Chengwei; Wu, Ming; Zhang, Chuang

    2017-10-01

    Sentence classification is one of the significant issues in Natural Language Processing (NLP). Feature extraction is often regarded as the key point for natural language processing. Traditional ways based on machine learning can not take high level features into consideration, such as Naive Bayesian Model. The neural network for sentence classification can make use of contextual information to achieve greater results in sentence classification tasks. In this paper, we focus on classifying Chinese sentences. And the most important is that we post a novel architecture of Convolutional Neural Network (CNN) to apply on Chinese sentence classification. In particular, most of the previous methods often use softmax classifier for prediction, we embed a linear support vector machine to substitute softmax in the deep neural network model, minimizing a margin-based loss to get a better result. And we use tanh as an activation function, instead of ReLU. The CNN model improve the result of Chinese sentence classification tasks. Experimental results on the Chinese news title database validate the effectiveness of our model.

  16. An enhanced topologically significant directed random walk in cancer classification using gene expression datasets

    Directory of Open Access Journals (Sweden)

    Choon Sen Seah

    2017-12-01

    Full Text Available Microarray technology has become one of the elementary tools for researchers to study the genome of organisms. As the complexity and heterogeneity of cancer is being increasingly appreciated through genomic analysis, cancerous classification is an emerging important trend. Significant directed random walk is proposed as one of the cancerous classification approach which have higher sensitivity of risk gene prediction and higher accuracy of cancer classification. In this paper, the methodology and material used for the experiment are presented. Tuning parameter selection method and weight as parameter are applied in proposed approach. Gene expression dataset is used as the input datasets while pathway dataset is used to build a directed graph, as reference datasets, to complete the bias process in random walk approach. In addition, we demonstrate that our approach can improve sensitive predictions with higher accuracy and biological meaningful classification result. Comparison result takes place between significant directed random walk and directed random walk to show the improvement in term of sensitivity of prediction and accuracy of cancer classification.

  17. Some improved classification-based ridge parameter of Hoerl and ...

    African Journals Online (AJOL)

    Some improved classification-based ridge parameter of Hoerl and Kennard estimation techniques. ... This assumption is often violated and Ridge Regression estimator introduced by [2]has been identified to be more efficient than ordinary least square (OLS) in handling it. However, it requires a ridge parameter, K, of which ...

  18. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is

  19. Torrent classification - Base of rational management of erosive regions

    International Nuclear Information System (INIS)

    Gavrilovic, Zoran; Stefanovic, Milutin; Milovanovic, Irina; Cotric, Jelena; Milojevic, Mileta

    2008-01-01

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  20. Classification of scintigrams on the base of an automatic analysis

    International Nuclear Information System (INIS)

    Vidyukov, V.I.; Kasatkin, Yu.N.; Kal'nitskaya, E.F.; Mironov, S.P.; Rotenberg, E.M.

    1980-01-01

    The stages of drawing a discriminative system based on self-education for an automatic analysis of scintigrams have been considered. The results of the classification of 240 scintigrams of the liver into ''normal'', ''diffuse lesions'', ''focal lesions'' have been evaluated by medical experts and computer. The accuracy of the computerized classification was 91.7%, that of the experts-85%. The automatic analysis methods of scintigrams of the liver have been realized using the specialized MDS system of data processing. The quality of the discriminative system has been assessed on 125 scintigrams. The accuracy of the classification is equal to 89.6%. The employment of the self-education; methods permitted one to single out two subclasses depending on the severity of diffuse lesions

  1. Hyperspectral image classification based on local binary patterns and PCANet

    Science.gov (United States)

    Yang, Huizhen; Gao, Feng; Dong, Junyu; Yang, Yang

    2018-04-01

    Hyperspectral image classification has been well acknowledged as one of the challenging tasks of hyperspectral data processing. In this paper, we propose a novel hyperspectral image classification framework based on local binary pattern (LBP) features and PCANet. In the proposed method, linear prediction error (LPE) is first employed to select a subset of informative bands, and LBP is utilized to extract texture features. Then, spectral and texture features are stacked into a high dimensional vectors. Next, the extracted features of a specified position are transformed to a 2-D image. The obtained images of all pixels are fed into PCANet for classification. Experimental results on real hyperspectral dataset demonstrate the effectiveness of the proposed method.

  2. Remote Sensing Image Classification Based on Stacked Denoising Autoencoder

    Directory of Open Access Journals (Sweden)

    Peng Liang

    2017-12-01

    Full Text Available Focused on the issue that conventional remote sensing image classification methods have run into the bottlenecks in accuracy, a new remote sensing image classification method inspired by deep learning is proposed, which is based on Stacked Denoising Autoencoder. First, the deep network model is built through the stacked layers of Denoising Autoencoder. Then, with noised input, the unsupervised Greedy layer-wise training algorithm is used to train each layer in turn for more robust expressing, characteristics are obtained in supervised learning by Back Propagation (BP neural network, and the whole network is optimized by error back propagation. Finally, Gaofen-1 satellite (GF-1 remote sensing data are used for evaluation, and the total accuracy and kappa accuracy reach 95.7% and 0.955, respectively, which are higher than that of the Support Vector Machine and Back Propagation neural network. The experiment results show that the proposed method can effectively improve the accuracy of remote sensing image classification.

  3. Torrent classification - Base of rational management of erosive regions

    Energy Technology Data Exchange (ETDEWEB)

    Gavrilovic, Zoran; Stefanovic, Milutin; Milovanovic, Irina; Cotric, Jelena; Milojevic, Mileta [Institute for the Development of Water Resources ' Jaroslav Cerni' , 11226 Beograd (Pinosava), Jaroslava Cernog 80 (Serbia)], E-mail: gavrilovicz@sbb.rs

    2008-11-01

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  4. Deep learning for EEG-Based preference classification

    Science.gov (United States)

    Teo, Jason; Hou, Chew Lin; Mountstephens, James

    2017-10-01

    Electroencephalogram (EEG)-based emotion classification is rapidly becoming one of the most intensely studied areas of brain-computer interfacing (BCI). The ability to passively identify yet accurately correlate brainwaves with our immediate emotions opens up truly meaningful and previously unattainable human-computer interactions such as in forensic neuroscience, rehabilitative medicine, affective entertainment and neuro-marketing. One particularly useful yet rarely explored areas of EEG-based emotion classification is preference recognition [1], which is simply the detection of like versus dislike. Within the limited investigations into preference classification, all reported studies were based on musically-induced stimuli except for a single study which used 2D images. The main objective of this study is to apply deep learning, which has been shown to produce state-of-the-art results in diverse hard problems such as in computer vision, natural language processing and audio recognition, to 3D object preference classification over a larger group of test subjects. A cohort of 16 users was shown 60 bracelet-like objects as rotating visual stimuli on a computer display while their preferences and EEGs were recorded. After training a variety of machine learning approaches which included deep neural networks, we then attempted to classify the users' preferences for the 3D visual stimuli based on their EEGs. Here, we show that that deep learning outperforms a variety of other machine learning classifiers for this EEG-based preference classification task particularly in a highly challenging dataset with large inter- and intra-subject variability.

  5. Comparison of Computational Algorithms for the Classification of Liver Cancer using SELDI Mass Spectrometry: A Case Study

    Directory of Open Access Journals (Sweden)

    Robert J Hickey

    2007-01-01

    Full Text Available Introduction: As an alternative to DNA microarrays, mass spectrometry based analysis of proteomic patterns has shown great potential in cancer diagnosis. The ultimate application of this technique in clinical settings relies on the advancement of the technology itself and the maturity of the computational tools used to analyze the data. A number of computational algorithms constructed on different principles are available for the classification of disease status based on proteomic patterns. Nevertheless, few studies have addressed the difference in the performance of these approaches. In this report, we describe a comparative case study on the classification accuracy of hepatocellular carcinoma based on the serum proteomic pattern generated from a Surface Enhanced Laser Desorption/Ionization (SELDI mass spectrometer.Methods: Nine supervised classifi cation algorithms are implemented in R software and compared for the classification accuracy.Results: We found that the support vector machine with radial function is preferable as a tool for classification of hepatocellular carcinoma using features in SELDI mass spectra. Among the rest of the methods, random forest and prediction analysis of microarrays have better performance. A permutation-based technique reveals that the support vector machine with a radial function seems intrinsically superior in learning from the training data since it has a lower prediction error than others when there is essentially no differential signal. On the other hand, the performance of the random forest and prediction analysis of microarrays rely on their capability of capturing the signals with substantial differentiation between groups.Conclusions: Our finding is similar to a previous study, where classification methods based on the Matrix Assisted Laser Desorption/Ionization (MALDI mass spectrometry are compared for the prediction accuracy of ovarian cancer. The support vector machine, random forest and prediction

  6. Hardware Accelerators Targeting a Novel Group Based Packet Classification Algorithm

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2013-01-01

    Full Text Available Packet classification is a ubiquitous and key building block for many critical network devices. However, it remains as one of the main bottlenecks faced when designing fast network devices. In this paper, we propose a novel Group Based Search packet classification Algorithm (GBSA that is scalable, fast, and efficient. GBSA consumes an average of 0.4 megabytes of memory for a 10 k rule set. The worst-case classification time per packet is 2 microseconds, and the preprocessing speed is 3 M rules/second based on an Xeon processor operating at 3.4 GHz. When compared with other state-of-the-art classification techniques, the results showed that GBSA outperforms the competition with respect to speed, memory usage, and processing time. Moreover, GBSA is amenable to implementation in hardware. Three different hardware implementations are also presented in this paper including an Application Specific Instruction Set Processor (ASIP implementation and two pure Register-Transfer Level (RTL implementations based on Impulse-C and Handel-C flows, respectively. Speedups achieved with these hardware accelerators ranged from 9x to 18x compared with a pure software implementation running on an Xeon processor.

  7. Sparse Representation Based Binary Hypothesis Model for Hyperspectral Image Classification

    Directory of Open Access Journals (Sweden)

    Yidong Tang

    2016-01-01

    Full Text Available The sparse representation based classifier (SRC and its kernel version (KSRC have been employed for hyperspectral image (HSI classification. However, the state-of-the-art SRC often aims at extended surface objects with linear mixture in smooth scene and assumes that the number of classes is given. Considering the small target with complex background, a sparse representation based binary hypothesis (SRBBH model is established in this paper. In this model, a query pixel is represented in two ways, which are, respectively, by background dictionary and by union dictionary. The background dictionary is composed of samples selected from the local dual concentric window centered at the query pixel. Thus, for each pixel the classification issue becomes an adaptive multiclass classification problem, where only the number of desired classes is required. Furthermore, the kernel method is employed to improve the interclass separability. In kernel space, the coding vector is obtained by using kernel-based orthogonal matching pursuit (KOMP algorithm. Then the query pixel can be labeled by the characteristics of the coding vectors. Instead of directly using the reconstruction residuals, the different impacts the background dictionary and union dictionary have on reconstruction are used for validation and classification. It enhances the discrimination and hence improves the performance.

  8. Energy-efficiency based classification of the manufacturing workstation

    Science.gov (United States)

    Frumuşanu, G.; Afteni, C.; Badea, N.; Epureanu, A.

    2017-08-01

    EU Directive 92/75/EC established for the first time an energy consumption labelling scheme, further implemented by several other directives. As consequence, nowadays many products (e.g. home appliances, tyres, light bulbs, houses) have an EU Energy Label when offered for sale or rent. Several energy consumption models of manufacturing equipments have been also developed. This paper proposes an energy efficiency - based classification of the manufacturing workstation, aiming to characterize its energetic behaviour. The concept of energy efficiency of the manufacturing workstation is defined. On this base, a classification methodology has been developed. It refers to specific criteria and their evaluation modalities, together to the definition & delimitation of energy efficiency classes. The energy class position is defined after the amount of energy needed by the workstation in the middle point of its operating domain, while its extension is determined by the value of the first coefficient from the Taylor series that approximates the dependence between the energy consume and the chosen parameter of the working regime. The main domain of interest for this classification looks to be the optimization of the manufacturing activities planning and programming. A case-study regarding an actual lathe classification from energy efficiency point of view, based on two different approaches (analytical and numerical) is also included.

  9. Knowledge-based approach to video content classification

    Science.gov (United States)

    Chen, Yu; Wong, Edward K.

    2001-01-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  10. Bayesian outcome-based strategy classification.

    Science.gov (United States)

    Lee, Michael D

    2016-03-01

    Hilbig and Moshagen (Psychonomic Bulletin & Review, 21, 1431-1443, 2014) recently developed a method for making inferences about the decision processes people use in multi-attribute forced choice tasks. Their paper makes a number of worthwhile theoretical and methodological contributions. Theoretically, they provide an insightful psychological motivation for a probabilistic extension of the widely-used "weighted additive" (WADD) model, and show how this model, as well as other important models like "take-the-best" (TTB), can and should be expressed in terms of meaningful priors. Methodologically, they develop an inference approach based on the Minimum Description Length (MDL) principles that balances both the goodness-of-fit and complexity of the decision models they consider. This paper aims to preserve these useful contributions, but provide a complementary Bayesian approach with some theoretical and methodological advantages. We develop a simple graphical model, implemented in JAGS, that allows for fully Bayesian inferences about which models people use to make decisions. To demonstrate the Bayesian approach, we apply it to the models and data considered by Hilbig and Moshagen (Psychonomic Bulletin & Review, 21, 1431-1443, 2014), showing how a prior predictive analysis of the models, and posterior inferences about which models people use and the parameter settings at which they use them, can contribute to our understanding of human decision making.

  11. Support vector machine for breast cancer classification using diffusion-weighted MRI histogram features: Preliminary study.

    Science.gov (United States)

    Vidić, Igor; Egnell, Liv; Jerome, Neil P; Teruel, Jose R; Sjøbakk, Torill E; Østlie, Agnes; Fjøsne, Hans E; Bathen, Tone F; Goa, Pål Erik

    2018-05-01

    Diffusion-weighted MRI (DWI) is currently one of the fastest developing MRI-based techniques in oncology. Histogram properties from model fitting of DWI are useful features for differentiation of lesions, and classification can potentially be improved by machine learning. To evaluate classification of malignant and benign tumors and breast cancer subtypes using support vector machine (SVM). Prospective. Fifty-one patients with benign (n = 23) and malignant (n = 28) breast tumors (26 ER+, whereof six were HER2+). Patients were imaged with DW-MRI (3T) using twice refocused spin-echo echo-planar imaging with echo time / repetition time (TR/TE) = 9000/86 msec, 90 × 90 matrix size, 2 × 2 mm in-plane resolution, 2.5 mm slice thickness, and 13 b-values. Apparent diffusion coefficient (ADC), relative enhanced diffusivity (RED), and the intravoxel incoherent motion (IVIM) parameters diffusivity (D), pseudo-diffusivity (D*), and perfusion fraction (f) were calculated. The histogram properties (median, mean, standard deviation, skewness, kurtosis) were used as features in SVM (10-fold cross-validation) for differentiation of lesions and subtyping. Accuracies of the SVM classifications were calculated to find the combination of features with highest prediction accuracy. Mann-Whitney tests were performed for univariate comparisons. For benign versus malignant tumors, univariate analysis found 11 histogram properties to be significant differentiators. Using SVM, the highest accuracy (0.96) was achieved from a single feature (mean of RED), or from three feature combinations of IVIM or ADC. Combining features from all models gave perfect classification. No single feature predicted HER2 status of ER + tumors (univariate or SVM), although high accuracy (0.90) was achieved with SVM combining several features. Importantly, these features had to include higher-order statistics (kurtosis and skewness), indicating the importance to account for heterogeneity. Our

  12. Optical beam classification using deep learning: a comparison with rule- and feature-based classification

    Science.gov (United States)

    Alom, Md. Zahangir; Awwal, Abdul A. S.; Lowe-Webb, Roger; Taha, Tarek M.

    2017-08-01

    Vector Machine (SVM). The experimental results show around 96% classification accuracy using CNN; the CNN approach also provides comparable recognition results compared to the present feature-based off-normal detection. The feature-based solution was developed to capture the expertise of a human expert in classifying the images. The misclassified results are further studied to explain the differences and discover any discrepancies or inconsistencies in current classification.

  13. A Chinese text classification system based on Naive Bayes algorithm

    Directory of Open Access Journals (Sweden)

    Cui Wei

    2016-01-01

    Full Text Available In this paper, aiming at the characteristics of Chinese text classification, using the ICTCLAS(Chinese lexical analysis system of Chinese academy of sciences for document segmentation, and for data cleaning and filtering the Stop words, using the information gain and document frequency feature selection algorithm to document feature selection. Based on this, based on the Naive Bayesian algorithm implemented text classifier , and use Chinese corpus of Fudan University has carried on the experiment and analysis on the system.

  14. Median Filter Noise Reduction of Image and Backpropagation Neural Network Model for Cervical Cancer Classification

    Science.gov (United States)

    Wutsqa, D. U.; Marwah, M.

    2017-06-01

    In this paper, we consider spatial operation median filter to reduce the noise in the cervical images yielded by colposcopy tool. The backpropagation neural network (BPNN) model is applied to the colposcopy images to classify cervical cancer. The classification process requires an image extraction by using a gray level co-occurrence matrix (GLCM) method to obtain image features that are used as inputs of BPNN model. The advantage of noise reduction is evaluated by comparing the performances of BPNN models with and without spatial operation median filter. The experimental result shows that the spatial operation median filter can improve the accuracy of the BPNN model for cervical cancer classification.

  15. Comparative Study of Classification Techniques on Breast Cancer FNA Biopsy Data

    Directory of Open Access Journals (Sweden)

    George Rumbe

    2010-12-01

    Full Text Available Accurate diagnostic detection of the cancerous cells in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Bayesian classifier and other Artificial neural network classifiers (Backpropagation, linear programming, Learning vector quantization, and K nearest neighborhood on the Wisconsin breast cancer classification problem.

  16. Pathogenesis of Gastric Cancer: Genetics and Molecular Classification.

    Science.gov (United States)

    Figueiredo, Ceu; Camargo, M C; Leite, Marina; Fuentes-Pananá, Ezequiel M; Rabkin, Charles S; Machado, José C

    Gastric cancer is the fifth most incident and the third most common cause of cancer-related death in the world. Infection with Helicobacter pylori is the major risk factor for this disease. Gastric cancer is the final outcome of a cascade of events that takes decades to occur and results from the accumulation of multiple genetic and epigenetic alterations. These changes are crucial for tumor cells to expedite and sustain the array of pathways involved in the cancer development, such as cell cycle, DNA repair, metabolism, cell-to-cell and cell-to-matrix interactions, apoptosis, angiogenesis, and immune surveillance. Comprehensive molecular analyses of gastric cancer have disclosed the complex heterogeneity of this disease. In particular, these analyses have confirmed that Epstein-Barr virus (EBV)-positive gastric cancer is a distinct entity. The identification of gastric cancer subtypes characterized by recognizable molecular profiles may pave the way for a more personalized clinical management and to the identification of novel therapeutic targets and biomarkers for screening, prognosis, prediction of response to treatment, and monitoring of gastric cancer progression.

  17. Group-Based Active Learning of Classification Models.

    Science.gov (United States)

    Luo, Zhipeng; Hauskrecht, Milos

    2017-05-01

    Learning of classification models from real-world data often requires additional human expert effort to annotate the data. However, this process can be rather costly and finding ways of reducing the human annotation effort is critical for this task. The objective of this paper is to develop and study new ways of providing human feedback for efficient learning of classification models by labeling groups of examples. Briefly, unlike traditional active learning methods that seek feedback on individual examples, we develop a new group-based active learning framework that solicits label information on groups of multiple examples. In order to describe groups in a user-friendly way, conjunctive patterns are used to compactly represent groups. Our empirical study on 12 UCI data sets demonstrates the advantages and superiority of our approach over both classic instance-based active learning work, as well as existing group-based active-learning methods.

  18. Finding Combination of Features from Promoter Regions for Ovarian Cancer-related Gene Group Classification

    KAUST Repository

    Olayan, Rawan S.

    2012-12-01

    In classification problems, it is always important to use the suitable combination of features that will be employed by classifiers. Generating the right combination of features usually results in good classifiers. In the situation when the problem is not well understood, data items are usually described by many features in the hope that some of these may be the relevant or most relevant ones. In this study, we focus on one such problem related to genes implicated in ovarian cancer (OC). We try to recognize two important OC-related gene groups: oncogenes, which support the development and progression of OC, and oncosuppressors, which oppose such tendencies. For this, we use the properties of promoters of these genes. We identified potential “regulatory features” that characterize OC-related oncogenes and oncosuppressors promoters. In our study, we used 211 oncogenes and 39 oncosuppressors. For these, we identified 538 characteristic sequence motifs from their promoters. Promoters are annotated by these motifs and derived feature vectors used to develop classification models. We made a comparison of a number of classification models in their ability to distinguish oncogenes from oncosuppressors. Based on 10-fold cross-validation, the resultant model was able to separate the two classes with sensitivity of 96% and specificity of 100% with the complete set of features. Moreover, we developed another recognition model where we attempted to distinguish oncogenes and oncosuppressors as one group from other OC-related genes. That model achieved accuracy of 82%. We believe that the results of this study will help in discovering other OC-related oncogenes and oncosuppressors not identified as yet.

  19. Finding Combination of Features from Promoter Regions for Ovarian Cancer-related Gene Group Classification

    KAUST Repository

    Olayan, Rawan S.

    2012-01-01

    In classification problems, it is always important to use the suitable combination of features that will be employed by classifiers. Generating the right combination of features usually results in good classifiers. In the situation when the problem is not well understood, data items are usually described by many features in the hope that some of these may be the relevant or most relevant ones. In this study, we focus on one such problem related to genes implicated in ovarian cancer (OC). We try to recognize two important OC-related gene groups: oncogenes, which support the development and progression of OC, and oncosuppressors, which oppose such tendencies. For this, we use the properties of promoters of these genes. We identified potential “regulatory features” that characterize OC-related oncogenes and oncosuppressors promoters. In our study, we used 211 oncogenes and 39 oncosuppressors. For these, we identified 538 characteristic sequence motifs from their promoters. Promoters are annotated by these motifs and derived feature vectors used to develop classification models. We made a comparison of a number of classification models in their ability to distinguish oncogenes from oncosuppressors. Based on 10-fold cross-validation, the resultant model was able to separate the two classes with sensitivity of 96% and specificity of 100% with the complete set of features. Moreover, we developed another recognition model where we attempted to distinguish oncogenes and oncosuppressors as one group from other OC-related genes. That model achieved accuracy of 82%. We believe that the results of this study will help in discovering other OC-related oncogenes and oncosuppressors not identified as yet.

  20. Sparse Representation Based Multi-Instance Learning for Breast Ultrasound Image Classification

    Directory of Open Access Journals (Sweden)

    Lu Bing

    2017-01-01

    Full Text Available We propose a novel method based on sparse representation for breast ultrasound image classification under the framework of multi-instance learning (MIL. After image enhancement and segmentation, concentric circle is used to extract the global and local features for improving the accuracy in diagnosis and prediction. The classification problem of ultrasound image is converted to sparse representation based MIL problem. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is represented by one feature vector which is obtained via sparse representations of all instances within the bag. The sparse and MIL problem is further converted to a conventional learning problem that is solved by relevance vector machine (RVM. Results of single classifiers are combined to be used for classification. Experimental results on the breast cancer datasets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods.

  1. Classification of Hearing Loss Disorders Using Teoae-Based Descriptors

    Science.gov (United States)

    Hatzopoulos, Stavros Dimitris

    Transiently Evoked Otoacoustic Emissions (TEOAE) are signals produced by the cochlea upon stimulation by an acoustic click. Within the context of this dissertation, it was hypothesized that the relationship between the TEOAEs and the functional status of the OHCs provided an opportunity for designing a TEOAE-based clinical procedure that could be used to assess cochlear function. To understand the nature of the TEOAE signals in the time and the frequency domain several different analyses were performed. Using normative Input-Output (IO) curves, short-time FFT analyses and cochlear computer simulations, it was found that for optimization of the hearing loss classification it is necessary to use a complete 20 ms TEOAE segment. It was also determined that various 2-D filtering methods (median and averaging filtering masks, LP-FFT) used to enhance of the TEOAE S/N offered minimal improvement (less than 6 dB per stimulus level). Higher S/N improvements resulted in TEOAE sequences that were over-smoothed. The final classification algorithm was based on a statistical analysis of raw FFT data and when applied to a sample set of clinically obtained TEOAE recordings (from 56 normal and 66 hearing-loss subjects) correctly identified 94.3% of the normal and 90% of the hearing loss subjects, at the 80 dB SPL stimulus level. To enhance the discrimination between the conductive and the sensorineural populations, data from the 68 dB SPL stimulus level were used, which yielded a normal classification of 90.2%, a hearing loss classification of 87.5% and a conductive-sensorineural classification of 87%. Among the hearing-loss populations the best discrimination was obtained in the group of otosclerosis and the worst in the group of acute acoustic trauma.

  2. Comparison Of Power Quality Disturbances Classification Based On Neural Network

    Directory of Open Access Journals (Sweden)

    Nway Nway Kyaw Win

    2015-07-01

    Full Text Available Abstract Power quality disturbances PQDs result serious problems in the reliability safety and economy of power system network. In order to improve electric power quality events the detection and classification of PQDs must be made type of transient fault. Software analysis of wavelet transform with multiresolution analysis MRA algorithm and feed forward neural network probabilistic and multilayer feed forward neural network based methodology for automatic classification of eight types of PQ signals flicker harmonics sag swell impulse fluctuation notch and oscillatory will be presented. The wavelet family Db4 is chosen in this system to calculate the values of detailed energy distributions as input features for classification because it can perform well in detecting and localizing various types of PQ disturbances. This technique classifies the types of PQDs problem sevents.The classifiers classify and identify the disturbance type according to the energy distribution. The results show that the PNN can analyze different power disturbance types efficiently. Therefore it can be seen that PNN has better classification accuracy than MLFF.

  3. Structure-based classification and ontology in chemistry

    Directory of Open Access Journals (Sweden)

    Hastings Janna

    2012-04-01

    Full Text Available Abstract Background Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving relevant results from the available information, and organising those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures, while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies. Results We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches. Conclusion Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational

  4. Significance and Application of Digital Breast Tomosynthesis for the BI-RADS Classification of Breast Cancer.

    Science.gov (United States)

    Cai, Si-Qing; Yan, Jian-Xiang; Chen, Qing-Shi; Huang, Mei-Ling; Cai, Dong-Lu

    2015-01-01

    Full-field digital mammography (FFDM) with dense breasts has a high rate of missed diagnosis, and digital breast tomosynthesis (DBT) could reduce organization overlapping and provide more reliable images for BI-RADS classification. This study aims to explore application of COMBO (FFDM+DBT) for effect and significance of BI-RADS classification of breast cancer. In this study, we selected 832 patients who had been treated from May 2013 to November 2013. Classify FFDM and COMBO examination according to BI-RADS separately and compare the differences for glands in the image of the same patient in judgment, mass characteristics display and indirect signs. Employ Paired Wilcoxon rank sum test was used in 79 breast cancer patients to find differences between two examine methods. The results indicated that COMBO pattern is able to observe more details in distribution of glands when estimating content. Paired Wilcoxon rank sum test showed that overall classification level of COMBO is higher significantly compared to FFDM to BI-RADS diagnosis and classification of breast (PBI-RADS classification in breast cancer in clinical.

  5. Gradient Evolution-based Support Vector Machine Algorithm for Classification

    Science.gov (United States)

    Zulvia, Ferani E.; Kuo, R. J.

    2018-03-01

    This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.

  6. A strategy learning model for autonomous agents based on classification

    Directory of Open Access Journals (Sweden)

    Śnieżyński Bartłomiej

    2015-09-01

    Full Text Available In this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process

  7. A Sieving ANN for Emotion-Based Movie Clip Classification

    Science.gov (United States)

    Watanapa, Saowaluk C.; Thipakorn, Bundit; Charoenkitkarn, Nipon

    Effective classification and analysis of semantic contents are very important for the content-based indexing and retrieval of video database. Our research attempts to classify movie clips into three groups of commonly elicited emotions, namely excitement, joy and sadness, based on a set of abstract-level semantic features extracted from the film sequence. In particular, these features consist of six visual and audio measures grounded on the artistic film theories. A unique sieving-structured neural network is proposed to be the classifying model due to its robustness. The performance of the proposed model is tested with 101 movie clips excerpted from 24 award-winning and well-known Hollywood feature films. The experimental result of 97.8% correct classification rate, measured against the collected human-judges, indicates the great potential of using abstract-level semantic features as an engineered tool for the application of video-content retrieval/indexing.

  8. Land Cover and Land Use Classification with TWOPAC: towards Automated Processing for Pixel- and Object-Based Image Classification

    Directory of Open Access Journals (Sweden)

    Stefan Dech

    2012-09-01

    Full Text Available We present a novel and innovative automated processing environment for the derivation of land cover (LC and land use (LU information. This processing framework named TWOPAC (TWinned Object and Pixel based Automated classification Chain enables the standardized, independent, user-friendly, and comparable derivation of LC and LU information, with minimized manual classification labor. TWOPAC allows classification of multi-spectral and multi-temporal remote sensing imagery from different sensor types. TWOPAC enables not only pixel-based classification, but also allows classification based on object-based characteristics. Classification is based on a Decision Tree approach (DT for which the well-known C5.0 code has been implemented, which builds decision trees based on the concept of information entropy. TWOPAC enables automatic generation of the decision tree classifier based on a C5.0-retrieved ascii-file, as well as fully automatic validation of the classification output via sample based accuracy assessment.Envisaging the automated generation of standardized land cover products, as well as area-wide classification of large amounts of data in preferably a short processing time, standardized interfaces for process control, Web Processing Services (WPS, as introduced by the Open Geospatial Consortium (OGC, are utilized. TWOPAC’s functionality to process geospatial raster or vector data via web resources (server, network enables TWOPAC’s usability independent of any commercial client or desktop software and allows for large scale data processing on servers. Furthermore, the components of TWOPAC were built-up using open source code components and are implemented as a plug-in for Quantum GIS software for easy handling of the classification process from the user’s perspective.

  9. Immunogenomic Classification of Colorectal Cancer and Therapeutic Implications

    NARCIS (Netherlands)

    Roelands, Jessica; Kuppen, Peter J. K.; Vermeulen, Louis; Maccalli, Cristina; Decock, Julie; Wang, Ena; Marincola, Francesco M.; Bedognetti, Davide; Hendrickx, Wouter

    2017-01-01

    The immune system has a substantial effect on colorectal cancer (CRC) progression. Additionally, the response to immunotherapeutics and conventional treatment options (e.g., chemotherapy, radiotherapy and targeted therapies) is influenced by the immune system. The molecular characterization of

  10. A new gammagraphic and functional-based classification for hyperthyroidism

    International Nuclear Information System (INIS)

    Sanchez, J.; Lamata, F.; Cerdan, R.; Agilella, V.; Gastaminza, R.; Abusada, R.; Gonzales, M.; Martinez, M.

    2000-01-01

    The absence of an universal classification for hyperthyroidism's (HT), give rise to inadequate interpretation of series and trials, and prevents decision making. We offer a tentative classification based on gammagraphic and functional findings. Clinical records from patients who underwent thyroidectomy in our Department since 1967 to 1997 were reviewed. Those with functional measurements of hyperthyroidism were considered. All were managed according to the same preestablished guidelines. HT was the surgical indication in 694 (27,1%) of the 2559 thyroidectomy. Based on gammagraphic studies, we classified HTs in: parenchymatous increased-uptake, which could be diffuse, diffuse with cold nodules or diffuse with at least one nodule, and nodular increased-uptake (Autonomous Functioning Thyroid Nodes-AFTN), divided into solitary AFTN or toxic adenoma and multiple AFTN o toxic multi-nodular goiter. This gammagraphic-based classification in useful and has high sensitivity to detect these nodules assessing their activity, allowing us to make therapeutic decision making and, in some cases, to choose surgical technique. (authors)

  11. Clinical classification of cancer cachexia: phenotypic correlates in human skeletal muscle.

    Directory of Open Access Journals (Sweden)

    Neil Johns

    Full Text Available BACKGROUND: Cachexia affects the majority of patients with advanced cancer and is associated with a reduction in treatment tolerance, response to therapy, and duration of survival. One impediment towards the effective treatment of cachexia is a validated classification system. METHODS: 41 patients with resectable upper gastrointestinal (GI or pancreatic cancer underwent characterisation for cachexia based on weight-loss (WL and/or low muscularity (LM. Four diagnostic criteria were used >5%WL, >10%WL, LM, and LM+>2%WL. All patients underwent biopsy of the rectus muscle. Analysis included immunohistochemistry for fibre size and type, protein and nucleic acid concentration, Western blots for markers of autophagy, SMAD signalling, and inflammation. FINDINGS: Compared with non-cachectic cancer patients, patients with LM or LM+>2%WL, mean muscle fibre diameter was reduced by about 25% (p = 0.02 and p = 0.001 respectively. No significant difference in fibre diameter was observed if patients had WL alone. Regardless of classification, there was no difference in fibre number or proportion of fibre type across all myosin heavy chain isoforms. Mean muscle protein content was reduced and the ratio of RNA/DNA decreased in patients with either >5%WL or LM+>2%WL. Compared with non-cachectic patients, SMAD3 protein levels were increased in patients with >5%WL (p = 0.022 and with >10%WL, beclin (p = 0.05 and ATG5 (p = 0.01 protein levels were increased. There were no differences in phospho-NFkB or phospho-STAT3 levels across any of the groups. CONCLUSION: Muscle fibre size, biochemical composition and pathway phenotype can vary according to whether the diagnostic criteria for cachexia are based on weight loss alone, a measure of low muscularity alone or a combination of the two. For intervention trials where the primary end-point is a change in muscle mass or function, use of combined diagnostic criteria may allow identification of a more

  12. G0-WISHART Distribution Based Classification from Polarimetric SAR Images

    Science.gov (United States)

    Hu, G. C.; Zhao, Q. H.

    2017-09-01

    Enormous scientific and technical developments have been carried out to further improve the remote sensing for decades, particularly Polarimetric Synthetic Aperture Radar(PolSAR) technique, so classification method based on PolSAR images has getted much more attention from scholars and related department around the world. The multilook polarmetric G0-Wishart model is a more flexible model which describe homogeneous, heterogeneous and extremely heterogeneous regions in the image. Moreover, the polarmetric G0-Wishart distribution dose not include the modified Bessel function of the second kind. It is a kind of simple statistical distribution model with less parameter. To prove its feasibility, a process of classification has been tested with the full-polarized Synthetic Aperture Radar (SAR) image by the method. First, apply multilook polarimetric SAR data process and speckle filter to reduce speckle influence for classification result. Initially classify the image into sixteen classes by H/A/α decomposition. Using the ICM algorithm to classify feature based on the G0-Wshart distance. Qualitative and quantitative results show that the proposed method can classify polaimetric SAR data effectively and efficiently.

  13. Design and implementation based on the classification protection vulnerability scanning system

    International Nuclear Information System (INIS)

    Wang Chao; Lu Zhigang; Liu Baoxu

    2010-01-01

    With the application and spread of the classification protection, Network Security Vulnerability Scanning should consider the efficiency and the function expansion. It proposes a kind of a system vulnerability from classification protection, and elaborates the design and implementation of a vulnerability scanning system based on vulnerability classification plug-in technology and oriented classification protection. According to the experiment, the application of classification protection has good adaptability and salability with the system, and it also approves the efficiency of scanning. (authors)

  14. Soil classification basing on the spectral characteristics of topsoil samples

    Science.gov (United States)

    Liu, Huanjun; Zhang, Xiaokang; Zhang, Xinle

    2016-04-01

    Soil taxonomy plays an important role in soil utility and management, but China has only course soil map created based on 1980s data. New technology, e.g. spectroscopy, could simplify soil classification. The study try to classify soils basing on the spectral characteristics of topsoil samples. 148 topsoil samples of typical soils, including Black soil, Chernozem, Blown soil and Meadow soil, were collected from Songnen plain, Northeast China, and the room spectral reflectance in the visible and near infrared region (400-2500 nm) were processed with weighted moving average, resampling technique, and continuum removal. Spectral indices were extracted from soil spectral characteristics, including the second absorption positions of spectral curve, the first absorption vale's area, and slope of spectral curve at 500-600 nm and 1340-1360 nm. Then K-means clustering and decision tree were used respectively to build soil classification model. The results indicated that 1) the second absorption positions of Black soil and Chernozem were located at 610 nm and 650 nm respectively; 2) the spectral curve of the meadow is similar to its adjacent soil, which could be due to soil erosion; 3) decision tree model showed higher classification accuracy, and accuracy of Black soil, Chernozem, Blown soil and Meadow are 100%, 88%, 97%, 50% respectively, and the accuracy of Blown soil could be increased to 100% by adding one more spectral index (the first two vole's area) to the model, which showed that the model could be used for soil classification and soil map in near future.

  15. hemaClass.org: Online One-By-One Microarray Normalization and Classification of Hematological Cancers for Precision Medicine.

    Science.gov (United States)

    Falgreen, Steffen; Ellern Bilgrau, Anders; Brøndum, Rasmus Froberg; Hjort Jakobsen, Lasse; Have, Jonas; Lindblad Nielsen, Kasper; El-Galaly, Tarec Christoffer; Bødker, Julie Støve; Schmitz, Alexander; H Young, Ken; Johnsen, Hans Erik; Dybkær, Karen; Bøgsted, Martin

    2016-01-01

    Dozens of omics based cancer classification systems have been introduced with prognostic, diagnostic, and predictive capabilities. However, they often employ complex algorithms and are only applicable on whole cohorts of patients, making them difficult to apply in a personalized clinical setting. This prompted us to create hemaClass.org, an online web application providing an easy interface to one-by-one RMA normalization of microarrays and subsequent risk classifications of diffuse large B-cell lymphoma (DLBCL) into cell-of-origin and chemotherapeutic sensitivity classes. Classification results for one-by-one array pre-processing with and without a laboratory specific RMA reference dataset were compared to cohort based classifiers in 4 publicly available datasets. Classifications showed high agreement between one-by-one and whole cohort pre-processsed data when a laboratory specific reference set was supplied. The website is essentially the R-package hemaClass accompanied by a Shiny web application. The well-documented package can be used to run the website locally or to use the developed methods programmatically. The website and R-package is relevant for biological and clinical lymphoma researchers using affymetrix U-133 Plus 2 arrays, as it provides reliable and swift methods for calculation of disease subclasses. The proposed one-by-one pre-processing method is relevant for all researchers using microarrays.

  16. Prognostic classification index in Iranian colorectal cancer patients: Survival tree analysis

    Directory of Open Access Journals (Sweden)

    Amal Saki Malehi

    2016-01-01

    Full Text Available Aims: The aim of this study was to determine the prognostic index for separating homogenous subgroups in colorectal cancer (CRC patients based on clinicopathological characteristics using survival tree analysis. Methods: The current study was conducted at the Research Center of Gastroenterology and Liver Disease, Shahid Beheshti Medical University in Tehran, between January 2004 and January 2009. A total of 739 patients who already have been diagnosed with CRC based on pathologic report were enrolled. The data included demographic and clinical-pathological characteristic of patients. Tree-structured survival analysis based on a recursive partitioning algorithm was implemented to evaluate prognostic factors. The probability curves were calculated according to the Kaplan-Meier method, and the hazard ratio was estimated as an interest effect size. Result: There were 526 males (71.2% of these patients. The mean survival time (from diagnosis time was 42.46± (3.4. Survival tree identified three variables as main prognostic factors and based on their four prognostic subgroups was constructed. The log-rank test showed good separation of survival curves. Patients with Stage I-IIIA and treated with surgery as the first treatment showed low risk (median = 34 months whereas patients with stage IIIB, IV, and more than 68 years have the worse survival outcome (median = 9.5 months. Conclusion: Constructing the prognostic classification index via survival tree can aid the researchers to assess interaction between clinical variables and determining the cumulative effect of these variables on survival outcome.

  17. Evaluation Methodology between Globalization and Localization Features Approaches for Skin Cancer Lesions Classification

    Science.gov (United States)

    Ahmed, H. M.; Al-azawi, R. J.; Abdulhameed, A. A.

    2018-05-01

    Huge efforts have been put in the developing of diagnostic methods to skin cancer disease. In this paper, two different approaches have been addressed for detection the skin cancer in dermoscopy images. The first approach uses a global method that uses global features for classifying skin lesions, whereas the second approach uses a local method that uses local features for classifying skin lesions. The aim of this paper is selecting the best approach for skin lesion classification. The dataset has been used in this paper consist of 200 dermoscopy images from Pedro Hispano Hospital (PH2). The achieved results are; sensitivity about 96%, specificity about 100%, precision about 100%, and accuracy about 97% for globalization approach while, sensitivity about 100%, specificity about 100%, precision about 100%, and accuracy about 100% for Localization Approach, these results showed that the localization approach achieved acceptable accuracy and better than globalization approach for skin cancer lesions classification.

  18. Breast cancer tumor classification using LASSO method selection approach

    International Nuclear Information System (INIS)

    Celaya P, J. M.; Ortiz M, J. A.; Martinez B, M. R.; Solis S, L. O.; Castaneda M, R.; Garza V, I.; Martinez F, M.; Ortiz R, J. M.

    2016-10-01

    Breast cancer is one of the leading causes of deaths worldwide among women. Early tumor detection is key in reducing breast cancer deaths and screening mammography is the widest available method for early detection. Mammography is the most common and effective breast cancer screening test. However, the rate of positive findings is very low, making the radiologic interpretation monotonous and biased toward errors. In an attempt to alleviate radiological workload, this work presents a computer-aided diagnosis (CAD x) method aimed to automatically classify tumor lesions into malign or benign as a means to a second opinion. The CAD x methos, extracts image features, and classifies the screening mammogram abnormality into one of two categories: subject at risk of having malignant tumor (malign), and healthy subject (benign). In this study, 143 abnormal segmentation s (57 malign and 86 benign) from the Breast Cancer Digital Repository (BCD R) public database were used to train and evaluate the CAD x system. Percentile-rank (p-rank) was used to standardize the data. Using the LASSO feature selection methodology, the model achieved a Leave-one-out-cross-validation area under the receiver operating characteristic curve (Auc) of 0.950. The proposed method has the potential to rank abnormal lesions with high probability of malignant findings aiding in the detection of potential malign cases as a second opinion to the radiologist. (Author)

  19. Breast cancer tumor classification using LASSO method selection approach

    Energy Technology Data Exchange (ETDEWEB)

    Celaya P, J. M.; Ortiz M, J. A.; Martinez B, M. R.; Solis S, L. O.; Castaneda M, R.; Garza V, I.; Martinez F, M.; Ortiz R, J. M., E-mail: morvymm@yahoo.com.mx [Universidad Autonoma de Zacatecas, Av. Ramon Lopez Velarde 801, Col. Centro, 98000 Zacatecas, Zac. (Mexico)

    2016-10-15

    Breast cancer is one of the leading causes of deaths worldwide among women. Early tumor detection is key in reducing breast cancer deaths and screening mammography is the widest available method for early detection. Mammography is the most common and effective breast cancer screening test. However, the rate of positive findings is very low, making the radiologic interpretation monotonous and biased toward errors. In an attempt to alleviate radiological workload, this work presents a computer-aided diagnosis (CAD x) method aimed to automatically classify tumor lesions into malign or benign as a means to a second opinion. The CAD x methos, extracts image features, and classifies the screening mammogram abnormality into one of two categories: subject at risk of having malignant tumor (malign), and healthy subject (benign). In this study, 143 abnormal segmentation s (57 malign and 86 benign) from the Breast Cancer Digital Repository (BCD R) public database were used to train and evaluate the CAD x system. Percentile-rank (p-rank) was used to standardize the data. Using the LASSO feature selection methodology, the model achieved a Leave-one-out-cross-validation area under the receiver operating characteristic curve (Auc) of 0.950. The proposed method has the potential to rank abnormal lesions with high probability of malignant findings aiding in the detection of potential malign cases as a second opinion to the radiologist. (Author)

  20. Breast Cancer Survival Defined by the ER/PR/HER2 Subtypes and a Surrogate Classification according to Tumor Grade and Immunohistochemical Bio markers

    International Nuclear Information System (INIS)

    Parise, C. A.; Caggiano, V.

    2014-01-01

    ER, PR, and HER2 are routinely available in breast cancer specimens. The purpose of this study is to contrast breast cancer-specific survival for the eight ER/PR/HER2 subtypes with survival of an immunohistochemical surrogate for the molecular subtype based on the ER/PR/HER2 subtypes and tumor grade. Methods. We identified 123,780 cases of stages 1-3 primary female invasive breast cancer from California Cancer Registry. The surrogate classification was derived using ER/PR/HER2 and tumor grade. Kaplan-Meier survival analysis and Cox proportional hazards modeling were used to assess differences in survival and risk of mortality for the ER/PR/HER2 subtypes and surrogate classification within each stage. Results. The luminal B/HER2− surrogate classification had a higher risk of mortality than the luminal B/HER2+ for all stages of disease. There was no difference in risk of mortality between the ER+/PR+/HER2− and ER+/PR+/HER2+ in stage 3. With one exception in stage 3, the ER-negative subtypes all had an increased risk of mortality when compared with the ER-positive subtypes. Conclusions. Assessment of survival using ER/PR/HER2 illustrates the heterogeneity of HER2+ subtypes. The surrogate classification provides clear separation in survival and adjusted mortality but underestimates the wide variability within the subtypes that make up the classification.

  1. Breast Cancer Survival Defined by the ER/PR/HER2 Subtypes and a Surrogate Classification according to Tumor Grade and Immunohistochemical Biomarkers

    Directory of Open Access Journals (Sweden)

    Carol A. Parise

    2014-01-01

    Full Text Available Introduction. ER, PR, and HER2 are routinely available in breast cancer specimens. The purpose of this study is to contrast breast cancer-specific survival for the eight ER/PR/HER2 subtypes with survival of an immunohistochemical surrogate for the molecular subtype based on the ER/PR/HER2 subtypes and tumor grade. Methods. We identified 123,780 cases of stages 1–3 primary female invasive breast cancer from California Cancer Registry. The surrogate classification was derived using ER/PR/HER2 and tumor grade. Kaplan-Meier survival analysis and Cox proportional hazards modeling were used to assess differences in survival and risk of mortality for the ER/PR/HER2 subtypes and surrogate classification within each stage. Results. The luminal B/HER2− surrogate classification had a higher risk of mortality than the luminal B/HER2+ for all stages of disease. There was no difference in risk of mortality between the ER+/PR+/HER2− and ER+/PR+/HER2+ in stage 3. With one exception in stage 3, the ER-negative subtypes all had an increased risk of mortality when compared with the ER-positive subtypes. Conclusions. Assessment of survival using ER/PR/HER2 illustrates the heterogeneity of HER2+ subtypes. The surrogate classification provides clear separation in survival and adjusted mortality but underestimates the wide variability within the subtypes that make up the classification.

  2. Cell nuclei attributed relational graphs for efficient representation and classification of gastric cancer in digital histopathology

    Science.gov (United States)

    Sharma, Harshita; Zerbe, Norman; Heim, Daniel; Wienert, Stephan; Lohmann, Sebastian; Hellwich, Olaf; Hufnagl, Peter

    2016-03-01

    This paper describes a novel graph-based method for efficient representation and subsequent classification in histological whole slide images of gastric cancer. Her2/neu immunohistochemically stained and haematoxylin and eosin stained histological sections of gastric carcinoma are digitized. Immunohistochemical staining is used in practice by pathologists to determine extent of malignancy, however, it is laborious to visually discriminate the corresponding malignancy levels in the more commonly used haematoxylin and eosin stain, and this study attempts to solve this problem using a computer-based method. Cell nuclei are first isolated at high magnification using an automatic cell nuclei segmentation strategy, followed by construction of cell nuclei attributed relational graphs of the tissue regions. These graphs represent tissue architecture comprehensively, as they contain information about cell nuclei morphology as vertex attributes, along with knowledge of neighborhood in the form of edge linking and edge attributes. Global graph characteristics are derived and ensemble learning is used to discriminate between three types of malignancy levels, namely, non-tumor, Her2/neu positive tumor and Her2/neu negative tumor. Performance is compared with state of the art methods including four texture feature groups (Haralick, Gabor, Local Binary Patterns and Varma Zisserman features), color and intensity features, and Voronoi diagram and Delaunay triangulation. Texture, color and intensity information is also combined with graph-based knowledge, followed by correlation analysis. Quantitative assessment is performed using two cross validation strategies. On investigating the experimental results, it can be concluded that the proposed method provides a promising way for computer-based analysis of histopathological images of gastric cancer.

  3. Radiographic classification for fractures of the fifth metatarsal base

    International Nuclear Information System (INIS)

    Mehlhorn, Alexander T.; Zwingmann, Joern; Hirschmueller, Anja; Suedkamp, Norbert P.; Schmal, Hagen

    2014-01-01

    Avulsion fractures of the fifth metatarsal base (MTB5) are common fore foot injuries. Based on a radiomorphometric analysis reflecting the risk for a secondary displacement, a new classification was developed. A cohort of 95 healthy, sportive, and young patients (age ≤ 50 years) with avulsion fractures of the MTB5 was included in the study and divided into groups with non-displaced, primary-displaced, and secondary-displaced fractures. Radiomorphometric data obtained using standard oblique and dorso-plantar views were analyzed in association with secondary displacement. Based on this, a classification was developed and checked for reproducibility. Fractures with a longer distance between the lateral edge of the styloid process and the lateral fracture step-off and fractures with a more medial joint entry of the fracture line at the MTB5 are at higher risk to displace secondarily. Based on these findings, all fractures were divided into three types: type I with a fracture entry in the lateral third; type II in the middle third; and type III in the medial third of the MTB5. Additionally, the three types were subdivided into an A-type with a fracture displacement <2 mm and a B-type with a fracture displacement ≥ 2 mm. A substantial level of interobserver agreement was found in the assignment of all 95 fractures to the six fracture types (κ = 0.72). The secondary displacement of fractures was confirmed by all examiners in 100 %. Radiomorphometric data may identify fractures at risk for secondary displacement of the MTB5. Based on this, a reliable classification was developed. (orig.)

  4. Radiographic classification for fractures of the fifth metatarsal base

    Energy Technology Data Exchange (ETDEWEB)

    Mehlhorn, Alexander T.; Zwingmann, Joern; Hirschmueller, Anja; Suedkamp, Norbert P.; Schmal, Hagen [University of Freiburg Medical Center, Department of Orthopaedic Surgery, Freiburg (Germany)

    2014-04-15

    Avulsion fractures of the fifth metatarsal base (MTB5) are common fore foot injuries. Based on a radiomorphometric analysis reflecting the risk for a secondary displacement, a new classification was developed. A cohort of 95 healthy, sportive, and young patients (age ≤ 50 years) with avulsion fractures of the MTB5 was included in the study and divided into groups with non-displaced, primary-displaced, and secondary-displaced fractures. Radiomorphometric data obtained using standard oblique and dorso-plantar views were analyzed in association with secondary displacement. Based on this, a classification was developed and checked for reproducibility. Fractures with a longer distance between the lateral edge of the styloid process and the lateral fracture step-off and fractures with a more medial joint entry of the fracture line at the MTB5 are at higher risk to displace secondarily. Based on these findings, all fractures were divided into three types: type I with a fracture entry in the lateral third; type II in the middle third; and type III in the medial third of the MTB5. Additionally, the three types were subdivided into an A-type with a fracture displacement <2 mm and a B-type with a fracture displacement ≥ 2 mm. A substantial level of interobserver agreement was found in the assignment of all 95 fractures to the six fracture types (κ = 0.72). The secondary displacement of fractures was confirmed by all examiners in 100 %. Radiomorphometric data may identify fractures at risk for secondary displacement of the MTB5. Based on this, a reliable classification was developed. (orig.)

  5. Risk Classification and Risk-based Safety and Mission Assurance

    Science.gov (United States)

    Leitner, Jesse A.

    2014-01-01

    Recent activities to revamp and emphasize the need to streamline processes and activities for Class D missions across the agency have led to various interpretations of Class D, including the lumping of a variety of low-cost projects into Class D. Sometimes terms such as Class D minus are used. In this presentation, mission risk classifications will be traced to official requirements and definitions as a measure to ensure that projects and programs align with the guidance and requirements that are commensurate for their defined risk posture. As part of this, the full suite of risk classifications, formal and informal will be defined, followed by an introduction to the new GPR 8705.4 that is currently under review.GPR 8705.4 lays out guidance for the mission success activities performed at the Classes A-D for NPR 7120.5 projects as well as for projects not under NPR 7120.5. Furthermore, the trends in stepping from Class A into higher risk posture classifications will be discussed. The talk will conclude with a discussion about risk-based safety and mission assuranceat GSFC.

  6. Overfitting Reduction of Text Classification Based on AdaBELM

    Directory of Open Access Journals (Sweden)

    Xiaoyue Feng

    2017-07-01

    Full Text Available Overfitting is an important problem in machine learning. Several algorithms, such as the extreme learning machine (ELM, suffer from this issue when facing high-dimensional sparse data, e.g., in text classification. One common issue is that the extent of overfitting is not well quantified. In this paper, we propose a quantitative measure of overfitting referred to as the rate of overfitting (RO and a novel model, named AdaBELM, to reduce the overfitting. With RO, the overfitting problem can be quantitatively measured and identified. The newly proposed model can achieve high performance on multi-class text classification. To evaluate the generalizability of the new model, we designed experiments based on three datasets, i.e., the 20 Newsgroups, Reuters-21578, and BioMed corpora, which represent balanced, unbalanced, and real application data, respectively. Experiment results demonstrate that AdaBELM can reduce overfitting and outperform classical ELM, decision tree, random forests, and AdaBoost on all three text-classification datasets; for example, it can achieve 62.2% higher accuracy than ELM. Therefore, the proposed model has a good generalizability.

  7. Image Classification Based on Convolutional Denoising Sparse Autoencoder

    Directory of Open Access Journals (Sweden)

    Shuangshuang Chen

    2017-01-01

    Full Text Available Image classification aims to group images into corresponding semantic categories. Due to the difficulties of interclass similarity and intraclass variability, it is a challenging issue in computer vision. In this paper, an unsupervised feature learning approach called convolutional denoising sparse autoencoder (CDSAE is proposed based on the theory of visual attention mechanism and deep learning methods. Firstly, saliency detection method is utilized to get training samples for unsupervised feature learning. Next, these samples are sent to the denoising sparse autoencoder (DSAE, followed by convolutional layer and local contrast normalization layer. Generally, prior in a specific task is helpful for the task solution. Therefore, a new pooling strategy—spatial pyramid pooling (SPP fused with center-bias prior—is introduced into our approach. Experimental results on the common two image datasets (STL-10 and CIFAR-10 demonstrate that our approach is effective in image classification. They also demonstrate that none of these three components: local contrast normalization, SPP fused with center-prior, and l2 vector normalization can be excluded from our proposed approach. They jointly improve image representation and classification performance.

  8. Tongue Images Classification Based on Constrained High Dispersal Network

    Directory of Open Access Journals (Sweden)

    Dan Meng

    2017-01-01

    Full Text Available Computer aided tongue diagnosis has a great potential to play important roles in traditional Chinese medicine (TCM. However, the majority of the existing tongue image analyses and classification methods are based on the low-level features, which may not provide a holistic view of the tongue. Inspired by deep convolutional neural network (CNN, we propose a novel feature extraction framework called constrained high dispersal neural networks (CHDNet to extract unbiased features and reduce human labor for tongue diagnosis in TCM. Previous CNN models have mostly focused on learning convolutional filters and adapting weights between them, but these models have two major issues: redundancy and insufficient capability in handling unbalanced sample distribution. We introduce high dispersal and local response normalization operation to address the issue of redundancy. We also add multiscale feature analysis to avoid the problem of sensitivity to deformation. Our proposed CHDNet learns high-level features and provides more classification information during training time, which may result in higher accuracy when predicting testing samples. We tested the proposed method on a set of 267 gastritis patients and a control group of 48 healthy volunteers. Test results show that CHDNet is a promising method in tongue image classification for the TCM study.

  9. Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis.

    Science.gov (United States)

    Al-Rajab, Murad; Lu, Joan; Xu, Qiang

    2017-07-01

    This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Support Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Classification of Noisy Data: An Approach Based on Genetic Algorithms and Voronoi Tessellation

    DEFF Research Database (Denmark)

    Khan, Abdul Rauf; Schiøler, Henrik; Knudsen, Torben

    Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based on the po......Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based...

  11. Contaminant classification using cosine distances based on multiple conventional sensors.

    Science.gov (United States)

    Liu, Shuming; Che, Han; Smith, Kate; Chang, Tian

    2015-02-01

    Emergent contamination events have a significant impact on water systems. After contamination detection, it is important to classify the type of contaminant quickly to provide support for remediation attempts. Conventional methods generally either rely on laboratory-based analysis, which requires a long analysis time, or on multivariable-based geometry analysis and sequence analysis, which is prone to being affected by the contaminant concentration. This paper proposes a new contaminant classification method, which discriminates contaminants in a real time manner independent of the contaminant concentration. The proposed method quantifies the similarities or dissimilarities between sensors' responses to different types of contaminants. The performance of the proposed method was evaluated using data from contaminant injection experiments in a laboratory and compared with a Euclidean distance-based method. The robustness of the proposed method was evaluated using an uncertainty analysis. The results show that the proposed method performed better in identifying the type of contaminant than the Euclidean distance based method and that it could classify the type of contaminant in minutes without significantly compromising the correct classification rate (CCR).

  12. Application of Bayesian Classification to Content-Based Data Management

    Science.gov (United States)

    Lynnes, Christopher; Berrick, S.; Gopalan, A.; Hua, X.; Shen, S.; Smith, P.; Yang, K-Y.; Wheeler, K.; Curry, C.

    2004-01-01

    The high volume of Earth Observing System data has proven to be challenging to manage for data centers and users alike. At the Goddard Earth Sciences Distributed Active Archive Center (GES DAAC), about 1 TB of new data are archived each day. Distribution to users is also about 1 TB/day. A substantial portion of this distribution is MODIS calibrated radiance data, which has a wide variety of uses. However, much of the data is not useful for a particular user's needs: for example, ocean color users typically need oceanic pixels that are free of cloud and sun-glint. The GES DAAC is using a simple Bayesian classification scheme to rapidly classify each pixel in the scene in order to support several experimental content-based data services for near-real-time MODIS calibrated radiance products (from Direct Readout stations). Content-based subsetting would allow distribution of, say, only clear pixels to the user if desired. Content-based subscriptions would distribute data to users only when they fit the user's usability criteria in their area of interest within the scene. Content-based cache management would retain more useful data on disk for easy online access. The classification may even be exploited in an automated quality assessment of the geolocation product. Though initially to be demonstrated at the GES DAAC, these techniques have applicability in other resource-limited environments, such as spaceborne data systems.

  13. Impact of full field digital mammography on the classification and mammographic characteristics of interval breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Knox, Mark, E-mail: marktknox@gmail.com; O’Brien, Angela, E-mail: angelaobrien@doctors.org.uk; Szabó, Endre, E-mail: endrebacsi@freemail.hu; Smith, Clare S., E-mail: csmith@mater.ie; Fenlon, Helen M., E-mail: helen.fenlon@cancerscreening.ie; McNicholas, Michelle M., E-mail: michelle.mcnicholas@cancerscreening.ie; Flanagan, Fidelma L., E-mail: fidelma.flanagan@cancerscreening.ie

    2015-06-15

    Highlights: • Digital mammography has changed the presentation of interval breast cancer. • Less interval breast cancers are associated with microcalcifications following FFDM. • Interval breast cancer audit remains a key feature of any breast screening program. - Abstract: Objective: Full field digital mammography (FFDM) is increasingly replacing screen film mammography (SFM) in breast screening programs. Interval breast cancers are an issue in all screening programs and the purpose of our study is to assess the impact of FFDM on the classification of interval breast cancers at independent blind review and to compare the mammographic features of interval cancers at FFDM and SFM. Materials and methods: This study included 138 cases of interval breast cancer, 76 following an FFDM screening examination and 62 following screening with SFM. The prior screening mammogram was assessed by each of five consultant breast radiologists who were blinded to the site of subsequent cancer. Subsequent review of the diagnostic mammogram was performed and cases were classified as missed, minimal signs, occult or true interval. Mammographic features of the interval cancer at diagnosis and any abnormality identified on the prior screening mammogram were recorded. Results: The percentages of cancers classified as missed at FFDM and SFM did not differ significantly, 10.5% (8 of 76) at FFDM and 8.1% (5 of 62) at SFM (p = .77). There were significantly less interval cancers presenting as microcalcifications (alone or in association with another abnormality) following screening with FFDM, 16% (12 of 76) than following a SFM examination, 32% (20 of 62) (p = .02). Conclusion: Interval breast cancers continue to pose a problem at FFDM. The switch to FFDM has changed the mammographic presentation of interval breast cancer, with less interval cancers presenting in association with microcalcifications.

  14. Impact of full field digital mammography on the classification and mammographic characteristics of interval breast cancers

    International Nuclear Information System (INIS)

    Knox, Mark; O’Brien, Angela; Szabó, Endre; Smith, Clare S.; Fenlon, Helen M.; McNicholas, Michelle M.; Flanagan, Fidelma L.

    2015-01-01

    Highlights: • Digital mammography has changed the presentation of interval breast cancer. • Less interval breast cancers are associated with microcalcifications following FFDM. • Interval breast cancer audit remains a key feature of any breast screening program. - Abstract: Objective: Full field digital mammography (FFDM) is increasingly replacing screen film mammography (SFM) in breast screening programs. Interval breast cancers are an issue in all screening programs and the purpose of our study is to assess the impact of FFDM on the classification of interval breast cancers at independent blind review and to compare the mammographic features of interval cancers at FFDM and SFM. Materials and methods: This study included 138 cases of interval breast cancer, 76 following an FFDM screening examination and 62 following screening with SFM. The prior screening mammogram was assessed by each of five consultant breast radiologists who were blinded to the site of subsequent cancer. Subsequent review of the diagnostic mammogram was performed and cases were classified as missed, minimal signs, occult or true interval. Mammographic features of the interval cancer at diagnosis and any abnormality identified on the prior screening mammogram were recorded. Results: The percentages of cancers classified as missed at FFDM and SFM did not differ significantly, 10.5% (8 of 76) at FFDM and 8.1% (5 of 62) at SFM (p = .77). There were significantly less interval cancers presenting as microcalcifications (alone or in association with another abnormality) following screening with FFDM, 16% (12 of 76) than following a SFM examination, 32% (20 of 62) (p = .02). Conclusion: Interval breast cancers continue to pose a problem at FFDM. The switch to FFDM has changed the mammographic presentation of interval breast cancer, with less interval cancers presenting in association with microcalcifications

  15. Object-based Dimensionality Reduction in Land Surface Phenology Classification

    Directory of Open Access Journals (Sweden)

    Brian E. Bunker

    2016-11-01

    Full Text Available Unsupervised classification or clustering of multi-decadal land surface phenology provides a spatio-temporal synopsis of natural and agricultural vegetation response to environmental variability and anthropogenic activities. Notwithstanding the detailed temporal information available in calibrated bi-monthly normalized difference vegetation index (NDVI and comparable time series, typical pre-classification workflows average a pixel’s bi-monthly index within the larger multi-decadal time series. While this process is one practical way to reduce the dimensionality of time series with many hundreds of image epochs, it effectively dampens temporal variation from both intra and inter-annual observations related to land surface phenology. Through a novel application of object-based segmentation aimed at spatial (not temporal dimensionality reduction, all 294 image epochs from a Moderate Resolution Imaging Spectroradiometer (MODIS bi-monthly NDVI time series covering the northern Fertile Crescent were retained (in homogenous landscape units as unsupervised classification inputs. Given the inherent challenges of in situ or manual image interpretation of land surface phenology classes, a cluster validation approach based on transformed divergence enabled comparison between traditional and novel techniques. Improved intra-annual contrast was clearly manifest in rain-fed agriculture and inter-annual trajectories showed increased cluster cohesion, reducing the overall number of classes identified in the Fertile Crescent study area from 24 to 10. Given careful segmentation parameters, this spatial dimensionality reduction technique augments the value of unsupervised learning to generate homogeneous land surface phenology units. By combining recent scalable computational approaches to image segmentation, future work can pursue new global land surface phenology products based on the high temporal resolution signatures of vegetation index time series.

  16. From Molecular Classification to Targeted Therapeutics: The Changing Face of Systemic Therapy in Metastatic Gastroesophageal Cancer

    Directory of Open Access Journals (Sweden)

    Adrian Murphy

    2015-01-01

    Full Text Available Histological classification of adenocarcinoma or squamous cell carcinoma for esophageal cancer or using the Lauren classification for intestinal and diffuse type gastric cancer has limited clinical utility in the management of advanced disease. Germline mutations in E-cadherin (CDH1 or mismatch repair genes (Lynch syndrome were identified many years ago but given their rarity, the identification of these molecular alterations does not substantially impact treatment in the advanced setting. Recent molecular profiling studies of upper GI tumors have added to our knowledge of the underlying biology but have not led to an alternative classification system which can guide clinician’s therapeutic decisions. Recently the Cancer Genome Atlas Research Network has proposed four subtypes of gastric cancer dividing tumors into those positive for Epstein-Barr virus, microsatellite unstable tumors, genomically stable tumors, and tumors with chromosomal instability. Unfortunately to date, many phase III clinical trials involving molecularly targeted agents have failed to meet their survival endpoints due to their use in unselected populations. Future clinical trials should utilize molecular profiling of individual tumors in order to determine the optimal use of targeted therapies in preselected patients.

  17. Hydrophobicity classification of polymeric materials based on fractal dimension

    Directory of Open Access Journals (Sweden)

    Daniel Thomazini

    2008-12-01

    Full Text Available This study proposes a new method to obtain hydrophobicity classification (HC in high voltage polymer insulators. In the method mentioned, the HC was analyzed by fractal dimension (fd and its processing time was evaluated having as a goal the application in mobile devices. Texture images were created from spraying solutions produced of mixtures of isopropyl alcohol and distilled water in proportions, which ranged from 0 to 100% volume of alcohol (%AIA. Based on these solutions, the contact angles of the drops were measured and the textures were used as patterns for fractal dimension calculations.

  18. Parametric classification of handvein patterns based on texture features

    Science.gov (United States)

    Al Mahafzah, Harbi; Imran, Mohammad; Supreetha Gowda H., D.

    2018-04-01

    In this paper, we have developed Biometric recognition system adopting hand based modality Handvein,which has the unique pattern for each individual and it is impossible to counterfeit and fabricate as it is an internal feature. We have opted in choosing feature extraction algorithms such as LBP-visual descriptor, LPQ-blur insensitive texture operator, Log-Gabor-Texture descriptor. We have chosen well known classifiers such as KNN and SVM for classification. We have experimented and tabulated results of single algorithm recognition rate for Handvein under different distance measures and kernel options. The feature level fusion is carried out which increased the performance level.

  19. MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS

    Science.gov (United States)

    Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...

  20. Stratification and Prognostic Relevance of Jass’s Molecular Classification of Colorectal Cancer

    International Nuclear Information System (INIS)

    Zlobec, Inti; Bihl, Michel P.; Foerster, Anja; Rufle, Alex; Terracciano, Luigi; Lugli, Alessandro

    2012-01-01

    Background: The current proposed model of colorectal tumorigenesis is based primarily on CpG island methylator phenotype (CIMP), microsatellite instability (MSI), KRAS, BRAF, and methylation status of 0-6-Methylguanine DNA Methyltransferase (MGMT) and classifies tumors into five subgroups. The aim of this study is to validate this molecular classification and test its prognostic relevance. Methods: Three hundred two patients were included in this study. Molecular analysis was performed for five CIMP-related promoters (CRABP1, MLH1, p16INK4a, CACNA1G, NEUROG1), MGMT, MSI, KRAS, and BRAF. Methylation in at least 4 promoters or in one to three promoters was considered CIMP-high and CIMP-low (CIMP-H/L), respectively. Results: CIMP-H, CIMP-L, and CIMP-negative were found in 7.1, 43, and 49.9% cases, respectively. One hundred twenty-three tumors (41%) could not be classified into any one of the proposed molecular subgroups, including 107 CIMP-L, 14 CIMP-H, and two CIMP-negative cases. The 10 year survival rate for CIMP-high patients [22.6% (95%CI: 7–43)] was significantly lower than for CIMP-L or CIMP-negative (p = 0.0295). Only the combined analysis of BRAF and CIMP (negative versus L/H) led to distinct prognostic subgroups. Conclusion: Although CIMP status has an effect on outcome, our results underline the need for standardized definitions of low- and high-level CIMP, which clearly hinders an effective prognostic and molecular classification of colorectal cancer.

  1. Stratification and Prognostic Relevance of Jass’s Molecular Classification of Colorectal Cancer

    Energy Technology Data Exchange (ETDEWEB)

    Zlobec, Inti [Institute of Pathology, University of Bern, Bern (Switzerland); Institute for Pathology, University Hospital Basel, Basel (Switzerland); Bihl, Michel P.; Foerster, Anja; Rufle, Alex; Terracciano, Luigi [Institute for Pathology, University Hospital Basel, Basel (Switzerland); Lugli, Alessandro, E-mail: inti.zlobec@pathology.unibe.ch [Institute of Pathology, University of Bern, Bern (Switzerland); Institute for Pathology, University Hospital Basel, Basel (Switzerland)

    2012-02-27

    Background: The current proposed model of colorectal tumorigenesis is based primarily on CpG island methylator phenotype (CIMP), microsatellite instability (MSI), KRAS, BRAF, and methylation status of 0-6-Methylguanine DNA Methyltransferase (MGMT) and classifies tumors into five subgroups. The aim of this study is to validate this molecular classification and test its prognostic relevance. Methods: Three hundred two patients were included in this study. Molecular analysis was performed for five CIMP-related promoters (CRABP1, MLH1, p16INK4a, CACNA1G, NEUROG1), MGMT, MSI, KRAS, and BRAF. Methylation in at least 4 promoters or in one to three promoters was considered CIMP-high and CIMP-low (CIMP-H/L), respectively. Results: CIMP-H, CIMP-L, and CIMP-negative were found in 7.1, 43, and 49.9% cases, respectively. One hundred twenty-three tumors (41%) could not be classified into any one of the proposed molecular subgroups, including 107 CIMP-L, 14 CIMP-H, and two CIMP-negative cases. The 10 year survival rate for CIMP-high patients [22.6% (95%CI: 7–43)] was significantly lower than for CIMP-L or CIMP-negative (p = 0.0295). Only the combined analysis of BRAF and CIMP (negative versus L/H) led to distinct prognostic subgroups. Conclusion: Although CIMP status has an effect on outcome, our results underline the need for standardized definitions of low- and high-level CIMP, which clearly hinders an effective prognostic and molecular classification of colorectal cancer.

  2. Stratification and prognostic relevance of Jass’s molecular classification of colorectal cancer

    Directory of Open Access Journals (Sweden)

    Inti eZlobec

    2012-02-01

    Full Text Available Background: The current proposed model of colorectal tumorigenesis is based primarily on CpG island methylator phenotype (CIMP, microsatellite instability (MSI, KRAS, BRAF, and methylation status of 0-6-Methylguanine DNA Methyltransferase (MGMT and classifies tumors into 5 subgroups. The aim of this study is to validate this molecular classification and test its prognostic relevance. Methods: 302 patients were included in this study. Molecular analysis was performed for 5 CIMP-related promoters (CRABP1, MLH1, p16INK4a, CACNA1G, NEUROG1, MGMT, MSI, KRAS and BRAF. Tumors were CIMP-high or CIMP-low if ≥4 and 1-3 promoters were methylated, respectively. Results: CIMP-high, CIMP-low and CIMP–negative were found in 7.1%, 43% and 49.9% cases, respectively. 123 tumors (41% could not be classified into any one of the proposed molecular subgroups, including 107 CIMP-low, 14 CIMP-high and 2 CIMP-negative cases. The 10-year survival rate for CIMP-high patients (22.6% (95%CI: 7-43 was significantly lower than for CIMP-low or CIMP-negative (p=0.0295. Only the combined analysis of BRAF and CIMP (negative versus low/high led to distinct prognostic subgroups. Conclusion: Although CIMP status has an effect on outcome, our results underline the need for standardized definitions of low- and high-level CIMP, which clearly hinders an effective prognostic and molecular classification of colorectal cancer.

  3. Forest Classification Based on Forest texture in Northwest Yunnan Province

    Science.gov (United States)

    Wang, Jinliang; Gao, Yan; Wang, Xiaohua; Fu, Lei

    2014-03-01

    Forest texture is an intrinsic characteristic and an important visual feature of a forest ecological system. Full utilization of forest texture will be a great help in increasing the accuracy of forest classification based on remote sensed data. Taking Shangri-La as a study area, forest classification has been based on the texture. The results show that: (1) From the texture abundance, texture boundary, entropy as well as visual interpretation, the combination of Grayscale-gradient co-occurrence matrix and wavelet transformation is much better than either one of both ways of forest texture information extraction; (2) During the forest texture information extraction, the size of the texture-suitable window determined by the semi-variogram method depends on the forest type (evergreen broadleaf forest is 3×3, deciduous broadleaf forest is 5×5, etc.). (3)While classifying forest based on forest texture information, the texture factor assembly differs among forests: Variance Heterogeneity and Correlation should be selected when the window is between 3×3 and 5×5 Mean, Correlation, and Entropy should be used when the window in the range of 7×7 to 19×19 and Correlation, Second Moment, and Variance should be used when the range is larger than 21×21.

  4. Forest Classification Based on Forest texture in Northwest Yunnan Province

    International Nuclear Information System (INIS)

    Wang, Jinliang; Gao, Yan; Fu, Lei; Wang, Xiaohua

    2014-01-01

    Forest texture is an intrinsic characteristic and an important visual feature of a forest ecological system. Full utilization of forest texture will be a great help in increasing the accuracy of forest classification based on remote sensed data. Taking Shangri-La as a study area, forest classification has been based on the texture. The results show that: (1) From the texture abundance, texture boundary, entropy as well as visual interpretation, the combination of Grayscale-gradient co-occurrence matrix and wavelet transformation is much better than either one of both ways of forest texture information extraction; (2) During the forest texture information extraction, the size of the texture-suitable window determined by the semi-variogram method depends on the forest type (evergreen broadleaf forest is 3×3, deciduous broadleaf forest is 5×5, etc.). (3)While classifying forest based on forest texture information, the texture factor assembly differs among forests: Variance Heterogeneity and Correlation should be selected when the window is between 3×3 and 5×5; Mean, Correlation, and Entropy should be used when the window in the range of 7×7 to 19×19; and Correlation, Second Moment, and Variance should be used when the range is larger than 21×21

  5. Task Classification Based Energy-Aware Consolidation in Clouds

    Directory of Open Access Journals (Sweden)

    HeeSeok Choi

    2016-01-01

    Full Text Available We consider a cloud data center, in which the service provider supplies virtual machines (VMs on hosts or physical machines (PMs to its subscribers for computation in an on-demand fashion. For the cloud data center, we propose a task consolidation algorithm based on task classification (i.e., computation-intensive and data-intensive and resource utilization (e.g., CPU and RAM. Furthermore, we design a VM consolidation algorithm to balance task execution time and energy consumption without violating a predefined service level agreement (SLA. Unlike the existing research on VM consolidation or scheduling that applies none or single threshold schemes, we focus on a double threshold (upper and lower scheme, which is used for VM consolidation. More specifically, when a host operates with resource utilization below the lower threshold, all the VMs on the host will be scheduled to be migrated to other hosts and then the host will be powered down, while when a host operates with resource utilization above the upper threshold, a VM will be migrated to avoid using 100% of resource utilization. Based on experimental performance evaluations with real-world traces, we prove that our task classification based energy-aware consolidation algorithm (TCEA achieves a significant energy reduction without incurring predefined SLA violations.

  6. Feature selection gait-based gender classification under different circumstances

    Science.gov (United States)

    Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

    2014-05-01

    This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

  7. Joint Probability-Based Neuronal Spike Train Classification

    Directory of Open Access Journals (Sweden)

    Yan Chen

    2009-01-01

    Full Text Available Neuronal spike trains are used by the nervous system to encode and transmit information. Euclidean distance-based methods (EDBMs have been applied to quantify the similarity between temporally-discretized spike trains and model responses. In this study, using the same discretization procedure, we developed and applied a joint probability-based method (JPBM to classify individual spike trains of slowly adapting pulmonary stretch receptors (SARs. The activity of individual SARs was recorded in anaesthetized, paralysed adult male rabbits, which were artificially-ventilated at constant rate and one of three different volumes. Two-thirds of the responses to the 600 stimuli presented at each volume were used to construct three response models (one for each stimulus volume consisting of a series of time bins, each with spike probabilities. The remaining one-third of the responses where used as test responses to be classified into one of the three model responses. This was done by computing the joint probability of observing the same series of events (spikes or no spikes, dictated by the test response in a given model and determining which probability of the three was highest. The JPBM generally produced better classification accuracy than the EDBM, and both performed well above chance. Both methods were similarly affected by variations in discretization parameters, response epoch duration, and two different response alignment strategies. Increasing bin widths increased classification accuracy, which also improved with increased observation time, but primarily during periods of increasing lung inflation. Thus, the JPBM is a simple and effective method performing spike train classification.

  8. Computed tomography and the TNM classification of lung cancer

    International Nuclear Information System (INIS)

    Sparup, J.; Friis, M.; Brenoee, J.; Vejlsted, H.; Villumsen, B.; Olesen, K.P.; Borgeskov, S.; Bertelsen, S.

    1990-01-01

    Computed tomography (CT)of the thorax and upper abdomen was prospectively evaluated in 84 patients with potentially operable lung cancer. Invasion into the thoracic wall and the mediastinal structures was not accurately demonstrated by CT. For metastatic mediastinal lymph nodes, the sensitivity and specificity of CT were, respectively, 86 per cent and 61 per cent and the positive and negative predictive indices 49 per cent and 91 per cent. For T1, T2 and T3 tumours the negative indices were 100 per cent, 96 per cent and 71 per cent. Positive predictive index did not differ between squamous cell carcinoma and adenocarcinoma. Adrenal metastases were CT-suspected in 17 cases and liver metastases in eight, but were verified by ultrasonography in only one and four cases. CT should be used in preoperative investigation of lung cancer, irrespective of stage. Demonstration of thoracic-wall or mediastinal invasion need not exclude tumour resection. Preoperative mediastinoscopy is indicated if CT shows nodal metastases or if there are signs of tumour invasion, but not in CT-negative T1 or T2 tumour. Abdominal metastases indicated by CT should be investigated with CT-guided needle biopsy. (authors)

  9. Soft computing based feature selection for environmental sound classification

    NARCIS (Netherlands)

    Shakoor, A.; May, T.M.; Van Schijndel, N.H.

    2010-01-01

    Environmental sound classification has a wide range of applications,like hearing aids, mobile communication devices, portable media players, and auditory protection devices. Sound classification systemstypically extract features from the input sound. Using too many features increases complexity

  10. Chemometric classification of casework arson samples based on gasoline content.

    Science.gov (United States)

    Sinkov, Nikolai A; Sandercock, P Mark L; Harynuk, James J

    2014-02-01

    Detection and identification of ignitable liquids (ILs) in arson debris is a critical part of arson investigations. The challenge of this task is due to the complex and unpredictable chemical nature of arson debris, which also contains pyrolysis products from the fire. ILs, most commonly gasoline, are complex chemical mixtures containing hundreds of compounds that will be consumed or otherwise weathered by the fire to varying extents depending on factors such as temperature, air flow, the surface on which IL was placed, etc. While methods such as ASTM E-1618 are effective, data interpretation can be a costly bottleneck in the analytical process for some laboratories. In this study, we address this issue through the application of chemometric tools. Prior to the application of chemometric tools such as PLS-DA and SIMCA, issues of chromatographic alignment and variable selection need to be addressed. Here we use an alignment strategy based on a ladder consisting of perdeuterated n-alkanes. Variable selection and model optimization was automated using a hybrid backward elimination (BE) and forward selection (FS) approach guided by the cluster resolution (CR) metric. In this work, we demonstrate the automated construction, optimization, and application of chemometric tools to casework arson data. The resulting PLS-DA and SIMCA classification models, trained with 165 training set samples, have provided classification of 55 validation set samples based on gasoline content with 100% specificity and sensitivity. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  11. Interactive classification and content-based retrieval of tissue images

    Science.gov (United States)

    Aksoy, Selim; Marchisio, Giovanni B.; Tusk, Carsten; Koperski, Krzysztof

    2002-11-01

    We describe a system for interactive classification and retrieval of microscopic tissue images. Our system models tissues in pixel, region and image levels. Pixel level features are generated using unsupervised clustering of color and texture values. Region level features include shape information and statistics of pixel level feature values. Image level features include statistics and spatial relationships of regions. To reduce the gap between low-level features and high-level expert knowledge, we define the concept of prototype regions. The system learns the prototype regions in an image collection using model-based clustering and density estimation. Different tissue types are modeled using spatial relationships of these regions. Spatial relationships are represented by fuzzy membership functions. The system automatically selects significant relationships from training data and builds models which can also be updated using user relevance feedback. A Bayesian framework is used to classify tissues based on these models. Preliminary experiments show that the spatial relationship models we developed provide a flexible and powerful framework for classification and retrieval of tissue images.

  12. Drunk driving detection based on classification of multivariate time series.

    Science.gov (United States)

    Li, Zhenlong; Jin, Xue; Zhao, Xiaohua

    2015-09-01

    This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.

  13. Multiple kernel boosting framework based on information measure for classification

    International Nuclear Information System (INIS)

    Qi, Chengming; Wang, Yuping; Tian, Wenjie; Wang, Qun

    2016-01-01

    The performance of kernel-based method, such as support vector machine (SVM), is greatly affected by the choice of kernel function. Multiple kernel learning (MKL) is a promising family of machine learning algorithms and has attracted many attentions in recent years. MKL combines multiple sub-kernels to seek better results compared to single kernel learning. In order to improve the efficiency of SVM and MKL, in this paper, the Kullback–Leibler kernel function is derived to develop SVM. The proposed method employs an improved ensemble learning framework, named KLMKB, which applies Adaboost to learning multiple kernel-based classifier. In the experiment for hyperspectral remote sensing image classification, we employ feature selected through Optional Index Factor (OIF) to classify the satellite image. We extensively examine the performance of our approach in comparison to some relevant and state-of-the-art algorithms on a number of benchmark classification data sets and hyperspectral remote sensing image data set. Experimental results show that our method has a stable behavior and a noticeable accuracy for different data set.

  14. [Galaxy/quasar classification based on nearest neighbor method].

    Science.gov (United States)

    Li, Xiang-Ru; Lu, Yu; Zhou, Jian-Ming; Wang, Yong-Jun

    2011-09-01

    With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification.

  15. Robust Pedestrian Classification Based on Hierarchical Kernel Sparse Representation

    Directory of Open Access Journals (Sweden)

    Rui Sun

    2016-08-01

    Full Text Available Vision-based pedestrian detection has become an active topic in computer vision and autonomous vehicles. It aims at detecting pedestrians appearing ahead of the vehicle using a camera so that autonomous vehicles can assess the danger and take action. Due to varied illumination and appearance, complex background and occlusion pedestrian detection in outdoor environments is a difficult problem. In this paper, we propose a novel hierarchical feature extraction and weighted kernel sparse representation model for pedestrian classification. Initially, hierarchical feature extraction based on a CENTRIST descriptor is used to capture discriminative structures. A max pooling operation is used to enhance the invariance of varying appearance. Then, a kernel sparse representation model is proposed to fully exploit the discrimination information embedded in the hierarchical local features, and a Gaussian weight function as the measure to effectively handle the occlusion in pedestrian images. Extensive experiments are conducted on benchmark databases, including INRIA, Daimler, an artificially generated dataset and a real occluded dataset, demonstrating the more robust performance of the proposed method compared to state-of-the-art pedestrian classification methods.

  16. Style-based classification of Chinese ink and wash paintings

    Science.gov (United States)

    Sheng, Jiachuan; Jiang, Jianmin

    2013-09-01

    Following the fact that a large collection of ink and wash paintings (IWP) is being digitized and made available on the Internet, their automated content description, analysis, and management are attracting attention across research communities. While existing research in relevant areas is primarily focused on image processing approaches, a style-based algorithm is proposed to classify IWPs automatically by their authors. As IWPs do not have colors or even tones, the proposed algorithm applies edge detection to locate the local region and detect painting strokes to enable histogram-based feature extraction and capture of important cues to reflect the styles of different artists. Such features are then applied to drive a number of neural networks in parallel to complete the classification, and an information entropy balanced fusion is proposed to make an integrated decision for the multiple neural network classification results in which the entropy is used as a pointer to combine the global and local features. Evaluations via experiments support that the proposed algorithm achieves good performances, providing excellent potential for computerized analysis and management of IWPs.

  17. On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Asriyanti Indah Pratiwi

    2018-01-01

    Full Text Available Sentiment analysis in a movie review is the needs of today lifestyle. Unfortunately, enormous features make the sentiment of analysis slow and less sensitive. Finding the optimum feature selection and classification is still a challenge. In order to handle an enormous number of features and provide better sentiment classification, an information-based feature selection and classification are proposed. The proposed method reduces more than 90% unnecessary features while the proposed classification scheme achieves 96% accuracy of sentiment classification. From the experimental results, it can be concluded that the combination of proposed feature selection and classification achieves the best performance so far.

  18. Cluster Validity Classification Approaches Based on Geometric Probability and Application in the Classification of Remotely Sensed Images

    Directory of Open Access Journals (Sweden)

    LI Jian-Wei

    2014-08-01

    Full Text Available On the basis of the cluster validity function based on geometric probability in literature [1, 2], propose a cluster analysis method based on geometric probability to process large amount of data in rectangular area. The basic idea is top-down stepwise refinement, firstly categories then subcategories. On all clustering levels, use the cluster validity function based on geometric probability firstly, determine clusters and the gathering direction, then determine the center of clustering and the border of clusters. Through TM remote sensing image classification examples, compare with the supervision and unsupervised classification in ERDAS and the cluster analysis method based on geometric probability in two-dimensional square which is proposed in literature 2. Results show that the proposed method can significantly improve the classification accuracy.

  19. Rough set soft computing cancer classification and network: one stone, two birds.

    Science.gov (United States)

    Zhang, Yue

    2010-07-15

    Gene expression profiling provides tremendous information to help unravel the complexity of cancer. The selection of the most informative genes from huge noise for cancer classification has taken centre stage, along with predicting the function of such identified genes and the construction of direct gene regulatory networks at different system levels with a tuneable parameter. A new study by Wang and Gotoh described a novel Variable Precision Rough Sets-rooted robust soft computing method to successfully address these problems and has yielded some new insights. The significance of this progress and its perspectives will be discussed in this article.

  20. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin

    DEFF Research Database (Denmark)

    Hoadley, Katherine A; Yau, Christina; Wolf, Denise M

    2014-01-01

    Recent genomic analyses of pathologically defined tumor types identify "within-a-tissue" disease subtypes. However, the extent to which genomic signatures are shared across tissues is still unclear. We performed an integrative analysis using five genome-wide platforms and one proteomic platform...... on 3,527 specimens from 12 cancer types, revealing a unified classification into 11 major subtypes. Five subtypes were nearly identical to their tissue-of-origin counterparts, but several distinct cancer types were found to converge into common subtypes. Lung squamous, head and neck, and a subset...

  1. Bearing Fault Classification Based on Conditional Random Field

    Directory of Open Access Journals (Sweden)

    Guofeng Wang

    2013-01-01

    Full Text Available Condition monitoring of rolling element bearing is paramount for predicting the lifetime and performing effective maintenance of the mechanical equipment. To overcome the drawbacks of the hidden Markov model (HMM and improve the diagnosis accuracy, conditional random field (CRF model based classifier is proposed. In this model, the feature vectors sequences and the fault categories are linked by an undirected graphical model in which their relationship is represented by a global conditional probability distribution. In comparison with the HMM, the main advantage of the CRF model is that it can depict the temporal dynamic information between the observation sequences and state sequences without assuming the independence of the input feature vectors. Therefore, the interrelationship between the adjacent observation vectors can also be depicted and integrated into the model, which makes the classifier more robust and accurate than the HMM. To evaluate the effectiveness of the proposed method, four kinds of bearing vibration signals which correspond to normal, inner race pit, outer race pit and roller pit respectively are collected from the test rig. And the CRF and HMM models are built respectively to perform fault classification by taking the sub band energy features of wavelet packet decomposition (WPD as the observation sequences. Moreover, K-fold cross validation method is adopted to improve the evaluation accuracy of the classifier. The analysis and comparison under different fold times show that the accuracy rate of classification using the CRF model is higher than the HMM. This method brings some new lights on the accurate classification of the bearing faults.

  2. Early detection of lung cancer from CT images: nodule segmentation and classification using deep learning

    Science.gov (United States)

    Sharma, Manu; Bhatt, Jignesh S.; Joshi, Manjunath V.

    2018-04-01

    Lung cancer is one of the most abundant causes of the cancerous deaths worldwide. It has low survival rate mainly due to the late diagnosis. With the hardware advancements in computed tomography (CT) technology, it is now possible to capture the high resolution images of lung region. However, it needs to be augmented by efficient algorithms to detect the lung cancer in the earlier stages using the acquired CT images. To this end, we propose a two-step algorithm for early detection of lung cancer. Given the CT image, we first extract the patch from the center location of the nodule and segment the lung nodule region. We propose to use Otsu method followed by morphological operations for the segmentation. This step enables accurate segmentation due to the use of data-driven threshold. Unlike other methods, we perform the segmentation without using the complete contour information of the nodule. In the second step, a deep convolutional neural network (CNN) is used for the better classification (malignant or benign) of the nodule present in the segmented patch. Accurate segmentation of even a tiny nodule followed by better classification using deep CNN enables the early detection of lung cancer. Experiments have been conducted using 6306 CT images of LIDC-IDRI database. We achieved the test accuracy of 84.13%, with the sensitivity and specificity of 91.69% and 73.16%, respectively, clearly outperforming the state-of-the-art algorithms.

  3. A collection of annotated and harmonized human breast cancer transcriptome datasets, including immunologic classification [version 2; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Jessica Roelands

    2018-02-01

    Full Text Available The increased application of high-throughput approaches in translational research has expanded the number of publicly available data repositories. Gathering additional valuable information contained in the datasets represents a crucial opportunity in the biomedical field. To facilitate and stimulate utilization of these datasets, we have recently developed an interactive data browsing and visualization web application, the Gene Expression Browser (GXB. In this note, we describe a curated compendium of 13 public datasets on human breast cancer, representing a total of 2142 transcriptome profiles. We classified the samples according to different immune based classification systems and integrated this information into the datasets. Annotated and harmonized datasets were uploaded to GXB. Study samples were categorized in different groups based on their immunologic tumor response profiles, intrinsic molecular subtypes and multiple clinical parameters. Ranked gene lists were generated based on relevant group comparisons. In this data note, we demonstrate the utility of GXB to evaluate the expression of a gene of interest, find differential gene expression between groups and investigate potential associations between variables with a specific focus on immunologic classification in breast cancer. This interactive resource is publicly available online at: http://breastcancer.gxbsidra.org/dm3/geneBrowser/list.

  4. Long-term Prostate-specific Antigen Velocity in Improved Classification of Prostate Cancer Risk and Mortality

    DEFF Research Database (Denmark)

    Ørsted, David Dynnes; Bojesen, Stig E; Kamstrup, Pia R

    2013-01-01

    BACKGROUND: It remains unclear whether adding long-term prostate-specific antigen velocity (PSAV) to baseline PSA values improves classification of prostate cancer (PCa) risk and mortality in the general population. OBJECTIVE: To determine whether long-term PSAV improves classification of PCa risk...

  5. Setting a generalized functional linear model (GFLM for the classification of different types of cancer

    Directory of Open Access Journals (Sweden)

    Miguel Flores

    2016-11-01

    Full Text Available This work aims to classify the DNA sequences of healthy and malignant cancer respectively. For this, supervised and unsupervised classification methods from a functional context are used; i.e. each strand of DNA is an observation. The observations are discretized, for that reason different ways to represent these observations with functions are evaluated. In addition, an exploratory study is done: estimating the mean and variance of each functional type of cancer. For the unsupervised classification method, hierarchical clustering with different measures of functional distance is used. On the other hand, for the supervised classification method, a functional generalized linear model is used. For this model the first and second derivatives are used which are included as discriminating variables. It has been verified that one of the advantages of working in the functional context is to obtain a model to correctly classify cancers by 100%. For the implementation of the methods it has been used the fda.usc R package that includes all the techniques of functional data analysis used in this work. In addition, some that have been developed in recent decades. For more details of these techniques can be consulted Ramsay, J. O. and Silverman (2005 and Ferraty et al. (2006.

  6. Evidence-based cancer imaging

    Energy Technology Data Exchange (ETDEWEB)

    Shinagare, Atul B.; Khorasani, Ramin [Dept. of Radiology, Brigham and Women' s Hospital, Boston (Korea, Republic of)

    2017-01-15

    With the advances in the field of oncology, imaging is increasingly used in the follow-up of cancer patients, leading to concerns about over-utilization. Therefore, it has become imperative to make imaging more evidence-based, efficient, cost-effective and equitable. This review explores the strategies and tools to make diagnostic imaging more evidence-based, mainly in the context of follow-up of cancer patients.

  7. A novel method for human age group classification based on

    Directory of Open Access Journals (Sweden)

    Anuradha Yarlagadda

    2015-10-01

    Full Text Available In the computer vision community, easy categorization of a person’s facial image into various age groups is often quite precise and is not pursued effectively. To address this problem, which is an important area of research, the present paper proposes an innovative method of age group classification system based on the Correlation Fractal Dimension of complex facial image. Wrinkles appear on the face with aging thereby changing the facial edges of the image. The proposed method is rotation and poses invariant. The present paper concentrates on developing an innovative technique that classifies facial images into four categories i.e. child image (0–15, young adult image (15–30, middle-aged adult image (31–50, and senior adult image (>50 based on correlation FD value of a facial edge image.

  8. Automatic classification of visual evoked potentials based on wavelet decomposition

    Science.gov (United States)

    Stasiakiewicz, Paweł; Dobrowolski, Andrzej P.; Tomczykiewicz, Kazimierz

    2017-04-01

    Diagnosis of part of the visual system, that is responsible for conducting compound action potential, is generally based on visual evoked potentials generated as a result of stimulation of the eye by external light source. The condition of patient's visual path is assessed by set of parameters that describe the time domain characteristic extremes called waves. The decision process is compound therefore diagnosis significantly depends on experience of a doctor. The authors developed a procedure - based on wavelet decomposition and linear discriminant analysis - that ensures automatic classification of visual evoked potentials. The algorithm enables to assign individual case to normal or pathological class. The proposed classifier has a 96,4% sensitivity at 10,4% probability of false alarm in a group of 220 cases and area under curve ROC equals to 0,96 which, from the medical point of view, is a very good result.

  9. Breast cancer surgery and diagnosis-related groups (DRGs): patient classification and hospital reimbursement in 11 European countries.

    Science.gov (United States)

    Scheller-Kreinsen, David; Quentin, Wilm; Geissler, Alexander; Busse, Reinhard

    2013-10-01

    Researchers from eleven countries (i.e. Austria, England, Estonia, Finland, France, Germany, Ireland, Netherlands, Poland, Spain, and Sweden) compared how their DRG systems deal with breast cancer surgery patients. DRG algorithms and indicators of resource consumption were assessed for those DRGs that individually contain at least 1% of all breast cancer surgery patients. Six standardised case vignettes were defined and quasi prices according to national DRG-based hospital payment systems were ascertained. European DRG systems classify breast cancer surgery patients according to different sets of classification variables into three to seven DRGs. Quasi prices for an index case treated with partial mastectomy range from €577 in Poland to €5780 in the Netherlands. Countries award their highest payments for very different kinds of patients. Breast cancer specialists and national DRG authorities should consider how other countries' DRG systems classify breast cancer patients in order to identify potential scope for improvement and to ensure fair and appropriate reimbursement. Copyright © 2012 Elsevier Ltd. All rights reserved.

  10. Apparent diffusion coefficient value of gastric cancer by diffusion-weighted imaging: Correlations with the histological differentiation and Lauren classification

    International Nuclear Information System (INIS)

    Liu, Song; Guan, Wenxian; Wang, Hao; Pan, Liang; Zhou, Zhuping; Yu, Haiping; Liu, Tian; Yang, Xiaofeng; He, Jian; Zhou, Zhengyang

    2014-01-01

    Highlights: • Gastric cancers’ ADC values were significantly lower than normal gastric wall. • Gastric adenocarcinomas with different differentiation had different ADC values. • Gastric adenocarcinomas’ ADC values correlated with histologic differentiations. • Gastric cancers’ ADC values correlated with Lauren classifications. • Mean ADC value was better than min ADC value in characterizing gastric cancers. - Abstract: Objective: The purpose of this study was to evaluate the correlations between histological differentiation and Lauren classification of gastric cancer and the apparent diffusion coefficient (ADC) value of diffusion weighted imaging (DWI). Materials and methods: Sixty-nine patients with gastric cancer lesions underwent preoperative magnetic resonance imaging (MRI) (3.0T) and surgical resection. DWI was obtained with a single-shot, echo-planar imaging sequence in the axial plane (b values: 0 and 1000 s/mm 2 ). Mean and minimum ADC values were obtained for each gastric cancer and normal gastric walls by two radiologists, who were blinded to the histological findings. Histological type, degree of differentiation and Lauren classification of each resected specimen were determined by one pathologist. Mean and minimum ADC values of gastric cancers with different histological types, degrees of differentiation and Lauren classifications were compared. Correlations between ADC values and histological differentiation and Lauren classification were analyzed. Results: The mean and minimum ADC values of gastric cancers, as a whole and separately, were significantly lower than those of normal gastric walls (all p values <0.001). There were significant differences in the mean and minimum ADC values among gastric cancers with different histological types, degrees of differentiation and Lauren classifications (p < 0.05). Mean and minimum ADC values correlated significantly (all p < 0.001) with histological differentiation (r = 0.564, 0.578) and Lauren

  11. A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection.

    Science.gov (United States)

    Ceccarelli, Michele; d'Acierno, Antonio; Facchiano, Angelo

    2009-10-15

    Mass spectrometry spectra, widely used in proteomics studies as a screening tool for protein profiling and to detect discriminatory signals, are high dimensional data. A large number of local maxima (a.k.a. peaks) have to be analyzed as part of computational pipelines aimed at the realization of efficient predictive and screening protocols. With this kind of data dimensions and samples size the risk of over-fitting and selection bias is pervasive. Therefore the development of bio-informatics methods based on unsupervised feature extraction can lead to general tools which can be applied to several fields of predictive proteomics. We propose a method for feature selection and extraction grounded on the theory of multi-scale spaces for high resolution spectra derived from analysis of serum. Then we use support vector machines for classification. In particular we use a database containing 216 samples spectra divided in 115 cancer and 91 control samples. The overall accuracy averaged over a large cross validation study is 98.18. The area under the ROC curve of the best selected model is 0.9962. We improved previous known results on the problem on the same data, with the advantage that the proposed method has an unsupervised feature selection phase. All the developed code, as MATLAB scripts, can be downloaded from http://medeaserver.isa.cnr.it/dacierno/spectracode.htm.

  12. Automated Glioblastoma Segmentation Based on a Multiparametric Structured Unsupervised Classification

    Science.gov (United States)

    Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V.; Robles, Montserrat; Aparici, F.; Martí-Bonmatí, L.; García-Gómez, Juan M.

    2015-01-01

    Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation. PMID:25978453

  13. Classification of follicular lymphoma images: a holistic approach with symbol-based machine learning methods.

    Science.gov (United States)

    Zorman, Milan; Sánchez de la Rosa, José Luis; Dinevski, Dejan

    2011-12-01

    It is not very often to see a symbol-based machine learning approach to be used for the purpose of image classification and recognition. In this paper we will present such an approach, which we first used on the follicular lymphoma images. Lymphoma is a broad term encompassing a variety of cancers of the lymphatic system. Lymphoma is differentiated by the type of cell that multiplies and how the cancer presents itself. It is very important to get an exact diagnosis regarding lymphoma and to determine the treatments that will be most effective for the patient's condition. Our work was focused on the identification of lymphomas by finding follicles in microscopy images provided by the Laboratory of Pathology in the University Hospital of Tenerife, Spain. We divided our work in two stages: in the first stage we did image pre-processing and feature extraction, and in the second stage we used different symbolic machine learning approaches for pixel classification. Symbolic machine learning approaches are often neglected when looking for image analysis tools. They are not only known for a very appropriate knowledge representation, but also claimed to lack computational power. The results we got are very promising and show that symbolic approaches can be successful in image analysis applications.

  14. Neighborhood Hypergraph Based Classification Algorithm for Incomplete Information System

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2015-01-01

    Full Text Available The problem of classification in incomplete information system is a hot issue in intelligent information processing. Hypergraph is a new intelligent method for machine learning. However, it is hard to process the incomplete information system by the traditional hypergraph, which is due to two reasons: (1 the hyperedges are generated randomly in traditional hypergraph model; (2 the existing methods are unsuitable to deal with incomplete information system, for the sake of missing values in incomplete information system. In this paper, we propose a novel classification algorithm for incomplete information system based on hypergraph model and rough set theory. Firstly, we initialize the hypergraph. Second, we classify the training set by neighborhood hypergraph. Third, under the guidance of rough set, we replace the poor hyperedges. After that, we can obtain a good classifier. The proposed approach is tested on 15 data sets from UCI machine learning repository. Furthermore, it is compared with some existing methods, such as C4.5, SVM, NavieBayes, and KNN. The experimental results show that the proposed algorithm has better performance via Precision, Recall, AUC, and F-measure.

  15. Comparison Effectiveness of Pixel Based Classification and Object Based Classification Using High Resolution Image In Floristic Composition Mapping (Study Case: Gunung Tidar Magelang City)

    Science.gov (United States)

    Ardha Aryaguna, Prama; Danoedoro, Projo

    2016-11-01

    Developments of analysis remote sensing have same way with development of technology especially in sensor and plane. Now, a lot of image have high spatial and radiometric resolution, that's why a lot information. Vegetation object analysis such floristic composition got a lot advantage of that development. Floristic composition can be interpreted using a lot of method such pixel based classification and object based classification. The problems for pixel based method on high spatial resolution image are salt and paper who appear in result of classification. The purpose of this research are compare effectiveness between pixel based classification and object based classification for composition vegetation mapping on high resolution image Worldview-2. The results show that pixel based classification using majority 5×5 kernel windows give the highest accuracy between another classifications. The highest accuracy is 73.32% from image Worldview-2 are being radiometric corrected level surface reflectance, but for overall accuracy in every class, object based are the best between another methods. Reviewed from effectiveness aspect, pixel based are more effective then object based for vegetation composition mapping in Tidar forest.

  16. Simple adaptive sparse representation based classification schemes for EEG based brain-computer interface applications.

    Science.gov (United States)

    Shin, Younghak; Lee, Seungchan; Ahn, Minkyu; Cho, Hohyun; Jun, Sung Chan; Lee, Heung-No

    2015-11-01

    One of the main problems related to electroencephalogram (EEG) based brain-computer interface (BCI) systems is the non-stationarity of the underlying EEG signals. This results in the deterioration of the classification performance during experimental sessions. Therefore, adaptive classification techniques are required for EEG based BCI applications. In this paper, we propose simple adaptive sparse representation based classification (SRC) schemes. Supervised and unsupervised dictionary update techniques for new test data and a dictionary modification method by using the incoherence measure of the training data are investigated. The proposed methods are very simple and additional computation for the re-training of the classifier is not needed. The proposed adaptive SRC schemes are evaluated using two BCI experimental datasets. The proposed methods are assessed by comparing classification results with the conventional SRC and other adaptive classification methods. On the basis of the results, we find that the proposed adaptive schemes show relatively improved classification accuracy as compared to conventional methods without requiring additional computation. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Radiological classification of renal angiomyolipomas based on 127 tumors

    Directory of Open Access Journals (Sweden)

    Prando Adilson

    2003-01-01

    Full Text Available PURPOSE: Demonstrate radiological findings of 127 angiomyolipomas (AMLs and propose a classification based on the radiological evidence of fat. MATERIALS AND METHODS: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73, multiple without tuberous sclerosis (TS (n = 4 and multiple with TS (n = 8, were retrospectively reviewed. Eighteen AMLs (14% presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13, hemorrhage (n = 11 and impossibility of an adequate preoperative characterization (n = 8. There was not a case of renal cell carcinoma (RCC with fat component in this group of patients. RESULTS: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal: 54%; Pattern-II, partially fatty (intrarenal or exophytic: 29%; Pattern-III, minimally fatty (most exophytic and perirenal: 11%; and Pattern-IV, without fat (most exophytic and perirenal: 6%. CONCLUSIONS: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm, pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exophytic and heterogeneous (patterns II and III. The rare pattern-IV AMLs, however, can be small or large, intra-renal or exophytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with detectable

  18. Deep neural network and noise classification-based speech enhancement

    Science.gov (United States)

    Shi, Wenhua; Zhang, Xiongwei; Zou, Xia; Han, Wei

    2017-07-01

    In this paper, a speech enhancement method using noise classification and Deep Neural Network (DNN) was proposed. Gaussian mixture model (GMM) was employed to determine the noise type in speech-absent frames. DNN was used to model the relationship between noisy observation and clean speech. Once the noise type was determined, the corresponding DNN model was applied to enhance the noisy speech. GMM was trained with mel-frequency cepstrum coefficients (MFCC) and the parameters were estimated with an iterative expectation-maximization (EM) algorithm. Noise type was updated by spectrum entropy-based voice activity detection (VAD). Experimental results demonstrate that the proposed method could achieve better objective speech quality and smaller distortion under stationary and non-stationary conditions.

  19. A robust probabilistic collaborative representation based classification for multimodal biometrics

    Science.gov (United States)

    Zhang, Jing; Liu, Huanxi; Ding, Derui; Xiao, Jianli

    2018-04-01

    Most of the traditional biometric recognition systems perform recognition with a single biometric indicator. These systems have suffered noisy data, interclass variations, unacceptable error rates, forged identity, and so on. Due to these inherent problems, it is not valid that many researchers attempt to enhance the performance of unimodal biometric systems with single features. Thus, multimodal biometrics is investigated to reduce some of these defects. This paper proposes a new multimodal biometric recognition approach by fused faces and fingerprints. For more recognizable features, the proposed method extracts block local binary pattern features for all modalities, and then combines them into a single framework. For better classification, it employs the robust probabilistic collaborative representation based classifier to recognize individuals. Experimental results indicate that the proposed method has improved the recognition accuracy compared to the unimodal biometrics.

  20. Machine Learning Based Localization and Classification with Atomic Magnetometers

    Science.gov (United States)

    Deans, Cameron; Griffin, Lewis D.; Marmugi, Luca; Renzoni, Ferruccio

    2018-01-01

    We demonstrate identification of position, material, orientation, and shape of objects imaged by a Rb 85 atomic magnetometer performing electromagnetic induction imaging supported by machine learning. Machine learning maximizes the information extracted from the images created by the magnetometer, demonstrating the use of hidden data. Localization 2.6 times better than the spatial resolution of the imaging system and successful classification up to 97% are obtained. This circumvents the need of solving the inverse problem and demonstrates the extension of machine learning to diffusive systems, such as low-frequency electrodynamics in media. Automated collection of task-relevant information from quantum-based electromagnetic imaging will have a relevant impact from biomedicine to security.

  1. Fines Classification Based on Sensitivity to Pore-Fluid Chemistry

    KAUST Repository

    Jang, Junbong

    2015-12-28

    The 75-μm particle size is used to discriminate between fine and coarse grains. Further analysis of fine grains is typically based on the plasticity chart. Whereas pore-fluid-chemistry-dependent soil response is a salient and distinguishing characteristic of fine grains, pore-fluid chemistry is not addressed in current classification systems. Liquid limits obtained with electrically contrasting pore fluids (deionized water, 2-M NaCl brine, and kerosene) are combined to define the soil "electrical sensitivity." Liquid limit and electrical sensitivity can be effectively used to classify fine grains according to their fluid-soil response into no-, low-, intermediate-, or high-plasticity fine grains of low, intermediate, or high electrical sensitivity. The proposed methodology benefits from the accumulated experience with liquid limit in the field and addresses the needs of a broader range of geotechnical engineering problems. © ASCE.

  2. Fines classification based on sensitivity to pore-fluid chemistry

    Science.gov (United States)

    Jang, Junbong; Santamarina, J. Carlos

    2016-01-01

    The 75-μm particle size is used to discriminate between fine and coarse grains. Further analysis of fine grains is typically based on the plasticity chart. Whereas pore-fluid-chemistry-dependent soil response is a salient and distinguishing characteristic of fine grains, pore-fluid chemistry is not addressed in current classification systems. Liquid limits obtained with electrically contrasting pore fluids (deionized water, 2-M NaCl brine, and kerosene) are combined to define the soil “electrical sensitivity.” Liquid limit and electrical sensitivity can be effectively used to classify fine grains according to their fluid-soil response into no-, low-, intermediate-, or high-plasticity fine grains of low, intermediate, or high electrical sensitivity. The proposed methodology benefits from the accumulated experience with liquid limit in the field and addresses the needs of a broader range of geotechnical engineering problems.

  3. New classification system-based visual outcome in Eales′ disease

    Directory of Open Access Journals (Sweden)

    Saxena Sandeep

    2007-01-01

    Full Text Available Purpose: A retrospective tertiary care center-based study was undertaken to evaluate the visual outcome in Eales′ disease, based on a new classification system, for the first time. Materials and Methods: One hundred and fifty-nine consecutive cases of Eales′ disease were included. All the eyes were staged according to the new classification: Stage 1: periphlebitis of small (1a and large (1b caliber vessels with superficial retinal hemorrhages; Stage 2a: capillary non-perfusion, 2b: neovascularization elsewhere/of the disc; Stage 3a: fibrovascular proliferation, 3b: vitreous hemorrhage; Stage 4a: traction/combined rhegmatogenous retinal detachment and 4b: rubeosis iridis, neovascular glaucoma, complicated cataract and optic atrophy. Visual acuity was graded as: Grade I 20/20 or better; Grade II 20/30 to 20/40; Grade III 20/60 to 20/120 and Grade IV 20/200 or worse. All the cases were managed by medical therapy, photocoagulation and/or vitreoretinal surgery. Visual acuity was converted into decimal scale, denoting 20/20=1 and 20/800=0.01. Paired t-test / Wilcoxon signed-rank tests were used for statistical analysis. Results: Vitreous hemorrhage was the commonest presenting feature (49.32%. Cases with Stages 1 to 3 and 4a and 4b achieved final visual acuity ranging from 20/15 to 20/40; 20/80 to 20/400 and 20/200 to 20/400, respectively. Statistically significant improvement in visual acuities was observed in all the stages of the disease except Stages 1a and 4b. Conclusion: Significant improvement in visual acuities was observed in the majority of stages of Eales′ disease following treatment. This study adds further to the little available evidences of treatment effects in literature and may have effect on patient care and health policy in Eales′ disease.

  4. Classification of Breast Cancer Subtypes by combining Gene Expression and DNA Methylation Data

    DEFF Research Database (Denmark)

    List, Markus; Hauschild, Anne-Christin; Tan, Qihua

    2014-01-01

    expression data for hundreds of patients, the challenge is to extract a minimal optimal set of genes with good prognostic properties from a large bulk of genes making a moderate contribution to classification. Several studies have successfully applied machine learning algorithms to solve this so-called gene...... on the transcriptomic, but also on an epigenetic level. We compared so-called random forest derived classification models based on gene expression and methylation data alone, to a model based on the combined features and to a model based on the gold standard PAM50. We obtained bootstrap errors of 10...

  5. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  6. Subtype classification for prediction of prognosis of breast cancer from a biomarker panel: correlations and indications

    Directory of Open Access Journals (Sweden)

    Chen C

    2014-02-01

    Full Text Available Chuang Chen,1 Jing-Ping Yuan,2,3 Wen Wei,1 Yi Tu,1 Feng Yao,1 Xue-Qin Yang,4 Jin-Zhong Sun,1 Sheng-Rong Sun,1 Yan Li2 1Department of Breast and Thyroid Surgery, Wuhan University, Renmin Hospital, Wuhan, 2Department of Oncology, Zhongnan Hospital of Wuhan University and Hubei Key Laboratory of Tumor Biological Behaviors and Hubei Cancer Clinical Study Center, Wuhan, 3Department of Pathology, The Central Hospital of Wuhan, Wuhan, 4Medical School of Jingchu University of Technology, Jingmen, People’s Republic of China Background: Hormone receptors, including the estrogen receptor and progesterone receptor, human epidermal growth factor receptor 2 (HER2, and other biomarkers like Ki67, epidermal growth factor receptor (EGFR, also known as HER1, the androgen receptor, and p53, are key molecules in breast cancer. This study evaluated the relationship between HER2 and hormone receptors and explored the additional prognostic value of Ki67, EGFR, the androgen receptor, and p53. Methods: Quantitative determination of HER2 and EGFR was performed in 240 invasive breast cancer tissue microarray specimens using quantum dot (QD-based nanotechnology. We identified two subtypes of HER2, ie, high total HER2 load (HTH2 and low total HER2 load (LTH2, and three subtypes of hormone receptor, ie, high hormone receptor (HHR, low hormone receptor (LHR, and no hormone receptor (NHR. Therefore, breast cancer patients could be divided into five subtypes according to HER2 and hormone receptor status. Ki67, p53, and the androgen receptor were determined by traditional immunohistochemistry techniques. The relationship between hormone receptors and HER2 was investigated and the additional value of Ki67, EGFR, the androgen receptor, and p53 for prediction of 5-year disease-free survival was assessed. Results: In all patients, quantitative determination showed a statistically significant (P<0.001 negative correlation between HER2 and the hormone receptors and a significant

  7. GA Based Optimal Feature Extraction Method for Functional Data Classification

    OpenAIRE

    Jun Wan; Zehua Chen; Yingwu Chen; Zhidong Bai

    2010-01-01

    Classification is an interesting problem in functional data analysis (FDA), because many science and application problems end up with classification problems, such as recognition, prediction, control, decision making, management, etc. As the high dimension and high correlation in functional data (FD), it is a key problem to extract features from FD whereas keeping its global characters, which relates to the classification efficiency and precision to heavens. In this paper...

  8. [Classification and characteristics of interval cancers in the Principality of Asturias's Breast Cancer Screening Program].

    Science.gov (United States)

    Prieto García, M A; Delgado Sevillano, R; Baldó Sierra, C; González Díaz, E; López Secades, A; Llavona Amor, J A; Vidal Marín, B

    2013-09-01

    To review and classify the interval cancers found in the Principality of Asturias's Breast Cancer Screening Program (PDPCM). A secondary objective was to determine the histological characteristics, size, and stage of the interval cancers at the time of diagnosis. We included the interval cancers in the PDPCM in the period 2003-2007. Interval cancers were classified according to the breast cancer screening program protocol, with double reading without consensus, without blinding, with arbitration. Mammograms were interpreted by 10 radiologists in the PDPCM. A total of 33.7% of the interval cancers could not be classified; of the interval cancers that could be classified, 40.67% were labeled true interval cancers, 31.4% were labeled false negatives on screening, 23.7% had minimal signs, and 4.23% were considered occult. A total of 70% of the interval cancers were diagnosed in the year of the period between screening examinations and 71.7% were diagnosed after subsequent screening. A total of 76.9% were invasive ductal carcinomas, 61.1% were stage II when detected, and 78.7% were larger than 10mm when detected. The rate of interval cancers and the rate of false negatives in the PDPCM are higher than those recommended in the European guidelines. Interval cancers are diagnosed later than the tumors detected at screening. Studying interval cancers provides significant training for the radiologists in the PDPCM. Copyright © 2011 SERAM. Published by Elsevier Espana. All rights reserved.

  9. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    Directory of Open Access Journals (Sweden)

    Hala Alshamlan

    2015-01-01

    Full Text Available An artificial bee colony (ABC is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR, and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO. The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  10. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    Science.gov (United States)

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  11. In vivo subsite classification and diagnosis of oral cancers using Raman spectroscopy

    Directory of Open Access Journals (Sweden)

    Aditi Sahu

    2016-09-01

    Full Text Available Oral cancers suffer from poor disease-free survival rates due to delayed diagnosis. Noninvasive, rapid, objective approaches as adjuncts to visual inspection can help in better management of oral cancers. Raman spectroscopy (RS has shown potential in identification of oral premalignant and malignant conditions and also in the detection of early cancer changes like cancer-field-effects (CFE at buccal mucosa subsite. Anatomic differences between different oral subsites have also been reported using RS. In this study, anatomical differences between subsites and their possible influence on healthy vs pathological classification were evaluated on 85 oral cancer and 72 healthy subjects. Spectra were acquired from buccal mucosa, lip and tongue in healthy, contralateral (internal healthy control, premalignant and cancer conditions using fiber-optic Raman spectrometer. Mean spectra indicate predominance of lipids in healthy buccal mucosa, contribution of both lipids and proteins in lip while major dominance of protein in tongue spectra. From healthy to tumor, changes in protein secondary-structure, DNA and heme-related features were observed. Principal component linear discriminant analysis (PC-LDA followed by leave-one-out-cross-validation (LOOCV was used for data analysis. Findings indicate buccal mucosa and tongue are distinct entities, while lip misclassifies with both these subsites. Additionally, the diagnostic algorithm for individual subsites gave improved classification efficiencies with respect to the pooled subsites model. However, as the pooled subsites model yielded 98% specificity and 100% sensitivity, this model may be more useful for preliminary screening applications. Large-scale validation studies are a pre-requisite before envisaging future clinical applications.

  12. Classification of prostate cancer grade using temporal ultrasound: in vivo feasibility study

    Science.gov (United States)

    Ghavidel, Sahar; Imani, Farhad; Khallaghi, Siavash; Gibson, Eli; Khojaste, Amir; Gaed, Mena; Moussa, Madeleine; Gomez, Jose A.; Siemens, D. Robert; Leveridge, Michael; Chang, Silvia; Fenster, Aaron; Ward, Aaron D.; Abolmaesumi, Purang; Mousavi, Parvin

    2016-03-01

    Temporal ultrasound has been shown to have high classification accuracy in differentiating cancer from benign tissue. In this paper, we extend the temporal ultrasound method to classify lower grade Prostate Cancer (PCa) from all other grades. We use a group of nine patients with mostly lower grade PCa, where cancerous regions are also limited. A critical challenge is to train a classifier with limited aggressive cancerous tissue compared to low grade cancerous tissue. To resolve the problem of imbalanced data, we use Synthetic Minority Oversampling Technique (SMOTE) to generate synthetic samples for the minority class. We calculate spectral features of temporal ultrasound data and perform feature selection using Random Forests. In leave-one-patient-out cross-validation strategy, an area under receiver operating characteristic curve (AUC) of 0.74 is achieved with overall sensitivity and specificity of 70%. Using an unsupervised learning approach prior to proposed method improves sensitivity and AUC to 80% and 0.79. This work represents promising results to classify lower and higher grade PCa with limited cancerous training samples, using temporal ultrasound.

  13. Classification tree analysis of second neoplasms in survivors of childhood cancer

    International Nuclear Information System (INIS)

    Jazbec, Janez; Todorovski, Ljupčo; Jereb, Berta

    2007-01-01

    Reports on childhood cancer survivors estimated cumulative probability of developing secondary neoplasms vary from 3,3% to 25% at 25 years from diagnosis, and the risk of developing another cancer to several times greater than in the general population. In our retrospective study, we have used the classification tree multivariate method on a group of 849 first cancer survivors, to identify childhood cancer patients with the greatest risk for development of secondary neoplasms. In observed group of patients, 34 develop secondary neoplasm after treatment of primary cancer. Analysis of parameters present at the treatment of first cancer, exposed two groups of patients at the special risk for secondary neoplasm. First are female patients treated for Hodgkin's disease at the age between 10 and 15 years, whose treatment included radiotherapy. Second group at special risk were male patients with acute lymphoblastic leukemia who were treated at the age between 4,6 and 6,6 years of age. The risk groups identified in our study are similar to the results of studies that used more conventional approaches. Usefulness of our approach in study of occurrence of second neoplasms should be confirmed in larger sample study, but user friendly presentation of results makes it attractive for further studies

  14. The generalization ability of online SVM classification based on Markov sampling.

    Science.gov (United States)

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  15. Classification of types of stuttering symptoms based on brain activity.

    Directory of Open Access Journals (Sweden)

    Jing Jiang

    Full Text Available Among the non-fluencies seen in speech, some are more typical (MT of stuttering speakers, whereas others are less typical (LT and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT whole-word repetitions (WWR should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type.

  16. Sequence-based classification using discriminatory motif feature selection.

    Directory of Open Access Journals (Sweden)

    Hao Xiong

    Full Text Available Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all k-mer patterns. The motivation behind such (enumerative approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of length ≤ k, such that potentially important, longer (> k predictors are not considered. Second, features so generated exhibit strong dependencies, which can complicate understanding of derived classification rules. Third, and most importantly, numerous irrelevant features are created. These concerns can compromise prediction and interpretation. While remedies have been proposed, they tend to be problem-specific and not broadly applicable. Here, we develop a generally applicable methodology, and an attendant software pipeline, that is predicated on discriminatory motif finding. In addition to the traditional training and validation partitions, our framework entails a third level of data partitioning, a discovery partition. A discriminatory motif finder is used on sequences and associated class labels in the discovery partition to yield a (small set of features. These features are then used as inputs to a classifier in the training partition. Finally, performance assessment occurs on the validation partition. Important attributes of our approach are its modularity (any discriminatory motif finder and any classifier can be deployed and its universality (all data, including sequences that are unaligned and/or of unequal length, can be accommodated. We illustrate our approach on two nucleosome occupancy datasets and a protein solubility dataset, previously analyzed using enumerative feature generation. Our method achieves excellent performance results, with and without optimization of classifier tuning parameters. A Python pipeline implementing the approach is

  17. User Classification in Crowdsourcing-Based Cooperative Spectrum Sensing

    Directory of Open Access Journals (Sweden)

    Linbo Zhai

    2017-07-01

    Full Text Available This paper studies cooperative spectrum sensing based on crowdsourcing in cognitive radio networks. Since intelligent mobile users such as smartphones and tablets can sense the wireless spectrum, channel sensing tasks can be assigned to these mobile users. This is referred to as the crowdsourcing method. However, there may be some malicious mobile users that send false sensing reports deliberately, for their own purposes. False sensing reports will influence decisions about channel state. Therefore, it is necessary to classify mobile users in order to distinguish malicious users. According to the sensing reports, mobile users should not just be divided into two classes (honest and malicious. There are two reasons for this: on the one hand, honest users in different positions may have different sensing outcomes, as shadowing, multi-path fading, and other issues may influence the sensing results; on the other hand, there may be more than one type of malicious users, acting differently in the network. Therefore, it is necessary to classify mobile users into more than two classes. Due to the lack of prior information of the number of user classes, this paper casts the problem of mobile user classification as a dynamic clustering problem that is NP-hard. The paper uses the interdistance-to-intradistance ratio of clusters as the fitness function, and aims to maximize the fitness function. To cast this optimization problem, this paper proposes a distributed algorithm for user classification in order to obtain bounded close-to-optimal solutions, and analyzes the approximation ratio of the proposed algorithm. Simulations show the distributed algorithm achieves higher performance than other algorithms.

  18. Classification of Types of Stuttering Symptoms Based on Brain Activity

    Science.gov (United States)

    Jiang, Jing; Lu, Chunming; Peng, Danling; Zhu, Chaozhe; Howell, Peter

    2012-01-01

    Among the non-fluencies seen in speech, some are more typical (MT) of stuttering speakers, whereas others are less typical (LT) and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT) whole-word repetitions (WWR) should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type. PMID:22761887

  19. Sequence-based classification and identification of Fungi.

    Science.gov (United States)

    Hibbett, David; Abarenkov, Kessy; Kõljalg, Urmas; Öpik, Maarja; Chai, Benli; Cole, James; Wang, Qiong; Crous, Pedro; Robert, Vincent; Helgason, Thorunn; Herr, Joshua R; Kirk, Paul; Lueschow, Shiloh; O'Donnell, Kerry; Nilsson, R Henrik; Oono, Ryoko; Schoch, Conrad; Smyth, Christopher; Walker, Donald M; Porras-Alfaro, Andrea; Taylor, John W; Geiser, David M

    Fungal taxonomy and ecology have been revolutionized by the application of molecular methods and both have increasing connections to genomics and functional biology. However, data streams from traditional specimen- and culture-based systematics are not yet fully integrated with those from metagenomic and metatranscriptomic studies, which limits understanding of the taxonomic diversity and metabolic properties of fungal communities. This article reviews current resources, needs, and opportunities for sequence-based classification and identification (SBCI) in fungi as well as related efforts in prokaryotes. To realize the full potential of fungal SBCI it will be necessary to make advances in multiple areas. Improvements in sequencing methods, including long-read and single-cell technologies, will empower fungal molecular ecologists to look beyond ITS and current shotgun metagenomics approaches. Data quality and accessibility will be enhanced by attention to data and metadata standards and rigorous enforcement of policies for deposition of data and workflows. Taxonomic communities will need to develop best practices for molecular characterization in their focal clades, while also contributing to globally useful datasets including ITS. Changes to nomenclatural rules are needed to enable validPUBLICation of sequence-based taxon descriptions. Finally, cultural shifts are necessary to promote adoption of SBCI and to accord professional credit to individuals who contribute to community resources.

  20. [Clinical Study of 2014 ISUP New Grade Group Classification for Prostate Cancer Patients Treated by Androgen Deprivation Therapy].

    Science.gov (United States)

    Uno, Masahiro; Kawase, Makoto; Kato, Daiki; Ishida, Takashi; Kato, Seiichi; Fujimoto, Yoshinori

    2018-01-01

    The 2014 International Society of Urological Pathology (ISUP) has proposed a new grade group (GG) classification for Gleason scores (GS). The usefulness of the new GG classification was investigated with 518 prostate cancer patients who underwent androgen deprivation therapy. According to the new GG classification, Stages B‒D and the new GG classification relapse-free rate for each stage were calculated using the Kaplan‒Meier method. The new GG classification revealed a significant difference for the relapse-free rate only between some groups. Analysis using the Cox proportional hazards model indicated that the risk of relapse was higher in GGs 4 and 5 than in GG 1. The usefulness about the relapse-free rate in androgen deprivation therapy of the 2014 ISUP new grade group classification a waits future examination.

  1. Comparison of the prevalence of malnutrition diagnosis in head and neck, gastrointestinal and lung cancer patients by three classification methods

    Science.gov (United States)

    Platek, Mary E.; Popp KPf, Johann V.; Possinger, Candi S.; DeNysschen, Carol A.; Horvath, Peter; Brown, Jean K.

    2011-01-01

    Background Malnutrition is prevalent among patients within certain cancer types. There is lack of universal standard of care for nutrition screening, lack of agreement on an operational definition and on validity of malnutrition indicators. Objective In a secondary data analysis, we investigated prevalence of malnutrition diagnosis by three classification methods using data from medical records of a National Cancer Institute (NCI)-designated comprehensive cancer center. Interventions/Methods Records of 227 patients hospitalized during 1998 with head and neck, gastrointestinal or lung cancer were reviewed for malnutrition based on three methods: 1) physician diagnosed malnutrition related ICD-9 codes; 2) in-hospital nutritional assessment summary conducted by Registered Dietitians; and 3) body mass index (BMI). For patients with multiple admissions, only data from the first hospitalization was included. Results Prevalence of malnutrition diagnosis ranged from 8.8% based on BMI to approximately 26% of all cases based on dietitian assessment. Kappa coefficients between any methods indicated a weak (kappa=0.23, BMI and Dietitians and kappa=0.28, Dietitians and Physicians) to fair strength of agreement (kappa=0.38, BMI and Physicians). Conclusions Available methods to identify patients with malnutrition in an NCI designated comprehensive cancer center resulted in varied prevalence of malnutrition diagnosis. Universal standard of care for nutrition screening that utilizes validated tools is needed. Implications for Practice The Joint Commission on the Accreditation of Healthcare Organizations requires nutritional screening of patients within 24 hours of admission. For this purpose, implementation of a validated tool that can be used by various healthcare practitioners, including nurses, needs to be considered. PMID:21242767

  2. Breast tissue classification in digital tomosynthesis images based on global gradient minimization and texture features

    Science.gov (United States)

    Qin, Xulei; Lu, Guolan; Sechopoulos, Ioannis; Fei, Baowei

    2014-03-01

    Digital breast tomosynthesis (DBT) is a pseudo-three-dimensional x-ray imaging modality proposed to decrease the effect of tissue superposition present in mammography, potentially resulting in an increase in clinical performance for the detection and diagnosis of breast cancer. Tissue classification in DBT images can be useful in risk assessment, computer-aided detection and radiation dosimetry, among other aspects. However, classifying breast tissue in DBT is a challenging problem because DBT images include complicated structures, image noise, and out-of-plane artifacts due to limited angular tomographic sampling. In this project, we propose an automatic method to classify fatty and glandular tissue in DBT images. First, the DBT images are pre-processed to enhance the tissue structures and to decrease image noise and artifacts. Second, a global smooth filter based on L0 gradient minimization is applied to eliminate detailed structures and enhance large-scale ones. Third, the similar structure regions are extracted and labeled by fuzzy C-means (FCM) classification. At the same time, the texture features are also calculated. Finally, each region is classified into different tissue types based on both intensity and texture features. The proposed method is validated using five patient DBT images using manual segmentation as the gold standard. The Dice scores and the confusion matrix are utilized to evaluate the classified results. The evaluation results demonstrated the feasibility of the proposed method for classifying breast glandular and fat tissue on DBT images.

  3. Classification between normal and tumor tissues based on the pair-wise gene expression ratio

    International Nuclear Information System (INIS)

    Yap, YeeLeng; Zhang, XueWu; Ling, MT; Wang, XiangHong; Wong, YC; Danchin, Antoine

    2004-01-01

    Precise classification of cancer types is critically important for early cancer diagnosis and treatment. Numerous efforts have been made to use gene expression profiles to improve precision of tumor classification. However, reliable cancer-related signals are generally lacking. Using recent datasets on colon and prostate cancer, a data transformation procedure from single gene expression to pair-wise gene expression ratio is proposed. Making use of the internal consistency of each expression profiling dataset this transformation improves the signal to noise ratio of the dataset and uncovers new relevant cancer-related signals (features). The efficiency in using the transformed dataset to perform normal/tumor classification was investigated using feature partitioning with informative features (gene annotation) as discriminating axes (single gene expression or pair-wise gene expression ratio). Classification results were compared to the original datasets for up to 10-feature model classifiers. 82 and 262 genes that have high correlation to tissue phenotype were selected from the colon and prostate datasets respectively. Remarkably, data transformation of the highly noisy expression data successfully led to lower the coefficient of variation (CV) for the within-class samples as well as improved the correlation with tissue phenotypes. The transformed dataset exhibited lower CV when compared to that of single gene expression. In the colon cancer set, the minimum CV decreased from 45.3% to 16.5%. In prostate cancer, comparable CV was achieved with and without transformation. This improvement in CV, coupled with the improved correlation between the pair-wise gene expression ratio and tissue phenotypes, yielded higher classification efficiency, especially with the colon dataset – from 87.1% to 93.5%. Over 90% of the top ten discriminating axes in both datasets showed significant improvement after data transformation. The high classification efficiency achieved suggested

  4. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    Science.gov (United States)

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Improving Generalization Based on l1-Norm Regularization for EEG-Based Motor Imagery Classification

    Directory of Open Access Journals (Sweden)

    Yuwei Zhao

    2018-05-01

    Full Text Available Multichannel electroencephalography (EEG is widely used in typical brain-computer interface (BCI systems. In general, a number of parameters are essential for a EEG classification algorithm due to redundant features involved in EEG signals. However, the generalization of the EEG method is often adversely affected by the model complexity, considerably coherent with its number of undetermined parameters, further leading to heavy overfitting. To decrease the complexity and improve the generalization of EEG method, we present a novel l1-norm-based approach to combine the decision value obtained from each EEG channel directly. By extracting the information from different channels on independent frequency bands (FB with l1-norm regularization, the method proposed fits the training data with much less parameters compared to common spatial pattern (CSP methods in order to reduce overfitting. Moreover, an effective and efficient solution to minimize the optimization object is proposed. The experimental results on dataset IVa of BCI competition III and dataset I of BCI competition IV show that, the proposed method contributes to high classification accuracy and increases generalization performance for the classification of MI EEG. As the training set ratio decreases from 80 to 20%, the average classification accuracy on the two datasets changes from 85.86 and 86.13% to 84.81 and 76.59%, respectively. The classification performance and generalization of the proposed method contribute to the practical application of MI based BCI systems.

  6. Data Stream Classification Based on the Gamma Classifier

    Directory of Open Access Journals (Sweden)

    Abril Valeria Uriarte-Arcia

    2015-01-01

    Full Text Available The ever increasing data generation confronts us with the problem of handling online massive amounts of information. One of the biggest challenges is how to extract valuable information from these massive continuous data streams during single scanning. In a data stream context, data arrive continuously at high speed; therefore the algorithms developed to address this context must be efficient regarding memory and time management and capable of detecting changes over time in the underlying distribution that generated the data. This work describes a novel method for the task of pattern classification over a continuous data stream based on an associative model. The proposed method is based on the Gamma classifier, which is inspired by the Alpha-Beta associative memories, which are both supervised pattern recognition models. The proposed method is capable of handling the space and time constrain inherent to data stream scenarios. The Data Streaming Gamma classifier (DS-Gamma classifier implements a sliding window approach to provide concept drift detection and a forgetting mechanism. In order to test the classifier, several experiments were performed using different data stream scenarios with real and synthetic data streams. The experimental results show that the method exhibits competitive performance when compared to other state-of-the-art algorithms.

  7. Estimation of Compaction Parameters Based on Soil Classification

    Science.gov (United States)

    Lubis, A. S.; Muis, Z. A.; Hastuty, I. P.; Siregar, I. M.

    2018-02-01

    Factors that must be considered in compaction of the soil works were the type of soil material, field control, maintenance and availability of funds. Those problems then raised the idea of how to estimate the density of the soil with a proper implementation system, fast, and economical. This study aims to estimate the compaction parameter i.e. the maximum dry unit weight (γ dmax) and optimum water content (Wopt) based on soil classification. Each of 30 samples were being tested for its properties index and compaction test. All of the data’s from the laboratory test results, were used to estimate the compaction parameter values by using linear regression and Goswami Model. From the research result, the soil types were A4, A-6, and A-7 according to AASHTO and SC, SC-SM, and CL based on USCS. By linear regression, the equation for estimation of the maximum dry unit weight (γdmax *)=1,862-0,005*FINES- 0,003*LL and estimation of the optimum water content (wopt *)=- 0,607+0,362*FINES+0,161*LL. By Goswami Model (with equation Y=mLogG+k), for estimation of the maximum dry unit weight (γdmax *) with m=-0,376 and k=2,482, for estimation of the optimum water content (wopt *) with m=21,265 and k=-32,421. For both of these equations a 95% confidence interval was obtained.

  8. Toward a Safety Risk-Based Classification of Unmanned Aircraft

    Science.gov (United States)

    Torres-Pomales, Wilfredo

    2016-01-01

    There is a trend of growing interest and demand for greater access of unmanned aircraft (UA) to the National Airspace System (NAS) as the ongoing development of UA technology has created the potential for significant economic benefits. However, the lack of a comprehensive and efficient UA regulatory framework has constrained the number and kinds of UA operations that can be performed. This report presents initial results of a study aimed at defining a safety-risk-based UA classification as a plausible basis for a regulatory framework for UA operating in the NAS. Much of the study up to this point has been at a conceptual high level. The report includes a survey of contextual topics, analysis of safety risk considerations, and initial recommendations for a risk-based approach to safe UA operations in the NAS. The next phase of the study will develop and leverage deeper clarity and insight into practical engineering and regulatory considerations for ensuring that UA operations have an acceptable level of safety.

  9. Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Łukasz Augustyniak

    2015-12-01

    Full Text Available We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.

  10. Simultaneous detection and classification of breast masses in digital mammograms via a deep learning YOLO-based CAD system.

    Science.gov (United States)

    Al-Masni, Mohammed A; Al-Antari, Mugahed A; Park, Jeong-Min; Gi, Geon; Kim, Tae-Yeon; Rivera, Patricio; Valarezo, Edwin; Choi, Mun-Taek; Han, Seung-Moo; Kim, Tae-Seong

    2018-04-01

    Automatic detection and classification of the masses in mammograms are still a big challenge and play a crucial role to assist radiologists for accurate diagnosis. In this paper, we propose a novel Computer-Aided Diagnosis (CAD) system based on one of the regional deep learning techniques, a ROI-based Convolutional Neural Network (CNN) which is called You Only Look Once (YOLO). Although most previous studies only deal with classification of masses, our proposed YOLO-based CAD system can handle detection and classification simultaneously in one framework. The proposed CAD system contains four main stages: preprocessing of mammograms, feature extraction utilizing deep convolutional networks, mass detection with confidence, and finally mass classification using Fully Connected Neural Networks (FC-NNs). In this study, we utilized original 600 mammograms from Digital Database for Screening Mammography (DDSM) and their augmented mammograms of 2,400 with the information of the masses and their types in training and testing our CAD. The trained YOLO-based CAD system detects the masses and then classifies their types into benign or malignant. Our results with five-fold cross validation tests show that the proposed CAD system detects the mass location with an overall accuracy of 99.7%. The system also distinguishes between benign and malignant lesions with an overall accuracy of 97%. Our proposed system even works on some challenging breast cancer cases where the masses exist over the pectoral muscles or dense regions. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. Assessing Unmet Information Needs of Breast Cancer Survivors: Exploratory Study of Online Health Forums Using Text Classification and Retrieval.

    Science.gov (United States)

    McRoy, Susan; Rastegar-Mojarad, Majid; Wang, Yanshan; Ruddy, Kathryn J; Haddad, Tufia C; Liu, Hongfang

    2018-05-15

    Patient education materials given to breast cancer survivors may not be a good fit for their information needs. Needs may change over time, be forgotten, or be misreported, for a variety of reasons. An automated content analysis of survivors' postings to online health forums can identify expressed information needs over a span of time and be repeated regularly at low cost. Identifying these unmet needs can guide improvements to existing education materials and the creation of new resources. The primary goals of this project are to assess the unmet information needs of breast cancer survivors from their own perspectives and to identify gaps between information needs and current education materials. This approach employs computational methods for content modeling and supervised text classification to data from online health forums to identify explicit and implicit requests for health-related information. Potential gaps between needs and education materials are identified using techniques from information retrieval. We provide a new taxonomy for the classification of sentences in online health forum data. 260 postings from two online health forums were selected, yielding 4179 sentences for coding. After annotation of data and training alternative one-versus-others classifiers, a random forest-based approach achieved F1 scores from 66% (Other, dataset2) to 90% (Medical, dataset1) on the primary information types. 136 expressions of need were used to generate queries to indexed education materials. Upon examination of the best two pages retrieved for each query, 12% (17/136) of queries were found to have relevant content by all coders, and 33% (45/136) were judged to have relevant content by at least one. Text from online health forums can be analyzed effectively using automated methods. Our analysis confirms that breast cancer survivors have many information needs that are not covered by the written documents they typically receive, as our results suggest that at most

  12. Knowledge-based sea ice classification by polarimetric SAR

    DEFF Research Database (Denmark)

    Skriver, Henning; Dierking, Wolfgang

    2004-01-01

    Polarimetric SAR images acquired at C- and L-band over sea ice in the Greenland Sea, Baltic Sea, and Beaufort Sea have been analysed with respect to their potential for ice type classification. The polarimetric data were gathered by the Danish EMISAR and the US AIRSAR which both are airborne...... systems. A hierarchical classification scheme was chosen for sea ice because our knowledge about magnitudes, variations, and dependences of sea ice signatures can be directly considered. The optimal sequence of classification rules and the rules themselves depend on the ice conditions/regimes. The use...... of the polarimetric phase information improves the classification only in the case of thin ice types but is not necessary for thicker ice (above about 30 cm thickness)...

  13. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks | Center for Cancer Research

    Science.gov (United States)

    The purpose of this study was to develop a method of classifying cancers to specific diagnostic categories based on their gene expression signatures using artificial neural networks (ANNs). We trained the ANNs using the small, round blue-cell tumors (SRBCTs) as a model. These cancers belong to four distinct diagnostic categories and often present diagnostic dilemmas in

  14. Palm-vein classification based on principal orientation features.

    Directory of Open Access Journals (Sweden)

    Yujia Zhou

    Full Text Available Personal recognition using palm-vein patterns has emerged as a promising alternative for human recognition because of its uniqueness, stability, live body identification, flexibility, and difficulty to cheat. With the expanding application of palm-vein pattern recognition, the corresponding growth of the database has resulted in a long response time. To shorten the response time of identification, this paper proposes a simple and useful classification for palm-vein identification based on principal direction features. In the registration process, the Gaussian-Radon transform is adopted to extract the orientation matrix and then compute the principal direction of a palm-vein image based on the orientation matrix. The database can be classified into six bins based on the value of the principal direction. In the identification process, the principal direction of the test sample is first extracted to ascertain the corresponding bin. One-by-one matching with the training samples is then performed in the bin. To improve recognition efficiency while maintaining better recognition accuracy, two neighborhood bins of the corresponding bin are continuously searched to identify the input palm-vein image. Evaluation experiments are conducted on three different databases, namely, PolyU, CASIA, and the database of this study. Experimental results show that the searching range of one test sample in PolyU, CASIA and our database by the proposed method for palm-vein identification can be reduced to 14.29%, 14.50%, and 14.28%, with retrieval accuracy of 96.67%, 96.00%, and 97.71%, respectively. With 10,000 training samples in the database, the execution time of the identification process by the traditional method is 18.56 s, while that by the proposed approach is 3.16 s. The experimental results confirm that the proposed approach is more efficient than the traditional method, especially for a large database.

  15. Trace elements based classification on clinkers. Application to Spanish clinkers

    Directory of Open Access Journals (Sweden)

    Tamás, F. D.

    2001-12-01

    Full Text Available The qualitative identification to determine the origin (i.e. manufacturing factory of Spanish clinkers is described. The classification of clinkers produced in different factories can be based on their trace element content. Approximately fifteen clinker sorts are analysed, collected from 11 Spanish cement factories to determine their Mg, Sr, Ba, Mn, Ti, Zr, Zn and V content. An expert system formulated by a binary decision tree is designed based on the collected data. The performance of the obtained classifier was measured by ten-fold cross validation. The results show that the proposed method is useful to identify an easy-to-use expert system that is able to determine the origin of the clinker based on its trace element content.

    En el presente trabajo se describe el procedimiento de identificación cualitativa de clínkeres españoles con el objeto de determinar su origen (fábrica. Esa clasificación de los clínkeres se basa en el contenido de sus elementos traza. Se analizaron 15 clínkeres diferentes procedentes de 11 fábricas de cemento españolas, determinándose los contenidos en Mg, Sr, Ba, Mn, Ti, Zr, Zn y V. Se ha diseñado un sistema experto mediante un árbol de decisión binario basado en los datos recogidos. La clasificación obtenida fue examinada mediante la validación cruzada de 10 valores. Los resultados obtenidos muestran que el modelo propuesto es válido para identificar, de manera fácil, un sistema experto capaz de determinar el origen de un clínker basándose en el contenido de sus elementos traza.

  16. Event-Based User Classification in Weibo Media

    Directory of Open Access Journals (Sweden)

    Liang Guo

    2014-01-01

    Full Text Available Weibo media, known as the real-time microblogging services, has attracted massive attention and support from social network users. Weibo platform offers an opportunity for people to access information and changes the way people acquire and disseminate information significantly. Meanwhile, it enables people to respond to the social events in a more convenient way. Much of the information in Weibo media is related to some events. Users who post different contents, and exert different behavior or attitude may lead to different contribution to the specific event. Therefore, classifying the large amount of uncategorized social circles generated in Weibo media automatically from the perspective of events has been a promising task. Under this circumstance, in order to effectively organize and manage the huge amounts of users, thereby further managing their contents, we address the task of user classification in a more granular, event-based approach in this paper. By analyzing real data collected from Sina Weibo, we investigate the Weibo properties and utilize both content information and social network information to classify the numerous users into four primary groups: celebrities, organizations/media accounts, grassroots stars, and ordinary individuals. The experiments results show that our method identifies the user categories accurately.

  17. Event-based user classification in Weibo media.

    Science.gov (United States)

    Guo, Liang; Wang, Wendong; Cheng, Shiduan; Que, Xirong

    2014-01-01

    Weibo media, known as the real-time microblogging services, has attracted massive attention and support from social network users. Weibo platform offers an opportunity for people to access information and changes the way people acquire and disseminate information significantly. Meanwhile, it enables people to respond to the social events in a more convenient way. Much of the information in Weibo media is related to some events. Users who post different contents, and exert different behavior or attitude may lead to different contribution to the specific event. Therefore, classifying the large amount of uncategorized social circles generated in Weibo media automatically from the perspective of events has been a promising task. Under this circumstance, in order to effectively organize and manage the huge amounts of users, thereby further managing their contents, we address the task of user classification in a more granular, event-based approach in this paper. By analyzing real data collected from Sina Weibo, we investigate the Weibo properties and utilize both content information and social network information to classify the numerous users into four primary groups: celebrities, organizations/media accounts, grassroots stars, and ordinary individuals. The experiments results show that our method identifies the user categories accurately.

  18. Radar-Derived Quantitative Precipitation Estimation Based on Precipitation Classification

    Directory of Open Access Journals (Sweden)

    Lili Yang

    2016-01-01

    Full Text Available A method for improving radar-derived quantitative precipitation estimation is proposed. Tropical vertical profiles of reflectivity (VPRs are first determined from multiple VPRs. Upon identifying a tropical VPR, the event can be further classified as either tropical-stratiform or tropical-convective rainfall by a fuzzy logic (FL algorithm. Based on the precipitation-type fields, the reflectivity values are converted into rainfall rate using a Z-R relationship. In order to evaluate the performance of this rainfall classification scheme, three experiments were conducted using three months of data and two study cases. In Experiment I, the Weather Surveillance Radar-1988 Doppler (WSR-88D default Z-R relationship was applied. In Experiment II, the precipitation regime was separated into convective and stratiform rainfall using the FL algorithm, and corresponding Z-R relationships were used. In Experiment III, the precipitation regime was separated into convective, stratiform, and tropical rainfall, and the corresponding Z-R relationships were applied. The results show that the rainfall rates obtained from all three experiments match closely with the gauge observations, although Experiment II could solve the underestimation, when compared to Experiment I. Experiment III significantly reduced this underestimation and generated the most accurate radar estimates of rain rate among the three experiments.

  19. Treatment of esophageal motility disorders based on the chicago classification.

    Science.gov (United States)

    Maradey-Romero, Carla; Gabbard, Scott; Fass, Ronnie

    2014-12-01

    The Chicago Classification divides esophageal motor disorders based on the recorded value of the integrated relaxation pressure (IRP). The first group includes those with an elevated mean IRP that is associated with peristaltic abnormalities such as achalasia and esophagogastric junction outflow obstruction. The second group includes those with a normal mean IRP that is associated with esophageal hypermotility disorders such as distal esophageal spasm, hypercontractile esophagus (jackhammer esophagus), and hypertensive peristalsis (nutcracker esophagus). The third group includes those with a normal mean IRP that is associated with esophageal hypomotility peristaltic abnormalities such as absent peristalsis, weak peristalsis with small or large breaks, and frequent failed peristalsis. The therapeutic options vary greatly between the different groups of esophageal motor disorders. In achalasia patients, potential treatment strategies comprise medical therapy (calcium channel blockers, nitrates, and phosphodiesterase 5 inhibitors), endoscopic procedures (botulinum toxin A injection, pneumatic dilation, or peroral endoscopic myotomy) or surgery (Heller myotomy). Patients with a normal IRP and esophageal hypermotility disorder are candidates for medical therapy (nitrates, calcium channel blockers, phosphodiesterase 5 inhibitors, cimetropium/ipratropium bromide, proton pump inhibitors, benzodiazepines, tricyclic antidepressants, trazodone, selective serotonin reuptake inhibitors, and serotonin-norepinephrine reuptake inhibitors), endoscopic procedures (botulinum toxin A injection and peroral endoscopic myotomy), or surgery (Heller myotomy). Lastly, in patients with a normal IRP and esophageal hypomotility disorder, treatment is primarily focused on controlling the presence of gastroesophageal reflux with proton pump inhibitors and lifestyle modifications (soft and liquid diet and eating in the upright position) to address patient's dysphagia.

  20. China's Classification-Based Forest Management: Procedures, Problems, and Prospects

    Science.gov (United States)

    Dai, Limin; Zhao, Fuqiang; Shao, Guofan; Zhou, Li; Tang, Lina

    2009-06-01

    China’s new Classification-Based Forest Management (CFM) is a two-class system, including Commodity Forest (CoF) and Ecological Welfare Forest (EWF) lands, so named according to differences in their distinct functions and services. The purposes of CFM are to improve forestry economic systems, strengthen resource management in a market economy, ease the conflicts between wood demands and public welfare, and meet the diversified needs for forest services in China. The formative process of China’s CFM has involved a series of trials and revisions. China’s central government accelerated the reform of CFM in the year 2000 and completed the final version in 2003. CFM was implemented at the provincial level with the aid of subsidies from the central government. About a quarter of the forestland in China was approved as National EWF lands by the State Forestry Administration in 2006 and 2007. Logging is prohibited on National EWF lands, and their landowners or managers receive subsidies of about 70 RMB (US10) per hectare from the central government. CFM represents a new forestry strategy in China and its implementation inevitably faces challenges in promoting the understanding of forest ecological services, generalizing nationwide criteria for identifying EWF and CoF lands, setting up forest-specific compensation mechanisms for ecological benefits, enhancing the knowledge of administrators and the general public about CFM, and sustaining EWF lands under China’s current forestland tenure system. CFM does, however, offer a viable pathway toward sustainable forest management in China.

  1. Classification of CT brain images based on deep learning networks.

    Science.gov (United States)

    Gao, Xiaohong W; Hui, Rui; Tian, Zengmin

    2017-01-01

    While computerised tomography (CT) may have been the first imaging tool to study human brain, it has not yet been implemented into clinical decision making process for diagnosis of Alzheimer's disease (AD). On the other hand, with the nature of being prevalent, inexpensive and non-invasive, CT does present diagnostic features of AD to a great extent. This study explores the significance and impact on the application of the burgeoning deep learning techniques to the task of classification of CT brain images, in particular utilising convolutional neural network (CNN), aiming at providing supplementary information for the early diagnosis of Alzheimer's disease. Towards this end, three categories of CT images (N = 285) are clustered into three groups, which are AD, lesion (e.g. tumour) and normal ageing. In addition, considering the characteristics of this collection with larger thickness along the direction of depth (z) (~3-5 mm), an advanced CNN architecture is established integrating both 2D and 3D CNN networks. The fusion of the two CNN networks is subsequently coordinated based on the average of Softmax scores obtained from both networks consolidating 2D images along spatial axial directions and 3D segmented blocks respectively. As a result, the classification accuracy rates rendered by this elaborated CNN architecture are 85.2%, 80% and 95.3% for classes of AD, lesion and normal respectively with an average of 87.6%. Additionally, this improved CNN network appears to outperform the others when in comparison with 2D version only of CNN network as well as a number of state of the art hand-crafted approaches. As a result, these approaches deliver accuracy rates in percentage of 86.3, 85.6 ± 1.10, 86.3 ± 1.04, 85.2 ± 1.60, 83.1 ± 0.35 for 2D CNN, 2D SIFT, 2D KAZE, 3D SIFT and 3D KAZE respectively. The two major contributions of the paper constitute a new 3-D approach while applying deep learning technique to extract signature information

  2. ParSel: Parallel Selection of Micro-RNAs for Survival Classification in Cancers.

    Science.gov (United States)

    Sinha, Debajyoti; Sengupta, Debarka; Bandyopadhyay, Sanghamitra

    2017-07-01

    It is known that tumor micro-RNAs (miRNA) can define patient survival and treatment response. We present a framework to identify miRNAs which are predictive of cancer survival. The framework attempts to rank the miRNAs by exploring their collaborative role in gene regulation. Our approach tests a significantly large number of combinatorial cases leveraging parallel computation. We carefully avoided parametric assumptions involved in evaluations of miRNA expressions but used rigorous statistical computation to assign an importance score to a miRNA. Experimental results on three cancer types namely, KIRC, OV and GBM verify that the top ranked miRNAs obtained using the proposed framework produce better classification accuracy as compared to some best practice variable selection methods. Some of these top ranked miRNA are also known to be associated with related diseases. © 2017 Wiley‐VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Basic Hand Gestures Classification Based on Surface Electromyography

    Directory of Open Access Journals (Sweden)

    Aleksander Palkowski

    2016-01-01

    Full Text Available This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method.

  4. Ligand and structure-based classification models for Prediction of P-glycoprotein inhibitors

    DEFF Research Database (Denmark)

    Klepsch, Freya; Poongavanam, Vasanthanathan; Ecker, Gerhard Franz

    2014-01-01

    an algorithm based on Euclidean distance. Results show that random forest and SVM performed best for classification of P-gp inhibitors and non-inhibitors, correctly predicting 73/75 % of the external test set compounds. Classification based on the docking experiments using the scoring function Chem...

  5. Tweet-based Target Market Classification Using Ensemble Method

    Directory of Open Access Journals (Sweden)

    Muhammad Adi Khairul Anshary

    2016-09-01

    Full Text Available Target market classification is aimed at focusing marketing activities on the right targets. Classification of target markets can be done through data mining and by utilizing data from social media, e.g. Twitter. The end result of data mining are learning models that can classify new data. Ensemble methods can improve the accuracy of the models and therefore provide better results. In this study, classification of target markets was conducted on a dataset of 3000 tweets in order to extract features. Classification models were constructed to manipulate the training data using two ensemble methods (bagging and boosting. To investigate the effectiveness of the ensemble methods, this study used the CART (classification and regression tree algorithm for comparison. Three categories of consumer goods (computers, mobile phones and cameras and three categories of sentiments (positive, negative and neutral were classified towards three target-market categories. Machine learning was performed using Weka 3.6.9. The results of the test data showed that the bagging method improved the accuracy of CART with 1.9% (to 85.20%. On the other hand, for sentiment classification, the ensemble methods were not successful in increasing the accuracy of CART. The results of this study may be taken into consideration by companies who approach their customers through social media, especially Twitter.

  6. Research on Remote Sensing Image Classification Based on Feature Level Fusion

    Science.gov (United States)

    Yuan, L.; Zhu, G.

    2018-04-01

    Remote sensing image classification, as an important direction of remote sensing image processing and application, has been widely studied. However, in the process of existing classification algorithms, there still exists the phenomenon of misclassification and missing points, which leads to the final classification accuracy is not high. In this paper, we selected Sentinel-1A and Landsat8 OLI images as data sources, and propose a classification method based on feature level fusion. Compare three kind of feature level fusion algorithms (i.e., Gram-Schmidt spectral sharpening, Principal Component Analysis transform and Brovey transform), and then select the best fused image for the classification experimental. In the classification process, we choose four kinds of image classification algorithms (i.e. Minimum distance, Mahalanobis distance, Support Vector Machine and ISODATA) to do contrast experiment. We use overall classification precision and Kappa coefficient as the classification accuracy evaluation criteria, and the four classification results of fused image are analysed. The experimental results show that the fusion effect of Gram-Schmidt spectral sharpening is better than other methods. In four kinds of classification algorithms, the fused image has the best applicability to Support Vector Machine classification, the overall classification precision is 94.01 % and the Kappa coefficients is 0.91. The fused image with Sentinel-1A and Landsat8 OLI is not only have more spatial information and spectral texture characteristics, but also enhances the distinguishing features of the images. The proposed method is beneficial to improve the accuracy and stability of remote sensing image classification.

  7. Ultrasonographic characteristics and BI-RADS-US classification of BRCA1 mutation-associated breast cancer in Guangxi, China.

    Science.gov (United States)

    Li, Cheng; Liu, Junjie; Wang, Sida; Chen, Yuanyuan; Yuan, Zhigang; Zeng, Jian; Li, Zhixian

    2015-01-01

    To retrospectively analyze and compare the ultrasonographic characteristics and BI-RADS-US classification between patients with BRCA1 mutation-associated breast cancer and those without BRCA1 gene mutation in Guangxi, China. The study was performed in 36 lesions from 34 BRCA1 mutation-associated breast cancer patients. A total of 422 lesions from 422 breast cancer patients without BRCA1 mutations served as control group. The comparison of the ultrasonographic features and BI-RADS-US classification between two the groups were reviewed. More complex inner echo was disclosed in BRCA1 mutation-associated breast cancer patients (x(2) = 4.741, P = 0.029). The BI-RADS classification of BRCA1 mutation-associated breast cancer was lower (U = 6094.0, P = 0.022). BRCA1 mutation-associated breast cancer frequently displays as microlobulated margin and complex echo. It also shows more benign characteristics in morphology, and the BI-RADS classification is prone to be underestimated.

  8. [Molecular classification of breast cancer patients obtained through the technique of chromogenic in situ hybridization (CISH)].

    Science.gov (United States)

    Fernández, Angel; Reigosa, Aldo

    2013-12-01

    Breast cancer is a heterogeneous disease composed of a growing number of biological subtypes, with substantial variability of the disease progression within each category. The aim of this research was to classify the samples object of study according to the molecular classes of breast cancer: luminal A, luminal B, HER2 and triple negative, as a result of the state of HER2 amplification obtained by the technique of chromogenic in situ hybridization (CISH). The sample consisted of 200 biopsies fixed in 10% formalin, processed by standard techniques up to paraffin embedding, corresponding to patients diagnosed with invasive ductal carcinoma of the breast. These biopsies were obtained from patients from private practice and the Institute of Oncology "Dr. Miguel Pérez Carreño", for immunohistochemistry (IHC) of hormone receptors and HER2 made in the Hospital Metropolitano del Norte, Valencia, Venezuela. The molecular classification of the patient's tumors considering the expression of estrogen and progesterone receptors by IHC and HER2 amplification by CISH, allowed those cases originally classified as unknown, since they had an indeterminate (2+) outcome for HER2 expression by IHC, to be grouped into the different molecular classes. Also, this classification permitted that some cases, initially considered as belonging to a molecular class, were assigned to another class, after the revaluation of the HER2 status by CISH.

  9. Histopathological Breast Cancer Image Classification by Deep Neural Network Techniques Guided by Local Clustering.

    Science.gov (United States)

    Nahid, Abdullah-Al; Mehrabi, Mohamad Ali; Kong, Yinan

    2018-01-01

    Breast Cancer is a serious threat and one of the largest causes of death of women throughout the world. The identification of cancer largely depends on digital biomedical photography analysis such as histopathological images by doctors and physicians. Analyzing histopathological images is a nontrivial task, and decisions from investigation of these kinds of images always require specialised knowledge. However, Computer Aided Diagnosis (CAD) techniques can help the doctor make more reliable decisions. The state-of-the-art Deep Neural Network (DNN) has been recently introduced for biomedical image analysis. Normally each image contains structural and statistical information. This paper classifies a set of biomedical breast cancer images (BreakHis dataset) using novel DNN techniques guided by structural and statistical information derived from the images. Specifically a Convolutional Neural Network (CNN), a Long-Short-Term-Memory (LSTM), and a combination of CNN and LSTM are proposed for breast cancer image classification. Softmax and Support Vector Machine (SVM) layers have been used for the decision-making stage after extracting features utilising the proposed novel DNN models. In this experiment the best Accuracy value of 91.00% is achieved on the 200x dataset, the best Precision value 96.00% is achieved on the 40x dataset, and the best F -Measure value is achieved on both the 40x and 100x datasets.

  10. Are preoperative histology and MRI useful for classification of endometrial cancer risk?

    International Nuclear Information System (INIS)

    Body, Noemie; Lavoué, Vincent; De Kerdaniel, Olivier; Foucher, Fabrice; Henno, Sébastien; Cauchois, Aurélie; Laviolle, Bruno; Leblanc, Marc; Levêque, Jean

    2016-01-01

    The 2010 guidelines of the French National Cancer Institute (INCa) classify patients with endometrial cancer into three risk groups for lymph node invasion and recurrence on the basis of MRI and histological analysis of an endometrial specimen obtained preoperatively. The classification guides therapeutic choices, which may include pelvic and/or para-aortic lymphadenectomy. The purpose of this study was to evaluate the diagnostic performance of preoperative assessment to help identify intermediate- or high-risk patients requiring lymphadenectomy. The study included all patients who underwent surgery for endometrial cancer between January 2010 and December 2013 at either Rennes University Hospital or Vannes Regional Hospital. The criteria for eligibility included a preoperative assessment with MRI and histological examination of an endometrial sample. A histological comparison was made between the preoperative and surgical specimens. Among the 91 patients who underwent a full preoperative assessment, the diagnosis of intermediate- or high-risk endometrial cancer was established by MRI and histology with a sensitivity of 70 %, specificity of 82 %, positive predictive value (PPV) of 87 %, negative predictive value (NPV) of 61 %, positive likelihood ratio (LR+) of 3.8 and negative likelihood ratio (LR-) of 0.3. The risk group was underestimated in 32 % of patients and overestimated in 7 % of patients. MRI underestimated endometrial cancer stage in 20 % of cases, while endometrial sampling underestimated the histological type in 4 % of cases and the grade in 9 % of cases. The preoperative assessment overestimated or underestimated the risk of recurrence in nearly 40 % of cases, with errors in lesion type, grade or stage. Erroneous preoperative risk assessment leads to suboptimal initial surgical management of patients with endometrial cancer

  11. Hierarchical structure for audio-video based semantic classification of sports video sequences

    Science.gov (United States)

    Kolekar, M. H.; Sengupta, S.

    2005-07-01

    A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.

  12. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  13. Feature selection and classification of MAQC-II breast cancer and multiple myeloma microarray gene expression data.

    Directory of Open Access Journals (Sweden)

    Qingzhong Liu

    Full Text Available Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by exploiting information redundancy induced by associations among genetic markers. Judicious feature selection in microarray data analysis can result in significant reduction of cost while maintaining or improving the classification or prediction accuracy of learning machines that are employed to sort out the datasets. In this paper, we propose a gene selection method called Recursive Feature Addition (RFA, which combines supervised learning and statistical similarity measures. We compare our method with the following gene selection methods: Support Vector Machine Recursive Feature Elimination (SVMRFE, Leave-One-Out Calculation Sequential Forward Selection (LOOCSFS, Gradient based Leave-one-out Gene Selection (GLGS. To evaluate the performance of these gene selection methods, we employ several popular learning classifiers on the MicroArray Quality Control phase II on predictive modeling (MAQC-II breast cancer dataset and the MAQC-II multiple myeloma dataset. Experimental results show that gene selection is strictly paired with learning classifier. Overall, our approach outperforms other compared methods. The biological functional analysis based on the MAQC-II breast cancer dataset convinced us to apply our method for phenotype prediction. Additionally, learning classifiers also play important roles in the classification of microarray data and our experimental results indicate that the Nearest Mean Scale Classifier (NMSC is a good choice due to its prediction reliability and its stability across the three performance measurements: Testing accuracy, MCC values, and

  14. Classification of urine sediment based on convolution neural network

    Science.gov (United States)

    Pan, Jingjing; Jiang, Cunbo; Zhu, Tiantian

    2018-04-01

    By designing a new convolution neural network framework, this paper breaks the constraints of the original convolution neural network framework requiring large training samples and samples of the same size. Move and cropping the input images, generate the same size of the sub-graph. And then, the generated sub-graph uses the method of dropout, increasing the diversity of samples and preventing the fitting generation. Randomly select some proper subset in the sub-graphic set and ensure that the number of elements in the proper subset is same and the proper subset is not the same. The proper subsets are used as input layers for the convolution neural network. Through the convolution layer, the pooling, the full connection layer and output layer, we can obtained the classification loss rate of test set and training set. In the red blood cells, white blood cells, calcium oxalate crystallization classification experiment, the classification accuracy rate of 97% or more.

  15. Data Clustering and Evolving Fuzzy Decision Tree for Data Base Classification Problems

    Science.gov (United States)

    Chang, Pei-Chann; Fan, Chin-Yuan; Wang, Yen-Wen

    Data base classification suffers from two well known difficulties, i.e., the high dimensionality and non-stationary variations within the large historic data. This paper presents a hybrid classification model by integrating a case based reasoning technique, a Fuzzy Decision Tree (FDT), and Genetic Algorithms (GA) to construct a decision-making system for data classification in various data base applications. The model is major based on the idea that the historic data base can be transformed into a smaller case-base together with a group of fuzzy decision rules. As a result, the model can be more accurately respond to the current data under classifying from the inductions by these smaller cases based fuzzy decision trees. Hit rate is applied as a performance measure and the effectiveness of our proposed model is demonstrated by experimentally compared with other approaches on different data base classification applications. The average hit rate of our proposed model is the highest among others.

  16. Data classification based on the hybrid intellectual technology

    Directory of Open Access Journals (Sweden)

    Demidova Liliya

    2018-01-01

    Full Text Available In this paper the data classification technique, implying the consistent application of the SVM and Parzen classifiers, has been suggested. The Parser classifier applies to data which can be both correctly and erroneously classified using the SVM classifier, and are located in the experimentally defined subareas near the hyperplane which separates the classes. A herewith, the SVM classifier is used with the default parameters values, and the optimal parameters values of the Parser classifier are determined using the genetic algorithm. The experimental results confirming the effectiveness of the proposed hybrid intellectual data classification technology have been presented.

  17. Woven fabric defects detection based on texture classification algorithm

    International Nuclear Information System (INIS)

    Ben Salem, Y.; Nasri, S.

    2011-01-01

    In this paper we have compared two famous methods in texture classification to solve the problem of recognition and classification of defects occurring in a textile manufacture. We have compared local binary patterns method with co-occurrence matrix. The classifier used is the support vector machines (SVM). The system has been tested using TILDA database. The results obtained are interesting and show that LBP is a good method for the problems of recognition and classifcation defects, it gives a good running time especially for the real time applications.

  18. Classification of Gait Types Based on the Duty-factor

    DEFF Research Database (Denmark)

    Fihl, Preben; Moeslund, Thomas B.

    2007-01-01

    on the speed of the human, the cameras setup etc. and hence a robust descriptor for gait classification. The dutyfactor is basically a matter of measuring the ground support of the feet with respect to the stride. We estimate this by comparing the incoming silhouettes to a database of silhouettes with known...... ground support. Silhouettes are extracted using the Codebook method and represented using Shape Contexts. The matching with database silhouettes is done using the Hungarian method. While manually estimated duty-factors show a clear classification the presented system contains misclassifications due...

  19. SVM-based Partial Discharge Pattern Classification for GIS

    Science.gov (United States)

    Ling, Yin; Bai, Demeng; Wang, Menglin; Gong, Xiaojin; Gu, Chao

    2018-01-01

    Partial discharges (PD) occur when there are localized dielectric breakdowns in small regions of gas insulated substations (GIS). It is of high importance to recognize the PD patterns, through which we can diagnose the defects caused by different sources so that predictive maintenance can be conducted to prevent from unplanned power outage. In this paper, we propose an approach to perform partial discharge pattern classification. It first recovers the PRPD matrices from the PRPD2D images; then statistical features are extracted from the recovered PRPD matrix and fed into SVM for classification. Experiments conducted on a dataset containing thousands of images demonstrates the high effectiveness of the method.

  20. Ensemble Classification of Data Streams Based on Attribute Reduction and a Sliding Window

    Directory of Open Access Journals (Sweden)

    Yingchun Chen

    2018-04-01

    Full Text Available With the current increasing volume and dimensionality of data, traditional data classification algorithms are unable to satisfy the demands of practical classification applications of data streams. To deal with noise and concept drift in data streams, we propose an ensemble classification algorithm based on attribute reduction and a sliding window in this paper. Using mutual information, an approximate attribute reduction algorithm based on rough sets is used to reduce data dimensionality and increase the diversity of reduced results in the algorithm. A double-threshold concept drift detection method and a three-stage sliding window control strategy are introduced to improve the performance of the algorithm when dealing with both noise and concept drift. The classification precision is further improved by updating the base classifiers and their nonlinear weights. Experiments on synthetic datasets and actual datasets demonstrate the performance of the algorithm in terms of classification precision, memory use, and time efficiency.

  1. An application-based classification to understand buyer-seller interaction in business services

    NARCIS (Netherlands)

    Valk, van der W.; Wynstra, J.Y.F.; Axelsson, B.

    2006-01-01

    Abstract: Purpose – Most existing classifications of business services have taken the perspective of the supplier as opposed to that of the buyer. To address this imbalance, the purpose of this paper is to propose a classification of business services based on how the buying company applies the

  2. Initial steps towards an evidence-based classification system for golfers with a physical impairment

    NARCIS (Netherlands)

    Stoter, Inge K.; Hettinga, Florentina J.; Altmann, Viola; Eisma, Wim; Arendzen, Hans; Bennett, Tony; van der Woude, Lucas H.; Dekker, Rienk

    2017-01-01

    Purpose: The present narrative review aims to make a first step towards an evidence-based classification system in handigolf following the International Paralympic Committee (IPC). It intends to create a conceptual framework of classification for handigolf and an agenda for future research. Method:

  3. Vision-Based Perception and Classification of Mosquitoes Using Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Masataka Fuchida

    2017-01-01

    Full Text Available The need for a novel automated mosquito perception and classification method is becoming increasingly essential in recent years, with steeply increasing number of mosquito-borne diseases and associated casualties. There exist remote sensing and GIS-based methods for mapping potential mosquito inhabitants and locations that are prone to mosquito-borne diseases, but these methods generally do not account for species-wise identification of mosquitoes in closed-perimeter regions. Traditional methods for mosquito classification involve highly manual processes requiring tedious sample collection and supervised laboratory analysis. In this research work, we present the design and experimental validation of an automated vision-based mosquito classification module that can deploy in closed-perimeter mosquito inhabitants. The module is capable of identifying mosquitoes from other bugs such as bees and flies by extracting the morphological features, followed by support vector machine-based classification. In addition, this paper presents the results of three variants of support vector machine classifier in the context of mosquito classification problem. This vision-based approach to the mosquito classification problem presents an efficient alternative to the conventional methods for mosquito surveillance, mapping and sample image collection. Experimental results involving classification between mosquitoes and a predefined set of other bugs using multiple classification strategies demonstrate the efficacy and validity of the proposed approach with a maximum recall of 98%.

  4. Clinical application of a microfluidic chip for immunocapture and quantification of circulating exosomes to assist breast cancer diagnosis and molecular classification.

    Science.gov (United States)

    Fang, Shimeng; Tian, Hongzhu; Li, Xiancheng; Jin, Dong; Li, Xiaojie; Kong, Jing; Yang, Chun; Yang, Xuesong; Lu, Yao; Luo, Yong; Lin, Bingcheng; Niu, Weidong; Liu, Tingjiao

    2017-01-01

    Increasing attention has been attracted by exosomes in blood-based diagnosis because cancer cells release more exosomes in serum than normal cells and these exosomes overexpress a certain number of cancer-related biomarkers. However, capture and biomarker analysis of exosomes for clinical application are technically challenging. In this study, we developed a microfluidic chip for immunocapture and quantification of circulating exosomes from small sample volume and applied this device in clinical study. Circulating EpCAM-positive exosomes were measured in 6 cases breast cancer patients and 3 healthy controls to assist diagnosis. A significant increase in the EpCAM-positive exosome level in these patients was detected, compared to healthy controls. Furthermore, we quantified circulating HER2-positive exosomes in 19 cases of breast cancer patients for molecular classification. We demonstrated that the exosomal HER2 expression levels were almost consistent with that in tumor tissues assessed by immunohistochemical staining. The microfluidic chip might provide a new platform to assist breast cancer diagnosis and molecular classification.

  5. Exploring high dimensional data with Butterfly: a novel classification algorithm based on discrete dynamical systems.

    Science.gov (United States)

    Geraci, Joseph; Dharsee, Moyez; Nuin, Paulo; Haslehurst, Alexandria; Koti, Madhuri; Feilotter, Harriet E; Evans, Ken

    2014-03-01

    We introduce a novel method for visualizing high dimensional data via a discrete dynamical system. This method provides a 2D representation of the relationship between subjects according to a set of variables without geometric projections, transformed axes or principal components. The algorithm exploits a memory-type mechanism inherent in a certain class of discrete dynamical systems collectively referred to as the chaos game that are closely related to iterative function systems. The goal of the algorithm was to create a human readable representation of high dimensional patient data that was capable of detecting unrevealed subclusters of patients from within anticipated classifications. This provides a mechanism to further pursue a more personalized exploration of pathology when used with medical data. For clustering and classification protocols, the dynamical system portion of the algorithm is designed to come after some feature selection filter and before some model evaluation (e.g. clustering accuracy) protocol. In the version given here, a univariate features selection step is performed (in practice more complex feature selection methods are used), a discrete dynamical system is driven by this reduced set of variables (which results in a set of 2D cluster models), these models are evaluated for their accuracy (according to a user-defined binary classification) and finally a visual representation of the top classification models are returned. Thus, in addition to the visualization component, this methodology can be used for both supervised and unsupervised machine learning as the top performing models are returned in the protocol we describe here. Butterfly, the algorithm we introduce and provide working code for, uses a discrete dynamical system to classify high dimensional data and provide a 2D representation of the relationship between subjects. We report results on three datasets (two in the article; one in the appendix) including a public lung cancer

  6. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  7. Analysis and classification of oncology activities on the way to workflow based single source documentation in clinical information systems.

    Science.gov (United States)

    Wagner, Stefan; Beckmann, Matthias W; Wullich, Bernd; Seggewies, Christof; Ries, Markus; Bürkle, Thomas; Prokosch, Hans-Ulrich

    2015-12-22

    Today, cancer documentation is still a tedious task involving many different information systems even within a single institution and it is rarely supported by appropriate documentation workflows. In a comprehensive 14 step analysis we compiled diagnostic and therapeutic pathways for 13 cancer entities using a mixed approach of document analysis, workflow analysis, expert interviews, workflow modelling and feedback loops. These pathways were stepwise classified and categorized to create a final set of grouped pathways and workflows including electronic documentation forms. A total of 73 workflows for the 13 entities based on 82 paper documentation forms additionally to computer based documentation systems were compiled in a 724 page document comprising 130 figures, 94 tables and 23 tumour classifications as well as 12 follow-up tables. Stepwise classification made it possible to derive grouped diagnostic and therapeutic pathways for the three major classes - solid entities with surgical therapy - solid entities with surgical and additional therapeutic activities and - non-solid entities. For these classes it was possible to deduct common documentation workflows to support workflow-guided single-source documentation. Clinical documentation activities within a Comprehensive Cancer Center can likely be realized in a set of three documentation workflows with conditional branching in a modern workflow supporting clinical information system.

  8. Colour based off-road environment and terrain type classification

    NARCIS (Netherlands)

    Jansen, P.; Mark, W. van der; Heuvel, J.C. van den; Groen, F.C.A.

    2005-01-01

    Terrain classification is an important problem that still remains to be solved for off-road autonomous robot vehicle guidance. Often, obstacle detection systems are used which cannot distinguish between solid obstacles such as rocks or soft obstacles such as tall patches of grass. Terrain

  9. Emotion of Physiological Signals Classification Based on TS Feature Selection

    Institute of Scientific and Technical Information of China (English)

    Wang Yujing; Mo Jianlin

    2015-01-01

    This paper propose a method of TS-MLP about emotion recognition of physiological signal.It can recognize emotion successfully by Tabu search which selects features of emotion’s physiological signals and multilayer perceptron that is used to classify emotion.Simulation shows that it has achieved good emotion classification performance.

  10. A vegetation-based hierarchical classification for seasonally pulsed ...

    African Journals Online (AJOL)

    A classification scheme is presented for seasonal floodplains of the Boro-Xudum distributary of the Okavango Delta, Botswana. This distributary is subject to an annual flood-pulse, the inundated area varying from a mean low of 3 600 km2 to a mean high of 5 400 km2 between 2000 and 2006. A stratified random sample of ...

  11. A Classification System for Hospital-Based Infection Outbreaks

    Directory of Open Access Journals (Sweden)

    Paul S. Ganney

    2010-01-01

    Full Text Available Outbreaks of infection within semi-closed environments such as hospitals, whether inherent in the environment (such as Clostridium difficile (C.Diff or Methicillinresistant Staphylococcus aureus (MRSA or imported from the wider community (such as Norwalk-like viruses (NLVs, are difficult to manage. As part of our work on modelling such outbreaks, we have developed a classification system to describe the impact of a particular outbreak upon an organization. This classification system may then be used in comparing appropriate computer models to real outbreaks, as well as in comparing different real outbreaks in, for example, the comparison of differing management and containment techniques and strategies. Data from NLV outbreaks in the Hull and East Yorkshire Hospitals NHS Trust (the Trust over several previous years are analysed and classified, both for infection within staff (where the end of infection date may not be known and within patients (where it generally is known. A classification system consisting of seven elements is described, along with a goodness-of-fit method for comparing a new classification to previously known ones, for use in evaluating a simulation against history and thereby determining how ‘realistic’ (or otherwise it is.

  12. A classification system for hospital-based infection outbreaks.

    Science.gov (United States)

    Ganney, Paul S; Madeo, Maurice; Phillips, Roger

    2010-12-01

    Outbreaks of infection within semi-closed environments such as hospitals, whether inherent in the environment (such as Clostridium difficile (C.Diff) or Methicillin-resistant Staphylococcus aureus (MRSA) or imported from the wider community (such as Norwalk-like viruses (NLVs)), are difficult to manage. As part of our work on modelling such outbreaks, we have developed a classification system to describe the impact of a particular outbreak upon an organization. This classification system may then be used in comparing appropriate computer models to real outbreaks, as well as in comparing different real outbreaks in, for example, the comparison of differing management and containment techniques and strategies. Data from NLV outbreaks in the Hull and East Yorkshire Hospitals NHS Trust (the Trust) over several previous years are analysed and classified, both for infection within staff (where the end of infection date may not be known) and within patients (where it generally is known). A classification system consisting of seven elements is described, along with a goodness-of-fit method for comparing a new classification to previously known ones, for use in evaluating a simulation against history and thereby determining how 'realistic' (or otherwise) it is.

  13. Proposing a Hybrid Model Based on Robson's Classification for Better Impact on Trends of Cesarean Deliveries.

    Science.gov (United States)

    Hans, Punit; Rohatgi, Renu

    2017-06-01

    To construct a hybrid model classification for cesarean section (CS) deliveries based on the woman-characteristics (Robson's classification with additional layers of indications for CS, keeping in view low-resource settings available in India). This is a cross-sectional study conducted at Nalanda Medical College, Patna. All the women delivered from January 2016 to May 2016 in the labor ward were included. Results obtained were compared with the values obtained for India, from secondary analysis of WHO multi-country survey (2010-2011) by Joshua Vogel and colleagues' study published in "The Lancet Global Health." The three classifications (indication-based, Robson's and hybrid model) applied for categorization of the cesarean deliveries from the same sample of data and a semiqualitative evaluations done, considering the main characteristics, strengths and weaknesses of each classification system. The total number of women delivered during study period was 1462, out of which CS deliveries were 471. Overall, CS rate calculated for NMCH, hospital in this specified period, was 32.21% ( p  = 0.001). Hybrid model scored 23/23, and scores of Robson classification and indication-based classification were 21/23 and 10/23, respectively. Single-study centre and referral bias are the limitations of the study. Given the flexibility of the classifications, we constructed a hybrid model based on the woman-characteristics system with additional layers of other classification. Indication-based classification answers why, Robson classification answers on whom, while through our hybrid model we get to know why and on whom cesarean deliveries are being performed.

  14. Applying Topographic Classification, Based on the Hydrological Process, to Design Habitat Linkages for Climate Change

    Directory of Open Access Journals (Sweden)

    Yongwon Mo

    2017-11-01

    Full Text Available The use of biodiversity surrogates has been discussed in the context of designing habitat linkages to support the migration of species affected by climate change. Topography has been proposed as a useful surrogate in the coarse-filter approach, as the hydrological process caused by topography such as erosion and accumulation is the basis of ecological processes. However, some studies that have designed topographic linkages as habitat linkages, so far have focused much on the shape of the topography (morphometric topographic classification with little emphasis on the hydrological processes (generic topographic classification to find such topographic linkages. We aimed to understand whether generic classification was valid for designing these linkages. First, we evaluated whether topographic classification is more appropriate for describing actual (coniferous and deciduous and potential (mammals and amphibians habitat distributions. Second, we analyzed the difference in the linkages between the morphometric and generic topographic classifications. The results showed that the generic classification represented the actual distribution of the trees, but neither the morphometric nor the generic classification could represent the potential animal distributions adequately. Our study demonstrated that the topographic classes, according to the generic classification, were arranged successively according to the flow of water, nutrients, and sediment; therefore, it would be advantageous to secure linkages with a width of 1 km or more. In addition, the edge effect would be smaller than with the morphometric classification. Accordingly, we suggest that topographic characteristics, based on the hydrological process, are required to design topographic linkages for climate change.

  15. Stratification and prognostic relevance of Jass’s molecular classification of colorectal cancer

    OpenAIRE

    Inti eZlobec; Inti eZlobec; Michel P Bihl; Anja eFoerster; Alex eRufle; Luigi eTerracciano; Alessandro eLugli; Alessandro eLugli

    2012-01-01

    Background: The current proposed model of colorectal tumorigenesis is based primarily on CpG island methylator phenotype (CIMP), microsatellite instability (MSI), KRAS, BRAF, and methylation status of 0-6-Methylguanine DNA Methyltransferase (MGMT) and classifies tumors into 5 subgroups. The aim of this study is to validate this molecular classification and test its prognostic relevance. Methods: 302 patients were included in this study. Molecular analysis was performed for 5 CIMP-related pro...

  16. A kernel-based multivariate feature selection method for microarray data classification.

    Directory of Open Access Journals (Sweden)

    Shiquan Sun

    Full Text Available High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Formula: see text]-nearest neighbor on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.

  17. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function

    Science.gov (United States)

    Groenendyk, Derek G.; Ferré, Ty P.A.; Thorp, Kelly R.; Rice, Amy K.

    2015-01-01

    Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth’s surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape

  18. Cell-based therapy technology classifications and translational challenges

    Science.gov (United States)

    Mount, Natalie M.; Ward, Stephen J.; Kefalas, Panos; Hyllner, Johan

    2015-01-01

    Cell therapies offer the promise of treating and altering the course of diseases which cannot be addressed adequately by existing pharmaceuticals. Cell therapies are a diverse group across cell types and therapeutic indications and have been an active area of research for many years but are now strongly emerging through translation and towards successful commercial development and patient access. In this article, we present a description of a classification of cell therapies on the basis of their underlying technologies rather than the more commonly used classification by cell type because the regulatory path and manufacturing solutions are often similar within a technology area due to the nature of the methods used. We analyse the progress of new cell therapies towards clinical translation, examine how they are addressing the clinical, regulatory, manufacturing and reimbursement requirements, describe some of the remaining challenges and provide perspectives on how the field may progress for the future. PMID:26416686

  19. Support vector machine classification and validation of cancer tissue samples using microarray expression data.

    Science.gov (United States)

    Furey, T S; Cristianini, N; Duffy, N; Bednarski, D W; Schummer, M; Haussler, D

    2000-10-01

    DNA microarray experiments generating thousands of gene expression measurements, are being used to gather information from tissue and cell samples regarding gene expression differences that will be useful in diagnosing disease. We have developed a new method to analyse this kind of data using support vector machines (SVMs). This analysis consists of both classification of the tissue samples, and an exploration of the data for mis-labeled or questionable tissue results. We demonstrate the method in detail on samples consisting of ovarian cancer tissues, normal ovarian tissues, and other normal tissues. The dataset consists of expression experiment results for 97,802 cDNAs for each tissue. As a result of computational analysis, a tissue sample is discovered and confirmed to be wrongly labeled. Upon correction of this mistake and the removal of an outlier, perfect classification of tissues is achieved, but not with high confidence. We identify and analyse a subset of genes from the ovarian dataset whose expression is highly differentiated between the types of tissues. To show robustness of the SVM method, two previously published datasets from other types of tissues or cells are analysed. The results are comparable to those previously obtained. We show that other machine learning methods also perform comparably to the SVM on many of those datasets. The SVM software is available at http://www.cs. columbia.edu/ approximately bgrundy/svm.

  20. Support Vector Machine Based Tool for Plant Species Taxonomic Classification

    OpenAIRE

    Manimekalai .K; Vijaya.MS

    2014-01-01

    Plant species are living things and are generally categorized in terms of Domain, Kingdom, Phylum, Class, Order, Family, Genus and name of Species in a hierarchical fashion. This paper formulates the taxonomic leaf categorization problem as the hierarchical classification task and provides a suitable solution using a supervised learning technique namely support vector machine. Features are extracted from scanned images of plant leaves and trained using SVM. Only class, order, family of plants...

  1. Image Analysis and Classification Based on Soil Strength

    Science.gov (United States)

    2016-08-01

    Impact Hammer, which is light, easy to operate, and cost effective . The Clegg Impact Hammer measures stiffness of the soil surface by drop- ping a... effect on out-of-scene classifications. More statistical analy- sis should, however, be done to compare the measured field spectra, the WV2 training...DISCLAIMER: The contents of this report are not to be used for advertising , publication, or promotional purposes. Ci- tation of trade names does not

  2. Three-Class Mammogram Classification Based on Descriptive CNN Features

    Directory of Open Access Journals (Sweden)

    M. Mohsin Jadoon

    2017-01-01

    Full Text Available In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases. In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW and convolutional neural network-curvelet transform (CNN-CT. An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE. In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT, while in the second method discrete curvelet transform (DCT is used. In both methods, dense scale invariant feature (DSIFT for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN. Softmax layer and support vector machine (SVM layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques.

  3. Interrater reliability of a Pilates movement-based classification system.

    Science.gov (United States)

    Yu, Kwan Kenny; Tulloch, Evelyn; Hendrick, Paul

    2015-01-01

    To determine the interrater reliability for identification of a specific movement pattern using a Pilates Classification system. Videos of 5 subjects performing specific movement tasks were sent to raters trained in the DMA-CP classification system. Ninety-six raters completed the survey. Interrater reliability for the detection of a directional bias was excellent (Pi = 0.92, and K(free) = 0.89). Interrater reliability for classifying an individual into a specific subgroup was moderate (Pi = 0.64, K(free) = 0.55) however raters who had completed levels 1-4 of the DMA-CP training and reported using the assessment daily demonstrated excellent reliability (Pi = 0.89 and K(free) = 0.87). The reliability of the classification system demonstrated almost perfect agreement in determining the existence of a specific movement pattern and classifying into a subgroup for experienced raters. There was a trend for greater reliability associated with increased levels of training and experience of the raters. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. SB certification handout material requirements, test methods, responsibilities, and minimum classification levels for mixture-based specification for flexible base.

    Science.gov (United States)

    2012-10-01

    A handout with tables representing the material requirements, test methods, responsibilities, and minimum classification levels mixture-based specification for flexible base and details on aggregate and test methods employed, along with agency and co...

  5. An Approach for Leukemia Classification Based on Cooperative Game Theory

    OpenAIRE

    Torkaman, Atefeh; Charkari, Nasrollah Moghaddam; Aghaeipour, Mahnaz

    2011-01-01

    Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers ...

  6. Prognostic classification with laboratory parameters or imaging techniques in small-cell lung cancer

    NARCIS (Netherlands)

    de Jong, Wouter K.; Fidler, Vaclav; Groen, Harry J. M.

    PURPOSE: Our aim in this study was to compare prognostic models based on laboratory tests with a model including imaging information in small-cell lung cancer. PATIENTS AND METHODS: A retrospective analysis was performed on 156 consecutive patients. Three existing models based on laboratory tests

  7. Graph-based semi-supervised learning with genomic data integration using condition-responsive genes applied to phenotype classification.

    Science.gov (United States)

    Doostparast Torshizi, Abolfazl; Petzold, Linda R

    2018-01-01

    Data integration methods that combine data from different molecular levels such as genome, epigenome, transcriptome, etc., have received a great deal of interest in the past few years. It has been demonstrated that the synergistic effects of different biological data types can boost learning capabilities and lead to a better understanding of the underlying interactions among molecular levels. In this paper we present a graph-based semi-supervised classification algorithm that incorporates latent biological knowledge in the form of biological pathways with gene expression and DNA methylation data. The process of graph construction from biological pathways is based on detecting condition-responsive genes, where 3 sets of genes are finally extracted: all condition responsive genes, high-frequency condition-responsive genes, and P-value-filtered genes. The proposed approach is applied to ovarian cancer data downloaded from the Human Genome Atlas. Extensive numerical experiments demonstrate superior performance of the proposed approach compared to other state-of-the-art algorithms, including the latest graph-based classification techniques. Simulation results demonstrate that integrating various data types enhances classification performance and leads to a better understanding of interrelations between diverse omics data types. The proposed approach outperforms many of the state-of-the-art data integration algorithms. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  8. Support Vector Machine and Parametric Wavelet-Based Texture Classification of Stem Cell Images

    National Research Council Canada - National Science Library

    Jeffreys, Christopher

    2004-01-01

    .... Since colony texture is a major discriminating feature in determining quality, we introduce a non-invasive, semi-automated texture-based stem cell colony classification methodology to aid researchers...

  9. Single-labelled music genre classification using content-based features

    CSIR Research Space (South Africa)

    Ajoodha, R

    2015-11-01

    Full Text Available In this paper we use content-based features to perform automatic classification of music pieces into genres. We categorise these features into four groups: features extracted from the Fourier transform’s magnitude spectrum, features designed...

  10. Desert plains classification based on Geomorphometrical parameters (Case study: Aghda, Yazd)

    Science.gov (United States)

    Tazeh, mahdi; Kalantari, Saeideh

    2013-04-01

    This research focuses on plains. There are several tremendous methods and classification which presented for plain classification. One of The natural resource based classification which is mostly using in Iran, classified plains into three types, Erosional Pediment, Denudation Pediment Aggradational Piedmont. The qualitative and quantitative factors to differentiate them from each other are also used appropriately. In this study effective Geomorphometrical parameters in differentiate landforms were applied for plain. Geomorphometrical parameters are calculable and can be extracted using mathematical equations and the corresponding relations on digital elevation model. Geomorphometrical parameters used in this study included Percent of Slope, Plan Curvature, Profile Curvature, Minimum Curvature, the Maximum Curvature, Cross sectional Curvature, Longitudinal Curvature and Gaussian Curvature. The results indicated that the most important affecting Geomorphometrical parameters for plain and desert classifications includes: Percent of Slope, Minimum Curvature, Profile Curvature, and Longitudinal Curvature. Key Words: Plain, Geomorphometry, Classification, Biophysical, Yazd Khezarabad.

  11. [Classification of cell-based medicinal products and legal implications: An overview and an update].

    Science.gov (United States)

    Scherer, Jürgen; Flory, Egbert

    2015-11-01

    In general, cell-based medicinal products do not represent a uniform class of medicinal products, but instead comprise medicinal products with diverse regulatory classification as advanced-therapy medicinal products (ATMP), medicinal products (MP), tissue preparations, or blood products. Due to the legal and scientific consequences of the development and approval of MPs, classification should be clarified as early as possible. This paper describes the legal situation in Germany and highlights specific criteria and concepts for classification, with a focus on, but not limited to, ATMPs and non-ATMPs. Depending on the stage of product development and the specific application submitted to a competent authority, legally binding classification is done by the German Länder Authorities, Paul-Ehrlich-Institut, or European Medicines Agency. On request by the applicants, the Committee for Advanced Therapies may issue scientific recommendations for classification.

  12. Polsar Land Cover Classification Based on Hidden Polarimetric Features in Rotation Domain and Svm Classifier

    Science.gov (United States)

    Tao, C.-S.; Chen, S.-W.; Li, Y.-Z.; Xiao, S.-P.

    2017-09-01

    Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR) data utilization. Rollinvariant polarimetric features such as H / Ani / text-decoration: overline">α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets' scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM) classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification

  13. POLSAR LAND COVER CLASSIFICATION BASED ON HIDDEN POLARIMETRIC FEATURES IN ROTATION DOMAIN AND SVM CLASSIFIER

    Directory of Open Access Journals (Sweden)

    C.-S. Tao

    2017-09-01

    Full Text Available Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets’ scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy

  14. A classification model of Hyperion image base on SAM combined decision tree

    Science.gov (United States)

    Wang, Zhenghai; Hu, Guangdao; Zhou, YongZhang; Liu, Xin

    2009-10-01

    Monitoring the Earth using imaging spectrometers has necessitated more accurate analyses and new applications to remote sensing. A very high dimensional input space requires an exponentially large amount of data to adequately and reliably represent the classes in that space. On the other hand, with increase in the input dimensionality the hypothesis space grows exponentially, which makes the classification performance highly unreliable. Traditional classification algorithms Classification of hyperspectral images is challenging. New algorithms have to be developed for hyperspectral data classification. The Spectral Angle Mapper (SAM) is a physically-based spectral classification that uses an ndimensional angle to match pixels to reference spectra. The algorithm determines the spectral similarity between two spectra by calculating the angle between the spectra, treating them as vectors in a space with dimensionality equal to the number of bands. The key and difficulty is that we should artificial defining the threshold of SAM. The classification precision depends on the rationality of the threshold of SAM. In order to resolve this problem, this paper proposes a new automatic classification model of remote sensing image using SAM combined with decision tree. It can automatic choose the appropriate threshold of SAM and improve the classify precision of SAM base on the analyze of field spectrum. The test area located in Heqing Yunnan was imaged by EO_1 Hyperion imaging spectrometer using 224 bands in visual and near infrared. The area included limestone areas, rock fields, soil and forests. The area was classified into four different vegetation and soil types. The results show that this method choose the appropriate threshold of SAM and eliminates the disturbance and influence of unwanted objects effectively, so as to improve the classification precision. Compared with the likelihood classification by field survey data, the classification precision of this model

  15. Performance Evaluation of Frequency Transform Based Block Classification of Compound Image Segmentation Techniques

    Science.gov (United States)

    Selwyn, Ebenezer Juliet; Florinabel, D. Jemi

    2018-04-01

    Compound image segmentation plays a vital role in the compression of computer screen images. Computer screen images are images which are mixed with textual, graphical, or pictorial contents. In this paper, we present a comparison of two transform based block classification of compound images based on metrics like speed of classification, precision and recall rate. Block based classification approaches normally divide the compound images into fixed size blocks of non-overlapping in nature. Then frequency transform like Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) are applied over each block. Mean and standard deviation are computed for each 8 × 8 block and are used as features set to classify the compound images into text/graphics and picture/background block. The classification accuracy of block classification based segmentation techniques are measured by evaluation metrics like precision and recall rate. Compound images of smooth background and complex background images containing text of varying size, colour and orientation are considered for testing. Experimental evidence shows that the DWT based segmentation provides significant improvement in recall rate and precision rate approximately 2.3% than DCT based segmentation with an increase in block classification time for both smooth and complex background images.

  16. An Analysis of Social Class Classification Based on Linguistic Variables

    Institute of Scientific and Technical Information of China (English)

    QU Xia-sha

    2016-01-01

    Since language is an influential tool in social interaction, the relationship of speech and social factors, such as social class, gender, even age is worth studying. People employ different linguistic variables to imply their social class, status and iden-tity in the social interaction. Thus the linguistic variation involves vocabulary, sounds, grammatical constructions, dialects and so on. As a result, a classification of social class draws people’s attention. Linguistic variable in speech interactions indicate the social relationship between people. This paper attempts to illustrate three main linguistic variables which influence the social class, and further sociolinguistic studies need to be more concerned about.

  17. Extreme Facial Expressions Classification Based on Reality Parameters

    Science.gov (United States)

    Rahim, Mohd Shafry Mohd; Rad, Abdolvahab Ehsani; Rehman, Amjad; Altameem, Ayman

    2014-09-01

    Extreme expressions are really type of emotional expressions that are basically stimulated through the strong emotion. An example of those extreme expression is satisfied through tears. So to be able to provide these types of features; additional elements like fluid mechanism (particle system) plus some of physics techniques like (SPH) are introduced. The fusion of facile animation with SPH exhibits promising results. Accordingly, proposed fluid technique using facial animation is the real tenor for this research to get the complex expression, like laugh, smile, cry (tears emergence) or the sadness until cry strongly, as an extreme expression classification that's happens on the human face in some cases.

  18. Conditional Mutual Information Based Feature Selection for Classification Task

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Somol, Petr; Haindl, Michal; Pudil, Pavel

    2007-01-01

    Roč. 45, č. 4756 (2007), s. 417-426 ISSN 0302-9743 R&D Projects: GA MŠk 1M0572; GA AV ČR IAA2075302 EU Projects: European Commission(XE) 507752 - MUSCLE Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : Pattern classification * feature selection * conditional mutual information * text categorization Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.402, year: 2005

  19. Automatic Classification of Normal and Cancer Lung CT Images Using Multiscale AM-FM Features

    Directory of Open Access Journals (Sweden)

    Eman Magdy

    2015-01-01

    Full Text Available Computer-aided diagnostic (CAD systems provide fast and reliable diagnosis for medical images. In this paper, CAD system is proposed to analyze and automatically segment the lungs and classify each lung into normal or cancer. Using 70 different patients’ lung CT dataset, Wiener filtering on the original CT images is applied firstly as a preprocessing step. Secondly, we combine histogram analysis with thresholding and morphological operations to segment the lung regions and extract each lung separately. Amplitude-Modulation Frequency-Modulation (AM-FM method thirdly, has been used to extract features for ROIs. Then, the significant AM-FM features have been selected using Partial Least Squares Regression (PLSR for classification step. Finally, K-nearest neighbour (KNN, support vector machine (SVM, naïve Bayes, and linear classifiers have been used with the selected AM-FM features. The performance of each classifier in terms of accuracy, sensitivity, and specificity is evaluated. The results indicate that our proposed CAD system succeeded to differentiate between normal and cancer lungs and achieved 95% accuracy in case of the linear classifier.

  20. Classification of samples into two or more ordered populations with application to a cancer trial.

    Science.gov (United States)

    Conde, D; Fernández, M A; Rueda, C; Salvador, B

    2012-12-10

    In many applications, especially in cancer treatment and diagnosis, investigators are interested in classifying patients into various diagnosis groups on the basis of molecular data such as gene expression or proteomic data. Often, some of the diagnosis groups are known to be related to higher or lower values of some of the predictors. The standard methods of classifying patients into various groups do not take into account the underlying order. This could potentially result in high misclassification rates, especially when the number of groups is larger than two. In this article, we develop classification procedures that exploit the underlying order among the mean values of the predictor variables and the diagnostic groups by using ideas from order-restricted inference. We generalize the existing methodology on discrimination under restrictions and provide empirical evidence to demonstrate that the proposed methodology improves over the existing unrestricted methodology. The proposed methodology is applied to a bladder cancer data set where the researchers are interested in classifying patients into various groups. Copyright © 2012 John Wiley & Sons, Ltd.

  1. A canonical correlation analysis based EMG classification algorithm for eliminating electrode shift effect.

    Science.gov (United States)

    Zhe Fan; Zhong Wang; Guanglin Li; Ruomei Wang

    2016-08-01

    Motion classification system based on surface Electromyography (sEMG) pattern recognition has achieved good results in experimental condition. But it is still a challenge for clinical implement and practical application. Many factors contribute to the difficulty of clinical use of the EMG based dexterous control. The most obvious and important is the noise in the EMG signal caused by electrode shift, muscle fatigue, motion artifact, inherent instability of signal and biological signals such as Electrocardiogram. In this paper, a novel method based on Canonical Correlation Analysis (CCA) was developed to eliminate the reduction of classification accuracy caused by electrode shift. The average classification accuracy of our method were above 95% for the healthy subjects. In the process, we validated the influence of electrode shift on motion classification accuracy and discovered the strong correlation with correlation coefficient of >0.9 between shift position data and normal position data.

  2. Ship Classification with High Resolution TerraSAR-X Imagery Based on Analytic Hierarchy Process

    Directory of Open Access Journals (Sweden)

    Zhi Zhao

    2013-01-01

    Full Text Available Ship surveillance using space-borne synthetic aperture radar (SAR, taking advantages of high resolution over wide swaths and all-weather working capability, has attracted worldwide attention. Recent activity in this field has concentrated mainly on the study of ship detection, but the classification is largely still open. In this paper, we propose a novel ship classification scheme based on analytic hierarchy process (AHP in order to achieve better performance. The main idea is to apply AHP on both feature selection and classification decision. On one hand, the AHP based feature selection constructs a selection decision problem based on several feature evaluation measures (e.g., discriminability, stability, and information measure and provides objective criteria to make comprehensive decisions for their combinations quantitatively. On the other hand, we take the selected feature sets as the input of KNN classifiers and fuse the multiple classification results based on AHP, in which the feature sets’ confidence is taken into account when the AHP based classification decision is made. We analyze the proposed classification scheme and demonstrate its results on a ship dataset that comes from TerraSAR-X SAR images.

  3. A space-based classification system for RF transients

    International Nuclear Information System (INIS)

    Moore, K.R.; Call, D.; Johnson, S.; Payne, T.; Ford, W.; Spencer, K.; Wilkerson, J.F.; Baumgart, C.

    1993-01-01

    The FORTE (Fast On-Orbit Recording of Transient Events) small satellite is scheduled for launch in mid 1995. The mission is to measure and classify VHF (30--300 MHz) electromagnetic pulses, primarily due to lightning, within a high noise environment dominated by continuous wave carriers such as TV and FM stations. The FORTE Event Classifier will use specialized hardware to implement signal processing and neural network algorithms that perform onboard classification of RF transients and carriers. Lightning events will also be characterized with optical data telemetered to the ground. A primary mission science goal is to develop a comprehensive understanding of the correlation between the optical flash and the VHF emissions from lightning. By combining FORTE measurements with ground measurements and/or active transmitters, other science issues can be addressed. Examples include the correlation of global precipitation rates with lightning flash rates and location, the effects of large scale structures within the ionosphere (such as traveling ionospheric disturbances and horizontal gradients in the total electron content) on the propagation of broad bandwidth RF signals, and various areas of lightning physics. Event classification is a key feature of the FORTE mission. Neural networks are promising candidates for this application. The authors describe the proposed FORTE Event Classifier flight system, which consists of a commercially available digital signal processing board and a custom board, and discuss work on signal processing and neural network algorithms

  4. Development of a computer aided diagnosis model for prostate cancer classification on multi-parametric MRI

    Science.gov (United States)

    Alfano, R.; Soetemans, D.; Bauman, G. S.; Gibson, E.; Gaed, M.; Moussa, M.; Gomez, J. A.; Chin, J. L.; Pautler, S.; Ward, A. D.

    2018-02-01

    Multi-parametric MRI (mp-MRI) is becoming a standard in contemporary prostate cancer screening and diagnosis, and has shown to aid physicians in cancer detection. It offers many advantages over traditional systematic biopsy, which has shown to have very high clinical false-negative rates of up to 23% at all stages of the disease. However beneficial, mp-MRI is relatively complex to interpret and suffers from inter-observer variability in lesion localization and grading. Computer-aided diagnosis (CAD) systems have been developed as a solution as they have the power to perform deterministic quantitative image analysis. We measured the accuracy of such a system validated using accurately co-registered whole-mount digitized histology. We trained a logistic linear classifier (LOGLC), support vector machine (SVC), k-nearest neighbour (KNN) and random forest classifier (RFC) in a four part ROI based experiment against: 1) cancer vs. non-cancer, 2) high-grade (Gleason score ≥4+3) vs. low-grade cancer (Gleason score work will form the basis for a tool that enhances the radiologist's ability to detect malignancies, potentially improving biopsy guidance, treatment selection, and focal therapy for prostate cancer patients, maximizing the potential for cure and increasing quality of life.

  5. Classification of high resolution imagery based on fusion of multiscale texture features

    International Nuclear Information System (INIS)

    Liu, Jinxiu; Liu, Huiping; Lv, Ying; Xue, Xiaojuan

    2014-01-01

    In high resolution data classification process, combining texture features with spectral bands can effectively improve the classification accuracy. However, the window size which is difficult to choose is regarded as an important factor influencing overall classification accuracy in textural classification and current approaches to image texture analysis only depend on a single moving window which ignores different scale features of various land cover types. In this paper, we propose a new method based on the fusion of multiscale texture features to overcome these problems. The main steps in new method include the classification of fixed window size spectral/textural images from 3×3 to 15×15 and comparison of all the posterior possibility values for every pixel, as a result the biggest probability value is given to the pixel and the pixel belongs to a certain land cover type automatically. The proposed approach is tested on University of Pavia ROSIS data. The results indicate that the new method improve the classification accuracy compared to results of methods based on fixed window size textural classification

  6. Graph-Based Semi-Supervised Hyperspectral Image Classification Using Spatial Information

    Science.gov (United States)

    Jamshidpour, N.; Homayouni, S.; Safari, A.

    2017-09-01

    Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  7. GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION

    Directory of Open Access Journals (Sweden)

    N. Jamshidpour

    2017-09-01

    Full Text Available Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  8. Reliability of a treatment-based classification system for subgrouping people with low back pain.

    Science.gov (United States)

    Henry, Sharon M; Fritz, Julie M; Trombley, Andrea R; Bunn, Janice Y

    2012-09-01

    Observational, cross-sectional reliability study. To examine the interrater reliability of novice raters in their use of the treatment-based classification (TBC) system for low back pain and to explore the patterns of disagreement in classification errors. Although the interrater reliability of individual test items in the TBC system is moderate to good, some error persists in classification decision making. Understanding which classification errors are common could direct further refinement of the TBC system. Using previously recorded patient data (n = 24), 12 novice raters classified patients according to the TBC schema. These classification results were combined with those of 7 other raters, allowing examination of the overall agreement using the kappa statistic, as well as agreement/disagreement among pairwise comparisons in classification assignments. A chi-square test examined differences in percent agreement between the novice and more experienced raters and differences in classification distributions between these 2 groups of raters. Among 12 novice raters, there was 80.9% agreement in the pairs of classification (κ = 0.62; 95% confidence interval: 0.59, 0.65) and an overall 75.5% agreement (κ = 0.57; 95% confidence interval: 0.55, 0.69) for the combined data set. Raters were least likely to agree on a classification of stabilization (77.5% agreement). The overall percentage of pairwise classification judgments that disagreed was 24.5%, with the most common disagreement being between manipulation and stabilization (11.0%), followed by a mismatch between stabilization and specific exercise (8.2%). Additional refinement is needed to reduce rater disagreement that persists in the TBC decision-making algorithm, particularly in the stabilization category. J Orthop Sports Phys Ther 2012;42(9):797-805, Epub 7 June 2012. doi:10.2519/jospt.2012.4078.

  9. A review of supervised object-based land-cover image classification

    Science.gov (United States)

    Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue

    2017-08-01

    Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial

  10. A Quantum Hybrid PSO Combined with Fuzzy k-NN Approach to Feature Selection and Cell Classification in Cervical Cancer Detection

    Directory of Open Access Journals (Sweden)

    Abdullah M. Iliyasu

    2017-12-01

    Full Text Available A quantum hybrid (QH intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO method with the intuitionistic rationality of traditional fuzzy k-nearest neighbours (Fuzzy k-NN algorithm (known simply as the Q-Fuzzy approach is proposed for efficient feature selection and classification of cells in cervical smeared (CS images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection and another hybrid technique combining the standard PSO algorithm with the Fuzzy k-NN technique (P-Fuzzy approach. In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k-NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.

  11. Improving the Computational Performance of Ontology-Based Classification Using Graph Databases

    Directory of Open Access Journals (Sweden)

    Thomas J. Lampoltshammer

    2015-07-01

    Full Text Available The increasing availability of very high-resolution remote sensing imagery (i.e., from satellites, airborne laser scanning, or aerial photography represents both a blessing and a curse for researchers. The manual classification of these images, or other similar geo-sensor data, is time-consuming and leads to subjective and non-deterministic results. Due to this fact, (semi- automated classification approaches are in high demand in affected research areas. Ontologies provide a proper way of automated classification for various kinds of sensor data, including remotely sensed data. However, the processing of data entities—so-called individuals—is one of the most cost-intensive computational operations within ontology reasoning. Therefore, an approach based on graph databases is proposed to overcome the issue of a high time consumption regarding the classification task. The introduced approach shifts the classification task from the classical Protégé environment and its common reasoners to the proposed graph-based approaches. For the validation, the authors tested the approach on a simulation scenario based on a real-world example. The results demonstrate a quite promising improvement of classification speed—up to 80,000 times faster than the Protégé-based approach.

  12. An object-oriented classification method of high resolution imagery based on improved AdaTree

    International Nuclear Information System (INIS)

    Xiaohe, Zhang; Liang, Zhai; Jixian, Zhang; Huiyong, Sang

    2014-01-01

    With the popularity of the application using high spatial resolution remote sensing image, more and more studies paid attention to object-oriented classification on image segmentation as well as automatic classification after image segmentation. This paper proposed a fast method of object-oriented automatic classification. First, edge-based or FNEA-based segmentation was used to identify image objects and the values of most suitable attributes of image objects for classification were calculated. Then a certain number of samples from the image objects were selected as training data for improved AdaTree algorithm to get classification rules. Finally, the image objects could be classified easily using these rules. In the AdaTree, we mainly modified the final hypothesis to get classification rules. In the experiment with WorldView2 image, the result of the method based on AdaTree showed obvious accuracy and efficient improvement compared with the method based on SVM with the kappa coefficient achieving 0.9242

  13. Multi-material classification of dry recyclables from municipal solid waste based on thermal imaging.

    Science.gov (United States)

    Gundupalli, Sathish Paulraj; Hait, Subrata; Thakur, Atul

    2017-12-01

    There has been a significant rise in municipal solid waste (MSW) generation in the last few decades due to rapid urbanization and industrialization. Due to the lack of source segregation practice, a need for automated segregation of recyclables from MSW exists in the developing countries. This paper reports a thermal imaging based system for classifying useful recyclables from simulated MSW sample. Experimental results have demonstrated the possibility to use thermal imaging technique for classification and a robotic system for sorting of recyclables in a single process step. The reported classification system yields an accuracy in the range of 85-96% and is comparable with the existing single-material recyclable classification techniques. We believe that the reported thermal imaging based system can emerge as a viable and inexpensive large-scale classification-cum-sorting technology in recycling plants for processing MSW in developing countries. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  15. Uav-Based Crops Classification with Joint Features from Orthoimage and Dsm Data

    Science.gov (United States)

    Liu, B.; Shi, Y.; Duan, Y.; Wu, W.

    2018-04-01

    Accurate crops classification remains a challenging task due to the same crop with different spectra and different crops with same spectrum phenomenon. Recently, UAV-based remote sensing approach gains popularity not only for its high spatial and temporal resolution, but also for its ability to obtain spectraand spatial data at the same time. This paper focus on how to take full advantages of spatial and spectrum features to improve crops classification accuracy, based on an UAV platform equipped with a general digital camera. Texture and spatial features extracted from the RGB orthoimage and the digital surface model of the monitoring area are analysed and integrated within a SVM classification framework. Extensive experiences results indicate that the overall classification accuracy is drastically improved from 72.9 % to 94.5 % when the spatial features are combined together, which verified the feasibility and effectiveness of the proposed method.

  16. Resting State fMRI Functional Connectivity-Based Classification Using a Convolutional Neural Network Architecture.

    Science.gov (United States)

    Meszlényi, Regina J; Buza, Krisztian; Vidnyánszky, Zoltán

    2017-01-01

    Machine learning techniques have become increasingly popular in the field of resting state fMRI (functional magnetic resonance imaging) network based classification. However, the application of convolutional networks has been proposed only very recently and has remained largely unexplored. In this paper we describe a convolutional neural network architecture for functional connectome classification called connectome-convolutional neural network (CCNN). Our results on simulated datasets and a publicly available dataset for amnestic mild cognitive impairment classification demonstrate that our CCNN model can efficiently distinguish between subject groups. We also show that the connectome-convolutional network is capable to combine information from diverse functional connectivity metrics and that models using a combination of different connectivity descriptors are able to outperform classifiers using only one metric. From this flexibility follows that our proposed CCNN model can be easily adapted to a wide range of connectome based classification or regression tasks, by varying which connectivity descriptor combinations are used to train the network.

  17. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  18. Efficacy measures associated to a plantar pressure based classification system in diabetic foot medicine.

    Science.gov (United States)

    Deschamps, Kevin; Matricali, Giovanni Arnoldo; Desmet, Dirk; Roosen, Philip; Keijsers, Noel; Nobels, Frank; Bruyninckx, Herman; Staes, Filip

    2016-09-01

    The concept of 'classification' has, similar to many other diseases, been found to be fundamental in the field of diabetic medicine. In the current study, we aimed at determining efficacy measures of a recently published plantar pressure based classification system. Technical efficacy of the classification system was investigated by applying a high resolution, pixel-level analysis on the normalized plantar pressure pedobarographic fields of the original experimental dataset consisting of 97 patients with diabetes and 33 persons without diabetes. Clinical efficacy was assessed by considering the occurence of foot ulcers at the plantar aspect of the forefoot in this dataset. Classification efficacy was assessed by determining the classification recognition rate as well as its sensitivity and specificity using cross-validation subsets of the experimental dataset together with a novel cohort of 12 patients with diabetes. Pixel-level comparison of the four groups associated to the classification system highlighted distinct regional differences. Retrospective analysis showed the occurence of eleven foot ulcers in the experimental dataset since their gait analysis. Eight out of the eleven ulcers developed in a region of the foot which had the highest forces. Overall classification recognition rate exceeded 90% for all cross-validation subsets. Sensitivity and specificity of the four groups associated to the classification system exceeded respectively the 0.7 and 0.8 level in all cross-validation subsets. The results of the current study support the use of the novel plantar pressure based classification system in diabetic foot medicine. It may particularly serve in communication, diagnosis and clinical decision making. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Esophagus cancer

    International Nuclear Information System (INIS)

    Anon.

    1989-01-01

    Ways of metastatic spreading of esophagus cancer, depending on segmental division of esophagus are considered. Classification of esophagus cancer according to morphological structure, domestic clinical classification according to stages and international classification according to TNM system are presented. Diagnosis of esophagus cancer should be complex and based on results of clinical examination of patients, radiological, endoscopic and morphological investigations. Radiological, surgical and combined (preoperative radiotherapy with successive operation) methods of treatment are used in the case of esophagus cancer. Versions of preoperative radiotherapy are given. Favourable results of applying combined surgical treatment with preoperative radiotherapy are shown

  20. Vehicle Maneuver Detection with Accelerometer-Based Classification

    Directory of Open Access Journals (Sweden)

    Javier Cervantes-Villanueva

    2016-09-01

    Full Text Available In the mobile computing era, smartphones have become instrumental tools to develop innovative mobile context-aware systems. In that sense, their usage in the vehicular domain eases the development of novel and personal transportation solutions. In this frame, the present work introduces an innovative mechanism to perceive the current kinematic state of a vehicle on the basis of the accelerometer data from a smartphone mounted in the vehicle. Unlike previous proposals, the introduced architecture targets the computational limitations of such devices to carry out the detection process following an incremental approach. For its realization, we have evaluated different classification algorithms to act as agents within the architecture. Finally, our approach has been tested with a real-world dataset collected by means of the ad hoc mobile application developed.

  1. Voting-based Classification for E-mail Spam Detection

    Directory of Open Access Journals (Sweden)

    Bashar Awad Al-Shboul

    2016-06-01

    Full Text Available The problem of spam e-mail has gained a tremendous amount of attention. Although entities tend to use e-mail spam filter applications to filter out received spam e-mails, marketing companies still tend to send unsolicited e-mails in bulk and users still receive a reasonable amount of spam e-mail despite those filtering applications. This work proposes a new method for classifying e-mails into spam and non-spam. First, several e-mail content features are extracted and then those features are used for classifying each e-mail individually. The classification results of three different classifiers (i.e. Decision Trees, Random Forests and k-Nearest Neighbor are combined in various voting schemes (i.e. majority vote, average probability, product of probabilities, minimum probability and maximum probability for making the final decision. To validate our method, two different spam e-mail collections were used.

  2. A CNN Based Approach for Garments Texture Design Classification

    Directory of Open Access Journals (Sweden)

    S.M. Sofiqul Islam

    2017-05-01

    Full Text Available Identifying garments texture design automatically for recommending the fashion trends is important nowadays because of the rapid growth of online shopping. By learning the properties of images efficiently, a machine can give better accuracy of classification. Several Hand-Engineered feature coding exists for identifying garments design classes. Recently, Deep Convolutional Neural Networks (CNNs have shown better performances for different object recognition. Deep CNN uses multiple levels of representation and abstraction that helps a machine to understand the types of data more accurately. In this paper, a CNN model for identifying garments design classes has been proposed. Experimental results on two different datasets show better results than existing two well-known CNN models (AlexNet and VGGNet and some state-of-the-art Hand-Engineered feature extraction methods.

  3. Enhancement of force patterns classification based on Gaussian distributions.

    Science.gov (United States)

    Ertelt, Thomas; Solomonovs, Ilja; Gronwald, Thomas

    2018-01-23

    Description of the patterns of ground reaction force is a standard method in areas such as medicine, biomechanics and robotics. The fundamental parameter is the time course of the force, which is classified visually in particular in the field of clinical diagnostics. Here, the knowledge and experience of the diagnostician is relevant for its assessment. For an objective and valid discrimination of the ground reaction force pattern, a generic method, especially in the medical field, is absolutely necessary to describe the qualities of the time-course. The aim of the presented method was to combine the approaches of two existing procedures from the fields of machine learning and the Gauss approximation in order to take advantages of both methods for the classification of ground reaction force patterns. The current limitations of both methods could be eliminated by an overarching method. Twenty-nine male athletes from different sports were examined. Each participant was given the task of performing a one-legged stopping maneuver on a force plate from the maximum possible starting speed. The individual time course of the ground reaction force of each subject was registered and approximated on the basis of eight Gaussian distributions. The descriptive coefficients were then classified using Bayesian regulated neural networks. The different sports served as the distinguishing feature. Although the athletes were all given the same task, all sports referred to a different quality in the time course of ground reaction force. Meanwhile within each sport, the athletes were homogeneous. With an overall prediction (R = 0.938) all subjects/sports were classified correctly with 94.29% accuracy. The combination of the two methods: the mathematical description of the time course of ground reaction forces on the basis of Gaussian distributions and their classification by means of Bayesian regulated neural networks, seems an adequate and promising method to discriminate the

  4. Deep Galaxy: Classification of Galaxies based on Deep Convolutional Neural Networks

    OpenAIRE

    Khalifa, Nour Eldeen M.; Taha, Mohamed Hamed N.; Hassanien, Aboul Ella; Selim, I. M.

    2017-01-01

    In this paper, a deep convolutional neural network architecture for galaxies classification is presented. The galaxy can be classified based on its features into main three categories Elliptical, Spiral, and Irregular. The proposed deep galaxies architecture consists of 8 layers, one main convolutional layer for features extraction with 96 filters, followed by two principles fully connected layers for classification. It is trained over 1356 images and achieved 97.272% in testing accuracy. A c...

  5. [Surgical treatment of chronic pancreatitis based on classification of M. Buchler and coworkers].

    Science.gov (United States)

    Krivoruchko, I A; Boĭko, V V; Goncharova, N N; Andreeshchev, S A

    2011-08-01

    The results of surgical treatment of 452 patients, suffering chronic pancreatitis (CHP), were analyzed. The CHP classification, elaborated by M. Buchler and coworkers (2009), based on clinical signs, morphological peculiarities and pancreatic function analysis, contains scientifically substantiated recommendations for choice of diagnostic methods and complex treatment of the disease. The classification proposed is simple in application and constitutes an instrument for studying and comparison of the CHP course severity, the patients prognosis and treatment.

  6. Deep learning based classification of breast tumors with shear-wave elastography.

    Science.gov (United States)

    Zhang, Qi; Xiao, Yang; Dai, Wei; Suo, Jingfeng; Wang, Congzhi; Shi, Jun; Zheng, Hairong

    2016-12-01

    This study aims to build a deep learning (DL) architecture for automated extraction of learned-from-data image features from the shear-wave elastography (SWE), and to evaluate the DL architecture in differentiation between benign and malignant breast tumors. We construct a two-layer DL architecture for SWE feature extraction, comprised of the point-wise gated Boltzmann machine (PGBM) and the restricted Boltzmann machine (RBM). The PGBM contains task-relevant and task-irrelevant hidden units, and the task-relevant units are connected to the RBM. Experimental evaluation was performed with five-fold cross validation on a set of 227 SWE images, 135 of benign tumors and 92 of malignant tumors, from 121 patients. The features learned with our DL architecture were compared with the statistical features quantifying image intensity and texture. Results showed that the DL features achieved better classification performance with an accuracy of 93.4%, a sensitivity of 88.6%, a specificity of 97.1%, and an area under the receiver operating characteristic curve of 0.947. The DL-based method integrates feature learning with feature selection on SWE. It may be potentially used in clinical computer-aided diagnosis of breast cancer. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data

    Directory of Open Access Journals (Sweden)

    J. Sunil Rao

    2007-01-01

    Full Text Available In gene selection for cancer classifi cation using microarray data, we define an eigenvalue-ratio statistic to measure a gene’s contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalueratio statistic, we define a novel hypothesis testing for gene statistical redundancy and propose two gene selection methods. Simulation studies illustrate the agreement between statistical redundancy testing and gene selection methods. Real data examples show the proposed gene selection methods can select a compact gene subset which can not only be used to build high quality cancer classifiers but also show biological relevance.

  8. Application of SVM classifier in thermographic image classification for early detection of breast cancer

    Science.gov (United States)

    Oleszkiewicz, Witold; Cichosz, Paweł; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał

    2016-09-01

    This article presents the application of machine learning algorithms for early detection of breast cancer on the basis of thermographic images. Supervised learning model: Support vector machine (SVM) and Sequential Minimal Optimization algorithm (SMO) for the training of SVM classifier were implemented. The SVM classifier was included in a client-server application which enables to create a training set of examinations and to apply classifiers (including SVM) for the diagnosis and early detection of the breast cancer. The sensitivity and specificity of SVM classifier were calculated based on the thermographic images from studies. Furthermore, the heuristic method for SVM's parameters tuning was proposed.

  9. Classification of right-hand grasp movement based on EMOTIV Epoc+

    Science.gov (United States)

    Tobing, T. A. M. L.; Prawito, Wijaya, S. K.

    2017-07-01

    Combinations of BCT elements for right-hand grasp movement have been obtained, providing the average value of their classification accuracy. The aim of this study is to find a suitable combination for best classification accuracy of right-hand grasp movement based on EEG headset, EMOTIV Epoc+. There are three movement classifications: grasping hand, relax, and opening hand. These classifications take advantage of Event-Related Desynchronization (ERD) phenomenon that makes it possible to differ relaxation, imagery, and movement state from each other. The combinations of elements are the usage of Independent Component Analysis (ICA), spectrum analysis by Fast Fourier Transform (FFT), maximum mu and beta power with their frequency as features, and also classifier Probabilistic Neural Network (PNN) and Radial Basis Function (RBF). The average values of classification accuracy are ± 83% for training and ± 57% for testing. To have a better understanding of the signal quality recorded by EMOTIV Epoc+, the result of classification accuracy of left or right-hand grasping movement EEG signal (provided by Physionet) also be given, i.e.± 85% for training and ± 70% for testing. The comparison of accuracy value from each combination, experiment condition, and external EEG data are provided for the purpose of value analysis of classification accuracy.

  10. Multi-Frequency Polarimetric SAR Classification Based on Riemannian Manifold and Simultaneous Sparse Representation

    Directory of Open Access Journals (Sweden)

    Fan Yang

    2015-07-01

    Full Text Available Normally, polarimetric SAR classification is a high-dimensional nonlinear mapping problem. In the realm of pattern recognition, sparse representation is a very efficacious and powerful approach. As classical descriptors of polarimetric SAR, covariance and coherency matrices are Hermitian semidefinite and form a Riemannian manifold. Conventional Euclidean metrics are not suitable for a Riemannian manifold, and hence, normal sparse representation classification cannot be applied to polarimetric SAR directly. This paper proposes a new land cover classification approach for polarimetric SAR. There are two principal novelties in this paper. First, a Stein kernel on a Riemannian manifold instead of Euclidean metrics, combined with sparse representation, is employed for polarimetric SAR land cover classification. This approach is named Stein-sparse representation-based classification (SRC. Second, using simultaneous sparse representation and reasonable assumptions of the correlation of representation among different frequency bands, Stein-SRC is generalized to simultaneous Stein-SRC for multi-frequency polarimetric SAR classification. These classifiers are assessed using polarimetric SAR images from the Airborne Synthetic Aperture Radar (AIRSAR sensor of the Jet Propulsion Laboratory (JPL and the Electromagnetics Institute Synthetic Aperture Radar (EMISAR sensor of the Technical University of Denmark (DTU. Experiments on single-band and multi-band data both show that these approaches acquire more accurate classification results in comparison to many conventional and advanced classifiers.

  11. The development of a classification schema for arts-based approaches to knowledge translation.

    Science.gov (United States)

    Archibald, Mandy M; Caine, Vera; Scott, Shannon D

    2014-10-01

    Arts-based approaches to knowledge translation are emerging as powerful interprofessional strategies with potential to facilitate evidence uptake, communication, knowledge, attitude, and behavior change across healthcare provider and consumer groups. These strategies are in the early stages of development. To date, no classification system for arts-based knowledge translation exists, which limits development and understandings of effectiveness in evidence syntheses. We developed a classification schema of arts-based knowledge translation strategies based on two mechanisms by which these approaches function: (a) the degree of precision in key message delivery, and (b) the degree of end-user participation. We demonstrate how this classification is necessary to explore how context, time, and location shape arts-based knowledge translation strategies. Classifying arts-based knowledge translation strategies according to their core attributes extends understandings of the appropriateness of these approaches for various healthcare settings and provider groups. The classification schema developed may enhance understanding of how, where, and for whom arts-based knowledge translation approaches are effective, and enable theorizing of essential knowledge translation constructs, such as the influence of context, time, and location on utilization strategies. The classification schema developed may encourage systematic inquiry into the effectiveness of these approaches in diverse interprofessional contexts. © 2014 Sigma Theta Tau International.

  12. Non-target adjacent stimuli classification improves performance of classical ERP-based brain computer interface

    Science.gov (United States)

    Ceballos, G. A.; Hernández, L. F.

    2015-04-01

    Objective. The classical ERP-based speller, or P300 Speller, is one of the most commonly used paradigms in the field of Brain Computer Interfaces (BCI). Several alterations to the visual stimuli presentation system have been developed to avoid unfavorable effects elicited by adjacent stimuli. However, there has been little, if any, regard to useful information contained in responses to adjacent stimuli about spatial location of target symbols. This paper aims to demonstrate that combining the classification of non-target adjacent stimuli with standard classification (target versus non-target) significantly improves classical ERP-based speller efficiency. Approach. Four SWLDA classifiers were trained and combined with the standard classifier: the lower row, upper row, right column and left column classifiers. This new feature extraction procedure and the classification method were carried out on three open databases: the UAM P300 database (Universidad Autonoma Metropolitana, Mexico), BCI competition II (dataset IIb) and BCI competition III (dataset II). Main results. The inclusion of the classification of non-target adjacent stimuli improves target classification in the classical row/column paradigm. A gain in mean single trial classification of 9.6% and an overall improvement of 25% in simulated spelling speed was achieved. Significance. We have provided further evidence that the ERPs produced by adjacent stimuli present discriminable features, which could provide additional information about the spatial location of intended symbols. This work promotes the searching of information on the peripheral stimulation responses to improve the performance of emerging visual ERP-based spellers.

  13. Validation of the RTOG recursive partitioning analysis (RPA) classification for small-cell lung cancer-only brain metastases

    International Nuclear Information System (INIS)

    Videtic, Gregory M.M.; Adelstein, David J.; Mekhail, Tarek M.; Rice, Thomas W.; Stevens, Glen H.J.; Lee, S.-Y.; Suh, John H.

    2007-01-01

    Purpose: Radiation Therapy Oncology Group (RTOG) developed a prognostic classification based on a recursive partitioning analysis (RPA) of patient pretreatment characteristics from three completed brain metastases randomized trials. Clinical trials for patients with brain metastases generally exclude small-cell lung cancer (SCLC) cases. We hypothesize that the RPA classes are valid in the setting of SCLC brain metastases. Methods and Materials: A retrospective review of 154 SCLC patients with brain metastases treated between April 1983 and May 2005 was performed. RPA criteria used for class assignment were Karnofsky performance status (KPS), primary tumor status (PT), presence of extracranial metastases (ED), and age. Results: Median survival was 4.9 months, with 4 patients (2.6%) alive at analysis. Median follow-up was 4.7 months (range, 0.3-40.3 months). Median age was 65 (range, 42-85 years). Median KPS was 70 (range, 40-100). Number of patients with controlled PT and no ED was 20 (13%) and with ED, 27 (18%); without controlled PT and ED, 34 (22%) and with ED, 73 (47%). RPA class distribution was: Class I: 8 (5%); Class II: 96 (62%); Class III: 51 (33%). Median survivals (in months) by RPA class were: Class I: 8.6; Class II: 4.2; Class III: 2.3 (p = 0.0023). Conclusions: Survivals for SCLC-only brain metastases replicate the results from the RTOG RPA classification. These classes are therefore valid for brain metastases from SCLC, support the inclusion of SCLC patients in future brain metastases trials, and may also serve as a basis for historical comparisons

  14. Abridged republication of FIGO's staging classification for cancer of the ovary, fallopian tube, and peritoneum.

    Science.gov (United States)

    Prat, Jaime

    2015-10-01

    Ovarian, fallopian tube, and peritoneal cancers have a similar clinical presentation and are treated similarly, and current evidence supports staging all 3 cancers in a single system. The primary site (i.e. ovary, fallopian tube, or peritoneum) should be designated where possible. The histologic type should be recorded. Intraoperative rupture ("surgical spill") is IC1; capsule ruptured before surgery or tumor on ovarian or fallopian tube surface is IC2; and positive peritoneal cytology with or without rupture is IC3. The new staging includes a revision of stage III patients; assignment to stage IIIA1 is based on spread to the retroperitoneal lymph nodes without intraperitoneal dissemination. Extension of tumor from omentum to spleen or liver (stage IIIC) should be differentiated from isolated parenchymal metastases (stage IVB). © 2015 American Cancer Society.

  15. Prediction of lung cancer patient survival via supervised machine learning classification techniques.

    Science.gov (United States)

    Lynch, Chip M; Abdollahi, Behnaz; Fuqua, Joshua D; de Carlo, Alexandra R; Bartholomai, James A; Balgemann, Rayeanne N; van Berkel, Victor H; Frieboes, Hermann B

    2017-12-01

    Outcomes for cancer patients have been previously estimated by applying various machine learning techniques to large datasets such as the Surveillance, Epidemiology, and End Results (SEER) program database. In particular for lung cancer, it is not well understood which types of techniques would yield more predictive information, and which data attributes should be used in order to determine this information. In this study, a number of supervised learning techniques is applied to the SEER database to classify lung cancer patients in terms of survival, including linear regression, Decision Trees, Gradient Boosting Machines (GBM), Support Vector Machines (SVM), and a custom ensemble. Key data attributes in applying these methods include tumor grade, tumor size, gender, age, stage, and number of primaries, with the goal to enable comparison of predictive power between the various methods The prediction is treated like a continuous target, rather than a classification into categories, as a first step towards improving survival prediction. The results show that the predicted values agree with actual values for low to moderate survival times, which constitute the majority of the data. The best performing technique was the custom ensemble with a Root Mean Square Error (RMSE) value of 15.05. The most influential model within the custom ensemble was GBM, while Decision Trees may be inapplicable as it had too few discrete outputs. The results further show that among the five individual models generated, the most accurate was GBM with an RMSE value of 15.32. Although SVM underperformed with an RMSE value of 15.82, statistical analysis singles the SVM as the only model that generated a distinctive output. The results of the models are consistent with a classical Cox proportional hazards model used as a reference technique. We conclude that application of these supervised learning techniques to lung cancer data in the SEER database may be of use to estimate patient survival time

  16. An approach for classification of hydrogeological systems at the regional scale based on groundwater hydrographs

    Science.gov (United States)

    Haaf, Ezra; Barthel, Roland

    2016-04-01

    When assessing hydrogeological conditions at the regional scale, the analyst is often confronted with uncertainty of structures, inputs and processes while having to base inference on scarce and patchy data. Haaf and Barthel (2015) proposed a concept for handling this predicament by developing a groundwater systems classification framework, where information is transferred from similar, but well-explored and better understood to poorly described systems. The concept is based on the central hypothesis that similar systems react similarly to the same inputs and vice versa. It is conceptually related to PUB (Prediction in ungauged basins) where organization of systems and processes by quantitative methods is intended and used to improve understanding and prediction. Furthermore, using the framework it is expected that regional conceptual and numerical models can be checked or enriched by ensemble generated data from neighborhood-based estimators. In a first step, groundwater hydrographs from a large dataset in Southern Germany are compared in an effort to identify structural similarity in groundwater dynamics. A number of approaches to group hydrographs, mostly based on a similarity measure - which have previously only been used in local-scale studies, can be found in the literature. These are tested alongside different global feature extraction techniques. The resulting classifications are then compared to a visual "expert assessment"-based classification which serves as a reference. A ranking of the classification methods is carried out and differences shown. Selected groups from the classifications are related to geological descriptors. Here we present the most promising results from a comparison of classifications based on series correlation, different series distances and series features, such as the coefficients of the discrete Fourier transform and the intrinsic mode functions of empirical mode decomposition. Additionally, we show examples of classes

  17. A fingerprint classification algorithm based on combination of local and global information

    Science.gov (United States)

    Liu, Chongjin; Fu, Xiang; Bian, Junjie; Feng, Jufu

    2011-12-01

    Fingerprint recognition is one of the most important technologies in biometric identification and has been wildly applied in commercial and forensic areas. Fingerprint classification, as the fundamental procedure in fingerprint recognition, can sharply decrease the quantity for fingerprint matching and improve the efficiency of fingerprint recognition. Most fingerprint classification algorithms are based on the number and position of singular points. Because the singular points detecting method only considers the local information commonly, the classification algorithms are sensitive to noise. In this paper, we propose a novel fingerprint classification algorithm combining the local and global information of fingerprint. Firstly we use local information to detect singular points and measure their quality considering orientation structure and image texture in adjacent areas. Furthermore the global orientation model is adopted to measure the reliability of singular points group. Finally the local quality and global reliability is weighted to classify fingerprint. Experiments demonstrate the accuracy and effectivity of our algorithm especially for the poor quality fingerprint images.

  18. Hyperspectral Image Classification Based on the Combination of Spatial-spectral Feature and Sparse Representation

    Directory of Open Access Journals (Sweden)

    YANG Zhaoxia

    2015-07-01

    Full Text Available In order to avoid the problem of being over-dependent on high-dimensional spectral feature in the traditional hyperspectral image classification, a novel approach based on the combination of spatial-spectral feature and sparse representation is proposed in this paper. Firstly, we extract the spatial-spectral feature by reorganizing the local image patch with the first d principal components(PCs into a vector representation, followed by a sorting scheme to make the vector invariant to local image rotation. Secondly, we learn the dictionary through a supervised method, and use it to code the features from test samples afterwards. Finally, we embed the resulting sparse feature coding into the support vector machine(SVM for hyperspectral image classification. Experiments using three hyperspectral data show that the proposed method can effectively improve the classification accuracy comparing with traditional classification methods.

  19. Rule-based land cover classification from very high-resolution satellite image with multiresolution segmentation

    Science.gov (United States)

    Haque, Md. Enamul; Al-Ramadan, Baqer; Johnson, Brian A.

    2016-07-01

    Multiresolution segmentation and rule-based classification techniques are used to classify objects from very high-resolution satellite images of urban areas. Custom rules are developed using different spectral, geometric, and textural features with five scale parameters, which exploit varying classification accuracy. Principal component analysis is used to select the most important features out of a total of 207 different features. In particular, seven different object types are considered for classification. The overall classification accuracy achieved for the rule-based method is 95.55% and 98.95% for seven and five classes, respectively. Other classifiers that are not using rules perform at 84.17% and 97.3% accuracy for seven and five classes, respectively. The results exploit coarse segmentation for higher scale parameter and fine segmentation for lower scale parameter. The major contribution of this research is the development of rule sets and the identification of major features for satellite image classification where the rule sets are transferable and the parameters are tunable for different types of imagery. Additionally, the individual objectwise classification and principal component analysis help to identify the required object from an arbitrary number of objects within images given ground truth data for the training.

  20. Classification of Two Comic Books based on Convolutional Neural Networks

    Directory of Open Access Journals (Sweden)

    Miki UENO

    2017-03-01

    Full Text Available Unphotographic images are the powerful representations described various situations. Thus, understanding intellectual products such as comics and picture books is one of the important topics in the field of artificial intelligence. Hence, stepwise analysis of a comic story, i.e., features of a part of the image, information features, features relating to continuous scene etc., was pursued. Especially, the length and each scene of four-scene comics are limited so as to ensure a clear interpretation of the contents.In this study, as the first step in this direction, the problem to classify two four-scene comics by the same artists were focused as the example. Several classifiers were constructed by utilizing a Convolutional Neural Network(CNN, and the results of classification by a human annotator and by a computational method were compared.From these experiments, we have clearly shown that CNN is efficient way to classify unphotographic gray scaled images and found that characteristic features of images to classify incorrectly.

  1. MR imaging-based diagnosis and classification of meniscal tears.

    Science.gov (United States)

    Nguyen, Jie C; De Smet, Arthur A; Graf, Ben K; Rosas, Humberto G

    2014-01-01

    Magnetic resonance (MR) imaging is currently the modality of choice for detecting meniscal injuries and planning subsequent treatment. A thorough understanding of the imaging protocols, normal meniscal anatomy, surrounding anatomic structures, and anatomic variants and pitfalls is critical to ensure diagnostic accuracy and prevent unnecessary surgery. High-spatial-resolution imaging of the meniscus can be performed using fast spin-echo and three-dimensional MR imaging sequences. Normal anatomic structures that can mimic a tear include the meniscal ligament, meniscofemoral ligaments, popliteomeniscal fascicles, and meniscomeniscal ligament. Anatomic variants and pitfalls that can mimic a tear include discoid meniscus, meniscal flounce, a meniscal ossicle, and chondrocalcinosis. When a meniscal tear is identified, accurate description and classification of the tear pattern can guide the referring clinician in patient education and surgical planning. For example, longitudinal tears are often amenable to repair, whereas horizontal and radial tears may require partial meniscectomy. Tear patterns include horizontal, longitudinal, radial, root, complex, displaced, and bucket-handle tears. Occasionally, meniscal tears can be difficult to detect at imaging; however, secondary indirect signs, such as a parameniscal cyst, meniscal extrusion, or linear subchondral bone marrow edema, should increase the radiologist's suspicion for an underlying tear. Awareness of common diagnostic errors can ensure accurate diagnosis of meniscal tears. Online supplemental material is available for this article. ©RSNA, 2014.

  2. Object-Based Classification as an Alternative Approach to the Traditional Pixel-Based Classification to Identify Potential Habitat of the Grasshopper Sparrow

    Science.gov (United States)

    Jobin, Benoît; Labrecque, Sandra; Grenier, Marcelle; Falardeau, Gilles

    2008-01-01

    The traditional method of identifying wildlife habitat distribution over large regions consists of pixel-based classification of satellite images into a suite of habitat classes used to select suitable habitat patches. Object-based classification is a new method that can achieve the same objective based on the segmentation of spectral bands of the image creating homogeneous polygons with regard to spatial or spectral characteristics. The segmentation algorithm does not solely rely on the single pixel value, but also on shape, texture, and pixel spatial continuity. The object-based classification is a knowledge base process where an interpretation key is developed using ground control points and objects are assigned to specific classes according to threshold values of determined spectral and/or spatial attributes. We developed a model using the eCognition software to identify suitable habitats for the Grasshopper Sparrow, a rare and declining species found in southwestern Québec. The model was developed in a region with known breeding sites and applied on other images covering adjacent regions where potential breeding habitats may be present. We were successful in locating potential habitats in areas where dairy farming prevailed but failed in an adjacent region covered by a distinct Landsat scene and dominated by annual crops. We discuss the added value of this method, such as the possibility to use the contextual information associated to objects and the ability to eliminate unsuitable areas in the segmentation and land cover classification processes, as well as technical and logistical constraints. A series of recommendations on the use of this method and on conservation issues of Grasshopper Sparrow habitat is also provided.

  3. Waste-acceptance criteria and risk-based thinking for radioactive-waste classification

    International Nuclear Information System (INIS)

    Lowenthal, M.D.

    1998-01-01

    The US system of radioactive-waste classification and its development provide a reference point for the discussion of risk-based thinking in waste classification. The official US system is described and waste-acceptance criteria for disposal sites are introduced because they constitute a form of de facto waste classification. Risk-based classification is explored and it is found that a truly risk-based system is context-dependent: risk depends not only on the waste-management activity but, for some activities such as disposal, it depends on the specific physical context. Some of the elements of the official US system incorporate risk-based thinking, but like many proposed alternative schemes, the physical context of disposal is ignored. The waste-acceptance criteria for disposal sites do account for this context dependence and could be used as a risk-based classification scheme for disposal. While different classes would be necessary for different management activities, the waste-acceptance criteria would obviate the need for the current system and could better match wastes to disposal environments saving money or improving safety or both

  4. Carcinoma de mama: novos conceitos na classificação Breast cancer: new concepts in classification

    Directory of Open Access Journals (Sweden)

    Daniella Serafin Couto Vieira

    2008-01-01

    Full Text Available O carcinoma de mama é a neoplasia maligna mais comum em mulheres. Estudos moleculares do carcinoma de mama, baseados na identificação do perfil de expressão gênica por meio do cDNA microarray, permitiram definir pelo menos cinco sub-grupos distintos: luminal A, luminal B, superexpressão do HER2, basal e normal breast-like. A técnica de tissue microarray (TMA, descrita pela primeira vez em 1998, permitiu estudar, em várias amostras de carcinoma, os perfis de expressão protéica de diferentes neoplasias. No carcinoma de mama, os TMAs têm sido utilizados para validar os achados dos estudos preliminares, identificando, desta forma, os novos subtipos fenotípicos do carcinoma de mama. Dentre os subtipos classicamente descritos, o grupo basal constitui um dos mais intrigantes subtipos tumorais e é freqüentemente associado com pior prognóstico e ausência de alvos terapêuticos definidos. A classificação histopatológica do carcinoma de mama tem pobre valor preditivo. Portanto, a associação entre o diagnóstico histológico com técnicas moleculares nos laboratórios de anatomia patológica, por meio do estudo imunoistoquímico, pode determinar o perfil molecular do carcinoma de mama, buscando melhorar a resposta terapêutica. Este estudo visou resumir os mais recentes conhecimentos em que se baseiam os novos conceitos da classificação do carcinoma de mama.Breast cancer is the principal cause of death from cancer in women. Molecular studies of breast cancer, based in the identification of the molecular profiling techniques through cDNA microarray, had allowed defining at least five distinct sub-group: luminal A, luminal B, HER-2-overexpression, basal and " normal" type breast-like. The technique of tissue microarrays (TMA, described for the first time in 1998, allows to study, in some samples of breast cancer, distinguished by differences in their gene expression patterns, which provide a distinctive molecular portrait for each tumor

  5. Color Independent Components Based SIFT Descriptors for Object/Scene Classification

    Science.gov (United States)

    Ai, Dan-Ni; Han, Xian-Hua; Ruan, Xiang; Chen, Yen-Wei

    In this paper, we present a novel color independent components based SIFT descriptor (termed CIC-SIFT) for object/scene classification. We first learn an efficient color transformation matrix based on independent component analysis (ICA), which is adaptive to each category in a database. The ICA-based color transformation can enhance contrast between the objects and the background in an image. Then we compute CIC-SIFT descriptors over all three transformed color independent components. Since the ICA-based color transformation can boost the objects and suppress the background, the proposed CIC-SIFT can extract more effective and discriminative local features for object/scene classification. The comparison is performed among seven SIFT descriptors, and the experimental classification results show that our proposed CIC-SIFT is superior to other conventional SIFT descriptors.

  6. Body mass index: different nutritional status according to WHO, OPAS and Lipschitz classifications in gastrointestinal cancer patients.

    Science.gov (United States)

    Barao, Katia; Forones, Nora Manoukian

    2012-01-01

    The body mass index (BMI) is the most common marker used on diagnoses of the nutritional status. The great advantage of this index is the easy way to measure, the low cost, the good correlation with the fat mass and the association to morbidity and mortality. To compare the BMI differences according to the WHO, OPAS and Lipschitz classification. A prospective study on 352 patients with esophageal, gastric or colorectal cancer was done. The BMI was calculated and analyzed by the classification of WHO, Lipschitz and OPAS. The mean age was 62.1 ± 12.4 years and 59% of them had more than 59 years. The BMI had not difference between the genders in patients cancer had more than 65 years. A different cut off must be used for this patients, because undernourished patients may be wrongly considered well nourished.

  7. A texton-based approach for the classification of lung parenchyma in CT images

    DEFF Research Database (Denmark)

    Gangeh, Mehrdad J.; Sørensen, Lauge; Shaker, Saher B.

    2010-01-01

    In this paper, a texton-based classification system based on raw pixel representation along with a support vector machine with radial basis function kernel is proposed for the classification of emphysema in computed tomography images of the lung. The proposed approach is tested on 168 annotated...... regions of interest consisting of normal tissue, centrilobular emphysema, and paraseptal emphysema. The results show the superiority of the proposed approach to common techniques in the literature including moments of the histogram of filter responses based on Gaussian derivatives. The performance...

  8. Sequential Classification of Palm Gestures Based on A* Algorithm and MLP Neural Network for Quadrocopter Control

    Directory of Open Access Journals (Sweden)

    Wodziński Marek

    2017-06-01

    Full Text Available This paper presents an alternative approach to the sequential data classification, based on traditional machine learning algorithms (neural networks, principal component analysis, multivariate Gaussian anomaly detector and finding the shortest path in a directed acyclic graph, using A* algorithm with a regression-based heuristic. Palm gestures were used as an example of the sequential data and a quadrocopter was the controlled object. The study includes creation of a conceptual model and practical construction of a system using the GPU to ensure the realtime operation. The results present the classification accuracy of chosen gestures and comparison of the computation time between the CPU- and GPU-based solutions.

  9. RRHGE: A Novel Approach to Classify the Estrogen Receptor Based Breast Cancer Subtypes

    Directory of Open Access Journals (Sweden)

    Ashish Saini

    2014-01-01

    Full Text Available Background. Breast cancer is the most common type of cancer among females with a high mortality rate. It is essential to classify the estrogen receptor based breast cancer subtypes into correct subclasses, so that the right treatments can be applied to lower the mortality rate. Using gene signatures derived from gene interaction networks to classify breast cancers has proven to be more reproducible and can achieve higher classification performance. However, the interactions in the gene interaction network usually contain many false-positive interactions that do not have any biological meanings. Therefore, it is a challenge to incorporate the reliability assessment of interactions when deriving gene signatures from gene interaction networks. How to effectively extract gene signatures from available resources is critical to the success of cancer classification. Methods. We propose a novel method to measure and extract the reliable (biologically true or valid interactions from gene interaction networks and incorporate the extracted reliable gene interactions into our proposed RRHGE algorithm to identify significant gene signatures from microarray gene expression data for classifying ER+ and ER− breast cancer samples. Results. The evaluation on real breast cancer samples showed that our RRHGE algorithm achieved higher classification accuracy than the existing approaches.

  10. Feature selection for neural network based defect classification of ceramic components using high frequency ultrasound.

    Science.gov (United States)

    Kesharaju, Manasa; Nagarajah, Romesh

    2015-09-01

    The motivation for this research stems from a need for providing a non-destructive testing method capable of detecting and locating any defects and microstructural variations within armour ceramic components before issuing them to the soldiers who rely on them for their survival. The development of an automated ultrasonic inspection based classification system would make possible the checking of each ceramic component and immediately alert the operator about the presence of defects. Generally, in many classification problems a choice of features or dimensionality reduction is significant and simultaneously very difficult, as a substantial computational effort is required to evaluate possible feature subsets. In this research, a combination of artificial neural networks and genetic algorithms are used to optimize the feature subset used in classification of various defects in reaction-sintered silicon carbide ceramic components. Initially wavelet based feature extraction is implemented from the region of interest. An Artificial Neural Network classifier is employed to evaluate the performance of these features. Genetic Algorithm based feature selection is performed. Principal Component Analysis is a popular technique used for feature selection and is compared with the genetic algorithm based technique in terms of classification accuracy and selection of optimal number of features. The experimental results confirm that features identified by Principal Component Analysis lead to improved performance in terms of classification percentage with 96% than Genetic algorithm with 94%. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Classification and Quality Evaluation of Tobacco Leaves Based on Image Processing and Fuzzy Comprehensive Evaluation

    Science.gov (United States)

    Zhang, Fan; Zhang, Xinhong

    2011-01-01

    Most of classification, quality evaluation or grading of the flue-cured tobacco leaves are manually operated, which relies on the judgmental experience of experts, and inevitably limited by personal, physical and environmental factors. The classification and the quality evaluation are therefore subjective and experientially based. In this paper, an automatic classification method of tobacco leaves based on the digital image processing and the fuzzy sets theory is presented. A grading system based on image processing techniques was developed for automatically inspecting and grading flue-cured tobacco leaves. This system uses machine vision for the extraction and analysis of color, size, shape and surface texture. Fuzzy comprehensive evaluation provides a high level of confidence in decision making based on the fuzzy logic. The neural network is used to estimate and forecast the membership function of the features of tobacco leaves in the fuzzy sets. The experimental results of the two-level fuzzy comprehensive evaluation (FCE) show that the accuracy rate of classification is about 94% for the trained tobacco leaves, and the accuracy rate of the non-trained tobacco leaves is about 72%. We believe that the fuzzy comprehensive evaluation is a viable way for the automatic classification and quality evaluation of the tobacco leaves. PMID:22163744

  12. Chinese wine classification system based on micrograph using combination of shape and structure features

    Science.gov (United States)

    Wan, Yi

    2011-06-01

    Chinese wines can be classification or graded by the micrographs. Micrographs of Chinese wines show floccules, stick and granule of variant shape and size. Different wines have variant microstructure and micrographs, we study the classification of Chinese wines based on the micrographs. Shape and structure of wines' particles in microstructure is the most important feature for recognition and classification of wines. So we introduce a feature extraction method which can describe the structure and region shape of micrograph efficiently. First, the micrographs are enhanced using total variation denoising, and segmented using a modified Otsu's method based on the Rayleigh Distribution. Then features are extracted using proposed method in the paper based on area, perimeter and traditional shape feature. Eight kinds total 26 features are selected. Finally, Chinese wine classification system based on micrograph using combination of shape and structure features and BP neural network have been presented. We compare the recognition results for different choices of features (traditional shape features or proposed features). The experimental results show that the better classification rate have been achieved using the combinational features proposed in this paper.

  13. Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection.

    Science.gov (United States)

    Chen, Yifei; Sun, Yuxing; Han, Bing-Qing

    2015-01-01

    Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.

  14. Prognostic significance of visceral pleural invasion in the forthcoming (seventh) edition of TNM classification for lung cancer.

    Science.gov (United States)

    Shim, Hyo Sup; Park, In Kyu; Lee, Chang Young; Chung, Kyung Young

    2009-08-01

    The next revision to the TNM classification for lung cancer (the seventh edition) is scheduled to be released in 2009. However, the definition of visceral pleural invasion (VPI), which is a non-size-based T2 descriptor, still lacks in detail, and its validation is not included. We analyzed 1046 cases of non-small cell lung cancer (NSCLC) with T1, T2, or T3 diseases from 1990 to 2005, and subclassified into p0-p3 according to the degrees of pleural invasion. Survival analyses were performed using Kaplan-Meier method. Then, all patients were subdivided into nine groups according to tumor size and pleural invasion, and we compared survival differences, primarily focusing on T2a and T2b diseases according to the seventh edition. There was no survival difference between patients with p1 and p2, thus we regarded p1 or p2 as VPI. There was survival difference between two groups, which are expected to be classified as T2b. The behavior of tumors larger than 5cm but 7cm or less with VPI was similar to T3 tumors. VPI is a poor prognostic factor of NSCLC, and the penetration through the elastic layer of the visceral pleura regardless of its exposure on the pleural surface (pl and p2) should be defined as VPI. This study also indicates that VPI influences T stage dependent on tumor size, and it can be suggested that tumors of larger than 5cm but 7cm or less with VPI should be upgraded to T3 stage.

  15. Yarn-dyed fabric defect classification based on convolutional neural network

    Science.gov (United States)

    Jing, Junfeng; Dong, Amei; Li, Pengfei; Zhang, Kaibing

    2017-09-01

    Considering that manual inspection of the yarn-dyed fabric can be time consuming and inefficient, we propose a yarn-dyed fabric defect classification method by using a convolutional neural network (CNN) based on a modified AlexNet. CNN shows powerful ability in performing feature extraction and fusion by simulating the learning mechanism of human brain. The local response normalization layers in AlexNet are replaced by the batch normalization layers, which can enhance both the computational efficiency and classification accuracy. In the training process of the network, the characteristics of the defect are extracted step by step and the essential features of the image can be obtained from the fusion of the edge details with several convolution operations. Then the max-pooling layers, the dropout layers, and the fully connected layers are employed in the classification model to reduce the computation cost and extract more precise features of the defective fabric. Finally, the results of the defect classification are predicted by the softmax function. The experimental results show promising performance with an acceptable average classification rate and strong robustness on yarn-dyed fabric defect classification.

  16. Wavelet-based multicomponent denoising on GPU to improve the classification of hyperspectral images

    Science.gov (United States)

    Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco; Mouriño, J. C.

    2017-10-01

    Supervised classification allows handling a wide range of remote sensing hyperspectral applications. Enhancing the spatial organization of the pixels over the image has proven to be beneficial for the interpretation of the image content, thus increasing the classification accuracy. Denoising in the spatial domain of the image has been shown as a technique that enhances the structures in the image. This paper proposes a multi-component denoising approach in order to increase the classification accuracy when a classification method is applied. It is computed on multicore CPUs and NVIDIA GPUs. The method combines feature extraction based on a 1Ddiscrete wavelet transform (DWT) applied in the spectral dimension followed by an Extended Morphological Profile (EMP) and a classifier (SVM or ELM). The multi-component noise reduction is applied to the EMP just before the classification. The denoising recursively applies a separable 2D DWT after which the number of wavelet coefficients is reduced by using a threshold. Finally, inverse 2D-DWT filters are applied to reconstruct the noise free original component. The computational cost of the classifiers as well as the cost of the whole classification chain is high but it is reduced achieving real-time behavior for some applications through their computation on NVIDIA multi-GPU platforms.

  17. Faults Classification Of Power Electronic Circuits Based On A Support Vector Data Description Method

    Directory of Open Access Journals (Sweden)

    Cui Jiang

    2015-06-01

    Full Text Available Power electronic circuits (PECs are prone to various failures, whose classification is of paramount importance. This paper presents a data-driven based fault diagnosis technique, which employs a support vector data description (SVDD method to perform fault classification of PECs. In the presented method, fault signals (e.g. currents, voltages, etc. are collected from accessible nodes of circuits, and then signal processing techniques (e.g. Fourier analysis, wavelet transform, etc. are adopted to extract feature samples, which are subsequently used to perform offline machine learning. Finally, the SVDD classifier is used to implement fault classification task. However, in some cases, the conventional SVDD cannot achieve good classification performance, because this classifier may generate some so-called refusal areas (RAs, and in our design these RAs are resolved with the one-against-one support vector machine (SVM classifier. The obtained experiment results from simulated and actual circuits demonstrate that the improved SVDD has a classification performance close to the conventional one-against-one SVM, and can be applied to fault classification of PECs in practice.

  18. Classification Framework for ICT-Based Learning Technologies for Disabled People

    Science.gov (United States)

    Hersh, Marion

    2017-01-01

    The paper presents the first systematic approach to the classification of inclusive information and communication technologies (ICT)-based learning technologies and ICT-based learning technologies for disabled people which covers both assistive and general learning technologies, is valid for all disabled people and considers the full range of…

  19. International association for the study of lung cancer/american thoracic society/european respiratory society international multidisciplinary classification of lung adenocarcinoma.

    Science.gov (United States)

    Travis, William D; Brambilla, Elisabeth; Noguchi, Masayuki; Nicholson, Andrew G; Geisinger, Kim R; Yatabe, Yasushi; Beer, David G; Powell, Charles A; Riely, Gregory J; Van Schil, Paul E; Garg, Kavita; Austin, John H M; Asamura, Hisao; Rusch, Valerie W; Hirsch, Fred R; Scagliotti, Giorgio; Mitsudomi, Tetsuya; Huber, Rudolf M; Ishikawa, Yuichi; Jett, James; Sanchez-Cespedes, Montserrat; Sculier, Jean-Paul; Takahashi, Takashi; Tsuboi, Masahiro; Vansteenkiste, Johan; Wistuba, Ignacio; Yang, Pan-Chyr; Aberle, Denise; Brambilla, Christian; Flieder, Douglas; Franklin, Wilbur; Gazdar, Adi; Gould, Michael; Hasleton, Philip; Henderson, Douglas; Johnson, Bruce; Johnson, David; Kerr, Keith; Kuriyama, Keiko; Lee, Jin Soo; Miller, Vincent A; Petersen, Iver; Roggli, Victor; Rosell, Rafael; Saijo, Nagahiro; Thunnissen, Erik; Tsao, Ming; Yankelewitz, David

    2011-02-01

    % disease-specific survival, respectively. AIS and MIA are usually nonmucinous but rarely may be mucinous. Invasive adenocarcinomas are classified by predominant pattern after using comprehensive histologic subtyping with lepidic (formerly most mixed subtype tumors with nonmucinous BAC), acinar, papillary, and solid patterns; micropapillary is added as a new histologic subtype. Variants include invasive mucinous adenocarcinoma (formerly mucinous BAC), colloid, fetal, and enteric adenocarcinoma. This classification provides guidance for small biopsies and cytology specimens, as approximately 70% of lung cancers are diagnosed in such samples. Non-small cell lung carcinomas (NSCLCs), in patients with advanced-stage disease, are to be classified into more specific types such as adenocarcinoma or squamous cell carcinoma, whenever possible for several reasons: (1) adenocarcinoma or NSCLC not otherwise specified should be tested for epidermal growth factor receptor (EGFR) mutations as the presence of these mutations is predictive of responsiveness to EGFR tyrosine kinase inhibitors, (2) adenocarcinoma histology is a strong predictor for improved outcome with pemetrexed therapy compared with squamous cell carcinoma, and (3) potential life-threatening hemorrhage may occur in patients with squamous cell carcinoma who receive bevacizumab. If the tumor cannot be classified based on light microscopy alone, special studies such as immunohistochemistry and/or mucin stains should be applied to classify the tumor further. Use of the term NSCLC not otherwise specified should be minimized. This new classification strategy is based on a multidisciplinary approach to diagnosis of lung adenocarcinoma that incorporates clinical, molecular, radiologic, and surgical issues, but it is primarily based on histology. This classification is intended to support clinical practice, and research investigation and clinical trials. As EGFR mutation is a validated predictive marker for response and progression

  20. Organizational Data Classification Based on the Importance Concept of Complex Networks.

    Science.gov (United States)

    Carneiro, Murillo Guimaraes; Zhao, Liang

    2017-08-01

    Data classification is a common task, which can be performed by both computers and human beings. However, a fundamental difference between them can be observed: computer-based classification considers only physical features (e.g., similarity, distance, or distribution) of input data; by contrast, brain-based classification takes into account not only physical features, but also the organizational structure of data. In this paper, we figure out the data organizational structure for classification using complex networks constructed from training data. Specifically, an unlabeled instance is classified by the importance concept characterized by Google's PageRank measure of the underlying data networks. Before a test data instance is classified, a network is constructed from vector-based data set and the test instance is inserted into the network in a proper manner. To this end, we also propose a measure, called spatio-structural differential efficiency, to combine the physical and topological features of the input data. Such a method allows for the classification technique to capture a variety of data patterns using the unique importance measure. Extensive experiments demonstrate that the proposed technique has promising predictive performance on the detection of heart abnormalities.

  1. An application to pulmonary emphysema classification based on model of texton learning by sparse representation

    Science.gov (United States)

    Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi

    2012-03-01

    We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.

  2. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    Directory of Open Access Journals (Sweden)

    Manana Khachidze

    2016-01-01

    Full Text Available According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray and 13 subgroups using two well-known methods: Support Vector Machine (SVM and K-Nearest Neighbor (KNN. The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system due to common features characterizing these subclasses. The overall results of the study were successful.

  3. Online Learning for Classification of Alzheimer Disease based on Cortical Thickness and Hippocampal Shape Analysis.

    Science.gov (United States)

    Lee, Ga-Young; Kim, Jeonghun; Kim, Ju Han; Kim, Kiwoong; Seong, Joon-Kyung

    2014-01-01

    Mobile healthcare applications are becoming a growing trend. Also, the prevalence of dementia in modern society is showing a steady growing trend. Among degenerative brain diseases that cause dementia, Alzheimer disease (AD) is the most common. The purpose of this study was to identify AD patients using magnetic resonance imaging in the mobile environment. We propose an incremental classification for mobile healthcare systems. Our classification method is based on incremental learning for AD diagnosis and AD prediction using the cortical thickness data and hippocampus shape. We constructed a classifier based on principal component analysis and linear discriminant analysis. We performed initial learning and mobile subject classification. Initial learning is the group learning part in our server. Our smartphone agent implements the mobile classification and shows various results. With use of cortical thickness data analysis alone, the discrimination accuracy was 87.33% (sensitivity 96.49% and specificity 64.33%). When cortical thickness data and hippocampal shape were analyzed together, the achieved accuracy was 87.52% (sensitivity 96.79% and specificity 63.24%). In this paper, we presented a classification method based on online learning for AD diagnosis by employing both cortical thickness data and hippocampal shape analysis data. Our method was implemented on smartphone devices and discriminated AD patients for normal group.

  4. Machine learning versus knowledge based classification of legal texts

    NARCIS (Netherlands)

    de Maat, E.; Krabben, K.; Winkels, R.; Winkels, R.G.F.

    2010-01-01

    This paper presents results of an experiment in which we used machine learning (ML) techniques to classify sentences in Dutch legislation. These results are compared to the results of a pattern-based classifier. Overall, the ML classifier performs as accurate (>90%) as the pattern based one, but

  5. Aided diagnosis methods of breast cancer based on machine learning

    Science.gov (United States)

    Zhao, Yue; Wang, Nian; Cui, Xiaoyu

    2017-08-01

    In the field of medicine, quickly and accurately determining whether the patient is malignant or benign is the key to treatment. In this paper, K-Nearest Neighbor, Linear Discriminant Analysis, Logistic Regression were applied to predict the classification of thyroid,Her-2,PR,ER,Ki67,metastasis and lymph nodes in breast cancer, in order to recognize the benign and malignant breast tumors and achieve the purpose of aided diagnosis of breast cancer. The results showed that the highest classification accuracy of LDA was 88.56%, while the classification effect of KNN and Logistic Regression were better than that of LDA, the best accuracy reached 96.30%.

  6. Task-Driven Dictionary Learning Based on Mutual Information for Medical Image Classification.

    Science.gov (United States)

    Diamant, Idit; Klang, Eyal; Amitai, Michal; Konen, Eli; Goldberger, Jacob; Greenspan, Hayit

    2017-06-01

    We present a novel variant of the bag-of-visual-words (BoVW) method for automated medical image classification. Our approach improves the BoVW model by learning a task-driven dictionary of the most relevant visual words per task using a mutual information-based criterion. Additionally, we generate relevance maps to visualize and localize the decision of the automatic classification algorithm. These maps demonstrate how the algorithm works and show the spatial layout of the most relevant words. We applied our algorithm to three different tasks: chest x-ray pathology identification (of four pathologies: cardiomegaly, enlarged mediastinum, right consolidation, and left consolidation), liver lesion classification into four categories in computed tomography (CT) images and benign/malignant clusters of microcalcifications (MCs) classification in breast mammograms. Validation was conducted on three datasets: 443 chest x-rays, 118 portal phase CT images of liver lesions, and 260 mammography MCs. The proposed method improves the classical BoVW method for all tested applications. For chest x-ray, area under curve of 0.876 was obtained for enlarged mediastinum identification compared to 0.855 using classical BoVW (with p-value 0.01). For MC classification, a significant improvement of 4% was achieved using our new approach (with p-value = 0.03). For liver lesion classification, an improvement of 6% in sensitivity and 2% in specificity were obtained (with p-value 0.001). We demonstrated that classification based on informative selected set of words results in significant improvement. Our new BoVW approach shows promising results in clinically important domains. Additionally, it can discover relevant parts of images for the task at hand without explicit annotations for training data. This can provide computer-aided support for medical experts in challenging image analysis tasks.

  7. Cancer survivors' experience of exercise-based cancer rehabilitation

    DEFF Research Database (Denmark)

    Midtgaard, Julie; Hammer, Nanna Maria; Andersen, Christina

    2015-01-01

    BACKGROUND: Evidence for the safety and benefits of exercise training as a therapeutic and rehabilitative intervention for cancer survivors is accumulating. However, whereas the evidence for the efficacy of exercise training has been established in several meta-analyses, synthesis of qualitative...... research is lacking. In order to extend healthcare professionals' understanding of the meaningfulness of exercise in cancer survivorship care, this paper aims to identify, appraise and synthesize qualitative studies on cancer survivors' experience of participation in exercise-based rehabilitation. MATERIAL......-based rehabilitation according to cancer survivors. Accordingly, the potential of rebuilding structure in everyday life, creating a normal context and enabling the individual to re-establish confidentiality and trust in their own body and physical potential constitute substantial qualities fundamental...

  8. Tumor Size Evaluation according to the T Component of the Seventh Edition of the International Association for the Study of Lung Cancer's TNM Classification: Interobserver Agreement between Radiologists and Computer-Aided Diagnosis System in Patients with Lung Cancer

    International Nuclear Information System (INIS)

    Kim, Jin Kyoung; Chong, Se Min; Seo, Jae Seung; Lee, Sun Jin; Han, Heon

    2011-01-01

    To assess the interobserver agreement for tumor size evaluation between radiologists and the computer-aided diagnosis (CAD) system based on the 7th edition of the TNM classification by the International Association for the Study of Lung Cancer in patients with lung cancer. We evaluated 20 patients who underwent a lobectomy or pneumonectomy for primary lung cancer. The maximum diameter of each primary tumor was measured by two radiologists and a CAD system on CT, and was staged based on the 7th edition of the TNM classification. The CT size and T-staging of the primary tumors was compared with the pathologic size and staging and the variability in the sizes and T stages of primary tumors was statistically analyzed between each radiologist's measurement or CAD estimation and the pathologic results. There was no statistically significant interobserver difference for the CT size among the two radiologists, between pathologic and CT size estimated by the radiologists, and between pathologic and CT staging by the radiologists and CAD system. However, there was a statistically significant interobserver difference between pathologic size and the CT size estimated by the CAD system (p = 0.003). No significant differences were found in the measurement of tumor size among radiologists or in the assessment of T-staging by radiologists and the CAD system.

  9. Locality-preserving sparse representation-based classification in hyperspectral imagery

    Science.gov (United States)

    Gao, Lianru; Yu, Haoyang; Zhang, Bing; Li, Qingting

    2016-10-01

    This paper proposes to combine locality-preserving projections (LPP) and sparse representation (SR) for hyperspectral image classification. The LPP is first used to reduce the dimensionality of all the training and testing data by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold, where the high-dimensional data lies. Then, SR codes the projected testing pixels as sparse linear combinations of all the training samples to classify the testing pixels by evaluating which class leads to the minimum approximation error. The integration of LPP and SR represents an innovative contribution to the literature. The proposed approach, called locality-preserving SR-based classification, addresses the imbalance between high dimensionality of hyperspectral data and the limited number of training samples. Experimental results on three real hyperspectral data sets demonstrate that the proposed approach outperforms the original counterpart, i.e., SR-based classification.

  10. Segmentation of Clinical Endoscopic Images Based on the Classification of Topological Vector Features

    Directory of Open Access Journals (Sweden)

    O. A. Dunaeva

    2013-01-01

    Full Text Available In this work, we describe a prototype of an automatic segmentation system and annotation of endoscopy images. The used algorithm is based on the classification of vectors of the topological features of the original image. We use the image processing scheme which includes image preprocessing, calculation of vector descriptors defined for every point of the source image and the subsequent classification of descriptors. Image preprocessing includes finding and selecting artifacts and equalizating the image brightness. In this work, we give the detailed algorithm of the construction of topological descriptors and the classifier creating procedure based on mutual sharing the AdaBoost scheme and a naive Bayes classifier. In the final section, we show the results of the classification of real endoscopic images.

  11. A Feature Selection Method for Large-Scale Network Traffic Classification Based on Spark

    Directory of Open Access Journals (Sweden)

    Yong Wang

    2016-02-01

    Full Text Available Currently, with the rapid increasing of data scales in network traffic classifications, how to select traffic features efficiently is becoming a big challenge. Although a number of traditional feature selection methods using the Hadoop-MapReduce framework have been proposed, the execution time was still unsatisfactory with numeral iterative computations during the processing. To address this issue, an efficient feature selection method for network traffic based on a new parallel computing framework called Spark is proposed in this paper. In our approach, the complete feature set is firstly preprocessed based on Fisher score, and a sequential forward search strategy is employed for subsets. The optimal feature subset is then selected using the continuous iterations of the Spark computing framework. The implementation demonstrates that, on the precondition of keeping the classification accuracy, our method reduces the time cost of modeling and classification, and improves the execution efficiency of feature selection significantly.

  12. SVM classification model in depression recognition based on mutation PSO parameter optimization

    Directory of Open Access Journals (Sweden)

    Zhang Ming

    2017-01-01

    Full Text Available At present, the clinical diagnosis of depression is mainly through structured interviews by psychiatrists, which is lack of objective diagnostic methods, so it causes the higher rate of misdiagnosis. In this paper, a method of depression recognition based on SVM and particle swarm optimization algorithm mutation is proposed. To address on the problem that particle swarm optimization (PSO algorithm easily trap in local optima, we propose a feedback mutation PSO algorithm (FBPSO to balance the local search and global exploration ability, so that the parameters of the classification model is optimal. We compared different PSO mutation algorithms about classification accuracy for depression, and found the classification accuracy of support vector machine (SVM classifier based on feedback mutation PSO algorithm is the highest. Our study promotes important reference value for establishing auxiliary diagnostic used in depression recognition of clinical diagnosis.

  13. Restaging and Survival Analysis of 4036 Ovarian Cancer Patients According to the 2013 FIGO Classification for Ovarian, Fallopian Tube, and Primary Peritoneal Cancer

    DEFF Research Database (Denmark)

    Rosendahl, Mikkel; Høgdall, Claus Kim; Mosgaard, Berit Jul

    2016-01-01

    OBJECTIVE: With the 2013 International Federation of Gynecology and Obstetrics (FIGO) staging for ovarian, fallopian tube, and primary peritoneal cancer, the number of substages changed from 10 to 14. Any classification of a malignancy should easily assign patients to prognostic groups, refer....... MATERIALS AND METHODS: Demographic, surgical, histological, and survival data from 4036 ovarian cancer patients were used in the analysis. Five-year survival rates (5YSR) and hazard ratios for the old and revised FIGO staging were calculated using Kaplan-Meier curves and Cox regression. RESULTS: A total...

  14. Comparison of Enzymes / Non-Enzymes Proteins Classification Models Based on 3D, Composition, Sequences and Topological Indices

    OpenAIRE

    Munteanu, Cristian Robert

    2014-01-01

    Comparison of Enzymes / Non-Enzymes Proteins Classification Models Based on 3D, Composition, Sequences and Topological Indices, German Conference on Bioinformatics (GCB), Potsdam, Germany (September, 2007)

  15. Application of In-Segment Multiple Sampling in Object-Based Classification

    Directory of Open Access Journals (Sweden)

    Nataša Đurić

    2014-12-01

    Full Text Available When object-based analysis is applied to very high-resolution imagery, pixels within the segments reveal large spectral inhomogeneity; their distribution can be considered complex rather than normal. When normality is violated, the classification methods that rely on the assumption of normally distributed data are not as successful or accurate. It is hard to detect normality violations in small samples. The segmentation process produces segments that vary highly in size; samples can be very big or very small. This paper investigates whether the complexity within the segment can be addressed using multiple random sampling of segment pixels and multiple calculations of similarity measures. In order to analyze the effect sampling has on classification results, statistics and probability value equations of non-parametric two-sample Kolmogorov-Smirnov test and