WorldWideScience

Sample records for cancer classification based

  1. NIM: A Node Influence Based Method for Cancer Classification

    Directory of Open Access Journals (Sweden)

    Yiwen Wang

    2014-01-01

    Full Text Available The classification of different cancer types owns great significance in the medical field. However, the great majority of existing cancer classification methods are clinical-based and have relatively weak diagnostic ability. With the rapid development of gene expression technology, it is able to classify different kinds of cancers using DNA microarray. Our main idea is to confront the problem of cancer classification using gene expression data from a graph-based view. Based on a new node influence model we proposed, this paper presents a novel high accuracy method for cancer classification, which is composed of four parts: the first is to calculate the similarity matrix of all samples, the second is to compute the node influence of training samples, the third is to obtain the similarity between every test sample and each class using weighted sum of node influence and similarity matrix, and the last is to classify each test sample based on its similarity between every class. The data sets used in our experiments are breast cancer, central nervous system, colon tumor, prostate cancer, acute lymphoblastic leukemia, and lung cancer. experimental results showed that our node influence based method (NIM is more efficient and robust than the support vector machine, K-nearest neighbor, C4.5, naive Bayes, and CART.

  2. Classification of cancerous cells based on the one-class problem approach

    Science.gov (United States)

    Murshed, Nabeel A.; Bortolozzi, Flavio; Sabourin, Robert

    1996-03-01

    One of the most important factors in reducing the effect of cancerous diseases is the early diagnosis, which requires a good and a robust method. With the advancement of computer technologies and digital image processing, the development of a computer-based system has become feasible. In this paper, we introduce a new approach for the detection of cancerous cells. This approach is based on the one-class problem approach, through which the classification system need only be trained with patterns of cancerous cells. This reduces the burden of the training task by about 50%. Based on this approach, a computer-based classification system is developed, based on the Fuzzy ARTMAP neural networks. Experimental results were performed using a set of 542 patterns taken from a sample of breast cancer. Results of the experiment show 98% correct identification of cancerous cells and 95% correct identification of non-cancerous cells.

  3. Cancer Classification Based on Support Vector Machine Optimized by Particle Swarm Optimization and Artificial Bee Colony.

    Science.gov (United States)

    Gao, Lingyun; Ye, Mingquan; Wu, Changrong

    2017-11-29

    Intelligent optimization algorithms have advantages in dealing with complex nonlinear problems accompanied by good flexibility and adaptability. In this paper, the FCBF (Fast Correlation-Based Feature selection) method is used to filter irrelevant and redundant features in order to improve the quality of cancer classification. Then, we perform classification based on SVM (Support Vector Machine) optimized by PSO (Particle Swarm Optimization) combined with ABC (Artificial Bee Colony) approaches, which is represented as PA-SVM. The proposed PA-SVM method is applied to nine cancer datasets, including five datasets of outcome prediction and a protein dataset of ovarian cancer. By comparison with other classification methods, the results demonstrate the effectiveness and the robustness of the proposed PA-SVM method in handling various types of data for cancer classification.

  4. Classification of human cancers based on DNA copy number amplification modeling

    Directory of Open Access Journals (Sweden)

    Knuutila Sakari

    2008-05-01

    Full Text Available Abstract Background DNA amplifications alter gene dosage in cancer genomes by multiplying the gene copy number. Amplifications are quintessential in a considerable number of advanced cancers of various anatomical locations. The aims of this study were to classify human cancers based on their amplification patterns, explore the biological and clinical fundamentals behind their amplification-pattern based classification, and understand the characteristics in human genomic architecture that associate with amplification mechanisms. Methods We applied a machine learning approach to model DNA copy number amplifications using a data set of binary amplification records at chromosome sub-band resolution from 4400 cases that represent 82 cancer types. Amplification data was fused with background data: clinical, histological and biological classifications, and cytogenetic annotations. Statistical hypothesis testing was used to mine associations between the data sets. Results Probabilistic clustering of each chromosome identified 111 amplification models and divided the cancer cases into clusters. The distribution of classification terms in the amplification-model based clustering of cancer cases revealed cancer classes that were associated with specific DNA copy number amplification models. Amplification patterns – finite or bounded descriptions of the ranges of the amplifications in the chromosome – were extracted from the clustered data and expressed according to the original cytogenetic nomenclature. This was achieved by maximal frequent itemset mining using the cluster-specific data sets. The boundaries of amplification patterns were shown to be enriched with fragile sites, telomeres, centromeres, and light chromosome bands. Conclusions Our results demonstrate that amplifications are non-random chromosomal changes and specifically selected in tumor tissue microenvironment. Furthermore, statistical evidence showed that specific chromosomal features

  5. A protein and mRNA expression-based classification of gastric cancer.

    Science.gov (United States)

    Setia, Namrata; Agoston, Agoston T; Han, Hye S; Mullen, John T; Duda, Dan G; Clark, Jeffrey W; Deshpande, Vikram; Mino-Kenudson, Mari; Srivastava, Amitabh; Lennerz, Jochen K; Hong, Theodore S; Kwak, Eunice L; Lauwers, Gregory Y

    2016-07-01

    The overall survival of gastric carcinoma patients remains poor despite improved control over known risk factors and surveillance. This highlights the need for new classifications, driven towards identification of potential therapeutic targets. Using sophisticated molecular technologies and analysis, three groups recently provided genetic and epigenetic molecular classifications of gastric cancer (The Cancer Genome Atlas, 'Singapore-Duke' study, and Asian Cancer Research Group). Suggested by these classifications, here, we examined the expression of 14 biomarkers in a cohort of 146 gastric adenocarcinomas and performed unsupervised hierarchical clustering analysis using less expensive and widely available immunohistochemistry and in situ hybridization. Ultimately, we identified five groups of gastric cancers based on Epstein-Barr virus (EBV) positivity, microsatellite instability, aberrant E-cadherin, and p53 expression; the remaining cases constituted a group characterized by normal p53 expression. In addition, the five categories correspond to the reported molecular subgroups by virtue of clinicopathologic features. Furthermore, evaluation between these clusters and survival using the Cox proportional hazards model showed a trend for superior survival in the EBV and microsatellite-instable related adenocarcinomas. In conclusion, we offer as a proposal a simplified algorithm that is able to reproduce the recently proposed molecular subgroups of gastric adenocarcinoma, using immunohistochemical and in situ hybridization techniques.

  6. Cancer classification in the genomic era: five contemporary problems.

    Science.gov (United States)

    Song, Qingxuan; Merajver, Sofia D; Li, Jun Z

    2015-10-19

    Classification is an everyday instinct as well as a full-fledged scientific discipline. Throughout the history of medicine, disease classification is central to how we develop knowledge, make diagnosis, and assign treatment. Here, we discuss the classification of cancer and the process of categorizing cancer subtypes based on their observed clinical and biological features. Traditionally, cancer nomenclature is primarily based on organ location, e.g., "lung cancer" designates a tumor originating in lung structures. Within each organ-specific major type, finer subgroups can be defined based on patient age, cell type, histological grades, and sometimes molecular markers, e.g., hormonal receptor status in breast cancer or microsatellite instability in colorectal cancer. In the past 15+ years, high-throughput technologies have generated rich new data regarding somatic variations in DNA, RNA, protein, or epigenomic features for many cancers. These data, collected for increasingly large tumor cohorts, have provided not only new insights into the biological diversity of human cancers but also exciting opportunities to discover previously unrecognized cancer subtypes. Meanwhile, the unprecedented volume and complexity of these data pose significant challenges for biostatisticians, cancer biologists, and clinicians alike. Here, we review five related issues that represent contemporary problems in cancer taxonomy and interpretation. (1) How many cancer subtypes are there? (2) How can we evaluate the robustness of a new classification system? (3) How are classification systems affected by intratumor heterogeneity and tumor evolution? (4) How should we interpret cancer subtypes? (5) Can multiple classification systems co-exist? While related issues have existed for a long time, we will focus on those aspects that have been magnified by the recent influx of complex multi-omics data. Exploration of these problems is essential for data-driven refinement of cancer classification

  7. A Classification Framework Applied to Cancer Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Hussein Hijazi

    2013-01-01

    Full Text Available Classification of cancer based on gene expression has provided insight into possible treatment strategies. Thus, developing machine learning methods that can successfully distinguish among cancer subtypes or normal versus cancer samples is important. This work discusses supervised learning techniques that have been employed to classify cancers. Furthermore, a two-step feature selection method based on an attribute estimation method (e.g., ReliefF and a genetic algorithm was employed to find a set of genes that can best differentiate between cancer subtypes or normal versus cancer samples. The application of different classification methods (e.g., decision tree, k-nearest neighbor, support vector machine (SVM, bagging, and random forest on 5 cancer datasets shows that no classification method universally outperforms all the others. However, k-nearest neighbor and linear SVM generally improve the classification performance over other classifiers. Finally, incorporating diverse types of genomic data (e.g., protein-protein interaction data and gene expression increase the prediction accuracy as compared to using gene expression alone.

  8. Lauren classification and individualized chemotherapy in gastric cancer.

    Science.gov (United States)

    Ma, Junli; Shen, Hong; Kapesa, Linda; Zeng, Shan

    2016-05-01

    Gastric cancer is one of the most common malignancies worldwide. During the last 50 years, the histological classification of gastric carcinoma has been largely based on Lauren's criteria, in which gastric cancer is classified into two major histological subtypes, namely intestinal type and diffuse type adenocarcinoma. This classification was introduced in 1965, and remains currently widely accepted and employed, since it constitutes a simple and robust classification approach. The two histological subtypes of gastric cancer proposed by the Lauren classification exhibit a number of distinct clinical and molecular characteristics, including histogenesis, cell differentiation, epidemiology, etiology, carcinogenesis, biological behaviors and prognosis. Gastric cancer exhibits varied sensitivity to chemotherapy drugs and significant heterogeneity; therefore, the disease may be a target for individualized therapy. The Lauren classification may provide the basis for individualized treatment for advanced gastric cancer, which is increasingly gaining attention in the scientific field. However, few studies have investigated individualized treatment that is guided by pathological classification. The aim of the current review is to analyze the two major histological subtypes of gastric cancer, as proposed by the Lauren classification, and to discuss the implications of this for personalized chemotherapy.

  9. Pathological Bases for a Robust Application of Cancer Molecular Classification

    Directory of Open Access Journals (Sweden)

    Salvador J. Diaz-Cano

    2015-04-01

    Full Text Available Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes, and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors.

  10. Actionable gene-based classification toward precision medicine in gastric cancer

    Directory of Open Access Journals (Sweden)

    Hiroshi Ichikawa

    2017-10-01

    Full Text Available Abstract Background Intertumoral heterogeneity represents a significant hurdle to identifying optimized targeted therapies in gastric cancer (GC. To realize precision medicine for GC patients, an actionable gene alteration-based molecular classification that directly associates GCs with targeted therapies is needed. Methods A total of 207 Japanese patients with GC were included in this study. Formalin-fixed, paraffin-embedded (FFPE tumor tissues were obtained from surgical or biopsy specimens and were subjected to DNA extraction. We generated comprehensive genomic profiling data using a 435-gene panel including 69 actionable genes paired with US Food and Drug Administration-approved targeted therapies, and the evaluation of Epstein-Barr virus (EBV infection and microsatellite instability (MSI status. Results Comprehensive genomic sequencing detected at least one alteration of 435 cancer-related genes in 194 GCs (93.7% and of 69 actionable genes in 141 GCs (68.1%. We classified the 207 GCs into four The Cancer Genome Atlas (TCGA subtypes using the genomic profiling data; EBV (N = 9, MSI (N = 17, chromosomal instability (N = 119, and genomically stable subtype (N = 62. Actionable gene alterations were not specific and were widely observed throughout all TCGA subtypes. To discover a novel classification which more precisely selects candidates for targeted therapies, 207 GCs were classified using hypermutated phenotype and the mutation profile of 69 actionable genes. We identified a hypermutated group (N = 32, while the others (N = 175 were sub-divided into six clusters including five with actionable gene alterations: ERBB2 (N = 25, CDKN2A, and CDKN2B (N = 10, KRAS (N = 10, BRCA2 (N = 9, and ATM cluster (N = 12. The clinical utility of this classification was demonstrated by a case of unresectable GC with a remarkable response to anti-HER2 therapy in the ERBB2 cluster. Conclusions This actionable gene-based

  11. Pathohistological classification systems in gastric cancer: diagnostic relevance and prognostic value.

    Science.gov (United States)

    Berlth, Felix; Bollschweiler, Elfriede; Drebber, Uta; Hoelscher, Arnulf H; Moenig, Stefan

    2014-05-21

    Several pathohistological classification systems exist for the diagnosis of gastric cancer. Many studies have investigated the correlation between the pathohistological characteristics in gastric cancer and patient characteristics, disease specific criteria and overall outcome. It is still controversial as to which classification system imparts the most reliable information, and therefore, the choice of system may vary in clinical routine. In addition to the most common classification systems, such as the Laurén and the World Health Organization (WHO) classifications, other authors have tried to characterize and classify gastric cancer based on the microscopic morphology and in reference to the clinical outcome of the patients. In more than 50 years of systematic classification of the pathohistological characteristics of gastric cancer, there is no sole classification system that is consistently used worldwide in diagnostics and research. However, several national guidelines for the treatment of gastric cancer refer to the Laurén or the WHO classifications regarding therapeutic decision-making, which underlines the importance of a reliable classification system for gastric cancer. The latest results from gastric cancer studies indicate that it might be useful to integrate DNA- and RNA-based features of gastric cancer into the classification systems to establish prognostic relevance. This article reviews the diagnostic relevance and the prognostic value of different pathohistological classification systems in gastric cancer.

  12. Lauren classification and individualized chemotherapy in gastric cancer

    OpenAIRE

    MA, JUNLI; SHEN, HONG; KAPESA, LINDA; ZENG, SHAN

    2016-01-01

    Gastric cancer is one of the most common malignancies worldwide. During the last 50 years, the histological classification of gastric carcinoma has been largely based on Lauren's criteria, in which gastric cancer is classified into two major histological subtypes, namely intestinal type and diffuse type adenocarcinoma. This classification was introduced in 1965, and remains currently widely accepted and employed, since it constitutes a simple and robust classification approach. The two histol...

  13. Quantum Cascade Laser-Based Infrared Microscopy for Label-Free and Automated Cancer Classification in Tissue Sections.

    Science.gov (United States)

    Kuepper, Claus; Kallenbach-Thieltges, Angela; Juette, Hendrik; Tannapfel, Andrea; Großerueschkamp, Frederik; Gerwert, Klaus

    2018-05-16

    A feasibility study using a quantum cascade laser-based infrared microscope for the rapid and label-free classification of colorectal cancer tissues is presented. Infrared imaging is a reliable, robust, automated, and operator-independent tissue classification method that has been used for differential classification of tissue thin sections identifying tumorous regions. However, long acquisition time by the so far used FT-IR-based microscopes hampered the clinical translation of this technique. Here, the used quantum cascade laser-based microscope provides now infrared images for precise tissue classification within few minutes. We analyzed 110 patients with UICC-Stage II and III colorectal cancer, showing 96% sensitivity and 100% specificity of this label-free method as compared to histopathology, the gold standard in routine clinical diagnostics. The main hurdle for the clinical translation of IR-Imaging is overcome now by the short acquisition time for high quality diagnostic images, which is in the same time range as frozen sections by pathologists.

  14. AN ADABOOST OPTIMIZED CCFIS BASED CLASSIFICATION MODEL FOR BREAST CANCER DETECTION

    Directory of Open Access Journals (Sweden)

    CHANDRASEKAR RAVI

    2017-06-01

    Full Text Available Classification is a Data Mining technique used for building a prototype of the data behaviour, using which an unseen data can be classified into one of the defined classes. Several researchers have proposed classification techniques but most of them did not emphasis much on the misclassified instances and storage space. In this paper, a classification model is proposed that takes into account the misclassified instances and storage space. The classification model is efficiently developed using a tree structure for reducing the storage complexity and uses single scan of the dataset. During the training phase, Class-based Closed Frequent ItemSets (CCFIS were mined from the training dataset in the form of a tree structure. The classification model has been developed using the CCFIS and a similarity measure based on Longest Common Subsequence (LCS. Further, the Particle Swarm Optimization algorithm is applied on the generated CCFIS, which assigns weights to the itemsets and their associated classes. Most of the classifiers are correctly classifying the common instances but they misclassify the rare instances. In view of that, AdaBoost algorithm has been used to boost the weights of the misclassified instances in the previous round so as to include them in the training phase to classify the rare instances. This improves the accuracy of the classification model. During the testing phase, the classification model is used to classify the instances of the test dataset. Breast Cancer dataset from UCI repository is used for experiment. Experimental analysis shows that the accuracy of the proposed classification model outperforms the PSOAdaBoost-Sequence classifier by 7% superior to other approaches like Naïve Bayes Classifier, Support Vector Machine Classifier, Instance Based Classifier, ID3 Classifier, J48 Classifier, etc.

  15. Influence of nuclei segmentation on breast cancer malignancy classification

    Science.gov (United States)

    Jelen, Lukasz; Fevens, Thomas; Krzyzak, Adam

    2009-02-01

    Breast Cancer is one of the most deadly cancers affecting middle-aged women. Accurate diagnosis and prognosis are crucial to reduce the high death rate. Nowadays there are numerous diagnostic tools for breast cancer diagnosis. In this paper we discuss a role of nuclear segmentation from fine needle aspiration biopsy (FNA) slides and its influence on malignancy classification. Classification of malignancy plays a very important role during the diagnosis process of breast cancer. Out of all cancer diagnostic tools, FNA slides provide the most valuable information about the cancer malignancy grade which helps to choose an appropriate treatment. This process involves assessing numerous nuclear features and therefore precise segmentation of nuclei is very important. In this work we compare three powerful segmentation approaches and test their impact on the classification of breast cancer malignancy. The studied approaches involve level set segmentation, fuzzy c-means segmentation and textural segmentation based on co-occurrence matrix. Segmented nuclei were used to extract nuclear features for malignancy classification. For classification purposes four different classifiers were trained and tested with previously extracted features. The compared classifiers are Multilayer Perceptron (MLP), Self-Organizing Maps (SOM), Principal Component-based Neural Network (PCA) and Support Vector Machines (SVM). The presented results show that level set segmentation yields the best results over the three compared approaches and leads to a good feature extraction with a lowest average error rate of 6.51% over four different classifiers. The best performance was recorded for multilayer perceptron with an error rate of 3.07% using fuzzy c-means segmentation.

  16. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2005-11-01

    Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

  17. Proteomic classification of breast cancer.

    LENUS (Irish Health Repository)

    Kamel, Dalia

    2012-11-01

    Being a significant health problem that affects patients in various age groups, breast cancer has been extensively studied to date. Recently, molecular breast cancer classification has advanced significantly with the availability of genomic profiling technologies. Proteomic technologies have also advanced from traditional protein assays including enzyme-linked immunosorbent assay, immunoblotting and immunohistochemistry to more comprehensive approaches including mass spectrometry and reverse phase protein lysate arrays (RPPA). The purpose of this manuscript is to review the current protein markers that influence breast cancer prediction and prognosis and to focus on novel advances in proteomic classification of breast cancer.

  18. FEATURE EXTRACTION BASED WAVELET TRANSFORM IN BREAST CANCER DIAGNOSIS USING FUZZY AND NON-FUZZY CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Pelin GORGEL

    2013-01-01

    Full Text Available This study helps to provide a second eye to the expert radiologists for the classification of manually extracted breast masses taken from 60 digital mammıgrams. These mammograms have been acquired from Istanbul University Faculty of Medicine Hospital and have 78 masses. The diagnosis is implemented with pre-processing by using feature extraction based Fast Wavelet Transform (FWT. Afterwards Adaptive Neuro-Fuzzy Inference System (ANFIS based fuzzy subtractive clustering and Support Vector Machines (SVM methods are used for the classification. It is a comparative study which uses these methods respectively. According to the results of the study, ANFIS based subtractive clustering produces ??% while SVM produces ??% accuracy in malignant-benign classification. The results demonstrate that the developed system could help the radiologists for a true diagnosis and decrease the number of the missing cancerous regions or unnecessary biopsies.

  19. Identifying colon cancer risk modules with better classification performance based on human signaling network.

    Science.gov (United States)

    Qu, Xiaoli; Xie, Ruiqiang; Chen, Lina; Feng, Chenchen; Zhou, Yanyan; Li, Wan; Huang, Hao; Jia, Xu; Lv, Junjie; He, Yuehan; Du, Youwen; Li, Weiguo; Shi, Yuchen; He, Weiming

    2014-10-01

    Identifying differences between normal and tumor samples from a modular perspective may help to improve our understanding of the mechanisms responsible for colon cancer. Many cancer studies have shown that signaling transduction and biological pathways are disturbed in disease states, and expression profiles can distinguish variations in diseases. In this study, we integrated a weighted human signaling network and gene expression profiles to select risk modules associated with tumor conditions. Risk modules as classification features by our method had a better classification performance than other methods, and one risk module for colon cancer had a good classification performance for distinguishing between normal/tumor samples and between tumor stages. All genes in the module were annotated to the biological process of positive regulation of cell proliferation, and were highly associated with colon cancer. These results suggested that these genes might be the potential risk genes for colon cancer. Copyright © 2013. Published by Elsevier Inc.

  20. BCDForest: a boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data.

    Science.gov (United States)

    Guo, Yang; Liu, Shuhui; Li, Zhanhuai; Shang, Xuequn

    2018-04-11

    The classification of cancer subtypes is of great importance to cancer disease diagnosis and therapy. Many supervised learning approaches have been applied to cancer subtype classification in the past few years, especially of deep learning based approaches. Recently, the deep forest model has been proposed as an alternative of deep neural networks to learn hyper-representations by using cascade ensemble decision trees. It has been proved that the deep forest model has competitive or even better performance than deep neural networks in some extent. However, the standard deep forest model may face overfitting and ensemble diversity challenges when dealing with small sample size and high-dimensional biology data. In this paper, we propose a deep learning model, so-called BCDForest, to address cancer subtype classification on small-scale biology datasets, which can be viewed as a modification of the standard deep forest model. The BCDForest distinguishes from the standard deep forest model with the following two main contributions: First, a named multi-class-grained scanning method is proposed to train multiple binary classifiers to encourage diversity of ensemble. Meanwhile, the fitting quality of each classifier is considered in representation learning. Second, we propose a boosting strategy to emphasize more important features in cascade forests, thus to propagate the benefits of discriminative features among cascade layers to improve the classification performance. Systematic comparison experiments on both microarray and RNA-Seq gene expression datasets demonstrate that our method consistently outperforms the state-of-the-art methods in application of cancer subtype classification. The multi-class-grained scanning and boosting strategy in our model provide an effective solution to ease the overfitting challenge and improve the robustness of deep forest model working on small-scale data. Our model provides a useful approach to the classification of cancer subtypes

  1. Training ANFIS structure using genetic algorithm for liver cancer classification based on microarray gene expression data

    Directory of Open Access Journals (Sweden)

    Bülent Haznedar

    2017-02-01

    Full Text Available Classification is an important data mining technique, which is used in many fields mostly exemplified as medicine, genetics and biomedical engineering. The number of studies about classification of the datum on DNA microarray gene expression is specifically increased in recent years. However, because of the reasons as the abundance of gene numbers in the datum as microarray gene expressions and the nonlinear relations mostly across those datum, the success of conventional classification algorithms can be limited. Because of these reasons, the interest on classification methods which are based on artificial intelligence to solve the problem on classification has been gradually increased in recent times. In this study, a hybrid approach which is based on Adaptive Neuro-Fuzzy Inference System (ANFIS and Genetic Algorithm (GA are suggested in order to classify liver microarray cancer data set. Simulation results are compared with the results of other methods. According to the results obtained, it is seen that the recommended method is better than the other methods.

  2. A New Direction of Cancer Classification: Positive Effect of Low-Ranking MicroRNAs.

    Science.gov (United States)

    Li, Feifei; Piao, Minghao; Piao, Yongjun; Li, Meijing; Ryu, Keun Ho

    2014-10-01

    Many studies based on microRNA (miRNA) expression profiles showed a new aspect of cancer classification. Because one characteristic of miRNA expression data is the high dimensionality, feature selection methods have been used to facilitate dimensionality reduction. The feature selection methods have one shortcoming thus far: they just consider the problem of where feature to class is 1:1 or n:1. However, because one miRNA may influence more than one type of cancer, human miRNA is considered to be ranked low in traditional feature selection methods and are removed most of the time. In view of the limitation of the miRNA number, low-ranking miRNAs are also important to cancer classification. We considered both high- and low-ranking features to cover all problems (1:1, n:1, 1:n, and m:n) in cancer classification. First, we used the correlation-based feature selection method to select the high-ranking miRNAs, and chose the support vector machine, Bayes network, decision tree, k-nearest-neighbor, and logistic classifier to construct cancer classification. Then, we chose Chi-square test, information gain, gain ratio, and Pearson's correlation feature selection methods to build the m:n feature subset, and used the selected miRNAs to determine cancer classification. The low-ranking miRNA expression profiles achieved higher classification accuracy compared with just using high-ranking miRNAs in traditional feature selection methods. Our results demonstrate that the m:n feature subset made a positive impression of low-ranking miRNAs in cancer classification.

  3. Magnetic resonance imaging texture analysis classification of primary breast cancer

    International Nuclear Information System (INIS)

    Waugh, S.A.; Lerski, R.A.; Purdie, C.A.; Jordan, L.B.; Vinnicombe, S.; Martin, P.; Thompson, A.M.

    2016-01-01

    Patient-tailored treatments for breast cancer are based on histological and immunohistochemical (IHC) subtypes. Magnetic Resonance Imaging (MRI) texture analysis (TA) may be useful in non-invasive lesion subtype classification. Women with newly diagnosed primary breast cancer underwent pre-treatment dynamic contrast-enhanced breast MRI. TA was performed using co-occurrence matrix (COM) features, by creating a model on retrospective training data, then prospectively applying to a test set. Analyses were blinded to breast pathology. Subtype classifications were performed using a cross-validated k-nearest-neighbour (k = 3) technique, with accuracy relative to pathology assessed and receiver operator curve (AUROC) calculated. Mann-Whitney U and Kruskal-Wallis tests were used to assess raw entropy feature values. Histological subtype classifications were similar across training (n = 148 cancers) and test sets (n = 73 lesions) using all COM features (training: 75 %, AUROC = 0.816; test: 72.5 %, AUROC = 0.823). Entropy features were significantly different between lobular and ductal cancers (p < 0.001; Mann-Whitney U). IHC classifications using COM features were also similar for training and test data (training: 57.2 %, AUROC = 0.754; test: 57.0 %, AUROC = 0.750). Hormone receptor positive and negative cancers demonstrated significantly different entropy features. Entropy features alone were unable to create a robust classification model. Textural differences on contrast-enhanced MR images may reflect underlying lesion subtypes, which merits testing against treatment response. (orig.)

  4. Magnetic resonance imaging texture analysis classification of primary breast cancer

    Energy Technology Data Exchange (ETDEWEB)

    Waugh, S.A.; Lerski, R.A. [Ninewells Hospital and Medical School, Department of Medical Physics, Dundee (United Kingdom); Purdie, C.A.; Jordan, L.B. [Ninewells Hospital and Medical School, Department of Pathology, Dundee (United Kingdom); Vinnicombe, S. [University of Dundee, Division of Imaging and Technology, Ninewells Hospital and Medical School, Dundee (United Kingdom); Martin, P. [Ninewells Hospital and Medical School, Department of Clinical Radiology, Dundee (United Kingdom); Thompson, A.M. [University of Texas MD Anderson Cancer Center, Department of Surgical Oncology, Houston, TX (United States)

    2016-02-15

    Patient-tailored treatments for breast cancer are based on histological and immunohistochemical (IHC) subtypes. Magnetic Resonance Imaging (MRI) texture analysis (TA) may be useful in non-invasive lesion subtype classification. Women with newly diagnosed primary breast cancer underwent pre-treatment dynamic contrast-enhanced breast MRI. TA was performed using co-occurrence matrix (COM) features, by creating a model on retrospective training data, then prospectively applying to a test set. Analyses were blinded to breast pathology. Subtype classifications were performed using a cross-validated k-nearest-neighbour (k = 3) technique, with accuracy relative to pathology assessed and receiver operator curve (AUROC) calculated. Mann-Whitney U and Kruskal-Wallis tests were used to assess raw entropy feature values. Histological subtype classifications were similar across training (n = 148 cancers) and test sets (n = 73 lesions) using all COM features (training: 75 %, AUROC = 0.816; test: 72.5 %, AUROC = 0.823). Entropy features were significantly different between lobular and ductal cancers (p < 0.001; Mann-Whitney U). IHC classifications using COM features were also similar for training and test data (training: 57.2 %, AUROC = 0.754; test: 57.0 %, AUROC = 0.750). Hormone receptor positive and negative cancers demonstrated significantly different entropy features. Entropy features alone were unable to create a robust classification model. Textural differences on contrast-enhanced MR images may reflect underlying lesion subtypes, which merits testing against treatment response. (orig.)

  5. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey.

    Science.gov (United States)

    Nahid, Abdullah-Al; Kong, Yinan

    2017-01-01

    Breast cancer is one of the largest causes of women's death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors' and physicians' time. Despite the various publications on breast image classification, very few review papers are available which provide a detailed description of breast cancer image classification techniques, feature extraction and selection procedures, classification measuring parameterizations, and image classification findings. We have put a special emphasis on the Convolutional Neural Network (CNN) method for breast image classification. Along with the CNN method we have also described the involvement of the conventional Neural Network (NN), Logic Based classifiers such as the Random Forest (RF) algorithm, Support Vector Machines (SVM), Bayesian methods, and a few of the semisupervised and unsupervised methods which have been used for breast image classification.

  6. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey

    Directory of Open Access Journals (Sweden)

    Abdullah-Al Nahid

    2017-01-01

    Full Text Available Breast cancer is one of the largest causes of women’s death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors’ and physicians’ time. Despite the various publications on breast image classification, very few review papers are available which provide a detailed description of breast cancer image classification techniques, feature extraction and selection procedures, classification measuring parameterizations, and image classification findings. We have put a special emphasis on the Convolutional Neural Network (CNN method for breast image classification. Along with the CNN method we have also described the involvement of the conventional Neural Network (NN, Logic Based classifiers such as the Random Forest (RF algorithm, Support Vector Machines (SVM, Bayesian methods, and a few of the semisupervised and unsupervised methods which have been used for breast image classification.

  7. Genetic Fuzzy System (GFS based wavelet co-occurrence feature selection in mammogram classification for breast cancer diagnosis

    Directory of Open Access Journals (Sweden)

    Meenakshi M. Pawar

    2016-09-01

    Full Text Available Breast cancer is significant health problem diagnosed mostly in women worldwide. Therefore, early detection of breast cancer is performed with the help of digital mammography, which can reduce mortality rate. This paper presents wrapper based feature selection approach for wavelet co-occurrence feature (WCF using Genetic Fuzzy System (GFS in mammogram classification problem. The performance of GFS algorithm is explained using mini-MIAS database. WCF features are obtained from detail wavelet coefficients at each level of decomposition of mammogram image. At first level of decomposition, 18 features are applied to GFS algorithm, which selects 5 features with an average classification success rate of 39.64%. Subsequently, at second level it selects 9 features from 36 features and the classification success rate is improved to 56.75%. For third level, 16 features are selected from 54 features and average success rate is improved to 64.98%. Lastly, at fourth level 72 features are applied to GFS, which selects 16 features and thereby increasing average success rate to 89.47%. Hence, GFS algorithm is the effective way of obtaining optimal set of feature in breast cancer diagnosis.

  8. X-ray diagnosis of esophageal cancer and application of Borrmann's classification

    International Nuclear Information System (INIS)

    Chin, Soo Yil

    1985-01-01

    In 126 cases and who were diagnosed as esophageal cancer and treated by radiation at Cancer Research Hospital, K. A. E. R. I., from January 1974 to July 1979, a study on the x-ray diagnosis of esophageal cancer was carried out mainly as to the type classification. The ordinary classification od esophageal cancer by x-ray picture was reviewed and Borrmann's classification using gastric cancer was tried to apply to the macroscopic classification of esophageal cancer, and also the application of this classification to x-ray diagnosis was discussed. And the effect of radiotherapy as to each type of cancer according to the ordinary x-ray classification and Borrmann's classification was studied too. The results were as follows: 1. The ordinary x-ray classification was not simple, because the degree of progression of cancer, difference of mural invasion, and position and method of radiography could make misinterpretation of the type of cancer and the therapeutic effect by radiation as to each type according to this classification did not represent a significant characteristic too, although the radiation was most effective in the polypoidal type and least effective in funnel type. 2. The Borrmann's classification was relatively easy even on the radiogram because of little overlapping between each type and the type became more evident on the resected specimen after operation. And also some correlation was recognized between the type of Borrmann's classification and radiotherapeutic effect. The effect was best in type I and It was gradually decreased in type II, III, and IV in the other. The radiotherapy was ineffective in about three quarters of type IV. 3. The Borrmann's classification is now employed to the carcinoma of large bowel, as well as to the gastric cancer. If it is applied to the esophageal cancer, the macroscopic classification for the cancer of digestive tract can be systemized and it will be convenient in clinical study.

  9. Association between gastric cancer and the Kyoto classification of gastritis.

    Science.gov (United States)

    Shichijo, Satoki; Hirata, Yoshihiro; Niikura, Ryota; Hayakawa, Yoku; Yamada, Atsuo; Koike, Kazuhiko

    2017-09-01

    Histological gastritis is associated with gastric cancer, but its diagnosis requires biopsy. Many classifications of endoscopic gastritis are available, but not all are useful for risk stratification of gastric cancer. The Kyoto Classification of Gastritis was proposed at the 85th Congress of the Japan Gastroenterological Endoscopy Society. This cross-sectional study evaluated the usefulness of the Kyoto Classification of Gastritis for risk stratification of gastric cancer. From August 2013 to September 2014, esophagogastroduodenoscopy was performed and the gastric findings evaluated according to the Kyoto Classification of Gastritis in a total of 4062 patients. The following five endoscopic findings were selected based on previous reports: atrophy, intestinal metaplasia, enlarged folds, nodularity, and diffuse redness. A total of 3392 patients (1746 [51%] men and 1646 [49%] women) were analyzed. Among them, 107 gastric cancers were diagnosed. Atrophy was found in 2585 (78%) and intestinal metaplasia in 924 (27%). Enlarged folds, nodularity, and diffuse redness were found in 197 (5.8%), 22 (0.6%), and 573 (17%), respectively. In univariate analyses, the severity of atrophy, intestinal metaplasia, diffuse redness, age, and male sex were associated with gastric cancer. In a multivariate analysis, atrophy and male sex were found to be independent risk factors. Younger age and severe atrophy were determined to be associated with diffuse-type gastric cancer. Endoscopic detection of atrophy was associated with the risk of gastric cancer. Thus, patients with severe atrophy should be examined carefully and may require intensive follow-up. © 2017 Journal of Gastroenterology and Hepatology Foundation and John Wiley & Sons Australia, Ltd.

  10. Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method

    Directory of Open Access Journals (Sweden)

    Huang Desheng

    2009-07-01

    Full Text Available Abstract Background A reliable and precise classification is essential for successful diagnosis and treatment of cancer. Gene expression microarrays have provided the high-throughput platform to discover genomic biomarkers for cancer diagnosis and prognosis. Rational use of the available bioinformation can not only effectively remove or suppress noise in gene chips, but also avoid one-sided results of separate experiment. However, only some studies have been aware of the importance of prior information in cancer classification. Methods Together with the application of support vector machine as the discriminant approach, we proposed one modified method that incorporated prior knowledge into cancer classification based on gene expression data to improve accuracy. A public well-known dataset, Malignant pleural mesothelioma and lung adenocarcinoma gene expression database, was used in this study. Prior knowledge is viewed here as a means of directing the classifier using known lung adenocarcinoma related genes. The procedures were performed by software R 2.80. Results The modified method performed better after incorporating prior knowledge. Accuracy of the modified method improved from 98.86% to 100% in training set and from 98.51% to 99.06% in test set. The standard deviations of the modified method decreased from 0.26% to 0 in training set and from 3.04% to 2.10% in test set. Conclusion The method that incorporates prior knowledge into discriminant analysis could effectively improve the capacity and reduce the impact of noise. This idea may have good future not only in practice but also in methodology.

  11. Evolving cancer classification in the era of personalized medicine: A primer for radiologists

    Energy Technology Data Exchange (ETDEWEB)

    O' Neill, Alibhe C.; Jagannathan, Jyothi P.; Ramaiya, Nikhil H. [Dept. of of Imaging, Dana Farber Cancer Institute, Boston (United States)

    2017-01-15

    Traditionally tumors were classified based on anatomic location but now specific genetic mutations in cancers are leading to treatment of tumors with molecular targeted therapies. This has led to a paradigm shift in the classification and treatment of cancer. Tumors treated with molecular targeted therapies often show morphological changes rather than change in size and are associated with class specific and drug specific toxicities, different from those encountered with conventional chemotherapeutic agents. It is important for the radiologists to be familiar with the new cancer classification and the various treatment strategies employed, in order to effectively communicate and participate in the multi-disciplinary care. In this paper we will focus on lung cancer as a prototype of the new molecular classification.

  12. Molecular Classification and Correlates in Colorectal Cancer

    OpenAIRE

    Ogino, Shuji; Goel, Ajay

    2008-01-01

    Molecular classification of colorectal cancer is evolving. As our understanding of colorectal carcinogenesis improves, we are incorporating new knowledge into the classification system. In particular, global genomic status [microsatellite instability (MSI) status and chromosomal instability (CIN) status] and epigenomic status [CpG island methylator phenotype (CIMP) status] play a significant role in determining clinical, pathological and biological characteristics of colorectal cancer. In thi...

  13. Using fuzzy association rule mining in cancer classification

    International Nuclear Information System (INIS)

    Mahmoodian, Hamid; Marhaban, M.H.; Abdulrahim, Raha; Rosli, Rozita; Saripan, Iqbal

    2011-01-01

    Full text: The classification of the cancer tumors based on gene expression profiles has been extensively studied in numbers of studies. A wide variety of cancer datasets have been implemented by the various methods of gene selec tion and classification to identify the behavior of the genes in tumors and find the relationships between them and outcome of diseases. Interpretability of the model, which is developed by fuzzy rules and linguistic variables in this study, has been rarely considered. In addition, creating a fuzzy classifier with high performance in classification that uses a subset of significant genes which have been selected by different types of gene selection methods is another goal of this study. A new algorithm has been developed to identify the fuzzy rules and significant genes based on fuzzy association rule mining. At first, different subset of genes which have been selected by different methods, were used to generate primary fuzzy classifiers separately and then proposed algorithm was implemented to mix the genes which have been associated in the primary classifiers and generate a new classifier. The results show that fuzzy classifier can classify the tumors with high performance while presenting the relationships between the genes by linguistic variables

  14. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  15. The classification of osteonecrosis in patients with cancer: validation of a new radiological classification system

    International Nuclear Information System (INIS)

    Niinimäki, T.; Niinimäki, J.; Halonen, J.; Hänninen, P.; Harila-Saari, A.; Niinimäki, R.

    2015-01-01

    Aim: To validate a new, non-joint-specific radiological classification system that is suitable regardless of the site of the osteonecrosis (ON) in patients with cancer. Material and methods: Critical deficiencies in the existing ON classification systems were identified and a new, non-joint-specific radiological classification system was developed. Seventy-two magnetic resonance imaging (MRI) images of patients with cancer and ON lesions were graded, and the validation of the new system was performed by assessing inter- and intra-observer reliability. Results: Intra-observer reliability of ON grading was good or very good, with kappa values of 0.79–0.86. Interobserver agreement was lower but still good, with kappa values of 0.62–0.77. Ninety-eight percent of all intra- or interobserver differences were within one grade. Interobserver reliability of assessing the location of ON was very good, with kappa values of 0.93–0.98. Conclusion: All the available radiological ON classification systems are joint specific. This limitation has spurred the development of multiple systems, which has led to the insufficient use of classifications in ON studies among patients with cancer. The introduced radiological classification system overcomes the problem of joint-specificity, was found to be reliable, and can be used to classify all ON lesions regardless of the affected site. - Highlights: • Patients with cancer may have osteonecrosis lesions at multiple sites. • There is no non-joint-specific osteonecrosis classification available. • We introduced a new non-joint-specific osteonecrosis classification. • The validation was performed by assessing inter- and intra-observer reliability. • The classification was reliable and could be used regardless of the affected site.

  16. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim U.; Emberton, Mark [University College London, Research Department of Urology, Division of Surgery and Interventional Science, London (United Kingdom); Kirkham, Alex [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2015-09-15

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  17. An Entropy-based gene selection method for cancer classification using microarray data

    Directory of Open Access Journals (Sweden)

    Krishnan Arun

    2005-03-01

    Full Text Available Abstract Background Accurate diagnosis of cancer subtypes remains a challenging problem. Building classifiers based on gene expression data is a promising approach; yet the selection of non-redundant but relevant genes is difficult. The selected gene set should be small enough to allow diagnosis even in regular clinical laboratories and ideally identify genes involved in cancer-specific regulatory pathways. Here an entropy-based method is proposed that selects genes related to the different cancer classes while at the same time reducing the redundancy among the genes. Results The present study identifies a subset of features by maximizing the relevance and minimizing the redundancy of the selected genes. A merit called normalized mutual information is employed to measure the relevance and the redundancy of the genes. In order to find a more representative subset of features, an iterative procedure is adopted that incorporates an initial clustering followed by data partitioning and the application of the algorithm to each of the partitions. A leave-one-out approach then selects the most commonly selected genes across all the different runs and the gene selection algorithm is applied again to pare down the list of selected genes until a minimal subset is obtained that gives a satisfactory accuracy of classification. The algorithm was applied to three different data sets and the results obtained were compared to work done by others using the same data sets Conclusion This study presents an entropy-based iterative algorithm for selecting genes from microarray data that are able to classify various cancer sub-types with high accuracy. In addition, the feature set obtained is very compact, that is, the redundancy between genes is reduced to a large extent. This implies that classifiers can be built with a smaller subset of genes.

  18. Prognostic classifications of lymph node involvement in lung cancer and current International Association for the Study of Lung Cancer descriptive classification in zones.

    Science.gov (United States)

    Riquet, Marc; Arame, Alex; Foucault, Christophe; Le Pimpec Barthes, Françoise

    2010-09-01

    The lymphatic drainage of solid organ tumors crosses through the lymph nodes (LNs) whose tumoral involvement may still be considered as local disease. Concerning lung cancer, LN involvement may be intrapulmonary (N1), and mediastinal and/or extra-thoracic. More than 30 years ago, mediastinal involved LNs were all considered as N2, and outside the scope of surgery. In 1978, Naruke presented an original article entitled 'Lymph node mapping and curability at various levels of metastasis in resected lung cancer', demonstrating that N2 was not a contraindication to surgery in all patients. The map permitted to localize the favorable N2 on the lung cancer ipsilateral side of the mediastinum. Several maps ensued aiming to discriminate between right and left involvement (1983), and to distinguish N2 (ipsilateral) and N3 (contralateral) mediastinal LN involvement (1983, 1986). The last map (1997 regional LN classification) was recently replaced by a descriptive classification in anatomical zones. This new LN map of the TNM classification for lung cancer is a step toward using anatomical view points which might be the best way to better understand lung cancer lymphatic spread. Nowadays, the LNs are easily identified by current radiological imaging, and their resectability may be anticipated. Each LN chain may be removed by en-bloc lymphadenectomy performed during radical lung resection, a safe procedure which seems to be more oncological based than sampling, and which avoids the source of discrepancies pointed out during the labeling of LN stations by surgeons.

  19. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  20. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Science.gov (United States)

    Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun

    2016-01-01

    This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms. PMID:27642363

  1. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  2. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  3. Gastric cancer: epidemiology, prevention, classification, and treatment

    Directory of Open Access Journals (Sweden)

    Sitarz R

    2018-02-01

    Full Text Available Robert Sitarz,1–3 Małgorzata Skierucha,1,2 Jerzy Mielko,1 G Johan A Offerhaus,3 Ryszard Maciejewski,2 Wojciech P Polkowski1 1Department of Surgical Oncology, Medical University of Lublin, Lublin, Poland; 2Department of Human Anatomy, Medical University of Lublin, Lublin, Poland; 3Department of Pathology, University Medical Centre, Utrecht, The Netherlands Abstract: Gastric cancer is the second most common cause of cancer-related deaths in the world, the epidemiology of which has changed within last decades. A trend of steady decline in gastric cancer incidence rates is the effect of the increased standards of hygiene, conscious nutrition, and Helicobacter pylori eradication, which together constitute primary prevention. Avoidance of gastric cancer remains a priority. However, patients with higher risk should be screened for early detection and chemoprevention. Surgical resection enhanced by standardized lymphadenectomy remains the gold standard in gastric cancer therapy. This review briefly summarizes the most important aspects of gastric cancers, which include epidemiology, risk factors, classification, diagnosis, prevention, and treatment. The paper is mostly addressed to physicians who are interested in updating the state of art concerning gastric carcinoma from easily accessible and credible source. Keywords: gastric cancer, epidemiology, classification, risk factors, treatment

  4. Visualization and tissue classification of human breast cancer images using ultrahigh-resolution OCT (Conference Presentation)

    Science.gov (United States)

    Yao, Xinwen; Gan, Yu; Chang, Ernest W.; Hibshoosh, Hanina; Feldman, Sheldon; Hendon, Christine P.

    2017-02-01

    We employed a home-built ultrahigh resolution (UHR) OCT system at 800nm to image human breast cancer sample ex vivo. The system has an axial resolution of 2.72µm and a lateral resolution of 5.52µm with an extended imaging range of 1.78mm. Over 900 UHR OCT volumes were generated on specimens from 23 breast cancer cases. With better spatial resolution, detailed structures in the breast tissue were better defined. Different types of breast cancer as well as healthy breast tissue can be well delineated from the UHR OCT images. To quantitatively evaluate the advantages of UHR OCT imaging of breast cancer, features derived from OCT intensity images were used as inputs to a machine learning model, the relevance vector machine. A trained machine learning model was employed to evaluate the performance of tissue classification based on UHR OCT images for differentiating tissue types in the breast samples, including adipose tissue, healthy stroma and cancerous region. For adipose tissue, grid-based local features were extracted from OCT intensity data, including standard deviation, entropy, and homogeneity. We showed that it was possible to enhance the classification performance on distinguishing fat tissue from non-fat tissue by using the UHR images when compared with the results based on OCT images from a commercial 1300 nm OCT system. For invasive ductal carcinoma (IDC) and normal stroma differentiation, the classification was based on frame-based features that portray signal penetration depth and tissue reflectivity. The confusing matrix indicated a sensitivity of 97.5% and a sensitivity of 77.8%.

  5. Classification of Breast Cancer Subtypes by combining Gene Expression and DNA Methylation Data

    Directory of Open Access Journals (Sweden)

    List Markus

    2014-06-01

    Full Text Available Selecting the most promising treatment strategy for breast cancer crucially depends on determining the correct subtype. In recent years, gene expression profiling has been investigated as an alternative to histochemical methods. Since databases like TCGA provide easy and unrestricted access to gene expression data for hundreds of patients, the challenge is to extract a minimal optimal set of genes with good prognostic properties from a large bulk of genes making a moderate contribution to classification. Several studies have successfully applied machine learning algorithms to solve this so-called gene selection problem. However, more diverse data from other OMICS technologies are available, including methylation. We hypothesize that combining methylation and gene expression data could already lead to a largely improved classification model, since the resulting model will reflect differences not only on the transcriptomic, but also on an epigenetic level. We compared so-called random forest derived classification models based on gene expression and methylation data alone, to a model based on the combined features and to a model based on the gold standard PAM50. We obtained bootstrap errors of 10-20% and classification error of 1-50%, depending on breast cancer subtype and model. The gene expression model was clearly superior to the methylation model, which was also reflected in the combined model, which mainly selected features from gene expression data. However, the methylation model was able to identify unique features not considered as relevant by the gene expression model, which might provide deeper insights into breast cancer subtype differentiation on an epigenetic level.

  6. Classification of breast cancer histology images using Convolutional Neural Networks.

    Directory of Open Access Journals (Sweden)

    Teresa Araújo

    Full Text Available Breast cancer is one of the main causes of cancer death worldwide. The diagnosis of biopsy tissue with hematoxylin and eosin stained images is non-trivial and specialists often disagree on the final diagnosis. Computer-aided Diagnosis systems contribute to reduce the cost and increase the efficiency of this process. Conventional classification approaches rely on feature extraction methods designed for a specific problem based on field-knowledge. To overcome the many difficulties of the feature-based approaches, deep learning methods are becoming important alternatives. A method for the classification of hematoxylin and eosin stained breast biopsy images using Convolutional Neural Networks (CNNs is proposed. Images are classified in four classes, normal tissue, benign lesion, in situ carcinoma and invasive carcinoma, and in two classes, carcinoma and non-carcinoma. The architecture of the network is designed to retrieve information at different scales, including both nuclei and overall tissue organization. This design allows the extension of the proposed system to whole-slide histology images. The features extracted by the CNN are also used for training a Support Vector Machine classifier. Accuracies of 77.8% for four class and 83.3% for carcinoma/non-carcinoma are achieved. The sensitivity of our method for cancer cases is 95.6%.

  7. Mechanism-based classification and physical therapy management of persons with cancer pain: A prospective case series

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2013-01-01

    Full Text Available Context: Mechanism-based classification (MBC was established with current evidence and physical therapy (PT management methods for both cancer and for noncancer pain. Aims: This study aims to describe the efficacy of MBC-based PT in persons with primary complaints of cancer pain. Settings and Design: A prospective case series of patients who attended the physiotherapy department of a multispecialty university-affiliated teaching hospital. Material and Methods: A total of 24 adults (18 female, 6 male aged 47.5 ± 10.6 years, with primary diagnosis of heterogeneous group of cancer, chief complaints of chronic disabling pain were included in the study on their consent for participation The patients were evaluated and classified on the basis of five predominant mechanisms for pain. Physical therapy interventions were recommended based on mechanisms identified and home program was prescribed with a patient log to ensure compliance. Treatments were given in five consecutive weekly sessions for five weeks each of 30 min duration. Statistical Analysis Used: Pre-post comparisons for pain severity (PS and pain interference (PI subscales of Brief pain inventory-Cancer pain (BPI-CP and, European organization for research and treatment in cancer-quality of life questionnaire (EORTC-QLQ-C30 were done using Wilcoxon signed-rank test at 95% confidence interval using SPSS for Windows version 16.0 (SPSS Inc, Chicago, IL. Results: There were statistically significant ( P < 0.05 reduction in pain severity, pain interference and total BPI-CP scores, and the EORTC-QLQ-C30. Conclusion: MBC-PT was effective for improving BPI-CP and EORTC-QLQ-C30 scores in people with cancer pain.

  8. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Science.gov (United States)

    Glaab, Enrico; Bacardit, Jaume; Garibaldi, Jonathan M; Krasnogor, Natalio

    2012-01-01

    Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  9. A New Feature Ensemble with a Multistage Classification Scheme for Breast Cancer Diagnosis

    Directory of Open Access Journals (Sweden)

    Idil Isikli Esener

    2017-01-01

    Full Text Available A new and effective feature ensemble with a multistage classification is proposed to be implemented in a computer-aided diagnosis (CAD system for breast cancer diagnosis. A publicly available mammogram image dataset collected during the Image Retrieval in Medical Applications (IRMA project is utilized to verify the suggested feature ensemble and multistage classification. In achieving the CAD system, feature extraction is performed on the mammogram region of interest (ROI images which are preprocessed by applying a histogram equalization followed by a nonlocal means filtering. The proposed feature ensemble is formed by concatenating the local configuration pattern-based, statistical, and frequency domain features. The classification process of these features is implemented in three cases: a one-stage study, a two-stage study, and a three-stage study. Eight well-known classifiers are used in all cases of this multistage classification scheme. Additionally, the results of the classifiers that provide the top three performances are combined via a majority voting technique to improve the recognition accuracy on both two- and three-stage studies. A maximum of 85.47%, 88.79%, and 93.52% classification accuracies are attained by the one-, two-, and three-stage studies, respectively. The proposed multistage classification scheme is more effective than the single-stage classification for breast cancer diagnosis.

  10. MO-DE-207B-03: Improved Cancer Classification Using Patient-Specific Biological Pathway Information Via Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Young, M; Craft, D [Massachusetts General Hospital and Harvard Medical School, Boston, MA (United States)

    2016-06-15

    Purpose: To develop an efficient, pathway-based classification system using network biology statistics to assist in patient-specific response predictions to radiation and drug therapies across multiple cancer types. Methods: We developed PICS (Pathway Informed Classification System), a novel two-step cancer classification algorithm. In PICS, a matrix m of mRNA expression values for a patient cohort is collapsed into a matrix p of biological pathways. The entries of p, which we term pathway scores, are obtained from either principal component analysis (PCA), normal tissue centroid (NTC), or gene expression deviation (GED). The pathway score matrix is clustered using both k-means and hierarchical clustering, and a clustering is judged by how well it groups patients into distinct survival classes. The most effective pathway scoring/clustering combination, per clustering p-value, thus generates various ‘signatures’ for conventional and functional cancer classification. Results: PICS successfully regularized large dimension gene data, separated normal and cancerous tissues, and clustered a large patient cohort spanning six cancer types. Furthermore, PICS clustered patient cohorts into distinct, statistically-significant survival groups. For a suboptimally-debulked ovarian cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00127) showed significant improvement over that of a prior gene expression-classified study (p = .0179). For a pancreatic cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00141) showed significant improvement over that of a prior gene expression-classified study (p = .04). Pathway-based classification confirmed biomarkers for the pyrimidine, WNT-signaling, glycerophosphoglycerol, beta-alanine, and panthothenic acid pathways for ovarian cancer. Despite its robust nature, PICS requires significantly less run time than current pathway scoring methods. Conclusion: This work validates the PICS method to improve

  11. Cancer classification using the Immunoscore: a worldwide task force.

    Science.gov (United States)

    Galon, Jérôme; Pagès, Franck; Marincola, Francesco M; Angell, Helen K; Thurin, Magdalena; Lugli, Alessandro; Zlobec, Inti; Berger, Anne; Bifulco, Carlo; Botti, Gerardo; Tatangelo, Fabiana; Britten, Cedrik M; Kreiter, Sebastian; Chouchane, Lotfi; Delrio, Paolo; Arndt, Hartmann; Asslaber, Martin; Maio, Michele; Masucci, Giuseppe V; Mihm, Martin; Vidal-Vanaclocha, Fernando; Allison, James P; Gnjatic, Sacha; Hakansson, Leif; Huber, Christoph; Singh-Jasuja, Harpreet; Ottensmeier, Christian; Zwierzina, Heinz; Laghi, Luigi; Grizzi, Fabio; Ohashi, Pamela S; Shaw, Patricia A; Clarke, Blaise A; Wouters, Bradly G; Kawakami, Yutaka; Hazama, Shoichi; Okuno, Kiyotaka; Wang, Ena; O'Donnell-Tormey, Jill; Lagorce, Christine; Pawelec, Graham; Nishimura, Michael I; Hawkins, Robert; Lapointe, Réjean; Lundqvist, Andreas; Khleif, Samir N; Ogino, Shuji; Gibbs, Peter; Waring, Paul; Sato, Noriyuki; Torigoe, Toshihiko; Itoh, Kyogo; Patel, Prabhu S; Shukla, Shilin N; Palmqvist, Richard; Nagtegaal, Iris D; Wang, Yili; D'Arrigo, Corrado; Kopetz, Scott; Sinicrope, Frank A; Trinchieri, Giorgio; Gajewski, Thomas F; Ascierto, Paolo A; Fox, Bernard A

    2012-10-03

    Prediction of clinical outcome in cancer is usually achieved by histopathological evaluation of tissue samples obtained during surgical resection of the primary tumor. Traditional tumor staging (AJCC/UICC-TNM classification) summarizes data on tumor burden (T), presence of cancer cells in draining and regional lymph nodes (N) and evidence for metastases (M). However, it is now recognized that clinical outcome can significantly vary among patients within the same stage. The current classification provides limited prognostic information, and does not predict response to therapy. Recent literature has alluded to the importance of the host immune system in controlling tumor progression. Thus, evidence supports the notion to include immunological biomarkers, implemented as a tool for the prediction of prognosis and response to therapy. Accumulating data, collected from large cohorts of human cancers, has demonstrated the impact of immune-classification, which has a prognostic value that may add to the significance of the AJCC/UICC TNM-classification. It is therefore imperative to begin to incorporate the 'Immunoscore' into traditional classification, thus providing an essential prognostic and potentially predictive tool. Introduction of this parameter as a biomarker to classify cancers, as part of routine diagnostic and prognostic assessment of tumors, will facilitate clinical decision-making including rational stratification of patient treatment. Equally, the inherent complexity of quantitative immunohistochemistry, in conjunction with protocol variation across laboratories, analysis of different immune cell types, inconsistent region selection criteria, and variable ways to quantify immune infiltration, all underline the urgent requirement to reach assay harmonization. In an effort to promote the Immunoscore in routine clinical settings, an international task force was initiated. This review represents a follow-up of the announcement of this initiative, and of the J

  12. Cancer classification using the Immunoscore: a worldwide task force

    Directory of Open Access Journals (Sweden)

    Galon Jérôme

    2012-10-01

    Full Text Available Abstract Prediction of clinical outcome in cancer is usually achieved by histopathological evaluation of tissue samples obtained during surgical resection of the primary tumor. Traditional tumor staging (AJCC/UICC-TNM classification summarizes data on tumor burden (T, presence of cancer cells in draining and regional lymph nodes (N and evidence for metastases (M. However, it is now recognized that clinical outcome can significantly vary among patients within the same stage. The current classification provides limited prognostic information, and does not predict response to therapy. Recent literature has alluded to the importance of the host immune system in controlling tumor progression. Thus, evidence supports the notion to include immunological biomarkers, implemented as a tool for the prediction of prognosis and response to therapy. Accumulating data, collected from large cohorts of human cancers, has demonstrated the impact of immune-classification, which has a prognostic value that may add to the significance of the AJCC/UICC TNM-classification. It is therefore imperative to begin to incorporate the ‘Immunoscore’ into traditional classification, thus providing an essential prognostic and potentially predictive tool. Introduction of this parameter as a biomarker to classify cancers, as part of routine diagnostic and prognostic assessment of tumors, will facilitate clinical decision-making including rational stratification of patient treatment. Equally, the inherent complexity of quantitative immunohistochemistry, in conjunction with protocol variation across laboratories, analysis of different immune cell types, inconsistent region selection criteria, and variable ways to quantify immune infiltration, all underline the urgent requirement to reach assay harmonization. In an effort to promote the Immunoscore in routine clinical settings, an international task force was initiated. This review represents a follow-up of the announcement of

  13. An iterated Laplacian based semi-supervised dimensionality reduction for classification of breast cancer on ultrasound images.

    Science.gov (United States)

    Liu, Xiao; Shi, Jun; Zhou, Shichong; Lu, Minhua

    2014-01-01

    The dimensionality reduction is an important step in ultrasound image based computer-aided diagnosis (CAD) for breast cancer. A newly proposed l2,1 regularized correntropy algorithm for robust feature selection (CRFS) has achieved good performance for noise corrupted data. Therefore, it has the potential to reduce the dimensions of ultrasound image features. However, in clinical practice, the collection of labeled instances is usually expensive and time costing, while it is relatively easy to acquire the unlabeled or undetermined instances. Therefore, the semi-supervised learning is very suitable for clinical CAD. The iterated Laplacian regularization (Iter-LR) is a new regularization method, which has been proved to outperform the traditional graph Laplacian regularization in semi-supervised classification and ranking. In this study, to augment the classification accuracy of the breast ultrasound CAD based on texture feature, we propose an Iter-LR-based semi-supervised CRFS (Iter-LR-CRFS) algorithm, and then apply it to reduce the feature dimensions of ultrasound images for breast CAD. We compared the Iter-LR-CRFS with LR-CRFS, original supervised CRFS, and principal component analysis. The experimental results indicate that the proposed Iter-LR-CRFS significantly outperforms all other algorithms.

  14. Classification of breast cancer patients using somatic mutation profiles and machine learning approaches.

    Science.gov (United States)

    Vural, Suleyman; Wang, Xiaosheng; Guda, Chittibabu

    2016-08-26

    The high degree of heterogeneity observed in breast cancers makes it very difficult to classify the cancer patients into distinct clinical subgroups and consequently limits the ability to devise effective therapeutic strategies. Several classification strategies based on ER/PR/HER2 expression or the expression profiles of a panel of genes have helped, but such methods often produce misleading results due to their dynamic nature. In contrast, somatic DNA mutations are relatively stable and lead to initiation and progression of many sporadic cancers. Hence in this study, we explore the use of gene mutation profiles to classify, characterize and predict the subgroups of breast cancers. We analyzed the whole exome sequencing data from 358 ethnically similar breast cancer patients in The Cancer Genome Atlas (TCGA) project. Somatic and non-synonymous single nucleotide variants identified from each patient were assigned a quantitative score (C-score) that represents the extent of negative impact on the gene function. Using these scores with non-negative matrix factorization method, we clustered the patients into three subgroups. By comparing the clinical stage of patients, we identified an early-stage-enriched and a late-stage-enriched subgroup. Comparison of the mutation scores of early and late-stage-enriched subgroups identified 358 genes that carry significantly higher mutations rates in the late stage subgroup. Functional characterization of these genes revealed important functional gene families that carry a heavy mutational load in the late state rich subgroup of patients. Finally, using the identified subgroups, we also developed a supervised classification model to predict the stage of the patients. This study demonstrates that gene mutation profiles can be effectively used with unsupervised machine-learning methods to identify clinically distinguishable breast cancer subgroups. The classification model developed in this method could provide a reasonable

  15. CrossLink: a novel method for cross-condition classification of cancer subtypes.

    Science.gov (United States)

    Ma, Chifeng; Sastry, Konduru S; Flore, Mario; Gehani, Salah; Al-Bozom, Issam; Feng, Yusheng; Serpedin, Erchin; Chouchane, Lotfi; Chen, Yidong; Huang, Yufei

    2016-08-22

    We considered the prediction of cancer classes (e.g. subtypes) using patient gene expression profiles that contain both systematic and condition-specific biases when compared with the training reference dataset. The conventional normalization-based approaches cannot guarantee that the gene signatures in the reference and prediction datasets always have the same distribution for all different conditions as the class-specific gene signatures change with the condition. Therefore, the trained classifier would work well under one condition but not under another. To address the problem of current normalization approaches, we propose a novel algorithm called CrossLink (CL). CL recognizes that there is no universal, condition-independent normalization mapping of signatures. In contrast, it exploits the fact that the signature is unique to its associated class under any condition and thus employs an unsupervised clustering algorithm to discover this unique signature. We assessed the performance of CL for cross-condition predictions of PAM50 subtypes of breast cancer by using a simulated dataset modeled after TCGA BRCA tumor samples with a cross-validation scheme, and datasets with known and unknown PAM50 classification. CL achieved prediction accuracy >73 %, highest among other methods we evaluated. We also applied the algorithm to a set of breast cancer tumors derived from Arabic population to assign a PAM50 classification to each tumor based on their gene expression profiles. A novel algorithm CrossLink for cross-condition prediction of cancer classes was proposed. In all test datasets, CL showed robust and consistent improvement in prediction performance over other state-of-the-art normalization and classification algorithms.

  16. Comparison of hand-craft feature based SVM and CNN based deep learning framework for automatic polyp classification.

    Science.gov (United States)

    Younghak Shin; Balasingham, Ilangko

    2017-07-01

    Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.

  17. Novelty detection for breast cancer image classification

    Science.gov (United States)

    Cichosz, Pawel; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał; Oleszkiewicz, Witold

    2016-09-01

    Using classification learning algorithms for medical applications may require not only refined model creation techniques and careful unbiased model evaluation, but also detecting the risk of misclassification at the time of model application. This is addressed by novelty detection, which identifies instances for which the training set is not sufficiently representative and for which it may be safer to restrain from classification and request a human expert diagnosis. The paper investigates two techniques for isolated instance identification, based on clustering and one-class support vector machines, which represent two different approaches to multidimensional outlier detection. The prediction quality for isolated instances in breast cancer image data is evaluated using the random forest algorithm and found to be substantially inferior to the prediction quality for non-isolated instances. Each of the two techniques is then used to create a novelty detection model which can be combined with a classification model and used at the time of prediction to detect instances for which the latter cannot be reliably applied. Novelty detection is demonstrated to improve random forest prediction quality and argued to deserve further investigation in medical applications.

  18. Classification of breast cancer cytological specimen using convolutional neural network

    Science.gov (United States)

    Żejmo, Michał; Kowal, Marek; Korbicz, Józef; Monczak, Roman

    2017-01-01

    The paper presents a deep learning approach for automatic classification of breast tumors based on fine needle cytology. The main aim of the system is to distinguish benign from malignant cases based on microscopic images. Experiment was carried out on cytological samples derived from 50 patients (25 benign cases + 25 malignant cases) diagnosed in Regional Hospital in Zielona Góra. To classify microscopic images, we used convolutional neural networks (CNN) of two types: GoogLeNet and AlexNet. Due to the very large size of images of cytological specimen (on average 200000 × 100000 pixels), they were divided into smaller patches of size 256 × 256 pixels. Breast cancer classification usually is based on morphometric features of nuclei. Therefore, training and validation patches were selected using Support Vector Machine (SVM) so that suitable amount of cell material was depicted. Neural classifiers were tuned using GPU accelerated implementation of gradient descent algorithm. Training error was defined as a cross-entropy classification loss. Classification accuracy was defined as the percentage ratio of successfully classified validation patches to the total number of validation patches. The best accuracy rate of 83% was obtained by GoogLeNet model. We observed that more misclassified patches belong to malignant cases.

  19. Application of machine learning on brain cancer multiclass classification

    Science.gov (United States)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  20. Cross-Disciplinary Analysis of Lymph Node Classification in Lung Cancer on CT Scanning.

    Science.gov (United States)

    El-Sherief, Ahmed H; Lau, Charles T; Obuchowski, Nancy A; Mehta, Atul C; Rice, Thomas W; Blackstone, Eugene H

    2017-04-01

    Accurate and consistent regional lymph node classification is an important element in the staging and multidisciplinary management of lung cancer. Regional lymph node definition sets-lymph node maps-have been created to standardize regional lymph node classification. In 2009, the International Association for the Study of Lung Cancer (IASLC) introduced a lymph node map to supersede all preexisting lymph node maps. Our aim was to study if and how lung cancer specialists apply the IASLC lymph node map when classifying thoracic lymph nodes encountered on CT scans during lung cancer staging. From April 2013 through July 2013, invitations were distributed to all members of the Fleischner Society, Society of Thoracic Radiology, General Thoracic Surgical Club, and the American Association of Bronchology and Interventional Pulmonology to participate in an anonymous online image-based and text-based 20-question survey regarding lymph node classification for lung cancer staging on CT imaging. Three hundred thirty-seven people responded (approximately 25% participation). Respondents consisted of self-reported thoracic radiologists (n = 158), thoracic surgeons (n = 102), and pulmonologists who perform endobronchial ultrasonography (n = 77). Half of the respondents (50%; 95% CI, 44%-55%) reported using the IASLC lymph node map in daily practice, with no significant differences between subspecialties. A disparity was observed between the IASLC definition sets and their interpretation and application on CT scans, in particular for lymph nodes near the thoracic inlet, anterior to the trachea, anterior to the tracheal bifurcation, near the ligamentum arteriosum, between the bronchus intermedius and esophagus, in the internal mammary space, and adjacent to the heart. Use of older lymph node maps and inconsistencies in interpretation and application of definitions in the IASLC lymph node map may potentially lead to misclassification of stage and suboptimal management of lung

  1. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Directory of Open Access Journals (Sweden)

    Enrico Glaab

    Full Text Available Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  2. Gene Expression Profiles for Predicting Metastasis in Breast Cancer: A Cross-Study Comparison of Classification Methods

    Directory of Open Access Journals (Sweden)

    Mark Burton

    2012-01-01

    Full Text Available Machine learning has increasingly been used with microarray gene expression data and for the development of classifiers using a variety of methods. However, method comparisons in cross-study datasets are very scarce. This study compares the performance of seven classification methods and the effect of voting for predicting metastasis outcome in breast cancer patients, in three situations: within the same dataset or across datasets on similar or dissimilar microarray platforms. Combining classification results from seven classifiers into one voting decision performed significantly better during internal validation as well as external validation in similar microarray platforms than the underlying classification methods. When validating between different microarray platforms, random forest, another voting-based method, proved to be the best performing method. We conclude that voting based classifiers provided an advantage with respect to classifying metastasis outcome in breast cancer patients.

  3. Normed kernel function-based fuzzy possibilistic C-means (NKFPCM) algorithm for high-dimensional breast cancer database classification with feature selection is based on Laplacian Score

    Science.gov (United States)

    Lestari, A. W.; Rustam, Z.

    2017-07-01

    In the last decade, breast cancer has become the focus of world attention as this disease is one of the primary leading cause of death for women. Therefore, it is necessary to have the correct precautions and treatment. In previous studies, Fuzzy Kennel K-Medoid algorithm has been used for multi-class data. This paper proposes an algorithm to classify the high dimensional data of breast cancer using Fuzzy Possibilistic C-means (FPCM) and a new method based on clustering analysis using Normed Kernel Function-Based Fuzzy Possibilistic C-Means (NKFPCM). The objective of this paper is to obtain the best accuracy in classification of breast cancer data. In order to improve the accuracy of the two methods, the features candidates are evaluated using feature selection, where Laplacian Score is used. The results show the comparison accuracy and running time of FPCM and NKFPCM with and without feature selection.

  4. Recurrent neural networks for breast lesion classification based on DCE-MRIs

    Science.gov (United States)

    Antropova, Natasha; Huynh, Benjamin; Giger, Maryellen

    2018-02-01

    Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) plays a significant role in breast cancer screening, cancer staging, and monitoring response to therapy. Recently, deep learning methods are being rapidly incorporated in image-based breast cancer diagnosis and prognosis. However, most of the current deep learning methods make clinical decisions based on 2-dimentional (2D) or 3D images and are not well suited for temporal image data. In this study, we develop a deep learning methodology that enables integration of clinically valuable temporal components of DCE-MRIs into deep learning-based lesion classification. Our work is performed on a database of 703 DCE-MRI cases for the task of distinguishing benign and malignant lesions, and uses the area under the ROC curve (AUC) as the performance metric in conducting that task. We train a recurrent neural network, specifically a long short-term memory network (LSTM), on sequences of image features extracted from the dynamic MRI sequences. These features are extracted with VGGNet, a convolutional neural network pre-trained on a large dataset of natural images ImageNet. The features are obtained from various levels of the network, to capture low-, mid-, and high-level information about the lesion. Compared to a classification method that takes as input only images at a single time-point (yielding an AUC = 0.81 (se = 0.04)), our LSTM method improves lesion classification with an AUC of 0.85 (se = 0.03).

  5. Recursive Partitioning Analysis for New Classification of Patients With Esophageal Cancer Treated by Chemoradiotherapy

    International Nuclear Information System (INIS)

    Nomura, Motoo; Shitara, Kohei; Kodaira, Takeshi; Kondoh, Chihiro; Takahari, Daisuke; Ura, Takashi; Kojima, Hiroyuki; Kamata, Minoru; Muro, Kei; Sawada, Satoshi

    2012-01-01

    Background: The 7th edition of the American Joint Committee on Cancer staging system does not include lymph node size in the guidelines for staging patients with esophageal cancer. The objectives of this study were to determine the prognostic impact of the maximum metastatic lymph node diameter (ND) on survival and to develop and validate a new staging system for patients with esophageal squamous cell cancer who were treated with definitive chemoradiotherapy (CRT). Methods: Information on 402 patients with esophageal cancer undergoing CRT at two institutions was reviewed. Univariate and multivariate analyses of data from one institution were used to assess the impact of clinical factors on survival, and recursive partitioning analysis was performed to develop the new staging classification. To assess its clinical utility, the new classification was validated using data from the second institution. Results: By multivariate analysis, gender, T, N, and ND stages were independently and significantly associated with survival (p < 0.05). The resulting new staging classification was based on the T and ND. The four new stages led to good separation of survival curves in both the developmental and validation datasets (p < 0.05). Conclusions: Our results showed that lymph node size is a strong independent prognostic factor and that the new staging system, which incorporated lymph node size, provided good prognostic power, and discriminated effectively for patients with esophageal cancer undergoing CRT.

  6. Laser Raman detection for oral cancer based on an adaptive Gaussian process classification method with posterior probabilities

    International Nuclear Information System (INIS)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Shen, Aiguo; Hu, Jiming; Jia, Jun

    2013-01-01

    The existing methods for early and differential diagnosis of oral cancer are limited due to the unapparent early symptoms and the imperfect imaging examination methods. In this paper, the classification models of oral adenocarcinoma, carcinoma tissues and a control group with just four features are established by utilizing the hybrid Gaussian process (HGP) classification algorithm, with the introduction of the mechanisms of noise reduction and posterior probability. HGP shows much better performance in the experimental results. During the experimental process, oral tissues were divided into three groups, adenocarcinoma (n = 87), carcinoma (n = 100) and the control group (n = 134). The spectral data for these groups were collected. The prospective application of the proposed HGP classification method improved the diagnostic sensitivity to 56.35% and the specificity to about 70.00%, and resulted in a Matthews correlation coefficient (MCC) of 0.36. It is proved that the utilization of HGP in LRS detection analysis for the diagnosis of oral cancer gives accurate results. The prospect of application is also satisfactory. (paper)

  7. Laser Raman detection for oral cancer based on an adaptive Gaussian process classification method with posterior probabilities

    Science.gov (United States)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Jia, Jun; Shen, Aiguo; Hu, Jiming

    2013-03-01

    The existing methods for early and differential diagnosis of oral cancer are limited due to the unapparent early symptoms and the imperfect imaging examination methods. In this paper, the classification models of oral adenocarcinoma, carcinoma tissues and a control group with just four features are established by utilizing the hybrid Gaussian process (HGP) classification algorithm, with the introduction of the mechanisms of noise reduction and posterior probability. HGP shows much better performance in the experimental results. During the experimental process, oral tissues were divided into three groups, adenocarcinoma (n = 87), carcinoma (n = 100) and the control group (n = 134). The spectral data for these groups were collected. The prospective application of the proposed HGP classification method improved the diagnostic sensitivity to 56.35% and the specificity to about 70.00%, and resulted in a Matthews correlation coefficient (MCC) of 0.36. It is proved that the utilization of HGP in LRS detection analysis for the diagnosis of oral cancer gives accurate results. The prospect of application is also satisfactory.

  8. An enhanced topologically significant directed random walk in cancer classification using gene expression datasets

    Directory of Open Access Journals (Sweden)

    Choon Sen Seah

    2017-12-01

    Full Text Available Microarray technology has become one of the elementary tools for researchers to study the genome of organisms. As the complexity and heterogeneity of cancer is being increasingly appreciated through genomic analysis, cancerous classification is an emerging important trend. Significant directed random walk is proposed as one of the cancerous classification approach which have higher sensitivity of risk gene prediction and higher accuracy of cancer classification. In this paper, the methodology and material used for the experiment are presented. Tuning parameter selection method and weight as parameter are applied in proposed approach. Gene expression dataset is used as the input datasets while pathway dataset is used to build a directed graph, as reference datasets, to complete the bias process in random walk approach. In addition, we demonstrate that our approach can improve sensitive predictions with higher accuracy and biological meaningful classification result. Comparison result takes place between significant directed random walk and directed random walk to show the improvement in term of sensitivity of prediction and accuracy of cancer classification.

  9. Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series.

    Science.gov (United States)

    Gálvez, Juan Manuel; Castillo, Daniel; Herrera, Luis Javier; San Román, Belén; Valenzuela, Olga; Ortuño, Francisco Manuel; Rojas, Ignacio

    2018-01-01

    Most of the research studies developed applying microarray technology to the characterization of different pathological states of any disease may fail in reaching statistically significant results. This is largely due to the small repertoire of analysed samples, and to the limitation in the number of states or pathologies usually addressed. Moreover, the influence of potential deviations on the gene expression quantification is usually disregarded. In spite of the continuous changes in omic sciences, reflected for instance in the emergence of new Next-Generation Sequencing-related technologies, the existing availability of a vast amount of gene expression microarray datasets should be properly exploited. Therefore, this work proposes a novel methodological approach involving the integration of several heterogeneous skin cancer series, and a later multiclass classifier design. This approach is thus a way to provide the clinicians with an intelligent diagnosis support tool based on the use of a robust set of selected biomarkers, which simultaneously distinguishes among different cancer-related skin states. To achieve this, a multi-platform combination of microarray datasets from Affymetrix and Illumina manufacturers was carried out. This integration is expected to strengthen the statistical robustness of the study as well as the finding of highly-reliable skin cancer biomarkers. Specifically, the designed operation pipeline has allowed the identification of a small subset of 17 differentially expressed genes (DEGs) from which to distinguish among 7 involved skin states. These genes were obtained from the assessment of a number of potential batch effects on the gene expression data. The biological interpretation of these genes was inspected in the specific literature to understand their underlying information in relation to skin cancer. Finally, in order to assess their possible effectiveness in cancer diagnosis, a cross-validation Support Vector Machines (SVM)-based

  10. Histological image classification using biologically interpretable shape-based features

    International Nuclear Information System (INIS)

    Kothari, Sonal; Phan, John H; Young, Andrew N; Wang, May D

    2013-01-01

    Automatic cancer diagnostic systems based on histological image classification are important for improving therapeutic decisions. Previous studies propose textural and morphological features for such systems. These features capture patterns in histological images that are useful for both cancer grading and subtyping. However, because many of these features lack a clear biological interpretation, pathologists may be reluctant to adopt these features for clinical diagnosis. We examine the utility of biologically interpretable shape-based features for classification of histological renal tumor images. Using Fourier shape descriptors, we extract shape-based features that capture the distribution of stain-enhanced cellular and tissue structures in each image and evaluate these features using a multi-class prediction model. We compare the predictive performance of the shape-based diagnostic model to that of traditional models, i.e., using textural, morphological and topological features. The shape-based model, with an average accuracy of 77%, outperforms or complements traditional models. We identify the most informative shapes for each renal tumor subtype from the top-selected features. Results suggest that these shapes are not only accurate diagnostic features, but also correlate with known biological characteristics of renal tumors. Shape-based analysis of histological renal tumor images accurately classifies disease subtypes and reveals biologically insightful discriminatory features. This method for shape-based analysis can be extended to other histological datasets to aid pathologists in diagnostic and therapeutic decisions

  11. Three-class classification in computer-aided diagnosis of breast cancer by support vector machine

    Science.gov (United States)

    Sun, Xuejun; Qian, Wei; Song, Dansheng

    2004-05-01

    Design of classifier in computer-aided diagnosis (CAD) scheme of breast cancer plays important role to its overall performance in sensitivity and specificity. Classification of a detected object as malignant lesion, benign lesion, or normal tissue on mammogram is a typical three-class pattern recognition problem. This paper presents a three-class classification approach by using two-stage classifier combined with support vector machine (SVM) learning algorithm for classification of breast cancer on mammograms. The first classification stage is used to detect abnormal areas and normal breast tissues, and the second stage is for classification of malignant or benign in detected abnormal objects. A series of spatial, morphology and texture features have been extracted on detected objects areas. By using genetic algorithm (GA), different feature groups for different stage classification have been investigated. Computerized free-response receiver operating characteristic (FROC) and receiver operating characteristic (ROC) analyses have been employed in different classification stages. Results have shown that obvious performance improvement in both sensitivity and specificity was observed through proposed classification approach compared with conventional two-class classification approaches, indicating its effectiveness in classification of breast cancer on mammograms.

  12. Vessel-guided airway segmentation based on voxel classification

    DEFF Research Database (Denmark)

    Lo, Pechin Chien Pau; Sporring, Jon; Ashraf, Haseem

    2008-01-01

    This paper presents a method for improving airway tree segmentation using vessel orientation information. We use the fact that an airway branch is always accompanied by an artery, with both structures having similar orientations. This work is based on a  voxel classification airway segmentation...... method proposed previously. The probability of a voxel belonging to the airway, from the voxel classification method, is augmented with an orientation similarity measure as a criterion for region growing. The orientation similarity measure of a voxel indicates how similar is the orientation...... of the surroundings of a voxel, estimated based on a tube model, is to that of a neighboring vessel. The proposed method is tested on 20 CT images from different subjects selected randomly from a lung cancer screening study. Length of the airway branches from the results of the proposed method are significantly...

  13. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey

    OpenAIRE

    Nahid, Abdullah-Al; Kong, Yinan

    2017-01-01

    Breast cancer is one of the largest causes of women’s death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors’ and physicians’ time. Despite the various publications on breast image classification, very few review papers are available w...

  14. Case base classification on digital mammograms: improving the performance of case base classifier

    Science.gov (United States)

    Raman, Valliappan; Then, H. H.; Sumari, Putra; Venkatesa Mohan, N.

    2011-10-01

    Breast cancer continues to be a significant public health problem in the world. Early detection is the key for improving breast cancer prognosis. The aim of the research presented here is in twofold. First stage of research involves machine learning techniques, which segments and extracts features from the mass of digital mammograms. Second level is on problem solving approach which includes classification of mass by performance based case base classifier. In this paper we build a case-based Classifier in order to diagnose mammographic images. We explain different methods and behaviors that have been added to the classifier to improve the performance of the classifier. Currently the initial Performance base Classifier with Bagging is proposed in the paper and it's been implemented and it shows an improvement in specificity and sensitivity.

  15. Automatic classification of ovarian cancer types from cytological images using deep convolutional neural networks.

    Science.gov (United States)

    Wu, Miao; Yan, Chuanbo; Liu, Huiqiang; Liu, Qian

    2018-06-29

    Ovarian cancer is one of the most common gynecologic malignancies. Accurate classification of ovarian cancer types (serous carcinoma, mucous carcinoma, endometrioid carcinoma, transparent cell carcinoma) is an essential part in the different diagnosis. Computer-aided diagnosis (CADx) can provide useful advice for pathologists to determine the diagnosis correctly. In our study, we employed a Deep Convolutional Neural Networks (DCNN) based on AlexNet to automatically classify the different types of ovarian cancers from cytological images. The DCNN consists of five convolutional layers, three max pooling layers, and two full reconnect layers. Then we trained the model by two group input data separately, one was original image data and the other one was augmented image data including image enhancement and image rotation. The testing results are obtained by the method of 10-fold cross-validation, showing that the accuracy of classification models has been improved from 72.76 to 78.20% by using augmented images as training data. The developed scheme was useful for classifying ovarian cancers from cytological images. © 2018 The Author(s).

  16. Sparse Representation Based Multi-Instance Learning for Breast Ultrasound Image Classification

    Directory of Open Access Journals (Sweden)

    Lu Bing

    2017-01-01

    Full Text Available We propose a novel method based on sparse representation for breast ultrasound image classification under the framework of multi-instance learning (MIL. After image enhancement and segmentation, concentric circle is used to extract the global and local features for improving the accuracy in diagnosis and prediction. The classification problem of ultrasound image is converted to sparse representation based MIL problem. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is represented by one feature vector which is obtained via sparse representations of all instances within the bag. The sparse and MIL problem is further converted to a conventional learning problem that is solved by relevance vector machine (RVM. Results of single classifiers are combined to be used for classification. Experimental results on the breast cancer datasets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods.

  17. Sparse Representation Based Multi-Instance Learning for Breast Ultrasound Image Classification.

    Science.gov (United States)

    Bing, Lu; Wang, Wei

    2017-01-01

    We propose a novel method based on sparse representation for breast ultrasound image classification under the framework of multi-instance learning (MIL). After image enhancement and segmentation, concentric circle is used to extract the global and local features for improving the accuracy in diagnosis and prediction. The classification problem of ultrasound image is converted to sparse representation based MIL problem. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is represented by one feature vector which is obtained via sparse representations of all instances within the bag. The sparse and MIL problem is further converted to a conventional learning problem that is solved by relevance vector machine (RVM). Results of single classifiers are combined to be used for classification. Experimental results on the breast cancer datasets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods.

  18. Molecular classification of gastric cancer: a new paradigm.

    Science.gov (United States)

    Shah, Manish A; Khanin, Raya; Tang, Laura; Janjigian, Yelena Y; Klimstra, David S; Gerdes, Hans; Kelsen, David P

    2011-05-01

    Gastric cancer may be subdivided into 3 distinct subtypes--proximal, diffuse, and distal gastric cancer--based on histopathologic and anatomic criteria. Each subtype is associated with unique epidemiology. Our aim is to test the hypothesis that these distinct gastric cancer subtypes may also be distinguished by gene expression analysis. Patients with localized gastric adenocarcinoma being screened for a phase II preoperative clinical trial (National Cancer Institute, NCI #5917) underwent endoscopic biopsy for fresh tumor procurement. Four to 6 targeted biopsies of the primary tumor were obtained. Macrodissection was carried out to ensure more than 80% carcinoma in the sample. HG-U133A GeneChip (Affymetrix) was used for cDNA expression analysis, and all arrays were processed and analyzed using the Bioconductor R-package. Between November 2003 and January 2006, 57 patients were screened to identify 36 patients with localized gastric cancer who had adequate RNA for expression analysis. Using supervised analysis, we built a classifier to distinguish the 3 gastric cancer subtypes, successfully classifying each into tightly grouped clusters. Leave-one-out cross-validation error was 0.14, suggesting that more than 85% of samples were classified correctly. Gene set analysis with the false discovery rate set at 0.25 identified several pathways that were differentially regulated when comparing each gastric cancer subtype to adjacent normal stomach. Subtypes of gastric cancer that have epidemiologic and histologic distinctions are also distinguished by gene expression data. These preliminary data suggest a new classification of gastric cancer with implications for improving our understanding of disease biology and identification of unique molecular drivers for each gastric cancer subtype. ©2011 AACR.

  19. Alternative Polyadenylation Patterns for Novel Gene Discovery and Classification in Cancer

    Directory of Open Access Journals (Sweden)

    Oguzhan Begik

    2017-07-01

    Full Text Available Certain aspects of diagnosis, prognosis, and treatment of cancer patients are still important challenges to be addressed. Therefore, we propose a pipeline to uncover patterns of alternative polyadenylation (APA, a hidden complexity in cancer transcriptomes, to further accelerate efforts to discover novel cancer genes and pathways. Here, we analyzed expression data for 1045 cancer patients and found a significant shift in usage of poly(A signals in common tumor types (breast, colon, lung, prostate, gastric, and ovarian compared to normal tissues. Using machine-learning techniques, we further defined specific subsets of APA events to efficiently classify cancer types. Furthermore, APA patterns were associated with altered protein levels in patients, revealed by antibody-based profiling data, suggesting functional significance. Overall, our study offers a computational approach for use of APA in novel gene discovery and classification in common tumor types, with important implications in basic research, biomarker discovery, and precision medicine approaches.

  20. The classification of lung cancers and their degree of malignancy by FTIR, PCA-LDA analysis, and a physics-based computational model.

    Science.gov (United States)

    Kaznowska, E; Depciuch, J; Łach, K; Kołodziej, M; Koziorowska, A; Vongsvivut, J; Zawlik, I; Cholewa, M; Cebulski, J

    2018-08-15

    Lung cancer has the highest mortality rate of all malignant tumours. The current effects of cancer treatment, as well as its diagnostics, are unsatisfactory. Therefore it is very important to introduce modern diagnostic tools, which will allow for rapid classification of lung cancers and their degree of malignancy. For this purpose, the authors propose the use of Fourier Transform InfraRed (FTIR) spectroscopy combined with Principal Component Analysis-Linear Discriminant Analysis (PCA-LDA) and a physics-based computational model. The results obtained for lung cancer tissues, adenocarcinoma and squamous cell carcinoma FTIR spectra, show a shift in wavenumbers compared to control tissue FTIR spectra. Furthermore, in the FTIR spectra of adenocarcinoma there are no peaks corresponding to glutamate or phospholipid functional groups. Moreover, in the case of G2 and G3 malignancy of adenocarcinoma lung cancer, the absence of an OH groups peak was noticed. Thus, it seems that FTIR spectroscopy is a valuable tool to classify lung cancer and to determine the degree of its malignancy. Copyright © 2018 Elsevier B.V. All rights reserved.

  1. Deep learning based classification for head and neck cancer detection with hyperspectral imaging in an animal model

    Science.gov (United States)

    Ma, Ling; Lu, Guolan; Wang, Dongsheng; Wang, Xu; Chen, Zhuo Georgia; Muller, Susan; Chen, Amy; Fei, Baowei

    2017-03-01

    Hyperspectral imaging (HSI) is an emerging imaging modality that can provide a noninvasive tool for cancer detection and image-guided surgery. HSI acquires high-resolution images at hundreds of spectral bands, providing big data to differentiating different types of tissue. We proposed a deep learning based method for the detection of head and neck cancer with hyperspectral images. Since the deep learning algorithm can learn the feature hierarchically, the learned features are more discriminative and concise than the handcrafted features. In this study, we adopt convolutional neural networks (CNN) to learn the deep feature of pixels for classifying each pixel into tumor or normal tissue. We evaluated our proposed classification method on the dataset containing hyperspectral images from 12 tumor-bearing mice. Experimental results show that our method achieved an average accuracy of 91.36%. The preliminary study demonstrated that our deep learning method can be applied to hyperspectral images for detecting head and neck tumors in animal models.

  2. Changing Histopathological Diagnostics by Genome-Based Tumor Classification

    Directory of Open Access Journals (Sweden)

    Michael Kloth

    2014-05-01

    Full Text Available Traditionally, tumors are classified by histopathological criteria, i.e., based on their specific morphological appearances. Consequently, current therapeutic decisions in oncology are strongly influenced by histology rather than underlying molecular or genomic aberrations. The increase of information on molecular changes however, enabled by the Human Genome Project and the International Cancer Genome Consortium as well as the manifold advances in molecular biology and high-throughput sequencing techniques, inaugurated the integration of genomic information into disease classification. Furthermore, in some cases it became evident that former classifications needed major revision and adaption. Such adaptations are often required by understanding the pathogenesis of a disease from a specific molecular alteration, using this molecular driver for targeted and highly effective therapies. Altogether, reclassifications should lead to higher information content of the underlying diagnoses, reflecting their molecular pathogenesis and resulting in optimized and individual therapeutic decisions. The objective of this article is to summarize some particularly important examples of genome-based classification approaches and associated therapeutic concepts. In addition to reviewing disease specific markers, we focus on potentially therapeutic or predictive markers and the relevance of molecular diagnostics in disease monitoring.

  3. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  4. EPA`s program for risk assessment guidelines: Cancer classification issues

    Energy Technology Data Exchange (ETDEWEB)

    Wiltse, J. [Environmental Protection Agency, Washington, DC (United States)

    1990-12-31

    Issues presented are related to classification of weight of evidence in cancer risk assessments. The focus in this paper is on lines of evidence used in constructing a conclusion about potential human carcinogenicity. The paper also discusses issues that are mistakenly addressed as classification issues but are really part of the risk assessment process. 2 figs.

  5. Classification of Dukes' B and C colorectal cancers using expression arrays

    DEFF Research Database (Denmark)

    Frederiksen, C.M.; Knudsen, Steen; Laurberg, S.

    2003-01-01

    Purpose. Colorectal cancer is one of the most common malignancies. Substaging of the cancer is of importance not only to prognosis but also to treatment. Classification of substages based on DNA microarray technology is currently the most promising approach. We therefore investigated if gene...... expression microarrays could be used to classify colorectal tumors. Methods. We used the Affymetrix oligonucleotide arrays to analyze the expression of more than 5,000 genes in samples from the sigmoid and upper rectum of the left colon. Five samples were from normal mucosa and five samples from each...... expression of one of the most common malignancies, colorectal cancer, now seems to be within reach. The data indicates that it is possible at least to classify Dukes' B and C colorectal tumors with microarrays....

  6. Detection and classification of Breast Cancer in Wavelet Sub-bands of Fractal Segmented Cancerous Zones.

    Science.gov (United States)

    Shirazinodeh, Alireza; Noubari, Hossein Ahmadi; Rabbani, Hossein; Dehnavi, Alireza Mehri

    2015-01-01

    Recent studies on wavelet transform and fractal modeling applied on mammograms for the detection of cancerous tissues indicate that microcalcifications and masses can be utilized for the study of the morphology and diagnosis of cancerous cases. It is shown that the use of fractal modeling, as applied to a given image, can clearly discern cancerous zones from noncancerous areas. In this paper, for fractal modeling, the original image is first segmented into appropriate fractal boxes followed by identifying the fractal dimension of each windowed section using a computationally efficient two-dimensional box-counting algorithm. Furthermore, using appropriate wavelet sub-bands and image Reconstruction based on modified wavelet coefficients, it is shown that it is possible to arrive at enhanced features for detection of cancerous zones. In this paper, we have attempted to benefit from the advantages of both fractals and wavelets by introducing a new algorithm. By using a new algorithm named F1W2, the original image is first segmented into appropriate fractal boxes, and the fractal dimension of each windowed section is extracted. Following from that, by applying a maximum level threshold on fractal dimensions matrix, the best-segmented boxes are selected. In the next step, the segmented Cancerous zones which are candidates are then decomposed by utilizing standard orthogonal wavelet transform and db2 wavelet in three different resolution levels, and after nullifying wavelet coefficients of the image at the first scale and low frequency band of the third scale, the modified reconstructed image is successfully utilized for detection of breast cancer regions by applying an appropriate threshold. For detection of cancerous zones, our simulations indicate the accuracy of 90.9% for masses and 88.99% for microcalcifications detection results using the F1W2 method. For classification of detected mictocalcification into benign and malignant cases, eight features are identified and

  7. The eighth TNM classification system for lung cancer: A consideration based on the degree of pleural invasion and involved neighboring structures.

    Science.gov (United States)

    Sakakura, Noriaki; Mizuno, Tetsuya; Kuroda, Hiroaki; Arimura, Takaaki; Yatabe, Yasushi; Yoshimura, Kenichi; Sakao, Yukinori

    2018-04-01

    The eighth tumor-node-metastasis (TNM) classification system for lung cancer has been used since January 2017 and must be applied to an individual institution's database. We analyzed pathological stage data of 2756 patients who underwent resection of non-small-cell lung cancer, particularly in terms of the degree of visceral pleural invasion and involved neighboring structures. Few patients had stage IIA disease (103, 4%); stratification between stages IB and IIA was insufficient (p = 0.129). When T2a tumors were divided into PL1 and PL2 subgroups based on the degree of pleural invasion, there was a significant prognostic difference between the subgroups (p consideration. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. Does the use of the 2009 FIGO classification of endometrial cancer impact on indications of the sentinel node biopsy?

    Directory of Open Access Journals (Sweden)

    Ballester Marcos

    2010-08-01

    Full Text Available Abstract Background Lymphadenectomy is debated in early stages endometrial cancer. Moreover, a new FIGO classification of endometrial cancer, merging stages IA and IB has been recently published. Therefore, the aims of the present study was to evaluate the relevance of the sentinel node (SN procedure in women with endometrial cancer and to discuss whether the use of the 2009 FIGO classification could modify the indications for SN procedure. Methods Eighty-five patients with endometrial cancer underwent the SN procedure followed by pelvic lymphadenectomy. SNs were detected with a dual or single labelling method in 74 and 11 cases, respectively. All SNs were analysed by both H&E staining and immunohistochemistry. Presumed stage before surgery was assessed for all patients based on MR imaging features using the 1988 FIGO classification and the 2009 FIGO classification. Results An SN was detected in 88.2% of cases (75/85 women. Among the fourteen patients with lymph node metastases one-half were detected by serial sectioning and immunohistochemical analysis. There were no false negative case. Using the 1988 FIGO classification and the 2009 FIGO classification, the correlation between preoperative MRI staging and final histology was moderate with Kappa = 0.24 and Kappa = 0.45, respectively. None of the patients with grade 1 endometrioid carcinoma on biopsy and IA 2009 FIGO stage on MR imaging exhibited positive SN. In patients with grade 2-3 endometrioid carcinoma and stage IA on MR imaging, the rate of positive SN reached 16.6% with an incidence of micrometastases of 50%. Conclusions The present study suggests that sentinel node biopsy is an adequate technique to evaluate lymph node status. The use of the 2009 FIGO classification increases the accuracy of MR imaging to stage patients with early stages of endometrial cancer and contributes to clarify the indication of SN biopsy according to tumour grade and histological type.

  9. Does the use of the 2009 FIGO classification of endometrial cancer impact on indications of the sentinel node biopsy?

    International Nuclear Information System (INIS)

    Ballester, Marcos; Koskas, Martin; Coutant, Charles; Chéreau, Elisabeth; Seror, Jeremy; Rouzier, Roman; Daraï, Emile

    2010-01-01

    Lymphadenectomy is debated in early stages endometrial cancer. Moreover, a new FIGO classification of endometrial cancer, merging stages IA and IB has been recently published. Therefore, the aims of the present study was to evaluate the relevance of the sentinel node (SN) procedure in women with endometrial cancer and to discuss whether the use of the 2009 FIGO classification could modify the indications for SN procedure. Eighty-five patients with endometrial cancer underwent the SN procedure followed by pelvic lymphadenectomy. SNs were detected with a dual or single labelling method in 74 and 11 cases, respectively. All SNs were analysed by both H&E staining and immunohistochemistry. Presumed stage before surgery was assessed for all patients based on MR imaging features using the 1988 FIGO classification and the 2009 FIGO classification. An SN was detected in 88.2% of cases (75/85 women). Among the fourteen patients with lymph node metastases one-half were detected by serial sectioning and immunohistochemical analysis. There were no false negative case. Using the 1988 FIGO classification and the 2009 FIGO classification, the correlation between preoperative MRI staging and final histology was moderate with Kappa = 0.24 and Kappa = 0.45, respectively. None of the patients with grade 1 endometrioid carcinoma on biopsy and IA 2009 FIGO stage on MR imaging exhibited positive SN. In patients with grade 2-3 endometrioid carcinoma and stage IA on MR imaging, the rate of positive SN reached 16.6% with an incidence of micrometastases of 50%. The present study suggests that sentinel node biopsy is an adequate technique to evaluate lymph node status. The use of the 2009 FIGO classification increases the accuracy of MR imaging to stage patients with early stages of endometrial cancer and contributes to clarify the indication of SN biopsy according to tumour grade and histological type

  10. An NRG Oncology/GOG study of molecular classification for risk prediction in endometrioid endometrial cancer.

    Science.gov (United States)

    Cosgrove, Casey M; Tritchler, David L; Cohn, David E; Mutch, David G; Rush, Craig M; Lankes, Heather A; Creasman, William T; Miller, David S; Ramirez, Nilsa C; Geller, Melissa A; Powell, Matthew A; Backes, Floor J; Landrum, Lisa M; Timmers, Cynthia; Suarez, Adrian A; Zaino, Richard J; Pearl, Michael L; DiSilvestro, Paul A; Lele, Shashikant B; Goodfellow, Paul J

    2018-01-01

    The purpose of this study was to assess the prognostic significance of a simplified, clinically accessible classification system for endometrioid endometrial cancers combining Lynch syndrome screening and molecular risk stratification. Tumors from NRG/GOG GOG210 were evaluated for mismatch repair defects (MSI, MMR IHC, and MLH1 methylation), POLE mutations, and loss of heterozygosity. TP53 was evaluated in a subset of cases. Tumors were assigned to four molecular classes. Relationships between molecular classes and clinicopathologic variables were assessed using contingency tests and Cox proportional methods. Molecular classification was successful for 982 tumors. Based on the NCI consensus MSI panel assessing MSI and loss of heterozygosity combined with POLE testing, 49% of tumors were classified copy number stable (CNS), 39% MMR deficient, 8% copy number altered (CNA) and 4% POLE mutant. Cancer-specific mortality occurred in 5% of patients with CNS tumors; 2.6% with POLE tumors; 7.6% with MMR deficient tumors and 19% with CNA tumors. The CNA group had worse progression-free (HR 2.31, 95%CI 1.53-3.49) and cancer-specific survival (HR 3.95; 95%CI 2.10-7.44). The POLE group had improved outcomes, but the differences were not statistically significant. CNA class remained significant for cancer-specific survival (HR 2.11; 95%CI 1.04-4.26) in multivariable analysis. The CNA molecular class was associated with TP53 mutation and expression status. A simple molecular classification for endometrioid endometrial cancers that can be easily combined with Lynch syndrome screening provides important prognostic information. These findings support prospective clinical validation and further studies on the predictive value of a simplified molecular classification system. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Efficacy of the Kyoto Classification of Gastritis in Identifying Patients at High Risk for Gastric Cancer.

    Science.gov (United States)

    Sugimoto, Mitsushige; Ban, Hiromitsu; Ichikawa, Hitomi; Sahara, Shu; Otsuka, Taketo; Inatomi, Osamu; Bamba, Shigeki; Furuta, Takahisa; Andoh, Akira

    2017-01-01

    Objective The Kyoto gastritis classification categorizes the endoscopic characteristics of Helicobacter pylori (H. pylori) infection-associated gastritis and identifies patterns associated with a high risk of gastric cancer. We investigated its efficacy, comparing scores in patients with H. pylori-associated gastritis and with gastric cancer. Methods A total of 1,200 patients with H. pylori-positive gastritis alone (n=932), early-stage H. pylori-positive gastric cancer (n=189), and successfully treated H. pylori-negative cancer (n=79) were endoscopically graded according to the Kyoto gastritis classification for atrophy, intestinal metaplasia, fold hypertrophy, nodularity, and diffuse redness. Results The prevalence of O-II/O-III-type atrophy according to the Kimura-Takemoto classification in early-stage H. pylori-positive gastric cancer and successfully treated H. pylori-negative cancer groups was 45.1%, which was significantly higher than in subjects with gastritis alone (12.7%, pgastritis scores of atrophy and intestinal metaplasia in the H. pylori-positive cancer group were significantly higher than in subjects with gastritis alone (all pgastritis classification may thus be useful for detecting these patients.

  12. A comprehensive sensitivity analysis of microarray breast cancer classification under feature variability

    Directory of Open Access Journals (Sweden)

    Reinders Marcel JT

    2009-11-01

    Full Text Available Abstract Background Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear. Results We have performed a comprehensive sensitivity analysis of microarray breast cancer classification under the two types of feature variability mentioned above. We used data from six state of the art preprocessing methods, using a compendium consisting of eight diferent datasets, involving 1131 hybridizations, containing data from both one and two-color array technology. For a wide range of classifiers, we performed a joint study on performance, concordance and stability. In the stability analysis we explicitly tested classifiers for their noise tolerance by using perturbed expression profiles that are based on uncertainty information directly related to the preprocessing methods. Our results indicate that signature composition is strongly influenced by feature variability, even if the array platform and the stratification of patient samples are identical. In addition, we show that there is often a high level of discordance between individual class assignments for signatures constructed on data coming from different preprocessing schemes, even if the actual signature composition is identical

  13. Classification of mitocans, anti-cancer drugs acting on mitochondria

    Czech Academy of Sciences Publication Activity Database

    Neužil, Jiří; Dong, L. F.; Rohlena, Jakub; Truksa, Jaroslav; Ralph, S. J.

    2013-01-01

    Roč. 13, č. 3 (2013), s. 199-208 ISSN 1567-7249 Institutional research plan: CEZ:AV0Z50520701 Keywords : Mitocans * Anti-cancer therapeutics * Classification Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.524, year: 2013

  14. Ultrasonographic characteristics and BI-RADS-US classification of BRCA1 mutation-associated breast cancer in Guangxi, China.

    Science.gov (United States)

    Li, Cheng; Liu, Junjie; Wang, Sida; Chen, Yuanyuan; Yuan, Zhigang; Zeng, Jian; Li, Zhixian

    2015-01-01

    To retrospectively analyze and compare the ultrasonographic characteristics and BI-RADS-US classification between patients with BRCA1 mutation-associated breast cancer and those without BRCA1 gene mutation in Guangxi, China. The study was performed in 36 lesions from 34 BRCA1 mutation-associated breast cancer patients. A total of 422 lesions from 422 breast cancer patients without BRCA1 mutations served as control group. The comparison of the ultrasonographic features and BI-RADS-US classification between two the groups were reviewed. More complex inner echo was disclosed in BRCA1 mutation-associated breast cancer patients (x(2) = 4.741, P = 0.029). The BI-RADS classification of BRCA1 mutation-associated breast cancer was lower (U = 6094.0, P = 0.022). BRCA1 mutation-associated breast cancer frequently displays as microlobulated margin and complex echo. It also shows more benign characteristics in morphology, and the BI-RADS classification is prone to be underestimated.

  15. Classification of mitocans, anti-cancer drugs acting on mitochondria

    Czech Academy of Sciences Publication Activity Database

    Neužil, Jiří; Dong, L. F.; Rohlena, Jakub; Truksa, Jaroslav; Ralph, S. J.

    2013-01-01

    Roč. 13, č. 3 (2013), s. 199-208 ISSN 1567-7249 Institutional research plan: CEZ:AV0Z50520701 Keywords : Mitocans * Anti-cancer therapeutics * Classification Subject RIV: EB - Gene tics ; Molecular Biology Impact factor: 3.524, year: 2013

  16. Comparison of Computational Algorithms for the Classification of Liver Cancer using SELDI Mass Spectrometry: A Case Study

    Directory of Open Access Journals (Sweden)

    Robert J Hickey

    2007-01-01

    Full Text Available Introduction: As an alternative to DNA microarrays, mass spectrometry based analysis of proteomic patterns has shown great potential in cancer diagnosis. The ultimate application of this technique in clinical settings relies on the advancement of the technology itself and the maturity of the computational tools used to analyze the data. A number of computational algorithms constructed on different principles are available for the classification of disease status based on proteomic patterns. Nevertheless, few studies have addressed the difference in the performance of these approaches. In this report, we describe a comparative case study on the classification accuracy of hepatocellular carcinoma based on the serum proteomic pattern generated from a Surface Enhanced Laser Desorption/Ionization (SELDI mass spectrometer.Methods: Nine supervised classifi cation algorithms are implemented in R software and compared for the classification accuracy.Results: We found that the support vector machine with radial function is preferable as a tool for classification of hepatocellular carcinoma using features in SELDI mass spectra. Among the rest of the methods, random forest and prediction analysis of microarrays have better performance. A permutation-based technique reveals that the support vector machine with a radial function seems intrinsically superior in learning from the training data since it has a lower prediction error than others when there is essentially no differential signal. On the other hand, the performance of the random forest and prediction analysis of microarrays rely on their capability of capturing the signals with substantial differentiation between groups.Conclusions: Our finding is similar to a previous study, where classification methods based on the Matrix Assisted Laser Desorption/Ionization (MALDI mass spectrometry are compared for the prediction accuracy of ovarian cancer. The support vector machine, random forest and prediction

  17. Classification between normal and tumor tissues based on the pair-wise gene expression ratio

    International Nuclear Information System (INIS)

    Yap, YeeLeng; Zhang, XueWu; Ling, MT; Wang, XiangHong; Wong, YC; Danchin, Antoine

    2004-01-01

    Precise classification of cancer types is critically important for early cancer diagnosis and treatment. Numerous efforts have been made to use gene expression profiles to improve precision of tumor classification. However, reliable cancer-related signals are generally lacking. Using recent datasets on colon and prostate cancer, a data transformation procedure from single gene expression to pair-wise gene expression ratio is proposed. Making use of the internal consistency of each expression profiling dataset this transformation improves the signal to noise ratio of the dataset and uncovers new relevant cancer-related signals (features). The efficiency in using the transformed dataset to perform normal/tumor classification was investigated using feature partitioning with informative features (gene annotation) as discriminating axes (single gene expression or pair-wise gene expression ratio). Classification results were compared to the original datasets for up to 10-feature model classifiers. 82 and 262 genes that have high correlation to tissue phenotype were selected from the colon and prostate datasets respectively. Remarkably, data transformation of the highly noisy expression data successfully led to lower the coefficient of variation (CV) for the within-class samples as well as improved the correlation with tissue phenotypes. The transformed dataset exhibited lower CV when compared to that of single gene expression. In the colon cancer set, the minimum CV decreased from 45.3% to 16.5%. In prostate cancer, comparable CV was achieved with and without transformation. This improvement in CV, coupled with the improved correlation between the pair-wise gene expression ratio and tissue phenotypes, yielded higher classification efficiency, especially with the colon dataset – from 87.1% to 93.5%. Over 90% of the top ten discriminating axes in both datasets showed significant improvement after data transformation. The high classification efficiency achieved suggested

  18. Study design requirements for RNA sequencing-based breast cancer diagnostics.

    Science.gov (United States)

    Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias

    2016-02-01

    Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic.

  19. Significance and Application of Digital Breast Tomosynthesis for the BI-RADS Classification of Breast Cancer.

    Science.gov (United States)

    Cai, Si-Qing; Yan, Jian-Xiang; Chen, Qing-Shi; Huang, Mei-Ling; Cai, Dong-Lu

    2015-01-01

    Full-field digital mammography (FFDM) with dense breasts has a high rate of missed diagnosis, and digital breast tomosynthesis (DBT) could reduce organization overlapping and provide more reliable images for BI-RADS classification. This study aims to explore application of COMBO (FFDM+DBT) for effect and significance of BI-RADS classification of breast cancer. In this study, we selected 832 patients who had been treated from May 2013 to November 2013. Classify FFDM and COMBO examination according to BI-RADS separately and compare the differences for glands in the image of the same patient in judgment, mass characteristics display and indirect signs. Employ Paired Wilcoxon rank sum test was used in 79 breast cancer patients to find differences between two examine methods. The results indicated that COMBO pattern is able to observe more details in distribution of glands when estimating content. Paired Wilcoxon rank sum test showed that overall classification level of COMBO is higher significantly compared to FFDM to BI-RADS diagnosis and classification of breast (PBI-RADS classification in breast cancer in clinical.

  20. [New molecular classification of colorectal cancer, pancreatic cancer and stomach cancer: Towards "à la carte" treatment?].

    Science.gov (United States)

    Dreyer, Chantal; Afchain, Pauline; Trouilloud, Isabelle; André, Thierry

    2016-01-01

    This review reports 3 of recently published molecular classifications of the 3 main gastro-intestinal cancers: gastric, pancreatic and colorectal adenocarcinoma. In colorectal adenocarcinoma, 6 independent classifications were combined to finally hold 4 molecular sub-groups, Consensus Molecular Subtypes (CMS 1-4), linked to various clinical, molecular and survival data. CMS1 (14% MSI with immune activation); CMS2 (37%: canonical with epithelial differentiation and activation of the WNT/MYC pathway); CMS3 (13% metabolic with epithelial differentiation and RAS mutation); CMS4 (23%: mesenchymal with activation of TGFβ pathway and angiogenesis with stromal invasion). In gastric adenocarcinoma, 4 groups were established: subtype "EBV" (9%, high frequency of PIK3CA mutations, hypermetylation and amplification of JAK2, PD-L1 and PD-L2), subtype "MSI" (22%, high rate of mutation), subtype "genomically stable tumor" (20%, diffuse histology type and mutations of RAS and genes encoding integrins and adhesion proteins including CDH1) and subtype "tumors with chromosomal instability" (50%, intestinal type, aneuploidy and receptor tyrosine kinase amplification). In pancreatic adenocarcinomas, a classification in four sub-groups has been proposed, stable subtype (20%, aneuploidy), locally rearranged subtype (30%, focal event on one or two chromosoms), scattered subtype (36%,200 structural variation events, defects in DNA maintenance). Although currently away from the care of patients, these classifications open the way to "à la carte" treatment depending on molecular biology. Copyright © 2016 Société Française du Cancer. Published by Elsevier Masson SAS. All rights reserved.

  1. Comparative Study of Classification Techniques on Breast Cancer FNA Biopsy Data

    Directory of Open Access Journals (Sweden)

    George Rumbe

    2010-12-01

    Full Text Available Accurate diagnostic detection of the cancerous cells in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Bayesian classifier and other Artificial neural network classifiers (Backpropagation, linear programming, Learning vector quantization, and K nearest neighborhood on the Wisconsin breast cancer classification problem.

  2. DNA methylation-based classification of central nervous system tumours

    DEFF Research Database (Denmark)

    Capper, David; Jones, David T.W.; Sill, Martin

    2018-01-01

    Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging - with substantial inter-observer variabil......Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging - with substantial inter......-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show...

  3. Bladder cancer: Analysis of the 2004 WHO classification in ...

    African Journals Online (AJOL)

    Objectives: Bladder cancer (BCA) is aworldwide disease and shows a wide range of geographical variation. The aim of this study is to analyze the prevalence of schistosomal and non-schistosomal associated BCA as well as compare our findings with the 2004 WHO consensus classification of urothelial neoplasms and ...

  4. Fluorescently labeled bevacizumab in human breast cancer: defining the classification threshold

    Science.gov (United States)

    Koch, Maximilian; de Jong, Johannes S.; Glatz, Jürgen; Symvoulidis, Panagiotis; Lamberts, Laetitia E.; Adams, Arthur L. L.; Kranendonk, Mariëtte E. G.; Terwisscha van Scheltinga, Anton G. T.; Aichler, Michaela; Jansen, Liesbeth; de Vries, Jakob; Lub-de Hooge, Marjolijn N.; Schröder, Carolien P.; Jorritsma-Smit, Annelies; Linssen, Matthijs D.; de Boer, Esther; van der Vegt, Bert; Nagengast, Wouter B.; Elias, Sjoerd G.; Oliveira, Sabrina; Witkamp, Arjen J.; Mali, Willem P. Th. M.; Van der Wall, Elsken; Garcia-Allende, P. Beatriz; van Diest, Paul J.; de Vries, Elisabeth G. E.; Walch, Axel; van Dam, Gooitzen M.; Ntziachristos, Vasilis

    2017-07-01

    In-vivo fluorescently labelled drug (bevacizumab) breast cancer specimen where obtained from patients. We propose a new structured method to determine the optimal classification threshold in targeted fluorescence intra-operative imaging.

  5. Call for a Computer-Aided Cancer Detection and Classification Research Initiative in Oman.

    Science.gov (United States)

    Mirzal, Andri; Chaudhry, Shafique Ahmad

    2016-01-01

    Cancer is a major health problem in Oman. It is reported that cancer incidence in Oman is the second highest after Saudi Arabia among Gulf Cooperation Council countries. Based on GLOBOCAN estimates, Oman is predicted to face an almost two-fold increase in cancer incidence in the period 2008-2020. However, cancer research in Oman is still in its infancy. This is due to the fact that medical institutions and infrastructure that play central roles in data collection and analysis are relatively new developments in Oman. We believe the country requires an organized plan and efforts to promote local cancer research. In this paper, we discuss current research progress in cancer diagnosis using machine learning techniques to optimize computer aided cancer detection and classification (CAD). We specifically discuss CAD using two major medical data, i.e., medical imaging and microarray gene expression profiling, because medical imaging like mammography, MRI, and PET have been widely used in Oman for assisting radiologists in early cancer diagnosis and microarray data have been proven to be a reliable source for differential diagnosis. We also discuss future cancer research directions and benefits to Oman economy for entering the cancer research and treatment business as it is a multi-billion dollar industry worldwide.

  6. Improving breast cancer classification with mammography, supported on an appropriate variable selection analysis

    Science.gov (United States)

    Pérez, Noel; Guevara, Miguel A.; Silva, Augusto

    2013-02-01

    This work addresses the issue of variable selection within the context of breast cancer classification with mammography. A comprehensive repository of feature vectors was used including a hybrid subset gathering image-based and clinical features. It aimed to gather experimental evidence of variable selection in terms of cardinality, type and find a classification scheme that provides the best performance over the Area Under Receiver Operating Characteristics Curve (AUC) scores using the ranked features subset. We evaluated and classified a total of 300 subsets of features formed by the application of Chi-Square Discretization, Information-Gain, One-Rule and RELIEF methods in association with Feed-Forward Backpropagation Neural Network (FFBP), Support Vector Machine (SVM) and Decision Tree J48 (DTJ48) Machine Learning Algorithms (MLA) for a comparative performance evaluation based on AUC scores. A variable selection analysis was performed for Single-View Ranking and Multi-View Ranking groups of features. Features subsets representing Microcalcifications (MCs), Masses and both MCs and Masses lesions achieved AUC scores of 0.91, 0.954 and 0.934 respectively. Experimental evidence demonstrated that classification performance was improved by combining image-based and clinical features. The most important clinical and image-based features were StromaDistortion and Circularity respectively. Other less important but worth to use due to its consistency were Contrast, Perimeter, Microcalcification, Correlation and Elongation.

  7. DNA methylation-based classification of central nervous system tumours.

    Science.gov (United States)

    Capper, David; Jones, David T W; Sill, Martin; Hovestadt, Volker; Schrimpf, Daniel; Sturm, Dominik; Koelsche, Christian; Sahm, Felix; Chavez, Lukas; Reuss, David E; Kratz, Annekathrin; Wefers, Annika K; Huang, Kristin; Pajtler, Kristian W; Schweizer, Leonille; Stichel, Damian; Olar, Adriana; Engel, Nils W; Lindenberg, Kerstin; Harter, Patrick N; Braczynski, Anne K; Plate, Karl H; Dohmen, Hildegard; Garvalov, Boyan K; Coras, Roland; Hölsken, Annett; Hewer, Ekkehard; Bewerunge-Hudler, Melanie; Schick, Matthias; Fischer, Roger; Beschorner, Rudi; Schittenhelm, Jens; Staszewski, Ori; Wani, Khalida; Varlet, Pascale; Pages, Melanie; Temming, Petra; Lohmann, Dietmar; Selt, Florian; Witt, Hendrik; Milde, Till; Witt, Olaf; Aronica, Eleonora; Giangaspero, Felice; Rushing, Elisabeth; Scheurlen, Wolfram; Geisenberger, Christoph; Rodriguez, Fausto J; Becker, Albert; Preusser, Matthias; Haberler, Christine; Bjerkvig, Rolf; Cryan, Jane; Farrell, Michael; Deckert, Martina; Hench, Jürgen; Frank, Stephan; Serrano, Jonathan; Kannan, Kasthuri; Tsirigos, Aristotelis; Brück, Wolfgang; Hofer, Silvia; Brehmer, Stefanie; Seiz-Rosenhagen, Marcel; Hänggi, Daniel; Hans, Volkmar; Rozsnoki, Stephanie; Hansford, Jordan R; Kohlhof, Patricia; Kristensen, Bjarne W; Lechner, Matt; Lopes, Beatriz; Mawrin, Christian; Ketter, Ralf; Kulozik, Andreas; Khatib, Ziad; Heppner, Frank; Koch, Arend; Jouvet, Anne; Keohane, Catherine; Mühleisen, Helmut; Mueller, Wolf; Pohl, Ute; Prinz, Marco; Benner, Axel; Zapatka, Marc; Gottardo, Nicholas G; Driever, Pablo Hernáiz; Kramm, Christof M; Müller, Hermann L; Rutkowski, Stefan; von Hoff, Katja; Frühwald, Michael C; Gnekow, Astrid; Fleischhack, Gudrun; Tippelt, Stephan; Calaminus, Gabriele; Monoranu, Camelia-Maria; Perry, Arie; Jones, Chris; Jacques, Thomas S; Radlwimmer, Bernhard; Gessi, Marco; Pietsch, Torsten; Schramm, Johannes; Schackert, Gabriele; Westphal, Manfred; Reifenberger, Guido; Wesseling, Pieter; Weller, Michael; Collins, Vincent Peter; Blümcke, Ingmar; Bendszus, Martin; Debus, Jürgen; Huang, Annie; Jabado, Nada; Northcott, Paul A; Paulus, Werner; Gajjar, Amar; Robinson, Giles W; Taylor, Michael D; Jaunmuktane, Zane; Ryzhova, Marina; Platten, Michael; Unterberg, Andreas; Wick, Wolfgang; Karajannis, Matthias A; Mittelbronn, Michel; Acker, Till; Hartmann, Christian; Aldape, Kenneth; Schüller, Ulrich; Buslei, Rolf; Lichter, Peter; Kool, Marcel; Herold-Mende, Christel; Ellison, David W; Hasselblatt, Martin; Snuderl, Matija; Brandner, Sebastian; Korshunov, Andrey; von Deimling, Andreas; Pfister, Stefan M

    2018-03-22

    Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging-with substantial inter-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show that the availability of this method may have a substantial impact on diagnostic precision compared to standard methods, resulting in a change of diagnosis in up to 12% of prospective cases. For broader accessibility, we have designed a free online classifier tool, the use of which does not require any additional onsite data processing. Our results provide a blueprint for the generation of machine-learning-based tumour classifiers across other cancer entities, with the potential to fundamentally transform tumour pathology.

  8. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic......, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s....

  9. Improved prognostic classification of breast cancer defined by antagonistic activation patterns of immune response pathway modules

    International Nuclear Information System (INIS)

    Teschendorff, Andrew E; Gomez, Sergio; Arenas, Alex; El-Ashry, Dorraya; Schmidt, Marcus; Gehrmann, Mathias; Caldas, Carlos

    2010-01-01

    Elucidating the activation pattern of molecular pathways across a given tumour type is a key challenge necessary for understanding the heterogeneity in clinical response and for developing novel more effective therapies. Gene expression signatures of molecular pathway activation derived from perturbation experiments in model systems as well as structural models of molecular interactions ('model signatures') constitute an important resource for estimating corresponding activation levels in tumours. However, relatively few strategies for estimating pathway activity from such model signatures exist and only few studies have used activation patterns of pathways to refine molecular classifications of cancer. Here we propose a novel network-based method for estimating pathway activation in tumours from model signatures. We find that although the pathway networks inferred from cancer expression data are highly consistent with the prior information contained in the model signatures, that they also exhibit a highly modular structure and that estimation of pathway activity is dependent on this modular structure. We apply our methodology to a panel of 438 estrogen receptor negative (ER-) and 785 estrogen receptor positive (ER+) breast cancers to infer activation patterns of important cancer related molecular pathways. We show that in ER negative basal and HER2+ breast cancer, gene expression modules reflecting T-cell helper-1 (Th1) and T-cell helper-2 (Th2) mediated immune responses play antagonistic roles as major risk factors for distant metastasis. Using Boolean interaction Cox-regression models to identify non-linear pathway combinations associated with clinical outcome, we show that simultaneous high activation of Th1 and low activation of a TGF-beta pathway module defines a subtype of particularly good prognosis and that this classification provides a better prognostic model than those based on the individual pathways. In ER+ breast cancer, we find that

  10. Apparent diffusion coefficient value of gastric cancer by diffusion-weighted imaging: Correlations with the histological differentiation and Lauren classification

    International Nuclear Information System (INIS)

    Liu, Song; Guan, Wenxian; Wang, Hao; Pan, Liang; Zhou, Zhuping; Yu, Haiping; Liu, Tian; Yang, Xiaofeng; He, Jian; Zhou, Zhengyang

    2014-01-01

    Highlights: • Gastric cancers’ ADC values were significantly lower than normal gastric wall. • Gastric adenocarcinomas with different differentiation had different ADC values. • Gastric adenocarcinomas’ ADC values correlated with histologic differentiations. • Gastric cancers’ ADC values correlated with Lauren classifications. • Mean ADC value was better than min ADC value in characterizing gastric cancers. - Abstract: Objective: The purpose of this study was to evaluate the correlations between histological differentiation and Lauren classification of gastric cancer and the apparent diffusion coefficient (ADC) value of diffusion weighted imaging (DWI). Materials and methods: Sixty-nine patients with gastric cancer lesions underwent preoperative magnetic resonance imaging (MRI) (3.0T) and surgical resection. DWI was obtained with a single-shot, echo-planar imaging sequence in the axial plane (b values: 0 and 1000 s/mm 2 ). Mean and minimum ADC values were obtained for each gastric cancer and normal gastric walls by two radiologists, who were blinded to the histological findings. Histological type, degree of differentiation and Lauren classification of each resected specimen were determined by one pathologist. Mean and minimum ADC values of gastric cancers with different histological types, degrees of differentiation and Lauren classifications were compared. Correlations between ADC values and histological differentiation and Lauren classification were analyzed. Results: The mean and minimum ADC values of gastric cancers, as a whole and separately, were significantly lower than those of normal gastric walls (all p values <0.001). There were significant differences in the mean and minimum ADC values among gastric cancers with different histological types, degrees of differentiation and Lauren classifications (p < 0.05). Mean and minimum ADC values correlated significantly (all p < 0.001) with histological differentiation (r = 0.564, 0.578) and Lauren

  11. [Assessment of functioning in patients with head and neck cancer based on the international classification of functioning, disability and health (ICF)].

    Science.gov (United States)

    Tschiesner, U

    2011-09-01

    The article approaches with the question how preservation of function after treatment of head and neck cancer (HNC) can be defined and measured across treatment approaches. On the basis of the "International Classification of Functioning, Disability and Health (ICF)" a series of efforts are summarized how all relevant aspects of the interdisciplinary team can be integrated into a common concept.Different efforts on the development, validation and implementation of ICF Core Sets for head and neck cancer (ICF-HNC) are discussed. The ICF-HNC covers organ-based problems with food ingestion, breathing, and speech, as well as psychosocial difficulties.Relationships between the ICF-HNC and well-established outcome measures are illustrated. This enables the user to integrate different aspects of functional outcome into a consolidated approach towards preservation/rehabilitation of functioning after HNC - applicable for a variety of treatment-approaches and health-professions. George Thieme Verlag KG Stuttgart · New York.

  12. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...

  13. Breast Cancer Survival Defined by the ER/PR/HER2 Subtypes and a Surrogate Classification according to Tumor Grade and Immunohistochemical Bio markers

    International Nuclear Information System (INIS)

    Parise, C. A.; Caggiano, V.

    2014-01-01

    ER, PR, and HER2 are routinely available in breast cancer specimens. The purpose of this study is to contrast breast cancer-specific survival for the eight ER/PR/HER2 subtypes with survival of an immunohistochemical surrogate for the molecular subtype based on the ER/PR/HER2 subtypes and tumor grade. Methods. We identified 123,780 cases of stages 1-3 primary female invasive breast cancer from California Cancer Registry. The surrogate classification was derived using ER/PR/HER2 and tumor grade. Kaplan-Meier survival analysis and Cox proportional hazards modeling were used to assess differences in survival and risk of mortality for the ER/PR/HER2 subtypes and surrogate classification within each stage. Results. The luminal B/HER2− surrogate classification had a higher risk of mortality than the luminal B/HER2+ for all stages of disease. There was no difference in risk of mortality between the ER+/PR+/HER2− and ER+/PR+/HER2+ in stage 3. With one exception in stage 3, the ER-negative subtypes all had an increased risk of mortality when compared with the ER-positive subtypes. Conclusions. Assessment of survival using ER/PR/HER2 illustrates the heterogeneity of HER2+ subtypes. The surrogate classification provides clear separation in survival and adjusted mortality but underestimates the wide variability within the subtypes that make up the classification.

  14. Clinical application of a microfluidic chip for immunocapture and quantification of circulating exosomes to assist breast cancer diagnosis and molecular classification.

    Science.gov (United States)

    Fang, Shimeng; Tian, Hongzhu; Li, Xiancheng; Jin, Dong; Li, Xiaojie; Kong, Jing; Yang, Chun; Yang, Xuesong; Lu, Yao; Luo, Yong; Lin, Bingcheng; Niu, Weidong; Liu, Tingjiao

    2017-01-01

    Increasing attention has been attracted by exosomes in blood-based diagnosis because cancer cells release more exosomes in serum than normal cells and these exosomes overexpress a certain number of cancer-related biomarkers. However, capture and biomarker analysis of exosomes for clinical application are technically challenging. In this study, we developed a microfluidic chip for immunocapture and quantification of circulating exosomes from small sample volume and applied this device in clinical study. Circulating EpCAM-positive exosomes were measured in 6 cases breast cancer patients and 3 healthy controls to assist diagnosis. A significant increase in the EpCAM-positive exosome level in these patients was detected, compared to healthy controls. Furthermore, we quantified circulating HER2-positive exosomes in 19 cases of breast cancer patients for molecular classification. We demonstrated that the exosomal HER2 expression levels were almost consistent with that in tumor tissues assessed by immunohistochemical staining. The microfluidic chip might provide a new platform to assist breast cancer diagnosis and molecular classification.

  15. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value.

    Directory of Open Access Journals (Sweden)

    Laetitia Marisa

    Full Text Available Colon cancer (CC pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses.Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype-like, normal-like, serrated CC phenotype-like, and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II-III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse-free survival, even after

  16. Breast Cancer Survival Defined by the ER/PR/HER2 Subtypes and a Surrogate Classification according to Tumor Grade and Immunohistochemical Biomarkers

    Directory of Open Access Journals (Sweden)

    Carol A. Parise

    2014-01-01

    Full Text Available Introduction. ER, PR, and HER2 are routinely available in breast cancer specimens. The purpose of this study is to contrast breast cancer-specific survival for the eight ER/PR/HER2 subtypes with survival of an immunohistochemical surrogate for the molecular subtype based on the ER/PR/HER2 subtypes and tumor grade. Methods. We identified 123,780 cases of stages 1–3 primary female invasive breast cancer from California Cancer Registry. The surrogate classification was derived using ER/PR/HER2 and tumor grade. Kaplan-Meier survival analysis and Cox proportional hazards modeling were used to assess differences in survival and risk of mortality for the ER/PR/HER2 subtypes and surrogate classification within each stage. Results. The luminal B/HER2− surrogate classification had a higher risk of mortality than the luminal B/HER2+ for all stages of disease. There was no difference in risk of mortality between the ER+/PR+/HER2− and ER+/PR+/HER2+ in stage 3. With one exception in stage 3, the ER-negative subtypes all had an increased risk of mortality when compared with the ER-positive subtypes. Conclusions. Assessment of survival using ER/PR/HER2 illustrates the heterogeneity of HER2+ subtypes. The surrogate classification provides clear separation in survival and adjusted mortality but underestimates the wide variability within the subtypes that make up the classification.

  17. An approach for leukemia classification based on cooperative game theory.

    Science.gov (United States)

    Torkaman, Atefeh; Charkari, Nasrollah Moghaddam; Aghaeipour, Mahnaz

    2011-01-01

    Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers in the world. So, the emphasis on diagnostic techniques and best treatments would be able to provide better prognosis and survival for patients. In this paper, an automatic diagnosis recommender system for classifying leukemia based on cooperative game is presented. Through out this research, we analyze the flow cytometry data toward the classification of leukemia into eight classes. We work on real data set from different types of leukemia that have been collected at Iran Blood Transfusion Organization (IBTO). Generally, the data set contains 400 samples taken from human leukemic bone marrow. This study deals with cooperative game used for classification according to different weights assigned to the markers. The proposed method is versatile as there are no constraints to what the input or output represent. This means that it can be used to classify a population according to their contributions. In other words, it applies equally to other groups of data. The experimental results show the accuracy rate of 93.12%, for classification and compared to decision tree (C4.5) with (90.16%) in accuracy. The result demonstrates that cooperative game is very promising to be used directly for classification of leukemia as a part of Active Medical decision support system for interpretation of flow cytometry readout. This system could assist clinical hematologists to properly recognize different kinds of leukemia by preparing suggestions and this could improve the treatment of leukemic

  18. Classification of premalignant pancreatic cancer mass-spectrometry data using decision tree ensembles

    Directory of Open Access Journals (Sweden)

    Wong G William

    2008-06-01

    Full Text Available Abstract Background Pancreatic cancer is the fourth leading cause of cancer death in the United States. Consequently, identification of clinically relevant biomarkers for the early detection of this cancer type is urgently needed. In recent years, proteomics profiling techniques combined with various data analysis methods have been successfully used to gain critical insights into processes and mechanisms underlying pathologic conditions, particularly as they relate to cancer. However, the high dimensionality of proteomics data combined with their relatively small sample sizes poses a significant challenge to current data mining methodology where many of the standard methods cannot be applied directly. Here, we propose a novel methodological framework using machine learning method, in which decision tree based classifier ensembles coupled with feature selection methods, is applied to proteomics data generated from premalignant pancreatic cancer. Results This study explores the utility of three different feature selection schemas (Student t test, Wilcoxon rank sum test and genetic algorithm to reduce the high dimensionality of a pancreatic cancer proteomic dataset. Using the top features selected from each method, we compared the prediction performances of a single decision tree algorithm C4.5 with six different decision-tree based classifier ensembles (Random forest, Stacked generalization, Bagging, Adaboost, Logitboost and Multiboost. We show that ensemble classifiers always outperform single decision tree classifier in having greater accuracies and smaller prediction errors when applied to a pancreatic cancer proteomics dataset. Conclusion In our cross validation framework, classifier ensembles generally have better classification accuracies compared to that of a single decision tree when applied to a pancreatic cancer proteomic dataset, thus suggesting its utility in future proteomics data analysis. Additionally, the use of feature selection

  19. Clinical classification of cancer cachexia: phenotypic correlates in human skeletal muscle.

    Directory of Open Access Journals (Sweden)

    Neil Johns

    Full Text Available BACKGROUND: Cachexia affects the majority of patients with advanced cancer and is associated with a reduction in treatment tolerance, response to therapy, and duration of survival. One impediment towards the effective treatment of cachexia is a validated classification system. METHODS: 41 patients with resectable upper gastrointestinal (GI or pancreatic cancer underwent characterisation for cachexia based on weight-loss (WL and/or low muscularity (LM. Four diagnostic criteria were used >5%WL, >10%WL, LM, and LM+>2%WL. All patients underwent biopsy of the rectus muscle. Analysis included immunohistochemistry for fibre size and type, protein and nucleic acid concentration, Western blots for markers of autophagy, SMAD signalling, and inflammation. FINDINGS: Compared with non-cachectic cancer patients, patients with LM or LM+>2%WL, mean muscle fibre diameter was reduced by about 25% (p = 0.02 and p = 0.001 respectively. No significant difference in fibre diameter was observed if patients had WL alone. Regardless of classification, there was no difference in fibre number or proportion of fibre type across all myosin heavy chain isoforms. Mean muscle protein content was reduced and the ratio of RNA/DNA decreased in patients with either >5%WL or LM+>2%WL. Compared with non-cachectic patients, SMAD3 protein levels were increased in patients with >5%WL (p = 0.022 and with >10%WL, beclin (p = 0.05 and ATG5 (p = 0.01 protein levels were increased. There were no differences in phospho-NFkB or phospho-STAT3 levels across any of the groups. CONCLUSION: Muscle fibre size, biochemical composition and pathway phenotype can vary according to whether the diagnostic criteria for cachexia are based on weight loss alone, a measure of low muscularity alone or a combination of the two. For intervention trials where the primary end-point is a change in muscle mass or function, use of combined diagnostic criteria may allow identification of a more

  20. Long-term Prostate-specific Antigen Velocity in Improved Classification of Prostate Cancer Risk and Mortality

    DEFF Research Database (Denmark)

    Ørsted, David Dynnes; Bojesen, Stig E; Kamstrup, Pia R

    2013-01-01

    BACKGROUND: It remains unclear whether adding long-term prostate-specific antigen velocity (PSAV) to baseline PSA values improves classification of prostate cancer (PCa) risk and mortality in the general population. OBJECTIVE: To determine whether long-term PSAV improves classification of PCa risk...

  1. A deep learning-based multi-model ensemble method for cancer prediction.

    Science.gov (United States)

    Xiao, Yawen; Wu, Jun; Lin, Zongli; Zhao, Xiaodong

    2018-01-01

    Cancer is a complex worldwide health problem associated with high mortality. With the rapid development of the high-throughput sequencing technology and the application of various machine learning methods that have emerged in recent years, progress in cancer prediction has been increasingly made based on gene expression, providing insight into effective and accurate treatment decision making. Thus, developing machine learning methods, which can successfully distinguish cancer patients from healthy persons, is of great current interest. However, among the classification methods applied to cancer prediction so far, no one method outperforms all the others. In this paper, we demonstrate a new strategy, which applies deep learning to an ensemble approach that incorporates multiple different machine learning models. We supply informative gene data selected by differential gene expression analysis to five different classification models. Then, a deep learning method is employed to ensemble the outputs of the five classifiers. The proposed deep learning-based multi-model ensemble method was tested on three public RNA-seq data sets of three kinds of cancers, Lung Adenocarcinoma, Stomach Adenocarcinoma and Breast Invasive Carcinoma. The test results indicate that it increases the prediction accuracy of cancer for all the tested RNA-seq data sets as compared to using a single classifier or the majority voting algorithm. By taking full advantage of different classifiers, the proposed deep learning-based multi-model ensemble method is shown to be accurate and effective for cancer prediction. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Superpixel-based classification of gastric chromoendoscopy images

    Science.gov (United States)

    Boschetto, Davide; Grisan, Enrico

    2017-03-01

    Chromoendoscopy (CH) is a gastroenterology imaging modality that involves the staining of tissues with methylene blue, which reacts with the internal walls of the gastrointestinal tract, improving the visual contrast in mucosal surfaces and thus enhancing a doctor's ability to screen precancerous lesions or early cancer. This technique helps identify areas that can be targeted for biopsy or treatment and in this work we will focus on gastric cancer detection. Gastric chromoendoscopy for cancer detection has several taxonomies available, one of which classifies CH images into three classes (normal, metaplasia, dysplasia) based on color, shape and regularity of pit patterns. Computer-assisted diagnosis is desirable to help us improve the reliability of the tissue classification and abnormalities detection. However, traditional computer vision methodologies, mainly segmentation, do not translate well to the specific visual characteristics of a gastroenterology imaging scenario. We propose the exploitation of a first unsupervised segmentation via superpixel, which groups pixels into perceptually meaningful atomic regions, used to replace the rigid structure of the pixel grid. For each superpixel, a set of features is extracted and then fed to a random forest based classifier, which computes a model used to predict the class of each superpixel. The average general accuracy of our model is 92.05% in the pixel domain (86.62% in the superpixel domain), while detection accuracies on the normal and abnormal class are respectively 85.71% and 95%. Eventually, the whole image class can be predicted image through a majority vote on each superpixel's predicted class.

  3. Identification of immune cell infiltration in hematoxylin-eosin stained breast cancer samples: texture-based classification of tissue morphologies

    Science.gov (United States)

    Turkki, Riku; Linder, Nina; Kovanen, Panu E.; Pellinen, Teijo; Lundin, Johan

    2016-03-01

    The characteristics of immune cells in the tumor microenvironment of breast cancer capture clinically important information. Despite the heterogeneity of tumor-infiltrating immune cells, it has been shown that the degree of infiltration assessed by visual evaluation of hematoxylin-eosin (H and E) stained samples has prognostic and possibly predictive value. However, quantification of the infiltration in H and E-stained tissue samples is currently dependent on visual scoring by an expert. Computer vision enables automated characterization of the components of the tumor microenvironment, and texture-based methods have successfully been used to discriminate between different tissue morphologies and cell phenotypes. In this study, we evaluate whether local binary pattern texture features with superpixel segmentation and classification with support vector machine can be utilized to identify immune cell infiltration in H and E-stained breast cancer samples. Guided with the pan-leukocyte CD45 marker, we annotated training and test sets from 20 primary breast cancer samples. In the training set of arbitrary sized image regions (n=1,116) a 3-fold cross-validation resulted in 98% accuracy and an area under the receiver-operating characteristic curve (AUC) of 0.98 to discriminate between immune cell -rich and - poor areas. In the test set (n=204), we achieved an accuracy of 96% and AUC of 0.99 to label cropped tissue regions correctly into immune cell -rich and -poor categories. The obtained results demonstrate strong discrimination between immune cell -rich and -poor tissue morphologies. The proposed method can provide a quantitative measurement of the degree of immune cell infiltration and applied to digitally scanned H and E-stained breast cancer samples for diagnostic purposes.

  4. On the International Agency for Research on Cancer classification of glyphosate as a probable human carcinogen.

    Science.gov (United States)

    Tarone, Robert E

    2018-01-01

    The recent classification by International Agency for Research on Cancer (IARC) of the herbicide glyphosate as a probable human carcinogen has generated considerable discussion. The classification is at variance with evaluations of the carcinogenic potential of glyphosate by several national and international regulatory bodies. The basis for the IARC classification is examined under the assumptions that the IARC criteria are reasonable and that the body of scientific studies determined by IARC staff to be relevant to the evaluation of glyphosate by the Monograph Working Group is sufficiently complete. It is shown that the classification of glyphosate as a probable human carcinogen was the result of a flawed and incomplete summary of the experimental evidence evaluated by the Working Group. Rational and effective cancer prevention activities depend on scientifically sound and unbiased assessments of the carcinogenic potential of suspected agents. Implications of the erroneous classification of glyphosate with respect to the IARC Monograph Working Group deliberative process are discussed.

  5. Esophagus cancer

    International Nuclear Information System (INIS)

    Anon.

    1989-01-01

    Ways of metastatic spreading of esophagus cancer, depending on segmental division of esophagus are considered. Classification of esophagus cancer according to morphological structure, domestic clinical classification according to stages and international classification according to TNM system are presented. Diagnosis of esophagus cancer should be complex and based on results of clinical examination of patients, radiological, endoscopic and morphological investigations. Radiological, surgical and combined (preoperative radiotherapy with successive operation) methods of treatment are used in the case of esophagus cancer. Versions of preoperative radiotherapy are given. Favourable results of applying combined surgical treatment with preoperative radiotherapy are shown

  6. RRHGE: A Novel Approach to Classify the Estrogen Receptor Based Breast Cancer Subtypes

    Directory of Open Access Journals (Sweden)

    Ashish Saini

    2014-01-01

    Full Text Available Background. Breast cancer is the most common type of cancer among females with a high mortality rate. It is essential to classify the estrogen receptor based breast cancer subtypes into correct subclasses, so that the right treatments can be applied to lower the mortality rate. Using gene signatures derived from gene interaction networks to classify breast cancers has proven to be more reproducible and can achieve higher classification performance. However, the interactions in the gene interaction network usually contain many false-positive interactions that do not have any biological meanings. Therefore, it is a challenge to incorporate the reliability assessment of interactions when deriving gene signatures from gene interaction networks. How to effectively extract gene signatures from available resources is critical to the success of cancer classification. Methods. We propose a novel method to measure and extract the reliable (biologically true or valid interactions from gene interaction networks and incorporate the extracted reliable gene interactions into our proposed RRHGE algorithm to identify significant gene signatures from microarray gene expression data for classifying ER+ and ER− breast cancer samples. Results. The evaluation on real breast cancer samples showed that our RRHGE algorithm achieved higher classification accuracy than the existing approaches.

  7. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge; Lo, Pechin Chien Pau; Dirksen, Asger

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment betw...

  8. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge Emil Borch Laurs; Lo, Pechin Chien Pau; Dirksen, Asger

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment...

  9. Towards precise classification of cancers based on robust gene functional expression profiles

    Directory of Open Access Journals (Sweden)

    Zhu Jing

    2005-03-01

    Full Text Available Abstract Background Development of robust and efficient methods for analyzing and interpreting high dimension gene expression profiles continues to be a focus in computational biology. The accumulated experiment evidence supports the assumption that genes express and perform their functions in modular fashions in cells. Therefore, there is an open space for development of the timely and relevant computational algorithms that use robust functional expression profiles towards precise classification of complex human diseases at the modular level. Results Inspired by the insight that genes act as a module to carry out a highly integrated cellular function, we thus define a low dimension functional expression profile for data reduction. After annotating each individual gene to functional categories defined in a proper gene function classification system such as Gene Ontology applied in this study, we identify those functional categories enriched with differentially expressed genes. For each functional category or functional module, we compute a summary measure (s for the raw expression values of the annotated genes to capture the overall activity level of the module. In this way, we can treat the gene expressions within a functional module as an integrative data point to replace the multiple values of individual genes. We compare the classification performance of decision trees based on functional expression profiles with the conventional gene expression profiles using four publicly available datasets, which indicates that precise classification of tumour types and improved interpretation can be achieved with the reduced functional expression profiles. Conclusion This modular approach is demonstrated to be a powerful alternative approach to analyzing high dimension microarray data and is robust to high measurement noise and intrinsic biological variance inherent in microarray data. Furthermore, efficient integration with current biological knowledge

  10. Cloud field classification based on textural features

    Science.gov (United States)

    Sengupta, Sailes Kumar

    1989-01-01

    An essential component in global climate research is accurate cloud cover and type determination. Of the two approaches to texture-based classification (statistical and textural), only the former is effective in the classification of natural scenes such as land, ocean, and atmosphere. In the statistical approach that was adopted, parameters characterizing the stochastic properties of the spatial distribution of grey levels in an image are estimated and then used as features for cloud classification. Two types of textural measures were used. One is based on the distribution of the grey level difference vector (GLDV), and the other on a set of textural features derived from the MaxMin cooccurrence matrix (MMCM). The GLDV method looks at the difference D of grey levels at pixels separated by a horizontal distance d and computes several statistics based on this distribution. These are then used as features in subsequent classification. The MaxMin tectural features on the other hand are based on the MMCM, a matrix whose (I,J)th entry give the relative frequency of occurrences of the grey level pair (I,J) that are consecutive and thresholded local extremes separated by a given pixel distance d. Textural measures are then computed based on this matrix in much the same manner as is done in texture computation using the grey level cooccurrence matrix. The database consists of 37 cloud field scenes from LANDSAT imagery using a near IR visible channel. The classification algorithm used is the well known Stepwise Discriminant Analysis. The overall accuracy was estimated by the percentage or correct classifications in each case. It turns out that both types of classifiers, at their best combination of features, and at any given spatial resolution give approximately the same classification accuracy. A neural network based classifier with a feed forward architecture and a back propagation training algorithm is used to increase the classification accuracy, using these two classes

  11. Dual-modal cancer detection based on optical pH sensing and Raman spectroscopy.

    Science.gov (United States)

    Kim, Soogeun; Lee, Seung Ho; Min, Sun Young; Byun, Kyung Min; Lee, Soo Yeol

    2017-10-01

    A dual-modal approach using Raman spectroscopy and optical pH sensing was investigated to discriminate between normal and cancerous tissues. Raman spectroscopy has demonstrated the potential for in vivo cancer detection. However, Raman spectroscopy has suffered from strong fluorescence background of biological samples and subtle spectral differences between normal and disease tissues. To overcome those issues, pH sensing is adopted to Raman spectroscopy as a dual-modal approach. Based on the fact that the pH level in cancerous tissues is lower than that in normal tissues due to insufficient vasculature formation, the dual-modal approach combining the chemical information of Raman spectrum and the metabolic information of pH level can improve the specificity of cancer diagnosis. From human breast tissue samples, Raman spectra and pH levels are measured using fiber-optic-based Raman and pH probes, respectively. The pH sensing is based on the dependence of pH level on optical transmission spectrum. Multivariate statistical analysis is performed to evaluate the classification capability of the dual-modal method. The analytical results show that the dual-modal method based on Raman spectroscopy and optical pH sensing can improve the performance of cancer classification. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

  12. Dual-modal cancer detection based on optical pH sensing and Raman spectroscopy

    Science.gov (United States)

    Kim, Soogeun; Lee, Seung Ho; Min, Sun Young; Byun, Kyung Min; Lee, Soo Yeol

    2017-10-01

    A dual-modal approach using Raman spectroscopy and optical pH sensing was investigated to discriminate between normal and cancerous tissues. Raman spectroscopy has demonstrated the potential for in vivo cancer detection. However, Raman spectroscopy has suffered from strong fluorescence background of biological samples and subtle spectral differences between normal and disease tissues. To overcome those issues, pH sensing is adopted to Raman spectroscopy as a dual-modal approach. Based on the fact that the pH level in cancerous tissues is lower than that in normal tissues due to insufficient vasculature formation, the dual-modal approach combining the chemical information of Raman spectrum and the metabolic information of pH level can improve the specificity of cancer diagnosis. From human breast tissue samples, Raman spectra and pH levels are measured using fiber-optic-based Raman and pH probes, respectively. The pH sensing is based on the dependence of pH level on optical transmission spectrum. Multivariate statistical analysis is performed to evaluate the classification capability of the dual-modal method. The analytical results show that the dual-modal method based on Raman spectroscopy and optical pH sensing can improve the performance of cancer classification.

  13. Evaluation Methodology between Globalization and Localization Features Approaches for Skin Cancer Lesions Classification

    Science.gov (United States)

    Ahmed, H. M.; Al-azawi, R. J.; Abdulhameed, A. A.

    2018-05-01

    Huge efforts have been put in the developing of diagnostic methods to skin cancer disease. In this paper, two different approaches have been addressed for detection the skin cancer in dermoscopy images. The first approach uses a global method that uses global features for classifying skin lesions, whereas the second approach uses a local method that uses local features for classifying skin lesions. The aim of this paper is selecting the best approach for skin lesion classification. The dataset has been used in this paper consist of 200 dermoscopy images from Pedro Hispano Hospital (PH2). The achieved results are; sensitivity about 96%, specificity about 100%, precision about 100%, and accuracy about 97% for globalization approach while, sensitivity about 100%, specificity about 100%, precision about 100%, and accuracy about 100% for Localization Approach, these results showed that the localization approach achieved acceptable accuracy and better than globalization approach for skin cancer lesions classification.

  14. An Approach for Leukemia Classification Based on Cooperative Game Theory

    Directory of Open Access Journals (Sweden)

    Atefeh Torkaman

    2011-01-01

    Full Text Available Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers in the world. So, the emphasis on diagnostic techniques and best treatments would be able to provide better prognosis and survival for patients. In this paper, an automatic diagnosis recommender system for classifying leukemia based on cooperative game is presented. Through out this research, we analyze the flow cytometry data toward the classification of leukemia into eight classes. We work on real data set from different types of leukemia that have been collected at Iran Blood Transfusion Organization (IBTO. Generally, the data set contains 400 samples taken from human leukemic bone marrow. This study deals with cooperative game used for classification according to different weights assigned to the markers. The proposed method is versatile as there are no constraints to what the input or output represent. This means that it can be used to classify a population according to their contributions. In other words, it applies equally to other groups of data. The experimental results show the accuracy rate of 93.12%, for classification and compared to decision tree (C4.5 with (90.16% in accuracy. The result demonstrates that cooperative game is very promising to be used directly for classification of leukemia as a part of Active Medical decision support system for interpretation of flow cytometry readout. This system could assist clinical hematologists to properly recognize different kinds of leukemia by preparing suggestions and this could improve the treatment

  15. AN OBJECT-BASED METHOD FOR CHINESE LANDFORM TYPES CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    H. Ding

    2016-06-01

    Full Text Available Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM. In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  16. From Molecular Classification to Targeted Therapeutics: The Changing Face of Systemic Therapy in Metastatic Gastroesophageal Cancer

    Directory of Open Access Journals (Sweden)

    Adrian Murphy

    2015-01-01

    Full Text Available Histological classification of adenocarcinoma or squamous cell carcinoma for esophageal cancer or using the Lauren classification for intestinal and diffuse type gastric cancer has limited clinical utility in the management of advanced disease. Germline mutations in E-cadherin (CDH1 or mismatch repair genes (Lynch syndrome were identified many years ago but given their rarity, the identification of these molecular alterations does not substantially impact treatment in the advanced setting. Recent molecular profiling studies of upper GI tumors have added to our knowledge of the underlying biology but have not led to an alternative classification system which can guide clinician’s therapeutic decisions. Recently the Cancer Genome Atlas Research Network has proposed four subtypes of gastric cancer dividing tumors into those positive for Epstein-Barr virus, microsatellite unstable tumors, genomically stable tumors, and tumors with chromosomal instability. Unfortunately to date, many phase III clinical trials involving molecularly targeted agents have failed to meet their survival endpoints due to their use in unselected populations. Future clinical trials should utilize molecular profiling of individual tumors in order to determine the optimal use of targeted therapies in preselected patients.

  17. On the classification techniques in data mining for microarray data classification

    Science.gov (United States)

    Aydadenta, Husna; Adiwijaya

    2018-03-01

    Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.

  18. In vivo subsite classification and diagnosis of oral cancers using Raman spectroscopy

    Directory of Open Access Journals (Sweden)

    Aditi Sahu

    2016-09-01

    Full Text Available Oral cancers suffer from poor disease-free survival rates due to delayed diagnosis. Noninvasive, rapid, objective approaches as adjuncts to visual inspection can help in better management of oral cancers. Raman spectroscopy (RS has shown potential in identification of oral premalignant and malignant conditions and also in the detection of early cancer changes like cancer-field-effects (CFE at buccal mucosa subsite. Anatomic differences between different oral subsites have also been reported using RS. In this study, anatomical differences between subsites and their possible influence on healthy vs pathological classification were evaluated on 85 oral cancer and 72 healthy subjects. Spectra were acquired from buccal mucosa, lip and tongue in healthy, contralateral (internal healthy control, premalignant and cancer conditions using fiber-optic Raman spectrometer. Mean spectra indicate predominance of lipids in healthy buccal mucosa, contribution of both lipids and proteins in lip while major dominance of protein in tongue spectra. From healthy to tumor, changes in protein secondary-structure, DNA and heme-related features were observed. Principal component linear discriminant analysis (PC-LDA followed by leave-one-out-cross-validation (LOOCV was used for data analysis. Findings indicate buccal mucosa and tongue are distinct entities, while lip misclassifies with both these subsites. Additionally, the diagnostic algorithm for individual subsites gave improved classification efficiencies with respect to the pooled subsites model. However, as the pooled subsites model yielded 98% specificity and 100% sensitivity, this model may be more useful for preliminary screening applications. Large-scale validation studies are a pre-requisite before envisaging future clinical applications.

  19. Comparison Effectiveness of Pixel Based Classification and Object Based Classification Using High Resolution Image In Floristic Composition Mapping (Study Case: Gunung Tidar Magelang City)

    Science.gov (United States)

    Ardha Aryaguna, Prama; Danoedoro, Projo

    2016-11-01

    Developments of analysis remote sensing have same way with development of technology especially in sensor and plane. Now, a lot of image have high spatial and radiometric resolution, that's why a lot information. Vegetation object analysis such floristic composition got a lot advantage of that development. Floristic composition can be interpreted using a lot of method such pixel based classification and object based classification. The problems for pixel based method on high spatial resolution image are salt and paper who appear in result of classification. The purpose of this research are compare effectiveness between pixel based classification and object based classification for composition vegetation mapping on high resolution image Worldview-2. The results show that pixel based classification using majority 5×5 kernel windows give the highest accuracy between another classifications. The highest accuracy is 73.32% from image Worldview-2 are being radiometric corrected level surface reflectance, but for overall accuracy in every class, object based are the best between another methods. Reviewed from effectiveness aspect, pixel based are more effective then object based for vegetation composition mapping in Tidar forest.

  20. Artificial neural networks as classification and diagnostic tools for lymph node-negative breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Eswari J, Satya; Chandrakar, Neha [National Institute of Technology Raipur, Raipur (India)

    2016-04-15

    Artificial neural networks (ANNs) can be used to develop a technique to classify lymph node negative breast cancer that is prone to distant metastases based on gene expression signatures. The neural network used is a multilayered feed forward network that employs back propagation algorithm. Once trained with DNA microarraybased gene expression profiles of genes that were predictive of distant metastasis recurrence of lymph node negative breast cancer, the ANNs became capable of correctly classifying all samples and recognizing the genes most appropriate to the classification. To test the ability of the trained ANN models in recognizing lymph node negative breast cancer, we analyzed additional idle samples that were not used beforehand for the training procedure and obtained the correctly classified result in the validation set. For more substantial result, bootstrapping of training and testing dataset was performed as external validation. This study illustrates the potential application of ANN for breast tumor diagnosis and the identification of candidate targets in patients for therapy.

  1. Classification of Ovarian Cancer Surgery Facilitates Treatment Decisions in a Gynecological Multidisciplinary Team

    DEFF Research Database (Denmark)

    Bjørn, Signe Frahm; Schnack, Tine Henrichsen; Lajer, Henrik

    2017-01-01

    multidisciplinary team (MDT) decisions. Materials and Methods Four hundred eighteen women diagnosed with ovarian cancers (n = 351) or borderline tumors (n = 66) were selected for primary debulking surgery from January 2008 to July 2013. At an MDT meeting, women were allocated into 3 groups named "pre-COVA" 1 to 3...... classifying the expected extent of the primary surgery and need for postoperative care. On the basis of the operative procedures performed, women were allocated into 1 of the 3 corresponding COVA 1 to 3 groups. The outcome measure was the predictive value of the pre-COVA score compared with the actual COVA......-COVA classification predicted the actual COVA group in 79 (49%) FIGO stages I to IIIB and in 85 (45%) FIGO stages IIIC to IV. Conclusions The COVA classification system is a simple and useful tool in the MDT setting where specialists make treatment decisions based on advanced technology. The use of pre...

  2. Inventory classification based on decoupling points

    Directory of Open Access Journals (Sweden)

    Joakim Wikner

    2015-01-01

    Full Text Available The ideal state of continuous one-piece flow may never be achieved. Still the logistics manager can improve the flow by carefully positioning inventory to buffer against variations. Strategies such as lean, postponement, mass customization, and outsourcing all rely on strategic positioning of decoupling points to separate forecast-driven from customer-order-driven flows. Planning and scheduling of the flow are also based on classification of decoupling points as master scheduled or not. A comprehensive classification scheme for these types of decoupling points is introduced. The approach rests on identification of flows as being either demand based or supply based. The demand or supply is then combined with exogenous factors, classified as independent, or endogenous factors, classified as dependent. As a result, eight types of strategic as well as tactical decoupling points are identified resulting in a process-based framework for inventory classification that can be used for flow design.

  3. Sentiment classification technology based on Markov logic networks

    Science.gov (United States)

    He, Hui; Li, Zhigang; Yao, Chongchong; Zhang, Weizhe

    2016-07-01

    With diverse online media emerging, there is a growing concern of sentiment classification problem. At present, text sentiment classification mainly utilizes supervised machine learning methods, which feature certain domain dependency. On the basis of Markov logic networks (MLNs), this study proposed a cross-domain multi-task text sentiment classification method rooted in transfer learning. Through many-to-one knowledge transfer, labeled text sentiment classification, knowledge was successfully transferred into other domains, and the precision of the sentiment classification analysis in the text tendency domain was improved. The experimental results revealed the following: (1) the model based on a MLN demonstrated higher precision than the single individual learning plan model. (2) Multi-task transfer learning based on Markov logical networks could acquire more knowledge than self-domain learning. The cross-domain text sentiment classification model could significantly improve the precision and efficiency of text sentiment classification.

  4. Side effects of cancer therapies. International classification and documentation systems

    International Nuclear Information System (INIS)

    Seegenschmiedt, M.H.

    1998-01-01

    The publication presents and explains verified, international classification and documentation systems for side effects induced by cancer treatments, applicable in general and clinical practice and clinical research, and covers in a clearly arranged manner the whole range of treatments, including acute and chronic side effects of chemotherapy and radiotherapy, surgery, or combined therapies. The book fills a long-felt need in tumor documentation and is a major contribution to quality assurance in clinical oncology in German-speaking countries. As most parts of the book are bilingual, presenting German and English texts and terminology, it satisfies the principles of interdisciplinarity and internationality. The tabulated form chosen for presentation of classification systems and criteria facilitate the user's approach as well as application in daily work. (orig./CB) [de

  5. Mechanism-based drug exposure classification in pharmacoepidemiological studies

    NARCIS (Netherlands)

    Verdel, B.M.

    2010-01-01

    Mechanism-based classification of drug exposure in pharmacoepidemiological studies In pharmacoepidemiology and pharmacovigilance, the relation between drug exposure and clinical outcomes is crucial. Exposure classification in pharmacoepidemiological studies is traditionally based on

  6. Median Filter Noise Reduction of Image and Backpropagation Neural Network Model for Cervical Cancer Classification

    Science.gov (United States)

    Wutsqa, D. U.; Marwah, M.

    2017-06-01

    In this paper, we consider spatial operation median filter to reduce the noise in the cervical images yielded by colposcopy tool. The backpropagation neural network (BPNN) model is applied to the colposcopy images to classify cervical cancer. The classification process requires an image extraction by using a gray level co-occurrence matrix (GLCM) method to obtain image features that are used as inputs of BPNN model. The advantage of noise reduction is evaluated by comparing the performances of BPNN models with and without spatial operation median filter. The experimental result shows that the spatial operation median filter can improve the accuracy of the BPNN model for cervical cancer classification.

  7. Breast cancer surgery and diagnosis-related groups (DRGs): patient classification and hospital reimbursement in 11 European countries.

    Science.gov (United States)

    Scheller-Kreinsen, David; Quentin, Wilm; Geissler, Alexander; Busse, Reinhard

    2013-10-01

    Researchers from eleven countries (i.e. Austria, England, Estonia, Finland, France, Germany, Ireland, Netherlands, Poland, Spain, and Sweden) compared how their DRG systems deal with breast cancer surgery patients. DRG algorithms and indicators of resource consumption were assessed for those DRGs that individually contain at least 1% of all breast cancer surgery patients. Six standardised case vignettes were defined and quasi prices according to national DRG-based hospital payment systems were ascertained. European DRG systems classify breast cancer surgery patients according to different sets of classification variables into three to seven DRGs. Quasi prices for an index case treated with partial mastectomy range from €577 in Poland to €5780 in the Netherlands. Countries award their highest payments for very different kinds of patients. Breast cancer specialists and national DRG authorities should consider how other countries' DRG systems classify breast cancer patients in order to identify potential scope for improvement and to ensure fair and appropriate reimbursement. Copyright © 2012 Elsevier Ltd. All rights reserved.

  8. Cancer cell detection and classification using transformation invariant template learning methods

    International Nuclear Information System (INIS)

    Talware, Rajendra; Abhyankar, Aditya

    2011-01-01

    In traditional cancer cell detection, pathologists examine biopsies to make diagnostic assessments, largely based on cell morphology and tissue distribution. The process of image acquisition is very much subjective and the pattern undergoes unknown or random transformations during data acquisition (e.g. variation in illumination, orientation, translation and perspective) results in high degree of variability. Transformed Component Analysis (TCA) incorporates a discrete, hidden variable that accounts for transformations and uses the Expectation Maximization (EM) algorithm to jointly extract components and normalize for transformations. Further the TEMPLAR framework developed takes advantage of hierarchical pattern models and adds probabilistic modeling for local transformations. Pattern classification is based on Expectation Maximization algorithm and General Likelihood Ratio Tests (GLRT). Performance of TEMPLAR is certainly improved by defining area of interest on slide a priori. Performance can be further enhanced by making the kernel function adaptive during learning. (author)

  9. Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis.

    Science.gov (United States)

    Al-Rajab, Murad; Lu, Joan; Xu, Qiang

    2017-07-01

    This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Support Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Thermogram breast cancer prediction approach based on Neutrosophic sets and fuzzy c-means algorithm.

    Science.gov (United States)

    Gaber, Tarek; Ismail, Gehad; Anter, Ahmed; Soliman, Mona; Ali, Mona; Semary, Noura; Hassanien, Aboul Ella; Snasel, Vaclav

    2015-08-01

    The early detection of breast cancer makes many women survive. In this paper, a CAD system classifying breast cancer thermograms to normal and abnormal is proposed. This approach consists of two main phases: automatic segmentation and classification. For the former phase, an improved segmentation approach based on both Neutrosophic sets (NS) and optimized Fast Fuzzy c-mean (F-FCM) algorithm was proposed. Also, post-segmentation process was suggested to segment breast parenchyma (i.e. ROI) from thermogram images. For the classification, different kernel functions of the Support Vector Machine (SVM) were used to classify breast parenchyma into normal or abnormal cases. Using benchmark database, the proposed CAD system was evaluated based on precision, recall, and accuracy as well as a comparison with related work. The experimental results showed that our system would be a very promising step toward automatic diagnosis of breast cancer using thermograms as the accuracy reached 100%.

  11. A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification

    Directory of Open Access Journals (Sweden)

    Wang Lily

    2008-07-01

    Full Text Available Abstract Background Cancer diagnosis and clinical outcome prediction are among the most important emerging applications of gene expression microarray technology with several molecular signatures on their way toward clinical deployment. Use of the most accurate classification algorithms available for microarray gene expression data is a critical ingredient in order to develop the best possible molecular signatures for patient care. As suggested by a large body of literature to date, support vector machines can be considered "best of class" algorithms for classification of such data. Recent work, however, suggests that random forest classifiers may outperform support vector machines in this domain. Results In the present paper we identify methodological biases of prior work comparing random forests and support vector machines and conduct a new rigorous evaluation of the two algorithms that corrects these limitations. Our experiments use 22 diagnostic and prognostic datasets and show that support vector machines outperform random forests, often by a large margin. Our data also underlines the importance of sound research design in benchmarking and comparison of bioinformatics algorithms. Conclusion We found that both on average and in the majority of microarray datasets, random forests are outperformed by support vector machines both in the settings when no gene selection is performed and when several popular gene selection methods are used.

  12. Efficacy of hidden markov model over support vector machine on multiclass classification of healthy and cancerous cervical tissues

    Science.gov (United States)

    Mukhopadhyay, Sabyasachi; Kurmi, Indrajit; Pratiher, Sawon; Mukherjee, Sukanya; Barman, Ritwik; Ghosh, Nirmalya; Panigrahi, Prasanta K.

    2018-02-01

    In this paper, a comparative study between SVM and HMM has been carried out for multiclass classification of cervical healthy and cancerous tissues. In our study, the HMM methodology is more promising to produce higher accuracy in classification.

  13. IDM-PhyChm-Ens: intelligent decision-making ensemble methodology for classification of human breast cancer using physicochemical properties of amino acids.

    Science.gov (United States)

    Ali, Safdar; Majid, Abdul; Khan, Asifullah

    2014-04-01

    Development of an accurate and reliable intelligent decision-making method for the construction of cancer diagnosis system is one of the fast growing research areas of health sciences. Such decision-making system can provide adequate information for cancer diagnosis and drug discovery. Descriptors derived from physicochemical properties of protein sequences are very useful for classifying cancerous proteins. Recently, several interesting research studies have been reported on breast cancer classification. To this end, we propose the exploitation of the physicochemical properties of amino acids in protein primary sequences such as hydrophobicity (Hd) and hydrophilicity (Hb) for breast cancer classification. Hd and Hb properties of amino acids, in recent literature, are reported to be quite effective in characterizing the constituent amino acids and are used to study protein foldings, interactions, structures, and sequence-order effects. Especially, using these physicochemical properties, we observed that proline, serine, tyrosine, cysteine, arginine, and asparagine amino acids offer high discrimination between cancerous and healthy proteins. In addition, unlike traditional ensemble classification approaches, the proposed 'IDM-PhyChm-Ens' method was developed by combining the decision spaces of a specific classifier trained on different feature spaces. The different feature spaces used were amino acid composition, split amino acid composition, and pseudo amino acid composition. Consequently, we have exploited different feature spaces using Hd and Hb properties of amino acids to develop an accurate method for classification of cancerous protein sequences. We developed ensemble classifiers using diverse learning algorithms such as random forest (RF), support vector machines (SVM), and K-nearest neighbor (KNN) trained on different feature spaces. We observed that ensemble-RF, in case of cancer classification, performed better than ensemble-SVM and ensemble-KNN. Our

  14. Setting a generalized functional linear model (GFLM for the classification of different types of cancer

    Directory of Open Access Journals (Sweden)

    Miguel Flores

    2016-11-01

    Full Text Available This work aims to classify the DNA sequences of healthy and malignant cancer respectively. For this, supervised and unsupervised classification methods from a functional context are used; i.e. each strand of DNA is an observation. The observations are discretized, for that reason different ways to represent these observations with functions are evaluated. In addition, an exploratory study is done: estimating the mean and variance of each functional type of cancer. For the unsupervised classification method, hierarchical clustering with different measures of functional distance is used. On the other hand, for the supervised classification method, a functional generalized linear model is used. For this model the first and second derivatives are used which are included as discriminating variables. It has been verified that one of the advantages of working in the functional context is to obtain a model to correctly classify cancers by 100%. For the implementation of the methods it has been used the fda.usc R package that includes all the techniques of functional data analysis used in this work. In addition, some that have been developed in recent decades. For more details of these techniques can be consulted Ramsay, J. O. and Silverman (2005 and Ferraty et al. (2006.

  15. PCA based feature reduction to improve the accuracy of decision tree c4.5 classification

    Science.gov (United States)

    Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.

    2018-03-01

    Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.

  16. Laser Raman detection for oral cancer based on a Gaussian process classification method

    Science.gov (United States)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Zhang, Chijun; Chen, He; Luo, Yusheng; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Jia, Jun; Shen, Aiguo; Hu, Jiming

    2013-06-01

    Oral squamous cell carcinoma is the most common neoplasm of the oral cavity. The incidence rate accounts for 80% of total oral cancer and shows an upward trend in recent years. It has a high degree of malignancy and is difficult to detect in terms of differential diagnosis, as a consequence of which the timing of treatment is always delayed. In this work, Raman spectroscopy was adopted to differentially diagnose oral squamous cell carcinoma and oral gland carcinoma. In total, 852 entries of raw spectral data which consisted of 631 items from 36 oral squamous cell carcinoma patients, 87 items from four oral gland carcinoma patients and 134 items from five normal people were collected by utilizing an optical method on oral tissues. The probability distribution of the datasets corresponding to the spectral peaks of the oral squamous cell carcinoma tissue was analyzed and the experimental result showed that the data obeyed a normal distribution. Moreover, the distribution characteristic of the noise was also in compliance with a Gaussian distribution. A Gaussian process (GP) classification method was utilized to distinguish the normal people and the oral gland carcinoma patients from the oral squamous cell carcinoma patients. The experimental results showed that all the normal people could be recognized. 83.33% of the oral squamous cell carcinoma patients could be correctly diagnosed and the remaining ones would be diagnosed as having oral gland carcinoma. For the classification process of oral gland carcinoma and oral squamous cell carcinoma, the correct ratio was 66.67% and the erroneously diagnosed percentage was 33.33%. The total sensitivity was 80% and the specificity was 100% with the Matthews correlation coefficient (MCC) set to 0.447 213 595. Considering the numerical results above, the application prospects and clinical value of this technique are significantly impressive.

  17. Laser Raman detection for oral cancer based on a Gaussian process classification method

    International Nuclear Information System (INIS)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Zhang, Chijun; Chen, He; Luo, Yusheng; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Shen, Aiguo; Hu, Jiming; Jia, Jun

    2013-01-01

    Oral squamous cell carcinoma is the most common neoplasm of the oral cavity. The incidence rate accounts for 80% of total oral cancer and shows an upward trend in recent years. It has a high degree of malignancy and is difficult to detect in terms of differential diagnosis, as a consequence of which the timing of treatment is always delayed. In this work, Raman spectroscopy was adopted to differentially diagnose oral squamous cell carcinoma and oral gland carcinoma. In total, 852 entries of raw spectral data which consisted of 631 items from 36 oral squamous cell carcinoma patients, 87 items from four oral gland carcinoma patients and 134 items from five normal people were collected by utilizing an optical method on oral tissues. The probability distribution of the datasets corresponding to the spectral peaks of the oral squamous cell carcinoma tissue was analyzed and the experimental result showed that the data obeyed a normal distribution. Moreover, the distribution characteristic of the noise was also in compliance with a Gaussian distribution. A Gaussian process (GP) classification method was utilized to distinguish the normal people and the oral gland carcinoma patients from the oral squamous cell carcinoma patients. The experimental results showed that all the normal people could be recognized. 83.33% of the oral squamous cell carcinoma patients could be correctly diagnosed and the remaining ones would be diagnosed as having oral gland carcinoma. For the classification process of oral gland carcinoma and oral squamous cell carcinoma, the correct ratio was 66.67% and the erroneously diagnosed percentage was 33.33%. The total sensitivity was 80% and the specificity was 100% with the Matthews correlation coefficient (MCC) set to 0.447 213 595. Considering the numerical results above, the application prospects and clinical value of this technique are significantly impressive. (letter)

  18. A collection of annotated and harmonized human breast cancer transcriptome datasets, including immunologic classification [version 2; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Jessica Roelands

    2018-02-01

    Full Text Available The increased application of high-throughput approaches in translational research has expanded the number of publicly available data repositories. Gathering additional valuable information contained in the datasets represents a crucial opportunity in the biomedical field. To facilitate and stimulate utilization of these datasets, we have recently developed an interactive data browsing and visualization web application, the Gene Expression Browser (GXB. In this note, we describe a curated compendium of 13 public datasets on human breast cancer, representing a total of 2142 transcriptome profiles. We classified the samples according to different immune based classification systems and integrated this information into the datasets. Annotated and harmonized datasets were uploaded to GXB. Study samples were categorized in different groups based on their immunologic tumor response profiles, intrinsic molecular subtypes and multiple clinical parameters. Ranked gene lists were generated based on relevant group comparisons. In this data note, we demonstrate the utility of GXB to evaluate the expression of a gene of interest, find differential gene expression between groups and investigate potential associations between variables with a specific focus on immunologic classification in breast cancer. This interactive resource is publicly available online at: http://breastcancer.gxbsidra.org/dm3/geneBrowser/list.

  19. hemaClass.org: Online One-By-One Microarray Normalization and Classification of Hematological Cancers for Precision Medicine.

    Science.gov (United States)

    Falgreen, Steffen; Ellern Bilgrau, Anders; Brøndum, Rasmus Froberg; Hjort Jakobsen, Lasse; Have, Jonas; Lindblad Nielsen, Kasper; El-Galaly, Tarec Christoffer; Bødker, Julie Støve; Schmitz, Alexander; H Young, Ken; Johnsen, Hans Erik; Dybkær, Karen; Bøgsted, Martin

    2016-01-01

    Dozens of omics based cancer classification systems have been introduced with prognostic, diagnostic, and predictive capabilities. However, they often employ complex algorithms and are only applicable on whole cohorts of patients, making them difficult to apply in a personalized clinical setting. This prompted us to create hemaClass.org, an online web application providing an easy interface to one-by-one RMA normalization of microarrays and subsequent risk classifications of diffuse large B-cell lymphoma (DLBCL) into cell-of-origin and chemotherapeutic sensitivity classes. Classification results for one-by-one array pre-processing with and without a laboratory specific RMA reference dataset were compared to cohort based classifiers in 4 publicly available datasets. Classifications showed high agreement between one-by-one and whole cohort pre-processsed data when a laboratory specific reference set was supplied. The website is essentially the R-package hemaClass accompanied by a Shiny web application. The well-documented package can be used to run the website locally or to use the developed methods programmatically. The website and R-package is relevant for biological and clinical lymphoma researchers using affymetrix U-133 Plus 2 arrays, as it provides reliable and swift methods for calculation of disease subclasses. The proposed one-by-one pre-processing method is relevant for all researchers using microarrays.

  20. Deep learning for EEG-Based preference classification

    Science.gov (United States)

    Teo, Jason; Hou, Chew Lin; Mountstephens, James

    2017-10-01

    Electroencephalogram (EEG)-based emotion classification is rapidly becoming one of the most intensely studied areas of brain-computer interfacing (BCI). The ability to passively identify yet accurately correlate brainwaves with our immediate emotions opens up truly meaningful and previously unattainable human-computer interactions such as in forensic neuroscience, rehabilitative medicine, affective entertainment and neuro-marketing. One particularly useful yet rarely explored areas of EEG-based emotion classification is preference recognition [1], which is simply the detection of like versus dislike. Within the limited investigations into preference classification, all reported studies were based on musically-induced stimuli except for a single study which used 2D images. The main objective of this study is to apply deep learning, which has been shown to produce state-of-the-art results in diverse hard problems such as in computer vision, natural language processing and audio recognition, to 3D object preference classification over a larger group of test subjects. A cohort of 16 users was shown 60 bracelet-like objects as rotating visual stimuli on a computer display while their preferences and EEGs were recorded. After training a variety of machine learning approaches which included deep neural networks, we then attempted to classify the users' preferences for the 3D visual stimuli based on their EEGs. Here, we show that that deep learning outperforms a variety of other machine learning classifiers for this EEG-based preference classification task particularly in a highly challenging dataset with large inter- and intra-subject variability.

  1. Computerized three-class classification of MRI-based prognostic markers for breast cancer

    Energy Technology Data Exchange (ETDEWEB)

    Bhooshan, Neha; Giger, Maryellen; Edwards, Darrin; Yuan Yading; Jansen, Sanaz; Li Hui; Lan Li; Newstead, Gillian [Department of Radiology, University of Chicago, Chicago, IL 60637 (United States); Sattar, Husain, E-mail: bhooshan@uchicago.edu [Department of Pathology, University of Chicago, Chicago, IL 60637 (United States)

    2011-09-21

    The purpose of this study is to investigate whether computerized analysis using three-class Bayesian artificial neural network (BANN) feature selection and classification can characterize tumor grades (grade 1, grade 2 and grade 3) of breast lesions for prognostic classification on DCE-MRI. A database of 26 IDC grade 1 lesions, 86 IDC grade 2 lesions and 58 IDC grade 3 lesions was collected. The computer automatically segmented the lesions, and kinetic and morphological lesion features were automatically extracted. The discrimination tasks-grade 1 versus grade 3, grade 2 versus grade 3, and grade 1 versus grade 2 lesions-were investigated. Step-wise feature selection was conducted by three-class BANNs. Classification was performed with three-class BANNs using leave-one-lesion-out cross-validation to yield computer-estimated probabilities of being grade 3 lesion, grade 2 lesion and grade 1 lesion. Two-class ROC analysis was used to evaluate the performances. We achieved AUC values of 0.80 {+-} 0.05, 0.78 {+-} 0.05 and 0.62 {+-} 0.05 for grade 1 versus grade 3, grade 1 versus grade 2, and grade 2 versus grade 3, respectively. This study shows the potential for (1) applying three-class BANN feature selection and classification to CADx and (2) expanding the role of DCE-MRI CADx from diagnostic to prognostic classification in distinguishing tumor grades.

  2. Knowledge-based approach to video content classification

    Science.gov (United States)

    Chen, Yu; Wong, Edward K.

    2001-01-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  3. Land Cover and Land Use Classification with TWOPAC: towards Automated Processing for Pixel- and Object-Based Image Classification

    Directory of Open Access Journals (Sweden)

    Stefan Dech

    2012-09-01

    Full Text Available We present a novel and innovative automated processing environment for the derivation of land cover (LC and land use (LU information. This processing framework named TWOPAC (TWinned Object and Pixel based Automated classification Chain enables the standardized, independent, user-friendly, and comparable derivation of LC and LU information, with minimized manual classification labor. TWOPAC allows classification of multi-spectral and multi-temporal remote sensing imagery from different sensor types. TWOPAC enables not only pixel-based classification, but also allows classification based on object-based characteristics. Classification is based on a Decision Tree approach (DT for which the well-known C5.0 code has been implemented, which builds decision trees based on the concept of information entropy. TWOPAC enables automatic generation of the decision tree classifier based on a C5.0-retrieved ascii-file, as well as fully automatic validation of the classification output via sample based accuracy assessment.Envisaging the automated generation of standardized land cover products, as well as area-wide classification of large amounts of data in preferably a short processing time, standardized interfaces for process control, Web Processing Services (WPS, as introduced by the Open Geospatial Consortium (OGC, are utilized. TWOPAC’s functionality to process geospatial raster or vector data via web resources (server, network enables TWOPAC’s usability independent of any commercial client or desktop software and allows for large scale data processing on servers. Furthermore, the components of TWOPAC were built-up using open source code components and are implemented as a plug-in for Quantum GIS software for easy handling of the classification process from the user’s perspective.

  4. A network-based biomarker approach for molecular investigation and diagnosis of lung cancer

    Directory of Open Access Journals (Sweden)

    Chen Bor-Sen

    2011-01-01

    Full Text Available Abstract Background Lung cancer is the leading cause of cancer deaths worldwide. Many studies have investigated the carcinogenic process and identified the biomarkers for signature classification. However, based on the research dedicated to this field, there is no highly sensitive network-based method for carcinogenesis characterization and diagnosis from the systems perspective. Methods In this study, a systems biology approach integrating microarray gene expression profiles and protein-protein interaction information was proposed to develop a network-based biomarker for molecular investigation into the network mechanism of lung carcinogenesis and diagnosis of lung cancer. The network-based biomarker consists of two protein association networks constructed for cancer samples and non-cancer samples. Results Based on the network-based biomarker, a total of 40 significant proteins in lung carcinogenesis were identified with carcinogenesis relevance values (CRVs. In addition, the network-based biomarker, acting as the screening test, proved to be effective in diagnosing smokers with signs of lung cancer. Conclusions A network-based biomarker using constructed protein association networks is a useful tool to highlight the pathways and mechanisms of the lung carcinogenic process and, more importantly, provides potential therapeutic targets to combat cancer.

  5. The generalization ability of online SVM classification based on Markov sampling.

    Science.gov (United States)

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  6. Classification of prostate cancer grade using temporal ultrasound: in vivo feasibility study

    Science.gov (United States)

    Ghavidel, Sahar; Imani, Farhad; Khallaghi, Siavash; Gibson, Eli; Khojaste, Amir; Gaed, Mena; Moussa, Madeleine; Gomez, Jose A.; Siemens, D. Robert; Leveridge, Michael; Chang, Silvia; Fenster, Aaron; Ward, Aaron D.; Abolmaesumi, Purang; Mousavi, Parvin

    2016-03-01

    Temporal ultrasound has been shown to have high classification accuracy in differentiating cancer from benign tissue. In this paper, we extend the temporal ultrasound method to classify lower grade Prostate Cancer (PCa) from all other grades. We use a group of nine patients with mostly lower grade PCa, where cancerous regions are also limited. A critical challenge is to train a classifier with limited aggressive cancerous tissue compared to low grade cancerous tissue. To resolve the problem of imbalanced data, we use Synthetic Minority Oversampling Technique (SMOTE) to generate synthetic samples for the minority class. We calculate spectral features of temporal ultrasound data and perform feature selection using Random Forests. In leave-one-patient-out cross-validation strategy, an area under receiver operating characteristic curve (AUC) of 0.74 is achieved with overall sensitivity and specificity of 70%. Using an unsupervised learning approach prior to proposed method improves sensitivity and AUC to 80% and 0.79. This work represents promising results to classify lower and higher grade PCa with limited cancerous training samples, using temporal ultrasound.

  7. Rough set soft computing cancer classification and network: one stone, two birds.

    Science.gov (United States)

    Zhang, Yue

    2010-07-15

    Gene expression profiling provides tremendous information to help unravel the complexity of cancer. The selection of the most informative genes from huge noise for cancer classification has taken centre stage, along with predicting the function of such identified genes and the construction of direct gene regulatory networks at different system levels with a tuneable parameter. A new study by Wang and Gotoh described a novel Variable Precision Rough Sets-rooted robust soft computing method to successfully address these problems and has yielded some new insights. The significance of this progress and its perspectives will be discussed in this article.

  8. North American Magazine Coverage of Skin Cancer and Recreational Tanning Before and After the WHO/IARC 2009 Classification of Indoor Tanning Devices as Carcinogenic.

    Science.gov (United States)

    McWhirter, Jennifer E; Hoffman-Goetz, Laurie

    2015-09-01

    The mass media is an influential source of skin cancer information for the public. In 2009, the World Health Organization's International Agency for Research on Cancer classified UV radiation from tanning devices as carcinogenic. Our objective was to determine if media coverage of skin cancer and recreational tanning increased in volume or changed in nature after this classification. We conducted a directed content analysis on 29 North American popular magazines (2007-2012) to investigate the overall volume of articles on skin cancer and recreational tanning and, more specifically, the presence of skin cancer risk factors, UV behaviors, and early detection information in article text (n = 410) and images (n = 714). The volume of coverage on skin cancer and recreational tanning did not increase significantly after the 2009 classification of tanning beds as carcinogenic. Key-related messages, including that UV exposure is a risk factor for skin cancer and that indoor tanning should be avoided, were not reported more frequently after the classification, but the promotion of the tanned look as attractive was conveyed more often in images afterwards (p skin cancer risk factors, other UV behaviors, or early detection information over time. The classification of indoor tanning beds as carcinogenic had no significant impact on the volume or nature of skin cancer and recreational tanning coverage in magazines.

  9. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  10. A review of supervised object-based land-cover image classification

    Science.gov (United States)

    Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue

    2017-08-01

    Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial

  11. Classification tree analysis of second neoplasms in survivors of childhood cancer

    International Nuclear Information System (INIS)

    Jazbec, Janez; Todorovski, Ljupčo; Jereb, Berta

    2007-01-01

    Reports on childhood cancer survivors estimated cumulative probability of developing secondary neoplasms vary from 3,3% to 25% at 25 years from diagnosis, and the risk of developing another cancer to several times greater than in the general population. In our retrospective study, we have used the classification tree multivariate method on a group of 849 first cancer survivors, to identify childhood cancer patients with the greatest risk for development of secondary neoplasms. In observed group of patients, 34 develop secondary neoplasm after treatment of primary cancer. Analysis of parameters present at the treatment of first cancer, exposed two groups of patients at the special risk for secondary neoplasm. First are female patients treated for Hodgkin's disease at the age between 10 and 15 years, whose treatment included radiotherapy. Second group at special risk were male patients with acute lymphoblastic leukemia who were treated at the age between 4,6 and 6,6 years of age. The risk groups identified in our study are similar to the results of studies that used more conventional approaches. Usefulness of our approach in study of occurrence of second neoplasms should be confirmed in larger sample study, but user friendly presentation of results makes it attractive for further studies

  12. Structure-based classification and ontology in chemistry

    Directory of Open Access Journals (Sweden)

    Hastings Janna

    2012-04-01

    Full Text Available Abstract Background Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving relevant results from the available information, and organising those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures, while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies. Results We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches. Conclusion Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational

  13. Contextual segment-based classification of airborne laser scanner data

    NARCIS (Netherlands)

    Vosselman, George; Coenen, Maximilian; Rottensteiner, Franz

    2017-01-01

    Classification of point clouds is needed as a first step in the extraction of various types of geo-information from point clouds. We present a new approach to contextual classification of segmented airborne laser scanning data. Potential advantages of segment-based classification are easily offset

  14. Analysis and classification of oncology activities on the way to workflow based single source documentation in clinical information systems.

    Science.gov (United States)

    Wagner, Stefan; Beckmann, Matthias W; Wullich, Bernd; Seggewies, Christof; Ries, Markus; Bürkle, Thomas; Prokosch, Hans-Ulrich

    2015-12-22

    Today, cancer documentation is still a tedious task involving many different information systems even within a single institution and it is rarely supported by appropriate documentation workflows. In a comprehensive 14 step analysis we compiled diagnostic and therapeutic pathways for 13 cancer entities using a mixed approach of document analysis, workflow analysis, expert interviews, workflow modelling and feedback loops. These pathways were stepwise classified and categorized to create a final set of grouped pathways and workflows including electronic documentation forms. A total of 73 workflows for the 13 entities based on 82 paper documentation forms additionally to computer based documentation systems were compiled in a 724 page document comprising 130 figures, 94 tables and 23 tumour classifications as well as 12 follow-up tables. Stepwise classification made it possible to derive grouped diagnostic and therapeutic pathways for the three major classes - solid entities with surgical therapy - solid entities with surgical and additional therapeutic activities and - non-solid entities. For these classes it was possible to deduct common documentation workflows to support workflow-guided single-source documentation. Clinical documentation activities within a Comprehensive Cancer Center can likely be realized in a set of three documentation workflows with conditional branching in a modern workflow supporting clinical information system.

  15. A comparative study of breast cancer diagnosis based on neural network ensemble via improved training algorithms.

    Science.gov (United States)

    Azami, Hamed; Escudero, Javier

    2015-08-01

    Breast cancer is one of the most common types of cancer in women all over the world. Early diagnosis of this kind of cancer can significantly increase the chances of long-term survival. Since diagnosis of breast cancer is a complex problem, neural network (NN) approaches have been used as a promising solution. Considering the low speed of the back-propagation (BP) algorithm to train a feed-forward NN, we consider a number of improved NN trainings for the Wisconsin breast cancer dataset: BP with momentum, BP with adaptive learning rate, BP with adaptive learning rate and momentum, Polak-Ribikre conjugate gradient algorithm (CGA), Fletcher-Reeves CGA, Powell-Beale CGA, scaled CGA, resilient BP (RBP), one-step secant and quasi-Newton methods. An NN ensemble, which is a learning paradigm to combine a number of NN outputs, is used to improve the accuracy of the classification task. Results demonstrate that NN ensemble-based classification methods have better performance than NN-based algorithms. The highest overall average accuracy is 97.68% obtained by NN ensemble trained by RBP for 50%-50% training-test evaluation method.

  16. Integrating Globality and Locality for Robust Representation Based Classification

    Directory of Open Access Journals (Sweden)

    Zheng Zhang

    2014-01-01

    Full Text Available The representation based classification method (RBCM has shown huge potential for face recognition since it first emerged. Linear regression classification (LRC method and collaborative representation classification (CRC method are two well-known RBCMs. LRC and CRC exploit training samples of each class and all the training samples to represent the testing sample, respectively, and subsequently conduct classification on the basis of the representation residual. LRC method can be viewed as a “locality representation” method because it just uses the training samples of each class to represent the testing sample and it cannot embody the effectiveness of the “globality representation.” On the contrary, it seems that CRC method cannot own the benefit of locality of the general RBCM. Thus we propose to integrate CRC and LRC to perform more robust representation based classification. The experimental results on benchmark face databases substantially demonstrate that the proposed method achieves high classification accuracy.

  17. An Efficient Ensemble Learning Method for Gene Microarray Classification

    Directory of Open Access Journals (Sweden)

    Alireza Osareh

    2013-01-01

    Full Text Available The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost.

  18. EMG finger movement classification based on ANFIS

    Science.gov (United States)

    Caesarendra, W.; Tjahjowidodo, T.; Nico, Y.; Wahyudati, S.; Nurhasanah, L.

    2018-04-01

    An increase number of people suffering from stroke has impact to the rapid development of finger hand exoskeleton to enable an automatic physical therapy. Prior to the development of finger exoskeleton, a research topic yet important i.e. machine learning of finger gestures classification is conducted. This paper presents a study on EMG signal classification of 5 finger gestures as a preliminary study toward the finger exoskeleton design and development in Indonesia. The EMG signals of 5 finger gestures were acquired using Myo EMG sensor. The EMG signal features were extracted and reduced using PCA. The ANFIS based learning is used to classify reduced features of 5 finger gestures. The result shows that the classification of finger gestures is less than the classification of 7 hand gestures.

  19. Aided diagnosis methods of breast cancer based on machine learning

    Science.gov (United States)

    Zhao, Yue; Wang, Nian; Cui, Xiaoyu

    2017-08-01

    In the field of medicine, quickly and accurately determining whether the patient is malignant or benign is the key to treatment. In this paper, K-Nearest Neighbor, Linear Discriminant Analysis, Logistic Regression were applied to predict the classification of thyroid,Her-2,PR,ER,Ki67,metastasis and lymph nodes in breast cancer, in order to recognize the benign and malignant breast tumors and achieve the purpose of aided diagnosis of breast cancer. The results showed that the highest classification accuracy of LDA was 88.56%, while the classification effect of KNN and Logistic Regression were better than that of LDA, the best accuracy reached 96.30%.

  20. Chinese Sentence Classification Based on Convolutional Neural Network

    Science.gov (United States)

    Gu, Chengwei; Wu, Ming; Zhang, Chuang

    2017-10-01

    Sentence classification is one of the significant issues in Natural Language Processing (NLP). Feature extraction is often regarded as the key point for natural language processing. Traditional ways based on machine learning can not take high level features into consideration, such as Naive Bayesian Model. The neural network for sentence classification can make use of contextual information to achieve greater results in sentence classification tasks. In this paper, we focus on classifying Chinese sentences. And the most important is that we post a novel architecture of Convolutional Neural Network (CNN) to apply on Chinese sentence classification. In particular, most of the previous methods often use softmax classifier for prediction, we embed a linear support vector machine to substitute softmax in the deep neural network model, minimizing a margin-based loss to get a better result. And we use tanh as an activation function, instead of ReLU. The CNN model improve the result of Chinese sentence classification tasks. Experimental results on the Chinese news title database validate the effectiveness of our model.

  1. Oral epithelial dysplasia classification systems

    DEFF Research Database (Denmark)

    Warnakulasuriya, S; Reibel, J; Bouquot, J

    2008-01-01

    At a workshop coordinated by the WHO Collaborating Centre for Oral Cancer and Precancer in the United Kingdom issues related to potentially malignant disorders of the oral cavity were discussed by an expert group. The consensus views of the Working Group are presented in a series of papers....... In this report, we review the oral epithelial dysplasia classification systems. The three classification schemes [oral epithelial dysplasia scoring system, squamous intraepithelial neoplasia and Ljubljana classification] were presented and the Working Group recommended epithelial dysplasia grading for routine...... use. Although most oral pathologists possibly recognize and accept the criteria for grading epithelial dysplasia, firstly based on architectural features and then of cytology, there is great variability in their interpretation of the presence, degree and significance of the individual criteria...

  2. Preliminary Research on Grassland Fine-classification Based on MODIS

    International Nuclear Information System (INIS)

    Hu, Z W; Zhang, S; Yu, X Y; Wang, X S

    2014-01-01

    Grassland ecosystem is important for climatic regulation, maintaining the soil and water. Research on the grassland monitoring method could provide effective reference for grassland resource investigation. In this study, we used the vegetation index method for grassland classification. There are several types of climate in China. Therefore, we need to use China's Main Climate Zone Maps and divide the study region into four climate zones. Based on grassland classification system of the first nation-wide grass resource survey in China, we established a new grassland classification system which is only suitable for this research. We used MODIS images as the basic data resources, and use the expert classifier method to perform grassland classification. Based on the 1:1,000,000 Grassland Resource Map of China, we obtained the basic distribution of all the grassland types and selected 20 samples evenly distributed in each type, then used NDVI/EVI product to summarize different spectral features of different grassland types. Finally, we introduced other classification auxiliary data, such as elevation, accumulate temperature (AT), humidity index (HI) and rainfall. China's nation-wide grassland classification map is resulted by merging the grassland in different climate zone. The overall classification accuracy is 60.4%. The result indicated that expert classifier is proper for national wide grassland classification, but the classification accuracy need to be improved

  3. An Authentication Technique Based on Classification

    Institute of Scientific and Technical Information of China (English)

    李钢; 杨杰

    2004-01-01

    We present a novel watermarking approach based on classification for authentication, in which a watermark is embedded into the host image. When the marked image is modified, the extracted watermark is also different to the original watermark, and different kinds of modification lead to different extracted watermarks. In this paper, different kinds of modification are considered as classes, and we used classification algorithm to recognize the modifications with high probability. Simulation results show that the proposed method is potential and effective.

  4. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    Science.gov (United States)

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Classification of follicular lymphoma images: a holistic approach with symbol-based machine learning methods.

    Science.gov (United States)

    Zorman, Milan; Sánchez de la Rosa, José Luis; Dinevski, Dejan

    2011-12-01

    It is not very often to see a symbol-based machine learning approach to be used for the purpose of image classification and recognition. In this paper we will present such an approach, which we first used on the follicular lymphoma images. Lymphoma is a broad term encompassing a variety of cancers of the lymphatic system. Lymphoma is differentiated by the type of cell that multiplies and how the cancer presents itself. It is very important to get an exact diagnosis regarding lymphoma and to determine the treatments that will be most effective for the patient's condition. Our work was focused on the identification of lymphomas by finding follicles in microscopy images provided by the Laboratory of Pathology in the University Hospital of Tenerife, Spain. We divided our work in two stages: in the first stage we did image pre-processing and feature extraction, and in the second stage we used different symbolic machine learning approaches for pixel classification. Symbolic machine learning approaches are often neglected when looking for image analysis tools. They are not only known for a very appropriate knowledge representation, but also claimed to lack computational power. The results we got are very promising and show that symbolic approaches can be successful in image analysis applications.

  6. [Clinical Study of 2014 ISUP New Grade Group Classification for Prostate Cancer Patients Treated by Androgen Deprivation Therapy].

    Science.gov (United States)

    Uno, Masahiro; Kawase, Makoto; Kato, Daiki; Ishida, Takashi; Kato, Seiichi; Fujimoto, Yoshinori

    2018-01-01

    The 2014 International Society of Urological Pathology (ISUP) has proposed a new grade group (GG) classification for Gleason scores (GS). The usefulness of the new GG classification was investigated with 518 prostate cancer patients who underwent androgen deprivation therapy. According to the new GG classification, Stages B‒D and the new GG classification relapse-free rate for each stage were calculated using the Kaplan‒Meier method. The new GG classification revealed a significant difference for the relapse-free rate only between some groups. Analysis using the Cox proportional hazards model indicated that the risk of relapse was higher in GGs 4 and 5 than in GG 1. The usefulness about the relapse-free rate in androgen deprivation therapy of the 2014 ISUP new grade group classification a waits future examination.

  7. Simple adaptive sparse representation based classification schemes for EEG based brain-computer interface applications.

    Science.gov (United States)

    Shin, Younghak; Lee, Seungchan; Ahn, Minkyu; Cho, Hohyun; Jun, Sung Chan; Lee, Heung-No

    2015-11-01

    One of the main problems related to electroencephalogram (EEG) based brain-computer interface (BCI) systems is the non-stationarity of the underlying EEG signals. This results in the deterioration of the classification performance during experimental sessions. Therefore, adaptive classification techniques are required for EEG based BCI applications. In this paper, we propose simple adaptive sparse representation based classification (SRC) schemes. Supervised and unsupervised dictionary update techniques for new test data and a dictionary modification method by using the incoherence measure of the training data are investigated. The proposed methods are very simple and additional computation for the re-training of the classifier is not needed. The proposed adaptive SRC schemes are evaluated using two BCI experimental datasets. The proposed methods are assessed by comparing classification results with the conventional SRC and other adaptive classification methods. On the basis of the results, we find that the proposed adaptive schemes show relatively improved classification accuracy as compared to conventional methods without requiring additional computation. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Early detection of lung cancer from CT images: nodule segmentation and classification using deep learning

    Science.gov (United States)

    Sharma, Manu; Bhatt, Jignesh S.; Joshi, Manjunath V.

    2018-04-01

    Lung cancer is one of the most abundant causes of the cancerous deaths worldwide. It has low survival rate mainly due to the late diagnosis. With the hardware advancements in computed tomography (CT) technology, it is now possible to capture the high resolution images of lung region. However, it needs to be augmented by efficient algorithms to detect the lung cancer in the earlier stages using the acquired CT images. To this end, we propose a two-step algorithm for early detection of lung cancer. Given the CT image, we first extract the patch from the center location of the nodule and segment the lung nodule region. We propose to use Otsu method followed by morphological operations for the segmentation. This step enables accurate segmentation due to the use of data-driven threshold. Unlike other methods, we perform the segmentation without using the complete contour information of the nodule. In the second step, a deep convolutional neural network (CNN) is used for the better classification (malignant or benign) of the nodule present in the segmented patch. Accurate segmentation of even a tiny nodule followed by better classification using deep CNN enables the early detection of lung cancer. Experiments have been conducted using 6306 CT images of LIDC-IDRI database. We achieved the test accuracy of 84.13%, with the sensitivity and specificity of 91.69% and 73.16%, respectively, clearly outperforming the state-of-the-art algorithms.

  9. Breast cancer molecular subtype classification using deep features: preliminary results

    Science.gov (United States)

    Zhu, Zhe; Albadawy, Ehab; Saha, Ashirbani; Zhang, Jun; Harowicz, Michael R.; Mazurowski, Maciej A.

    2018-02-01

    Radiogenomics is a field of investigation that attempts to examine the relationship between imaging characteris- tics of cancerous lesions and their genomic composition. This could offer a noninvasive alternative to establishing genomic characteristics of tumors and aid cancer treatment planning. While deep learning has shown its supe- riority in many detection and classification tasks, breast cancer radiogenomic data suffers from a very limited number of training examples, which renders the training of the neural network for this problem directly and with no pretraining a very difficult task. In this study, we investigated an alternative deep learning approach referred to as deep features or off-the-shelf network approach to classify breast cancer molecular subtypes using breast dynamic contrast enhanced MRIs. We used the feature maps of different convolution layers and fully connected layers as features and trained support vector machines using these features for prediction. For the feature maps that have multiple layers, max-pooling was performed along each channel. We focused on distinguishing the Luminal A subtype from other subtypes. To evaluate the models, 10 fold cross-validation was performed and the final AUC was obtained by averaging the performance of all the folds. The highest average AUC obtained was 0.64 (0.95 CI: 0.57-0.71), using the feature maps of the last fully connected layer. This indicates the promise of using this approach to predict the breast cancer molecular subtypes. Since the best performance appears in the last fully connected layer, it also implies that breast cancer molecular subtypes may relate to high level image features

  10. AAPT Diagnostic Criteria for Chronic Cancer Pain Conditions

    OpenAIRE

    Paice, Judith A.; Mulvey, Matt; Bennett, Michael; Dougherty, Patrick M.; Farrar, John T.; Mantyh, Patrick W.; Miaskowski, Christine; Schmidt, Brian; Smith, Thomas J.

    2016-01-01

    Chronic cancer pain is a serious complication of malignancy or its treatment. Currently, no comprehensive, universally accepted cancer pain classification system exists. Clarity in classification of common cancer pain syndromes would improve clinical assessment and management. Moreover, an evidence-based taxonomy would enhance cancer pain research efforts by providing consistent diagnostic criteria, ensuring comparability across clinical trials. As part of a collaborative effort between the A...

  11. ICF-based classification and measurement of functioning.

    Science.gov (United States)

    Stucki, G; Kostanjsek, N; Ustün, B; Cieza, A

    2008-09-01

    If we aim towards a comprehensive understanding of human functioning and the development of comprehensive programs to optimize functioning of individuals and populations we need to develop suitable measures. The approval of the International Classification, Disability and Health (ICF) in 2001 by the 54th World Health Assembly as the first universally shared model and classification of functioning, disability and health marks, therefore an important step in the development of measurement instruments and ultimately for our understanding of functioning, disability and health. The acceptance and use of the ICF as a reference framework and classification has been facilitated by its development in a worldwide, comprehensive consensus process and the increasing evidence regarding its validity. However, the broad acceptance and use of the ICF as a reference framework and classification will also depend on the resolution of conceptual and methodological challenges relevant for the classification and measurement of functioning. This paper therefore describes first how the ICF categories can serve as building blocks for the measurement of functioning and then the current state of the development of ICF based practical tools and international standards such as the ICF Core Sets. Finally it illustrates how to map the world of measures to the ICF and vice versa and the methodological principles relevant for the transformation of information obtained with a clinical test or a patient-oriented instrument to the ICF as well as the development of ICF-based clinical and self-reported measurement instruments.

  12. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  13. Support vector machine for breast cancer classification using diffusion-weighted MRI histogram features: Preliminary study.

    Science.gov (United States)

    Vidić, Igor; Egnell, Liv; Jerome, Neil P; Teruel, Jose R; Sjøbakk, Torill E; Østlie, Agnes; Fjøsne, Hans E; Bathen, Tone F; Goa, Pål Erik

    2018-05-01

    Diffusion-weighted MRI (DWI) is currently one of the fastest developing MRI-based techniques in oncology. Histogram properties from model fitting of DWI are useful features for differentiation of lesions, and classification can potentially be improved by machine learning. To evaluate classification of malignant and benign tumors and breast cancer subtypes using support vector machine (SVM). Prospective. Fifty-one patients with benign (n = 23) and malignant (n = 28) breast tumors (26 ER+, whereof six were HER2+). Patients were imaged with DW-MRI (3T) using twice refocused spin-echo echo-planar imaging with echo time / repetition time (TR/TE) = 9000/86 msec, 90 × 90 matrix size, 2 × 2 mm in-plane resolution, 2.5 mm slice thickness, and 13 b-values. Apparent diffusion coefficient (ADC), relative enhanced diffusivity (RED), and the intravoxel incoherent motion (IVIM) parameters diffusivity (D), pseudo-diffusivity (D*), and perfusion fraction (f) were calculated. The histogram properties (median, mean, standard deviation, skewness, kurtosis) were used as features in SVM (10-fold cross-validation) for differentiation of lesions and subtyping. Accuracies of the SVM classifications were calculated to find the combination of features with highest prediction accuracy. Mann-Whitney tests were performed for univariate comparisons. For benign versus malignant tumors, univariate analysis found 11 histogram properties to be significant differentiators. Using SVM, the highest accuracy (0.96) was achieved from a single feature (mean of RED), or from three feature combinations of IVIM or ADC. Combining features from all models gave perfect classification. No single feature predicted HER2 status of ER + tumors (univariate or SVM), although high accuracy (0.90) was achieved with SVM combining several features. Importantly, these features had to include higher-order statistics (kurtosis and skewness), indicating the importance to account for heterogeneity. Our

  14. Classification of Cancer-related Death Certificates using Machine Learning

    Directory of Open Access Journals (Sweden)

    Luke Butt

    2013-05-01

    Full Text Available BackgroundCancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities.AimsIn this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated.Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes.ResultsDeath certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032 and false negative rate (0.0297 while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers.ConclusionThe selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with

  15. Cluster Validity Classification Approaches Based on Geometric Probability and Application in the Classification of Remotely Sensed Images

    Directory of Open Access Journals (Sweden)

    LI Jian-Wei

    2014-08-01

    Full Text Available On the basis of the cluster validity function based on geometric probability in literature [1, 2], propose a cluster analysis method based on geometric probability to process large amount of data in rectangular area. The basic idea is top-down stepwise refinement, firstly categories then subcategories. On all clustering levels, use the cluster validity function based on geometric probability firstly, determine clusters and the gathering direction, then determine the center of clustering and the border of clusters. Through TM remote sensing image classification examples, compare with the supervision and unsupervised classification in ERDAS and the cluster analysis method based on geometric probability in two-dimensional square which is proposed in literature 2. Results show that the proposed method can significantly improve the classification accuracy.

  16. Validation of the prognostic gene portfolio, ClinicoMolecular Triad Classification, using an independent prospective breast cancer cohort and external patient populations.

    Science.gov (United States)

    Wang, Dong-Yu; Done, Susan J; Mc Cready, David R; Leong, Wey L

    2014-07-04

    Using genome-wide expression profiles of a prospective training cohort of breast cancer patients, ClinicoMolecular Triad Classification (CMTC) was recently developed to classify breast cancers into three clinically relevant groups to aid treatment decisions. CMTC was found to be both prognostic and predictive in a large external breast cancer cohort in that study. This study serves to validate the reproducibility of CMTC and its prognostic value using independent patient cohorts. An independent internal cohort (n = 284) and a new external cohort (n = 2,181) were used to validate the association of CMTC between clinicopathological factors, 12 known gene signatures, two molecular subtype classifiers, and 19 oncogenic signalling pathway activities, and to reproduce the abilities of CMTC to predict clinical outcomes of breast cancer. In addition, we also updated the outcome data of the original training cohort (n = 147). The original training cohort reached a statistically significant difference (p value of the triad classification was reproduced in the second independent internal cohort and the new external validation cohort. CMTC achieved even higher prognostic significance when all available patients were analyzed (n = 4,851). Oncogenic pathways Myc, E2F1, Ras and β-catenin were again implicated in the high-risk groups. Both prospective internal cohorts and the independent external cohorts reproduced the triad classification of CMTC and its prognostic significance. CMTC is an independent prognostic predictor, and it outperformed 12 other known prognostic gene signatures, molecular subtype classifications, and all other standard prognostic clinicopathological factors. Our results support further development of CMTC portfolio into a guide for personalized breast cancer treatments.

  17. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression.

    Science.gov (United States)

    Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa

    2015-11-03

    Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  18. Simultaneous detection and classification of breast masses in digital mammograms via a deep learning YOLO-based CAD system.

    Science.gov (United States)

    Al-Masni, Mohammed A; Al-Antari, Mugahed A; Park, Jeong-Min; Gi, Geon; Kim, Tae-Yeon; Rivera, Patricio; Valarezo, Edwin; Choi, Mun-Taek; Han, Seung-Moo; Kim, Tae-Seong

    2018-04-01

    Automatic detection and classification of the masses in mammograms are still a big challenge and play a crucial role to assist radiologists for accurate diagnosis. In this paper, we propose a novel Computer-Aided Diagnosis (CAD) system based on one of the regional deep learning techniques, a ROI-based Convolutional Neural Network (CNN) which is called You Only Look Once (YOLO). Although most previous studies only deal with classification of masses, our proposed YOLO-based CAD system can handle detection and classification simultaneously in one framework. The proposed CAD system contains four main stages: preprocessing of mammograms, feature extraction utilizing deep convolutional networks, mass detection with confidence, and finally mass classification using Fully Connected Neural Networks (FC-NNs). In this study, we utilized original 600 mammograms from Digital Database for Screening Mammography (DDSM) and their augmented mammograms of 2,400 with the information of the masses and their types in training and testing our CAD. The trained YOLO-based CAD system detects the masses and then classifies their types into benign or malignant. Our results with five-fold cross validation tests show that the proposed CAD system detects the mass location with an overall accuracy of 99.7%. The system also distinguishes between benign and malignant lesions with an overall accuracy of 97%. Our proposed system even works on some challenging breast cancer cases where the masses exist over the pectoral muscles or dense regions. Copyright © 2018 Elsevier B.V. All rights reserved.

  19. Network-Based Logistic Classification with an Enhanced L1/2 Solver Reveals Biomarker and Subnetwork Signatures for Diagnosing Lung Cancer

    Directory of Open Access Journals (Sweden)

    Hai-Hui Huang

    2015-01-01

    Full Text Available Identifying biomarker and signaling pathway is a critical step in genomic studies, in which the regularization method is a widely used feature extraction approach. However, most of the regularizers are based on L1-norm and their results are not good enough for sparsity and interpretation and are asymptotically biased, especially in genomic research. Recently, we gained a large amount of molecular interaction information about the disease-related biological processes and gathered them through various databases, which focused on many aspects of biological systems. In this paper, we use an enhanced L1/2 penalized solver to penalize network-constrained logistic regression model called an enhanced L1/2 net, where the predictors are based on gene-expression data with biologic network knowledge. Extensive simulation studies showed that our proposed approach outperforms L1 regularization, the old L1/2 penalized solver, and the Elastic net approaches in terms of classification accuracy and stability. Furthermore, we applied our method for lung cancer data analysis and found that our method achieves higher predictive accuracy than L1 regularization, the old L1/2 penalized solver, and the Elastic net approaches, while fewer but informative biomarkers and pathways are selected.

  20. Vision-Based Perception and Classification of Mosquitoes Using Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Masataka Fuchida

    2017-01-01

    Full Text Available The need for a novel automated mosquito perception and classification method is becoming increasingly essential in recent years, with steeply increasing number of mosquito-borne diseases and associated casualties. There exist remote sensing and GIS-based methods for mapping potential mosquito inhabitants and locations that are prone to mosquito-borne diseases, but these methods generally do not account for species-wise identification of mosquitoes in closed-perimeter regions. Traditional methods for mosquito classification involve highly manual processes requiring tedious sample collection and supervised laboratory analysis. In this research work, we present the design and experimental validation of an automated vision-based mosquito classification module that can deploy in closed-perimeter mosquito inhabitants. The module is capable of identifying mosquitoes from other bugs such as bees and flies by extracting the morphological features, followed by support vector machine-based classification. In addition, this paper presents the results of three variants of support vector machine classifier in the context of mosquito classification problem. This vision-based approach to the mosquito classification problem presents an efficient alternative to the conventional methods for mosquito surveillance, mapping and sample image collection. Experimental results involving classification between mosquitoes and a predefined set of other bugs using multiple classification strategies demonstrate the efficacy and validity of the proposed approach with a maximum recall of 98%.

  1. Novel personalized pathway-based metabolomics models reveal key metabolic pathways for breast cancer diagnosis

    DEFF Research Database (Denmark)

    Huang, Sijia; Chong, Nicole; Lewis, Nathan

    2016-01-01

    diagnosis. We applied this method to predict breast cancer occurrence, in combination with correlation feature selection (CFS) and classification methods. Results: The resulting all-stage and early-stage diagnosis models are highly accurate in two sets of testing blood samples, with average AUCs (Area Under.......993. Moreover, important metabolic pathways, such as taurine and hypotaurine metabolism and the alanine, aspartate, and glutamate pathway, are revealed as critical biological pathways for early diagnosis of breast cancer. Conclusions: We have successfully developed a new type of pathway-based model to study...... metabolomics data for disease diagnosis. Applying this method to blood-based breast cancer metabolomics data, we have discovered crucial metabolic pathway signatures for breast cancer diagnosis, especially early diagnosis. Further, this modeling approach may be generalized to other omics data types for disease...

  2. Biobank classification in an Australian setting.

    Science.gov (United States)

    Rush, Amanda; Christiansen, Jeffrey H; Farrell, Jake P; Goode, Susan M; Scott, Rodney J; Spring, Kevin J; Byrne, Jennifer A

    2015-06-01

    In 2011, Watson and Barnes proposed a schema for classifying biobanks into 3 groups (mono-, oligo-, and poly-user), primarily based upon biospecimen access policies. We used results from a recent comprehensive survey of cancer biobanks in New South Wales, Australia to assess the applicability of this biobank classification schema in an Australian setting. Cancer biobanks were identified using publically available data, and by consulting with research managers. A comprehensive survey was developed and administered through a face-to-face setting. Data were analyzed using Microsoft Excel™ 2010 and IBM SPSS Statistics™ version 21.0. The cancer biobank cohort (n=23) represented 5 mono-user biobanks, 7 oligo-user biobanks, and 11 poly-user biobanks, and was analyzed as two groups (mono-/oligo- versus poly-user biobanks). Poly-user biobanks employed significantly more full-time equivalent staff, and were significantly more likely to have a website, share staff between biobanks, access governance support, utilize quality control measures, be aware of biobanking best practice documents, and offer staff training. Mono-/oligo-user biobanks were significantly more likely to seek advice from other biobanks. Our results further delineate a biobank classification system that is primarily based on access policy, and demonstrate its relevance in an Australian setting.

  3. Classification of research reactors and discussion of thinking of safety regulation based on the classification

    International Nuclear Information System (INIS)

    Song Chenxiu; Zhu Lixin

    2013-01-01

    Research reactors have different characteristics in the fields of reactor type, use, power level, design principle, operation model and safety performance, etc, and also have significant discrepancy in the aspect of nuclear safety regulation. This paper introduces classification of research reactors and discusses thinking of safety regulation based on the classification of research reactors. (authors)

  4. Radar Target Classification using Recursive Knowledge-Based Methods

    DEFF Research Database (Denmark)

    Jochumsen, Lars Wurtz

    The topic of this thesis is target classification of radar tracks from a 2D mechanically scanning coastal surveillance radar. The measurements provided by the radar are position data and therefore the classification is mainly based on kinematic data, which is deduced from the position. The target...... been terminated. Therefore, an update of the classification results must be made for each measurement of the target. The data for this work are collected throughout the PhD and are both collected from radars and other sensors such as GPS....

  5. Energy-efficiency based classification of the manufacturing workstation

    Science.gov (United States)

    Frumuşanu, G.; Afteni, C.; Badea, N.; Epureanu, A.

    2017-08-01

    EU Directive 92/75/EC established for the first time an energy consumption labelling scheme, further implemented by several other directives. As consequence, nowadays many products (e.g. home appliances, tyres, light bulbs, houses) have an EU Energy Label when offered for sale or rent. Several energy consumption models of manufacturing equipments have been also developed. This paper proposes an energy efficiency - based classification of the manufacturing workstation, aiming to characterize its energetic behaviour. The concept of energy efficiency of the manufacturing workstation is defined. On this base, a classification methodology has been developed. It refers to specific criteria and their evaluation modalities, together to the definition & delimitation of energy efficiency classes. The energy class position is defined after the amount of energy needed by the workstation in the middle point of its operating domain, while its extension is determined by the value of the first coefficient from the Taylor series that approximates the dependence between the energy consume and the chosen parameter of the working regime. The main domain of interest for this classification looks to be the optimization of the manufacturing activities planning and programming. A case-study regarding an actual lathe classification from energy efficiency point of view, based on two different approaches (analytical and numerical) is also included.

  6. OC-2-KB: A software pipeline to build an evidence-based obesity and cancer knowledge base.

    Science.gov (United States)

    Lossio-Ventura, Juan Antonio; Hogan, William; Modave, François; Guo, Yi; He, Zhe; Hicks, Amanda; Bian, Jiang

    2017-11-01

    Obesity has been linked to several types of cancer. Access to adequate health information activates people's participation in managing their own health, which ultimately improves their health outcomes. Nevertheless, the existing online information about the relationship between obesity and cancer is heterogeneous and poorly organized. A formal knowledge representation can help better organize and deliver quality health information. Currently, there are several efforts in the biomedical domain to convert unstructured data to structured data and store them in Semantic Web knowledge bases (KB). In this demo paper, we present, OC-2-KB (Obesity and Cancer to Knowledge Base), a system that is tailored to guide the automatic KB construction for managing obesity and cancer knowledge from free-text scientific literature (i.e., PubMed abstracts) in a systematic way. OC-2-KB has two important modules which perform the acquisition of entities and the extraction then classification of relationships among these entities. We tested the OC-2-KB system on a data set with 23 manually annotated obesity and cancer PubMed abstracts and created a preliminary KB with 765 triples. We conducted a preliminary evaluation on this sample of triples and reported our evaluation results.

  7. International association for the study of lung cancer/american thoracic society/european respiratory society international multidisciplinary classification of lung adenocarcinoma.

    Science.gov (United States)

    Travis, William D; Brambilla, Elisabeth; Noguchi, Masayuki; Nicholson, Andrew G; Geisinger, Kim R; Yatabe, Yasushi; Beer, David G; Powell, Charles A; Riely, Gregory J; Van Schil, Paul E; Garg, Kavita; Austin, John H M; Asamura, Hisao; Rusch, Valerie W; Hirsch, Fred R; Scagliotti, Giorgio; Mitsudomi, Tetsuya; Huber, Rudolf M; Ishikawa, Yuichi; Jett, James; Sanchez-Cespedes, Montserrat; Sculier, Jean-Paul; Takahashi, Takashi; Tsuboi, Masahiro; Vansteenkiste, Johan; Wistuba, Ignacio; Yang, Pan-Chyr; Aberle, Denise; Brambilla, Christian; Flieder, Douglas; Franklin, Wilbur; Gazdar, Adi; Gould, Michael; Hasleton, Philip; Henderson, Douglas; Johnson, Bruce; Johnson, David; Kerr, Keith; Kuriyama, Keiko; Lee, Jin Soo; Miller, Vincent A; Petersen, Iver; Roggli, Victor; Rosell, Rafael; Saijo, Nagahiro; Thunnissen, Erik; Tsao, Ming; Yankelewitz, David

    2011-02-01

    % disease-specific survival, respectively. AIS and MIA are usually nonmucinous but rarely may be mucinous. Invasive adenocarcinomas are classified by predominant pattern after using comprehensive histologic subtyping with lepidic (formerly most mixed subtype tumors with nonmucinous BAC), acinar, papillary, and solid patterns; micropapillary is added as a new histologic subtype. Variants include invasive mucinous adenocarcinoma (formerly mucinous BAC), colloid, fetal, and enteric adenocarcinoma. This classification provides guidance for small biopsies and cytology specimens, as approximately 70% of lung cancers are diagnosed in such samples. Non-small cell lung carcinomas (NSCLCs), in patients with advanced-stage disease, are to be classified into more specific types such as adenocarcinoma or squamous cell carcinoma, whenever possible for several reasons: (1) adenocarcinoma or NSCLC not otherwise specified should be tested for epidermal growth factor receptor (EGFR) mutations as the presence of these mutations is predictive of responsiveness to EGFR tyrosine kinase inhibitors, (2) adenocarcinoma histology is a strong predictor for improved outcome with pemetrexed therapy compared with squamous cell carcinoma, and (3) potential life-threatening hemorrhage may occur in patients with squamous cell carcinoma who receive bevacizumab. If the tumor cannot be classified based on light microscopy alone, special studies such as immunohistochemistry and/or mucin stains should be applied to classify the tumor further. Use of the term NSCLC not otherwise specified should be minimized. This new classification strategy is based on a multidisciplinary approach to diagnosis of lung adenocarcinoma that incorporates clinical, molecular, radiologic, and surgical issues, but it is primarily based on histology. This classification is intended to support clinical practice, and research investigation and clinical trials. As EGFR mutation is a validated predictive marker for response and progression

  8. Implementation of several mathematical algorithms to breast tissue density classification

    International Nuclear Information System (INIS)

    Quintana, C.; Redondo, M.; Tirao, G.

    2014-01-01

    The accuracy of mammographic abnormality detection methods is strongly dependent on breast tissue characteristics, where a dense breast tissue can hide lesions causing cancer to be detected at later stages. In addition, breast tissue density is widely accepted to be an important risk indicator for the development of breast cancer. This paper presents the implementation and the performance of different mathematical algorithms designed to standardize the categorization of mammographic images, according to the American College of Radiology classifications. These mathematical techniques are based on intrinsic properties calculations and on comparison with an ideal homogeneous image (joint entropy, mutual information, normalized cross correlation and index Q) as categorization parameters. The algorithms evaluation was performed on 100 cases of the mammographic data sets provided by the Ministerio de Salud de la Provincia de Córdoba, Argentina—Programa de Prevención del Cáncer de Mama (Department of Public Health, Córdoba, Argentina, Breast Cancer Prevention Program). The obtained breast classifications were compared with the expert medical diagnostics, showing a good performance. The implemented algorithms revealed a high potentiality to classify breasts into tissue density categories. - Highlights: • Breast density classification can be obtained by suitable mathematical algorithms. • Mathematical processing help radiologists to obtain the BI-RADS classification. • The entropy and joint entropy show high performance for density classification

  9. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    N. Li

    2016-06-01

    Full Text Available Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could keep the initial spatial structure and insure the consideration of the neighborhood. Based on a small number of component features a k nearest neighborhood classification is applied.

  10. Numeric pathologic lymph node classification shows prognostic superiority to topographic pN classification in esophageal squamous cell carcinoma.

    Science.gov (United States)

    Sugawara, Kotaro; Yamashita, Hiroharu; Uemura, Yukari; Mitsui, Takashi; Yagi, Koichi; Nishida, Masato; Aikou, Susumu; Mori, Kazuhiko; Nomura, Sachiyo; Seto, Yasuyuki

    2017-10-01

    The current eighth tumor node metastasis lymph node category pathologic lymph node staging system for esophageal squamous cell carcinoma is based solely on the number of metastatic nodes and does not consider anatomic distribution. We aimed to assess the prognostic capability of the eighth tumor node metastasis pathologic lymph node staging system (numeric-based) compared with the 11th Japan Esophageal Society (topography-based) pathologic lymph node staging system in patients with esophageal squamous cell carcinoma. We retrospectively reviewed the clinical records of 289 patients with esophageal squamous cell carcinoma who underwent esophagectomy with extended lymph node dissection during the period from January 2006 through June 2016. We compared discrimination abilities for overall survival, recurrence-free survival, and cancer-specific survival between these 2 staging systems using C-statistics. The median number of dissected and metastatic nodes was 61 (25% to 75% quartile range, 45 to 79) and 1 (25% to 75% quartile range, 0 to 3), respectively. The eighth tumor node metastasis pathologic lymph node staging system had a greater ability to accurately determine overall survival (C-statistics: tumor node metastasis classification, 0.69, 95% confidence interval, 0.62-0.76; Japan Esophageal Society classification; 0.65, 95% confidence interval, 0.58-0.71; P = .014) and cancer-specific survival (C-statistics: tumor node metastasis classification, 0.78, 95% confidence interval, 0.70-0.87; Japan Esophageal Society classification; 0.72, 95% confidence interval, 0.64-0.80; P = .018). Rates of total recurrence rose as the eighth tumor node metastasis pathologic lymph node stage increased, while stratification of patients according to the topography-based node classification system was not feasible. Numeric nodal staging is an essential tool for stratifying the oncologic outcomes of patients with esophageal squamous cell carcinoma even in the cohort in which adequate

  11. Current Trends in the Molecular Classification of Renal Neoplasms

    Directory of Open Access Journals (Sweden)

    Andrew N. Young

    2006-01-01

    Full Text Available Renal cell carcinoma (RCC is the most common form of kidney cancer in adults. RCC is a significant challenge for pathologic diagnosis and clinical management. The primary approach to diagnosis is by light microscopy, using the World Health Organization (WHO classification system, which defines histopathologic tumor subtypes with distinct clinical behavior and underlying genetic mutations. However, light microscopic diagnosis of RCC subtypes is often difficult due to variable histology. In addition, the clinical behavior of RCC is highly variable and therapeutic response rates are poor. Few clinical assays are available to predict outcome in RCC or correlate behavior with histology. Therefore, novel RCC classification systems based on gene expression should be useful for diagnosis, prognosis, and treatment. Recent microarray studies have shown that renal tumors are characterized by distinct gene expression profiles, which can be used to discover novel diagnostic and prognostic biomarkers. Here, we review clinical features of kidney cancer, the WHO classification system, and the growing role of molecular classification for diagnosis, prognosis, and therapy of this disease.

  12. Waste-acceptance criteria and risk-based thinking for radioactive-waste classification

    International Nuclear Information System (INIS)

    Lowenthal, M.D.

    1998-01-01

    The US system of radioactive-waste classification and its development provide a reference point for the discussion of risk-based thinking in waste classification. The official US system is described and waste-acceptance criteria for disposal sites are introduced because they constitute a form of de facto waste classification. Risk-based classification is explored and it is found that a truly risk-based system is context-dependent: risk depends not only on the waste-management activity but, for some activities such as disposal, it depends on the specific physical context. Some of the elements of the official US system incorporate risk-based thinking, but like many proposed alternative schemes, the physical context of disposal is ignored. The waste-acceptance criteria for disposal sites do account for this context dependence and could be used as a risk-based classification scheme for disposal. While different classes would be necessary for different management activities, the waste-acceptance criteria would obviate the need for the current system and could better match wastes to disposal environments saving money or improving safety or both

  13. Research on Classification of Chinese Text Data Based on SVM

    Science.gov (United States)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  14. Iris Image Classification Based on Hierarchical Visual Codebook.

    Science.gov (United States)

    Zhenan Sun; Hui Zhang; Tieniu Tan; Jianyu Wang

    2014-06-01

    Iris recognition as a reliable method for personal identification has been well-studied with the objective to assign the class label of each iris image to a unique subject. In contrast, iris image classification aims to classify an iris image to an application specific category, e.g., iris liveness detection (classification of genuine and fake iris images), race classification (e.g., classification of iris images of Asian and non-Asian subjects), coarse-to-fine iris identification (classification of all iris images in the central database into multiple categories). This paper proposes a general framework for iris image classification based on texture analysis. A novel texture pattern representation method called Hierarchical Visual Codebook (HVC) is proposed to encode the texture primitives of iris images. The proposed HVC method is an integration of two existing Bag-of-Words models, namely Vocabulary Tree (VT), and Locality-constrained Linear Coding (LLC). The HVC adopts a coarse-to-fine visual coding strategy and takes advantages of both VT and LLC for accurate and sparse representation of iris texture. Extensive experimental results demonstrate that the proposed iris image classification method achieves state-of-the-art performance for iris liveness detection, race classification, and coarse-to-fine iris identification. A comprehensive fake iris image database simulating four types of iris spoof attacks is developed as the benchmark for research of iris liveness detection.

  15. Improving the Computational Performance of Ontology-Based Classification Using Graph Databases

    Directory of Open Access Journals (Sweden)

    Thomas J. Lampoltshammer

    2015-07-01

    Full Text Available The increasing availability of very high-resolution remote sensing imagery (i.e., from satellites, airborne laser scanning, or aerial photography represents both a blessing and a curse for researchers. The manual classification of these images, or other similar geo-sensor data, is time-consuming and leads to subjective and non-deterministic results. Due to this fact, (semi- automated classification approaches are in high demand in affected research areas. Ontologies provide a proper way of automated classification for various kinds of sensor data, including remotely sensed data. However, the processing of data entities—so-called individuals—is one of the most cost-intensive computational operations within ontology reasoning. Therefore, an approach based on graph databases is proposed to overcome the issue of a high time consumption regarding the classification task. The introduced approach shifts the classification task from the classical Protégé environment and its common reasoners to the proposed graph-based approaches. For the validation, the authors tested the approach on a simulation scenario based on a real-world example. The results demonstrate a quite promising improvement of classification speed—up to 80,000 times faster than the Protégé-based approach.

  16. A multifactorial likelihood model for MMR gene variant classification incorporating probabilities based on sequence bioinformatics and tumor characteristics: a report from the Colon Cancer Family Registry.

    Science.gov (United States)

    Thompson, Bryony A; Goldgar, David E; Paterson, Carol; Clendenning, Mark; Walters, Rhiannon; Arnold, Sven; Parsons, Michael T; Michael D, Walsh; Gallinger, Steven; Haile, Robert W; Hopper, John L; Jenkins, Mark A; Lemarchand, Loic; Lindor, Noralane M; Newcomb, Polly A; Thibodeau, Stephen N; Young, Joanne P; Buchanan, Daniel D; Tavtigian, Sean V; Spurdle, Amanda B

    2013-01-01

    Mismatch repair (MMR) gene sequence variants of uncertain clinical significance are often identified in suspected Lynch syndrome families, and this constitutes a challenge for both researchers and clinicians. Multifactorial likelihood model approaches provide a quantitative measure of MMR variant pathogenicity, but first require input of likelihood ratios (LRs) for different MMR variation-associated characteristics from appropriate, well-characterized reference datasets. Microsatellite instability (MSI) and somatic BRAF tumor data for unselected colorectal cancer probands of known pathogenic variant status were used to derive LRs for tumor characteristics using the Colon Cancer Family Registry (CFR) resource. These tumor LRs were combined with variant segregation within families, and estimates of prior probability of pathogenicity based on sequence conservation and position, to analyze 44 unclassified variants identified initially in Australasian Colon CFR families. In addition, in vitro splicing analyses were conducted on the subset of variants based on bioinformatic splicing predictions. The LR in favor of pathogenicity was estimated to be ~12-fold for a colorectal tumor with a BRAF mutation-negative MSI-H phenotype. For 31 of the 44 variants, the posterior probabilities of pathogenicity were such that altered clinical management would be indicated. Our findings provide a working multifactorial likelihood model for classification that carefully considers mode of ascertainment for gene testing. © 2012 Wiley Periodicals, Inc.

  17. Validation of the prognostic gene portfolio, ClinicoMolecular Triad Classification, using an independent prospective breast cancer cohort and external patient populations

    Science.gov (United States)

    2014-01-01

    Introduction Using genome-wide expression profiles of a prospective training cohort of breast cancer patients, ClinicoMolecular Triad Classification (CMTC) was recently developed to classify breast cancers into three clinically relevant groups to aid treatment decisions. CMTC was found to be both prognostic and predictive in a large external breast cancer cohort in that study. This study serves to validate the reproducibility of CMTC and its prognostic value using independent patient cohorts. Methods An independent internal cohort (n = 284) and a new external cohort (n = 2,181) were used to validate the association of CMTC between clinicopathological factors, 12 known gene signatures, two molecular subtype classifiers, and 19 oncogenic signalling pathway activities, and to reproduce the abilities of CMTC to predict clinical outcomes of breast cancer. In addition, we also updated the outcome data of the original training cohort (n = 147). Results The original training cohort reached a statistically significant difference (p risk groups. Conclusions Both prospective internal cohorts and the independent external cohorts reproduced the triad classification of CMTC and its prognostic significance. CMTC is an independent prognostic predictor, and it outperformed 12 other known prognostic gene signatures, molecular subtype classifications, and all other standard prognostic clinicopathological factors. Our results support further development of CMTC portfolio into a guide for personalized breast cancer treatments. PMID:24996446

  18. Hot complaint intelligent classification based on text mining

    Directory of Open Access Journals (Sweden)

    XIA Haifeng

    2013-10-01

    Full Text Available The complaint recognizer system plays an important role in making sure the correct classification of the hot complaint,improving the service quantity of telecommunications industry.The customers’ complaint in telecommunications industry has its special particularity which should be done in limited time,which cause the error in classification of hot complaint.The paper presents a model of complaint hot intelligent classification based on text mining,which can classify the hot complaint in the correct level of the complaint navigation.The examples show that the model can be efficient to classify the text of the complaint.

  19. Gene selection for cancer classification with the help of bees.

    Science.gov (United States)

    Moosa, Johra Muhammad; Shakur, Rameen; Kaykobad, Mohammad; Rahman, Mohammad Sohel

    2016-08-10

    Development of biologically relevant models from gene expression data notably, microarray data has become a topic of great interest in the field of bioinformatics and clinical genetics and oncology. Only a small number of gene expression data compared to the total number of genes explored possess a significant correlation with a certain phenotype. Gene selection enables researchers to obtain substantial insight into the genetic nature of the disease and the mechanisms responsible for it. Besides improvement of the performance of cancer classification, it can also cut down the time and cost of medical diagnoses. This study presents a modified Artificial Bee Colony Algorithm (ABC) to select minimum number of genes that are deemed to be significant for cancer along with improvement of predictive accuracy. The search equation of ABC is believed to be good at exploration but poor at exploitation. To overcome this limitation we have modified the ABC algorithm by incorporating the concept of pheromones which is one of the major components of Ant Colony Optimization (ACO) algorithm and a new operation in which successive bees communicate to share their findings. The proposed algorithm is evaluated using a suite of ten publicly available datasets after the parameters are tuned scientifically with one of the datasets. Obtained results are compared to other works that used the same datasets. The performance of the proposed method is proved to be superior. The method presented in this paper can provide subset of genes leading to more accurate classification results while the number of selected genes is smaller. Additionally, the proposed modified Artificial Bee Colony Algorithm could conceivably be applied to problems in other areas as well.

  20. Distinct clinical outcomes of two CIMP-positive colorectal cancer subtypes based on a revised CIMP classification system.

    Science.gov (United States)

    Bae, Jeong Mo; Kim, Jung Ho; Kwak, Yoonjin; Lee, Dae-Won; Cha, Yongjun; Wen, Xianyu; Lee, Tae Hun; Cho, Nam-Yun; Jeong, Seung-Yong; Park, Kyu Joo; Han, Sae Won; Lee, Hye Seung; Kim, Tae-You; Kang, Gyeong Hoon

    2017-04-11

    Colorectal cancer (CRC) is a heterogeneous disease in terms of molecular carcinogenic pathways. Based on recent findings regarding the multiple serrated neoplasia pathway, we revised an eight-marker panel for a new CIMP classification system. 1370 patients who received surgical resection for CRCs were classified into three CIMP subtypes (CIMP-N: 0-4 methylated markers, CIMP-P1: 5-6 methylated markers and CIMP-P2: 7-8 methylated markers). Our findings were validated in a separate set of high-risk stage II or stage III CRCs receiving adjuvant fluoropyrimidine plus oxaliplatin (n=950). A total of 1287/62/21 CRCs cases were classified as CIMP-N/CIMP-P1/CIMP-P2, respectively. CIMP-N showed male predominance, distal location, lower T, N category and devoid of BRAF mutation, microsatellite instability (MSI) and MLH1 methylation. CIMP-P1 showed female predominance, proximal location, advanced TNM stage, mild decrease of CK20 and CDX2 expression, mild increase of CK7 expression, BRAF mutation, MSI and MLH1 methylation. CIMP-P2 showed older age, female predominance, proximal location, advanced T category, markedly reduced CK20 and CDX2 expression, rare KRAS mutation, high frequency of CK7 expression, BRAF mutation, MSI and MLH1 methylation. CIMP-N showed better 5-year cancer-specific survival (CSS; HR=0.47; 95% CI: 0.28-0.78) in discovery set and better 5-year relapse-free survival (RFS; HR=0.50; 95% CI: 0.29-0.88) in validation set compared with CIMP-P1. CIMP-P2 showed marginally better 5-year CSS (HR=0.28, 95% CI: 0.07-1.22) in discovery set and marginally better 5-year RFS (HR=0.21, 95% CI: 0.05-0.92) in validation set compared with CIMP-P1. CIMP subtypes classified using our revised system showed different clinical outcomes, demonstrating the heterogeneity of multiple serrated precursors of CIMP-positive CRCs.

  1. A proposed data base system for detection, classification and ...

    African Journals Online (AJOL)

    A proposed data base system for detection, classification and location of fault on electricity company of Ghana electrical distribution system. Isaac Owusu-Nyarko, Mensah-Ananoo Eugine. Abstract. No Abstract. Keywords: database, classification of fault, power, distribution system, SCADA, ECG. Full Text: EMAIL FULL TEXT ...

  2. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin

    DEFF Research Database (Denmark)

    Hoadley, Katherine A; Yau, Christina; Wolf, Denise M

    2014-01-01

    Recent genomic analyses of pathologically defined tumor types identify "within-a-tissue" disease subtypes. However, the extent to which genomic signatures are shared across tissues is still unclear. We performed an integrative analysis using five genome-wide platforms and one proteomic platform...... on 3,527 specimens from 12 cancer types, revealing a unified classification into 11 major subtypes. Five subtypes were nearly identical to their tissue-of-origin counterparts, but several distinct cancer types were found to converge into common subtypes. Lung squamous, head and neck, and a subset...

  3. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function

    Science.gov (United States)

    Groenendyk, Derek G.; Ferré, Ty P.A.; Thorp, Kelly R.; Rice, Amy K.

    2015-01-01

    Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth’s surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape

  4. Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.

    Science.gov (United States)

    Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel

    2017-06-01

    Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.

  5. Granular loess classification based

    International Nuclear Information System (INIS)

    Browzin, B.S.

    1985-01-01

    This paper discusses how loess might be identified by two index properties: the granulometric composition and the dry unit weight. These two indices are necessary but not always sufficient for identification of loess. On the basis of analyses of samples from three continents, it was concluded that the 0.01-0.5-mm fraction deserves the name loessial fraction. Based on the loessial fraction concept, a granulometric classification of loess is proposed. A triangular chart is used to classify loess

  6. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    Directory of Open Access Journals (Sweden)

    Hala Alshamlan

    2015-01-01

    Full Text Available An artificial bee colony (ABC is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR, and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO. The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  7. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    Science.gov (United States)

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  8. Body mass index: different nutritional status according to WHO, OPAS and Lipschitz classifications in gastrointestinal cancer patients.

    Science.gov (United States)

    Barao, Katia; Forones, Nora Manoukian

    2012-01-01

    The body mass index (BMI) is the most common marker used on diagnoses of the nutritional status. The great advantage of this index is the easy way to measure, the low cost, the good correlation with the fat mass and the association to morbidity and mortality. To compare the BMI differences according to the WHO, OPAS and Lipschitz classification. A prospective study on 352 patients with esophageal, gastric or colorectal cancer was done. The BMI was calculated and analyzed by the classification of WHO, Lipschitz and OPAS. The mean age was 62.1 ± 12.4 years and 59% of them had more than 59 years. The BMI had not difference between the genders in patients cancer had more than 65 years. A different cut off must be used for this patients, because undernourished patients may be wrongly considered well nourished.

  9. Stratification and Prognostic Relevance of Jass’s Molecular Classification of Colorectal Cancer

    International Nuclear Information System (INIS)

    Zlobec, Inti; Bihl, Michel P.; Foerster, Anja; Rufle, Alex; Terracciano, Luigi; Lugli, Alessandro

    2012-01-01

    Background: The current proposed model of colorectal tumorigenesis is based primarily on CpG island methylator phenotype (CIMP), microsatellite instability (MSI), KRAS, BRAF, and methylation status of 0-6-Methylguanine DNA Methyltransferase (MGMT) and classifies tumors into five subgroups. The aim of this study is to validate this molecular classification and test its prognostic relevance. Methods: Three hundred two patients were included in this study. Molecular analysis was performed for five CIMP-related promoters (CRABP1, MLH1, p16INK4a, CACNA1G, NEUROG1), MGMT, MSI, KRAS, and BRAF. Methylation in at least 4 promoters or in one to three promoters was considered CIMP-high and CIMP-low (CIMP-H/L), respectively. Results: CIMP-H, CIMP-L, and CIMP-negative were found in 7.1, 43, and 49.9% cases, respectively. One hundred twenty-three tumors (41%) could not be classified into any one of the proposed molecular subgroups, including 107 CIMP-L, 14 CIMP-H, and two CIMP-negative cases. The 10 year survival rate for CIMP-high patients [22.6% (95%CI: 7–43)] was significantly lower than for CIMP-L or CIMP-negative (p = 0.0295). Only the combined analysis of BRAF and CIMP (negative versus L/H) led to distinct prognostic subgroups. Conclusion: Although CIMP status has an effect on outcome, our results underline the need for standardized definitions of low- and high-level CIMP, which clearly hinders an effective prognostic and molecular classification of colorectal cancer.

  10. Stratification and Prognostic Relevance of Jass’s Molecular Classification of Colorectal Cancer

    Energy Technology Data Exchange (ETDEWEB)

    Zlobec, Inti [Institute of Pathology, University of Bern, Bern (Switzerland); Institute for Pathology, University Hospital Basel, Basel (Switzerland); Bihl, Michel P.; Foerster, Anja; Rufle, Alex; Terracciano, Luigi [Institute for Pathology, University Hospital Basel, Basel (Switzerland); Lugli, Alessandro, E-mail: inti.zlobec@pathology.unibe.ch [Institute of Pathology, University of Bern, Bern (Switzerland); Institute for Pathology, University Hospital Basel, Basel (Switzerland)

    2012-02-27

    Background: The current proposed model of colorectal tumorigenesis is based primarily on CpG island methylator phenotype (CIMP), microsatellite instability (MSI), KRAS, BRAF, and methylation status of 0-6-Methylguanine DNA Methyltransferase (MGMT) and classifies tumors into five subgroups. The aim of this study is to validate this molecular classification and test its prognostic relevance. Methods: Three hundred two patients were included in this study. Molecular analysis was performed for five CIMP-related promoters (CRABP1, MLH1, p16INK4a, CACNA1G, NEUROG1), MGMT, MSI, KRAS, and BRAF. Methylation in at least 4 promoters or in one to three promoters was considered CIMP-high and CIMP-low (CIMP-H/L), respectively. Results: CIMP-H, CIMP-L, and CIMP-negative were found in 7.1, 43, and 49.9% cases, respectively. One hundred twenty-three tumors (41%) could not be classified into any one of the proposed molecular subgroups, including 107 CIMP-L, 14 CIMP-H, and two CIMP-negative cases. The 10 year survival rate for CIMP-high patients [22.6% (95%CI: 7–43)] was significantly lower than for CIMP-L or CIMP-negative (p = 0.0295). Only the combined analysis of BRAF and CIMP (negative versus L/H) led to distinct prognostic subgroups. Conclusion: Although CIMP status has an effect on outcome, our results underline the need for standardized definitions of low- and high-level CIMP, which clearly hinders an effective prognostic and molecular classification of colorectal cancer.

  11. Stratification and prognostic relevance of Jass’s molecular classification of colorectal cancer

    Directory of Open Access Journals (Sweden)

    Inti eZlobec

    2012-02-01

    Full Text Available Background: The current proposed model of colorectal tumorigenesis is based primarily on CpG island methylator phenotype (CIMP, microsatellite instability (MSI, KRAS, BRAF, and methylation status of 0-6-Methylguanine DNA Methyltransferase (MGMT and classifies tumors into 5 subgroups. The aim of this study is to validate this molecular classification and test its prognostic relevance. Methods: 302 patients were included in this study. Molecular analysis was performed for 5 CIMP-related promoters (CRABP1, MLH1, p16INK4a, CACNA1G, NEUROG1, MGMT, MSI, KRAS and BRAF. Tumors were CIMP-high or CIMP-low if ≥4 and 1-3 promoters were methylated, respectively. Results: CIMP-high, CIMP-low and CIMP–negative were found in 7.1%, 43% and 49.9% cases, respectively. 123 tumors (41% could not be classified into any one of the proposed molecular subgroups, including 107 CIMP-low, 14 CIMP-high and 2 CIMP-negative cases. The 10-year survival rate for CIMP-high patients (22.6% (95%CI: 7-43 was significantly lower than for CIMP-low or CIMP-negative (p=0.0295. Only the combined analysis of BRAF and CIMP (negative versus low/high led to distinct prognostic subgroups. Conclusion: Although CIMP status has an effect on outcome, our results underline the need for standardized definitions of low- and high-level CIMP, which clearly hinders an effective prognostic and molecular classification of colorectal cancer.

  12. Failure diagnosis using deep belief learning based health state classification

    International Nuclear Information System (INIS)

    Tamilselvan, Prasanna; Wang, Pingfeng

    2013-01-01

    Effective health diagnosis provides multifarious benefits such as improved safety, improved reliability and reduced costs for operation and maintenance of complex engineered systems. This paper presents a novel multi-sensor health diagnosis method using deep belief network (DBN). DBN has recently become a popular approach in machine learning for its promised advantages such as fast inference and the ability to encode richer and higher order network structures. The DBN employs a hierarchical structure with multiple stacked restricted Boltzmann machines and works through a layer by layer successive learning process. The proposed multi-sensor health diagnosis methodology using DBN based state classification can be structured in three consecutive stages: first, defining health states and preprocessing sensory data for DBN training and testing; second, developing DBN based classification models for diagnosis of predefined health states; third, validating DBN classification models with testing sensory dataset. Health diagnosis using DBN based health state classification technique is compared with four existing diagnosis techniques. Benchmark classification problems and two engineering health diagnosis applications: aircraft engine health diagnosis and electric power transformer health diagnosis are employed to demonstrate the efficacy of the proposed approach

  13. A case-oriented web-based training system for breast cancer diagnosis.

    Science.gov (United States)

    Huang, Qinghua; Huang, Xianhai; Liu, Longzhong; Lin, Yidi; Long, Xingzhang; Li, Xuelong

    2018-03-01

    Breast cancer is still considered as the most common form of cancer as well as the leading causes of cancer deaths among women all over the world. We aim to provide a web-based breast ultrasound database for online training inexperienced radiologists and giving computer-assisted diagnostic information for detection and classification of the breast tumor. We introduce a web database which stores breast ultrasound images from breast cancer patients as well as their diagnostic information. A web-based training system using a feature scoring scheme based on Breast Imaging Reporting and Data System (BI-RADS) US lexicon was designed. A computer-aided diagnosis (CAD) subsystem was developed to assist the radiologists to make scores on the BI-RADS features for an input case. The training system possesses 1669 scored cases, where 412 cases are benign and 1257 cases are malignant. It was tested by 31 users including 12 interns, 11 junior radiologists, and 8 experienced senior radiologists. This online training system automatically creates case-based exercises to train and guide the newly employed or resident radiologists for the diagnosis of breast cancer using breast ultrasound images based on the BI-RADS. After the trainings, the interns and junior radiologists show significant improvement in the diagnosis of the breast tumor with ultrasound imaging (p-value  .05). The online training system can improve the capabilities of early-career radiologists in distinguishing between the benign and malignant lesions and reduce the misdiagnosis of breast cancer in a quick, convenient and effective manner. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  14. A resampling-based meta-analysis for detection of differential gene expression in breast cancer

    International Nuclear Information System (INIS)

    Gur-Dedeoglu, Bala; Konu, Ozlen; Kir, Serkan; Ozturk, Ahmet Rasit; Bozkurt, Betul; Ergul, Gulusan; Yulug, Isik G

    2008-01-01

    Accuracy in the diagnosis of breast cancer and classification of cancer subtypes has improved over the years with the development of well-established immunohistopathological criteria. More recently, diagnostic gene-sets at the mRNA expression level have been tested as better predictors of disease state. However, breast cancer is heterogeneous in nature; thus extraction of differentially expressed gene-sets that stably distinguish normal tissue from various pathologies poses challenges. Meta-analysis of high-throughput expression data using a collection of statistical methodologies leads to the identification of robust tumor gene expression signatures. A resampling-based meta-analysis strategy, which involves the use of resampling and application of distribution statistics in combination to assess the degree of significance in differential expression between sample classes, was developed. Two independent microarray datasets that contain normal breast, invasive ductal carcinoma (IDC), and invasive lobular carcinoma (ILC) samples were used for the meta-analysis. Expression of the genes, selected from the gene list for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes were tested on 10 independent primary IDC samples and matched non-tumor controls by real-time qRT-PCR. Other existing breast cancer microarray datasets were used in support of the resampling-based meta-analysis. The two independent microarray studies were found to be comparable, although differing in their experimental methodologies (Pearson correlation coefficient, R = 0.9389 and R = 0.8465 for ductal and lobular samples, respectively). The resampling-based meta-analysis has led to the identification of a highly stable set of genes for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes. The expression results of the selected genes obtained through real-time qRT-PCR supported the meta-analysis results. The

  15. A resampling-based meta-analysis for detection of differential gene expression in breast cancer

    Directory of Open Access Journals (Sweden)

    Ergul Gulusan

    2008-12-01

    Full Text Available Abstract Background Accuracy in the diagnosis of breast cancer and classification of cancer subtypes has improved over the years with the development of well-established immunohistopathological criteria. More recently, diagnostic gene-sets at the mRNA expression level have been tested as better predictors of disease state. However, breast cancer is heterogeneous in nature; thus extraction of differentially expressed gene-sets that stably distinguish normal tissue from various pathologies poses challenges. Meta-analysis of high-throughput expression data using a collection of statistical methodologies leads to the identification of robust tumor gene expression signatures. Methods A resampling-based meta-analysis strategy, which involves the use of resampling and application of distribution statistics in combination to assess the degree of significance in differential expression between sample classes, was developed. Two independent microarray datasets that contain normal breast, invasive ductal carcinoma (IDC, and invasive lobular carcinoma (ILC samples were used for the meta-analysis. Expression of the genes, selected from the gene list for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes were tested on 10 independent primary IDC samples and matched non-tumor controls by real-time qRT-PCR. Other existing breast cancer microarray datasets were used in support of the resampling-based meta-analysis. Results The two independent microarray studies were found to be comparable, although differing in their experimental methodologies (Pearson correlation coefficient, R = 0.9389 and R = 0.8465 for ductal and lobular samples, respectively. The resampling-based meta-analysis has led to the identification of a highly stable set of genes for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes. The expression results of the selected genes obtained through real

  16. Graph-based semi-supervised learning with genomic data integration using condition-responsive genes applied to phenotype classification.

    Science.gov (United States)

    Doostparast Torshizi, Abolfazl; Petzold, Linda R

    2018-01-01

    Data integration methods that combine data from different molecular levels such as genome, epigenome, transcriptome, etc., have received a great deal of interest in the past few years. It has been demonstrated that the synergistic effects of different biological data types can boost learning capabilities and lead to a better understanding of the underlying interactions among molecular levels. In this paper we present a graph-based semi-supervised classification algorithm that incorporates latent biological knowledge in the form of biological pathways with gene expression and DNA methylation data. The process of graph construction from biological pathways is based on detecting condition-responsive genes, where 3 sets of genes are finally extracted: all condition responsive genes, high-frequency condition-responsive genes, and P-value-filtered genes. The proposed approach is applied to ovarian cancer data downloaded from the Human Genome Atlas. Extensive numerical experiments demonstrate superior performance of the proposed approach compared to other state-of-the-art algorithms, including the latest graph-based classification techniques. Simulation results demonstrate that integrating various data types enhances classification performance and leads to a better understanding of interrelations between diverse omics data types. The proposed approach outperforms many of the state-of-the-art data integration algorithms. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  17. Feature selection and classification of MAQC-II breast cancer and multiple myeloma microarray gene expression data.

    Directory of Open Access Journals (Sweden)

    Qingzhong Liu

    Full Text Available Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by exploiting information redundancy induced by associations among genetic markers. Judicious feature selection in microarray data analysis can result in significant reduction of cost while maintaining or improving the classification or prediction accuracy of learning machines that are employed to sort out the datasets. In this paper, we propose a gene selection method called Recursive Feature Addition (RFA, which combines supervised learning and statistical similarity measures. We compare our method with the following gene selection methods: Support Vector Machine Recursive Feature Elimination (SVMRFE, Leave-One-Out Calculation Sequential Forward Selection (LOOCSFS, Gradient based Leave-one-out Gene Selection (GLGS. To evaluate the performance of these gene selection methods, we employ several popular learning classifiers on the MicroArray Quality Control phase II on predictive modeling (MAQC-II breast cancer dataset and the MAQC-II multiple myeloma dataset. Experimental results show that gene selection is strictly paired with learning classifier. Overall, our approach outperforms other compared methods. The biological functional analysis based on the MAQC-II breast cancer dataset convinced us to apply our method for phenotype prediction. Additionally, learning classifiers also play important roles in the classification of microarray data and our experimental results indicate that the Nearest Mean Scale Classifier (NMSC is a good choice due to its prediction reliability and its stability across the three performance measurements: Testing accuracy, MCC values, and

  18. Classification of Noisy Data: An Approach Based on Genetic Algorithms and Voronoi Tessellation

    DEFF Research Database (Denmark)

    Khan, Abdul Rauf; Schiøler, Henrik; Knudsen, Torben

    Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based on the po......Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based...

  19. Cancer survival classification using integrated data sets and intermediate information.

    Science.gov (United States)

    Kim, Shinuk; Park, Taesung; Kon, Mark

    2014-09-01

    Although numerous studies related to cancer survival have been published, increasing the prediction accuracy of survival classes still remains a challenge. Integration of different data sets, such as microRNA (miRNA) and mRNA, might increase the accuracy of survival class prediction. Therefore, we suggested a machine learning (ML) approach to integrate different data sets, and developed a novel method based on feature selection with Cox proportional hazard regression model (FSCOX) to improve the prediction of cancer survival time. FSCOX provides us with intermediate survival information, which is usually discarded when separating survival into 2 groups (short- and long-term), and allows us to perform survival analysis. We used an ML-based protocol for feature selection, integrating information from miRNA and mRNA expression profiles at the feature level. To predict survival phenotypes, we used the following classifiers, first, existing ML methods, support vector machine (SVM) and random forest (RF), second, a new median-based classifier using FSCOX (FSCOX_median), and third, an SVM classifier using FSCOX (FSCOX_SVM). We compared these methods using 3 types of cancer tissue data sets: (i) miRNA expression, (ii) mRNA expression, and (iii) combined miRNA and mRNA expression. The latter data set included features selected either from the combined miRNA/mRNA profile or independently from miRNAs and mRNAs profiles (IFS). In the ovarian data set, the accuracy of survival classification using the combined miRNA/mRNA profiles with IFS was 75% using RF, 86.36% using SVM, 84.09% using FSCOX_median, and 88.64% using FSCOX_SVM with a balanced 22 short-term and 22 long-term survivor data set. These accuracies are higher than those using miRNA alone (70.45%, RF; 75%, SVM; 75%, FSCOX_median; and 75%, FSCOX_SVM) or mRNA alone (65.91%, RF; 63.64%, SVM; 72.73%, FSCOX_median; and 70.45%, FSCOX_SVM). Similarly in the glioblastoma multiforme data, the accuracy of miRNA/mRNA using IFS

  20. Color Independent Components Based SIFT Descriptors for Object/Scene Classification

    Science.gov (United States)

    Ai, Dan-Ni; Han, Xian-Hua; Ruan, Xiang; Chen, Yen-Wei

    In this paper, we present a novel color independent components based SIFT descriptor (termed CIC-SIFT) for object/scene classification. We first learn an efficient color transformation matrix based on independent component analysis (ICA), which is adaptive to each category in a database. The ICA-based color transformation can enhance contrast between the objects and the background in an image. Then we compute CIC-SIFT descriptors over all three transformed color independent components. Since the ICA-based color transformation can boost the objects and suppress the background, the proposed CIC-SIFT can extract more effective and discriminative local features for object/scene classification. The comparison is performed among seven SIFT descriptors, and the experimental classification results show that our proposed CIC-SIFT is superior to other conventional SIFT descriptors.

  1. Object-Based Classification as an Alternative Approach to the Traditional Pixel-Based Classification to Identify Potential Habitat of the Grasshopper Sparrow

    Science.gov (United States)

    Jobin, Benoît; Labrecque, Sandra; Grenier, Marcelle; Falardeau, Gilles

    2008-01-01

    The traditional method of identifying wildlife habitat distribution over large regions consists of pixel-based classification of satellite images into a suite of habitat classes used to select suitable habitat patches. Object-based classification is a new method that can achieve the same objective based on the segmentation of spectral bands of the image creating homogeneous polygons with regard to spatial or spectral characteristics. The segmentation algorithm does not solely rely on the single pixel value, but also on shape, texture, and pixel spatial continuity. The object-based classification is a knowledge base process where an interpretation key is developed using ground control points and objects are assigned to specific classes according to threshold values of determined spectral and/or spatial attributes. We developed a model using the eCognition software to identify suitable habitats for the Grasshopper Sparrow, a rare and declining species found in southwestern Québec. The model was developed in a region with known breeding sites and applied on other images covering adjacent regions where potential breeding habitats may be present. We were successful in locating potential habitats in areas where dairy farming prevailed but failed in an adjacent region covered by a distinct Landsat scene and dominated by annual crops. We discuss the added value of this method, such as the possibility to use the contextual information associated to objects and the ability to eliminate unsuitable areas in the segmentation and land cover classification processes, as well as technical and logistical constraints. A series of recommendations on the use of this method and on conservation issues of Grasshopper Sparrow habitat is also provided.

  2. Comparison of clinical and survival characteristics between prostate cancer patients of PSA-based screening and clinical diagnosis in China.

    Science.gov (United States)

    Xu, Libo; Wang, Jinguo; Guo, Baofeng; Zhang, Haixia; Wang, Kaichen; Wang, Ding; Dai, Chang; Zhang, Ling; Zhao, Xuejian

    2018-01-02

    Prostate-specific antigen (PSA)-based mass screening remains the most controversial topic in prostate cancer. PSA-based mass screening has not been widely used in China yet. The aim of our study was to evaluate the effect of the PSA-based screening in China. The cohort consisted of 1,012 prostate cancer patients. Data were retrospectively collected and clinical characteristics of the cohorts were investigated. Survival was analyzed for prostatic carcinoma of both PSA screened and clinically diagnosed patients according to clinical characteristics and the National Comprehensive Cancer Network (NCCN) risk classification. Cox Proportional Hazards Model analysis was done for risk predictor identification. The median age was 71 years old. Five-year overall and prostate-cancer-specific survival in prostatic adenocarcinoma patients were 77.52% and 79.65%; 10-year survivals were 62.57% and 68.60%, respectively. Survival was significantly poorer in patients with metastases and non-curative management. T staging and Gleason score by NCCN classification effectively stratified prostatic adenocarcinoma patients into different risk groups. T staging was a significant predictor of survival by COX Proportional Hazard Model. PSA screened patients had a significantly higher percentage diagnosed in early stage. PSA screened prostatic adenocarcinoma patients had a better prognosis in both overall and prostate cancer-specific survivals. This Chinese cohort had a lower overall and prostate cancer survival rate than it is reported in western countries. The incidence of early-stage prostate cancer found in PSA-based mass screening was high and there were significant differences in both overall and prostate cancer-specific survival between the PSA-screened and clinically diagnosed patients.

  3. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  4. Classification of masses on mammograms using support vector machine

    Science.gov (United States)

    Chu, Yong; Li, Lihua; Goldgof, Dmitry B.; Qui, Yan; Clark, Robert A.

    2003-05-01

    Mammography is the most effective method for early detection of breast cancer. However, the positive predictive value for classification of malignant and benign lesion from mammographic images is not very high. Clinical studies have shown that most biopsies for cancer are very low, between 15% and 30%. It is important to increase the diagnostic accuracy by improving the positive predictive value to reduce the number of unnecessary biopsies. In this paper, a new classification method was proposed to distinguish malignant from benign masses in mammography by Support Vector Machine (SVM) method. Thirteen features were selected based on receiver operating characteristic (ROC) analysis of classification using individual feature. These features include four shape features, two gradient features and seven Laws features. With these features, SVM was used to classify the masses into two categories, benign and malignant, in which a Gaussian kernel and sequential minimal optimization learning technique are performed. The data set used in this study consists of 193 cases, in which there are 96 benign cases and 97 malignant cases. The leave-one-out evaluation of SVM classifier was taken. The results show that the positive predict value of the presented method is 81.6% with the sensitivity of 83.7% and the false-positive rate of 30.2%. It demonstrated that the SVM-based classifier is effective in mass classification.

  5. Hierarchical structure for audio-video based semantic classification of sports video sequences

    Science.gov (United States)

    Kolekar, M. H.; Sengupta, S.

    2005-07-01

    A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.

  6. SQL based cardiovascular ultrasound image classification.

    Science.gov (United States)

    Nandagopalan, S; Suryanarayana, Adiga B; Sudarshan, T S B; Chandrashekar, Dhanalakshmi; Manjunath, C N

    2013-01-01

    This paper proposes a novel method to analyze and classify the cardiovascular ultrasound echocardiographic images using Naïve-Bayesian model via database OLAP-SQL. Efficient data mining algorithms based on tightly-coupled model is used to extract features. Three algorithms are proposed for classification namely Naïve-Bayesian Classifier for Discrete variables (NBCD) with SQL, NBCD with OLAP-SQL, and Naïve-Bayesian Classifier for Continuous variables (NBCC) using OLAP-SQL. The proposed model is trained with 207 patient images containing normal and abnormal categories. Out of the three proposed algorithms, a high classification accuracy of 96.59% was achieved from NBCC which is better than the earlier methods.

  7. Design and implementation based on the classification protection vulnerability scanning system

    International Nuclear Information System (INIS)

    Wang Chao; Lu Zhigang; Liu Baoxu

    2010-01-01

    With the application and spread of the classification protection, Network Security Vulnerability Scanning should consider the efficiency and the function expansion. It proposes a kind of a system vulnerability from classification protection, and elaborates the design and implementation of a vulnerability scanning system based on vulnerability classification plug-in technology and oriented classification protection. According to the experiment, the application of classification protection has good adaptability and salability with the system, and it also approves the efficiency of scanning. (authors)

  8. Changes in classification of genetic variants in BRCA1 and BRCA2.

    Science.gov (United States)

    Kast, Karin; Wimberger, Pauline; Arnold, Norbert

    2018-02-01

    Classification of variants of unknown significance (VUS) in the breast cancer genes BRCA1 and BRCA2 changes with accumulating evidence for clinical relevance. In most cases down-staging towards neutral variants without clinical significance is possible. We searched the database of the German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC) for changes in classification of genetic variants as an update to our earlier publication on genetic variants in the Centre of Dresden. Changes between 2015 and 2017 were recorded. In the group of variants of unclassified significance (VUS, Class 3, uncertain), only changes of classification towards neutral genetic variants were noted. In BRCA1, 25% of the Class 3 variants (n = 2/8) changed to Class 2 (likely benign) and Class 1 (benign). In BRCA2, in 50% of the Class 3 variants (n = 16/32), a change to Class 2 (n = 10/16) or Class 1 (n = 6/16) was observed. No change in classification was noted in Class 4 (likely pathogenic) and Class 5 (pathogenic) genetic variants in both genes. No up-staging from Class 1, Class 2 or Class 3 to more clinical significance was observed. All variants with a change in classification in our cohort were down-staged towards no clinical significance by a panel of experts of the German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC). Prevention in families with Class 3 variants should be based on pedigree based risks and should not be guided by the presence of a VUS.

  9. CLASSIFICATION OF SEVERAL SKIN CANCER TYPES BASED ON AUTOFLUORESCENCE INTENSITY OF VISIBLE LIGHT TO NEAR INFRARED RATIO

    Directory of Open Access Journals (Sweden)

    Aryo Tedjo

    2009-12-01

    Full Text Available Skin cancer is a malignant growth on the skin caused by many factors. The most common skin cancers are Basal Cell Cancer (BCC and Squamous Cell Cancer (SCC. This research uses a discriminant analysis to classify some tissues of skin cancer based on criterion number of independent variables. An independent variable is variation of excitation light sources (LED lamp, filters, and sensors to measure Autofluorescence Intensity (IAF of visible light to near infrared (VIS/NIR ratio of paraffin embedded tissue biopsy from BCC, SCC, and Lipoma. From the result of discriminant analysis, it is known that the discriminant function is determined by 4 (four independent variables i.e., Blue LED-Red Filter, Blue LED-Yellow Filter, UV LED-Blue Filter, and UV LED-Yellow Filter. The accuracy of discriminant in classifying the analysis of three skin cancer tissues is 100 %.

  10. Deep convolutional neural networks for automatic classification of gastric carcinoma using whole slide images in digital histopathology.

    Science.gov (United States)

    Sharma, Harshita; Zerbe, Norman; Klempert, Iris; Hellwich, Olaf; Hufnagl, Peter

    2017-11-01

    Deep learning using convolutional neural networks is an actively emerging field in histological image analysis. This study explores deep learning methods for computer-aided classification in H&E stained histopathological whole slide images of gastric carcinoma. An introductory convolutional neural network architecture is proposed for two computerized applications, namely, cancer classification based on immunohistochemical response and necrosis detection based on the existence of tumor necrosis in the tissue. Classification performance of the developed deep learning approach is quantitatively compared with traditional image analysis methods in digital histopathology requiring prior computation of handcrafted features, such as statistical measures using gray level co-occurrence matrix, Gabor filter-bank responses, LBP histograms, gray histograms, HSV histograms and RGB histograms, followed by random forest machine learning. Additionally, the widely known AlexNet deep convolutional framework is comparatively analyzed for the corresponding classification problems. The proposed convolutional neural network architecture reports favorable results, with an overall classification accuracy of 0.6990 for cancer classification and 0.8144 for necrosis detection. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Analysis of composition-based metagenomic classification.

    Science.gov (United States)

    Higashi, Susan; Barreto, André da Motta Salles; Cantão, Maurício Egidio; de Vasconcelos, Ana Tereza Ribeiro

    2012-01-01

    An essential step of a metagenomic study is the taxonomic classification, that is, the identification of the taxonomic lineage of the organisms in a given sample. The taxonomic classification process involves a series of decisions. Currently, in the context of metagenomics, such decisions are usually based on empirical studies that consider one specific type of classifier. In this study we propose a general framework for analyzing the impact that several decisions can have on the classification problem. Instead of focusing on any specific classifier, we define a generic score function that provides a measure of the difficulty of the classification task. Using this framework, we analyze the impact of the following parameters on the taxonomic classification problem: (i) the length of n-mers used to encode the metagenomic sequences, (ii) the similarity measure used to compare sequences, and (iii) the type of taxonomic classification, which can be conventional or hierarchical, depending on whether the classification process occurs in a single shot or in several steps according to the taxonomic tree. We defined a score function that measures the degree of separability of the taxonomic classes under a given configuration induced by the parameters above. We conducted an extensive computational experiment and found out that reasonable values for the parameters of interest could be (i) intermediate values of n, the length of the n-mers; (ii) any similarity measure, because all of them resulted in similar scores; and (iii) the hierarchical strategy, which performed better in all of the cases. As expected, short n-mers generate lower configuration scores because they give rise to frequency vectors that represent distinct sequences in a similar way. On the other hand, large values for n result in sparse frequency vectors that represent differently metagenomic fragments that are in fact similar, also leading to low configuration scores. Regarding the similarity measure, in

  12. Group-Based Active Learning of Classification Models.

    Science.gov (United States)

    Luo, Zhipeng; Hauskrecht, Milos

    2017-05-01

    Learning of classification models from real-world data often requires additional human expert effort to annotate the data. However, this process can be rather costly and finding ways of reducing the human annotation effort is critical for this task. The objective of this paper is to develop and study new ways of providing human feedback for efficient learning of classification models by labeling groups of examples. Briefly, unlike traditional active learning methods that seek feedback on individual examples, we develop a new group-based active learning framework that solicits label information on groups of multiple examples. In order to describe groups in a user-friendly way, conjunctive patterns are used to compactly represent groups. Our empirical study on 12 UCI data sets demonstrates the advantages and superiority of our approach over both classic instance-based active learning work, as well as existing group-based active-learning methods.

  13. Classification across gene expression microarray studies

    Directory of Open Access Journals (Sweden)

    Kuner Ruprecht

    2009-12-01

    Full Text Available Abstract Background The increasing number of gene expression microarray studies represents an important resource in biomedical research. As a result, gene expression based diagnosis has entered clinical practice for patient stratification in breast cancer. However, the integration and combined analysis of microarray studies remains still a challenge. We assessed the potential benefit of data integration on the classification accuracy and systematically evaluated the generalization performance of selected methods on four breast cancer studies comprising almost 1000 independent samples. To this end, we introduced an evaluation framework which aims to establish good statistical practice and a graphical way to monitor differences. The classification goal was to correctly predict estrogen receptor status (negative/positive and histological grade (low/high of each tumor sample in an independent study which was not used for the training. For the classification we chose support vector machines (SVM, predictive analysis of microarrays (PAM, random forest (RF and k-top scoring pairs (kTSP. Guided by considerations relevant for classification across studies we developed a generalization of kTSP which we evaluated in addition. Our derived version (DV aims to improve the robustness of the intrinsic invariance of kTSP with respect to technologies and preprocessing. Results For each individual study the generalization error was benchmarked via complete cross-validation and was found to be similar for all classification methods. The misclassification rates were substantially higher in classification across studies, when each single study was used as an independent test set while all remaining studies were combined for the training of the classifier. However, with increasing number of independent microarray studies used in the training, the overall classification performance improved. DV performed better than the average and showed slightly less variance. In

  14. Prognostic classification index in Iranian colorectal cancer patients: Survival tree analysis

    Directory of Open Access Journals (Sweden)

    Amal Saki Malehi

    2016-01-01

    Full Text Available Aims: The aim of this study was to determine the prognostic index for separating homogenous subgroups in colorectal cancer (CRC patients based on clinicopathological characteristics using survival tree analysis. Methods: The current study was conducted at the Research Center of Gastroenterology and Liver Disease, Shahid Beheshti Medical University in Tehran, between January 2004 and January 2009. A total of 739 patients who already have been diagnosed with CRC based on pathologic report were enrolled. The data included demographic and clinical-pathological characteristic of patients. Tree-structured survival analysis based on a recursive partitioning algorithm was implemented to evaluate prognostic factors. The probability curves were calculated according to the Kaplan-Meier method, and the hazard ratio was estimated as an interest effect size. Result: There were 526 males (71.2% of these patients. The mean survival time (from diagnosis time was 42.46± (3.4. Survival tree identified three variables as main prognostic factors and based on their four prognostic subgroups was constructed. The log-rank test showed good separation of survival curves. Patients with Stage I-IIIA and treated with surgery as the first treatment showed low risk (median = 34 months whereas patients with stage IIIB, IV, and more than 68 years have the worse survival outcome (median = 9.5 months. Conclusion: Constructing the prognostic classification index via survival tree can aid the researchers to assess interaction between clinical variables and determining the cumulative effect of these variables on survival outcome.

  15. Markerless gating for lung cancer radiotherapy based on machine learning techniques

    International Nuclear Information System (INIS)

    Lin Tong; Li Ruijiang; Tang Xiaoli; Jiang, Steve B; Dy, Jennifer G

    2009-01-01

    In lung cancer radiotherapy, radiation to a mobile target can be delivered by respiratory gating, for which we need to know whether the target is inside or outside a predefined gating window at any time point during the treatment. This can be achieved by tracking one or more fiducial markers implanted inside or near the target, either fluoroscopically or electromagnetically. However, the clinical implementation of marker tracking is limited for lung cancer radiotherapy mainly due to the risk of pneumothorax. Therefore, gating without implanted fiducial markers is a promising clinical direction. We have developed several template-matching methods for fluoroscopic marker-less gating. Recently, we have modeled the gating problem as a binary pattern classification problem, in which principal component analysis (PCA) and support vector machine (SVM) are combined to perform the classification task. Following the same framework, we investigated different combinations of dimensionality reduction techniques (PCA and four nonlinear manifold learning methods) and two machine learning classification methods (artificial neural networks-ANN and SVM). Performance was evaluated on ten fluoroscopic image sequences of nine lung cancer patients. We found that among all combinations of dimensionality reduction techniques and classification methods, PCA combined with either ANN or SVM achieved a better performance than the other nonlinear manifold learning methods. ANN when combined with PCA achieves a better performance than SVM in terms of classification accuracy and recall rate, although the target coverage is similar for the two classification methods. Furthermore, the running time for both ANN and SVM with PCA is within tolerance for real-time applications. Overall, ANN combined with PCA is a better candidate than other combinations we investigated in this work for real-time gated radiotherapy.

  16. Data Clustering and Evolving Fuzzy Decision Tree for Data Base Classification Problems

    Science.gov (United States)

    Chang, Pei-Chann; Fan, Chin-Yuan; Wang, Yen-Wen

    Data base classification suffers from two well known difficulties, i.e., the high dimensionality and non-stationary variations within the large historic data. This paper presents a hybrid classification model by integrating a case based reasoning technique, a Fuzzy Decision Tree (FDT), and Genetic Algorithms (GA) to construct a decision-making system for data classification in various data base applications. The model is major based on the idea that the historic data base can be transformed into a smaller case-base together with a group of fuzzy decision rules. As a result, the model can be more accurately respond to the current data under classifying from the inductions by these smaller cases based fuzzy decision trees. Hit rate is applied as a performance measure and the effectiveness of our proposed model is demonstrated by experimentally compared with other approaches on different data base classification applications. The average hit rate of our proposed model is the highest among others.

  17. Comparison of Back propagation neural network and Back propagation neural network Based Particle Swarm intelligence in Diagnostic Breast Cancer

    Directory of Open Access Journals (Sweden)

    Farahnaz SADOUGHI

    2014-03-01

    Full Text Available Breast cancer is the most commonly diagnosed cancer and the most common cause of death in women all over the world. Use of computer technology supporting breast cancer diagnosing is now widespread and pervasive across a broad range of medical areas. Early diagnosis of this disease can greatly enhance the chances of long-term survival of breast cancer victims. Artificial Neural Networks (ANN as mainly method play important role in early diagnoses breast cancer. This paper studies Levenberg Marquardet Backpropagation (LMBP neural network and Levenberg Marquardet Backpropagation based Particle Swarm Optimization(LMBP-PSO for the diagnosis of breast cancer. The obtained results show that LMBP and LMBP based PSO system provides higher classification efficiency. But LMBP based PSO needs minimum training and testing time. It helps in developing Medical Decision System (MDS for breast cancer diagnosing. It can also be used as secondary observer in clinical decision making.

  18. An opto-electronic joint detection system based on DSP aiming at early cervical cancer screening

    Science.gov (United States)

    Wang, Weiya; Jia, Mengyu; Gao, Feng; Yang, Lihong; Qu, Pengpeng; Zou, Changping; Liu, Pengxi; Zhao, Huijuan

    2015-02-01

    The cervical cancer screening at a pre-cancer stage is beneficial to reduce the mortality of women. An opto-electronic joint detection system based on DSP aiming at early cervical cancer screening is introduced in this paper. In this system, three electrodes alternately discharge to the cervical tissue and three light emitting diodes in different wavelengths alternately irradiate the cervical tissue. Then the relative optical reflectance and electrical voltage attenuation curve are obtained by optical and electrical detection, respectively. The system is based on DSP to attain the portable and cheap instrument. By adopting the relative reflectance and the voltage attenuation constant, the classification algorithm based on Support Vector Machine (SVM) discriminates abnormal cervical tissue from normal. We use particle swarm optimization to optimize the two key parameters of SVM, i.e. nuclear factor and cost factor. The clinical data were collected on 313 patients to build a clinical database of tissue responses under optical and electrical stimulations with the histopathologic examination as the gold standard. The classification result shows that the opto-electronic joint detection has higher total coincidence rate than separate optical detection or separate electrical detection. The sensitivity, specificity, and total coincidence rate increase with the increasing of sample numbers in the training set. The average total coincidence rate of the system can reach 85.1% compared with the histopathologic examination.

  19. Mastectomy or breast conserving surgery? Factors affecting type of surgical treatment for breast cancer – a classification tree approach

    International Nuclear Information System (INIS)

    Martin, Michael A; Meyricke, Ramona; O'Neill, Terry; Roberts, Steven

    2006-01-01

    A critical choice facing breast cancer patients is which surgical treatment – mastectomy or breast conserving surgery (BCS) – is most appropriate. Several studies have investigated factors that impact the type of surgery chosen, identifying features such as place of residence, age at diagnosis, tumor size, socio-economic and racial/ethnic elements as relevant. Such assessment of 'propensity' is important in understanding issues such as a reported under-utilisation of BCS among women for whom such treatment was not contraindicated. Using Western Australian (WA) data, we further examine the factors associated with the type of surgical treatment for breast cancer using a classification tree approach. This approach deals naturally with complicated interactions between factors, and so allows flexible and interpretable models for treatment choice to be built that add to the current understanding of this complex decision process. Data was extracted from the WA Cancer Registry on women diagnosed with breast cancer in WA from 1990 to 2000. Subjects' treatment preferences were predicted from covariates using both classification trees and logistic regression. Tumor size was the primary determinant of patient choice, subjects with tumors smaller than 20 mm in diameter preferring BCS. For subjects with tumors greater than 20 mm in diameter factors such as patient age, nodal status, and tumor histology become relevant as predictors of patient choice. Classification trees perform as well as logistic regression for predicting patient choice, but are much easier to interpret for clinical use. The selected tree can inform clinicians' advice to patients

  20. A new gammagraphic and functional-based classification for hyperthyroidism

    International Nuclear Information System (INIS)

    Sanchez, J.; Lamata, F.; Cerdan, R.; Agilella, V.; Gastaminza, R.; Abusada, R.; Gonzales, M.; Martinez, M.

    2000-01-01

    The absence of an universal classification for hyperthyroidism's (HT), give rise to inadequate interpretation of series and trials, and prevents decision making. We offer a tentative classification based on gammagraphic and functional findings. Clinical records from patients who underwent thyroidectomy in our Department since 1967 to 1997 were reviewed. Those with functional measurements of hyperthyroidism were considered. All were managed according to the same preestablished guidelines. HT was the surgical indication in 694 (27,1%) of the 2559 thyroidectomy. Based on gammagraphic studies, we classified HTs in: parenchymatous increased-uptake, which could be diffuse, diffuse with cold nodules or diffuse with at least one nodule, and nodular increased-uptake (Autonomous Functioning Thyroid Nodes-AFTN), divided into solitary AFTN or toxic adenoma and multiple AFTN o toxic multi-nodular goiter. This gammagraphic-based classification in useful and has high sensitivity to detect these nodules assessing their activity, allowing us to make therapeutic decision making and, in some cases, to choose surgical technique. (authors)

  1. A kernel-based multivariate feature selection method for microarray data classification.

    Directory of Open Access Journals (Sweden)

    Shiquan Sun

    Full Text Available High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Formula: see text]-nearest neighbor on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.

  2. Restaging and Survival Analysis of 4036 Ovarian Cancer Patients According to the 2013 FIGO Classification for Ovarian, Fallopian Tube, and Primary Peritoneal Cancer

    DEFF Research Database (Denmark)

    Rosendahl, Mikkel; Høgdall, Claus Kim; Mosgaard, Berit Jul

    2016-01-01

    OBJECTIVE: With the 2013 International Federation of Gynecology and Obstetrics (FIGO) staging for ovarian, fallopian tube, and primary peritoneal cancer, the number of substages changed from 10 to 14. Any classification of a malignancy should easily assign patients to prognostic groups, refer....... MATERIALS AND METHODS: Demographic, surgical, histological, and survival data from 4036 ovarian cancer patients were used in the analysis. Five-year survival rates (5YSR) and hazard ratios for the old and revised FIGO staging were calculated using Kaplan-Meier curves and Cox regression. RESULTS: A total...

  3. Finding Combination of Features from Promoter Regions for Ovarian Cancer-related Gene Group Classification

    KAUST Repository

    Olayan, Rawan S.

    2012-01-01

    In classification problems, it is always important to use the suitable combination of features that will be employed by classifiers. Generating the right combination of features usually results in good classifiers. In the situation when the problem is not well understood, data items are usually described by many features in the hope that some of these may be the relevant or most relevant ones. In this study, we focus on one such problem related to genes implicated in ovarian cancer (OC). We try to recognize two important OC-related gene groups: oncogenes, which support the development and progression of OC, and oncosuppressors, which oppose such tendencies. For this, we use the properties of promoters of these genes. We identified potential “regulatory features” that characterize OC-related oncogenes and oncosuppressors promoters. In our study, we used 211 oncogenes and 39 oncosuppressors. For these, we identified 538 characteristic sequence motifs from their promoters. Promoters are annotated by these motifs and derived feature vectors used to develop classification models. We made a comparison of a number of classification models in their ability to distinguish oncogenes from oncosuppressors. Based on 10-fold cross-validation, the resultant model was able to separate the two classes with sensitivity of 96% and specificity of 100% with the complete set of features. Moreover, we developed another recognition model where we attempted to distinguish oncogenes and oncosuppressors as one group from other OC-related genes. That model achieved accuracy of 82%. We believe that the results of this study will help in discovering other OC-related oncogenes and oncosuppressors not identified as yet.

  4. Finding Combination of Features from Promoter Regions for Ovarian Cancer-related Gene Group Classification

    KAUST Repository

    Olayan, Rawan S.

    2012-12-01

    In classification problems, it is always important to use the suitable combination of features that will be employed by classifiers. Generating the right combination of features usually results in good classifiers. In the situation when the problem is not well understood, data items are usually described by many features in the hope that some of these may be the relevant or most relevant ones. In this study, we focus on one such problem related to genes implicated in ovarian cancer (OC). We try to recognize two important OC-related gene groups: oncogenes, which support the development and progression of OC, and oncosuppressors, which oppose such tendencies. For this, we use the properties of promoters of these genes. We identified potential “regulatory features” that characterize OC-related oncogenes and oncosuppressors promoters. In our study, we used 211 oncogenes and 39 oncosuppressors. For these, we identified 538 characteristic sequence motifs from their promoters. Promoters are annotated by these motifs and derived feature vectors used to develop classification models. We made a comparison of a number of classification models in their ability to distinguish oncogenes from oncosuppressors. Based on 10-fold cross-validation, the resultant model was able to separate the two classes with sensitivity of 96% and specificity of 100% with the complete set of features. Moreover, we developed another recognition model where we attempted to distinguish oncogenes and oncosuppressors as one group from other OC-related genes. That model achieved accuracy of 82%. We believe that the results of this study will help in discovering other OC-related oncogenes and oncosuppressors not identified as yet.

  5. Comparison of the prevalence of malnutrition diagnosis in head and neck, gastrointestinal and lung cancer patients by three classification methods

    Science.gov (United States)

    Platek, Mary E.; Popp KPf, Johann V.; Possinger, Candi S.; DeNysschen, Carol A.; Horvath, Peter; Brown, Jean K.

    2011-01-01

    Background Malnutrition is prevalent among patients within certain cancer types. There is lack of universal standard of care for nutrition screening, lack of agreement on an operational definition and on validity of malnutrition indicators. Objective In a secondary data analysis, we investigated prevalence of malnutrition diagnosis by three classification methods using data from medical records of a National Cancer Institute (NCI)-designated comprehensive cancer center. Interventions/Methods Records of 227 patients hospitalized during 1998 with head and neck, gastrointestinal or lung cancer were reviewed for malnutrition based on three methods: 1) physician diagnosed malnutrition related ICD-9 codes; 2) in-hospital nutritional assessment summary conducted by Registered Dietitians; and 3) body mass index (BMI). For patients with multiple admissions, only data from the first hospitalization was included. Results Prevalence of malnutrition diagnosis ranged from 8.8% based on BMI to approximately 26% of all cases based on dietitian assessment. Kappa coefficients between any methods indicated a weak (kappa=0.23, BMI and Dietitians and kappa=0.28, Dietitians and Physicians) to fair strength of agreement (kappa=0.38, BMI and Physicians). Conclusions Available methods to identify patients with malnutrition in an NCI designated comprehensive cancer center resulted in varied prevalence of malnutrition diagnosis. Universal standard of care for nutrition screening that utilizes validated tools is needed. Implications for Practice The Joint Commission on the Accreditation of Healthcare Organizations requires nutritional screening of patients within 24 hours of admission. For this purpose, implementation of a validated tool that can be used by various healthcare practitioners, including nurses, needs to be considered. PMID:21242767

  6. Performance Evaluation of Frequency Transform Based Block Classification of Compound Image Segmentation Techniques

    Science.gov (United States)

    Selwyn, Ebenezer Juliet; Florinabel, D. Jemi

    2018-04-01

    Compound image segmentation plays a vital role in the compression of computer screen images. Computer screen images are images which are mixed with textual, graphical, or pictorial contents. In this paper, we present a comparison of two transform based block classification of compound images based on metrics like speed of classification, precision and recall rate. Block based classification approaches normally divide the compound images into fixed size blocks of non-overlapping in nature. Then frequency transform like Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) are applied over each block. Mean and standard deviation are computed for each 8 × 8 block and are used as features set to classify the compound images into text/graphics and picture/background block. The classification accuracy of block classification based segmentation techniques are measured by evaluation metrics like precision and recall rate. Compound images of smooth background and complex background images containing text of varying size, colour and orientation are considered for testing. Experimental evidence shows that the DWT based segmentation provides significant improvement in recall rate and precision rate approximately 2.3% than DCT based segmentation with an increase in block classification time for both smooth and complex background images.

  7. Clinical significance of combined detection of CYFRA21-1, NSE and CEA in classification and staging of patients with lung cancer

    International Nuclear Information System (INIS)

    Hu He; Li Yanhua; Liang Weida; Zhang Qin

    2011-01-01

    To explore clinical value of combined detection of CYFRA21-1, NSE and CEA in classification and staging of patients with lung cancer, the CYFRA21-1, NSE and CEA levels in pleural effusion in 330 patients with lung cancer and in 43 patients with benign were detected by the electrochemiluminescence. The results showed that CYFRA21-1, NSE and CEA levels in pleural effusion in patients with lung cancer group were significantly higher than that of in benign group (P<0.01). The positive rate of tumor markers in different pathological type lung cancer were different,which CYFRA21-1 positive rate in squamous cell cancer group was highest with 65.5%; CEA positive rate in glands cancer group was supreme with 65.0%; the NSE positive rate in differentiation cancer group was highest with 79.5%. The positive rate in three markers combined detection was higher than that in one item detection. The tumor marker levels in lung cancer were positively related with clinical staging. The higher of tumor marker levels and the more late of clinical staging, and the clinical III∼IV period was obviously higher than that I∼II period (P<0.05). The combined detection of CYFRA21-1, NSE and CEA may enhance the positive rate in lung cancer detection, and may have significant clinical value in the classification and staging of patients with lung cancer. (authors)

  8. The correlation study of radiological findings with pathological classification of superficial depressed (IIc type) early gastric cancer

    International Nuclear Information System (INIS)

    Liu Linxiang; Deng Bingxing; Liu Yujin; Iinuma, G.; Moriyama, N.

    2007-01-01

    Objective: To investigate the relations between radiological findings and pathological classification of superficial depressed (II c type) early gastric cancer. Methods: Radiological features in subtonic double contrast barium examination and the endoscopic pictures of early gastric cancer compared with the global pathological specimens and micro-pathological features were prospectively studied. Combined with the gastric endoscopic pictures, the sharpness of margin of the lesions, the changes of converging mucosal folds and the changes of the depressed surface on the film of double contrast barium examination were analyzed. The correlation between the radiological features and histological classification of gastric cancer including well differentiated tubular adenocarcinoma (tub1), moderately differentiated tubular adenocarcinoma (tub2), poorly differentiated adenocarcinoma (por) and signet-ring cell carcinoma (sig) were studied. Results: In 102 cases of II c type early gastric cancer, there were tub1 27 cases, tub2 11, por 26 and sig 38 cases histologically. The margin of the depressed lesions of tubl (24 cases) and tub2 (9 cases) cancers were mostly unsharply demarcated or with fine spicular border, while the margin of lesions of por(15 cases) and sig(31 cases) were mostly clearly and sharply demarcated, with statistical significance (P<0.01). The depressed surface of tub1 and tub2 lesions (17 cases) revealed little unevenness, sometimes with evenly granulations, single nodule and scar-like depression, while that of por and sig lesions (41 cases) manifested as nodules of varying sizes, with statistical significance (P<0.01). Conclusion: The radiological findings of superficial depressed early gastric cancer in different histological types were different, the possible histological type could be speculated according to the radiological findings of the lesions. (authors)

  9. Graph-Based Semi-Supervised Hyperspectral Image Classification Using Spatial Information

    Science.gov (United States)

    Jamshidpour, N.; Homayouni, S.; Safari, A.

    2017-09-01

    Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  10. GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION

    Directory of Open Access Journals (Sweden)

    N. Jamshidpour

    2017-09-01

    Full Text Available Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  11. Key-phrase based classification of public health web pages.

    Science.gov (United States)

    Dolamic, Ljiljana; Boyer, Célia

    2013-01-01

    This paper describes and evaluates the public health web pages classification model based on key phrase extraction and matching. Easily extendible both in terms of new classes as well as the new language this method proves to be a good solution for text classification faced with the total lack of training data. To evaluate the proposed solution we have used a small collection of public health related web pages created by a double blind manual classification. Our experiments have shown that by choosing the adequate threshold value the desired value for either precision or recall can be achieved.

  12. Classification and risk assessment of individuals with familial polyposis, Gardner's syndrome, and familial non-polyposis colon cancer from [3H]thymidine labeling patterns in colonic epithelial cells

    International Nuclear Information System (INIS)

    Lipkin, M.; Blattner, W.A.; Gardner, E.J.; Burt, R.W.; Lynch, H.; Deschner, E.; Winawer, S.; Fraumeni, J.F. Jr.

    1984-01-01

    A probabilistic analysis has been developed to assist the binary classification and risk assessment of members of familial colon cancer kindreds. The analysis is based on the microautoradiographic observation of [ 3 H]thymidine-labeled epithelial cells in colonic mucosa of the kindred members. From biopsies of colonic mucosa which are labeled with [ 3 H]thymidine in vitro, the degree of similarity of each subject's cell-labeling pattern measured over entire crypts was automatically compared to the labeling patterns of high-risk and low-risk reference populations. Each individual was then presumptively classified and assigned to one of the reference populations, and a degree of risk for the classification was provided. In carrying out the analysis, a linear score was calculated for each individual relative to each of the reference populations, and the classification was based on the polarity of the score difference; the degree of risk was then quantitated from the magnitude of the score difference. When the method was applied to kindreds having either familial polyposis or familial non-polyposis colon cancer, it effectively segregated individuals affected with disease from others at low risk, with sensitivity and specificity ranging from 71 to 92%. Further application of the method to asymptomatic family members believed to be at 50% risk on the basis of pedigree evaluation revealed a biomodal distribution to nearly zero or full risk. The accuracy and simplicity of this approach and its capability of revealing early stages of abnormal colonic epithelial cell development indicate potential for preclinical screening of subjects at risk in cancer-prone kindreds and for assisting the analysis of modes of inheritance

  13. The Study of Land Use Classification Based on SPOT6 High Resolution Data

    OpenAIRE

    Wu Song; Jiang Qigang

    2016-01-01

    A method is carried out to quick classification extract of the type of land use in agricultural areas, which is based on the spot6 high resolution remote sensing classification data and used of the good nonlinear classification ability of support vector machine. The results show that the spot6 high resolution remote sensing classification data can realize land classification efficiently, the overall classification accuracy reached 88.79% and Kappa factor is 0.8632 which means that the classif...

  14. Rough set classification based on quantum logic

    Science.gov (United States)

    Hassan, Yasser F.

    2017-11-01

    By combining the advantages of quantum computing and soft computing, the paper shows that rough sets can be used with quantum logic for classification and recognition systems. We suggest the new definition of rough set theory as quantum logic theory. Rough approximations are essential elements in rough set theory, the quantum rough set model for set-valued data directly construct set approximation based on a kind of quantum similarity relation which is presented here. Theoretical analyses demonstrate that the new model for quantum rough sets has new type of decision rule with less redundancy which can be used to give accurate classification using principles of quantum superposition and non-linear quantum relations. To our knowledge, this is the first attempt aiming to define rough sets in representation of a quantum rather than logic or sets. The experiments on data-sets have demonstrated that the proposed model is more accuracy than the traditional rough sets in terms of finding optimal classifications.

  15. ParSel: Parallel Selection of Micro-RNAs for Survival Classification in Cancers.

    Science.gov (United States)

    Sinha, Debajyoti; Sengupta, Debarka; Bandyopadhyay, Sanghamitra

    2017-07-01

    It is known that tumor micro-RNAs (miRNA) can define patient survival and treatment response. We present a framework to identify miRNAs which are predictive of cancer survival. The framework attempts to rank the miRNAs by exploring their collaborative role in gene regulation. Our approach tests a significantly large number of combinatorial cases leveraging parallel computation. We carefully avoided parametric assumptions involved in evaluations of miRNA expressions but used rigorous statistical computation to assign an importance score to a miRNA. Experimental results on three cancer types namely, KIRC, OV and GBM verify that the top ranked miRNAs obtained using the proposed framework produce better classification accuracy as compared to some best practice variable selection methods. Some of these top ranked miRNA are also known to be associated with related diseases. © 2017 Wiley‐VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Organizational Data Classification Based on the Importance Concept of Complex Networks.

    Science.gov (United States)

    Carneiro, Murillo Guimaraes; Zhao, Liang

    2017-08-01

    Data classification is a common task, which can be performed by both computers and human beings. However, a fundamental difference between them can be observed: computer-based classification considers only physical features (e.g., similarity, distance, or distribution) of input data; by contrast, brain-based classification takes into account not only physical features, but also the organizational structure of data. In this paper, we figure out the data organizational structure for classification using complex networks constructed from training data. Specifically, an unlabeled instance is classified by the importance concept characterized by Google's PageRank measure of the underlying data networks. Before a test data instance is classified, a network is constructed from vector-based data set and the test instance is inserted into the network in a proper manner. To this end, we also propose a measure, called spatio-structural differential efficiency, to combine the physical and topological features of the input data. Such a method allows for the classification technique to capture a variety of data patterns using the unique importance measure. Extensive experiments demonstrate that the proposed technique has promising predictive performance on the detection of heart abnormalities.

  17. Combined Kernel-Based BDT-SMO Classification of Hyperspectral Fused Images

    Directory of Open Access Journals (Sweden)

    Fenghua Huang

    2014-01-01

    Full Text Available To solve the poor generalization and flexibility problems that single kernel SVM classifiers have while classifying combined spectral and spatial features, this paper proposed a solution to improve the classification accuracy and efficiency of hyperspectral fused images: (1 different radial basis kernel functions (RBFs are employed for spectral and textural features, and a new combined radial basis kernel function (CRBF is proposed by combining them in a weighted manner; (2 the binary decision tree-based multiclass SMO (BDT-SMO is used in the classification of hyperspectral fused images; (3 experiments are carried out, where the single radial basis function- (SRBF- based BDT-SMO classifier and the CRBF-based BDT-SMO classifier are used, respectively, to classify the land usages of hyperspectral fused images, and genetic algorithms (GA are used to optimize the kernel parameters of the classifiers. The results show that, compared with SRBF, CRBF-based BDT-SMO classifiers display greater classification accuracy and efficiency.

  18. Assessing Unmet Information Needs of Breast Cancer Survivors: Exploratory Study of Online Health Forums Using Text Classification and Retrieval.

    Science.gov (United States)

    McRoy, Susan; Rastegar-Mojarad, Majid; Wang, Yanshan; Ruddy, Kathryn J; Haddad, Tufia C; Liu, Hongfang

    2018-05-15

    Patient education materials given to breast cancer survivors may not be a good fit for their information needs. Needs may change over time, be forgotten, or be misreported, for a variety of reasons. An automated content analysis of survivors' postings to online health forums can identify expressed information needs over a span of time and be repeated regularly at low cost. Identifying these unmet needs can guide improvements to existing education materials and the creation of new resources. The primary goals of this project are to assess the unmet information needs of breast cancer survivors from their own perspectives and to identify gaps between information needs and current education materials. This approach employs computational methods for content modeling and supervised text classification to data from online health forums to identify explicit and implicit requests for health-related information. Potential gaps between needs and education materials are identified using techniques from information retrieval. We provide a new taxonomy for the classification of sentences in online health forum data. 260 postings from two online health forums were selected, yielding 4179 sentences for coding. After annotation of data and training alternative one-versus-others classifiers, a random forest-based approach achieved F1 scores from 66% (Other, dataset2) to 90% (Medical, dataset1) on the primary information types. 136 expressions of need were used to generate queries to indexed education materials. Upon examination of the best two pages retrieved for each query, 12% (17/136) of queries were found to have relevant content by all coders, and 33% (45/136) were judged to have relevant content by at least one. Text from online health forums can be analyzed effectively using automated methods. Our analysis confirms that breast cancer survivors have many information needs that are not covered by the written documents they typically receive, as our results suggest that at most

  19. Automated classification of cell morphology by coherence-controlled holographic microscopy

    Science.gov (United States)

    Strbkova, Lenka; Zicha, Daniel; Vesely, Pavel; Chmelik, Radim

    2017-08-01

    In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity.

  20. Hardware Accelerators Targeting a Novel Group Based Packet Classification Algorithm

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2013-01-01

    Full Text Available Packet classification is a ubiquitous and key building block for many critical network devices. However, it remains as one of the main bottlenecks faced when designing fast network devices. In this paper, we propose a novel Group Based Search packet classification Algorithm (GBSA that is scalable, fast, and efficient. GBSA consumes an average of 0.4 megabytes of memory for a 10 k rule set. The worst-case classification time per packet is 2 microseconds, and the preprocessing speed is 3 M rules/second based on an Xeon processor operating at 3.4 GHz. When compared with other state-of-the-art classification techniques, the results showed that GBSA outperforms the competition with respect to speed, memory usage, and processing time. Moreover, GBSA is amenable to implementation in hardware. Three different hardware implementations are also presented in this paper including an Application Specific Instruction Set Processor (ASIP implementation and two pure Register-Transfer Level (RTL implementations based on Impulse-C and Handel-C flows, respectively. Speedups achieved with these hardware accelerators ranged from 9x to 18x compared with a pure software implementation running on an Xeon processor.

  1. Classification of high resolution imagery based on fusion of multiscale texture features

    International Nuclear Information System (INIS)

    Liu, Jinxiu; Liu, Huiping; Lv, Ying; Xue, Xiaojuan

    2014-01-01

    In high resolution data classification process, combining texture features with spectral bands can effectively improve the classification accuracy. However, the window size which is difficult to choose is regarded as an important factor influencing overall classification accuracy in textural classification and current approaches to image texture analysis only depend on a single moving window which ignores different scale features of various land cover types. In this paper, we propose a new method based on the fusion of multiscale texture features to overcome these problems. The main steps in new method include the classification of fixed window size spectral/textural images from 3×3 to 15×15 and comparison of all the posterior possibility values for every pixel, as a result the biggest probability value is given to the pixel and the pixel belongs to a certain land cover type automatically. The proposed approach is tested on University of Pavia ROSIS data. The results indicate that the new method improve the classification accuracy compared to results of methods based on fixed window size textural classification

  2. Empirical Studies On Machine Learning Based Text Classification Algorithms

    OpenAIRE

    Shweta C. Dharmadhikari; Maya Ingle; Parag Kulkarni

    2011-01-01

    Automatic classification of text documents has become an important research issue now days. Properclassification of text documents requires information retrieval, machine learning and Natural languageprocessing (NLP) techniques. Our aim is to focus on important approaches to automatic textclassification based on machine learning techniques viz. supervised, unsupervised and semi supervised.In this paper we present a review of various text classification approaches under machine learningparadig...

  3. Locality-preserving sparse representation-based classification in hyperspectral imagery

    Science.gov (United States)

    Gao, Lianru; Yu, Haoyang; Zhang, Bing; Li, Qingting

    2016-10-01

    This paper proposes to combine locality-preserving projections (LPP) and sparse representation (SR) for hyperspectral image classification. The LPP is first used to reduce the dimensionality of all the training and testing data by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold, where the high-dimensional data lies. Then, SR codes the projected testing pixels as sparse linear combinations of all the training samples to classify the testing pixels by evaluating which class leads to the minimum approximation error. The integration of LPP and SR represents an innovative contribution to the literature. The proposed approach, called locality-preserving SR-based classification, addresses the imbalance between high dimensionality of hyperspectral data and the limited number of training samples. Experimental results on three real hyperspectral data sets demonstrate that the proposed approach outperforms the original counterpart, i.e., SR-based classification.

  4. Tumor Size Evaluation according to the T Component of the Seventh Edition of the International Association for the Study of Lung Cancer's TNM Classification: Interobserver Agreement between Radiologists and Computer-Aided Diagnosis System in Patients with Lung Cancer

    International Nuclear Information System (INIS)

    Kim, Jin Kyoung; Chong, Se Min; Seo, Jae Seung; Lee, Sun Jin; Han, Heon

    2011-01-01

    To assess the interobserver agreement for tumor size evaluation between radiologists and the computer-aided diagnosis (CAD) system based on the 7th edition of the TNM classification by the International Association for the Study of Lung Cancer in patients with lung cancer. We evaluated 20 patients who underwent a lobectomy or pneumonectomy for primary lung cancer. The maximum diameter of each primary tumor was measured by two radiologists and a CAD system on CT, and was staged based on the 7th edition of the TNM classification. The CT size and T-staging of the primary tumors was compared with the pathologic size and staging and the variability in the sizes and T stages of primary tumors was statistically analyzed between each radiologist's measurement or CAD estimation and the pathologic results. There was no statistically significant interobserver difference for the CT size among the two radiologists, between pathologic and CT size estimated by the radiologists, and between pathologic and CT staging by the radiologists and CAD system. However, there was a statistically significant interobserver difference between pathologic size and the CT size estimated by the CAD system (p = 0.003). No significant differences were found in the measurement of tumor size among radiologists or in the assessment of T-staging by radiologists and the CAD system.

  5. Ligand and structure-based classification models for Prediction of P-glycoprotein inhibitors

    DEFF Research Database (Denmark)

    Klepsch, Freya; Poongavanam, Vasanthanathan; Ecker, Gerhard Franz

    2014-01-01

    an algorithm based on Euclidean distance. Results show that random forest and SVM performed best for classification of P-gp inhibitors and non-inhibitors, correctly predicting 73/75 % of the external test set compounds. Classification based on the docking experiments using the scoring function Chem...

  6. Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection.

    Science.gov (United States)

    Chen, Yifei; Sun, Yuxing; Han, Bing-Qing

    2015-01-01

    Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.

  7. Polarimetric SAR image classification based on discriminative dictionary learning model

    Science.gov (United States)

    Sang, Cheng Wei; Sun, Hong

    2018-03-01

    Polarimetric SAR (PolSAR) image classification is one of the important applications of PolSAR remote sensing. It is a difficult high-dimension nonlinear mapping problem, the sparse representations based on learning overcomplete dictionary have shown great potential to solve such problem. The overcomplete dictionary plays an important role in PolSAR image classification, however for PolSAR image complex scenes, features shared by different classes will weaken the discrimination of learned dictionary, so as to degrade classification performance. In this paper, we propose a novel overcomplete dictionary learning model to enhance the discrimination of dictionary. The learned overcomplete dictionary by the proposed model is more discriminative and very suitable for PolSAR classification.

  8. Are preoperative histology and MRI useful for classification of endometrial cancer risk?

    International Nuclear Information System (INIS)

    Body, Noemie; Lavoué, Vincent; De Kerdaniel, Olivier; Foucher, Fabrice; Henno, Sébastien; Cauchois, Aurélie; Laviolle, Bruno; Leblanc, Marc; Levêque, Jean

    2016-01-01

    The 2010 guidelines of the French National Cancer Institute (INCa) classify patients with endometrial cancer into three risk groups for lymph node invasion and recurrence on the basis of MRI and histological analysis of an endometrial specimen obtained preoperatively. The classification guides therapeutic choices, which may include pelvic and/or para-aortic lymphadenectomy. The purpose of this study was to evaluate the diagnostic performance of preoperative assessment to help identify intermediate- or high-risk patients requiring lymphadenectomy. The study included all patients who underwent surgery for endometrial cancer between January 2010 and December 2013 at either Rennes University Hospital or Vannes Regional Hospital. The criteria for eligibility included a preoperative assessment with MRI and histological examination of an endometrial sample. A histological comparison was made between the preoperative and surgical specimens. Among the 91 patients who underwent a full preoperative assessment, the diagnosis of intermediate- or high-risk endometrial cancer was established by MRI and histology with a sensitivity of 70 %, specificity of 82 %, positive predictive value (PPV) of 87 %, negative predictive value (NPV) of 61 %, positive likelihood ratio (LR+) of 3.8 and negative likelihood ratio (LR-) of 0.3. The risk group was underestimated in 32 % of patients and overestimated in 7 % of patients. MRI underestimated endometrial cancer stage in 20 % of cases, while endometrial sampling underestimated the histological type in 4 % of cases and the grade in 9 % of cases. The preoperative assessment overestimated or underestimated the risk of recurrence in nearly 40 % of cases, with errors in lesion type, grade or stage. Erroneous preoperative risk assessment leads to suboptimal initial surgical management of patients with endometrial cancer

  9. An object-oriented classification method of high resolution imagery based on improved AdaTree

    International Nuclear Information System (INIS)

    Xiaohe, Zhang; Liang, Zhai; Jixian, Zhang; Huiyong, Sang

    2014-01-01

    With the popularity of the application using high spatial resolution remote sensing image, more and more studies paid attention to object-oriented classification on image segmentation as well as automatic classification after image segmentation. This paper proposed a fast method of object-oriented automatic classification. First, edge-based or FNEA-based segmentation was used to identify image objects and the values of most suitable attributes of image objects for classification were calculated. Then a certain number of samples from the image objects were selected as training data for improved AdaTree algorithm to get classification rules. Finally, the image objects could be classified easily using these rules. In the AdaTree, we mainly modified the final hypothesis to get classification rules. In the experiment with WorldView2 image, the result of the method based on AdaTree showed obvious accuracy and efficient improvement compared with the method based on SVM with the kappa coefficient achieving 0.9242

  10. Building an asynchronous web-based tool for machine learning classification.

    Science.gov (United States)

    Weber, Griffin; Vinterbo, Staal; Ohno-Machado, Lucila

    2002-01-01

    Various unsupervised and supervised learning methods including support vector machines, classification trees, linear discriminant analysis and nearest neighbor classifiers have been used to classify high-throughput gene expression data. Simpler and more widely accepted statistical tools have not yet been used for this purpose, hence proper comparisons between classification methods have not been conducted. We developed free software that implements logistic regression with stepwise variable selection as a quick and simple method for initial exploration of important genetic markers in disease classification. To implement the algorithm and allow our collaborators in remote locations to evaluate and compare its results against those of other methods, we developed a user-friendly asynchronous web-based application with a minimal amount of programming using free, downloadable software tools. With this program, we show that classification using logistic regression can perform as well as other more sophisticated algorithms, and it has the advantages of being easy to interpret and reproduce. By making the tool freely and easily available, we hope to promote the comparison of classification methods. In addition, we believe our web application can be used as a model for other bioinformatics laboratories that need to develop web-based analysis tools in a short amount of time and on a limited budget.

  11. Application of In-Segment Multiple Sampling in Object-Based Classification

    Directory of Open Access Journals (Sweden)

    Nataša Đurić

    2014-12-01

    Full Text Available When object-based analysis is applied to very high-resolution imagery, pixels within the segments reveal large spectral inhomogeneity; their distribution can be considered complex rather than normal. When normality is violated, the classification methods that rely on the assumption of normally distributed data are not as successful or accurate. It is hard to detect normality violations in small samples. The segmentation process produces segments that vary highly in size; samples can be very big or very small. This paper investigates whether the complexity within the segment can be addressed using multiple random sampling of segment pixels and multiple calculations of similarity measures. In order to analyze the effect sampling has on classification results, statistics and probability value equations of non-parametric two-sample Kolmogorov-Smirnov test and parametric Student’s t-test are selected as similarity measures in the classification process. The performance of both classifiers was assessed on a WorldView-2 image for four land cover classes (roads, buildings, grass and trees and compared to two commonly used object-based classifiers—k-Nearest Neighbor (k-NN and Support Vector Machine (SVM. Both proposed classifiers showed a slight improvement in the overall classification accuracies and produced more accurate classification maps when compared to the ground truth image.

  12. Lidar-based individual tree species classification using convolutional neural network

    Science.gov (United States)

    Mizoguchi, Tomohiro; Ishii, Akira; Nakamura, Hiroyuki; Inoue, Tsuyoshi; Takamatsu, Hisashi

    2017-06-01

    Terrestrial lidar is commonly used for detailed documentation in the field of forest inventory investigation. Recent improvements of point cloud processing techniques enabled efficient and precise computation of an individual tree shape parameters, such as breast-height diameter, height, and volume. However, tree species are manually specified by skilled workers to date. Previous works for automatic tree species classification mainly focused on aerial or satellite images, and few works have been reported for classification techniques using ground-based sensor data. Several candidate sensors can be considered for classification, such as RGB or multi/hyper spectral cameras. Above all candidates, we use terrestrial lidar because it can obtain high resolution point cloud in the dark forest. We selected bark texture for the classification criteria, since they clearly represent unique characteristics of each tree and do not change their appearance under seasonable variation and aged deterioration. In this paper, we propose a new method for automatic individual tree species classification based on terrestrial lidar using Convolutional Neural Network (CNN). The key component is the creation step of a depth image which well describe the characteristics of each species from a point cloud. We focus on Japanese cedar and cypress which cover the large part of domestic forest. Our experimental results demonstrate the effectiveness of our proposed method.

  13. Development of the cancer registration system in Belarus

    International Nuclear Information System (INIS)

    Okeanov, A.E.; Polyakov, S.M.; Sobolev, A.V.; Winkelmann, R.A.; Storm, H.H.

    1996-01-01

    Cancer registration was established in Belarus in 1953, however was not complete until the 1970's. In 1973 a computerized central cancer registry was established (files available only from 1978) based on coded and anonymous information received from each of the 12 oncological dispensaries in the country. In 1985 a computer system of dispensary control for cancer patients was set up in the oncological dispensaries in Belarus, whereby identification of individual cancer patients in the cancer registry was made possible. The Belarussian cancer registry records all cases of cancer including those of the lymph-hematopoietic system, and carcinoma in situ. The registry is person-based with information on all tumors and their treatment in a given individual. Coding and classification is carried out in accordance with ICD-9. For histology a local classification is used. Currently the registration system is under modernization in order to achieve full correspondence with internationally accepted standards and for the purpose of easy linkage to the Belarussian Chernobyl Registry

  14. Classification of treatment-related mortality in children with cancer

    DEFF Research Database (Denmark)

    Alexander, Sarah; Pole, Jason D; Gibson, Paul

    2015-01-01

    Treatment-related mortality is an important outcome in paediatric cancer clinical trials. An international group of experts in supportive care in paediatric cancer developed a consensus-based definition of treatment-related mortality and a cause-of-death attribution system. The reliability and va...

  15. Sparse Representation Based Binary Hypothesis Model for Hyperspectral Image Classification

    Directory of Open Access Journals (Sweden)

    Yidong Tang

    2016-01-01

    Full Text Available The sparse representation based classifier (SRC and its kernel version (KSRC have been employed for hyperspectral image (HSI classification. However, the state-of-the-art SRC often aims at extended surface objects with linear mixture in smooth scene and assumes that the number of classes is given. Considering the small target with complex background, a sparse representation based binary hypothesis (SRBBH model is established in this paper. In this model, a query pixel is represented in two ways, which are, respectively, by background dictionary and by union dictionary. The background dictionary is composed of samples selected from the local dual concentric window centered at the query pixel. Thus, for each pixel the classification issue becomes an adaptive multiclass classification problem, where only the number of desired classes is required. Furthermore, the kernel method is employed to improve the interclass separability. In kernel space, the coding vector is obtained by using kernel-based orthogonal matching pursuit (KOMP algorithm. Then the query pixel can be labeled by the characteristics of the coding vectors. Instead of directly using the reconstruction residuals, the different impacts the background dictionary and union dictionary have on reconstruction are used for validation and classification. It enhances the discrimination and hence improves the performance.

  16. The development of a classification schema for arts-based approaches to knowledge translation.

    Science.gov (United States)

    Archibald, Mandy M; Caine, Vera; Scott, Shannon D

    2014-10-01

    Arts-based approaches to knowledge translation are emerging as powerful interprofessional strategies with potential to facilitate evidence uptake, communication, knowledge, attitude, and behavior change across healthcare provider and consumer groups. These strategies are in the early stages of development. To date, no classification system for arts-based knowledge translation exists, which limits development and understandings of effectiveness in evidence syntheses. We developed a classification schema of arts-based knowledge translation strategies based on two mechanisms by which these approaches function: (a) the degree of precision in key message delivery, and (b) the degree of end-user participation. We demonstrate how this classification is necessary to explore how context, time, and location shape arts-based knowledge translation strategies. Classifying arts-based knowledge translation strategies according to their core attributes extends understandings of the appropriateness of these approaches for various healthcare settings and provider groups. The classification schema developed may enhance understanding of how, where, and for whom arts-based knowledge translation approaches are effective, and enable theorizing of essential knowledge translation constructs, such as the influence of context, time, and location on utilization strategies. The classification schema developed may encourage systematic inquiry into the effectiveness of these approaches in diverse interprofessional contexts. © 2014 Sigma Theta Tau International.

  17. Research on Remote Sensing Image Classification Based on Feature Level Fusion

    Science.gov (United States)

    Yuan, L.; Zhu, G.

    2018-04-01

    Remote sensing image classification, as an important direction of remote sensing image processing and application, has been widely studied. However, in the process of existing classification algorithms, there still exists the phenomenon of misclassification and missing points, which leads to the final classification accuracy is not high. In this paper, we selected Sentinel-1A and Landsat8 OLI images as data sources, and propose a classification method based on feature level fusion. Compare three kind of feature level fusion algorithms (i.e., Gram-Schmidt spectral sharpening, Principal Component Analysis transform and Brovey transform), and then select the best fused image for the classification experimental. In the classification process, we choose four kinds of image classification algorithms (i.e. Minimum distance, Mahalanobis distance, Support Vector Machine and ISODATA) to do contrast experiment. We use overall classification precision and Kappa coefficient as the classification accuracy evaluation criteria, and the four classification results of fused image are analysed. The experimental results show that the fusion effect of Gram-Schmidt spectral sharpening is better than other methods. In four kinds of classification algorithms, the fused image has the best applicability to Support Vector Machine classification, the overall classification precision is 94.01 % and the Kappa coefficients is 0.91. The fused image with Sentinel-1A and Landsat8 OLI is not only have more spatial information and spectral texture characteristics, but also enhances the distinguishing features of the images. The proposed method is beneficial to improve the accuracy and stability of remote sensing image classification.

  18. Implementing a Childhood Cancer Outcomes Surveillance System Within a Population-Based Cancer Registry

    Directory of Open Access Journals (Sweden)

    Oscar Ramirez

    2018-03-01

    Full Text Available Purpose: Approximately 80% of cases of childhood cancer occur in low- and middle-income countries and are associated with high mortality rates. Assessing outcomes is essential for designing effective strategies to improve outcomes equally worldwide. We implemented a real-time surveillance system, VIGICANCER, embedded in a population-based cancer registry (PBCR to assess childhood cancer outcomes. Methods: VIGICANCER was established in 2009 as an integral part of Cali’s PBCR to collect real-time data on outcomes of patients (age < 19 years with a new diagnosis of cancer treated in pediatric oncology units in Cali, Colombia. Baseline and follow-up data (death, relapse, treatment abandonment, second neoplasms were collected from medical records, hospital discharge logs, pathology reports, death certificates, and the National Public Health Insurance database. A quality assurance process was implemented for the system. Results: From 2009 to 2013, data from 1,242 patients were included in VIGICANCER: 32% of patients were younger than 5 years, 55% were male, and 15% were Afro-descendants. International Classification of Childhood Cancer group I diagnoses predominated in all age groups except children younger than 1 year old, in whom CNS tumors predominated. Five-year overall survival for all cancers was 51.7% (95% CI, 47.9% to 55.4% for children (< 15 years, and 39.4% (95% CI, 29.8% to 50.5% for adolescents (15 to 18.9 years. Five-year overall survival for acute lymphoblastic leukemia was 55.6% (95% CI, 48.5% to 62.2%. Conclusion: Our study demonstrates the feasibility of implementing a real-time childhood cancer outcomes surveillance system embedded in a PBCR that can guide interventions to improve clinical outcomes in low- and middle-income countries.

  19. A Quantum Hybrid PSO Combined with Fuzzy k-NN Approach to Feature Selection and Cell Classification in Cervical Cancer Detection

    Directory of Open Access Journals (Sweden)

    Abdullah M. Iliyasu

    2017-12-01

    Full Text Available A quantum hybrid (QH intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO method with the intuitionistic rationality of traditional fuzzy k-nearest neighbours (Fuzzy k-NN algorithm (known simply as the Q-Fuzzy approach is proposed for efficient feature selection and classification of cells in cervical smeared (CS images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection and another hybrid technique combining the standard PSO algorithm with the Fuzzy k-NN technique (P-Fuzzy approach. In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k-NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.

  20. Cell nuclei attributed relational graphs for efficient representation and classification of gastric cancer in digital histopathology

    Science.gov (United States)

    Sharma, Harshita; Zerbe, Norman; Heim, Daniel; Wienert, Stephan; Lohmann, Sebastian; Hellwich, Olaf; Hufnagl, Peter

    2016-03-01

    This paper describes a novel graph-based method for efficient representation and subsequent classification in histological whole slide images of gastric cancer. Her2/neu immunohistochemically stained and haematoxylin and eosin stained histological sections of gastric carcinoma are digitized. Immunohistochemical staining is used in practice by pathologists to determine extent of malignancy, however, it is laborious to visually discriminate the corresponding malignancy levels in the more commonly used haematoxylin and eosin stain, and this study attempts to solve this problem using a computer-based method. Cell nuclei are first isolated at high magnification using an automatic cell nuclei segmentation strategy, followed by construction of cell nuclei attributed relational graphs of the tissue regions. These graphs represent tissue architecture comprehensively, as they contain information about cell nuclei morphology as vertex attributes, along with knowledge of neighborhood in the form of edge linking and edge attributes. Global graph characteristics are derived and ensemble learning is used to discriminate between three types of malignancy levels, namely, non-tumor, Her2/neu positive tumor and Her2/neu negative tumor. Performance is compared with state of the art methods including four texture feature groups (Haralick, Gabor, Local Binary Patterns and Varma Zisserman features), color and intensity features, and Voronoi diagram and Delaunay triangulation. Texture, color and intensity information is also combined with graph-based knowledge, followed by correlation analysis. Quantitative assessment is performed using two cross validation strategies. On investigating the experimental results, it can be concluded that the proposed method provides a promising way for computer-based analysis of histopathological images of gastric cancer.

  1. Modern classification of neoplasms: reconciling differences between morphologic and molecular approaches

    International Nuclear Information System (INIS)

    Berman, Jules

    2005-01-01

    For over 150 years, pathologists have relied on histomorphology to classify and diagnose neoplasms. Their success has been stunning, permitting the accurate diagnosis of thousands of different types of neoplasms using only a microscope and a trained eye. In the past two decades, cancer genomics has challenged the supremacy of histomorphology by identifying genetic alterations shared by morphologically diverse tumors and by finding genetic features that distinguish subgroups of morphologically homogeneous tumors. The Developmental Lineage Classification and Taxonomy of Neoplasms groups neoplasms by their embryologic origin. The putative value of this classification is based on the expectation that tumors of a common developmental lineage will share common metabolic pathways and common responses to drugs that target these pathways. The purpose of this manuscript is to show that grouping tumors according to their developmental lineage can reconcile certain fundamental discrepancies resulting from morphologic and molecular approaches to neoplasm classification. In this study, six issues in tumor classification are described that exemplify the growing rift between morphologic and molecular approaches to tumor classification: 1) the morphologic separation between epithelial and non-epithelial tumors; 2) the grouping of tumors based on shared cellular functions; 3) the distinction between germ cell tumors and pluripotent tumors of non-germ cell origin; 4) the distinction between tumors that have lost their differentiation and tumors that arise from uncommitted stem cells; 5) the molecular properties shared by morphologically disparate tumors that have a common developmental lineage, and 6) the problem of re-classifying morphologically identical but clinically distinct subsets of tumors. The discussion of these issues in the context of describing different methods of tumor classification is intended to underscore the clinical value of a robust tumor classification. A

  2. Classification of Breast Cancer Subtypes by combining Gene Expression and DNA Methylation Data

    DEFF Research Database (Denmark)

    List, Markus; Hauschild, Anne-Christin; Tan, Qihua

    2014-01-01

    expression data for hundreds of patients, the challenge is to extract a minimal optimal set of genes with good prognostic properties from a large bulk of genes making a moderate contribution to classification. Several studies have successfully applied machine learning algorithms to solve this so-called gene...... on the transcriptomic, but also on an epigenetic level. We compared so-called random forest derived classification models based on gene expression and methylation data alone, to a model based on the combined features and to a model based on the gold standard PAM50. We obtained bootstrap errors of 10...

  3. Semantic Document Image Classification Based on Valuable Text Pattern

    Directory of Open Access Journals (Sweden)

    Hossein Pourghassem

    2011-01-01

    Full Text Available Knowledge extraction from detected document image is a complex problem in the field of information technology. This problem becomes more intricate when we know, a negligible percentage of the detected document images are valuable. In this paper, a segmentation-based classification algorithm is used to analysis the document image. In this algorithm, using a two-stage segmentation approach, regions of the image are detected, and then classified to document and non-document (pure region regions in the hierarchical classification. In this paper, a novel valuable definition is proposed to classify document image in to valuable or invaluable categories. The proposed algorithm is evaluated on a database consisting of the document and non-document image that provide from Internet. Experimental results show the efficiency of the proposed algorithm in the semantic document image classification. The proposed algorithm provides accuracy rate of 98.8% for valuable and invaluable document image classification problem.

  4. Ship Classification with High Resolution TerraSAR-X Imagery Based on Analytic Hierarchy Process

    Directory of Open Access Journals (Sweden)

    Zhi Zhao

    2013-01-01

    Full Text Available Ship surveillance using space-borne synthetic aperture radar (SAR, taking advantages of high resolution over wide swaths and all-weather working capability, has attracted worldwide attention. Recent activity in this field has concentrated mainly on the study of ship detection, but the classification is largely still open. In this paper, we propose a novel ship classification scheme based on analytic hierarchy process (AHP in order to achieve better performance. The main idea is to apply AHP on both feature selection and classification decision. On one hand, the AHP based feature selection constructs a selection decision problem based on several feature evaluation measures (e.g., discriminability, stability, and information measure and provides objective criteria to make comprehensive decisions for their combinations quantitatively. On the other hand, we take the selected feature sets as the input of KNN classifiers and fuse the multiple classification results based on AHP, in which the feature sets’ confidence is taken into account when the AHP based classification decision is made. We analyze the proposed classification scheme and demonstrate its results on a ship dataset that comes from TerraSAR-X SAR images.

  5. Profiling cancer

    DEFF Research Database (Denmark)

    Ciro, Marco; Bracken, Adrian P; Helin, Kristian

    2003-01-01

    In the past couple of years, several very exciting studies have demonstrated the enormous power of gene-expression profiling for cancer classification and prediction of patient survival. In addition to promising a more accurate classification of cancer and therefore better treatment of patients......, gene-expression profiling can result in the identification of novel potential targets for cancer therapy and a better understanding of the molecular mechanisms leading to cancer....

  6. Texture-based classification of different gastric tumors at contrast-enhanced CT

    Energy Technology Data Exchange (ETDEWEB)

    Ba-Ssalamah, Ahmed, E-mail: ahmed.ba-ssalamah@meduniwien.ac.at [Department of Radiology, Medical University of Vienna (Austria); Muin, Dina; Schernthaner, Ruediger; Kulinna-Cosentini, Christiana; Bastati, Nina [Department of Radiology, Medical University of Vienna (Austria); Stift, Judith [Department of Pathology, Medical University of Vienna (Austria); Gore, Richard [Department of Radiology, University of Chicago Pritzker School of Medicine, Chicago, IL (United States); Mayerhoefer, Marius E. [Department of Radiology, Medical University of Vienna (Austria)

    2013-10-01

    Purpose: To determine the feasibility of texture analysis for the classification of gastric adenocarcinoma, lymphoma, and gastrointestinal stromal tumors on contrast-enhanced hydrodynamic-MDCT images. Materials and methods: The arterial phase scans of 47 patients with adenocarcinoma (AC) and a histologic tumor grade of [AC-G1, n = 4, G1, n = 4; AC-G2, n = 7; AC-G3, n = 16]; GIST, n = 15; and lymphoma, n = 5, and the venous phase scans of 48 patients with AC-G1, n = 3; AC-G2, n = 6; AC-G3, n = 14; GIST, n = 17; lymphoma, n = 8, were retrospectively reviewed. Based on regions of interest, texture analysis was performed, and features derived from the gray-level histogram, run-length and co-occurrence matrix, absolute gradient, autoregressive model, and wavelet transform were calculated. Fisher coefficients, probability of classification error, average correlation coefficients, and mutual information coefficients were used to create combinations of texture features that were optimized for tumor differentiation. Linear discriminant analysis in combination with a k-nearest neighbor classifier was used for tumor classification. Results: On arterial-phase scans, texture-based lesion classification was highly successful in differentiating between AC and lymphoma, and GIST and lymphoma, with misclassification rates of 3.1% and 0%, respectively. On venous-phase scans, texture-based classification was slightly less successful for AC vs. lymphoma (9.7% misclassification) and GIST vs. lymphoma (8% misclassification), but enabled the differentiation between AC and GIST (10% misclassification), and between the different grades of AC (4.4% misclassification). No texture feature combination was able to adequately distinguish between all three tumor types. Conclusion: Classification of different gastric tumors based on textural information may aid radiologists in establishing the correct diagnosis, at least in cases where the differential diagnosis can be narrowed down to two

  7. Texture-based classification of different gastric tumors at contrast-enhanced CT

    International Nuclear Information System (INIS)

    Ba-Ssalamah, Ahmed; Muin, Dina; Schernthaner, Ruediger; Kulinna-Cosentini, Christiana; Bastati, Nina; Stift, Judith; Gore, Richard; Mayerhoefer, Marius E.

    2013-01-01

    Purpose: To determine the feasibility of texture analysis for the classification of gastric adenocarcinoma, lymphoma, and gastrointestinal stromal tumors on contrast-enhanced hydrodynamic-MDCT images. Materials and methods: The arterial phase scans of 47 patients with adenocarcinoma (AC) and a histologic tumor grade of [AC-G1, n = 4, G1, n = 4; AC-G2, n = 7; AC-G3, n = 16]; GIST, n = 15; and lymphoma, n = 5, and the venous phase scans of 48 patients with AC-G1, n = 3; AC-G2, n = 6; AC-G3, n = 14; GIST, n = 17; lymphoma, n = 8, were retrospectively reviewed. Based on regions of interest, texture analysis was performed, and features derived from the gray-level histogram, run-length and co-occurrence matrix, absolute gradient, autoregressive model, and wavelet transform were calculated. Fisher coefficients, probability of classification error, average correlation coefficients, and mutual information coefficients were used to create combinations of texture features that were optimized for tumor differentiation. Linear discriminant analysis in combination with a k-nearest neighbor classifier was used for tumor classification. Results: On arterial-phase scans, texture-based lesion classification was highly successful in differentiating between AC and lymphoma, and GIST and lymphoma, with misclassification rates of 3.1% and 0%, respectively. On venous-phase scans, texture-based classification was slightly less successful for AC vs. lymphoma (9.7% misclassification) and GIST vs. lymphoma (8% misclassification), but enabled the differentiation between AC and GIST (10% misclassification), and between the different grades of AC (4.4% misclassification). No texture feature combination was able to adequately distinguish between all three tumor types. Conclusion: Classification of different gastric tumors based on textural information may aid radiologists in establishing the correct diagnosis, at least in cases where the differential diagnosis can be narrowed down to two

  8. Automated classification of cell morphology by coherence-controlled holographic microscopy.

    Science.gov (United States)

    Strbkova, Lenka; Zicha, Daniel; Vesely, Pavel; Chmelik, Radim

    2017-08-01

    In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

  9. Radiomic features analysis in computed tomography images of lung nodule classification.

    Directory of Open Access Journals (Sweden)

    Chia-Hung Chen

    Full Text Available Radiomics, which extract large amount of quantification image features from diagnostic medical images had been widely used for prognostication, treatment response prediction and cancer detection. The treatment options for lung nodules depend on their diagnosis, benign or malignant. Conventionally, lung nodule diagnosis is based on invasive biopsy. Recently, radiomics features, a non-invasive method based on clinical images, have shown high potential in lesion classification, treatment outcome prediction.Lung nodule classification using radiomics based on Computed Tomography (CT image data was investigated and a 4-feature signature was introduced for lung nodule classification. Retrospectively, 72 patients with 75 pulmonary nodules were collected. Radiomics feature extraction was performed on non-enhanced CT images with contours which were delineated by an experienced radiation oncologist.Among the 750 image features in each case, 76 features were found to have significant differences between benign and malignant lesions. A radiomics signature was composed of the best 4 features which included Laws_LSL_min, Laws_SLL_energy, Laws_SSL_skewness and Laws_EEL_uniformity. The accuracy using the signature in benign or malignant classification was 84% with the sensitivity of 92.85% and the specificity of 72.73%.The classification signature based on radiomics features demonstrated very good accuracy and high potential in clinical application.

  10. Ensemble Classification of Data Streams Based on Attribute Reduction and a Sliding Window

    Directory of Open Access Journals (Sweden)

    Yingchun Chen

    2018-04-01

    Full Text Available With the current increasing volume and dimensionality of data, traditional data classification algorithms are unable to satisfy the demands of practical classification applications of data streams. To deal with noise and concept drift in data streams, we propose an ensemble classification algorithm based on attribute reduction and a sliding window in this paper. Using mutual information, an approximate attribute reduction algorithm based on rough sets is used to reduce data dimensionality and increase the diversity of reduced results in the algorithm. A double-threshold concept drift detection method and a three-stage sliding window control strategy are introduced to improve the performance of the algorithm when dealing with both noise and concept drift. The classification precision is further improved by updating the base classifiers and their nonlinear weights. Experiments on synthetic datasets and actual datasets demonstrate the performance of the algorithm in terms of classification precision, memory use, and time efficiency.

  11. Classification of right-hand grasp movement based on EMOTIV Epoc+

    Science.gov (United States)

    Tobing, T. A. M. L.; Prawito, Wijaya, S. K.

    2017-07-01

    Combinations of BCT elements for right-hand grasp movement have been obtained, providing the average value of their classification accuracy. The aim of this study is to find a suitable combination for best classification accuracy of right-hand grasp movement based on EEG headset, EMOTIV Epoc+. There are three movement classifications: grasping hand, relax, and opening hand. These classifications take advantage of Event-Related Desynchronization (ERD) phenomenon that makes it possible to differ relaxation, imagery, and movement state from each other. The combinations of elements are the usage of Independent Component Analysis (ICA), spectrum analysis by Fast Fourier Transform (FFT), maximum mu and beta power with their frequency as features, and also classifier Probabilistic Neural Network (PNN) and Radial Basis Function (RBF). The average values of classification accuracy are ± 83% for training and ± 57% for testing. To have a better understanding of the signal quality recorded by EMOTIV Epoc+, the result of classification accuracy of left or right-hand grasping movement EEG signal (provided by Physionet) also be given, i.e.± 85% for training and ± 70% for testing. The comparison of accuracy value from each combination, experiment condition, and external EEG data are provided for the purpose of value analysis of classification accuracy.

  12. Tolerance to missing data using a likelihood ratio based classifier for computer-aided classification of breast cancer

    International Nuclear Information System (INIS)

    Bilska-Wolak, Anna O; Floyd, Carey E Jr

    2004-01-01

    While mammography is a highly sensitive method for detecting breast tumours, its ability to differentiate between malignant and benign lesions is low, which may result in as many as 70% of unnecessary biopsies. The purpose of this study was to develop a highly specific computer-aided diagnosis algorithm to improve classification of mammographic masses. A classifier based on the likelihood ratio was developed to accommodate cases with missing data. Data for development included 671 biopsy cases (245 malignant), with biopsy-proved outcome. Sixteen features based on the BI-RADS TM lexicon and patient history had been recorded for the cases, with 1.3 ± 1.1 missing feature values per case. Classifier evaluation methods included receiver operating characteristic and leave-one-out bootstrap sampling. The classifier achieved 32% specificity at 100% sensitivity on the 671 cases with 16 features that had missing values. Utilizing just the seven features present for all cases resulted in decreased performance at 100% sensitivity with average 19% specificity. No cases and no feature data were omitted during classifier development, showing that it is more beneficial to utilize cases with missing values than to discard incomplete cases that cannot be handled by many algorithms. Classification of mammographic masses was commendable at high sensitivity levels, indicating that benign cases could be potentially spared from biopsy

  13. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    Directory of Open Access Journals (Sweden)

    Liu Qingzhong

    2011-12-01

    Full Text Available Abstract Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.

  14. Constructing Support Vector Machine Ensembles for Cancer Classification Based on Proteomic Profiling

    Institute of Scientific and Technical Information of China (English)

    Yong Mao; Xiao-Bo Zhou; Dao-Ying Pi; You-Xian Sun

    2005-01-01

    In this study, we present a constructive algorithm for training cooperative support vector machine ensembles (CSVMEs). CSVME combines ensemble architecture design with cooperative training for individual SVMs in ensembles. Unlike most previous studies on training ensembles, CSVME puts emphasis on both accuracy and collaboration among individual SVMs in an ensemble. A group of SVMs selected on the basis of recursive classifier elimination is used in CSVME, and the number of the individual SVMs selected to construct CSVME is determined by 10-fold cross-validation. This kind of SVME has been tested on two ovarian cancer datasets previously obtained by proteomic mass spectrometry. By combining several individual SVMs, the proposed method achieves better performance than the SVME of all base SVMs.

  15. A DIMENSION REDUCTION-BASED METHOD FOR CLASSIFICATION OF HYPERSPECTRAL AND LIDAR DATA

    Directory of Open Access Journals (Sweden)

    B. Abbasi

    2015-12-01

    Full Text Available The existence of various natural objects such as grass, trees, and rivers along with artificial manmade features such as buildings and roads, make it difficult to classify ground objects. Consequently using single data or simple classification approach cannot improve classification results in object identification. Also, using of a variety of data from different sensors; increase the accuracy of spatial and spectral information. In this paper, we proposed a classification algorithm on joint use of hyperspectral and Lidar (Light Detection and Ranging data based on dimension reduction. First, some feature extraction techniques are applied to achieve more information from Lidar and hyperspectral data. Also Principal component analysis (PCA and Minimum Noise Fraction (MNF have been utilized to reduce the dimension of spectral features. The number of 30 features containing the most information of the hyperspectral images is considered for both PCA and MNF. In addition, Normalized Difference Vegetation Index (NDVI has been measured to highlight the vegetation. Furthermore, the extracted features from Lidar data calculated based on relation between every pixel of data and surrounding pixels in local neighbourhood windows. The extracted features are based on the Grey Level Co-occurrence Matrix (GLCM matrix. In second step, classification is operated in all features which obtained by MNF, PCA, NDVI and GLCM and trained by class samples. After this step, two classification maps are obtained by SVM classifier with MNF+NDVI+GLCM features and PCA+NDVI+GLCM features, respectively. Finally, the classified images are fused together to create final classification map by decision fusion based majority voting strategy.

  16. Yarn-dyed fabric defect classification based on convolutional neural network

    Science.gov (United States)

    Jing, Junfeng; Dong, Amei; Li, Pengfei; Zhang, Kaibing

    2017-09-01

    Considering that manual inspection of the yarn-dyed fabric can be time consuming and inefficient, we propose a yarn-dyed fabric defect classification method by using a convolutional neural network (CNN) based on a modified AlexNet. CNN shows powerful ability in performing feature extraction and fusion by simulating the learning mechanism of human brain. The local response normalization layers in AlexNet are replaced by the batch normalization layers, which can enhance both the computational efficiency and classification accuracy. In the training process of the network, the characteristics of the defect are extracted step by step and the essential features of the image can be obtained from the fusion of the edge details with several convolution operations. Then the max-pooling layers, the dropout layers, and the fully connected layers are employed in the classification model to reduce the computation cost and extract more precise features of the defective fabric. Finally, the results of the defect classification are predicted by the softmax function. The experimental results show promising performance with an acceptable average classification rate and strong robustness on yarn-dyed fabric defect classification.

  17. Sequencing-based breast cancer diagnostics as an alternative to routine biomarkers.

    Science.gov (United States)

    Rantalainen, Mattias; Klevebring, Daniel; Lindberg, Johan; Ivansson, Emma; Rosin, Gustaf; Kis, Lorand; Celebioglu, Fuat; Fredriksson, Irma; Czene, Kamila; Frisell, Jan; Hartman, Johan; Bergh, Jonas; Grönberg, Henrik

    2016-11-30

    Sequencing-based breast cancer diagnostics have the potential to replace routine biomarkers and provide molecular characterization that enable personalized precision medicine. Here we investigate the concordance between sequencing-based and routine diagnostic biomarkers and to what extent tumor sequencing contributes clinically actionable information. We applied DNA- and RNA-sequencing to characterize tumors from 307 breast cancer patients with replication in up to 739 patients. We developed models to predict status of routine biomarkers (ER, HER2,Ki-67, histological grade) from sequencing data. Non-routine biomarkers, including mutations in BRCA1, BRCA2 and ERBB2(HER2), and additional clinically actionable somatic alterations were also investigated. Concordance with routine diagnostic biomarkers was high for ER status (AUC = 0.95;AUC(replication) = 0.97) and HER2 status (AUC = 0.97;AUC(replication) = 0.92). The transcriptomic grade model enabled classification of histological grade 1 and histological grade 3 tumors with high accuracy (AUC = 0.98;AUC(replication) = 0.94). Clinically actionable mutations in BRCA1, BRCA2 and ERBB2(HER2) were detected in 5.5% of patients, while 53% had genomic alterations matching ongoing or concluded breast cancer studies. Sequencing-based molecular profiling can be applied as an alternative to histopathology to determine ER and HER2 status, in addition to providing improved tumor grading and clinically actionable mutations and molecular subtypes. Our results suggest that sequencing-based breast cancer diagnostics in a near future can replace routine biomarkers.

  18. Generative embedding for model-based classification of fMRI data.

    Directory of Open Access Journals (Sweden)

    Kay H Brodersen

    2011-06-01

    Full Text Available Decoding models, such as those underlying multivariate classification algorithms, have been increasingly used to infer cognitive or clinical brain states from measures of brain activity obtained by functional magnetic resonance imaging (fMRI. The practicality of current classifiers, however, is restricted by two major challenges. First, due to the high data dimensionality and low sample size, algorithms struggle to separate informative from uninformative features, resulting in poor generalization performance. Second, popular discriminative methods such as support vector machines (SVMs rarely afford mechanistic interpretability. In this paper, we address these issues by proposing a novel generative-embedding approach that incorporates neurobiologically interpretable generative models into discriminative classifiers. Our approach extends previous work on trial-by-trial classification for electrophysiological recordings to subject-by-subject classification for fMRI and offers two key advantages over conventional methods: it may provide more accurate predictions by exploiting discriminative information encoded in 'hidden' physiological quantities such as synaptic connection strengths; and it affords mechanistic interpretability of clinical classifications. Here, we introduce generative embedding for fMRI using a combination of dynamic causal models (DCMs and SVMs. We propose a general procedure of DCM-based generative embedding for subject-wise classification, provide a concrete implementation, and suggest good-practice guidelines for unbiased application of generative embedding in the context of fMRI. We illustrate the utility of our approach by a clinical example in which we classify moderately aphasic patients and healthy controls using a DCM of thalamo-temporal regions during speech processing. Generative embedding achieves a near-perfect balanced classification accuracy of 98% and significantly outperforms conventional activation-based and

  19. Object-Based Crop Species Classification Based on the Combination of Airborne Hyperspectral Images and LiDAR Data

    Directory of Open Access Journals (Sweden)

    Xiaolong Liu

    2015-01-01

    Full Text Available Identification of crop species is an important issue in agricultural management. In recent years, many studies have explored this topic using multi-spectral and hyperspectral remote sensing data. In this study, we perform dedicated research to propose a framework for mapping crop species by combining hyperspectral and Light Detection and Ranging (LiDAR data in an object-based image analysis (OBIA paradigm. The aims of this work were the following: (i to understand the performances of different spectral dimension-reduced features from hyperspectral data and their combination with LiDAR derived height information in image segmentation; (ii to understand what classification accuracies of crop species can be achieved by combining hyperspectral and LiDAR data in an OBIA paradigm, especially in regions that have fragmented agricultural landscape and complicated crop planting structure; and (iii to understand the contributions of the crop height that is derived from LiDAR data, as well as the geometric and textural features of image objects, to the crop species’ separabilities. The study region was an irrigated agricultural area in the central Heihe river basin, which is characterized by many crop species, complicated crop planting structures, and fragmented landscape. The airborne hyperspectral data acquired by the Compact Airborne Spectrographic Imager (CASI with a 1 m spatial resolution and the Canopy Height Model (CHM data derived from the LiDAR data acquired by the airborne Leica ALS70 LiDAR system were used for this study. The image segmentation accuracies of different feature combination schemes (very high-resolution imagery (VHR, VHR/CHM, and minimum noise fractional transformed data (MNF/CHM were evaluated and analyzed. The results showed that VHR/CHM outperformed the other two combination schemes with a segmentation accuracy of 84.8%. The object-based crop species classification results of different feature integrations indicated that

  20. Improving Generalization Based on l1-Norm Regularization for EEG-Based Motor Imagery Classification

    Directory of Open Access Journals (Sweden)

    Yuwei Zhao

    2018-05-01

    Full Text Available Multichannel electroencephalography (EEG is widely used in typical brain-computer interface (BCI systems. In general, a number of parameters are essential for a EEG classification algorithm due to redundant features involved in EEG signals. However, the generalization of the EEG method is often adversely affected by the model complexity, considerably coherent with its number of undetermined parameters, further leading to heavy overfitting. To decrease the complexity and improve the generalization of EEG method, we present a novel l1-norm-based approach to combine the decision value obtained from each EEG channel directly. By extracting the information from different channels on independent frequency bands (FB with l1-norm regularization, the method proposed fits the training data with much less parameters compared to common spatial pattern (CSP methods in order to reduce overfitting. Moreover, an effective and efficient solution to minimize the optimization object is proposed. The experimental results on dataset IVa of BCI competition III and dataset I of BCI competition IV show that, the proposed method contributes to high classification accuracy and increases generalization performance for the classification of MI EEG. As the training set ratio decreases from 80 to 20%, the average classification accuracy on the two datasets changes from 85.86 and 86.13% to 84.81 and 76.59%, respectively. The classification performance and generalization of the proposed method contribute to the practical application of MI based BCI systems.

  1. Video based object representation and classification using multiple covariance matrices.

    Science.gov (United States)

    Zhang, Yurong; Liu, Quan

    2017-01-01

    Video based object recognition and classification has been widely studied in computer vision and image processing area. One main issue of this task is to develop an effective representation for video. This problem can generally be formulated as image set representation. In this paper, we present a new method called Multiple Covariance Discriminative Learning (MCDL) for image set representation and classification problem. The core idea of MCDL is to represent an image set using multiple covariance matrices with each covariance matrix representing one cluster of images. Firstly, we use the Nonnegative Matrix Factorization (NMF) method to do image clustering within each image set, and then adopt Covariance Discriminative Learning on each cluster (subset) of images. At last, we adopt KLDA and nearest neighborhood classification method for image set classification. Promising experimental results on several datasets show the effectiveness of our MCDL method.

  2. Torrent classification - Base of rational management of erosive regions

    International Nuclear Information System (INIS)

    Gavrilovic, Zoran; Stefanovic, Milutin; Milovanovic, Irina; Cotric, Jelena; Milojevic, Mileta

    2008-01-01

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  3. [Role of contemporary pathological diagnostics in the personalized treatment of cancer].

    Science.gov (United States)

    Tímár, József

    2013-03-01

    Due to the developments of pathology in the past decades (immunohistochemistry and molecular pathology) classification of cancers changed fundamentally, laying a ground for personalized management of cancer patients. Our picture of cancer is more complex today, identifying the genetic basis of the morphological variants. On the other hand, this picture has a much higher resolution enabling us to subclassify similar histological cancer types based on molecular markers. This redefined classification of cancers helps us to better predict the possible biological behavior of the disease and/or the therapeutic sensitivity, opening the way toward a more personalized treatment of this disease. The redefined molecular classification of cancer may affect the universal application of treatment protocols. To achieve this goal molecular diagnostics must be an integral and reimbursed part of the routine pathological diagnostics. On the other hand, it is time to extend the multidisciplinary team with molecular pathologist to improve the decision making process of the management of cancer patients.

  4. A classification model of Hyperion image base on SAM combined decision tree

    Science.gov (United States)

    Wang, Zhenghai; Hu, Guangdao; Zhou, YongZhang; Liu, Xin

    2009-10-01

    Monitoring the Earth using imaging spectrometers has necessitated more accurate analyses and new applications to remote sensing. A very high dimensional input space requires an exponentially large amount of data to adequately and reliably represent the classes in that space. On the other hand, with increase in the input dimensionality the hypothesis space grows exponentially, which makes the classification performance highly unreliable. Traditional classification algorithms Classification of hyperspectral images is challenging. New algorithms have to be developed for hyperspectral data classification. The Spectral Angle Mapper (SAM) is a physically-based spectral classification that uses an ndimensional angle to match pixels to reference spectra. The algorithm determines the spectral similarity between two spectra by calculating the angle between the spectra, treating them as vectors in a space with dimensionality equal to the number of bands. The key and difficulty is that we should artificial defining the threshold of SAM. The classification precision depends on the rationality of the threshold of SAM. In order to resolve this problem, this paper proposes a new automatic classification model of remote sensing image using SAM combined with decision tree. It can automatic choose the appropriate threshold of SAM and improve the classify precision of SAM base on the analyze of field spectrum. The test area located in Heqing Yunnan was imaged by EO_1 Hyperion imaging spectrometer using 224 bands in visual and near infrared. The area included limestone areas, rock fields, soil and forests. The area was classified into four different vegetation and soil types. The results show that this method choose the appropriate threshold of SAM and eliminates the disturbance and influence of unwanted objects effectively, so as to improve the classification precision. Compared with the likelihood classification by field survey data, the classification precision of this model

  5. Joint Probability-Based Neuronal Spike Train Classification

    Directory of Open Access Journals (Sweden)

    Yan Chen

    2009-01-01

    Full Text Available Neuronal spike trains are used by the nervous system to encode and transmit information. Euclidean distance-based methods (EDBMs have been applied to quantify the similarity between temporally-discretized spike trains and model responses. In this study, using the same discretization procedure, we developed and applied a joint probability-based method (JPBM to classify individual spike trains of slowly adapting pulmonary stretch receptors (SARs. The activity of individual SARs was recorded in anaesthetized, paralysed adult male rabbits, which were artificially-ventilated at constant rate and one of three different volumes. Two-thirds of the responses to the 600 stimuli presented at each volume were used to construct three response models (one for each stimulus volume consisting of a series of time bins, each with spike probabilities. The remaining one-third of the responses where used as test responses to be classified into one of the three model responses. This was done by computing the joint probability of observing the same series of events (spikes or no spikes, dictated by the test response in a given model and determining which probability of the three was highest. The JPBM generally produced better classification accuracy than the EDBM, and both performed well above chance. Both methods were similarly affected by variations in discretization parameters, response epoch duration, and two different response alignment strategies. Increasing bin widths increased classification accuracy, which also improved with increased observation time, but primarily during periods of increasing lung inflation. Thus, the JPBM is a simple and effective method performing spike train classification.

  6. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    Directory of Open Access Journals (Sweden)

    Manana Khachidze

    2016-01-01

    Full Text Available According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray and 13 subgroups using two well-known methods: Support Vector Machine (SVM and K-Nearest Neighbor (KNN. The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system due to common features characterizing these subclasses. The overall results of the study were successful.

  7. Ensemble based system for whole-slide prostate cancer probability mapping using color texture features.

    LENUS (Irish Health Repository)

    DiFranco, Matthew D

    2011-01-01

    We present a tile-based approach for producing clinically relevant probability maps of prostatic carcinoma in histological sections from radical prostatectomy. Our methodology incorporates ensemble learning for feature selection and classification on expert-annotated images. Random forest feature selection performed over varying training sets provides a subset of generalized CIEL*a*b* co-occurrence texture features, while sample selection strategies with minimal constraints reduce training data requirements to achieve reliable results. Ensembles of classifiers are built using expert-annotated tiles from training images, and scores for the probability of cancer presence are calculated from the responses of each classifier in the ensemble. Spatial filtering of tile-based texture features prior to classification results in increased heat-map coherence as well as AUC values of 95% using ensembles of either random forests or support vector machines. Our approach is designed for adaptation to different imaging modalities, image features, and histological decision domains.

  8. The IASLC Lung Cancer Staging Project

    DEFF Research Database (Denmark)

    Chansky, Kari; Detterbeck, Frank C; Nicholson, Andrew G

    2017-01-01

    INTRODUCTION: Revisions to the TNM stage classifications for lung cancer, informed by the international database (N = 94,708) of the International Association for the Study of Lung Cancer (IASLC) Staging and Prognostic Factors Committee, need external validation. The objective was to externally...... demonstrated consistent ability to discriminate TNM categories and stage groups for clinical and pathologic stage. CONCLUSIONS: The IASLC revisions made for the eighth edition of lung cancer staging are validated by this analysis of the NCDB database by the ordering, statistical differences, and homogeneity...... validate the revisions by using the National Cancer Data Base (NCDB) of the American College of Surgeons. METHODS: Cases presenting from 2000 through 2012 were drawn from the NCDB and reclassified according to the eighth edition stage classification. Clinically and pathologically staged subsets of NSCLC...

  9. Classification of Hearing Loss Disorders Using Teoae-Based Descriptors

    Science.gov (United States)

    Hatzopoulos, Stavros Dimitris

    Transiently Evoked Otoacoustic Emissions (TEOAE) are signals produced by the cochlea upon stimulation by an acoustic click. Within the context of this dissertation, it was hypothesized that the relationship between the TEOAEs and the functional status of the OHCs provided an opportunity for designing a TEOAE-based clinical procedure that could be used to assess cochlear function. To understand the nature of the TEOAE signals in the time and the frequency domain several different analyses were performed. Using normative Input-Output (IO) curves, short-time FFT analyses and cochlear computer simulations, it was found that for optimization of the hearing loss classification it is necessary to use a complete 20 ms TEOAE segment. It was also determined that various 2-D filtering methods (median and averaging filtering masks, LP-FFT) used to enhance of the TEOAE S/N offered minimal improvement (less than 6 dB per stimulus level). Higher S/N improvements resulted in TEOAE sequences that were over-smoothed. The final classification algorithm was based on a statistical analysis of raw FFT data and when applied to a sample set of clinically obtained TEOAE recordings (from 56 normal and 66 hearing-loss subjects) correctly identified 94.3% of the normal and 90% of the hearing loss subjects, at the 80 dB SPL stimulus level. To enhance the discrimination between the conductive and the sensorineural populations, data from the 68 dB SPL stimulus level were used, which yielded a normal classification of 90.2%, a hearing loss classification of 87.5% and a conductive-sensorineural classification of 87%. Among the hearing-loss populations the best discrimination was obtained in the group of otosclerosis and the worst in the group of acute acoustic trauma.

  10. Radiographic classification for fractures of the fifth metatarsal base

    International Nuclear Information System (INIS)

    Mehlhorn, Alexander T.; Zwingmann, Joern; Hirschmueller, Anja; Suedkamp, Norbert P.; Schmal, Hagen

    2014-01-01

    Avulsion fractures of the fifth metatarsal base (MTB5) are common fore foot injuries. Based on a radiomorphometric analysis reflecting the risk for a secondary displacement, a new classification was developed. A cohort of 95 healthy, sportive, and young patients (age ≤ 50 years) with avulsion fractures of the MTB5 was included in the study and divided into groups with non-displaced, primary-displaced, and secondary-displaced fractures. Radiomorphometric data obtained using standard oblique and dorso-plantar views were analyzed in association with secondary displacement. Based on this, a classification was developed and checked for reproducibility. Fractures with a longer distance between the lateral edge of the styloid process and the lateral fracture step-off and fractures with a more medial joint entry of the fracture line at the MTB5 are at higher risk to displace secondarily. Based on these findings, all fractures were divided into three types: type I with a fracture entry in the lateral third; type II in the middle third; and type III in the medial third of the MTB5. Additionally, the three types were subdivided into an A-type with a fracture displacement <2 mm and a B-type with a fracture displacement ≥ 2 mm. A substantial level of interobserver agreement was found in the assignment of all 95 fractures to the six fracture types (κ = 0.72). The secondary displacement of fractures was confirmed by all examiners in 100 %. Radiomorphometric data may identify fractures at risk for secondary displacement of the MTB5. Based on this, a reliable classification was developed. (orig.)

  11. Radiographic classification for fractures of the fifth metatarsal base

    Energy Technology Data Exchange (ETDEWEB)

    Mehlhorn, Alexander T.; Zwingmann, Joern; Hirschmueller, Anja; Suedkamp, Norbert P.; Schmal, Hagen [University of Freiburg Medical Center, Department of Orthopaedic Surgery, Freiburg (Germany)

    2014-04-15

    Avulsion fractures of the fifth metatarsal base (MTB5) are common fore foot injuries. Based on a radiomorphometric analysis reflecting the risk for a secondary displacement, a new classification was developed. A cohort of 95 healthy, sportive, and young patients (age ≤ 50 years) with avulsion fractures of the MTB5 was included in the study and divided into groups with non-displaced, primary-displaced, and secondary-displaced fractures. Radiomorphometric data obtained using standard oblique and dorso-plantar views were analyzed in association with secondary displacement. Based on this, a classification was developed and checked for reproducibility. Fractures with a longer distance between the lateral edge of the styloid process and the lateral fracture step-off and fractures with a more medial joint entry of the fracture line at the MTB5 are at higher risk to displace secondarily. Based on these findings, all fractures were divided into three types: type I with a fracture entry in the lateral third; type II in the middle third; and type III in the medial third of the MTB5. Additionally, the three types were subdivided into an A-type with a fracture displacement <2 mm and a B-type with a fracture displacement ≥ 2 mm. A substantial level of interobserver agreement was found in the assignment of all 95 fractures to the six fracture types (κ = 0.72). The secondary displacement of fractures was confirmed by all examiners in 100 %. Radiomorphometric data may identify fractures at risk for secondary displacement of the MTB5. Based on this, a reliable classification was developed. (orig.)

  12. Network-Based Isoform Quantification with RNA-Seq Data for Cancer Transcriptome Analysis.

    Directory of Open Access Journals (Sweden)

    Wei Zhang

    2015-12-01

    Full Text Available High-throughput mRNA sequencing (RNA-Seq is widely used for transcript quantification of gene isoforms. Since RNA-Seq data alone is often not sufficient to accurately identify the read origins from the isoforms for quantification, we propose to explore protein domain-domain interactions as prior knowledge for integrative analysis with RNA-Seq data. We introduce a Network-based method for RNA-Seq-based Transcript Quantification (Net-RSTQ to integrate protein domain-domain interaction network with short read alignments for transcript abundance estimation. Based on our observation that the abundances of the neighboring isoforms by domain-domain interactions in the network are positively correlated, Net-RSTQ models the expression of the neighboring transcripts as Dirichlet priors on the likelihood of the observed read alignments against the transcripts in one gene. The transcript abundances of all the genes are then jointly estimated with alternating optimization of multiple EM problems. In simulation Net-RSTQ effectively improved isoform transcript quantifications when isoform co-expressions correlate with their interactions. qRT-PCR results on 25 multi-isoform genes in a stem cell line, an ovarian cancer cell line, and a breast cancer cell line also showed that Net-RSTQ estimated more consistent isoform proportions with RNA-Seq data. In the experiments on the RNA-Seq data in The Cancer Genome Atlas (TCGA, the transcript abundances estimated by Net-RSTQ are more informative for patient sample classification of ovarian cancer, breast cancer and lung cancer. All experimental results collectively support that Net-RSTQ is a promising approach for isoform quantification. Net-RSTQ toolbox is available at http://compbio.cs.umn.edu/Net-RSTQ/.

  13. Chinese wine classification system based on micrograph using combination of shape and structure features

    Science.gov (United States)

    Wan, Yi

    2011-06-01

    Chinese wines can be classification or graded by the micrographs. Micrographs of Chinese wines show floccules, stick and granule of variant shape and size. Different wines have variant microstructure and micrographs, we study the classification of Chinese wines based on the micrographs. Shape and structure of wines' particles in microstructure is the most important feature for recognition and classification of wines. So we introduce a feature extraction method which can describe the structure and region shape of micrograph efficiently. First, the micrographs are enhanced using total variation denoising, and segmented using a modified Otsu's method based on the Rayleigh Distribution. Then features are extracted using proposed method in the paper based on area, perimeter and traditional shape feature. Eight kinds total 26 features are selected. Finally, Chinese wine classification system based on micrograph using combination of shape and structure features and BP neural network have been presented. We compare the recognition results for different choices of features (traditional shape features or proposed features). The experimental results show that the better classification rate have been achieved using the combinational features proposed in this paper.

  14. Desert plains classification based on Geomorphometrical parameters (Case study: Aghda, Yazd)

    Science.gov (United States)

    Tazeh, mahdi; Kalantari, Saeideh

    2013-04-01

    This research focuses on plains. There are several tremendous methods and classification which presented for plain classification. One of The natural resource based classification which is mostly using in Iran, classified plains into three types, Erosional Pediment, Denudation Pediment Aggradational Piedmont. The qualitative and quantitative factors to differentiate them from each other are also used appropriately. In this study effective Geomorphometrical parameters in differentiate landforms were applied for plain. Geomorphometrical parameters are calculable and can be extracted using mathematical equations and the corresponding relations on digital elevation model. Geomorphometrical parameters used in this study included Percent of Slope, Plan Curvature, Profile Curvature, Minimum Curvature, the Maximum Curvature, Cross sectional Curvature, Longitudinal Curvature and Gaussian Curvature. The results indicated that the most important affecting Geomorphometrical parameters for plain and desert classifications includes: Percent of Slope, Minimum Curvature, Profile Curvature, and Longitudinal Curvature. Key Words: Plain, Geomorphometry, Classification, Biophysical, Yazd Khezarabad.

  15. A canonical correlation analysis based EMG classification algorithm for eliminating electrode shift effect.

    Science.gov (United States)

    Zhe Fan; Zhong Wang; Guanglin Li; Ruomei Wang

    2016-08-01

    Motion classification system based on surface Electromyography (sEMG) pattern recognition has achieved good results in experimental condition. But it is still a challenge for clinical implement and practical application. Many factors contribute to the difficulty of clinical use of the EMG based dexterous control. The most obvious and important is the noise in the EMG signal caused by electrode shift, muscle fatigue, motion artifact, inherent instability of signal and biological signals such as Electrocardiogram. In this paper, a novel method based on Canonical Correlation Analysis (CCA) was developed to eliminate the reduction of classification accuracy caused by electrode shift. The average classification accuracy of our method were above 95% for the healthy subjects. In the process, we validated the influence of electrode shift on motion classification accuracy and discovered the strong correlation with correlation coefficient of >0.9 between shift position data and normal position data.

  16. Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data.

    Science.gov (United States)

    Ali, Safdar; Majid, Abdul; Javed, Syed Gibran; Sattar, Mohsin

    2016-06-01

    Early prediction of breast cancer is important for effective treatment and survival. We developed an effective Cost-Sensitive Classifier with GentleBoost Ensemble (Can-CSC-GBE) for the classification of breast cancer using protein amino acid features. In this work, first, discriminant information of the protein sequences related to breast tissue is extracted. Then, the physicochemical properties hydrophobicity and hydrophilicity of amino acids are employed to generate molecule descriptors in different feature spaces. For comparison, we obtained results by combining Cost-Sensitive learning with conventional ensemble of AdaBoostM1 and Bagging. The proposed Can-CSC-GBE system has effectively reduced the misclassification costs and thereby improved the overall classification performance. Our novel approach has highlighted promising results as compared to the state-of-the-art ensemble approaches. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Distinction of gastric cancer tissue based on surface-enhanced Raman spectroscopy

    Science.gov (United States)

    Ma, Jun; Zhou, Hanjing; Gong, Longjing; Liu, Shu; Zhou, Zhenghua; Mao, Weizheng; Zheng, Rong-er

    2012-12-01

    Gastric cancer is one of the most common malignant tumors with high recurrence rate and mortality rate in China. This study aimed to evaluate the diagnostic capability of Surface-enhanced Raman spectroscopy (SERS) based on gold colloids for distinguishing gastric tissues. Gold colloids were directly mixed with the supernatant of homogenized tissues to heighten the Raman signal of various biomolecule. A total of 56 samples were collected from normal (30) and cancer (26). Raman spectra were obtained with a 785nm excitation in the range of 600-1800 cm-1. Significant spectral differences in SERS mainly belong to nucleic acid, proteins and lipids, particularly in the range of 653, 726, 828, 963, 1004, 1032, 1088, 1130, 1243, 1369, 1474, 1596, 1723 cm-1. PCA-LDA algorithms with leave-one-patient-out cross validation yielded diagnostic sensitivities of 90% (27/30), specificities of 88.5% (23/26), and accuracy of 89.3% (50/56), for classification of normal and cancer tissues. The receiver operating characteristic (ROC) surface is 0.917, illustrating the diagnostic utility of SERS together with PCA-LDA to identify gastric cancer from normal tissue. This work demonstrated the SERS techniques can be useful for gastric cancer detection, and it is also a potential technique for accurately identifying cancerous tumor, which is of considerable clinical importance to real-time diagnosis.

  18. Torrent classification - Base of rational management of erosive regions

    Energy Technology Data Exchange (ETDEWEB)

    Gavrilovic, Zoran; Stefanovic, Milutin; Milovanovic, Irina; Cotric, Jelena; Milojevic, Mileta [Institute for the Development of Water Resources ' Jaroslav Cerni' , 11226 Beograd (Pinosava), Jaroslava Cernog 80 (Serbia)], E-mail: gavrilovicz@sbb.rs

    2008-11-01

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  19. Automated classification of mouse pup isolation syllables: from cluster analysis to an Excel based ‘mouse pup syllable classification calculator’

    Directory of Open Access Journals (Sweden)

    Jasmine eGrimsley

    2013-01-01

    Full Text Available Mouse pups vocalize at high rates when they are cold or isolated from the nest. The proportions of each syllable type produced carry information about disease state and are being used as behavioral markers for the internal state of animals. Manual classifications of these vocalizations identified ten syllable types based on their spectro-temporal features. However, manual classification of mouse syllables is time consuming and vulnerable to experimenter bias. This study uses an automated cluster analysis to identify acoustically distinct syllable types produced by CBA/CaJ mouse pups, and then compares the results to prior manual classification methods. The cluster analysis identified two syllable types, based on their frequency bands, that have continuous frequency-time structure, and two syllable types featuring abrupt frequency transitions. Although cluster analysis computed fewer syllable types than manual classification, the clusters represented well the probability distributions of the acoustic features within syllables. These probability distributions indicate that some of the manually classified syllable types are not statistically distinct. The characteristics of the four classified clusters were used to generate a Microsoft Excel-based mouse syllable classifier that rapidly categorizes syllables, with over a 90% match, into the syllable types determined by cluster analysis.

  20. Uav-Based Crops Classification with Joint Features from Orthoimage and Dsm Data

    Science.gov (United States)

    Liu, B.; Shi, Y.; Duan, Y.; Wu, W.

    2018-04-01

    Accurate crops classification remains a challenging task due to the same crop with different spectra and different crops with same spectrum phenomenon. Recently, UAV-based remote sensing approach gains popularity not only for its high spatial and temporal resolution, but also for its ability to obtain spectraand spatial data at the same time. This paper focus on how to take full advantages of spatial and spectrum features to improve crops classification accuracy, based on an UAV platform equipped with a general digital camera. Texture and spatial features extracted from the RGB orthoimage and the digital surface model of the monitoring area are analysed and integrated within a SVM classification framework. Extensive experiences results indicate that the overall classification accuracy is drastically improved from 72.9 % to 94.5 % when the spatial features are combined together, which verified the feasibility and effectiveness of the proposed method.

  1. Hyperspectral image classification based on local binary patterns and PCANet

    Science.gov (United States)

    Yang, Huizhen; Gao, Feng; Dong, Junyu; Yang, Yang

    2018-04-01

    Hyperspectral image classification has been well acknowledged as one of the challenging tasks of hyperspectral data processing. In this paper, we propose a novel hyperspectral image classification framework based on local binary pattern (LBP) features and PCANet. In the proposed method, linear prediction error (LPE) is first employed to select a subset of informative bands, and LBP is utilized to extract texture features. Then, spectral and texture features are stacked into a high dimensional vectors. Next, the extracted features of a specified position are transformed to a 2-D image. The obtained images of all pixels are fed into PCANet for classification. Experimental results on real hyperspectral dataset demonstrate the effectiveness of the proposed method.

  2. AAPT Diagnostic Criteria for Chronic Cancer Pain Conditions.

    Science.gov (United States)

    Paice, Judith A; Mulvey, Matt; Bennett, Michael; Dougherty, Patrick M; Farrar, John T; Mantyh, Patrick W; Miaskowski, Christine; Schmidt, Brian; Smith, Thomas J

    2017-03-01

    Chronic cancer pain is a serious complication of malignancy or its treatment. Currently, no comprehensive, universally accepted cancer pain classification system exists. Clarity in classification of common cancer pain syndromes would improve clinical assessment and management. Moreover, an evidence-based taxonomy would enhance cancer pain research efforts by providing consistent diagnostic criteria, ensuring comparability across clinical trials. As part of a collaborative effort between the Analgesic, Anesthetic, and Addiction Clinical Trial Translations, Innovations, Opportunities, and Networks (ACTTION) and the American Pain Society (APS), the ACTTION-APS Pain Taxonomy initiative worked to develop the characteristics of an optimal diagnostic system. After the establishment of these characteristics, a working group consisting of clinicians and clinical and basic scientists with expertise in cancer and cancer-related pain was convened to generate core diagnostic criteria for an illustrative sample of 3 chronic pain syndromes associated with cancer (ie, bone pain and pancreatic cancer pain as models of pain related to a tumor) or its treatment (ie, chemotherapy-induced peripheral neuropathy). A systematic review and synthesis was conducted to provide evidence for the dimensions that comprise this cancer pain taxonomy. Future efforts will subject these diagnostic categories and criteria to systematic empirical evaluation of their feasibility, reliability, and validity and extension to other cancer-related pain syndromes. The ACTTION-APS chronic cancer pain taxonomy provides an evidence-based classification for 3 prevalent syndromes, namely malignant bone pain, pancreatic cancer pain, and chemotherapy-induced peripheral neuropathy. This taxonomy provides consistent diagnostic criteria, common features, comorbidities, consequences, and putative mechanisms for these potentially serious cancer pain conditions that can be extended and applied with other cancer

  3. Optical beam classification using deep learning: a comparison with rule- and feature-based classification

    Science.gov (United States)

    Alom, Md. Zahangir; Awwal, Abdul A. S.; Lowe-Webb, Roger; Taha, Tarek M.

    2017-08-01

    Vector Machine (SVM). The experimental results show around 96% classification accuracy using CNN; the CNN approach also provides comparable recognition results compared to the present feature-based off-normal detection. The feature-based solution was developed to capture the expertise of a human expert in classifying the images. The misclassified results are further studied to explain the differences and discover any discrepancies or inconsistencies in current classification.

  4. Classification of forensic autopsy reports through conceptual graph-based document representation model.

    Science.gov (United States)

    Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali

    2018-06-01

    Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation, a clinical report is transformed into a format that is suitable for classification. The traditional document representation technique for text categorization is the bag-of-words (BoW) technique. In this study, the traditional BoW technique is ineffective in classifying forensic autopsy reports because it merely extracts frequent but discriminative features from clinical reports. Moreover, this technique fails to capture word inversion, as well as word-level synonymy and polysemy, when classifying autopsy reports. Hence, the BoW technique suffers from low accuracy and low robustness unless it is improved with contextual and application-specific information. To overcome the aforementioned limitations of the BoW technique, this research aims to develop an effective conceptual graph-based document representation (CGDR) technique to classify 1500 forensic autopsy reports from four (4) manners of death (MoD) and sixteen (16) causes of death (CoD). Term-based and Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) based conceptual features were extracted and represented through graphs. These features were then used to train a two-level text classifier. The first level classifier was responsible for predicting MoD. In addition, the second level classifier was responsible for predicting CoD using the proposed conceptual graph-based document representation technique. To demonstrate the significance of the proposed technique, its results were compared with those of six (6) state-of-the-art document representation techniques. Lastly, this study compared the effects of one-level classification and two-level classification on the experimental results

  5. Proposing a Hybrid Model Based on Robson's Classification for Better Impact on Trends of Cesarean Deliveries.

    Science.gov (United States)

    Hans, Punit; Rohatgi, Renu

    2017-06-01

    To construct a hybrid model classification for cesarean section (CS) deliveries based on the woman-characteristics (Robson's classification with additional layers of indications for CS, keeping in view low-resource settings available in India). This is a cross-sectional study conducted at Nalanda Medical College, Patna. All the women delivered from January 2016 to May 2016 in the labor ward were included. Results obtained were compared with the values obtained for India, from secondary analysis of WHO multi-country survey (2010-2011) by Joshua Vogel and colleagues' study published in "The Lancet Global Health." The three classifications (indication-based, Robson's and hybrid model) applied for categorization of the cesarean deliveries from the same sample of data and a semiqualitative evaluations done, considering the main characteristics, strengths and weaknesses of each classification system. The total number of women delivered during study period was 1462, out of which CS deliveries were 471. Overall, CS rate calculated for NMCH, hospital in this specified period, was 32.21% ( p  = 0.001). Hybrid model scored 23/23, and scores of Robson classification and indication-based classification were 21/23 and 10/23, respectively. Single-study centre and referral bias are the limitations of the study. Given the flexibility of the classifications, we constructed a hybrid model based on the woman-characteristics system with additional layers of other classification. Indication-based classification answers why, Robson classification answers on whom, while through our hybrid model we get to know why and on whom cesarean deliveries are being performed.

  6. Carcinoma de mama: novos conceitos na classificação Breast cancer: new concepts in classification

    Directory of Open Access Journals (Sweden)

    Daniella Serafin Couto Vieira

    2008-01-01

    Full Text Available O carcinoma de mama é a neoplasia maligna mais comum em mulheres. Estudos moleculares do carcinoma de mama, baseados na identificação do perfil de expressão gênica por meio do cDNA microarray, permitiram definir pelo menos cinco sub-grupos distintos: luminal A, luminal B, superexpressão do HER2, basal e normal breast-like. A técnica de tissue microarray (TMA, descrita pela primeira vez em 1998, permitiu estudar, em várias amostras de carcinoma, os perfis de expressão protéica de diferentes neoplasias. No carcinoma de mama, os TMAs têm sido utilizados para validar os achados dos estudos preliminares, identificando, desta forma, os novos subtipos fenotípicos do carcinoma de mama. Dentre os subtipos classicamente descritos, o grupo basal constitui um dos mais intrigantes subtipos tumorais e é freqüentemente associado com pior prognóstico e ausência de alvos terapêuticos definidos. A classificação histopatológica do carcinoma de mama tem pobre valor preditivo. Portanto, a associação entre o diagnóstico histológico com técnicas moleculares nos laboratórios de anatomia patológica, por meio do estudo imunoistoquímico, pode determinar o perfil molecular do carcinoma de mama, buscando melhorar a resposta terapêutica. Este estudo visou resumir os mais recentes conhecimentos em que se baseiam os novos conceitos da classificação do carcinoma de mama.Breast cancer is the principal cause of death from cancer in women. Molecular studies of breast cancer, based in the identification of the molecular profiling techniques through cDNA microarray, had allowed defining at least five distinct sub-group: luminal A, luminal B, HER-2-overexpression, basal and " normal" type breast-like. The technique of tissue microarrays (TMA, described for the first time in 1998, allows to study, in some samples of breast cancer, distinguished by differences in their gene expression patterns, which provide a distinctive molecular portrait for each tumor

  7. Image Classification Based on Convolutional Denoising Sparse Autoencoder

    Directory of Open Access Journals (Sweden)

    Shuangshuang Chen

    2017-01-01

    Full Text Available Image classification aims to group images into corresponding semantic categories. Due to the difficulties of interclass similarity and intraclass variability, it is a challenging issue in computer vision. In this paper, an unsupervised feature learning approach called convolutional denoising sparse autoencoder (CDSAE is proposed based on the theory of visual attention mechanism and deep learning methods. Firstly, saliency detection method is utilized to get training samples for unsupervised feature learning. Next, these samples are sent to the denoising sparse autoencoder (DSAE, followed by convolutional layer and local contrast normalization layer. Generally, prior in a specific task is helpful for the task solution. Therefore, a new pooling strategy—spatial pyramid pooling (SPP fused with center-bias prior—is introduced into our approach. Experimental results on the common two image datasets (STL-10 and CIFAR-10 demonstrate that our approach is effective in image classification. They also demonstrate that none of these three components: local contrast normalization, SPP fused with center-prior, and l2 vector normalization can be excluded from our proposed approach. They jointly improve image representation and classification performance.

  8. Feature-Based Classification of Amino Acid Substitutions outside Conserved Functional Protein Domains

    Directory of Open Access Journals (Sweden)

    Branislava Gemovic

    2013-01-01

    Full Text Available There are more than 500 amino acid substitutions in each human genome, and bioinformatics tools irreplaceably contribute to determination of their functional effects. We have developed feature-based algorithm for the detection of mutations outside conserved functional domains (CFDs and compared its classification efficacy with the most commonly used phylogeny-based tools, PolyPhen-2 and SIFT. The new algorithm is based on the informational spectrum method (ISM, a feature-based technique, and statistical analysis. Our dataset contained neutral polymorphisms and mutations associated with myeloid malignancies from epigenetic regulators ASXL1, DNMT3A, EZH2, and TET2. PolyPhen-2 and SIFT had significantly lower accuracies in predicting the effects of amino acid substitutions outside CFDs than expected, with especially low sensitivity. On the other hand, only ISM algorithm showed statistically significant classification of these sequences. It outperformed PolyPhen-2 and SIFT by 15% and 13%, respectively. These results suggest that feature-based methods, like ISM, are more suitable for the classification of amino acid substitutions outside CFDs than phylogeny-based tools.

  9. Impact of full field digital mammography on the classification and mammographic characteristics of interval breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Knox, Mark, E-mail: marktknox@gmail.com; O’Brien, Angela, E-mail: angelaobrien@doctors.org.uk; Szabó, Endre, E-mail: endrebacsi@freemail.hu; Smith, Clare S., E-mail: csmith@mater.ie; Fenlon, Helen M., E-mail: helen.fenlon@cancerscreening.ie; McNicholas, Michelle M., E-mail: michelle.mcnicholas@cancerscreening.ie; Flanagan, Fidelma L., E-mail: fidelma.flanagan@cancerscreening.ie

    2015-06-15

    Highlights: • Digital mammography has changed the presentation of interval breast cancer. • Less interval breast cancers are associated with microcalcifications following FFDM. • Interval breast cancer audit remains a key feature of any breast screening program. - Abstract: Objective: Full field digital mammography (FFDM) is increasingly replacing screen film mammography (SFM) in breast screening programs. Interval breast cancers are an issue in all screening programs and the purpose of our study is to assess the impact of FFDM on the classification of interval breast cancers at independent blind review and to compare the mammographic features of interval cancers at FFDM and SFM. Materials and methods: This study included 138 cases of interval breast cancer, 76 following an FFDM screening examination and 62 following screening with SFM. The prior screening mammogram was assessed by each of five consultant breast radiologists who were blinded to the site of subsequent cancer. Subsequent review of the diagnostic mammogram was performed and cases were classified as missed, minimal signs, occult or true interval. Mammographic features of the interval cancer at diagnosis and any abnormality identified on the prior screening mammogram were recorded. Results: The percentages of cancers classified as missed at FFDM and SFM did not differ significantly, 10.5% (8 of 76) at FFDM and 8.1% (5 of 62) at SFM (p = .77). There were significantly less interval cancers presenting as microcalcifications (alone or in association with another abnormality) following screening with FFDM, 16% (12 of 76) than following a SFM examination, 32% (20 of 62) (p = .02). Conclusion: Interval breast cancers continue to pose a problem at FFDM. The switch to FFDM has changed the mammographic presentation of interval breast cancer, with less interval cancers presenting in association with microcalcifications.

  10. Impact of full field digital mammography on the classification and mammographic characteristics of interval breast cancers

    International Nuclear Information System (INIS)

    Knox, Mark; O’Brien, Angela; Szabó, Endre; Smith, Clare S.; Fenlon, Helen M.; McNicholas, Michelle M.; Flanagan, Fidelma L.

    2015-01-01

    Highlights: • Digital mammography has changed the presentation of interval breast cancer. • Less interval breast cancers are associated with microcalcifications following FFDM. • Interval breast cancer audit remains a key feature of any breast screening program. - Abstract: Objective: Full field digital mammography (FFDM) is increasingly replacing screen film mammography (SFM) in breast screening programs. Interval breast cancers are an issue in all screening programs and the purpose of our study is to assess the impact of FFDM on the classification of interval breast cancers at independent blind review and to compare the mammographic features of interval cancers at FFDM and SFM. Materials and methods: This study included 138 cases of interval breast cancer, 76 following an FFDM screening examination and 62 following screening with SFM. The prior screening mammogram was assessed by each of five consultant breast radiologists who were blinded to the site of subsequent cancer. Subsequent review of the diagnostic mammogram was performed and cases were classified as missed, minimal signs, occult or true interval. Mammographic features of the interval cancer at diagnosis and any abnormality identified on the prior screening mammogram were recorded. Results: The percentages of cancers classified as missed at FFDM and SFM did not differ significantly, 10.5% (8 of 76) at FFDM and 8.1% (5 of 62) at SFM (p = .77). There were significantly less interval cancers presenting as microcalcifications (alone or in association with another abnormality) following screening with FFDM, 16% (12 of 76) than following a SFM examination, 32% (20 of 62) (p = .02). Conclusion: Interval breast cancers continue to pose a problem at FFDM. The switch to FFDM has changed the mammographic presentation of interval breast cancer, with less interval cancers presenting in association with microcalcifications

  11. Prognostic Performance and Reproducibility of the 1973 and 2004/2016 World Health Organization Grading Classification Systems in Non-muscle-invasive Bladder Cancer: A European Association of Urology Non-muscle Invasive Bladder Cancer Guidelines Panel Systematic Review.

    Science.gov (United States)

    Soukup, Viktor; Čapoun, Otakar; Cohen, Daniel; Hernández, Virginia; Babjuk, Marek; Burger, Max; Compérat, Eva; Gontero, Paolo; Lam, Thomas; MacLennan, Steven; Mostafid, A Hugh; Palou, Joan; van Rhijn, Bas W G; Rouprêt, Morgan; Shariat, Shahrokh F; Sylvester, Richard; Yuan, Yuhong; Zigeuner, Richard

    2017-11-01

    Tumour grade is an important prognostic indicator in non-muscle-invasive bladder cancer (NMIBC). Histopathological classifications are limited by interobserver variability (reproducibility), which may have prognostic implications. European Association of Urology NMIBC guidelines suggest concurrent use of both 1973 and 2004/2016 World Health Organization (WHO) classifications. To compare the prognostic performance and reproducibility of the 1973 and 2004/2016 WHO grading systems for NMIBC. A systematic literature search was undertaken incorporating Medline, Embase, and the Cochrane Library. Studies were critically appraised for risk of bias (QUIPS). For prognosis, the primary outcome was progression to muscle-invasive or metastatic disease. Secondary outcomes were disease recurrence, and overall and cancer-specific survival. For reproducibility, the primary outcome was interobserver variability between pathologists. Secondary outcome was intraobserver variability (repeatability) by the same pathologist. Of 3593 articles identified, 20 were included in the prognostic review; three were eligible for the reproducibility review. Increasing tumour grade in both classifications was associated with higher disease progression and recurrence rates. Progression rates in grade 1 patients were similar to those in low-grade patients; progression rates in grade 3 patients were higher than those in high-grade patients. Survival data were limited. Reproducibility of the 2004/2016 system was marginally better than that of the 1973 system. Two studies on repeatability showed conflicting results. Most studies had a moderate to high risk of bias. Current grading classifications in NMIBC are suboptimal. The 1973 system identifies more aggressive tumours. Intra- and interobserver variability was slightly less in the 2004/2016 classification. We could not confirm that the 2004/2016 classification outperforms the 1973 classification in prediction of recurrence and progression. This article

  12. Gradient Evolution-based Support Vector Machine Algorithm for Classification

    Science.gov (United States)

    Zulvia, Ferani E.; Kuo, R. J.

    2018-03-01

    This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.

  13. Compensatory neurofuzzy model for discrete data classification in biomedical

    Science.gov (United States)

    Ceylan, Rahime

    2015-03-01

    Biomedical data is separated to two main sections: signals and discrete data. So, studies in this area are about biomedical signal classification or biomedical discrete data classification. There are artificial intelligence models which are relevant to classification of ECG, EMG or EEG signals. In same way, in literature, many models exist for classification of discrete data taken as value of samples which can be results of blood analysis or biopsy in medical process. Each algorithm could not achieve high accuracy rate on classification of signal and discrete data. In this study, compensatory neurofuzzy network model is presented for classification of discrete data in biomedical pattern recognition area. The compensatory neurofuzzy network has a hybrid and binary classifier. In this system, the parameters of fuzzy systems are updated by backpropagation algorithm. The realized classifier model is conducted to two benchmark datasets (Wisconsin Breast Cancer dataset and Pima Indian Diabetes dataset). Experimental studies show that compensatory neurofuzzy network model achieved 96.11% accuracy rate in classification of breast cancer dataset and 69.08% accuracy rate was obtained in experiments made on diabetes dataset with only 10 iterations.

  14. Atmospheric circulation classification comparison based on wildfires in Portugal

    Science.gov (United States)

    Pereira, M. G.; Trigo, R. M.

    2009-04-01

    Atmospheric circulation classifications are not a simple description of atmospheric states but a tool to understand and interpret the atmospheric processes and to model the relation between atmospheric circulation and surface climate and other related variables (Radan Huth et al., 2008). Classifications were initially developed with weather forecasting purposes, however with the progress in computer processing capability, new and more robust objective methods were developed and applied to large datasets prompting atmospheric circulation classification methods to one of the most important fields in synoptic and statistical climatology. Classification studies have been extensively used in climate change studies (e.g. reconstructed past climates, recent observed changes and future climates), in bioclimatological research (e.g. relating human mortality to climatic factors) and in a wide variety of synoptic climatological applications (e.g. comparison between datasets, air pollution, snow avalanches, wine quality, fish captures and forest fires). Likewise, atmospheric circulation classifications are important for the study of the role of weather in wildfire occurrence in Portugal because the daily synoptic variability is the most important driver of local weather conditions (Pereira et al., 2005). In particular, the objective classification scheme developed by Trigo and DaCamara (2000) to classify the atmospheric circulation affecting Portugal have proved to be quite useful in discriminating the occurrence and development of wildfires as well as the distribution over Portugal of surface climatic variables with impact in wildfire activity such as maximum and minimum temperature and precipitation. This work aims to present: (i) an overview the existing circulation classification for the Iberian Peninsula, and (ii) the results of a comparison study between these atmospheric circulation classifications based on its relation with wildfires and relevant meteorological

  15. CT-based injury classification

    International Nuclear Information System (INIS)

    Mirvis, S.E.; Whitley, N.O.; Vainright, J.; Gens, D.

    1988-01-01

    Review of preoperative abdominal CT scans obtained in adults after blunt trauma during a 2.5-year period demonstrated isolated or predominant liver injury in 35 patients and splenic injury in 33 patients. CT-based injury scores, consisting of five levels of hepatic injury and four levels of splenic injury, were correlated with clinical outcome and surgical findings. Hepatic injury grades I-III, present in 33 of 35 patients, were associated with successful nonsurgical management in 27 (82%) or with findings at celiotomy not requiring surgical intervention in four (12%). Higher grades of splenic injury generally required early operative intervention, but eight (36%) of 22 patients with initial grade III or IV injury were managed without surgery, while four (36%) of 11 patients with grade I or II injury required delayed celiotomy and splenectomy (three patients) or emergent rehospitalization (one patient). CT-based injury classification is useful in guiding the nonoperative management of blunt hepatic injury in hemodynamically stable adults but appears to be less reliable in predicting the outcome of blunt splenic injury

  16. Classification of normal and abnormal images of lung cancer

    Science.gov (United States)

    Bhatnagar, Divyesh; Tiwari, Amit Kumar; Vijayarajan, V.; Krishnamoorthy, A.

    2017-11-01

    To find the exact symptoms of lung cancer is difficult, because of the formation of the most cancers tissues, wherein large structure of tissues is intersect in a different way. This problem can be evaluated with the help of digital images. In this strategy images will be examined with basic operation of PCA Algorithm. In this paper, GLCM method is used for pre-processing of the snap shots and function extraction system and to test the level of diseases of a patient in its premature stage get to know it is regular or unusual. With the help of result stage of cancer will be evaluated. With the help of dataset and result survival rate of cancer patient can be estimated. Result is based totally on the precise and wrong arrangement of the patterns of tissues.

  17. A new circulation type classification based upon Lagrangian air trajectories

    Directory of Open Access Journals (Sweden)

    Alexandre M. Ramos

    2014-10-01

    Full Text Available A new classification method of the large-scale circulation characteristic for a specific target area (NW Iberian Peninsula is presented, based on the analysis of 90-h backward trajectories arriving in this area calculated with the 3-D Lagrangian particle dispersion model FLEXPART. A cluster analysis is applied to separate the backward trajectories in up to five representative air streams for each day. Specific measures are then used to characterise the distinct air streams (e.g., curvature of the trajectories, cyclonic or anticyclonic flow, moisture evolution, origin and length of the trajectories. The robustness of the presented method is demonstrated in comparison with the Eulerian Lamb weather type classification.A case study of the 2003 heatwave is discussed in terms of the new Lagrangian circulation and the Lamb weather type classifications. It is shown that the new classification method adds valuable information about the pertinent meteorological conditions, which are missing in an Eulerian approach. The new method is climatologically evaluated for the five-year time period from December 1999 to November 2004. The ability of the method to capture the inter-seasonal circulation variability in the target region is shown. Furthermore, the multi-dimensional character of the classification is shortly discussed, in particular with respect to inter-seasonal differences. Finally, the relationship between the new Lagrangian classification and the precipitation in the target area is studied.

  18. Classification of neuropathic pain in cancer patients

    DEFF Research Database (Denmark)

    Brunelli, Cinzia; Bennett, Michael I; Kaasa, Stein

    2014-01-01

    and on the relevance of patient-reported outcome (PRO) descriptors for the screening of NP in this population. An international group of 42 experts was invited to participate in a consensus process through a modified 2-round Internet-based Delphi survey. Relevant topics investigated were: peculiarities of NP...... in patients with cancer, IASP NeuPSIG diagnostic criteria adaptation and assessment, and standardized PRO assessment for NP screening. Median consensus scores (MED) and interquartile ranges (IQR) were calculated to measure expert consensus after both rounds. Twenty-nine experts answered, and good agreement...... was proposed. Clinical research on PRO in the screening phase and on the application of the algorithm will be needed to examine their effectiveness in classifying NP in cancer patients....

  19. Classification of scintigrams on the base of an automatic analysis

    International Nuclear Information System (INIS)

    Vidyukov, V.I.; Kasatkin, Yu.N.; Kal'nitskaya, E.F.; Mironov, S.P.; Rotenberg, E.M.

    1980-01-01

    The stages of drawing a discriminative system based on self-education for an automatic analysis of scintigrams have been considered. The results of the classification of 240 scintigrams of the liver into ''normal'', ''diffuse lesions'', ''focal lesions'' have been evaluated by medical experts and computer. The accuracy of the computerized classification was 91.7%, that of the experts-85%. The automatic analysis methods of scintigrams of the liver have been realized using the specialized MDS system of data processing. The quality of the discriminative system has been assessed on 125 scintigrams. The accuracy of the classification is equal to 89.6%. The employment of the self-education; methods permitted one to single out two subclasses depending on the severity of diffuse lesions

  20. An application-based classification to understand buyer-seller interaction in business services

    NARCIS (Netherlands)

    Valk, van der W.; Wynstra, J.Y.F.; Axelsson, B.

    2006-01-01

    Abstract: Purpose – Most existing classifications of business services have taken the perspective of the supplier as opposed to that of the buyer. To address this imbalance, the purpose of this paper is to propose a classification of business services based on how the buying company applies the

  1. Patent Keyword Extraction Algorithm Based on Distributed Representation for Patent Classification

    Directory of Open Access Journals (Sweden)

    Jie Hu

    2018-02-01

    Full Text Available Many text mining tasks such as text retrieval, text summarization, and text comparisons depend on the extraction of representative keywords from the main text. Most existing keyword extraction algorithms are based on discrete bag-of-words type of word representation of the text. In this paper, we propose a patent keyword extraction algorithm (PKEA based on the distributed Skip-gram model for patent classification. We also develop a set of quantitative performance measures for keyword extraction evaluation based on information gain and cross-validation, based on Support Vector Machine (SVM classification, which are valuable when human-annotated keywords are not available. We used a standard benchmark dataset and a homemade patent dataset to evaluate the performance of PKEA. Our patent dataset includes 2500 patents from five distinct technological fields related to autonomous cars (GPS systems, lidar systems, object recognition systems, radar systems, and vehicle control systems. We compared our method with Frequency, Term Frequency-Inverse Document Frequency (TF-IDF, TextRank and Rapid Automatic Keyword Extraction (RAKE. The experimental results show that our proposed algorithm provides a promising way to extract keywords from patent texts for patent classification.

  2. A comparison of the accuracy of pixel based and object based classifications of integrated optical and LiDAR data

    Science.gov (United States)

    Gajda, Agnieszka; Wójtowicz-Nowakowska, Anna

    2013-04-01

    A comparison of the accuracy of pixel based and object based classifications of integrated optical and LiDAR data Land cover maps are generally produced on the basis of high resolution imagery. Recently, LiDAR (Light Detection and Ranging) data have been brought into use in diverse applications including land cover mapping. In this study we attempted to assess the accuracy of land cover classification using both high resolution aerial imagery and LiDAR data (airborne laser scanning, ALS), testing two classification approaches: a pixel-based classification and object-oriented image analysis (OBIA). The study was conducted on three test areas (3 km2 each) in the administrative area of Kraków, Poland, along the course of the Vistula River. They represent three different dominating land cover types of the Vistula River valley. Test site 1 had a semi-natural vegetation, with riparian forests and shrubs, test site 2 represented a densely built-up area, and test site 3 was an industrial site. Point clouds from ALS and ortophotomaps were both captured in November 2007. Point cloud density was on average 16 pt/m2 and it contained additional information about intensity and encoded RGB values. Ortophotomaps had a spatial resolution of 10 cm. From point clouds two raster maps were generated: intensity (1) and (2) normalised Digital Surface Model (nDSM), both with the spatial resolution of 50 cm. To classify the aerial data, a supervised classification approach was selected. Pixel based classification was carried out in ERDAS Imagine software. Ortophotomaps and intensity and nDSM rasters were used in classification. 15 homogenous training areas representing each cover class were chosen. Classified pixels were clumped to avoid salt and pepper effect. Object oriented image object classification was carried out in eCognition software, which implements both the optical and ALS data. Elevation layers (intensity, firs/last reflection, etc.) were used at segmentation stage due to

  3. G0-WISHART Distribution Based Classification from Polarimetric SAR Images

    Science.gov (United States)

    Hu, G. C.; Zhao, Q. H.

    2017-09-01

    Enormous scientific and technical developments have been carried out to further improve the remote sensing for decades, particularly Polarimetric Synthetic Aperture Radar(PolSAR) technique, so classification method based on PolSAR images has getted much more attention from scholars and related department around the world. The multilook polarmetric G0-Wishart model is a more flexible model which describe homogeneous, heterogeneous and extremely heterogeneous regions in the image. Moreover, the polarmetric G0-Wishart distribution dose not include the modified Bessel function of the second kind. It is a kind of simple statistical distribution model with less parameter. To prove its feasibility, a process of classification has been tested with the full-polarized Synthetic Aperture Radar (SAR) image by the method. First, apply multilook polarimetric SAR data process and speckle filter to reduce speckle influence for classification result. Initially classify the image into sixteen classes by H/A/α decomposition. Using the ICM algorithm to classify feature based on the G0-Wshart distance. Qualitative and quantitative results show that the proposed method can classify polaimetric SAR data effectively and efficiently.

  4. Semi-supervised vibration-based classification and condition monitoring of compressors

    Science.gov (United States)

    Potočnik, Primož; Govekar, Edvard

    2017-09-01

    Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.

  5. Histopathological Breast Cancer Image Classification by Deep Neural Network Techniques Guided by Local Clustering.

    Science.gov (United States)

    Nahid, Abdullah-Al; Mehrabi, Mohamad Ali; Kong, Yinan

    2018-01-01

    Breast Cancer is a serious threat and one of the largest causes of death of women throughout the world. The identification of cancer largely depends on digital biomedical photography analysis such as histopathological images by doctors and physicians. Analyzing histopathological images is a nontrivial task, and decisions from investigation of these kinds of images always require specialised knowledge. However, Computer Aided Diagnosis (CAD) techniques can help the doctor make more reliable decisions. The state-of-the-art Deep Neural Network (DNN) has been recently introduced for biomedical image analysis. Normally each image contains structural and statistical information. This paper classifies a set of biomedical breast cancer images (BreakHis dataset) using novel DNN techniques guided by structural and statistical information derived from the images. Specifically a Convolutional Neural Network (CNN), a Long-Short-Term-Memory (LSTM), and a combination of CNN and LSTM are proposed for breast cancer image classification. Softmax and Support Vector Machine (SVM) layers have been used for the decision-making stage after extracting features utilising the proposed novel DNN models. In this experiment the best Accuracy value of 91.00% is achieved on the 200x dataset, the best Precision value 96.00% is achieved on the 40x dataset, and the best F -Measure value is achieved on both the 40x and 100x datasets.

  6. Single-labelled music genre classification using content-based features

    CSIR Research Space (South Africa)

    Ajoodha, R

    2015-11-01

    Full Text Available In this paper we use content-based features to perform automatic classification of music pieces into genres. We categorise these features into four groups: features extracted from the Fourier transform’s magnitude spectrum, features designed...

  7. Breast tissue classification in digital tomosynthesis images based on global gradient minimization and texture features

    Science.gov (United States)

    Qin, Xulei; Lu, Guolan; Sechopoulos, Ioannis; Fei, Baowei

    2014-03-01

    Digital breast tomosynthesis (DBT) is a pseudo-three-dimensional x-ray imaging modality proposed to decrease the effect of tissue superposition present in mammography, potentially resulting in an increase in clinical performance for the detection and diagnosis of breast cancer. Tissue classification in DBT images can be useful in risk assessment, computer-aided detection and radiation dosimetry, among other aspects. However, classifying breast tissue in DBT is a challenging problem because DBT images include complicated structures, image noise, and out-of-plane artifacts due to limited angular tomographic sampling. In this project, we propose an automatic method to classify fatty and glandular tissue in DBT images. First, the DBT images are pre-processed to enhance the tissue structures and to decrease image noise and artifacts. Second, a global smooth filter based on L0 gradient minimization is applied to eliminate detailed structures and enhance large-scale ones. Third, the similar structure regions are extracted and labeled by fuzzy C-means (FCM) classification. At the same time, the texture features are also calculated. Finally, each region is classified into different tissue types based on both intensity and texture features. The proposed method is validated using five patient DBT images using manual segmentation as the gold standard. The Dice scores and the confusion matrix are utilized to evaluate the classified results. The evaluation results demonstrated the feasibility of the proposed method for classifying breast glandular and fat tissue on DBT images.

  8. On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Asriyanti Indah Pratiwi

    2018-01-01

    Full Text Available Sentiment analysis in a movie review is the needs of today lifestyle. Unfortunately, enormous features make the sentiment of analysis slow and less sensitive. Finding the optimum feature selection and classification is still a challenge. In order to handle an enormous number of features and provide better sentiment classification, an information-based feature selection and classification are proposed. The proposed method reduces more than 90% unnecessary features while the proposed classification scheme achieves 96% accuracy of sentiment classification. From the experimental results, it can be concluded that the combination of proposed feature selection and classification achieves the best performance so far.

  9. Ovarian cancer: Novel molecular aspects for clinical assessment.

    Science.gov (United States)

    Palmirotta, Raffaele; Silvestris, Erica; D'Oronzo, Stella; Cardascia, Angela; Silvestris, Franco

    2017-09-01

    Ovarian cancer is a very heterogeneous tumor which has been traditionally characterized according to the different histological subtypes and differentiation degree. In recent years, innovative molecular screening biotechnologies have allowed to identify further subtypes of this cancer based on gene expression profiles, mutational features, and epigenetic factors. These novel classification systems emphasizing the molecular signatures within the broad spectrum of ovarian cancer have not only allowed a more precise prognostic prediction, but also proper therapeutic strategies for specific subgroups of patients. The bulk of available scientific data and the high refinement of molecular classifications of ovarian cancers can today address the research towards innovative drugs with the adoption of targeted therapies tailored for single molecular profiles leading to a better prediction of therapeutic response. Here, we summarize the current state of knowledge on the molecular bases of ovarian cancer, from the description of its molecular subtypes derived from wide high-throughput analyses to the latest discoveries of the ovarian cancer stem cells. The latest personalized treatment options are also presented with recent advances in using PARP inhibitors, anti-angiogenic, anti-folate receptor and anti-cancer stem cells treatment approaches. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Implementation of several mathematical algorithms to breast tissue density classification

    Science.gov (United States)

    Quintana, C.; Redondo, M.; Tirao, G.

    2014-02-01

    The accuracy of mammographic abnormality detection methods is strongly dependent on breast tissue characteristics, where a dense breast tissue can hide lesions causing cancer to be detected at later stages. In addition, breast tissue density is widely accepted to be an important risk indicator for the development of breast cancer. This paper presents the implementation and the performance of different mathematical algorithms designed to standardize the categorization of mammographic images, according to the American College of Radiology classifications. These mathematical techniques are based on intrinsic properties calculations and on comparison with an ideal homogeneous image (joint entropy, mutual information, normalized cross correlation and index Q) as categorization parameters. The algorithms evaluation was performed on 100 cases of the mammographic data sets provided by the Ministerio de Salud de la Provincia de Córdoba, Argentina—Programa de Prevención del Cáncer de Mama (Department of Public Health, Córdoba, Argentina, Breast Cancer Prevention Program). The obtained breast classifications were compared with the expert medical diagnostics, showing a good performance. The implemented algorithms revealed a high potentiality to classify breasts into tissue density categories.

  11. MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS

    Science.gov (United States)

    Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...

  12. Remote Sensing Image Classification Based on Stacked Denoising Autoencoder

    Directory of Open Access Journals (Sweden)

    Peng Liang

    2017-12-01

    Full Text Available Focused on the issue that conventional remote sensing image classification methods have run into the bottlenecks in accuracy, a new remote sensing image classification method inspired by deep learning is proposed, which is based on Stacked Denoising Autoencoder. First, the deep network model is built through the stacked layers of Denoising Autoencoder. Then, with noised input, the unsupervised Greedy layer-wise training algorithm is used to train each layer in turn for more robust expressing, characteristics are obtained in supervised learning by Back Propagation (BP neural network, and the whole network is optimized by error back propagation. Finally, Gaofen-1 satellite (GF-1 remote sensing data are used for evaluation, and the total accuracy and kappa accuracy reach 95.7% and 0.955, respectively, which are higher than that of the Support Vector Machine and Back Propagation neural network. The experiment results show that the proposed method can effectively improve the accuracy of remote sensing image classification.

  13. FPGA-Based Online PQD Detection and Classification through DWT, Mathematical Morphology and SVD

    Directory of Open Access Journals (Sweden)

    Misael Lopez-Ramirez

    2018-03-01

    Full Text Available Power quality disturbances (PQD in electric distribution systems can be produced by the utilization of non-linear loads or environmental circumstances, causing electrical equipment malfunction and reduction of its useful life. Detecting and classifying different PQDs implies great efforts in planning and structuring the monitoring system. The main disadvantage of most works in the literature is that they treat a limited number of electrical disturbances through personal computer (PC-based computation techniques, which makes it difficult to perform an online PQD classification. In this work, the novel contribution is a methodology for PQD recognition and classification through discrete wavelet transform, mathematical morphology, decomposition of singular values, and statistical analysis. Furthermore, the timely and reliable classification of different disturbances is necessary; hence, a field programmable gate array (FPGA-based integrated circuit is developed to offer a portable hardware processing unit to perform fast, online PQD classification. The obtained numerical and experimental results demonstrate that the proposed method guarantees high effectiveness during online PQD detection and classification of real voltage/current signals.

  14. [Molecular classification of breast cancer patients obtained through the technique of chromogenic in situ hybridization (CISH)].

    Science.gov (United States)

    Fernández, Angel; Reigosa, Aldo

    2013-12-01

    Breast cancer is a heterogeneous disease composed of a growing number of biological subtypes, with substantial variability of the disease progression within each category. The aim of this research was to classify the samples object of study according to the molecular classes of breast cancer: luminal A, luminal B, HER2 and triple negative, as a result of the state of HER2 amplification obtained by the technique of chromogenic in situ hybridization (CISH). The sample consisted of 200 biopsies fixed in 10% formalin, processed by standard techniques up to paraffin embedding, corresponding to patients diagnosed with invasive ductal carcinoma of the breast. These biopsies were obtained from patients from private practice and the Institute of Oncology "Dr. Miguel Pérez Carreño", for immunohistochemistry (IHC) of hormone receptors and HER2 made in the Hospital Metropolitano del Norte, Valencia, Venezuela. The molecular classification of the patient's tumors considering the expression of estrogen and progesterone receptors by IHC and HER2 amplification by CISH, allowed those cases originally classified as unknown, since they had an indeterminate (2+) outcome for HER2 expression by IHC, to be grouped into the different molecular classes. Also, this classification permitted that some cases, initially considered as belonging to a molecular class, were assigned to another class, after the revaluation of the HER2 status by CISH.

  15. CDX2 prognostic value in stage II/III resected colon cancer is related to CMS classification.

    Science.gov (United States)

    Pilati, C; Taieb, J; Balogoun, R; Marisa, L; de Reyniès, A; Laurent-Puig, P

    2017-05-01

    Caudal-type homeobox transcription factor 2 (CDX2) is involved in colon cancer (CC) oncogenesis and has been proposed as a prognostic biomarker in patients with stage II or III CC. We analyzed CDX2 expression in a series of 469 CC typed for the new international consensus molecular subtype (CMS) classification, and we confirmed results in a series of 90 CC. Here, we show that lack of CDX2 expression is only present in the mesenchymal subgroup (CMS4) and in MSI-immune tumors (CMS1) and not in CMS2 and CMS3 colon cancer. Although CDX2 expression was a globally independent prognostic factor, loss of CDX2 expression is not associated with a worse prognosis in the CMS1 group, but is highly prognostic in CMS4 patients for both relapse free and overall survival. Similarly, lack of CDX2 expression was a bad prognostic factor in MSS patients, but not in MSI. Our work suggests that combination of the consensual CMS classification and lack of CDX2 expression could be a useful marker to identify CMS4/CDX2-negative patients with a very poor prognosis. © The Author 2017. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  16. GMDH-Based Semi-Supervised Feature Selection for Electricity Load Classification Forecasting

    Directory of Open Access Journals (Sweden)

    Lintao Yang

    2018-01-01

    Full Text Available With the development of smart power grids, communication network technology and sensor technology, there has been an exponential growth in complex electricity load data. Irregular electricity load fluctuations caused by the weather and holiday factors disrupt the daily operation of the power companies. To deal with these challenges, this paper investigates a day-ahead electricity peak load interval forecasting problem. It transforms the conventional continuous forecasting problem into a novel interval forecasting problem, and then further converts the interval forecasting problem into the classification forecasting problem. In addition, an indicator system influencing the electricity load is established from three dimensions, namely the load series, calendar data, and weather data. A semi-supervised feature selection algorithm is proposed to address an electricity load classification forecasting issue based on the group method of data handling (GMDH technology. The proposed algorithm consists of three main stages: (1 training the basic classifier; (2 selectively marking the most suitable samples from the unclassified label data, and adding them to an initial training set; and (3 training the classification models on the final training set and classifying the test samples. An empirical analysis of electricity load dataset from four Chinese cities is conducted. Results show that the proposed model can address the electricity load classification forecasting problem more efficiently and effectively than the FW-Semi FS (forward semi-supervised feature selection and GMDH-U (GMDH-based semi-supervised feature selection for customer classification models.

  17. Assay based on electrical impedance spectroscopy to discriminate between normal and cancerous mammalian cells

    Science.gov (United States)

    Giana, Fabián Eduardo; Bonetto, Fabián José; Bellotti, Mariela Inés

    2018-03-01

    In this work we present an assay to discriminate between normal and cancerous cells. The method is based on the measurement of electrical impedance spectra of in vitro cell cultures. We developed a protocol consisting on four consecutive measurement phases, each of them designed to obtain different information about the cell cultures. Through the analysis of the measured data, 26 characteristic features were obtained for both cell types. From the complete set of features, we selected the most relevant in terms of their discriminant capacity by means of conventional statistical tests. A linear discriminant analysis was then carried out on the selected features, allowing the classification of the samples in normal or cancerous with 4.5% of false positives and no false negatives.

  18. Comparison Of Power Quality Disturbances Classification Based On Neural Network

    Directory of Open Access Journals (Sweden)

    Nway Nway Kyaw Win

    2015-07-01

    Full Text Available Abstract Power quality disturbances PQDs result serious problems in the reliability safety and economy of power system network. In order to improve electric power quality events the detection and classification of PQDs must be made type of transient fault. Software analysis of wavelet transform with multiresolution analysis MRA algorithm and feed forward neural network probabilistic and multilayer feed forward neural network based methodology for automatic classification of eight types of PQ signals flicker harmonics sag swell impulse fluctuation notch and oscillatory will be presented. The wavelet family Db4 is chosen in this system to calculate the values of detailed energy distributions as input features for classification because it can perform well in detecting and localizing various types of PQ disturbances. This technique classifies the types of PQDs problem sevents.The classifiers classify and identify the disturbance type according to the energy distribution. The results show that the PNN can analyze different power disturbance types efficiently. Therefore it can be seen that PNN has better classification accuracy than MLFF.

  19. A minimum spanning forest based classification method for dedicated breast CT images

    International Nuclear Information System (INIS)

    Pike, Robert; Sechopoulos, Ioannis; Fei, Baowei

    2015-01-01

    Purpose: To develop and test an automated algorithm to classify different types of tissue in dedicated breast CT images. Methods: Images of a single breast of five different patients were acquired with a dedicated breast CT clinical prototype. The breast CT images were processed by a multiscale bilateral filter to reduce noise while keeping edge information and were corrected to overcome cupping artifacts. As skin and glandular tissue have similar CT values on breast CT images, morphologic processing is used to identify the skin based on its position information. A support vector machine (SVM) is trained and the resulting model used to create a pixelwise classification map of fat and glandular tissue. By combining the results of the skin mask with the SVM results, the breast tissue is classified as skin, fat, and glandular tissue. This map is then used to identify markers for a minimum spanning forest that is grown to segment the image using spatial and intensity information. To evaluate the authors’ classification method, they use DICE overlap ratios to compare the results of the automated classification to those obtained by manual segmentation on five patient images. Results: Comparison between the automatic and the manual segmentation shows that the minimum spanning forest based classification method was able to successfully classify dedicated breast CT image with average DICE ratios of 96.9%, 89.8%, and 89.5% for fat, glandular, and skin tissue, respectively. Conclusions: A 2D minimum spanning forest based classification method was proposed and evaluated for classifying the fat, skin, and glandular tissue in dedicated breast CT images. The classification method can be used for dense breast tissue quantification, radiation dose assessment, and other applications in breast imaging

  20. Genome profiling (GP method based classification of insects: congruence with that of classical phenotype-based one.

    Directory of Open Access Journals (Sweden)

    Shamim Ahmed

    Full Text Available Ribosomal RNAs have been widely used for identification and classification of species, and have produced data giving new insights into phylogenetic relationships. Recently, multilocus genotyping and even whole genome sequencing-based technologies have been adopted in ambitious comparative biology studies. However, such technologies are still far from routine-use in species classification studies due to their high costs in terms of labor, equipment and consumables.Here, we describe a simple and powerful approach for species classification called genome profiling (GP. The GP method composed of random PCR, temperature gradient gel electrophoresis (TGGE and computer-aided gel image processing is highly informative and less laborious. For demonstration, we classified 26 species of insects using GP and 18S rDNA-sequencing approaches. The GP method was found to give a better correspondence to the classical phenotype-based approach than did 18S rDNA sequencing employing a congruence value. To our surprise, use of a single probe in GP was sufficient to identify the relationships between the insect species, making this approach more straightforward.The data gathered here, together with those of previous studies show that GP is a simple and powerful method that can be applied for actually universally identifying and classifying species. The current success supported our previous proposal that GP-based web database can be constructible and effective for the global identification/classification of species.

  1. Intelligence system based classification approach for medical disease diagnosis

    Science.gov (United States)

    Sagir, Abdu Masanawa; Sathasivam, Saratha

    2017-08-01

    The prediction of breast cancer in women who have no signs or symptoms of the disease as well as survivability after undergone certain surgery has been a challenging problem for medical researchers. The decision about presence or absence of diseases depends on the physician's intuition, experience and skill for comparing current indicators with previous one than on knowledge rich data hidden in a database. This measure is a very crucial and challenging task. The goal is to predict patient condition by using an adaptive neuro fuzzy inference system (ANFIS) pre-processed by grid partitioning. To achieve an accurate diagnosis at this complex stage of symptom analysis, the physician may need efficient diagnosis system. A framework describes methodology for designing and evaluation of classification performances of two discrete ANFIS systems of hybrid learning algorithms least square estimates with Modified Levenberg-Marquardt and Gradient descent algorithms that can be used by physicians to accelerate diagnosis process. The proposed method's performance was evaluated based on training and test datasets with mammographic mass and Haberman's survival Datasets obtained from benchmarked datasets of University of California at Irvine's (UCI) machine learning repository. The robustness of the performance measuring total accuracy, sensitivity and specificity is examined. In comparison, the proposed method achieves superior performance when compared to conventional ANFIS based gradient descent algorithm and some related existing methods. The software used for the implementation is MATLAB R2014a (version 8.3) and executed in PC Intel Pentium IV E7400 processor with 2.80 GHz speed and 2.0 GB of RAM.

  2. RESEARCH ON REMOTE SENSING GEOLOGICAL INFORMATION EXTRACTION BASED ON OBJECT ORIENTED CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    H. Gao

    2018-04-01

    Full Text Available The northern Tibet belongs to the Sub cold arid climate zone in the plateau. It is rarely visited by people. The geological working conditions are very poor. However, the stratum exposures are good and human interference is very small. Therefore, the research on the automatic classification and extraction of remote sensing geological information has typical significance and good application prospect. Based on the object-oriented classification in Northern Tibet, using the Worldview2 high-resolution remote sensing data, combined with the tectonic information and image enhancement, the lithological spectral features, shape features, spatial locations and topological relations of various geological information are excavated. By setting the threshold, based on the hierarchical classification, eight kinds of geological information were classified and extracted. Compared with the existing geological maps, the accuracy analysis shows that the overall accuracy reached 87.8561 %, indicating that the classification-oriented method is effective and feasible for this study area and provides a new idea for the automatic extraction of remote sensing geological information.

  3. The 2015 World Health Organization Classification of Lung Tumors: Impact of Genetic, Clinical and Radiologic Advances Since the 2004 Classification.

    Science.gov (United States)

    Travis, William D; Brambilla, Elisabeth; Nicholson, Andrew G; Yatabe, Yasushi; Austin, John H M; Beasley, Mary Beth; Chirieac, Lucian R; Dacic, Sanja; Duhig, Edwina; Flieder, Douglas B; Geisinger, Kim; Hirsch, Fred R; Ishikawa, Yuichi; Kerr, Keith M; Noguchi, Masayuki; Pelosi, Giuseppe; Powell, Charles A; Tsao, Ming Sound; Wistuba, Ignacio

    2015-09-01

    The 2015 World Health Organization (WHO) Classification of Tumors of the Lung, Pleura, Thymus and Heart has just been published with numerous important changes from the 2004 WHO classification. The most significant changes in this edition involve (1) use of immunohistochemistry throughout the classification, (2) a new emphasis on genetic studies, in particular, integration of molecular testing to help personalize treatment strategies for advanced lung cancer patients, (3) a new classification for small biopsies and cytology similar to that proposed in the 2011 Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society classification, (4) a completely different approach to lung adenocarcinoma as proposed by the 2011 Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society classification, (5) restricting the diagnosis of large cell carcinoma only to resected tumors that lack any clear morphologic or immunohistochemical differentiation with reclassification of the remaining former large cell carcinoma subtypes into different categories, (6) reclassifying squamous cell carcinomas into keratinizing, nonkeratinizing, and basaloid subtypes with the nonkeratinizing tumors requiring immunohistochemistry proof of squamous differentiation, (7) grouping of neuroendocrine tumors together in one category, (8) adding NUT carcinoma, (9) changing the term sclerosing hemangioma to sclerosing pneumocytoma, (10) changing the name hamartoma to "pulmonary hamartoma," (11) creating a group of PEComatous tumors that include (a) lymphangioleiomyomatosis, (b) PEComa, benign (with clear cell tumor as a variant) and (c) PEComa, malignant, (12) introducing the entity pulmonary myxoid sarcoma with an EWSR1-CREB1 translocation, (13) adding the entities myoepithelioma and myoepithelial carcinomas, which can show EWSR1 gene rearrangements, (14) recognition of usefulness of WWTR1-CAMTA1 fusions in diagnosis of epithelioid

  4. Wearable-Sensor-Based Classification Models of Faller Status in Older Adults.

    Directory of Open Access Journals (Sweden)

    Jennifer Howcroft

    Full Text Available Wearable sensors have potential for quantitative, gait-based, point-of-care fall risk assessment that can be easily and quickly implemented in clinical-care and older-adult living environments. This investigation generated models for wearable-sensor based fall-risk classification in older adults and identified the optimal sensor type, location, combination, and modelling method; for walking with and without a cognitive load task. A convenience sample of 100 older individuals (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence walked 7.62 m under single-task and dual-task conditions while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, and left and right shanks. Participants also completed the Activities-specific Balance Confidence scale, Community Health Activities Model Program for Seniors questionnaire, six minute walk test, and ranked their fear of falling. Fall risk classification models were assessed for all sensor combinations and three model types: multi-layer perceptron neural network, naïve Bayesian, and support vector machine. The best performing model was a multi-layer perceptron neural network with input parameters from pressure-sensing insoles and head, pelvis, and left shank accelerometers (accuracy = 84%, F1 score = 0.600, MCC score = 0.521. Head sensor-based models had the best performance of the single-sensor models for single-task gait assessment. Single-task gait assessment models outperformed models based on dual-task walking or clinical assessment data. Support vector machines and neural networks were the best modelling technique for fall risk classification. Fall risk classification models developed for point-of-care environments should be developed using support vector machines and neural networks, with a multi-sensor single-task gait assessment.

  5. A Sieving ANN for Emotion-Based Movie Clip Classification

    Science.gov (United States)

    Watanapa, Saowaluk C.; Thipakorn, Bundit; Charoenkitkarn, Nipon

    Effective classification and analysis of semantic contents are very important for the content-based indexing and retrieval of video database. Our research attempts to classify movie clips into three groups of commonly elicited emotions, namely excitement, joy and sadness, based on a set of abstract-level semantic features extracted from the film sequence. In particular, these features consist of six visual and audio measures grounded on the artistic film theories. A unique sieving-structured neural network is proposed to be the classifying model due to its robustness. The performance of the proposed model is tested with 101 movie clips excerpted from 24 award-winning and well-known Hollywood feature films. The experimental result of 97.8% correct classification rate, measured against the collected human-judges, indicates the great potential of using abstract-level semantic features as an engineered tool for the application of video-content retrieval/indexing.

  6. Faller Classification in Older Adults Using Wearable Sensors Based on Turn and Straight-Walking Accelerometer-Based Features.

    Science.gov (United States)

    Drover, Dylan; Howcroft, Jennifer; Kofman, Jonathan; Lemaire, Edward D

    2017-06-07

    Faller classification in elderly populations can facilitate preventative care before a fall occurs. A novel wearable-sensor based faller classification method for the elderly was developed using accelerometer-based features from straight walking and turns. Seventy-six older individuals (74.15 ± 7.0 years), categorized as prospective fallers and non-fallers, completed a six-minute walk test with accelerometers attached to their lower legs and pelvis. After segmenting straight and turn sections, cross validation tests were conducted on straight and turn walking features to assess classification performance. The best "classifier model-feature selector" combination used turn data, random forest classifier, and select-5-best feature selector (73.4% accuracy, 60.5% sensitivity, 82.0% specificity, and 0.44 Matthew's Correlation Coefficient (MCC)). Using only the most frequently occurring features, a feature subset (minimum of anterior-posterior ratio of even/odd harmonics for right shank, standard deviation (SD) of anterior left shank acceleration SD, SD of mean anterior left shank acceleration, maximum of medial-lateral first quartile of Fourier transform (FQFFT) for lower back, maximum of anterior-posterior FQFFT for lower back) achieved better classification results, with 77.3% accuracy, 66.1% sensitivity, 84.7% specificity, and 0.52 MCC score. All classification performance metrics improved when turn data was used for faller classification, compared to straight walking data. Combining turn and straight walking features decreased performance metrics compared to turn features for similar classifier model-feature selector combinations.

  7. Classification Framework for ICT-Based Learning Technologies for Disabled People

    Science.gov (United States)

    Hersh, Marion

    2017-01-01

    The paper presents the first systematic approach to the classification of inclusive information and communication technologies (ICT)-based learning technologies and ICT-based learning technologies for disabled people which covers both assistive and general learning technologies, is valid for all disabled people and considers the full range of…

  8. Essential drugs for cancer chemotherapy. WHO consultation.

    OpenAIRE

    1994-01-01

    The WHO recommendation on essential drugs for cancer chemotherapy has been updated. General principles on the proper role of cancer chemotherapeutic agents in relation to efficacy and on the classification of tumours with respect to their curative potential are discussed. Curable cancers and those cancers where the cost-benefit ratio clearly favours drug treatment can be managed appropriately based on only 24 drugs. Fourteen of them should ideally be available for the treatment of the ten mos...

  9. Convolution-based classification of audio and symbolic representations of music

    DEFF Research Database (Denmark)

    Velarde, Gissel; Cancino Chacón, Carlos; Meredith, David

    2018-01-01

    We present a novel convolution-based method for classification of audio and symbolic representations of music, which we apply to classification of music by style. Pieces of music are first sampled to pitch–time representations (piano-rolls or spectrograms) and then convolved with a Gaussian filter......-class composer identification, methods specialised for classifying symbolic representations of music are more effective. We also performed experiments on symbolic representations, synthetic audio and two different recordings of The Well-Tempered Clavier by J. S. Bach to study the method’s capacity to distinguish...

  10. Cancer classification through filtering progressive transductive support vector machine based on gene expression data

    Science.gov (United States)

    Lu, Xinguo; Chen, Dan

    2017-08-01

    Traditional supervised classifiers neglect a large amount of data which not have sufficient follow-up information, only work with labeled data. Consequently, the small sample size limits the advancement of design appropriate classifier. In this paper, a transductive learning method which combined with the filtering strategy in transductive framework and progressive labeling strategy is addressed. The progressive labeling strategy does not need to consider the distribution of labeled samples to evaluate the distribution of unlabeled samples, can effective solve the problem of evaluate the proportion of positive and negative samples in work set. Our experiment result demonstrate that the proposed technique have great potential in cancer prediction based on gene expression.

  11. Insights into the classification of small GTPases

    Directory of Open Access Journals (Sweden)

    Dominik Heider

    2010-05-01

    Full Text Available Dominik Heider1, Sascha Hauke3, Martin Pyka4, Daniel Kessler21Department of Bioinformatics, Center for Medical Biotechnology, 2Institute of Cell Biology (Cancer Research, University of Duisburg-Essen, Essen, Germany; 3Institute of Computer Science, University of Münster, Münster, Germany; 4Interdisciplinary Center for Clinical Research, University Hospital of Münster, Münster, GermanyAbstract: In this study we used a Random Forest-based approach for an assignment of small guanosine triphosphate proteins (GTPases to specific subgroups. Small GTPases represent an important functional group of proteins that serve as molecular switches in a wide range of fundamental cellular processes, including intracellular transport, movement and signaling events. These proteins have further gained a special emphasis in cancer research, because within the last decades a huge variety of small GTPases from different subgroups could be related to the development of all types of tumors. Using a random forest approach, we were able to identify the most important amino acid positions for the classification process within the small GTPases superfamily and its subgroups. These positions are in line with the results of earlier studies and have been shown to be the essential elements for the different functionalities of the GTPase families. Furthermore, we provide an accurate and reliable software tool (GTPasePred to identify potential novel GTPases and demonstrate its application to genome sequences.Keywords: cancer, machine learning, classification, Random Forests, proteins

  12. Deep Galaxy: Classification of Galaxies based on Deep Convolutional Neural Networks

    OpenAIRE

    Khalifa, Nour Eldeen M.; Taha, Mohamed Hamed N.; Hassanien, Aboul Ella; Selim, I. M.

    2017-01-01

    In this paper, a deep convolutional neural network architecture for galaxies classification is presented. The galaxy can be classified based on its features into main three categories Elliptical, Spiral, and Irregular. The proposed deep galaxies architecture consists of 8 layers, one main convolutional layer for features extraction with 96 filters, followed by two principles fully connected layers for classification. It is trained over 1356 images and achieved 97.272% in testing accuracy. A c...

  13. Task Classification Based Energy-Aware Consolidation in Clouds

    Directory of Open Access Journals (Sweden)

    HeeSeok Choi

    2016-01-01

    Full Text Available We consider a cloud data center, in which the service provider supplies virtual machines (VMs on hosts or physical machines (PMs to its subscribers for computation in an on-demand fashion. For the cloud data center, we propose a task consolidation algorithm based on task classification (i.e., computation-intensive and data-intensive and resource utilization (e.g., CPU and RAM. Furthermore, we design a VM consolidation algorithm to balance task execution time and energy consumption without violating a predefined service level agreement (SLA. Unlike the existing research on VM consolidation or scheduling that applies none or single threshold schemes, we focus on a double threshold (upper and lower scheme, which is used for VM consolidation. More specifically, when a host operates with resource utilization below the lower threshold, all the VMs on the host will be scheduled to be migrated to other hosts and then the host will be powered down, while when a host operates with resource utilization above the upper threshold, a VM will be migrated to avoid using 100% of resource utilization. Based on experimental performance evaluations with real-world traces, we prove that our task classification based energy-aware consolidation algorithm (TCEA achieves a significant energy reduction without incurring predefined SLA violations.

  14. Model-based object classification using unification grammars and abstract representations

    Science.gov (United States)

    Liburdy, Kathleen A.; Schalkoff, Robert J.

    1993-04-01

    The design and implementation of a high level computer vision system which performs object classification is described. General object labelling and functional analysis require models of classes which display a wide range of geometric variations. A large representational gap exists between abstract criteria such as `graspable' and current geometric image descriptions. The vision system developed and described in this work addresses this problem and implements solutions based on a fusion of semantics, unification, and formal language theory. Object models are represented using unification grammars, which provide a framework for the integration of structure and semantics. A methodology for the derivation of symbolic image descriptions capable of interacting with the grammar-based models is described and implemented. A unification-based parser developed for this system achieves object classification by determining if the symbolic image description can be unified with the abstract criteria of an object model. Future research directions are indicated.

  15. A texton-based approach for the classification of lung parenchyma in CT images

    DEFF Research Database (Denmark)

    Gangeh, Mehrdad J.; Sørensen, Lauge; Shaker, Saher B.

    2010-01-01

    In this paper, a texton-based classification system based on raw pixel representation along with a support vector machine with radial basis function kernel is proposed for the classification of emphysema in computed tomography images of the lung. The proposed approach is tested on 168 annotated...... regions of interest consisting of normal tissue, centrilobular emphysema, and paraseptal emphysema. The results show the superiority of the proposed approach to common techniques in the literature including moments of the histogram of filter responses based on Gaussian derivatives. The performance...

  16. WHO/ISUP classification of the urothelial tumors of the urinary bladder

    Directory of Open Access Journals (Sweden)

    Zdenka Ovčak

    2005-09-01

    Full Text Available Background: The authors present the current classification of urothelial neoplasms of the urinary bladder. The classification of urothelial tumors of the urinary bladder of 1973 was despite some imperfection relatively successfuly used for more than thirty years. The three grade classification of papillary urothelial tumors without invasion has been based on evaluation of variations in architecture of covering epithelium and tumor cell anaplasia. As reccomended by the International Society of Urological Pathologists (ISUP, the World Health Organisation (WHO accepted the new WHO/ ISUP classification in 1998 that was revised in 2002 and finally published in 2004. With intention to avoid unnecessary diagnosis of cancer in patients having papillary urothelial tumors with rare invasive or metastastatic growth, this classification introduced a new entity, the papillary urothelial neoplasia of low malignant potential (PUNLMP. The additional change in classification was the division of invasive urothelial neoplasms only to low and high grade urothelial carcinomas.Conclusions: The authors’ opinion is that although the old classification is not recommended for use anymore the new one is not solving the elementary reproaches to previous classification such as terminological unsuitability and insufficient scientific reasoning. Our proposed solution in classification of papillary urothelial neoplasms would be the application of criteria analogous to that used in diagnostics of papillary noninvasive tumors of the head and neck or alimentary tract.

  17. SAW Classification Algorithm for Chinese Text Classification

    OpenAIRE

    Xiaoli Guo; Huiyu Sun; Tiehua Zhou; Ling Wang; Zhaoyang Qu; Jiannan Zang

    2015-01-01

    Considering the explosive growth of data, the increased amount of text data’s effect on the performance of text categorization forward the need for higher requirements, such that the existing classification method cannot be satisfied. Based on the study of existing text classification technology and semantics, this paper puts forward a kind of Chinese text classification oriented SAW (Structural Auxiliary Word) algorithm. The algorithm uses the special space effect of Chinese text where words...

  18. A patch-based convolutional neural network for remote sensing image classification.

    Science.gov (United States)

    Sharma, Atharva; Liu, Xiuwen; Yang, Xiaojun; Shi, Di

    2017-11-01

    Availability of accurate land cover information over large areas is essential to the global environment sustainability; digital classification using medium-resolution remote sensing data would provide an effective method to generate the required land cover information. However, low accuracy of existing per-pixel based classification methods for medium-resolution data is a fundamental limiting factor. While convolutional neural networks (CNNs) with deep layers have achieved unprecedented improvements in object recognition applications that rely on fine image structures, they cannot be applied directly to medium-resolution data due to lack of such fine structures. In this paper, considering the spatial relation of a pixel to its neighborhood, we propose a new deep patch-based CNN system tailored for medium-resolution remote sensing data. The system is designed by incorporating distinctive characteristics of medium-resolution data; in particular, the system computes patch-based samples from multidimensional top of atmosphere reflectance data. With a test site from the Florida Everglades area (with a size of 771 square kilometers), the proposed new system has outperformed pixel-based neural network, pixel-based CNN and patch-based neural network by 24.36%, 24.23% and 11.52%, respectively, in overall classification accuracy. By combining the proposed deep CNN and the huge collection of medium-resolution remote sensing data, we believe that much more accurate land cover datasets can be produced over large areas. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Task-Driven Dictionary Learning Based on Mutual Information for Medical Image Classification.

    Science.gov (United States)

    Diamant, Idit; Klang, Eyal; Amitai, Michal; Konen, Eli; Goldberger, Jacob; Greenspan, Hayit

    2017-06-01

    We present a novel variant of the bag-of-visual-words (BoVW) method for automated medical image classification. Our approach improves the BoVW model by learning a task-driven dictionary of the most relevant visual words per task using a mutual information-based criterion. Additionally, we generate relevance maps to visualize and localize the decision of the automatic classification algorithm. These maps demonstrate how the algorithm works and show the spatial layout of the most relevant words. We applied our algorithm to three different tasks: chest x-ray pathology identification (of four pathologies: cardiomegaly, enlarged mediastinum, right consolidation, and left consolidation), liver lesion classification into four categories in computed tomography (CT) images and benign/malignant clusters of microcalcifications (MCs) classification in breast mammograms. Validation was conducted on three datasets: 443 chest x-rays, 118 portal phase CT images of liver lesions, and 260 mammography MCs. The proposed method improves the classical BoVW method for all tested applications. For chest x-ray, area under curve of 0.876 was obtained for enlarged mediastinum identification compared to 0.855 using classical BoVW (with p-value 0.01). For MC classification, a significant improvement of 4% was achieved using our new approach (with p-value = 0.03). For liver lesion classification, an improvement of 6% in sensitivity and 2% in specificity were obtained (with p-value 0.001). We demonstrated that classification based on informative selected set of words results in significant improvement. Our new BoVW approach shows promising results in clinically important domains. Additionally, it can discover relevant parts of images for the task at hand without explicit annotations for training data. This can provide computer-aided support for medical experts in challenging image analysis tasks.

  20. Multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement

    Science.gov (United States)

    Yan, Dan; Bai, Lianfa; Zhang, Yi; Han, Jing

    2018-02-01

    For the problems of missing details and performance of the colorization based on sparse representation, we propose a conceptual model framework for colorizing gray-scale images, and then a multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement (CEMDC) is proposed based on this framework. The algorithm can achieve a natural colorized effect for a gray-scale image, and it is consistent with the human vision. First, the algorithm establishes a multi-sparse dictionary classification colorization model. Then, to improve the accuracy rate of the classification, the corresponding local constraint algorithm is proposed. Finally, we propose a detail enhancement based on Laplacian Pyramid, which is effective in solving the problem of missing details and improving the speed of image colorization. In addition, the algorithm not only realizes the colorization of the visual gray-scale image, but also can be applied to the other areas, such as color transfer between color images, colorizing gray fusion images, and infrared images.

  1. SVM classification model in depression recognition based on mutation PSO parameter optimization

    Directory of Open Access Journals (Sweden)

    Zhang Ming

    2017-01-01

    Full Text Available At present, the clinical diagnosis of depression is mainly through structured interviews by psychiatrists, which is lack of objective diagnostic methods, so it causes the higher rate of misdiagnosis. In this paper, a method of depression recognition based on SVM and particle swarm optimization algorithm mutation is proposed. To address on the problem that particle swarm optimization (PSO algorithm easily trap in local optima, we propose a feedback mutation PSO algorithm (FBPSO to balance the local search and global exploration ability, so that the parameters of the classification model is optimal. We compared different PSO mutation algorithms about classification accuracy for depression, and found the classification accuracy of support vector machine (SVM classifier based on feedback mutation PSO algorithm is the highest. Our study promotes important reference value for establishing auxiliary diagnostic used in depression recognition of clinical diagnosis.

  2. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  3. Rule-based land cover classification from very high-resolution satellite image with multiresolution segmentation

    Science.gov (United States)

    Haque, Md. Enamul; Al-Ramadan, Baqer; Johnson, Brian A.

    2016-07-01

    Multiresolution segmentation and rule-based classification techniques are used to classify objects from very high-resolution satellite images of urban areas. Custom rules are developed using different spectral, geometric, and textural features with five scale parameters, which exploit varying classification accuracy. Principal component analysis is used to select the most important features out of a total of 207 different features. In particular, seven different object types are considered for classification. The overall classification accuracy achieved for the rule-based method is 95.55% and 98.95% for seven and five classes, respectively. Other classifiers that are not using rules perform at 84.17% and 97.3% accuracy for seven and five classes, respectively. The results exploit coarse segmentation for higher scale parameter and fine segmentation for lower scale parameter. The major contribution of this research is the development of rule sets and the identification of major features for satellite image classification where the rule sets are transferable and the parameters are tunable for different types of imagery. Additionally, the individual objectwise classification and principal component analysis help to identify the required object from an arbitrary number of objects within images given ground truth data for the training.

  4. Application of Metabolomics in Thyroid Cancer Research

    Directory of Open Access Journals (Sweden)

    Anna Wojakowska

    2015-01-01

    Full Text Available Thyroid cancer is the most common endocrine malignancy with four major types distinguished on the basis of histopathological features: papillary, follicular, medullary, and anaplastic. Classification of thyroid cancer is the primary step in the assessment of prognosis and selection of the treatment. However, in some cases, cytological and histological patterns are inconclusive; hence, classification based on histopathology could be supported by molecular biomarkers, including markers identified with the use of high-throughput “omics” techniques. Beside genomics, transcriptomics, and proteomics, metabolomic approach emerges as the most downstream attitude reflecting phenotypic changes and alterations in pathophysiological states of biological systems. Metabolomics using mass spectrometry and magnetic resonance spectroscopy techniques allows qualitative and quantitative profiling of small molecules present in biological systems. This approach can be applied to reveal metabolic differences between different types of thyroid cancer and to identify new potential candidates for molecular biomarkers. In this review, we consider current results concerning application of metabolomics in the field of thyroid cancer research. Recent studies show that metabolomics can provide significant information about the discrimination between different types of thyroid lesions. In the near future, one could expect a further progress in thyroid cancer metabolomics leading to development of molecular markers and improvement of the tumor types classification and diagnosis.

  5. Thermographic image analysis for classification of ACL rupture disease, bone cancer, and feline hyperthyroid, with Gabor filters

    Science.gov (United States)

    Alvandipour, Mehrdad; Umbaugh, Scott E.; Mishra, Deependra K.; Dahal, Rohini; Lama, Norsang; Marino, Dominic J.; Sackman, Joseph

    2017-05-01

    Thermography and pattern classification techniques are used to classify three different pathologies in veterinary images. Thermographic images of both normal and diseased animals were provided by the Long Island Veterinary Specialists (LIVS). The three pathologies are ACL rupture disease, bone cancer, and feline hyperthyroid. The diagnosis of these diseases usually involves radiology and laboratory tests while the method that we propose uses thermographic images and image analysis techniques and is intended for use as a prescreening tool. Images in each category of pathologies are first filtered by Gabor filters and then various features are extracted and used for classification into normal and abnormal classes. Gabor filters are linear filters that can be characterized by the two parameters wavelength λ and orientation θ. With two different wavelength and five different orientations, a total of ten different filters were studied. Different combinations of camera views, filters, feature vectors, normalization methods, and classification methods, produce different tests that were examined and the sensitivity, specificity and success rate for each test were produced. Using the Gabor features alone, sensitivity, specificity, and overall success rates of 85% for each of the pathologies was achieved.

  6. [Severity classification of chronic obstructive pulmonary disease based on deep learning].

    Science.gov (United States)

    Ying, Jun; Yang, Ceyuan; Li, Quanzheng; Xue, Wanguo; Li, Tanshi; Cao, Wenzhe

    2017-12-01

    In this paper, a deep learning method has been raised to build an automatic classification algorithm of severity of chronic obstructive pulmonary disease. Large sample clinical data as input feature were analyzed for their weights in classification. Through feature selection, model training, parameter optimization and model testing, a classification prediction model based on deep belief network was built to predict severity classification criteria raised by the Global Initiative for Chronic Obstructive Lung Disease (GOLD). We get accuracy over 90% in prediction for two different standardized versions of severity criteria raised in 2007 and 2011 respectively. Moreover, we also got the contribution ranking of different input features through analyzing the model coefficient matrix and confirmed that there was a certain degree of agreement between the more contributive input features and the clinical diagnostic knowledge. The validity of the deep belief network model was proved by this result. This study provides an effective solution for the application of deep learning method in automatic diagnostic decision making.

  7. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is

  8. Classification-based comparison of pre-processing methods for interpretation of mass spectrometry generated clinical datasets

    Directory of Open Access Journals (Sweden)

    Hoefsloot Huub CJ

    2009-05-01

    Full Text Available Abstract Background Mass spectrometry is increasingly being used to discover proteins or protein profiles associated with disease. Experimental design of mass-spectrometry studies has come under close scrutiny and the importance of strict protocols for sample collection is now understood. However, the question of how best to process the large quantities of data generated is still unanswered. Main challenges for the analysis are the choice of proper pre-processing and classification methods. While these two issues have been investigated in isolation, we propose to use the classification of patient samples as a clinically relevant benchmark for the evaluation of pre-processing methods. Results Two in-house generated clinical SELDI-TOF MS datasets are used in this study as an example of high throughput mass-spectrometry data. We perform a systematic comparison of two commonly used pre-processing methods as implemented in Ciphergen ProteinChip Software and in the Cromwell package. With respect to reproducibility, Ciphergen and Cromwell pre-processing are largely comparable. We find that the overlap between peaks detected by either Ciphergen ProteinChip Software or Cromwell is large. This is especially the case for the more stringent peak detection settings. Moreover, similarity of the estimated intensities between matched peaks is high. We evaluate the pre-processing methods using five different classification methods. Classification is done in a double cross-validation protocol using repeated random sampling to obtain an unbiased estimate of classification accuracy. No pre-processing method significantly outperforms the other for all peak detection settings evaluated. Conclusion We use classification of patient samples as a clinically relevant benchmark for the evaluation of pre-processing methods. Both pre-processing methods lead to similar classification results on an ovarian cancer and a Gaucher disease dataset. However, the settings for pre

  9. Density Based Support Vector Machines for Classification

    OpenAIRE

    Zahra Nazari; Dongshik Kang

    2015-01-01

    Support Vector Machines (SVM) is the most successful algorithm for classification problems. SVM learns the decision boundary from two classes (for Binary Classification) of training points. However, sometimes there are some less meaningful samples amongst training points, which are corrupted by noises or misplaced in wrong side, called outliers. These outliers are affecting on margin and classification performance, and machine should better to discard them. SVM as a popular and widely used cl...

  10. Risk Classification and Risk-based Safety and Mission Assurance

    Science.gov (United States)

    Leitner, Jesse A.

    2014-01-01

    Recent activities to revamp and emphasize the need to streamline processes and activities for Class D missions across the agency have led to various interpretations of Class D, including the lumping of a variety of low-cost projects into Class D. Sometimes terms such as Class D minus are used. In this presentation, mission risk classifications will be traced to official requirements and definitions as a measure to ensure that projects and programs align with the guidance and requirements that are commensurate for their defined risk posture. As part of this, the full suite of risk classifications, formal and informal will be defined, followed by an introduction to the new GPR 8705.4 that is currently under review.GPR 8705.4 lays out guidance for the mission success activities performed at the Classes A-D for NPR 7120.5 projects as well as for projects not under NPR 7120.5. Furthermore, the trends in stepping from Class A into higher risk posture classifications will be discussed. The talk will conclude with a discussion about risk-based safety and mission assuranceat GSFC.

  11. Overfitting Reduction of Text Classification Based on AdaBELM

    Directory of Open Access Journals (Sweden)

    Xiaoyue Feng

    2017-07-01

    Full Text Available Overfitting is an important problem in machine learning. Several algorithms, such as the extreme learning machine (ELM, suffer from this issue when facing high-dimensional sparse data, e.g., in text classification. One common issue is that the extent of overfitting is not well quantified. In this paper, we propose a quantitative measure of overfitting referred to as the rate of overfitting (RO and a novel model, named AdaBELM, to reduce the overfitting. With RO, the overfitting problem can be quantitatively measured and identified. The newly proposed model can achieve high performance on multi-class text classification. To evaluate the generalizability of the new model, we designed experiments based on three datasets, i.e., the 20 Newsgroups, Reuters-21578, and BioMed corpora, which represent balanced, unbalanced, and real application data, respectively. Experiment results demonstrate that AdaBELM can reduce overfitting and outperform classical ELM, decision tree, random forests, and AdaBoost on all three text-classification datasets; for example, it can achieve 62.2% higher accuracy than ELM. Therefore, the proposed model has a good generalizability.

  12. Stepwise classification of cancer samples using clinical and molecular data

    Directory of Open Access Journals (Sweden)

    Obulkasim Askar

    2011-10-01

    Full Text Available Abstract Background Combining clinical and molecular data types may potentially improve prediction accuracy of a classifier. However, currently there is a shortage of effective and efficient statistical and bioinformatic tools for true integrative data analysis. Existing integrative classifiers have two main disadvantages: First, coarse combination may lead to subtle contributions of one data type to be overshadowed by more obvious contributions of the other. Second, the need to measure both data types for all patients may be both unpractical and (cost inefficient. Results We introduce a novel classification method, a stepwise classifier, which takes advantage of the distinct classification power of clinical data and high-dimensional molecular data. We apply classification algorithms to two data types independently, starting with the traditional clinical risk factors. We only turn to relatively expensive molecular data when the uncertainty of prediction result from clinical data exceeds a predefined limit. Experimental results show that our approach is adaptive: the proportion of samples that needs to be re-classified using molecular data depends on how much we expect the predictive accuracy to increase when re-classifying those samples. Conclusions Our method renders a more cost-efficient classifier that is at least as good, and sometimes better, than one based on clinical or molecular data alone. Hence our approach is not just a classifier that minimizes a particular loss function. Instead, it aims to be cost-efficient by avoiding molecular tests for a potentially large subgroup of individuals; moreover, for these individuals a test result would be quickly available, which may lead to reduced waiting times (for diagnosis and hence lower the patients distress. Stepwise classification is implemented in R-package stepwiseCM and available at the Bioconductor website.

  13. An application to pulmonary emphysema classification based on model of texton learning by sparse representation

    Science.gov (United States)

    Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi

    2012-03-01

    We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.

  14. [Magnetic resonance semiotics of prostate cancer according to the PI-RADS classification. The clinical diagnostic algorithm of a study].

    Science.gov (United States)

    Korobkin, A S; Shariya, M A; Chaban, A S; Voskanvan, G A; Vinarov, A Z

    2015-01-01

    to elaborate the magnetic resonance imaging (MRI) signs of prostate cancer (PC) in accordance with the PI-RADS classification during multiparametric MRI (mpMRI). A total of 89 men aged 20 to 82 years were examined. A control group consisted of 8 (9%) healthy volunteers younger than 30 years of age with no urological history to obtain control images and MRI plots and 20 (22.5%) men aged 26-76 years, whose morphological changes were inflammatory and hyperplastic. The second age-matched group included 61 (68.5%) patients diagnosed with prostate cancer at morphological examination. A set of studies included digital rectal examination, serum prostate-specific antigen, and transrectal ultrasound-guided prostate biopsy. All the patients underwent prostate mpMRI applying a 3.0 T Achieva MRI scanner (Philips, the Netherlands). The patients have been found to have mpMRI signs that were typical of PC; its MRI semiotics according to the PI-RADS classification is presented. Each mpMRI procedure has been determined to be of importance and informative value in detecting PC. The comprehensive mpMRI approach to diagnosing PC improves the quality and diagnostic value of prostate MRI.

  15. Soil classification basing on the spectral characteristics of topsoil samples

    Science.gov (United States)

    Liu, Huanjun; Zhang, Xiaokang; Zhang, Xinle

    2016-04-01

    Soil taxonomy plays an important role in soil utility and management, but China has only course soil map created based on 1980s data. New technology, e.g. spectroscopy, could simplify soil classification. The study try to classify soils basing on the spectral characteristics of topsoil samples. 148 topsoil samples of typical soils, including Black soil, Chernozem, Blown soil and Meadow soil, were collected from Songnen plain, Northeast China, and the room spectral reflectance in the visible and near infrared region (400-2500 nm) were processed with weighted moving average, resampling technique, and continuum removal. Spectral indices were extracted from soil spectral characteristics, including the second absorption positions of spectral curve, the first absorption vale's area, and slope of spectral curve at 500-600 nm and 1340-1360 nm. Then K-means clustering and decision tree were used respectively to build soil classification model. The results indicated that 1) the second absorption positions of Black soil and Chernozem were located at 610 nm and 650 nm respectively; 2) the spectral curve of the meadow is similar to its adjacent soil, which could be due to soil erosion; 3) decision tree model showed higher classification accuracy, and accuracy of Black soil, Chernozem, Blown soil and Meadow are 100%, 88%, 97%, 50% respectively, and the accuracy of Blown soil could be increased to 100% by adding one more spectral index (the first two vole's area) to the model, which showed that the model could be used for soil classification and soil map in near future.

  16. Epidemiological bases and molecular mechanisms linking obesity, diabetes, and cancer.

    Science.gov (United States)

    Gutiérrez-Salmerón, María; Chocarro-Calvo, Ana; García-Martínez, José Manuel; de la Vieja, Antonio; García-Jiménez, Custodia

    2017-02-01

    The association between diabetes and cancer was hypothesized almost one century ago. Today, a vast number of epidemiological studies support that obese and diabetic populations are more likely to experience tissue-specific cancers, but the underlying molecular mechanisms remain unknown. Obesity, diabetes, and cancer share many hormonal, immune, and metabolic changes that may account for the relationship between diabetes and cancer. In addition, antidiabetic treatments may have an impact on the occurrence and course of some cancers. Moreover, some anticancer treatments may induce diabetes. These observations aroused a great controversy because of the ethical implications and the associated commercial interests. We report an epidemiological update from a mechanistic perspective that suggests the existence of many common and differential individual mechanisms linking obesity and type 1 and 2 diabetes mellitus to certain cancers. The challenge today is to identify the molecular links responsible for this association. Classification of cancers by their molecular signatures may facilitate future mechanistic and epidemiological studies. Copyright © 2016 SEEN. Publicado por Elsevier España, S.L.U. All rights reserved.

  17. A Novel Algorithm for Imbalance Data Classification Based on Neighborhood Hypergraph

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2014-01-01

    Full Text Available The classification problem for imbalance data is paid more attention to. So far, many significant methods are proposed and applied to many fields. But more efficient methods are needed still. Hypergraph may not be powerful enough to deal with the data in boundary region, although it is an efficient tool to knowledge discovery. In this paper, the neighborhood hypergraph is presented, combining rough set theory and hypergraph. After that, a novel classification algorithm for imbalance data based on neighborhood hypergraph is developed, which is composed of three steps: initialization of hyperedge, classification of training data set, and substitution of hyperedge. After conducting an experiment of 10-fold cross validation on 18 data sets, the proposed algorithm has higher average accuracy than others.

  18. Segmentation of Clinical Endoscopic Images Based on the Classification of Topological Vector Features

    Directory of Open Access Journals (Sweden)

    O. A. Dunaeva

    2013-01-01

    Full Text Available In this work, we describe a prototype of an automatic segmentation system and annotation of endoscopy images. The used algorithm is based on the classification of vectors of the topological features of the original image. We use the image processing scheme which includes image preprocessing, calculation of vector descriptors defined for every point of the source image and the subsequent classification of descriptors. Image preprocessing includes finding and selecting artifacts and equalizating the image brightness. In this work, we give the detailed algorithm of the construction of topological descriptors and the classifier creating procedure based on mutual sharing the AdaBoost scheme and a naive Bayes classifier. In the final section, we show the results of the classification of real endoscopic images.

  19. Some improved classification-based ridge parameter of Hoerl and ...

    African Journals Online (AJOL)

    Some improved classification-based ridge parameter of Hoerl and Kennard estimation techniques. ... This assumption is often violated and Ridge Regression estimator introduced by [2]has been identified to be more efficient than ordinary least square (OLS) in handling it. However, it requires a ridge parameter, K, of which ...

  20. Contaminant classification using cosine distances based on multiple conventional sensors.

    Science.gov (United States)

    Liu, Shuming; Che, Han; Smith, Kate; Chang, Tian

    2015-02-01

    Emergent contamination events have a significant impact on water systems. After contamination detection, it is important to classify the type of contaminant quickly to provide support for remediation attempts. Conventional methods generally either rely on laboratory-based analysis, which requires a long analysis time, or on multivariable-based geometry analysis and sequence analysis, which is prone to being affected by the contaminant concentration. This paper proposes a new contaminant classification method, which discriminates contaminants in a real time manner independent of the contaminant concentration. The proposed method quantifies the similarities or dissimilarities between sensors' responses to different types of contaminants. The performance of the proposed method was evaluated using data from contaminant injection experiments in a laboratory and compared with a Euclidean distance-based method. The robustness of the proposed method was evaluated using an uncertainty analysis. The results show that the proposed method performed better in identifying the type of contaminant than the Euclidean distance based method and that it could classify the type of contaminant in minutes without significantly compromising the correct classification rate (CCR).

  1. Object-based Dimensionality Reduction in Land Surface Phenology Classification

    Directory of Open Access Journals (Sweden)

    Brian E. Bunker

    2016-11-01

    Full Text Available Unsupervised classification or clustering of multi-decadal land surface phenology provides a spatio-temporal synopsis of natural and agricultural vegetation response to environmental variability and anthropogenic activities. Notwithstanding the detailed temporal information available in calibrated bi-monthly normalized difference vegetation index (NDVI and comparable time series, typical pre-classification workflows average a pixel’s bi-monthly index within the larger multi-decadal time series. While this process is one practical way to reduce the dimensionality of time series with many hundreds of image epochs, it effectively dampens temporal variation from both intra and inter-annual observations related to land surface phenology. Through a novel application of object-based segmentation aimed at spatial (not temporal dimensionality reduction, all 294 image epochs from a Moderate Resolution Imaging Spectroradiometer (MODIS bi-monthly NDVI time series covering the northern Fertile Crescent were retained (in homogenous landscape units as unsupervised classification inputs. Given the inherent challenges of in situ or manual image interpretation of land surface phenology classes, a cluster validation approach based on transformed divergence enabled comparison between traditional and novel techniques. Improved intra-annual contrast was clearly manifest in rain-fed agriculture and inter-annual trajectories showed increased cluster cohesion, reducing the overall number of classes identified in the Fertile Crescent study area from 24 to 10. Given careful segmentation parameters, this spatial dimensionality reduction technique augments the value of unsupervised learning to generate homogeneous land surface phenology units. By combining recent scalable computational approaches to image segmentation, future work can pursue new global land surface phenology products based on the high temporal resolution signatures of vegetation index time series.

  2. Efficacy measures associated to a plantar pressure based classification system in diabetic foot medicine.

    Science.gov (United States)

    Deschamps, Kevin; Matricali, Giovanni Arnoldo; Desmet, Dirk; Roosen, Philip; Keijsers, Noel; Nobels, Frank; Bruyninckx, Herman; Staes, Filip

    2016-09-01

    The concept of 'classification' has, similar to many other diseases, been found to be fundamental in the field of diabetic medicine. In the current study, we aimed at determining efficacy measures of a recently published plantar pressure based classification system. Technical efficacy of the classification system was investigated by applying a high resolution, pixel-level analysis on the normalized plantar pressure pedobarographic fields of the original experimental dataset consisting of 97 patients with diabetes and 33 persons without diabetes. Clinical efficacy was assessed by considering the occurence of foot ulcers at the plantar aspect of the forefoot in this dataset. Classification efficacy was assessed by determining the classification recognition rate as well as its sensitivity and specificity using cross-validation subsets of the experimental dataset together with a novel cohort of 12 patients with diabetes. Pixel-level comparison of the four groups associated to the classification system highlighted distinct regional differences. Retrospective analysis showed the occurence of eleven foot ulcers in the experimental dataset since their gait analysis. Eight out of the eleven ulcers developed in a region of the foot which had the highest forces. Overall classification recognition rate exceeded 90% for all cross-validation subsets. Sensitivity and specificity of the four groups associated to the classification system exceeded respectively the 0.7 and 0.8 level in all cross-validation subsets. The results of the current study support the use of the novel plantar pressure based classification system in diabetic foot medicine. It may particularly serve in communication, diagnosis and clinical decision making. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. A new web-based system for unsupervised classification of satellite images from the Google Maps engine

    Science.gov (United States)

    Ferrán, Ángel; Bernabé, Sergio; García-Rodríguez, Pablo; Plaza, Antonio

    2012-10-01

    In this paper, we develop a new web-based system for unsupervised classification of satellite images available from the Google Maps engine. The system has been developed using the Google Maps API and incorporates functionalities such as unsupervised classification of image portions selected by the user (at the desired zoom level). For this purpose, we use a processing chain made up of the well-known ISODATA and k-means algorithms, followed by spatial post-processing based on majority voting. The system is currently hosted on a high performance server which performs the execution of classification algorithms and returns the obtained classification results in a very efficient way. The previous functionalities are necessary to use efficient techniques for the classification of images and the incorporation of content-based image retrieval (CBIR). Several experimental validation types of the classification results with the proposed system are performed by comparing the classification accuracy of the proposed chain by means of techniques available in the well-known Environment for Visualizing Images (ENVI) software package. The server has access to a cluster of commodity graphics processing units (GPUs), hence in future work we plan to perform the processing in parallel by taking advantage of the cluster.

  4. Attribute-based classification for zero-shot visual object categorization.

    Science.gov (United States)

    Lampert, Christoph H; Nickisch, Hannes; Harmeling, Stefan

    2014-03-01

    We study the problem of object recognition for categories for which we have no training examples, a task also called zero--data or zero-shot learning. This situation has hardly been studied in computer vision research, even though it occurs frequently; the world contains tens of thousands of different object classes, and image collections have been formed and suitably annotated for only a few of them. To tackle the problem, we introduce attribute-based classification: Objects are identified based on a high-level description that is phrased in terms of semantic attributes, such as the object's color or shape. Because the identification of each such property transcends the specific learning task at hand, the attribute classifiers can be prelearned independently, for example, from existing image data sets unrelated to the current task. Afterward, new classes can be detected based on their attribute representation, without the need for a new training phase. In this paper, we also introduce a new data set, Animals with Attributes, of over 30,000 images of 50 animal classes, annotated with 85 semantic attributes. Extensive experiments on this and two more data sets show that attribute-based classification indeed is able to categorize images without access to any training images of the target classes.

  5. Does glyphosate cause cancer?

    OpenAIRE

    German Federal Institute for Risk Assessment

    2015-01-01

    In its recent evaluation from March 2015, the International Agency for Cancer Research (IARC), as the specialized cancer agency of the World Health Organization (WHO), came to the conclusion that glyphosate should now be classified as a carcinogenic substance in Group 2A (probably carcinogenic to humans), based on “limited evidence” in human-experiments and ”sufficient evidence” in animal-experiments. This classification was pub-lished in a short report in the "Lancet" journal on 20 March 201...

  6. KNN BASED CLASSIFICATION OF DIGITAL MODULATED SIGNALS

    Directory of Open Access Journals (Sweden)

    Sajjad Ahmed Ghauri

    2016-11-01

    Full Text Available Demodulation process without the knowledge of modulation scheme requires Automatic Modulation Classification (AMC. When receiver has limited information about received signal then AMC become essential process. AMC finds important place in the field many civil and military fields such as modern electronic warfare, interfering source recognition, frequency management, link adaptation etc. In this paper we explore the use of K-nearest neighbor (KNN for modulation classification with different distance measurement methods. Five modulation schemes are used for classification purpose which is Binary Phase Shift Keying (BPSK, Quadrature Phase Shift Keying (QPSK, Quadrature Amplitude Modulation (QAM, 16-QAM and 64-QAM. Higher order cummulants (HOC are used as an input feature set to the classifier. Simulation results shows that proposed classification method provides better results for the considered modulation formats.

  7. A k-mer-based barcode DNA classification methodology based on spectral representation and a neural gas network.

    Science.gov (United States)

    Fiannaca, Antonino; La Rosa, Massimo; Rizzo, Riccardo; Urso, Alfonso

    2015-07-01

    In this paper, an alignment-free method for DNA barcode classification that is based on both a spectral representation and a neural gas network for unsupervised clustering is proposed. In the proposed methodology, distinctive words are identified from a spectral representation of DNA sequences. A taxonomic classification of the DNA sequence is then performed using the sequence signature, i.e., the smallest set of k-mers that can assign a DNA sequence to its proper taxonomic category. Experiments were then performed to compare our method with other supervised machine learning classification algorithms, such as support vector machine, random forest, ripper, naïve Bayes, ridor, and classification tree, which also consider short DNA sequence fragments of 200 and 300 base pairs (bp). The experimental tests were conducted over 10 real barcode datasets belonging to different animal species, which were provided by the on-line resource "Barcode of Life Database". The experimental results showed that our k-mer-based approach is directly comparable, in terms of accuracy, recall and precision metrics, with the other classifiers when considering full-length sequences. In addition, we demonstrate the robustness of our method when a classification is performed task with a set of short DNA sequences that were randomly extracted from the original data. For example, the proposed method can reach the accuracy of 64.8% at the species level with 200-bp fragments. Under the same conditions, the best other classifier (random forest) reaches the accuracy of 20.9%. Our results indicate that we obtained a clear improvement over the other classifiers for the study of short DNA barcode sequence fragments. Copyright © 2015 Elsevier B.V. All rights reserved.

  8. Automatic Classification of Normal and Cancer Lung CT Images Using Multiscale AM-FM Features

    Directory of Open Access Journals (Sweden)

    Eman Magdy

    2015-01-01

    Full Text Available Computer-aided diagnostic (CAD systems provide fast and reliable diagnosis for medical images. In this paper, CAD system is proposed to analyze and automatically segment the lungs and classify each lung into normal or cancer. Using 70 different patients’ lung CT dataset, Wiener filtering on the original CT images is applied firstly as a preprocessing step. Secondly, we combine histogram analysis with thresholding and morphological operations to segment the lung regions and extract each lung separately. Amplitude-Modulation Frequency-Modulation (AM-FM method thirdly, has been used to extract features for ROIs. Then, the significant AM-FM features have been selected using Partial Least Squares Regression (PLSR for classification step. Finally, K-nearest neighbour (KNN, support vector machine (SVM, naïve Bayes, and linear classifiers have been used with the selected AM-FM features. The performance of each classifier in terms of accuracy, sensitivity, and specificity is evaluated. The results indicate that our proposed CAD system succeeded to differentiate between normal and cancer lungs and achieved 95% accuracy in case of the linear classifier.

  9. Classification and Quality Evaluation of Tobacco Leaves Based on Image Processing and Fuzzy Comprehensive Evaluation

    Science.gov (United States)

    Zhang, Fan; Zhang, Xinhong

    2011-01-01

    Most of classification, quality evaluation or grading of the flue-cured tobacco leaves are manually operated, which relies on the judgmental experience of experts, and inevitably limited by personal, physical and environmental factors. The classification and the quality evaluation are therefore subjective and experientially based. In this paper, an automatic classification method of tobacco leaves based on the digital image processing and the fuzzy sets theory is presented. A grading system based on image processing techniques was developed for automatically inspecting and grading flue-cured tobacco leaves. This system uses machine vision for the extraction and analysis of color, size, shape and surface texture. Fuzzy comprehensive evaluation provides a high level of confidence in decision making based on the fuzzy logic. The neural network is used to estimate and forecast the membership function of the features of tobacco leaves in the fuzzy sets. The experimental results of the two-level fuzzy comprehensive evaluation (FCE) show that the accuracy rate of classification is about 94% for the trained tobacco leaves, and the accuracy rate of the non-trained tobacco leaves is about 72%. We believe that the fuzzy comprehensive evaluation is a viable way for the automatic classification and quality evaluation of the tobacco leaves. PMID:22163744

  10. Optimal Couple Projections for Domain Adaptive Sparse Representation-based Classification.

    Science.gov (United States)

    Zhang, Guoqing; Sun, Huaijiang; Porikli, Fatih; Liu, Yazhou; Sun, Quansen

    2017-08-29

    In recent years, sparse representation based classification (SRC) is one of the most successful methods and has been shown impressive performance in various classification tasks. However, when the training data has a different distribution than the testing data, the learned sparse representation may not be optimal, and the performance of SRC will be degraded significantly. To address this problem, in this paper, we propose an optimal couple projections for domain-adaptive sparse representation-based classification (OCPD-SRC) method, in which the discriminative features of data in the two domains are simultaneously learned with the dictionary that can succinctly represent the training and testing data in the projected space. OCPD-SRC is designed based on the decision rule of SRC, with the objective to learn coupled projection matrices and a common discriminative dictionary such that the between-class sparse reconstruction residuals of data from both domains are maximized, and the within-class sparse reconstruction residuals of data are minimized in the projected low-dimensional space. Thus, the resulting representations can well fit SRC and simultaneously have a better discriminant ability. In addition, our method can be easily extended to multiple domains and can be kernelized to deal with the nonlinear structure of data. The optimal solution for the proposed method can be efficiently obtained following the alternative optimization method. Extensive experimental results on a series of benchmark databases show that our method is better or comparable to many state-of-the-art methods.

  11. An approach for classification of hydrogeological systems at the regional scale based on groundwater hydrographs

    Science.gov (United States)

    Haaf, Ezra; Barthel, Roland

    2016-04-01

    When assessing hydrogeological conditions at the regional scale, the analyst is often confronted with uncertainty of structures, inputs and processes while having to base inference on scarce and patchy data. Haaf and Barthel (2015) proposed a concept for handling this predicament by developing a groundwater systems classification framework, where information is transferred from similar, but well-explored and better understood to poorly described systems. The concept is based on the central hypothesis that similar systems react similarly to the same inputs and vice versa. It is conceptually related to PUB (Prediction in ungauged basins) where organization of systems and processes by quantitative methods is intended and used to improve understanding and prediction. Furthermore, using the framework it is expected that regional conceptual and numerical models can be checked or enriched by ensemble generated data from neighborhood-based estimators. In a first step, groundwater hydrographs from a large dataset in Southern Germany are compared in an effort to identify structural similarity in groundwater dynamics. A number of approaches to group hydrographs, mostly based on a similarity measure - which have previously only been used in local-scale studies, can be found in the literature. These are tested alongside different global feature extraction techniques. The resulting classifications are then compared to a visual "expert assessment"-based classification which serves as a reference. A ranking of the classification methods is carried out and differences shown. Selected groups from the classifications are related to geological descriptors. Here we present the most promising results from a comparison of classifications based on series correlation, different series distances and series features, such as the coefficients of the discrete Fourier transform and the intrinsic mode functions of empirical mode decomposition. Additionally, we show examples of classes

  12. Supervised classification of combined copy number and gene expression data

    Directory of Open Access Journals (Sweden)

    Riccadonna S.

    2007-12-01

    Full Text Available In this paper we apply a predictive profiling method to genome copy number aberrations (CNA in combination with gene expression and clinical data to identify molecular patterns of cancer pathophysiology. Predictive models and optimal feature lists for the platforms are developed by a complete validation SVM-based machine learning system. Ranked list of genome CNA sites (assessed by comparative genomic hybridization arrays – aCGH and of differentially expressed genes (assessed by microarray profiling with Affy HG-U133A chips are computed and combined on a breast cancer dataset for the discrimination of Luminal/ ER+ (Lum/ER+ and Basal-like/ER- classes. Different encodings are developed and applied to the CNA data, and predictive variable selection is discussed. We analyze the combination of profiling information between the platforms, also considering the pathophysiological data. A specific subset of patients is identified that has a different response to classification by chromosomal gains and losses and by differentially expressed genes, corroborating the idea that genomic CNA can represent an independent source for tumor classification.

  13. A strategy learning model for autonomous agents based on classification

    Directory of Open Access Journals (Sweden)

    Śnieżyński Bartłomiej

    2015-09-01

    Full Text Available In this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process

  14. Optimal preprocessing of serum and urine metabolomic data fusion for staging prostate cancer through design of experiment

    International Nuclear Information System (INIS)

    Zheng, Hong; Cai, Aimin; Zhou, Qi; Xu, Pengtao; Zhao, Liangcai; Li, Chen; Dong, Baijun; Gao, Hongchang

    2017-01-01

    Accurate classification of cancer stages will achieve precision treatment for cancer. Metabolomics presents biological phenotypes at the metabolite level and holds a great potential for cancer classification. Since metabolomic data can be obtained from different samples or analytical techniques, data fusion has been applied to improve classification accuracy. Data preprocessing is an essential step during metabolomic data analysis. Therefore, we developed an innovative optimization method to select a proper data preprocessing strategy for metabolomic data fusion using a design of experiment approach for improving the classification of prostate cancer (PCa) stages. In this study, urine and serum samples were collected from participants at five phases of PCa and analyzed using a 1 H NMR-based metabolomic approach. Partial least squares-discriminant analysis (PLS-DA) was used as a classification model and its performance was assessed by goodness of fit (R 2 ) and predictive ability (Q 2 ). Results show that data preprocessing significantly affect classification performance and depends on data properties. Using the fused metabolomic data from urine and serum, PLS-DA model with the optimal data preprocessing (R 2  = 0.729, Q 2  = 0.504, P < 0.0001) can effectively improve model performance and achieve a better classification result for PCa stages as compared with that without data preprocessing (R 2  = 0.139, Q 2  = 0.006, P = 0.450). Therefore, we propose that metabolomic data fusion integrated with an optimal data preprocessing strategy can significantly improve the classification of cancer stages for precision treatment. - Highlights: • NMR metabolomic analysis of body fluids can be used for staging prostate cancer. • Data preprocessing is an essential step for metabolomic analysis. • Data fusion improves information recovery for cancer classification. • Design of experiment achieves optimal preprocessing of metabolomic data fusion.

  15. Initial steps towards an evidence-based classification system for golfers with a physical impairment

    NARCIS (Netherlands)

    Stoter, Inge K.; Hettinga, Florentina J.; Altmann, Viola; Eisma, Wim; Arendzen, Hans; Bennett, Tony; van der Woude, Lucas H.; Dekker, Rienk

    2017-01-01

    Purpose: The present narrative review aims to make a first step towards an evidence-based classification system in handigolf following the International Paralympic Committee (IPC). It intends to create a conceptual framework of classification for handigolf and an agenda for future research. Method:

  16. Classification and global distribution of ocean precipitation types based on satellite passive microwave signatures

    Science.gov (United States)

    Gautam, Nitin

    The main objectives of this thesis are to develop a robust statistical method for the classification of ocean precipitation based on physical properties to which the SSM/I is sensitive and to examine how these properties vary globally and seasonally. A two step approach is adopted for the classification of oceanic precipitation classes from multispectral SSM/I data: (1)we subjectively define precipitation classes using a priori information about the precipitating system and its possible distinct signature on SSM/I data such as scattering by ice particles aloft in the precipitating cloud, emission by liquid rain water below freezing level, the difference of polarization at 19 GHz-an indirect measure of optical depth, etc.; (2)we then develop an objective classification scheme which is found to reproduce the subjective classification with high accuracy. This hybrid strategy allows us to use the characteristics of the data to define and encode classes and helps retain the physical interpretation of classes. The classification methods based on k-nearest neighbor and neural network are developed to objectively classify six precipitation classes. It is found that the classification method based neural network yields high accuracy for all precipitation classes. An inversion method based on minimum variance approach was used to retrieve gross microphysical properties of these precipitation classes such as column integrated liquid water path, column integrated ice water path, and column integrated min water path. This classification method is then applied to 2 years (1991-92) of SSM/I data to examine and document the seasonal and global distribution of precipitation frequency corresponding to each of these objectively defined six classes. The characteristics of the distribution are found to be consistent with assumptions used in defining these six precipitation classes and also with well known climatological patterns of precipitation regions. The seasonal and global

  17. Multi-material classification of dry recyclables from municipal solid waste based on thermal imaging.

    Science.gov (United States)

    Gundupalli, Sathish Paulraj; Hait, Subrata; Thakur, Atul

    2017-12-01

    There has been a significant rise in municipal solid waste (MSW) generation in the last few decades due to rapid urbanization and industrialization. Due to the lack of source segregation practice, a need for automated segregation of recyclables from MSW exists in the developing countries. This paper reports a thermal imaging based system for classifying useful recyclables from simulated MSW sample. Experimental results have demonstrated the possibility to use thermal imaging technique for classification and a robotic system for sorting of recyclables in a single process step. The reported classification system yields an accuracy in the range of 85-96% and is comparable with the existing single-material recyclable classification techniques. We believe that the reported thermal imaging based system can emerge as a viable and inexpensive large-scale classification-cum-sorting technology in recycling plants for processing MSW in developing countries. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. A kernel-based multi-feature image representation for histopathology image classification

    International Nuclear Information System (INIS)

    Moreno J; Caicedo J Gonzalez F

    2010-01-01

    This paper presents a novel strategy for building a high-dimensional feature space to represent histopathology image contents. Histogram features, related to colors, textures and edges, are combined together in a unique image representation space using kernel functions. This feature space is further enhanced by the application of latent semantic analysis, to model hidden relationships among visual patterns. All that information is included in the new image representation space. Then, support vector machine classifiers are used to assign semantic labels to images. Processing and classification algorithms operate on top of kernel functions, so that; the structure of the feature space is completely controlled using similarity measures and a dual representation. The proposed approach has shown a successful performance in a classification task using a dataset with 1,502 real histopathology images in 18 different classes. The results show that our approach for histological image classification obtains an improved average performance of 20.6% when compared to a conventional classification approach based on SVM directly applied to the original kernel.

  19. A KERNEL-BASED MULTI-FEATURE IMAGE REPRESENTATION FOR HISTOPATHOLOGY IMAGE CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    J Carlos Moreno

    2010-09-01

    Full Text Available This paper presents a novel strategy for building a high-dimensional feature space to represent histopathology image contents. Histogram features, related to colors, textures and edges, are combined together in a unique image representation space using kernel functions. This feature space is further enhanced by the application of Latent Semantic Analysis, to model hidden relationships among visual patterns. All that information is included in the new image representation space. Then, Support Vector Machine classifiers are used to assign semantic labels to images. Processing and classification algorithms operate on top of kernel functions, so that, the structure of the feature space is completely controlled using similarity measures and a dual representation. The proposed approach has shown a successful performance in a classification task using a dataset with 1,502 real histopathology images in 18 different classes. The results show that our approach for histological image classification obtains an improved average performance of 20.6% when compared to a conventional classification approach based on SVM directly applied to the original kernel.

  20. Non-target adjacent stimuli classification improves performance of classical ERP-based brain computer interface

    Science.gov (United States)

    Ceballos, G. A.; Hernández, L. F.

    2015-04-01

    Objective. The classical ERP-based speller, or P300 Speller, is one of the most commonly used paradigms in the field of Brain Computer Interfaces (BCI). Several alterations to the visual stimuli presentation system have been developed to avoid unfavorable effects elicited by adjacent stimuli. However, there has been little, if any, regard to useful information contained in responses to adjacent stimuli about spatial location of target symbols. This paper aims to demonstrate that combining the classification of non-target adjacent stimuli with standard classification (target versus non-target) significantly improves classical ERP-based speller efficiency. Approach. Four SWLDA classifiers were trained and combined with the standard classifier: the lower row, upper row, right column and left column classifiers. This new feature extraction procedure and the classification method were carried out on three open databases: the UAM P300 database (Universidad Autonoma Metropolitana, Mexico), BCI competition II (dataset IIb) and BCI competition III (dataset II). Main results. The inclusion of the classification of non-target adjacent stimuli improves target classification in the classical row/column paradigm. A gain in mean single trial classification of 9.6% and an overall improvement of 25% in simulated spelling speed was achieved. Significance. We have provided further evidence that the ERPs produced by adjacent stimuli present discriminable features, which could provide additional information about the spatial location of intended symbols. This work promotes the searching of information on the peripheral stimulation responses to improve the performance of emerging visual ERP-based spellers.

  1. The Discriminative validity of "nociceptive," "peripheral neuropathic," and "central sensitization" as mechanisms-based classifications of musculoskeletal pain.

    LENUS (Irish Health Repository)

    Smart, Keith M

    2012-02-01

    OBJECTIVES: Empirical evidence of discriminative validity is required to justify the use of mechanisms-based classifications of musculoskeletal pain in clinical practice. The purpose of this study was to evaluate the discriminative validity of mechanisms-based classifications of pain by identifying discriminatory clusters of clinical criteria predictive of "nociceptive," "peripheral neuropathic," and "central sensitization" pain in patients with low back (+\\/- leg) pain disorders. METHODS: This study was a cross-sectional, between-patients design using the extreme-groups method. Four hundred sixty-four patients with low back (+\\/- leg) pain were assessed using a standardized assessment protocol. After each assessment, patients\\' pain was assigned a mechanisms-based classification. Clinicians then completed a clinical criteria checklist indicating the presence\\/absence of various clinical criteria. RESULTS: Multivariate analyses using binary logistic regression with Bayesian model averaging identified a discriminative cluster of 7, 3, and 4 symptoms and signs predictive of a dominance of "nociceptive," "peripheral neuropathic," and "central sensitization" pain, respectively. Each cluster was found to have high levels of classification accuracy (sensitivity, specificity, positive\\/negative predictive values, positive\\/negative likelihood ratios). DISCUSSION: By identifying a discriminatory cluster of symptoms and signs predictive of "nociceptive," "peripheral neuropathic," and "central" pain, this study provides some preliminary discriminative validity evidence for mechanisms-based classifications of musculoskeletal pain. Classification system validation requires the accumulation of validity evidence before their use in clinical practice can be recommended. Further studies are required to evaluate the construct and criterion validity of mechanisms-based classifications of musculoskeletal pain.

  2. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    Science.gov (United States)

    2010-01-01

    Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for

  3. From learning taxonomies to phylogenetic learning: Integration of 16S rRNA gene data into FAME-based bacterial classification

    Directory of Open Access Journals (Sweden)

    Dawyndt Peter

    2010-01-01

    Full Text Available Abstract Background Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. Results In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. Conclusions FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the

  4. From learning taxonomies to phylogenetic learning: integration of 16S rRNA gene data into FAME-based bacterial classification.

    Science.gov (United States)

    Slabbinck, Bram; Waegeman, Willem; Dawyndt, Peter; De Vos, Paul; De Baets, Bernard

    2010-01-30

    Machine learning techniques have shown to improve bacterial species classification based on fatty acid methyl ester (FAME) data. Nonetheless, FAME analysis has a limited resolution for discrimination of bacteria at the species level. In this paper, we approach the species classification problem from a taxonomic point of view. Such a taxonomy or tree is typically obtained by applying clustering algorithms on FAME data or on 16S rRNA gene data. The knowledge gained from the tree can then be used to evaluate FAME-based classifiers, resulting in a novel framework for bacterial species classification. In view of learning in a taxonomic framework, we consider two types of trees. First, a FAME tree is constructed with a supervised divisive clustering algorithm. Subsequently, based on 16S rRNA gene sequence analysis, phylogenetic trees are inferred by the NJ and UPGMA methods. In this second approach, the species classification problem is based on the combination of two different types of data. Herein, 16S rRNA gene sequence data is used for phylogenetic tree inference and the corresponding binary tree splits are learned based on FAME data. We call this learning approach 'phylogenetic learning'. Supervised Random Forest models are developed to train the classification tasks in a stratified cross-validation setting. In this way, better classification results are obtained for species that are typically hard to distinguish by a single or flat multi-class classification model. FAME-based bacterial species classification is successfully evaluated in a taxonomic framework. Although the proposed approach does not improve the overall accuracy compared to flat multi-class classification, it has some distinct advantages. First, it has better capabilities for distinguishing species on which flat multi-class classification fails. Secondly, the hierarchical classification structure allows to easily evaluate and visualize the resolution of FAME data for the discrimination of bacterial

  5. The paradox of atheoretical classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2016-01-01

    A distinction can be made between “artificial classifications” and “natural classifications,” where artificial classifications may adequately serve some limited purposes, but natural classifications are overall most fruitful by allowing inference and thus many different purposes. There is strong...... support for the view that a natural classification should be based on a theory (and, of course, that the most fruitful theory provides the most fruitful classification). Nevertheless, atheoretical (or “descriptive”) classifications are often produced. Paradoxically, atheoretical classifications may...... be very successful. The best example of a successful “atheoretical” classification is probably the prestigious Diagnostic and Statistical Manual of Mental Disorders (DSM) since its third edition from 1980. Based on such successes one may ask: Should the claim that classifications ideally are natural...

  6. A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification

    Science.gov (United States)

    Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J. E.

    2015-01-01

    In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain–computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling’s \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$T^{2}$ \\end{document} statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%. PMID:27170898

  7. Interactive classification and content-based retrieval of tissue images

    Science.gov (United States)

    Aksoy, Selim; Marchisio, Giovanni B.; Tusk, Carsten; Koperski, Krzysztof

    2002-11-01

    We describe a system for interactive classification and retrieval of microscopic tissue images. Our syst