WorldWideScience

Sample records for cancer classification based

  1. Gene expression based cancer classification

    OpenAIRE

    Sara Tarek; Reda Abd Elwahab; Mahmoud Shoman

    2017-01-01

    Cancer classification based on molecular level investigation has gained the interest of researches as it provides a systematic, accurate and objective diagnosis for different cancer types. Several recent researches have been studying the problem of cancer classification using data mining methods, machine learning algorithms and statistical methods to reach an efficient analysis for gene expression profiles. Studying the characteristics of thousands of genes simultaneously offered a deep in...

  2. Pathway-based classification of cancer subtypes

    Directory of Open Access Journals (Sweden)

    Kim Shinuk

    2012-07-01

    Full Text Available Abstract Background Molecular markers based on gene expression profiles have been used in experimental and clinical settings to distinguish cancerous tumors in stage, grade, survival time, metastasis, and drug sensitivity. However, most significant gene markers are unstable (not reproducible among data sets. We introduce a standardized method for representing cancer markers as 2-level hierarchical feature vectors, with a basic gene level as well as a second level of (more stable pathway markers, for the purpose of discriminating cancer subtypes. This extends standard gene expression arrays with new pathway-level activation features obtained directly from off-the-shelf gene set enrichment algorithms such as GSEA. Such so-called pathway-based expression arrays are significantly more reproducible across datasets. Such reproducibility will be important for clinical usefulness of genomic markers, and augment currently accepted cancer classification protocols. Results The present method produced more stable (reproducible pathway-based markers for discriminating breast cancer metastasis and ovarian cancer survival time. Between two datasets for breast cancer metastasis, the intersection of standard significant gene biomarkers totaled 7.47% of selected genes, compared to 17.65% using pathway-based markers; the corresponding percentages for ovarian cancer datasets were 20.65% and 33.33% respectively. Three pathways, consisting of Type_1_diabetes mellitus, Cytokine-cytokine_receptor_interaction and Hedgehog_signaling (all previously implicated in cancer, are enriched in both the ovarian long survival and breast non-metastasis groups. In addition, integrating pathway and gene information, we identified five (ID4, ANXA4, CXCL9, MYLK, FBXL7 and six (SQLE, E2F1, PTTG1, TSTA3, BUB1B, MAD2L1 known cancer genes significant for ovarian and breast cancer respectively. Conclusions Standardizing the analysis of genomic data in the process of cancer staging

  3. Nominated Texture Based Cervical Cancer Classification

    Directory of Open Access Journals (Sweden)

    Edwin Jayasingh Mariarputham

    2015-01-01

    Full Text Available Accurate classification of Pap smear images becomes the challenging task in medical image processing. This can be improved in two ways. One way is by selecting suitable well defined specific features and the other is by selecting the best classifier. This paper presents a nominated texture based cervical cancer (NTCC classification system which classifies the Pap smear images into any one of the seven classes. This can be achieved by extracting well defined texture features and selecting best classifier. Seven sets of texture features (24 features are extracted which include relative size of nucleus and cytoplasm, dynamic range and first four moments of intensities of nucleus and cytoplasm, relative displacement of nucleus within the cytoplasm, gray level cooccurrence matrix, local binary pattern histogram, tamura features, and edge orientation histogram. Few types of support vector machine (SVM and neural network (NN classifiers are used for the classification. The performance of the NTCC algorithm is tested and compared to other algorithms on public image database of Herlev University Hospital, Denmark, with 917 Pap smear images. The output of SVM is found to be best for the most of the classes and better results for the remaining classes.

  4. Human Cancer Classification: A Systems Biology- Based Model Integrating Morphology, Cancer Stem Cells, Proteomics, and Genomics

    Directory of Open Access Journals (Sweden)

    Halliday A Idikio

    2011-01-01

    Full Text Available Human cancer classification is currently based on the idea of cell of origin, light and electron microscopic attributes of the cancer. What is not yet integrated into cancer classification are the functional attributes of these cancer cells. Recent innovative techniques in biology have provided a wealth of information on the genomic, transcriptomic and proteomic changes in cancer cells. The emergence of the concept of cancer stem cells needs to be included in a classification model to capture the known attributes of cancer stem cells and their potential contribution to treatment response, and metastases. The integrated model of cancer classification presented here incorporates all morphology, cancer stem cell contributions, genetic, and functional attributes of cancer. Integrated cancer classification models could eliminate the unclassifiable cancers as used in current classifications. Future cancer treatment may be advanced by using an integrated model of cancer classification.

  5. Human Cancer Classification: A Systems Biology- Based Model Integrating Morphology, Cancer Stem Cells, Proteomics, and Genomics

    OpenAIRE

    Halliday A Idikio

    2011-01-01

    Human cancer classification is currently based on the idea of cell of origin, light and electron microscopic attributes of the cancer. What is not yet integrated into cancer classification are the functional attributes of these cancer cells. Recent innovative techniques in biology have provided a wealth of information on the genomic, transcriptomic and proteomic changes in cancer cells. The emergence of the concept of cancer stem cells needs to be included in a classification model to capture...

  6. Pathway-based classification of cancer subtypes

    OpenAIRE

    Kim, Shinuk; Kon, Mark; DeLisi, Charles

    2012-01-01

    Abstract Background Molecular markers based on gene expression profiles have been used in experimental and clinical settings to distinguish cancerous tumors in stage, grade, survival time, metastasis, and drug sensitivity. However, most significant gene markers are unstable (not reproducible) among data sets. We introduce a standardized method for representing cancer markers as 2-level hierarchical feature vectors, with a basic gene level as well as a second level of (more stable) pathway mar...

  7. NIM: A Node Influence Based Method for Cancer Classification

    Directory of Open Access Journals (Sweden)

    Yiwen Wang

    2014-01-01

    Full Text Available The classification of different cancer types owns great significance in the medical field. However, the great majority of existing cancer classification methods are clinical-based and have relatively weak diagnostic ability. With the rapid development of gene expression technology, it is able to classify different kinds of cancers using DNA microarray. Our main idea is to confront the problem of cancer classification using gene expression data from a graph-based view. Based on a new node influence model we proposed, this paper presents a novel high accuracy method for cancer classification, which is composed of four parts: the first is to calculate the similarity matrix of all samples, the second is to compute the node influence of training samples, the third is to obtain the similarity between every test sample and each class using weighted sum of node influence and similarity matrix, and the last is to classify each test sample based on its similarity between every class. The data sets used in our experiments are breast cancer, central nervous system, colon tumor, prostate cancer, acute lymphoblastic leukemia, and lung cancer. experimental results showed that our node influence based method (NIM is more efficient and robust than the support vector machine, K-nearest neighbor, C4.5, naive Bayes, and CART.

  8. Pathological Bases for a Robust Application of Cancer Molecular Classification

    Directory of Open Access Journals (Sweden)

    Salvador J. Diaz-Cano

    2015-04-01

    Full Text Available Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes, and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors.

  9. Application of wavelet transformation and adaptive neighborhood based modified backpropagation (ANMBP) for classification of brain cancer

    Science.gov (United States)

    Werdiningsih, Indah; Zaman, Badrus; Nuqoba, Barry

    2017-08-01

    This paper presents classification of brain cancer using wavelet transformation and Adaptive Neighborhood Based Modified Backpropagation (ANMBP). Three stages of the processes, namely features extraction, features reduction, and classification process. Wavelet transformation is used for feature extraction and ANMBP is used for classification process. The result of features extraction is feature vectors. Features reduction used 100 energy values per feature and 10 energy values per feature. Classifications of brain cancer are normal, alzheimer, glioma, and carcinoma. Based on simulation results, 10 energy values per feature can be used to classify brain cancer correctly. The correct classification rate of proposed system is 95 %. This research demonstrated that wavelet transformation can be used for features extraction and ANMBP can be used for classification of brain cancer.

  10. Gene expression-based diagnostics for molecular cancer classification of difficult to diagnose tumors.

    Science.gov (United States)

    Schnabel, Catherine A; Erlander, Mark G

    2012-09-01

    Standardized methods for accurate tumor classification are of critical importance for cancer diagnosis and treatment, particularly in diagnostically-challenging cases where site-directed therapies are an option. Molecular diagnostics for tumor classification, subclassification and site of origin determination based on advances in gene expression profiling have translated into clinical practice as complementary approaches to clinicopathological evaluations. In this review, the foundational science of gene expression-based cancer classification, technical and clinical considerations for clinical translation, and an overview of molecular signatures of tumor classification that are available for clinical use will be discussed. Proposed approaches will also be described for further integration of molecular tests for cancer classification into the diagnostic paradigm using a tissue-based strategy as a key component to direct evaluation. Increasing evidence of improved patient outcomes with the application of site and molecularly-targeted cancer therapy through use of molecular tools highlights the growing potential for these gene expression-based diagnostics to positively impact patient management. Looking forward, the availability of adequate tissue will be a significant issue and limiting factor as cancer diagnosis progresses; when the tumor specimen is limited, use of molecular classification may be a reasonable early step in the evaluation, particularly if the tumor is poorly-differentiated and has atypical features.

  11. Molecular cancer classification using a meta-sample-based regularized robust coding method.

    Science.gov (United States)

    Wang, Shu-Lin; Sun, Liuchao; Fang, Jianwen

    2014-01-01

    Previous studies have demonstrated that machine learning based molecular cancer classification using gene expression profiling (GEP) data is promising for the clinic diagnosis and treatment of cancer. Novel classification methods with high efficiency and prediction accuracy are still needed to deal with high dimensionality and small sample size of typical GEP data. Recently the sparse representation (SR) method has been successfully applied to the cancer classification. Nevertheless, its efficiency needs to be improved when analyzing large-scale GEP data. In this paper we present the meta-sample-based regularized robust coding classification (MRRCC), a novel effective cancer classification technique that combines the idea of meta-sample-based cluster method with regularized robust coding (RRC) method. It assumes that the coding residual and the coding coefficient are respectively independent and identically distributed. Similar to meta-sample-based SR classification (MSRC), MRRCC extracts a set of meta-samples from the training samples, and then encodes a testing sample as the sparse linear combination of these meta-samples. The representation fidelity is measured by the l2-norm or l1-norm of the coding residual. Extensive experiments on publicly available GEP datasets demonstrate that the proposed method is more efficient while its prediction accuracy is equivalent to existing MSRC-based methods and better than other state-of-the-art dimension reduction based methods.

  12. Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data.

    Science.gov (United States)

    Huang, Desheng; Quan, Yu; He, Miao; Zhou, Baosen

    2009-12-10

    More studies based on gene expression data have been reported in great detail, however, one major challenge for the methodologists is the choice of classification methods. The main purpose of this research was to compare the performance of linear discriminant analysis (LDA) and its modification methods for the classification of cancer based on gene expression data. The classification performance of linear discriminant analysis (LDA) and its modification methods was evaluated by applying these methods to six public cancer gene expression datasets. These methods included linear discriminant analysis (LDA), prediction analysis for microarrays (PAM), shrinkage centroid regularized discriminant analysis (SCRDA), shrinkage linear discriminant analysis (SLDA) and shrinkage diagonal discriminant analysis (SDDA). The procedures were performed by software R 2.80. PAM picked out fewer feature genes than other methods from most datasets except from Brain dataset. For the two methods of shrinkage discriminant analysis, SLDA selected more genes than SDDA from most datasets except from 2-class lung cancer dataset. When comparing SLDA with SCRDA, SLDA selected more genes than SCRDA from 2-class lung cancer, SRBCT and Brain dataset, the result was opposite for the rest datasets. The average test error of LDA modification methods was lower than LDA method. The classification performance of LDA modification methods was superior to that of traditional LDA with respect to the average error and there was no significant difference between theses modification methods.

  13. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2005-11-01

    Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

  14. AN ADABOOST OPTIMIZED CCFIS BASED CLASSIFICATION MODEL FOR BREAST CANCER DETECTION

    Directory of Open Access Journals (Sweden)

    CHANDRASEKAR RAVI

    2017-06-01

    Full Text Available Classification is a Data Mining technique used for building a prototype of the data behaviour, using which an unseen data can be classified into one of the defined classes. Several researchers have proposed classification techniques but most of them did not emphasis much on the misclassified instances and storage space. In this paper, a classification model is proposed that takes into account the misclassified instances and storage space. The classification model is efficiently developed using a tree structure for reducing the storage complexity and uses single scan of the dataset. During the training phase, Class-based Closed Frequent ItemSets (CCFIS were mined from the training dataset in the form of a tree structure. The classification model has been developed using the CCFIS and a similarity measure based on Longest Common Subsequence (LCS. Further, the Particle Swarm Optimization algorithm is applied on the generated CCFIS, which assigns weights to the itemsets and their associated classes. Most of the classifiers are correctly classifying the common instances but they misclassify the rare instances. In view of that, AdaBoost algorithm has been used to boost the weights of the misclassified instances in the previous round so as to include them in the training phase to classify the rare instances. This improves the accuracy of the classification model. During the testing phase, the classification model is used to classify the instances of the test dataset. Breast Cancer dataset from UCI repository is used for experiment. Experimental analysis shows that the accuracy of the proposed classification model outperforms the PSOAdaBoost-Sequence classifier by 7% superior to other approaches like Naïve Bayes Classifier, Support Vector Machine Classifier, Instance Based Classifier, ID3 Classifier, J48 Classifier, etc.

  15. Cancer pain: A critical review of mechanism-based classification and physical therapy management in palliative care

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2011-01-01

    Full Text Available Mechanism-based classification and physical therapy management of pain is essential to effectively manage painful symptoms in patients attending palliative care. The objective of this review is to provide a detailed review of mechanism-based classification and physical therapy management of patients with cancer pain. Cancer pain can be classified based upon pain symptoms, pain mechanisms and pain syndromes. Classification based upon mechanisms not only addresses the underlying pathophysiology but also provides us with an understanding behind patient′s symptoms and treatment responses. Existing evidence suggests that the five mechanisms - central sensitization, peripheral sensitization, sympathetically maintained pain, nociceptive and cognitive-affective - operate in patients with cancer pain. Summary of studies showing evidence for physical therapy treatment methods for cancer pain follows with suggested therapeutic implications. Effective palliative physical therapy care using a mechanism-based classification model should be tailored to suit each patient′s findings, using a biopsychosocial model of pain.

  16. Training ANFIS structure using genetic algorithm for liver cancer classification based on microarray gene expression data

    Directory of Open Access Journals (Sweden)

    Bülent Haznedar

    2017-02-01

    Full Text Available Classification is an important data mining technique, which is used in many fields mostly exemplified as medicine, genetics and biomedical engineering. The number of studies about classification of the datum on DNA microarray gene expression is specifically increased in recent years. However, because of the reasons as the abundance of gene numbers in the datum as microarray gene expressions and the nonlinear relations mostly across those datum, the success of conventional classification algorithms can be limited. Because of these reasons, the interest on classification methods which are based on artificial intelligence to solve the problem on classification has been gradually increased in recent times. In this study, a hybrid approach which is based on Adaptive Neuro-Fuzzy Inference System (ANFIS and Genetic Algorithm (GA are suggested in order to classify liver microarray cancer data set. Simulation results are compared with the results of other methods. According to the results obtained, it is seen that the recommended method is better than the other methods.

  17. Usage of case-based reasoning, neural network and adaptive neuro-fuzzy inference system classification techniques in breast cancer dataset classification diagnosis.

    Science.gov (United States)

    Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, Wen-Ming; Li, R K; Wang, Tzu-Hao

    2012-04-01

    Breast cancer is a common to females worldwide. Today, technological advancements in cancer treatment innovations have increased the survival rates. Many theoretical and experimental studies have shown that a multiple classifier system is an effective technique for reducing prediction errors. This study compared the particle swarm optimizer (PSO) based artificial neural network (ANN), the adaptive neuro-fuzzy inference system (ANFIS), and a case-based reasoning (CBR) classifier with a logistic regression model and decision tree model. It also applied three classification techniques to the Mammographic Mass Data Set, and measured its improvements in accuracy and classification errors. The experimental results showed that, the best CBR-based classification accuracy is 83.60%, and the classification accuracies of the PSO-based ANN classifier and ANFIS are 91.10% and 92.80%, respectively.

  18. Classification of lung cancer tumors based on structural and physicochemical properties of proteins by bioinformatics models.

    Directory of Open Access Journals (Sweden)

    Faezeh Hosseinzadeh

    Full Text Available Rapid distinction between small cell lung cancer (SCLC and non-small cell lung cancer (NSCLC tumors is very important in diagnosis of this disease. Furthermore sequence-derived structural and physicochemical descriptors are very useful for machine learning prediction of protein structural and functional classes, classifying proteins and the prediction performance. Herein, in this study is the classification of lung tumors based on 1497 attributes derived from structural and physicochemical properties of protein sequences (based on genes defined by microarray analysis investigated through a combination of attribute weighting, supervised and unsupervised clustering algorithms. Eighty percent of the weighting methods selected features such as autocorrelation, dipeptide composition and distribution of hydrophobicity as the most important protein attributes in classification of SCLC, NSCLC and COMMON classes of lung tumors. The same results were observed by most tree induction algorithms while descriptors of hydrophobicity distribution were high in protein sequences COMMON in both groups and distribution of charge in these proteins was very low; showing COMMON proteins were very hydrophobic. Furthermore, compositions of polar dipeptide in SCLC proteins were higher than NSCLC proteins. Some clustering models (alone or in combination with attribute weighting algorithms were able to nearly classify SCLC and NSCLC proteins. Random Forest tree induction algorithm, calculated on leaves one-out and 10-fold cross validation shows more than 86% accuracy in clustering and predicting three different lung cancer tumors. Here for the first time the application of data mining tools to effectively classify three classes of lung cancer tumors regarding the importance of dipeptide composition, autocorrelation and distribution descriptor has been reported.

  19. Classification of Individual Lung Cancer Cell Lines Based on DNA Methylation Markers

    Science.gov (United States)

    Marchevsky, Alberto M.; Tsou, Jeffrey A.; Laird-Offringa, Ite A.

    2004-01-01

    The classification of small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) can pose diagnostic problems due to inter-observer variability and other limitations of histopathology. There is an interest in developing classificatory models of lung neoplasms based on the analysis of multivariate molecular data with statistical methods and/or neural networks. DNA methylation levels at 20 loci were measured in 41 SCLC and 46 NSCLC cell lines with the quantitative real-time PCR method MethyLight. The data were analyzed with artificial neural networks (ANN) and linear discriminant analysis (LDA) to classify the cell lines into SCLC or into NSCLC. Models used either data from all 20 loci, or from five significant DNA methylation loci that were selected by a step-wise back-propagation procedure (PTGS2, CALCA, MTHFR, ESR1, and CDKN2A). The data were sorted randomly by cell line into 10 different data sets, each with training and testing subsets composed of 71 and 16 of the cases, respectively. Ten ANN models were trained using the 10 data sets: five using 20 variables, and five using the five variables selected by step-wise back-propagation. The ANN models with 20 input variables correctly classified 100% of the cell lines, while the models with only five variables correctly classified 87 to 100% of cases. For comparison, 10 different LDA models were trained and tested using the same data sets with either the original data or with logarithmically transformed data. Again, half of the models used all 20 variables while the others used only the five significant variables. LDA models provided correct classifications in 62.5% to 87.5% of cases. The classifications provided by all of the different models were compared with kappa statistics, yielding kappa values ranging from 0.25 to 1.0. We conclude that ANN models based on DNA methylation profiles can objectively classify SCLC and NSCLC cells lines with substantial to perfect concordance, while LDA models based on

  20. Actionable gene-based classification toward precision medicine in gastric cancer

    Directory of Open Access Journals (Sweden)

    Hiroshi Ichikawa

    2017-10-01

    Full Text Available Abstract Background Intertumoral heterogeneity represents a significant hurdle to identifying optimized targeted therapies in gastric cancer (GC. To realize precision medicine for GC patients, an actionable gene alteration-based molecular classification that directly associates GCs with targeted therapies is needed. Methods A total of 207 Japanese patients with GC were included in this study. Formalin-fixed, paraffin-embedded (FFPE tumor tissues were obtained from surgical or biopsy specimens and were subjected to DNA extraction. We generated comprehensive genomic profiling data using a 435-gene panel including 69 actionable genes paired with US Food and Drug Administration-approved targeted therapies, and the evaluation of Epstein-Barr virus (EBV infection and microsatellite instability (MSI status. Results Comprehensive genomic sequencing detected at least one alteration of 435 cancer-related genes in 194 GCs (93.7% and of 69 actionable genes in 141 GCs (68.1%. We classified the 207 GCs into four The Cancer Genome Atlas (TCGA subtypes using the genomic profiling data; EBV (N = 9, MSI (N = 17, chromosomal instability (N = 119, and genomically stable subtype (N = 62. Actionable gene alterations were not specific and were widely observed throughout all TCGA subtypes. To discover a novel classification which more precisely selects candidates for targeted therapies, 207 GCs were classified using hypermutated phenotype and the mutation profile of 69 actionable genes. We identified a hypermutated group (N = 32, while the others (N = 175 were sub-divided into six clusters including five with actionable gene alterations: ERBB2 (N = 25, CDKN2A, and CDKN2B (N = 10, KRAS (N = 10, BRCA2 (N = 9, and ATM cluster (N = 12. The clinical utility of this classification was demonstrated by a case of unresectable GC with a remarkable response to anti-HER2 therapy in the ERBB2 cluster. Conclusions This actionable gene-based

  1. Proteomic-Based Biosignatures in Breast Cancer Classification and Prediction of Therapeutic Response

    Science.gov (United States)

    He, Jianbo; Whelan, Stephen A.; Lu, Ming; Shen, Dejun; Chung, Debra U.; Saxton, Romaine E.; Faull, Kym F.; Whitelegge, Julian P.; Chang, Helena R.

    2011-01-01

    Protein-based markers that classify tumor subtypes and predict therapeutic response would be clinically useful in guiding patient treatment. We investigated the LC-MS/MS-identified protein biosignatures in 39 baseline breast cancer specimens including 28 HER2-positive and 11 triple-negative (TNBC) tumors. Twenty proteins were found to correctly classify all HER2 positive and 7 of the 11 TNBC tumors. Among them, galectin-3-binding protein and ALDH1A1 were found preferentially elevated in TNBC, whereas CK19, transferrin, transketolase, and thymosin β4 and β10 were elevated in HER2-positive cancers. In addition, several proteins such as enolase, vimentin, peroxiredoxin 5, Hsp 70, periostin precursor, RhoA, cathepsin D preproprotein, and annexin 1 were found to be associated with the tumor responses to treatment within each subtype. The MS-based proteomic findings appear promising in guiding tumor classification and predicting response. When sufficiently validated, some of these candidate protein markers could have great potential in improving breast cancer treatment. PMID:22110952

  2. Epigenetic and genetic alterations-based molecular classification of head and neck cancer.

    Science.gov (United States)

    Feng, Zhien; Xu, Qin; Chen, Wantao

    2012-04-01

    The long-term survival rates for patients diagnosed with advanced head and neck cancer (HNC) remain poor. Many perplexing factors, including etiology and comorbidity, lead to different molecular malfunctions of HNC cells and determine the prognosis of the disease. Traditional diagnostic methods are limited in that they fail to provide an effective classification diagnosis, such as a more precise prediction of prognosis and decisions for personalized treatment regimens. Recently, molecular biology techniques, especially epigenetic and genetic techniques, have been developed that have enabled us to gain a greater insight into the molecular pathways underlying the cancers. Translating the research into a format that will facilitate effective molecular classification, support personalized treatment and determine prognosis remains a challenge. In this review, the authors provide an overview of cancer epigenetic and genetic alterations, tissue banks, and several promising biomarkers or candidates that may ultimately prove to be beneficial in a clinical setting for patients with HNC.

  3. Proteomic classification of breast cancer.

    LENUS (Irish Health Repository)

    Kamel, Dalia

    2012-11-01

    Being a significant health problem that affects patients in various age groups, breast cancer has been extensively studied to date. Recently, molecular breast cancer classification has advanced significantly with the availability of genomic profiling technologies. Proteomic technologies have also advanced from traditional protein assays including enzyme-linked immunosorbent assay, immunoblotting and immunohistochemistry to more comprehensive approaches including mass spectrometry and reverse phase protein lysate arrays (RPPA). The purpose of this manuscript is to review the current protein markers that influence breast cancer prediction and prognosis and to focus on novel advances in proteomic classification of breast cancer.

  4. Biomarker Discovery Based on Hybrid Optimization Algorithm and Artificial Neural Networks on Microarray Data for Cancer Classification.

    Science.gov (United States)

    Moteghaed, Niloofar Yousefi; Maghooli, Keivan; Pirhadi, Shiva; Garshasbi, Masoud

    2015-01-01

    The improvement of high-through-put gene profiling based microarrays technology has provided monitoring the expression value of thousands of genes simultaneously. Detailed examination of changes in expression levels of genes can help physicians to have efficient diagnosing, classification of tumors and cancer's types as well as effective treatments. Finding genes that can classify the group of cancers correctly based on hybrid optimization algorithms is the main purpose of this paper. In this paper, a hybrid particle swarm optimization and genetic algorithm method are used for gene selection and also artificial neural network (ANN) is adopted as the classifier. In this work, we have improved the ability of the algorithm for the classification problem by finding small group of biomarkers and also best parameters of the classifier. The proposed approach is tested on three benchmark gene expression data sets: Blood (acute myeloid leukemia, acute lymphoblastic leukemia), colon and breast datasets. We used 10-fold cross-validation to achieve accuracy and also decision tree algorithm to find the relation between the biomarkers for biological point of view. To test the ability of the trained ANN models to categorize the cancers, we analyzed additional blinded samples that were not previously used for the training procedure. Experimental results show that the proposed method can reduce the dimension of the data set and confirm the most informative gene subset and improve classification accuracy with best parameters based on datasets.

  5. Classification of Colon Cancer Patients Based on the Methylation Patterns of Promoters

    Directory of Open Access Journals (Sweden)

    Wonyoung Choi

    2016-06-01

    Full Text Available Diverse somatic mutations have been reported to serve as cancer drivers. Recently, it has also been reported that epigenetic regulation is closely related to cancer development. However, the effect of epigenetic changes on cancer is still elusive. In this study, we analyzed DNA methylation data on colon cancer taken from The Caner Genome Atlas. We found that several promoters were significantly hypermethylated in colon cancer patients. Through clustering analysis of differentially methylated DNA regions, we were able to define subgroups of patients and observed clinical features associated with each subgroup. In addition, we analyzed the functional ontology of aberrantly methylated genes and identified the G-protein-coupled receptor signaling pathway as one of the major pathways affected epigenetically. In conclusion, our analysis shows the possibility of characterizing the clinical features of colon cancer subgroups based on DNA methylation patterns and provides lists of important genes and pathways possibly involved in colon cancer development.

  6. Borderline resectable pancreatic cancer: conceptual evolution and current approach to image-based classification.

    Science.gov (United States)

    Gilbert, J W; Wolpin, B; Clancy, T; Wang, J; Mamon, H; Shinagare, A B; Jagannathan, J; Rosenthal, M

    2017-09-01

    Diagnostic imaging plays a critical role in the initial diagnosis and therapeutic monitoring of pancreatic adenocarcinoma. Over the past decade, the concept of 'borderline resectable' pancreatic cancer has emerged to describe a distinct subset of patients existing along the spectrum from resectable to locally advanced disease for whom a microscopically margin-positive (R1) resection is considered relatively more likely, primarily due to the relationship of the primary tumor with surrounding vasculature. This review traces the conceptual evolution of borderline resectability from a radiological perspective, including the debates over the key imaging criteria that define the thresholds between resectable, borderline resectable, and locally advanced or metastatic disease. This review also addresses the data supporting neoadjuvant therapy in this population and discusses current imaging practices before and during treatment. A growing body of evidence suggests that the borderline resectable group of patients may particularly benefit from neoadjuvant therapy to increase the likelihood of an ultimately margin-negative (R0) resection. Unfortunately, anatomic and imaging criteria to define borderline resectability are not yet universally agreed upon, with several classification systems proposed in the literature and considerable variance in institution-by-institution practice. As a result of this lack of consensus, as well as overall small patient numbers and lack of established clinical trials dedicated to borderline resectable patients, accurate evidence-based diagnostic categorization and treatment selection for this subset of patients remains a significant challenge. Clinicians and radiologists alike should be cognizant of evolving imaging criteria for borderline resectability given their profound implications for treatment strategy, follow-up recommendations, and prognosis.

  7. Genetic Fuzzy System (GFS based wavelet co-occurrence feature selection in mammogram classification for breast cancer diagnosis

    Directory of Open Access Journals (Sweden)

    Meenakshi M. Pawar

    2016-09-01

    Full Text Available Breast cancer is significant health problem diagnosed mostly in women worldwide. Therefore, early detection of breast cancer is performed with the help of digital mammography, which can reduce mortality rate. This paper presents wrapper based feature selection approach for wavelet co-occurrence feature (WCF using Genetic Fuzzy System (GFS in mammogram classification problem. The performance of GFS algorithm is explained using mini-MIAS database. WCF features are obtained from detail wavelet coefficients at each level of decomposition of mammogram image. At first level of decomposition, 18 features are applied to GFS algorithm, which selects 5 features with an average classification success rate of 39.64%. Subsequently, at second level it selects 9 features from 36 features and the classification success rate is improved to 56.75%. For third level, 16 features are selected from 54 features and average success rate is improved to 64.98%. Lastly, at fourth level 72 features are applied to GFS, which selects 16 features and thereby increasing average success rate to 89.47%. Hence, GFS algorithm is the effective way of obtaining optimal set of feature in breast cancer diagnosis.

  8. Molecular classification and prediction in gastric cancer

    Directory of Open Access Journals (Sweden)

    Xiandong Lin

    2015-01-01

    Full Text Available Gastric cancer, a highly heterogeneous disease, is the second leading cause of cancer death and the fourth most common cancer globally, with East Asia accounting for more than half of cases annually. Alongside TNM staging, gastric cancer clinic has two well-recognized classification systems, the Lauren classification that subdivides gastric adenocarcinoma into intestinal and diffuse types and the alternative World Health Organization system that divides gastric cancer into papillary, tubular, mucinous (colloid, and poorly cohesive carcinomas. Both classification systems enable a better understanding of the histogenesis and the biology of gastric cancer yet have a limited clinical utility in guiding patient therapy due to the molecular heterogeneity of gastric cancer. Unprecedented whole-genome-scale data have been catalyzing and advancing the molecular subtyping approach. Here we cataloged and compared those published gene expression profiling signatures in gastric cancer. We summarized recent integrated genomic characterization of gastric cancer based on additional data of somatic mutation, chromosomal instability, EBV virus infection, and DNA methylation. We identified the consensus patterns across these signatures and identified the underlying molecular pathways and biological functions. The identification of molecular subtyping of gastric adenocarcinoma and the development of integrated genomics approaches for clinical applications such as prediction of clinical intervening emerge as an essential phase toward personalized medicine in treating gastric cancer.

  9. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  10. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  11. Breast cancer detection and classification in digital mammography based on Non-Subsampled Contourlet Transform (NSCT) and Super Resolution.

    Science.gov (United States)

    Pak, Fatemeh; Kanan, Hamidreza Rashidy; Alikhassi, Afsaneh

    2015-11-01

    Breast cancer is one of the most perilous diseases among women. Breast screening is a method of detecting breast cancer at a very early stage which can reduce the mortality rate. Mammography is a standard method for the early diagnosis of breast cancer. In this paper, a new algorithm is proposed for breast cancer detection and classification in digital mammography based on Non-Subsampled Contourlet Transform (NSCT) and Super Resolution (SR). The presented algorithm includes three main parts including pre-processing, feature extraction and classification. In the pre-processing stage, after determining the region of interest (ROI) by an automatic technique, the quality of image is improved using NSCT and SR algorithm. In the feature extraction part, several features of the image components are extracted and skewness of each feature is calculated. Finally, AdaBoost algorithm is used to classify and determine the probability of benign and malign disease. The obtained results on Mammographic Image Analysis Society (MIAS) database indicate the significant performance and superiority of the proposed method in comparison with the state of the art approaches. According to the obtained results, the proposed technique achieves 91.43% and 6.42% as a mean accuracy and FPR, respectively. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  12. Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method

    Directory of Open Access Journals (Sweden)

    Huang Desheng

    2009-07-01

    Full Text Available Abstract Background A reliable and precise classification is essential for successful diagnosis and treatment of cancer. Gene expression microarrays have provided the high-throughput platform to discover genomic biomarkers for cancer diagnosis and prognosis. Rational use of the available bioinformation can not only effectively remove or suppress noise in gene chips, but also avoid one-sided results of separate experiment. However, only some studies have been aware of the importance of prior information in cancer classification. Methods Together with the application of support vector machine as the discriminant approach, we proposed one modified method that incorporated prior knowledge into cancer classification based on gene expression data to improve accuracy. A public well-known dataset, Malignant pleural mesothelioma and lung adenocarcinoma gene expression database, was used in this study. Prior knowledge is viewed here as a means of directing the classifier using known lung adenocarcinoma related genes. The procedures were performed by software R 2.80. Results The modified method performed better after incorporating prior knowledge. Accuracy of the modified method improved from 98.86% to 100% in training set and from 98.51% to 99.06% in test set. The standard deviations of the modified method decreased from 0.26% to 0 in training set and from 3.04% to 2.10% in test set. Conclusion The method that incorporates prior knowledge into discriminant analysis could effectively improve the capacity and reduce the impact of noise. This idea may have good future not only in practice but also in methodology.

  13. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    -max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic......A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s....

  14. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Directory of Open Access Journals (Sweden)

    Enrico Glaab

    Full Text Available Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  15. Laser Raman detection for oral cancer based on a Gaussian process classification method

    Science.gov (United States)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Zhang, Chijun; Chen, He; Luo, Yusheng; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Jia, Jun; Shen, Aiguo; Hu, Jiming

    2013-06-01

    Oral squamous cell carcinoma is the most common neoplasm of the oral cavity. The incidence rate accounts for 80% of total oral cancer and shows an upward trend in recent years. It has a high degree of malignancy and is difficult to detect in terms of differential diagnosis, as a consequence of which the timing of treatment is always delayed. In this work, Raman spectroscopy was adopted to differentially diagnose oral squamous cell carcinoma and oral gland carcinoma. In total, 852 entries of raw spectral data which consisted of 631 items from 36 oral squamous cell carcinoma patients, 87 items from four oral gland carcinoma patients and 134 items from five normal people were collected by utilizing an optical method on oral tissues. The probability distribution of the datasets corresponding to the spectral peaks of the oral squamous cell carcinoma tissue was analyzed and the experimental result showed that the data obeyed a normal distribution. Moreover, the distribution characteristic of the noise was also in compliance with a Gaussian distribution. A Gaussian process (GP) classification method was utilized to distinguish the normal people and the oral gland carcinoma patients from the oral squamous cell carcinoma patients. The experimental results showed that all the normal people could be recognized. 83.33% of the oral squamous cell carcinoma patients could be correctly diagnosed and the remaining ones would be diagnosed as having oral gland carcinoma. For the classification process of oral gland carcinoma and oral squamous cell carcinoma, the correct ratio was 66.67% and the erroneously diagnosed percentage was 33.33%. The total sensitivity was 80% and the specificity was 100% with the Matthews correlation coefficient (MCC) set to 0.447 213 595. Considering the numerical results above, the application prospects and clinical value of this technique are significantly impressive.

  16. Mechanism-based classification and physical therapy management of persons with cancer pain: A prospective case series

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2013-01-01

    Full Text Available Context: Mechanism-based classification (MBC was established with current evidence and physical therapy (PT management methods for both cancer and for noncancer pain. Aims: This study aims to describe the efficacy of MBC-based PT in persons with primary complaints of cancer pain. Settings and Design: A prospective case series of patients who attended the physiotherapy department of a multispecialty university-affiliated teaching hospital. Material and Methods: A total of 24 adults (18 female, 6 male aged 47.5 ± 10.6 years, with primary diagnosis of heterogeneous group of cancer, chief complaints of chronic disabling pain were included in the study on their consent for participation The patients were evaluated and classified on the basis of five predominant mechanisms for pain. Physical therapy interventions were recommended based on mechanisms identified and home program was prescribed with a patient log to ensure compliance. Treatments were given in five consecutive weekly sessions for five weeks each of 30 min duration. Statistical Analysis Used: Pre-post comparisons for pain severity (PS and pain interference (PI subscales of Brief pain inventory-Cancer pain (BPI-CP and, European organization for research and treatment in cancer-quality of life questionnaire (EORTC-QLQ-C30 were done using Wilcoxon signed-rank test at 95% confidence interval using SPSS for Windows version 16.0 (SPSS Inc, Chicago, IL. Results: There were statistically significant ( P < 0.05 reduction in pain severity, pain interference and total BPI-CP scores, and the EORTC-QLQ-C30. Conclusion: MBC-PT was effective for improving BPI-CP and EORTC-QLQ-C30 scores in people with cancer pain.

  17. Non-linear cancer classification using a modified radial basis function classification algorithm.

    Science.gov (United States)

    Wang, Hong-Qiang; Huang, De-Shuang

    2005-10-01

    This paper proposes a modified radial basis function classification algorithm for non-linear cancer classification. In the algorithm, a modified simulated annealing method is developed and combined with the linear least square and gradient paradigms to optimize the structure of the radial basis function (RBF) classifier. The proposed algorithm can be adopted to perform non-linear cancer classification based on gene expression profiles and applied to two microarray data sets involving various human tumor classes: (1) Normal versus colon tumor; (2) acute myeloid leukemia (AML) versus acute lymphoblastic leukemia (ALL). Finally, accuracy and stability for the proposed algorithm are further demonstrated by comparing with the other cancer classification algorithms.

  18. An iterated Laplacian based semi-supervised dimensionality reduction for classification of breast cancer on ultrasound images.

    Science.gov (United States)

    Liu, Xiao; Shi, Jun; Zhou, Shichong; Lu, Minhua

    2014-01-01

    The dimensionality reduction is an important step in ultrasound image based computer-aided diagnosis (CAD) for breast cancer. A newly proposed l2,1 regularized correntropy algorithm for robust feature selection (CRFS) has achieved good performance for noise corrupted data. Therefore, it has the potential to reduce the dimensions of ultrasound image features. However, in clinical practice, the collection of labeled instances is usually expensive and time costing, while it is relatively easy to acquire the unlabeled or undetermined instances. Therefore, the semi-supervised learning is very suitable for clinical CAD. The iterated Laplacian regularization (Iter-LR) is a new regularization method, which has been proved to outperform the traditional graph Laplacian regularization in semi-supervised classification and ranking. In this study, to augment the classification accuracy of the breast ultrasound CAD based on texture feature, we propose an Iter-LR-based semi-supervised CRFS (Iter-LR-CRFS) algorithm, and then apply it to reduce the feature dimensions of ultrasound images for breast CAD. We compared the Iter-LR-CRFS with LR-CRFS, original supervised CRFS, and principal component analysis. The experimental results indicate that the proposed Iter-LR-CRFS significantly outperforms all other algorithms.

  19. Clinical study of quantitative diagnosis of early cervical cancer based on the classification of acetowhitening kinetics

    Science.gov (United States)

    Wu, Tao; Cheung, Tak-Hong; Yim, So-Fan; Qu, Jianan Y.

    2010-03-01

    A quantitative colposcopic imaging system for the diagnosis of early cervical cancer is evaluated in a clinical study. This imaging technology based on 3-D active stereo vision and motion tracking extracts diagnostic information from the kinetics of acetowhitening process measured from the cervix of human subjects in vivo. Acetowhitening kinetics measured from 137 cervical sites of 57 subjects are analyzed and classified using multivariate statistical algorithms. Cross-validation methods are used to evaluate the performance of the diagnostic algorithms. The results show that an algorithm for screening precancer produced 95% sensitivity (SE) and 96% specificity (SP) for discriminating normal and human papillomavirus (HPV)-infected tissues from cervical intraepithelial neoplasia (CIN) lesions. For a diagnostic algorithm, 91% SE and 90% SP are achieved for discriminating normal tissue, HPV infected tissue, and low-grade CIN lesions from high-grade CIN lesions. The results demonstrate that the quantitative colposcopic imaging system could provide objective screening and diagnostic information for early detection of cervical cancer.

  20. Normed kernel function-based fuzzy possibilistic C-means (NKFPCM) algorithm for high-dimensional breast cancer database classification with feature selection is based on Laplacian Score

    Science.gov (United States)

    Lestari, A. W.; Rustam, Z.

    2017-07-01

    In the last decade, breast cancer has become the focus of world attention as this disease is one of the primary leading cause of death for women. Therefore, it is necessary to have the correct precautions and treatment. In previous studies, Fuzzy Kennel K-Medoid algorithm has been used for multi-class data. This paper proposes an algorithm to classify the high dimensional data of breast cancer using Fuzzy Possibilistic C-means (FPCM) and a new method based on clustering analysis using Normed Kernel Function-Based Fuzzy Possibilistic C-Means (NKFPCM). The objective of this paper is to obtain the best accuracy in classification of breast cancer data. In order to improve the accuracy of the two methods, the features candidates are evaluated using feature selection, where Laplacian Score is used. The results show the comparison accuracy and running time of FPCM and NKFPCM with and without feature selection.

  1. Computerized three-class classification of MRI-based prognostic markers for breast cancer

    Science.gov (United States)

    Bhooshan, Neha; Giger, Maryellen; Edwards, Darrin; Yuan, Yading; Jansen, Sanaz; Li, Hui; Lan, Li; Sattar, Husain; Newstead, Gillian

    2011-09-01

    The purpose of this study is to investigate whether computerized analysis using three-class Bayesian artificial neural network (BANN) feature selection and classification can characterize tumor grades (grade 1, grade 2 and grade 3) of breast lesions for prognostic classification on DCE-MRI. A database of 26 IDC grade 1 lesions, 86 IDC grade 2 lesions and 58 IDC grade 3 lesions was collected. The computer automatically segmented the lesions, and kinetic and morphological lesion features were automatically extracted. The discrimination tasks—grade 1 versus grade 3, grade 2 versus grade 3, and grade 1 versus grade 2 lesions—were investigated. Step-wise feature selection was conducted by three-class BANNs. Classification was performed with three-class BANNs using leave-one-lesion-out cross-validation to yield computer-estimated probabilities of being grade 3 lesion, grade 2 lesion and grade 1 lesion. Two-class ROC analysis was used to evaluate the performances. We achieved AUC values of 0.80 ± 0.05, 0.78 ± 0.05 and 0.62 ± 0.05 for grade 1 versus grade 3, grade 1 versus grade 2, and grade 2 versus grade 3, respectively. This study shows the potential for (1) applying three-class BANN feature selection and classification to CADx and (2) expanding the role of DCE-MRI CADx from diagnostic to prognostic classification in distinguishing tumor grades.

  2. Identification of immune cell infiltration in hematoxylin-eosin stained breast cancer samples: texture-based classification of tissue morphologies

    Science.gov (United States)

    Turkki, Riku; Linder, Nina; Kovanen, Panu E.; Pellinen, Teijo; Lundin, Johan

    2016-03-01

    The characteristics of immune cells in the tumor microenvironment of breast cancer capture clinically important information. Despite the heterogeneity of tumor-infiltrating immune cells, it has been shown that the degree of infiltration assessed by visual evaluation of hematoxylin-eosin (H and E) stained samples has prognostic and possibly predictive value. However, quantification of the infiltration in H and E-stained tissue samples is currently dependent on visual scoring by an expert. Computer vision enables automated characterization of the components of the tumor microenvironment, and texture-based methods have successfully been used to discriminate between different tissue morphologies and cell phenotypes. In this study, we evaluate whether local binary pattern texture features with superpixel segmentation and classification with support vector machine can be utilized to identify immune cell infiltration in H and E-stained breast cancer samples. Guided with the pan-leukocyte CD45 marker, we annotated training and test sets from 20 primary breast cancer samples. In the training set of arbitrary sized image regions (n=1,116) a 3-fold cross-validation resulted in 98% accuracy and an area under the receiver-operating characteristic curve (AUC) of 0.98 to discriminate between immune cell -rich and - poor areas. In the test set (n=204), we achieved an accuracy of 96% and AUC of 0.99 to label cropped tissue regions correctly into immune cell -rich and -poor categories. The obtained results demonstrate strong discrimination between immune cell -rich and -poor tissue morphologies. The proposed method can provide a quantitative measurement of the degree of immune cell infiltration and applied to digitally scanned H and E-stained breast cancer samples for diagnostic purposes.

  3. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    Directory of Open Access Journals (Sweden)

    Liu Qingzhong

    2011-12-01

    Full Text Available Abstract Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.

  4. Building a model for disease classification integration in oncology, an approach based on the national cancer institute thesaurus.

    Science.gov (United States)

    Jouhet, Vianney; Mougin, Fleur; Bréchat, Bérénice; Thiessard, Frantz

    2017-02-07

    Identifying incident cancer cases within a population remains essential for scientific research in oncology. Data produced within electronic health records can be useful for this purpose. Due to the multiplicity of providers, heterogeneous terminologies such as ICD-10 and ICD-O-3 are used for oncology diagnosis recording purpose. To enable disease identification based on these diagnoses, there is a need for integrating disease classifications in oncology. Our aim was to build a model integrating concepts involved in two disease classifications, namely ICD-10 (diagnosis) and ICD-O-3 (topography and morphology), despite their structural heterogeneity. Based on the NCIt, a "derivative" model for linking diagnosis and topography-morphology combinations was defined and built. ICD-O-3 and ICD-10 codes were then used to instantiate classes of the "derivative" model. Links between terminologies obtained through the model were then compared to mappings provided by the Surveillance, Epidemiology, and End Results (SEER) program. The model integrated 42% of neoplasm ICD-10 codes (excluding metastasis), 98% of ICD-O-3 morphology codes (excluding metastasis) and 68% of ICD-O-3 topography codes. For every codes instantiating at least a class in the "derivative" model, comparison with SEER mappings reveals that all mappings were actually available in the model as a link between the corresponding codes. We have proposed a method to automatically build a model for integrating ICD-10 and ICD-O-3 based on the NCIt. The resulting "derivative" model is a machine understandable resource that enables an integrated view of these heterogeneous terminologies. The NCIt structure and the available relationships can help to bridge disease classifications taking into account their structural and granular heterogeneities. However, (i) inconsistencies exist within the NCIt leading to misclassifications in the "derivative" model, (ii) the "derivative" model only integrates a part of ICD-10 and ICD

  5. Accurate molecular classification of cancer using simple rules

    Directory of Open Access Journals (Sweden)

    Gotoh Osamu

    2009-10-01

    Full Text Available Abstract Background One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible. Methods We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV of training sets and classification of independent test sets. Results We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML], lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML. Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods. Conclusion In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction.

  6. Biomarker identification and cancer classification based on microarray data using Laplace naive Bayes model with mean shrinkage.

    Science.gov (United States)

    Wu, Meng-Yun; Dai, Dao-Qing; Shi, Yu; Yan, Hong; Zhang, Xiao-Fei

    2012-01-01

    Biomarker identification and cancer classification are two closely related problems. In gene expression data sets, the correlation between genes can be high when they share the same biological pathway. Moreover, the gene expression data sets may contain outliers due to either chemical or electrical reasons. A good gene selection method should take group effects into account and be robust to outliers. In this paper, we propose a Laplace naive Bayes model with mean shrinkage (LNB-MS). The Laplace distribution instead of the normal distribution is used as the conditional distribution of the samples for the reasons that it is less sensitive to outliers and has been applied in many fields. The key technique is the L1 penalty imposed on the mean of each class to achieve automatic feature selection. The objective function of the proposed model is a piecewise linear function with respect to the mean of each class, of which the optimal value can be evaluated at the breakpoints simply. An efficient algorithm is designed to estimate the parameters in the model. A new strategy that uses the number of selected features to control the regularization parameter is introduced. Experimental results on simulated data sets and 17 publicly available cancer data sets attest to the accuracy, sparsity, efficiency, and robustness of the proposed algorithm. Many biomarkers identified with our method have been verified in biochemical or biomedical research. The analysis of biological and functional correlation of the genes based on Gene Ontology (GO) terms shows that the proposed method guarantees the selection of highly correlated genes simultaneously

  7. Is cancer a disease that can be cured? An answer based on a new classification of diseases

    CERN Document Server

    Richmond, Peter

    2016-01-01

    Is cancer a disease that can be cured or a degenerative disease which comes predominantly with old age? We give an answer based on a two-dimensional representation of diseases. These two dimensions are defined as follows. In mortality curves there is an age, namely a_c = 10 years, which plays a crucial role in the sense that the mortality rate decreases in the interval I1=(aa_c). The respective trends in I1 and I2 are the two parameters used in our classification of diseases. Within the framework of reliability analysis, I1 and I2 would be referred to as the "burn-in" and "wear-out" phases. This leads to define three broad groups of diseases. (AS1) Asymmetry with prevalence of I1. (AS2) Asymmetry with prevalence of I2. (S) Symmetry, with I1 and I2 both playing roles of comparable importance. Not surprisingly, among AS1-cases one finds all diseases due to congenital malformations. In the AS2-class one finds degenerative diseases, e.g. Alzheimer's disease. Among S-cases one finds most diseases due to external p...

  8. A Novel Classification Method for Prediction of Rectal Bleeding in Prostate Cancer Radiotherapy Based on a Semi-Nonnegative ICA of 3D Planned Dose Distributions.

    Science.gov (United States)

    Coloigner, Julie; Fargeas, Auréline; Kachenoura, Amar; Wang, Lu; Dréan, Gaël; Lafond, Caroline; Senhadji, Lotfi; de Crevoisier, Renaud; Acosta, Oscar; Albera, Laurent

    2015-05-01

    The understanding of dose/side-effects relationships in prostate cancer radiotherapy is crucial to define appropriate individual's constraints for the therapy planning. Most of the existing methods to predict side-effects do not fully exploit the rich spatial information conveyed by the three-dimensional planned dose distributions. We propose a new classification method for three-dimensional individuals' doses, based on a new semi-nonnegative ICA algorithm to identify patients at risk of presenting rectal bleeding from a population treated for prostate cancer. The method first determines two bases of vectors from the population data: the two bases span vector subspaces, which characterize patients with and without rectal bleeding, respectively. The classification is then achieved by calculating the distance of a given patient to the two subspaces. The results, obtained on a cohort of 87 patients (at two year follow-up) treated with radiotherapy, showed high performance in terms of sensitivity and specificity.

  9. [Urine-based tumour diagnostics for bladder cancer: effects of the new histopathological classification--food for thought].

    Science.gov (United States)

    Knüchel, R; Lindemann-Docter, K

    2009-06-01

    The new WHO classification of bladder cancer was published in 2004 and consequently cannot be regarded as very recent. However, it is still timely since it picks up considerations affecting other schemes of tumour classification as well. Genetic results are included in the context of morphology, and at the same time a high inter- and intra-observer agreement is striven for as a matter of high quality patient care. The WHO classification of 2004 does not include cytological diagnosis. Thinking about and considering tumour tissue diagnosis, the style of cytological diagnoses is also affected. For tissue diagnoses, low- and high-grade tumours are differentiated from benign lesions including reactive changes. The element of this classification which has to be transferred to cytology is especially the unequivocal diagnosis of high-grade lesions. The low-grade lesion, correlating with tissue of well-differentiated papillary tumours and dysplasias, mostly cannot be distinguished cytologically with certainty from a broad spectrum of non-malignant lesions (papillomas, reactive urothelial detachment in urolithiasis patients, cytology specimen from vigorously irrigated bladders). For the latter group our aim should be to establish an additional diagnostic tool of high quality driven by clinical questions (e.g. potential of tumour progression).

  10. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...

  11. Dissecting cancer heterogeneity--an unsupervised classification approach

    NARCIS (Netherlands)

    Wang, Xin; Markowetz, Florian; de Sousa E Melo, Felipe; Medema, Jan Paul; Vermeulen, Louis

    2013-01-01

    Gene-expression-based classification studies have changed the way cancer is traditionally perceived. It is becoming increasingly clear that many cancer types are in fact not single diseases but rather consist of multiple molecular distinct subtypes. In this review, we discuss unsupervised

  12. Plastic surgery for breast cancer: еssentials, classification, performance algorithm

    Directory of Open Access Journals (Sweden)

    A. Kh. Ismagilov

    2014-01-01

    Full Text Available The choice of plastic surgical techniques for cancer is influenced by two factors: resection volume/baseline breast volume ratio and tumor site.Based on these factors, the authors propose a two-level classification and an algorithm for performing the most optimal plastic operation onthe breast for its cancer.

  13. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  14. CLASSIFICATION OF SEVERAL SKIN CANCER TYPES BASED ON AUTOFLUORESCENCE INTENSITY OF VISIBLE LIGHT TO NEAR INFRARED RATIO

    Directory of Open Access Journals (Sweden)

    Aryo Tedjo

    2009-12-01

    Full Text Available Skin cancer is a malignant growth on the skin caused by many factors. The most common skin cancers are Basal Cell Cancer (BCC and Squamous Cell Cancer (SCC. This research uses a discriminant analysis to classify some tissues of skin cancer based on criterion number of independent variables. An independent variable is variation of excitation light sources (LED lamp, filters, and sensors to measure Autofluorescence Intensity (IAF of visible light to near infrared (VIS/NIR ratio of paraffin embedded tissue biopsy from BCC, SCC, and Lipoma. From the result of discriminant analysis, it is known that the discriminant function is determined by 4 (four independent variables i.e., Blue LED-Red Filter, Blue LED-Yellow Filter, UV LED-Blue Filter, and UV LED-Yellow Filter. The accuracy of discriminant in classifying the analysis of three skin cancer tissues is 100 %.

  15. Cancer classification: Mutual information, target network and strategies of therapy.

    Science.gov (United States)

    Hsu, Wen-Chin; Liu, Chan-Cheng; Chang, Fu; Chen, Su-Shing

    2012-10-02

    Cancer therapy is a challenging research area because side effects often occur in chemo and radiation therapy. We intend to study a multi-targets and multi-components design that will provide synergistic results to improve efficiency of cancer therapy. We have developed a general methodology, AMFES (Adaptive Multiple FEature Selection), for ranking and selecting important cancer biomarkers based on SVM (Support Vector Machine) classification. In particular, we exemplify this method by three datasets: a prostate cancer (three stages), a breast cancer (four subtypes), and another prostate cancer (normal vs. cancerous). Moreover, we have computed the target networks of these biomarkers as the signatures of the cancers with additional information (mutual information between biomarkers of the network). Then, we proposed a robust framework for synergistic therapy design approach which includes varies existing mechanisms. These methodologies were applied to three GEO datasets: GSE18655 (three prostate stages), GSE19536 (4 subtypes breast cancers) and GSE21036 (prostate cancer cells and normal cells) shown in. We selected 96 biomarkers for first prostate cancer dataset (three prostate stages), 72 for breast cancer (luminal A vs. luminal B), 68 for breast cancer (basal-like vs. normal-like), and 22 for another prostate cancer (cancerous vs. normal. In addition, we obtained statistically significant results of mutual information, which demonstrate that the dependencies among these biomarkers can be positive or negative. We proposed an efficient feature ranking and selection scheme, AMFES, to select an important subset from a large number of features for any cancer dataset. Thus, we obtained the signatures of these cancers by building their target networks. Finally, we proposed a robust framework of synergistic therapy for cancer patients. Our framework is not only supported by real GEO datasets but also aim to a multi-targets/multi-components drug design tool, which improves

  16. Magnetic resonance imaging texture analysis classification of primary breast cancer

    Energy Technology Data Exchange (ETDEWEB)

    Waugh, S.A.; Lerski, R.A. [Ninewells Hospital and Medical School, Department of Medical Physics, Dundee (United Kingdom); Purdie, C.A.; Jordan, L.B. [Ninewells Hospital and Medical School, Department of Pathology, Dundee (United Kingdom); Vinnicombe, S. [University of Dundee, Division of Imaging and Technology, Ninewells Hospital and Medical School, Dundee (United Kingdom); Martin, P. [Ninewells Hospital and Medical School, Department of Clinical Radiology, Dundee (United Kingdom); Thompson, A.M. [University of Texas MD Anderson Cancer Center, Department of Surgical Oncology, Houston, TX (United States)

    2016-02-15

    Patient-tailored treatments for breast cancer are based on histological and immunohistochemical (IHC) subtypes. Magnetic Resonance Imaging (MRI) texture analysis (TA) may be useful in non-invasive lesion subtype classification. Women with newly diagnosed primary breast cancer underwent pre-treatment dynamic contrast-enhanced breast MRI. TA was performed using co-occurrence matrix (COM) features, by creating a model on retrospective training data, then prospectively applying to a test set. Analyses were blinded to breast pathology. Subtype classifications were performed using a cross-validated k-nearest-neighbour (k = 3) technique, with accuracy relative to pathology assessed and receiver operator curve (AUROC) calculated. Mann-Whitney U and Kruskal-Wallis tests were used to assess raw entropy feature values. Histological subtype classifications were similar across training (n = 148 cancers) and test sets (n = 73 lesions) using all COM features (training: 75 %, AUROC = 0.816; test: 72.5 %, AUROC = 0.823). Entropy features were significantly different between lobular and ductal cancers (p < 0.001; Mann-Whitney U). IHC classifications using COM features were also similar for training and test data (training: 57.2 %, AUROC = 0.754; test: 57.0 %, AUROC = 0.750). Hormone receptor positive and negative cancers demonstrated significantly different entropy features. Entropy features alone were unable to create a robust classification model. Textural differences on contrast-enhanced MR images may reflect underlying lesion subtypes, which merits testing against treatment response. (orig.)

  17. Reverse phase protein array based tumor profiling identifies a biomarker signature for risk classification of hormone receptor-positive breast cancer

    Directory of Open Access Journals (Sweden)

    Johanna Sonntag

    2014-03-01

    Full Text Available A robust subclassification of luminal breast cancer, the most common molecular subtype of human breast cancer, is crucial for therapy decisions. While a part of patients is at higher risk of recurrence and requires chemo-endocrine treatment, the other part is at lower risk and also poorly responds to chemotherapeutic regimens. To approximate the risk of cancer recurrence, clinical guidelines recommend determining histologic grading and abundance of a cell proliferation marker in tumor specimens. However, this approach assigns an intermediate risk to a substantial number of patients and in addition suffers from a high interobserver variability. Therefore, the aim of our study was to identify a quantitative protein biomarker signature to facilitate risk classification. Reverse phase protein arrays (RPPA were used to obtain quantitative expression data for 128 breast cancer relevant proteins in a set of hormone receptor-positive tumors (n = 109. Proteomic data for the subset of histologic G1 (n = 14 and G3 (n = 22 samples were used for biomarker discovery serving as surrogates of low and high recurrence risk, respectively. A novel biomarker selection workflow based on combining three different classification methods identified caveolin-1, NDKA, RPS6, and Ki-67 as top candidates. NDKA, RPS6, and Ki-67 were expressed at elevated levels in high risk tumors whereas caveolin-1 was observed as downregulated. The identified biomarker signature was subsequently analyzed using an independent test set (AUC = 0.78. Further evaluation of the identified biomarker panel by Western blot and mRNA profiling confirmed the proteomic signature obtained by RPPA. In conclusion, the biomarker signature introduced supports RPPA as a tool for cancer biomarker discovery.

  18. Pathohistological classification systems in gastric cancer: diagnostic relevance and prognostic value.

    Science.gov (United States)

    Berlth, Felix; Bollschweiler, Elfriede; Drebber, Uta; Hoelscher, Arnulf H; Moenig, Stefan

    2014-05-21

    Several pathohistological classification systems exist for the diagnosis of gastric cancer. Many studies have investigated the correlation between the pathohistological characteristics in gastric cancer and patient characteristics, disease specific criteria and overall outcome. It is still controversial as to which classification system imparts the most reliable information, and therefore, the choice of system may vary in clinical routine. In addition to the most common classification systems, such as the Laurén and the World Health Organization (WHO) classifications, other authors have tried to characterize and classify gastric cancer based on the microscopic morphology and in reference to the clinical outcome of the patients. In more than 50 years of systematic classification of the pathohistological characteristics of gastric cancer, there is no sole classification system that is consistently used worldwide in diagnostics and research. However, several national guidelines for the treatment of gastric cancer refer to the Laurén or the WHO classifications regarding therapeutic decision-making, which underlines the importance of a reliable classification system for gastric cancer. The latest results from gastric cancer studies indicate that it might be useful to integrate DNA- and RNA-based features of gastric cancer into the classification systems to establish prognostic relevance. This article reviews the diagnostic relevance and the prognostic value of different pathohistological classification systems in gastric cancer.

  19. Topical interferon alfa-2b for management of ocular surface squamous neoplasia in 23 cases: outcomes based on American Joint Committee on Cancer classification.

    Science.gov (United States)

    Shah, Sanket U; Kaliki, Swathi; Kim, H Jane; Lally, Sara E; Shields, Jerry A; Shields, Carol L

    2012-02-01

    To evaluate the efficacy of topical interferon alfa-2b in the management of ocular surface squamous neoplasia (OSSN). Clinically visible OSSN in 20 patients (23 tumors) was managed with topical interferon alfa-2b, 1 million IU/mL, 4 times daily. Tumor control and complications were evaluated according to American Joint Committee on Cancer classification. Complete tumor resolution was achieved in 19 tumors (83%) following topical interferon alfa-2b treatment for a median period of 6 months (mean, 7 months; range, 1-12 months) and maintained for up to 24 months of follow-up. Of the 4 tumors with partial resolution (17%), tumor surface area was reduced 44% (median) during 4 months (median) without further response and alternative therapy was used. Based on American Joint Committee on Cancer classification, complete control was achieved in 2 of 3 Tis (67%), 17 of 20 T3 (85%), 19 of 23 N0 (83%), and 19 of 23 M0 (83%) category tumors. Tumors involving the cornea responded earlier compared with those without corneal involvement (P = .01). Initial tumor size did not correlate with time to response (P = .27). Recurrence was noted in 1 case (Tis, 4%) at 3 months. Adverse effects included conjunctival hyperemia (2 [10%]), follicular hypertrophy (2 [10%]), giant papillary conjunctivitis (1 [5%]), irritation (1 [5%]), corneal epithelial defect (1 [5%]), and flulike symptoms (1 [5%]); all resolved within 1 month of medication discontinuation. According to American Joint Committee on Cancer classification, complete control with topical interferon alfa-2b can be achieved in 67% of Tis, 85% of T3, and 83% of all OSSN.

  20. Biogeography based Satellite Image Classification

    OpenAIRE

    Harish Kundra; Parminder Singh; Navdeep Kaur; V.K. Panchal

    2009-01-01

    Biogeography is the study of the geographical distribution of biological organisms. The mindset of the engineer is that we can learn from nature. Biogeography Based Optimization is a burgeoning nature inspired technique to find the optimal solution of the problem. Satellite image classification is an important task because it is the only way we can know about the land cover map of inaccessible areas. Though satellite images have been classified in past by using various techniques, the researc...

  1. Magnetic resonance imaging texture analysis classification of primary breast cancer.

    Science.gov (United States)

    Waugh, S A; Purdie, C A; Jordan, L B; Vinnicombe, S; Lerski, R A; Martin, P; Thompson, A M

    2016-02-01

    Patient-tailored treatments for breast cancer are based on histological and immunohistochemical (IHC) subtypes. Magnetic Resonance Imaging (MRI) texture analysis (TA) may be useful in non-invasive lesion subtype classification. Women with newly diagnosed primary breast cancer underwent pre-treatment dynamic contrast-enhanced breast MRI. TA was performed using co-occurrence matrix (COM) features, by creating a model on retrospective training data, then prospectively applying to a test set. Analyses were blinded to breast pathology. Subtype classifications were performed using a cross-validated k-nearest-neighbour (k = 3) technique, with accuracy relative to pathology assessed and receiver operator curve (AUROC) calculated. Mann-Whitney U and Kruskal-Wallis tests were used to assess raw entropy feature values. Histological subtype classifications were similar across training (n = 148 cancers) and test sets (n = 73 lesions) using all COM features (training: 75%, AUROC = 0.816; test: 72.5%, AUROC = 0.823). Entropy features were significantly different between lobular and ductal cancers (p cancers demonstrated significantly different entropy features. Entropy features alone were unable to create a robust classification model. Textural differences on contrast-enhanced MR images may reflect underlying lesion subtypes, which merits testing against treatment response. • MR-derived entropy features, representing heterogeneity, provide important information on tissue composition. • Entropy features can differentiate between histological and immunohistochemical subtypes of breast cancer. • Differing entropy features between breast cancer subtypes implies differences in lesion heterogeneity. • Texture analysis of breast cancer potentially provides added information for decision making.

  2. Association between gastric cancer and the Kyoto classification of gastritis.

    Science.gov (United States)

    Shichijo, Satoki; Hirata, Yoshihiro; Niikura, Ryota; Hayakawa, Yoku; Yamada, Atsuo; Koike, Kazuhiko

    2017-09-01

    Histological gastritis is associated with gastric cancer, but its diagnosis requires biopsy. Many classifications of endoscopic gastritis are available, but not all are useful for risk stratification of gastric cancer. The Kyoto Classification of Gastritis was proposed at the 85th Congress of the Japan Gastroenterological Endoscopy Society. This cross-sectional study evaluated the usefulness of the Kyoto Classification of Gastritis for risk stratification of gastric cancer. From August 2013 to September 2014, esophagogastroduodenoscopy was performed and the gastric findings evaluated according to the Kyoto Classification of Gastritis in a total of 4062 patients. The following five endoscopic findings were selected based on previous reports: atrophy, intestinal metaplasia, enlarged folds, nodularity, and diffuse redness. A total of 3392 patients (1746 [51%] men and 1646 [49%] women) were analyzed. Among them, 107 gastric cancers were diagnosed. Atrophy was found in 2585 (78%) and intestinal metaplasia in 924 (27%). Enlarged folds, nodularity, and diffuse redness were found in 197 (5.8%), 22 (0.6%), and 573 (17%), respectively. In univariate analyses, the severity of atrophy, intestinal metaplasia, diffuse redness, age, and male sex were associated with gastric cancer. In a multivariate analysis, atrophy and male sex were found to be independent risk factors. Younger age and severe atrophy were determined to be associated with diffuse-type gastric cancer. Endoscopic detection of atrophy was associated with the risk of gastric cancer. Thus, patients with severe atrophy should be examined carefully and may require intensive follow-up. © 2017 Journal of Gastroenterology and Hepatology Foundation and John Wiley & Sons Australia, Ltd.

  3. Classification of FTIR cancer data using wavelets and BPNN

    Science.gov (United States)

    Cheng, Cungui; Tian, Yumei; Zhang, Changjiang

    2007-11-01

    In this paper, a feature extracting method based on wavelets for horizontal attenuated total reflectance Fourier transform infrared spectroscopy (HATR-FTIR) cancer data analysis and classification using artificial neural network trained with back-propagation algorithm is presented. 168 Spectra were collected from 84 pairs of fresh normal and abnormal lung tissue's samples. After preprocessing, 12 features were extracted with continuous wavelet analysis. Based on BPNN classification, all spectra were classified into two categories : normal or abnormal. The accuracy of identifying normal, early carcinoma, and advanced carcinoma were 100%, 90% and 100% respectively. This result indicated that FTIR with continuous wavelet transform (CWT) and the back-propagation neural network (BPNN) could effectively and easily diagnose lung cancer in its early stages.

  4. MORPHOLOGICAL CLASSIFICATION OF RENAL-CANCER

    NARCIS (Netherlands)

    STORKEL, S; VANDENBERG, E

    The current classification of renal-cell adenomas (RCAs) and carcinomas (RCCs) is based on eight basic cell and tumor types (entities) with characteristic morphologic features: (1) RCCs of clear-cell type, (2) RCAs/RCCs of chromophilic-cell type, (3) RCAs/RCCs of chromophobic-cell type, (4) RCCs of

  5. Computer aided decision support system for cervical cancer classification

    Science.gov (United States)

    Rahmadwati, Rahmadwati; Naghdy, Golshah; Ros, Montserrat; Todd, Catherine

    2012-10-01

    Conventional analysis of a cervical histology image, such a pap smear or a biopsy sample, is performed by an expert pathologist manually. This involves inspecting the sample for cellular level abnormalities and determining the spread of the abnormalities. Cancer is graded based on the spread of the abnormal cells. This is a tedious, subjective and time-consuming process with considerable variations in diagnosis between the experts. This paper presents a computer aided decision support system (CADSS) tool to help the pathologists in their examination of the cervical cancer biopsies. The main aim of the proposed CADSS system is to identify abnormalities and quantify cancer grading in a systematic and repeatable manner. The paper proposes three different methods which presents and compares the results using 475 images of cervical biopsies which include normal, three stages of pre cancer, and malignant cases. This paper will explore various components of an effective CADSS; image acquisition, pre-processing, segmentation, feature extraction, classification, grading and disease identification. Cervical histological images are captured using a digital microscope. The images are captured in sufficient resolution to retain enough information for effective classification. Histology images of cervical biopsies consist of three major sections; background, stroma and squamous epithelium. Most diagnostic information are contained within the epithelium region. This paper will present two levels of segmentations; global (macro) and local (micro). At the global level the squamous epithelium is separated from the background and stroma. At the local or cellular level, the nuclei and cytoplasm are segmented for further analysis. Image features that influence the pathologists' decision during the analysis and classification of a cervical biopsy are the nuclei's shape and spread; the ratio of the areas of nuclei and cytoplasm as well as the texture and spread of the abnormalities

  6. Classification of breast cancer cytological specimen using convolutional neural network

    Science.gov (United States)

    Żejmo, Michał; Kowal, Marek; Korbicz, Józef; Monczak, Roman

    2017-01-01

    The paper presents a deep learning approach for automatic classification of breast tumors based on fine needle cytology. The main aim of the system is to distinguish benign from malignant cases based on microscopic images. Experiment was carried out on cytological samples derived from 50 patients (25 benign cases + 25 malignant cases) diagnosed in Regional Hospital in Zielona Góra. To classify microscopic images, we used convolutional neural networks (CNN) of two types: GoogLeNet and AlexNet. Due to the very large size of images of cytological specimen (on average 200000 × 100000 pixels), they were divided into smaller patches of size 256 × 256 pixels. Breast cancer classification usually is based on morphometric features of nuclei. Therefore, training and validation patches were selected using Support Vector Machine (SVM) so that suitable amount of cell material was depicted. Neural classifiers were tuned using GPU accelerated implementation of gradient descent algorithm. Training error was defined as a cross-entropy classification loss. Classification accuracy was defined as the percentage ratio of successfully classified validation patches to the total number of validation patches. The best accuracy rate of 83% was obtained by GoogLeNet model. We observed that more misclassified patches belong to malignant cases.

  7. Cluster-based adaptive metric classification

    NARCIS (Netherlands)

    Giotis, Ioannis; Petkov, Nicolai

    2012-01-01

    Introducing adaptive metric has been shown to improve the results of distance-based classification algorithms. Existing methods are often computationally intensive, either in the training or in the classification phase. We present a novel algorithm that we call Cluster-Based Adaptive Metric (CLAM)

  8. A new algorithm for integrated analysis of miRNA-mRNA interactions based on individual classification reveals insights into bladder cancer.

    Science.gov (United States)

    Hecker, Nikolai; Stephan, Carsten; Mollenkopf, Hans-Joachim; Jung, Klaus; Preissner, Robert; Meyer, Hellmuth-A

    2013-01-01

    MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression. It has been proposed that miRNAs play an important role in cancer development and progression. Their ability to affect multiple gene pathways by targeting various mRNAs makes them an interesting class of regulators. We have developed an algorithm, Classification based Analysis of Paired Expression data of RNA (CAPE RNA), which is capable of identifying altered miRNA-mRNA regulation between tissues samples that assigns interaction states to each sample without preexisting stratification of groups. The distribution of the assigned interaction states compared to given experimental groups is used to assess the quality of a predicted interaction. We demonstrate the applicability of our approach by analyzing urothelial carcinoma and normal bladder tissue samples derived from 24 patients. Using our approach, normal and tumor tissue samples as well as different stages of tumor progression were successfully stratified. Also, our results suggest interesting differentially regulated miRNA-mRNA interactions associated with bladder tumor progression. The need for tools that allow an integrative analysis of microRNA and mRNA expression data has been addressed. With this study, we provide an algorithm that emphasizes on the distribution of samples to rank differentially regulated miRNA-mRNA interactions. This is a new point of view compared to current approaches. From bootstrapping analysis, our ranking yields features that build strong classifiers. Further analysis reveals genes identified as differentially regulated by miRNAs to be enriched in cancer pathways, thus suggesting biologically interesting interactions.

  9. [Molecular Classification of Colorectal Cancers and Clinical Application].

    Science.gov (United States)

    Jeon, So Yeon; Kim, Won Kyu; Kim, Hoguen

    2016-12-25

    The molecular genetics of colorectal cancers (CRCs) is among the best understood of common human cancers. It is difficult to predict the prognosis and/or to predict chemoresponding in CRC patients. At present, prognosis is based predominantly on the tumor stage and pathological examination of the disease. Molecular classification of CRCs, based on genomics and transcriptomics, proposed that CRCs can be classified into at least three-to-six subtypes, depending on the gene expression pattern, and groups of marker genes representing to each subtype have also been reported. Gene expression-based subtyping is now widely accepted as a relevant source of disease stratification. We reviewed the previous studies on CRC subtyping, international consortium dedicated to large-scale data sharing and analytics recently established four consensus molecular subtypes with distinguishing features. Predictive markers identified in these studies are under investigation and large-scale clinical evaluations of molecular markers are currently in progress.

  10. Molecular classification of gastric cancer: a new paradigm.

    Science.gov (United States)

    Shah, Manish A; Khanin, Raya; Tang, Laura; Janjigian, Yelena Y; Klimstra, David S; Gerdes, Hans; Kelsen, David P

    2011-05-01

    Gastric cancer may be subdivided into 3 distinct subtypes--proximal, diffuse, and distal gastric cancer--based on histopathologic and anatomic criteria. Each subtype is associated with unique epidemiology. Our aim is to test the hypothesis that these distinct gastric cancer subtypes may also be distinguished by gene expression analysis. Patients with localized gastric adenocarcinoma being screened for a phase II preoperative clinical trial (National Cancer Institute, NCI #5917) underwent endoscopic biopsy for fresh tumor procurement. Four to 6 targeted biopsies of the primary tumor were obtained. Macrodissection was carried out to ensure more than 80% carcinoma in the sample. HG-U133A GeneChip (Affymetrix) was used for cDNA expression analysis, and all arrays were processed and analyzed using the Bioconductor R-package. Between November 2003 and January 2006, 57 patients were screened to identify 36 patients with localized gastric cancer who had adequate RNA for expression analysis. Using supervised analysis, we built a classifier to distinguish the 3 gastric cancer subtypes, successfully classifying each into tightly grouped clusters. Leave-one-out cross-validation error was 0.14, suggesting that more than 85% of samples were classified correctly. Gene set analysis with the false discovery rate set at 0.25 identified several pathways that were differentially regulated when comparing each gastric cancer subtype to adjacent normal stomach. Subtypes of gastric cancer that have epidemiologic and histologic distinctions are also distinguished by gene expression data. These preliminary data suggest a new classification of gastric cancer with implications for improving our understanding of disease biology and identification of unique molecular drivers for each gastric cancer subtype. ©2011 AACR.

  11. RPCA-Based Tumor Classification Using Gene Expression Data.

    Science.gov (United States)

    Liu, Jin-Xing; Xu, Yong; Zheng, Chun-Hou; Kong, Heng; Lai, Zhi-Hui

    2015-01-01

    Microarray techniques have been used to delineate cancer groups or to identify candidate genes for cancer prognosis. As such problems can be viewed as classification ones, various classification methods have been applied to analyze or interpret gene expression data. In this paper, we propose a novel method based on robust principal component analysis (RPCA) to classify tumor samples of gene expression data. Firstly, RPCA is utilized to highlight the characteristic genes associated with a special biological process. Then, RPCA and RPCA+LDA (robust principal component analysis and linear discriminant analysis) are used to identify the features. Finally, support vector machine (SVM) is applied to classify the tumor samples of gene expression data based on the identified features. Experiments on seven data sets demonstrate that our methods are effective and feasible for tumor classification.

  12. Classification of breast cancer histology images using Convolutional Neural Networks.

    Directory of Open Access Journals (Sweden)

    Teresa Araújo

    Full Text Available Breast cancer is one of the main causes of cancer death worldwide. The diagnosis of biopsy tissue with hematoxylin and eosin stained images is non-trivial and specialists often disagree on the final diagnosis. Computer-aided Diagnosis systems contribute to reduce the cost and increase the efficiency of this process. Conventional classification approaches rely on feature extraction methods designed for a specific problem based on field-knowledge. To overcome the many difficulties of the feature-based approaches, deep learning methods are becoming important alternatives. A method for the classification of hematoxylin and eosin stained breast biopsy images using Convolutional Neural Networks (CNNs is proposed. Images are classified in four classes, normal tissue, benign lesion, in situ carcinoma and invasive carcinoma, and in two classes, carcinoma and non-carcinoma. The architecture of the network is designed to retrieve information at different scales, including both nuclei and overall tissue organization. This design allows the extension of the proposed system to whole-slide histology images. The features extracted by the CNN are also used for training a Support Vector Machine classifier. Accuracies of 77.8% for four class and 83.3% for carcinoma/non-carcinoma are achieved. The sensitivity of our method for cancer cases is 95.6%.

  13. A lymphocyte spatial distribution graph-based method for automated classification of recurrence risk on lung cancer images

    Science.gov (United States)

    Garciá-Arteaga, Juan D.; Corredor, Germán.; Wang, Xiangxue; Velcheti, Vamsidhar; Madabhushi, Anant; Romero, Eduardo

    2017-11-01

    Tumor-infiltrating lymphocytes occurs when various classes of white blood cells migrate from the blood stream towards the tumor, infiltrating it. The presence of TIL is predictive of the response of the patient to therapy. In this paper, we show how the automatic detection of lymphocytes in digital H and E histopathological images and the quantitative evaluation of the global lymphocyte configuration, evaluated through global features extracted from non-parametric graphs, constructed from the lymphocytes' detected positions, can be correlated to the patient's outcome in early-stage non-small cell lung cancer (NSCLC). The method was assessed on a tissue microarray cohort composed of 63 NSCLC cases. From the evaluated graphs, minimum spanning trees and K-nn showed the highest predictive ability, yielding F1 Scores of 0.75 and 0.72 and accuracies of 0.67 and 0.69, respectively. The predictive power of the proposed methodology indicates that graphs may be used to develop objective measures of the infiltration grade of tumors, which can, in turn, be used by pathologists to improve the decision making and treatment planning processes.

  14. Classification of neuropathic pain in cancer patients

    DEFF Research Database (Denmark)

    Brunelli, Cinzia; Bennett, Michael I; Kaasa, Stein

    2014-01-01

    and on the relevance of patient-reported outcome (PRO) descriptors for the screening of NP in this population. An international group of 42 experts was invited to participate in a consensus process through a modified 2-round Internet-based Delphi survey. Relevant topics investigated were: peculiarities of NP...... was found on the statement "the pathophysiology of NP due to cancer can be different from non-cancer NP" (MED=9, IQR=2). Satisfactory consensus was reached for the first 3 NeuPSIG criteria (pain distribution, history, and sensory findings; MEDs⩾8, IQRs⩽3), but not for the fourth one (diagnostic test...

  15. Application of machine learning on brain cancer multiclass classification

    Science.gov (United States)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  16. Molecular and clinical support for a four-tiered grading system for bladder cancer based on the WHO 1973 and 2004 classifications.

    Science.gov (United States)

    van Rhijn, Bas W G; Musquera, Mireia; Liu, Liyang; Vis, André N; Zuiverloon, Tahlita C M; van Leenders, Geert J L H; Kirkels, Wim J; Zwarthoff, Ellen C; Boevé, Egbert R; Jöbsis, Adriaan C; Bapat, Bharati; Jewett, Michael A S; Zlotta, Alexandre R; van der Kwast, Theo H

    2015-05-01

    Currently, the use of two classification systems for bladder cancer grade is advocated in clinical guidelines because the WHO2004 classification has not been sufficiently validated with biological markers and follow-up. The slides of 325 primary non-muscle invasive bladder cancers from three hospitals were reviewed by one uro-pathologist in two separate sessions for the WHO1973 (G1, G2 and G3) and 2004 (papillary urothelial neoplasm of low malignant potential (LMP), low-grade (LG) and high-grade (HG)) classifications. FGFR3 status was examined with PCR-SNaPshot analysis. Expression of Ki-67, P53 and P27 was analyzed by immuno-histochemistry. Clinical recurrence and progression were determined. We performed validation and cross-validation of the two systems for grade with molecular markers and clinical outcome. Multivariable analyses were done to predict prognosis and pT1 bladder cancer. Grade review resulted in 88 G1, 149 G2 and 88 G3 lesions (WHO1973) and 79 LMP, 101 LG and 145 HG lesions (WHO2004). Molecular validation of both grading systems showed that FGFR3 mutations were associated with lower grades whereas altered expression (Ki-67, P53 and P27) was found in higher grades. Clinical validation showed that the two classification systems were both significant predictors for progression but not for recurrence. Cross-validation of both WHO systems showed a significant stepwise increase in biological (molecular markers) and clinical (progression) potential along the line: G1-LG-G2-HG-G3. The LMP and G1 categories had a similar clinical and molecular profile. On the basis of molecular biology and multivariable clinical data, our results support a four-tiered grading system using the 1973 and 2004 WHO classifications with one low-grade (LMP/LG/G1) category that includes LMP, two intermediate grade (LG/G2 and HG/G2) categories and one high-grade (HG/G3) category.

  17. Molecular Classification of Gastric Cancer: A new paradigm

    Science.gov (United States)

    Shah, Manish A.; Khanin, Raya; Tang, Laura; Janjigian, Yelena Y.; Klimstra, David S.; Gerdes, Hans; Kelsen, David P.

    2011-01-01

    Purpose Gastric cancer may be subdivided into three distinct subtypes –proximal, diffuse, and distal gastric cancer– based on histopathologic and anatomic criteria. Each subtype is associated with unique epidemiology. Our aim is to test the hypothesis that these distinct gastric cancer subtypes may also be distinguished by gene expression analysis. Experimental Design Patients with localized gastric adenocarcinoma being screened for a phase II preoperative clinical trial (NCI 5917) underwent endoscopic biopsy for fresh tumor procurement. 4–6 targeted biopsies of the primary tumor were obtained. Macrodissection was performed to ensure >80% carcinoma in the sample. HG-U133A GeneChip (Affymetrix) was used for cDNA expression analysis, and all arrays were processed and analyzed using the Bioconductor R-package. Results Between November 2003 and January 2006, 57 patients were screened to identify 36 patients with localized gastric cancer who had adequate RNA for expression analysis. Using supervised analysis, we built a classifier to distinguish the three gastric cancer subtypes, successfully classifying each into tightly grouped clusters. Leave-one-out cross validation error was 0.14, suggesting that >85% of samples were classified correctly. Gene set analysis with the False Discovery Rate set at 0.25 identified several pathways that were differentially regulated when comparing each gastric cancer subtype to adjacent normal stomach. Conclusions Subtypes of gastric cancer that have epidemiologic and histologic distinction are also distinguished by gene expression data. These preliminary data suggest a new classification of gastric cancer with implications for improving our understanding of disease biology and identification of unique molecular drivers for each gastric cancer subtype. PMID:21430069

  18. Treating Colon Cancer Survivability Prediction as a Classification Problem

    Directory of Open Access Journals (Sweden)

    Ana SILVA

    2016-10-01

    Full Text Available This work presents a survivability prediction model for colon cancer developed with machine learning techniques. Survivability was viewed as a classification task where it was necessary to determine if a patient would survive each of the five years following treatment. The model was based on the SEER dataset which, after preprocessing, consisted of 38,592 records of colon cancer patients. Six features were extracted from a feature selection process in order to construct the model. This model was compared with another one with 18 features indicated by a physician. The results show that the performance of the six-feature model is close to that of the model using 18 features, which indicates that the first may be a good compromise between usability and performance.

  19. Land classification based on hydrological landscape units

    NARCIS (Netherlands)

    Gharari, S.; Fenicia, F.; Hrachowitz, M.; Savenije, H.H.G.

    2011-01-01

    This paper presents a new type of hydrological landscape classification based on dominant runoff mechanisms. Three landscape classes are distinguished: wetland, hillslope and plateau, corresponding to three dominant hydrological regimes: saturation excess overland flow, storage excess sub-surface

  20. Classifications of multispectral colorectal cancer tissues using convolution neural network

    Directory of Open Access Journals (Sweden)

    Hawraa Haj-Hassan

    2017-01-01

    Full Text Available Background: Colorectal cancer (CRC is the third most common cancer among men and women. Its diagnosis in early stages, typically done through the analysis of colon biopsy images, can greatly improve the chances of a successful treatment. This paper proposes to use convolution neural networks (CNNs to predict three tissue types related to the progression of CRC: benign hyperplasia (BH, intraepithelial neoplasia (IN, and carcinoma (Ca. Methods: Multispectral biopsy images of thirty CRC patients were retrospectively analyzed. Images of tissue samples were divided into three groups, based on their type (10 BH, 10 IN, and 10 Ca. An active contour model was used to segment image regions containing pathological tissues. Tissue samples were classified using a CNN containing convolution, max-pooling, and fully-connected layers. Available tissue samples were split into a training set, for learning the CNN parameters, and test set, for evaluating its performance. Results: An accuracy of 99.17% was obtained from segmented image regions, outperforming existing approaches based on traditional feature extraction, and classification techniques. Conclusions: Experimental results demonstrate the effectiveness of CNN for the classification of CRC tissue types, in particular when using presegmented regions of interest.

  1. Automated classification of histopathology images of prostate cancer using a Bag-of-Words approach

    Science.gov (United States)

    Sanghavi, Foram M.; Agaian, Sos S.

    2016-05-01

    The goals of this paper are (1) test the Computer Aided Classification of the prostate cancer histopathology images based on the Bag-of-Words (BoW) approach (2) evaluate the performance of the classification grade 3 and 4 of the proposed method using the results of the approach proposed by the authors Khurd et al. in [9] and (3) classify the different grades of cancer namely, grade 0, 3, 4, and 5 using the proposed approach. The system performance is assessed using 132 prostate cancer histopathology of different grades. The system performance of the SURF features are also analyzed by comparing the results with SIFT features using different cluster sizes. The results show 90.15% accuracy in detection of prostate cancer images using SURF features with 75 clusters for k-mean clustering. The results showed higher sensitivity for SURF based BoW classification compared to SIFT based BoW.

  2. Orientation selectivity based structure for texture classification

    Science.gov (United States)

    Wu, Jinjian; Lin, Weisi; Shi, Guangming; Zhang, Yazhong; Lu, Liu

    2014-10-01

    Local structure, e.g., local binary pattern (LBP), is widely used in texture classification. However, LBP is too sensitive to disturbance. In this paper, we introduce a novel structure for texture classification. Researches on cognitive neuroscience indicate that the primary visual cortex presents remarkable orientation selectivity for visual information extraction. Inspired by this, we investigate the orientation similarities among neighbor pixels, and propose an orientation selectivity based pattern for local structure description. Experimental results on texture classification demonstrate that the proposed structure descriptor is quite robust to disturbance.

  3. CLASSIFICATION OF CERVICAL CANCER CELLS IN PAP SMEAR SCREENING TEST

    Directory of Open Access Journals (Sweden)

    S. Athinarayanan

    2016-05-01

    Full Text Available Cervical cancer is second topmost cancers among women but also, it was a curable one. Regular smear test can discover the sign of precancerous cell and treated the patient according to the result. However sometimes the detection errors can be occurred by smear thickness, cell overlapping or by un-wanted particles in the smear and cytotechnologists faulty diagnosis. Therefore the reason automatic cancer detection was developed. This was help to increase cancer cell mindfulness, diagnosis accuracy with low cost. This detection process consists of some techniques of the image preprocessing that is segmentation and effective texture feature extraction with SVM classification. Then the Final Classification Results of this proposed technique was compared to the previous classification techniques of KNN and ANN and the result would be very useful to cytotechnologists for their further analysis

  4. Classification of Dukes' B and C colorectal cancers using expression arrays

    DEFF Research Database (Denmark)

    Frederiksen, C.M.; Knudsen, Steen; Laurberg, S.

    2003-01-01

    Purpose. Colorectal cancer is one of the most common malignancies. Substaging of the cancer is of importance not only to prognosis but also to treatment. Classification of substages based on DNA microarray technology is currently the most promising approach. We therefore investigated if gene...... expression of one of the most common malignancies, colorectal cancer, now seems to be within reach. The data indicates that it is possible at least to classify Dukes' B and C colorectal tumors with microarrays....

  5. Application of Artificial Neural Networks in Cancer Classification and Diagnosis Prediction of a Subtype of Lymphoma Based on Gene Expression Profile

    Directory of Open Access Journals (Sweden)

    L Ziaei

    2006-01-01

    Full Text Available Background: Diffuse Large B-cell Lymphoma (DLBCL is the most common subtype of non-Hodgkin’s Lymphoma. DLBCL patients have different survivals after diagnosis. 40% of patients respond well to current therapy and have prolonged survival, whereas the remainders survive less than 5 years. In this study, we have applied artificial neural network to classify patients with DLBCL on the basis of their gene expression profiles. Finally, we have attempted to extract a number of genes that their differential expression were significant in DLBCL subtypes. Methods: We studied 40 patients and 4026 genes. In this study, genes were ranked based on their signal to noise (S/N ratios. After selecting a suitable threshold, some of them whose ratios were less than the threshold were removed. Then we used PCA for more reducing and Perceptron neural network for classification of these patients. We extracted some appropriate genes based on their prediction ability. Results: We considered various targets for patients classifying. Thus patients were classified based on their 5 years survival with accuracy of 93%, in regard to Alizadeh et al study results with accuracy of 100%, and regarding with their International Prognosis Index (IPI with accuracy of 89%. Conclusion: Combination of PCA and S/N ratio is an effective method for the reduction of the dimension and neural network is a robust tool for classification of patients according to their gene expression profile. Keywords: classification, gene expression, DLBCL, neural network, Perceptron

  6. Involvement of Machine Learning for Breast Cancer Image Classification: A Survey

    Directory of Open Access Journals (Sweden)

    Abdullah-Al Nahid

    2017-01-01

    Full Text Available Breast cancer is one of the largest causes of women’s death in the world today. Advance engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors’ and physicians’ time. Despite the various publications on breast image classification, very few review papers are available which provide a detailed description of breast cancer image classification techniques, feature extraction and selection procedures, classification measuring parameterizations, and image classification findings. We have put a special emphasis on the Convolutional Neural Network (CNN method for breast image classification. Along with the CNN method we have also described the involvement of the conventional Neural Network (NN, Logic Based classifiers such as the Random Forest (RF algorithm, Support Vector Machines (SVM, Bayesian methods, and a few of the semisupervised and unsupervised methods which have been used for breast image classification.

  7. Evolving cancer classification in the era of personalized medicine: A primer for radiologists

    Energy Technology Data Exchange (ETDEWEB)

    O' Neill, Alibhe C.; Jagannathan, Jyothi P.; Ramaiya, Nikhil H. [Dept. of of Imaging, Dana Farber Cancer Institute, Boston (United States)

    2017-01-15

    Traditionally tumors were classified based on anatomic location but now specific genetic mutations in cancers are leading to treatment of tumors with molecular targeted therapies. This has led to a paradigm shift in the classification and treatment of cancer. Tumors treated with molecular targeted therapies often show morphological changes rather than change in size and are associated with class specific and drug specific toxicities, different from those encountered with conventional chemotherapeutic agents. It is important for the radiologists to be familiar with the new cancer classification and the various treatment strategies employed, in order to effectively communicate and participate in the multi-disciplinary care. In this paper we will focus on lung cancer as a prototype of the new molecular classification.

  8. A Selective Ensemble Classification Method Combining Mammography Images with Ultrasound Images for Breast Cancer Diagnosis

    Directory of Open Access Journals (Sweden)

    Jinyu Cong

    2017-01-01

    Full Text Available Breast cancer has been one of the main diseases that threatens women’s life. Early detection and diagnosis of breast cancer play an important role in reducing mortality of breast cancer. In this paper, we propose a selective ensemble method integrated with the KNN, SVM, and Naive Bayes to diagnose the breast cancer combining ultrasound images with mammography images. Our experimental results have shown that the selective classification method with an accuracy of 88.73% and sensitivity of 97.06% is efficient for breast cancer diagnosis. And indicator R presents a new way to choose the base classifier for ensemble learning.

  9. A Neural-Network-Based Approach to White Blood Cell Classification

    National Research Council Canada - National Science Library

    Su, Mu-Chun; Cheng, Chun-Yen; Wang, Pa-Chun

    2014-01-01

    ... of important hematic pathologies. For example, the presence of infections, leukemia, and some particular kinds of cancers can be diagnosed based on the results of the classification and the count of ...

  10. A systematic approach to prioritize drug targets using machine learning, a molecular descriptor-based classification model, and high-throughput screening of plant derived molecules: a case study in oral cancer.

    Science.gov (United States)

    Randhawa, Vinay; Kumar Singh, Anil; Acharya, Vishal

    2015-12-01

    Systems-biology inspired identification of drug targets and machine learning-based screening of small molecules which modulate their activity have the potential to revolutionize modern drug discovery by complementing conventional methods. To utilize the effectiveness of such pipelines, we first analyzed the dysregulated gene pairs between control and tumor samples and then implemented an ensemble-based feature selection approach to prioritize targets in oral squamous cell carcinoma (OSCC) for therapeutic exploration. Based on the structural information of known inhibitors of CXCR4-one of the best targets identified in this study-a feature selection was implemented for the identification of optimal structural features (molecular descriptor) based on which a classification model was generated. Furthermore, the CXCR4-centered descriptor-based classification model was finally utilized to screen a repository of plant derived small-molecules to obtain potential inhibitors. The application of our methodology may assist effective selection of the best targets which may have previously been overlooked, that in turn will lead to the development of new oral cancer medications. The small molecules identified in this study can be ideal candidates for trials as potential novel anti-oral cancer agents. Importantly, distinct steps of this whole study may provide reference for the analysis of other complex human diseases.

  11. Validation of the Consensus-Definition for Cancer Cachexia and evaluation of a classification model--a study based on data from an international multicentre project (EPCRC-CSA).

    Science.gov (United States)

    Blum, D; Stene, G B; Solheim, T S; Fayers, P; Hjermstad, M J; Baracos, V E; Fearon, K; Strasser, F; Kaasa, S

    2014-08-01

    Weight loss limits cancer therapy, quality of life and survival. Common diagnostic criteria and a framework for a classification system for cancer cachexia were recently agreed upon by international consensus. Specific assessment domains (stores, intake, catabolism and function) were proposed. The aim of this study is to validate this diagnostic criteria (two groups: model 1) and examine a four-group (model 2) classification system regarding these domains as well as survival. Data from an international patient sample with advanced cancer (N = 1070) were analysed. In model 1, the diagnostic criteria for cancer cachexia [weight loss/body mass index (BMI)] were used. Model 2 classified patients into four groups 0-III, according to weight loss/BMI as a framework for cachexia stages. The cachexia domains, survival and sociodemographic/medical variables were compared across models. Eight hundred and sixty-one patients were included. Model 1 consisted of 399 cachectic and 462 non-cachectic patients. Cachectic patients had significantly higher levels of inflammation, lower nutritional intake and performance status and shorter survival. In model 2, differences were not consistent; appetite loss did not differ between group III and IV, and performance status not between group 0 and I. Survival was shorter in group II and III compared with other groups. By adding other cachexia domains to the model, survival differences were demonstrated. The diagnostic criteria based on weight loss and BMI distinguish between cachectic and non-cachectic patients concerning all domains (intake, catabolism and function) and is associated with survival. In order to guide cachexia treatment a four-group classification model needs additional domains to discriminate between cachexia stages. © The Author 2014. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  12. Classification of oral cancers using Raman spectroscopy of serum

    Science.gov (United States)

    Sahu, Aditi; Talathi, Sneha; Sawant, Sharada; Krishna, C. Murali

    2014-03-01

    Oral cancers are the sixth most common malignancy worldwide, with low 5-year disease free survival rates, attributable to late detection due to lack of reliable screening modalities. Our in vivo Raman spectroscopy studies have demonstrated classification of normal and tumor as well as cancer field effects (CFE), the earliest events in oral cancers. In view of limitations such as requirement of on-site instrumentation and stringent experimental conditions of this approach, feasibility of classification of normal and cancer using serum was explored using 532 nm excitation. In this study, strong resonance features of β-carotenes, present differentially in normal and pathological conditions, were observed. In the present study, Raman spectra of sera of 36 buccal mucosa, 33 tongue cancers and 17 healthy subjects were recorded using Raman microprobe coupled with 40X objective using 785 nm excitation, a known source of excitation for biomedical applications. To eliminate heterogeneity, average of 3 spectra recorded from each sample was subjected to PC-LDA followed by leave-one-out-cross-validation. Findings indicate average classification efficiency of ~70% for normal and cancer. Buccal mucosa and tongue cancer serum could also be classified with an efficiency of ~68%. Of the two cancers, buccal mucosa cancer and normal could be classified with a higher efficiency. Findings of the study are quite comparable to that of our earlier study, which suggest that there exist significant differences, other than β- carotenes, between normal and cancerous samples which can be exploited for the classification. Prospectively, extensive validation studies will be undertaken to confirm the findings.

  13. Changes of 2015 WHO Histological Classification of Lung Cancer 
and the Clinical Significance

    Directory of Open Access Journals (Sweden)

    Xin YANG

    2016-06-01

    Full Text Available Due in part to remarkable advances over the past decade in our understanding of lung cancer, particularly in area of medical oncology, molecular biology, and radiology, there is a pressing need for a revised classification, based not on pathology alone, but rather on an integrated multidisciplinary approach to classification of lung cancer. The 2015 World Health Organization (WHO Classification of Tumors of the Lung, Pleura, Thymus and Heart has just been published with numerous important changes from the 2004 WHO classification. The revised classification has been greatly improved in helping advance the field, increasing the impact of research, improving patient care and assisting in predicting outcome. The most significant changes will be summarized in this paper as follows: (1 main changes of lung adenocarcinoma as proposed by the 2011 International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society (IASLC/ATS/ERS classification, (2 reclassifying squamous cell carcinomas into keratinizing, nonkeratinizing, and basaloid subtypes with the nonkeratinizing tumors requiring immunohistochemistry proof of squamous differentiation, (3 restricting the diagnosis of large cell carcinoma only to resected tumors that lack any clear morphologic or immunohistochemical differentiation with reclassification of the remaining former large cell carcinoma subtypes into different categories, (4 grouping of neuroendocrine tumors together in one category, (5 and the current viewpoint of histologic grading of lung cancer.

  14. Optimization based tumor classification from microarray gene expression data.

    Directory of Open Access Journals (Sweden)

    Onur Dagliyan

    Full Text Available BACKGROUND: An important use of data obtained from microarray measurements is the classification of tumor types with respect to genes that are either up or down regulated in specific cancer types. A number of algorithms have been proposed to obtain such classifications. These algorithms usually require parameter optimization to obtain accurate results depending on the type of data. Additionally, it is highly critical to find an optimal set of markers among those up or down regulated genes that can be clinically utilized to build assays for the diagnosis or to follow progression of specific cancer types. In this paper, we employ a mixed integer programming based classification algorithm named hyper-box enclosure method (HBE for the classification of some cancer types with a minimal set of predictor genes. This optimization based method which is a user friendly and efficient classifier may allow the clinicians to diagnose and follow progression of certain cancer types. METHODOLOGY/PRINCIPAL FINDINGS: We apply HBE algorithm to some well known data sets such as leukemia, prostate cancer, diffuse large B-cell lymphoma (DLBCL, small round blue cell tumors (SRBCT to find some predictor genes that can be utilized for diagnosis and prognosis in a robust manner with a high accuracy. Our approach does not require any modification or parameter optimization for each data set. Additionally, information gain attribute evaluator, relief attribute evaluator and correlation-based feature selection methods are employed for the gene selection. The results are compared with those from other studies and biological roles of selected genes in corresponding cancer type are described. CONCLUSIONS/SIGNIFICANCE: The performance of our algorithm overall was better than the other algorithms reported in the literature and classifiers found in WEKA data-mining package. Since it does not require a parameter optimization and it performs consistently very high prediction rate on

  15. A web-based land cover classification system based on ontology model of different classification systems

    Science.gov (United States)

    Lin, Y.; Chen, X.

    2016-12-01

    Land cover classification systems used in remote sensing image data have been developed to meet the needs for depicting land covers in scientific investigations and policy decisions. However, accuracy assessments of a spate of data sets demonstrate that compared with the real physiognomy, each of the thematic map of specific land cover classification system contains some unavoidable flaws and unintended deviation. This work proposes a web-based land cover classification system, an integrated prototype, based on an ontology model of various classification systems, each of which is assigned the same weight in the final determination of land cover type. Ontology, a formal explication of specific concepts and relations, is employed in this prototype to build up the connections among different systems to resolve the naming conflicts. The process is initialized by measuring semantic similarity between terminologies in the systems and the search key to produce certain set of satisfied classifications, and carries on through searching the predefined relations in concepts of all classification systems to generate classification maps with user-specified land cover type highlighted, based on probability calculated by votes from data sets with different classification system adopted. The present system is verified and validated by comparing the classification results with those most common systems. Due to full consideration and meaningful expression of each classification system using ontology and the convenience that the web brings with itself, this system, as a preliminary model, proposes a flexible and extensible architecture for classification system integration and data fusion, thereby providing a strong foundation for the future work.

  16. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  17. Multimodal Feature-Based Surface Material Classification.

    Science.gov (United States)

    Strese, Matti; Schuwerk, Clemens; Iepure, Albert; Steinbach, Eckehard

    2017-01-01

    When a tool is tapped on or dragged over an object surface, vibrations are induced in the tool, which can be captured using acceleration sensors. The tool-surface interaction additionally creates audible sound waves, which can be recorded using microphones. Features extracted from camera images provide additional information about the surfaces. We present an approach for tool-mediated surface classification that combines these signals and demonstrate that the proposed method is robust against variable scan-time parameters. We examine freehand recordings of 69 textured surfaces recorded by different users and propose a classification system that uses perception-related features, such as hardness, roughness, and friction; selected features adapted from speech recognition, such as modified cepstral coefficients applied to our acceleration signals; and surface texture-related image features. We focus on mitigating the effect of variable contact force and exploration velocity conditions on these features as a prerequisite for a robust machine-learning-based approach for surface classification. The proposed system works without explicit scan force and velocity measurements. Experimental results show that our proposed approach allows for successful classification of textured surfaces under variable freehand movement conditions, exerted by different human operators. The proposed subset of six features, selected from the described sound, image, friction force, and acceleration features, leads to a classification accuracy of 74 percent in our experiments when combined with a Naive Bayes classifier.

  18. Prognostic classifications of lymph node involvement in lung cancer and current International Association for the Study of Lung Cancer descriptive classification in zones.

    Science.gov (United States)

    Riquet, Marc; Arame, Alex; Foucault, Christophe; Le Pimpec Barthes, Françoise

    2010-09-01

    The lymphatic drainage of solid organ tumors crosses through the lymph nodes (LNs) whose tumoral involvement may still be considered as local disease. Concerning lung cancer, LN involvement may be intrapulmonary (N1), and mediastinal and/or extra-thoracic. More than 30 years ago, mediastinal involved LNs were all considered as N2, and outside the scope of surgery. In 1978, Naruke presented an original article entitled 'Lymph node mapping and curability at various levels of metastasis in resected lung cancer', demonstrating that N2 was not a contraindication to surgery in all patients. The map permitted to localize the favorable N2 on the lung cancer ipsilateral side of the mediastinum. Several maps ensued aiming to discriminate between right and left involvement (1983), and to distinguish N2 (ipsilateral) and N3 (contralateral) mediastinal LN involvement (1983, 1986). The last map (1997 regional LN classification) was recently replaced by a descriptive classification in anatomical zones. This new LN map of the TNM classification for lung cancer is a step toward using anatomical view points which might be the best way to better understand lung cancer lymphatic spread. Nowadays, the LNs are easily identified by current radiological imaging, and their resectability may be anticipated. Each LN chain may be removed by en-bloc lymphadenectomy performed during radical lung resection, a safe procedure which seems to be more oncological based than sampling, and which avoids the source of discrepancies pointed out during the labeling of LN stations by surgeons.

  19. Vessel-guided airway segmentation based on voxel classification

    DEFF Research Database (Denmark)

    Lo, Pechin Chien Pau; Sporring, Jon; Ashraf, Haseem

    2008-01-01

    This paper presents a method for improving airway tree segmentation using vessel orientation information. We use the fact that an airway branch is always accompanied by an artery, with both structures having similar orientations. This work is based on a  voxel classification airway segmentation...... of the surroundings of a voxel, estimated based on a tube model, is to that of a neighboring vessel. The proposed method is tested on 20 CT images from different subjects selected randomly from a lung cancer screening study. Length of the airway branches from the results of the proposed method are significantly...

  20. Incidence and survival of lymphohematopoietic neoplasms according to the World Health Organization classification: a population-based study from the Victorian Cancer Registry in Australia.

    Science.gov (United States)

    Jayasekara, Harindra; Karahalios, Amalia; Juneja, Surender; Thursfield, Vicky; Farrugia, Helen; English, Dallas R; Giles, Graham G

    2010-03-01

    We studied the incidence and relative survival of 39 837 cases of lymphohematopoietic neoplasms (LHN) reported to the Victorian Cancer Registry during 1982-2004, classified according to the World Health Organization (WHO) classification. We modeled excess mortality using Poisson regression to estimate differences in survival by age, sex, and time period. Age-standardized incidence rates varied across subtypes of lymphoid and myeloid neoplasms. All major subtypes predominantly affected the elderly except Hodgkin lymphoma (incidence peaks at 20-24 and 75-79 years) and acute lymphoblastic leukemia (0-9 years). After an initial rise, overall lymphoid and myeloid incidence stabilized in the mid-1990s. The 5-year relative survival was 58% for lymphoid and 35% for myeloid neoplasms. Survival improved during 1990-2004 for diffuse large B-cell lymphoma, follicular lymphoma, acute myeloid leukemia, chronic myeloid leukemia, and myelodysplastic syndromes (p  < 0.001) and declined with advancing age for all subtypes (p <  0.001). Female sex was associated with higher survival for most myeloid subtypes. The results represent a rare epidemiological characterization of the whole range of LHN according to WHO subtypes.

  1. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks | Center for Cancer Research

    Science.gov (United States)

    The purpose of this study was to develop a method of classifying cancers to specific diagnostic categories based on their gene expression signatures using artificial neural networks (ANNs). We trained the ANNs using the small, round blue-cell tumors (SRBCTs) as a model. These cancers belong to four distinct diagnostic categories and often present diagnostic dilemmas in clinical practice. The ANNs correctly classified all samples and identified the genes most relevant to the classification.

  2. Lung cancer classification using neural networks for CT images.

    Science.gov (United States)

    Kuruvilla, Jinsa; Gunavathi, K

    2014-01-01

    Early detection of cancer is the most promising way to enhance a patient's chance for survival. This paper presents a computer aided classification method in computed tomography (CT) images of lungs developed using artificial neural network. The entire lung is segmented from the CT images and the parameters are calculated from the segmented image. The statistical parameters like mean, standard deviation, skewness, kurtosis, fifth central moment and sixth central moment are used for classification. The classification process is done by feed forward and feed forward back propagation neural networks. Compared to feed forward networks the feed forward back propagation network gives better classification. The parameter skewness gives the maximum classification accuracy. Among the already available thirteen training functions of back propagation neural network, the Traingdx function gives the maximum classification accuracy of 91.1%. Two new training functions are proposed in this paper. The results show that the proposed training function 1 gives an accuracy of 93.3%, specificity of 100% and sensitivity of 91.4% and a mean square error of 0.998. The proposed training function 2 gives a classification accuracy of 93.3% and minimum mean square error of 0.0942. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  3. A New Feature Ensemble with a Multistage Classification Scheme for Breast Cancer Diagnosis

    Directory of Open Access Journals (Sweden)

    Idil Isikli Esener

    2017-01-01

    Full Text Available A new and effective feature ensemble with a multistage classification is proposed to be implemented in a computer-aided diagnosis (CAD system for breast cancer diagnosis. A publicly available mammogram image dataset collected during the Image Retrieval in Medical Applications (IRMA project is utilized to verify the suggested feature ensemble and multistage classification. In achieving the CAD system, feature extraction is performed on the mammogram region of interest (ROI images which are preprocessed by applying a histogram equalization followed by a nonlocal means filtering. The proposed feature ensemble is formed by concatenating the local configuration pattern-based, statistical, and frequency domain features. The classification process of these features is implemented in three cases: a one-stage study, a two-stage study, and a three-stage study. Eight well-known classifiers are used in all cases of this multistage classification scheme. Additionally, the results of the classifiers that provide the top three performances are combined via a majority voting technique to improve the recognition accuracy on both two- and three-stage studies. A maximum of 85.47%, 88.79%, and 93.52% classification accuracies are attained by the one-, two-, and three-stage studies, respectively. The proposed multistage classification scheme is more effective than the single-stage classification for breast cancer diagnosis.

  4. Analysis of composition-based metagenomic classification.

    Science.gov (United States)

    Higashi, Susan; Barreto, André da Motta Salles; Cantão, Maurício Egidio; de Vasconcelos, Ana Tereza Ribeiro

    2012-01-01

    An essential step of a metagenomic study is the taxonomic classification, that is, the identification of the taxonomic lineage of the organisms in a given sample. The taxonomic classification process involves a series of decisions. Currently, in the context of metagenomics, such decisions are usually based on empirical studies that consider one specific type of classifier. In this study we propose a general framework for analyzing the impact that several decisions can have on the classification problem. Instead of focusing on any specific classifier, we define a generic score function that provides a measure of the difficulty of the classification task. Using this framework, we analyze the impact of the following parameters on the taxonomic classification problem: (i) the length of n-mers used to encode the metagenomic sequences, (ii) the similarity measure used to compare sequences, and (iii) the type of taxonomic classification, which can be conventional or hierarchical, depending on whether the classification process occurs in a single shot or in several steps according to the taxonomic tree. We defined a score function that measures the degree of separability of the taxonomic classes under a given configuration induced by the parameters above. We conducted an extensive computational experiment and found out that reasonable values for the parameters of interest could be (i) intermediate values of n, the length of the n-mers; (ii) any similarity measure, because all of them resulted in similar scores; and (iii) the hierarchical strategy, which performed better in all of the cases. As expected, short n-mers generate lower configuration scores because they give rise to frequency vectors that represent distinct sequences in a similar way. On the other hand, large values for n result in sparse frequency vectors that represent differently metagenomic fragments that are in fact similar, also leading to low configuration scores. Regarding the similarity measure, in

  5. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim U.; Emberton, Mark [University College London, Research Department of Urology, Division of Surgery and Interventional Science, London (United Kingdom); Kirkham, Alex [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2015-09-15

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  6. Evaluation of a panel of expert pathologists: review of the diagnosis and histological classification of Hodgkin and non-Hodgkin lymphomas in a population-based cancer registry

    NARCIS (Netherlands)

    Strobbe, L.; Schans, S.A. van de; Heijker, S.M.; Meijer, J.W.R.; Mattijssen, E.J.; Mandigers, C.M.P.W.; Kievit, I.M. de; Raemaekers, J.M.M.; Hebeda, K.M.; Krieken, J.H. van

    2014-01-01

    Abstract Correct histological classification of malignant lymphomas is important but has always been a difficult challenge. Since 2001 the World Health Organization (WHO) classification has been used, which should make it easier to define distinct disease entities. The purpose of this study was to

  7. Object Classification Using Substance Based Neural Network

    Directory of Open Access Journals (Sweden)

    P. Sengottuvelan

    2014-01-01

    Full Text Available Object recognition has shown tremendous increase in the field of image analysis. The required set of image objects is identified and retrieved on the basis of object recognition. In this paper, we propose a novel classification technique called substance based image classification (SIC using a wavelet neural network. The foremost task of SIC is to remove the surrounding regions from an image to reduce the misclassified portion and to effectively reflect the shape of an object. At first, the image to be extracted is performed with SIC system through the segmentation of the image. Next, in order to attain more accurate information, with the extracted set of regions, the wavelet transform is applied for extracting the configured set of features. Finally, using the neural network classifier model, misclassification over the given natural images and further background images are removed from the given natural image using the LSEG segmentation. Moreover, to increase the accuracy of object classification, SIC system involves the removal of the regions in the surrounding image. Performance evaluation reveals that the proposed SIC system reduces the occurrence of misclassification and reflects the exact shape of an object to approximately 10–15%.

  8. An Approach for Leukemia Classification Based on Cooperative Game Theory

    Directory of Open Access Journals (Sweden)

    Atefeh Torkaman

    2011-01-01

    Full Text Available Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers in the world. So, the emphasis on diagnostic techniques and best treatments would be able to provide better prognosis and survival for patients. In this paper, an automatic diagnosis recommender system for classifying leukemia based on cooperative game is presented. Through out this research, we analyze the flow cytometry data toward the classification of leukemia into eight classes. We work on real data set from different types of leukemia that have been collected at Iran Blood Transfusion Organization (IBTO. Generally, the data set contains 400 samples taken from human leukemic bone marrow. This study deals with cooperative game used for classification according to different weights assigned to the markers. The proposed method is versatile as there are no constraints to what the input or output represent. This means that it can be used to classify a population according to their contributions. In other words, it applies equally to other groups of data. The experimental results show the accuracy rate of 93.12%, for classification and compared to decision tree (C4.5 with (90.16% in accuracy. The result demonstrates that cooperative game is very promising to be used directly for classification of leukemia as a part of Active Medical decision support system for interpretation of flow cytometry readout. This system could assist clinical hematologists to properly recognize different kinds of leukemia by preparing suggestions and this could improve the treatment

  9. Optimization models for cancer classification: extracting gene interaction information from microarray expression data.

    Science.gov (United States)

    Antonov, Alexey V; Tetko, Igor V; Mader, Michael T; Budczies, Jan; Mewes, Hans W

    2004-03-22

    Microarray data appear particularly useful to investigate mechanisms in cancer biology and represent one of the most powerful tools to uncover the genetic mechanisms causing loss of cell cycle control. Recently, several different methods to employ microarray data as a diagnostic tool in cancer classification have been proposed. These procedures take changes in the expression of particular genes into account but do not consider disruptions in certain gene interactions caused by the tumor. It is probable that some genes participating in tumor development do not change their expression level dramatically. Thus, they cannot be detected by simple classification approaches used previously. For these reasons, a classification procedure exploiting information related to changes in gene interactions is needed. We propose a MAximal MArgin Linear Programming (MAMA) method for the classification of tumor samples based on microarray data. This procedure detects groups of genes and constructs models (features) that strongly correlate with particular tumor types. The detected features include genes whose functional relations are changed for particular cancer types. The proposed method was tested on two publicly available datasets and demonstrated a prediction ability superior to previously employed classification schemes. The MAMA system was developed using the linear programming system LINDO http://www.lindo.com. A Perl script that specifies the optimization problem for this software is available upon request from the authors.

  10. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  11. CAMUR: Knowledge extraction from RNA-seq cancer data through equivalent classification rules.

    Science.gov (United States)

    Cestarelli, Valerio; Fiscon, Giulia; Felici, Giovanni; Bertolazzi, Paola; Weitschek, Emanuel

    2016-03-01

    Nowadays, knowledge extraction methods from Next Generation Sequencing data are highly requested. In this work, we focus on RNA-seq gene expression analysis and specifically on case-control studies with rule-based supervised classification algorithms that build a model able to discriminate cases from controls. State of the art algorithms compute a single classification model that contains few features (genes). On the contrary, our goal is to elicit a higher amount of knowledge by computing many classification models, and therefore to identify most of the genes related to the predicted class. We propose CAMUR, a new method that extracts multiple and equivalent classification models. CAMUR iteratively computes a rule-based classification model, calculates the power set of the genes present in the rules, iteratively eliminates those combinations from the data set, and performs again the classification procedure until a stopping criterion is verified. CAMUR includes an ad-hoc knowledge repository (database) and a querying tool.We analyze three different types of RNA-seq data sets (Breast, Head and Neck, and Stomach Cancer) from The Cancer Genome Atlas (TCGA) and we validate CAMUR and its models also on non-TCGA data. Our experimental results show the efficacy of CAMUR: we obtain several reliable equivalent classification models, from which the most frequent genes, their relationships, and the relation with a particular cancer are deduced. dmb.iasi.cnr.it/camur.php emanuel@iasi.cnr.it Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  12. Genome-Based Taxonomic Classification of Bacteroidetes

    Science.gov (United States)

    Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina; Mukherjee, Supratim; Huntemann, Marcel; Ivanova, Natalia N.; Woyke, Tanja; Kyrpides, Nikos C.; Klenk, Hans-Peter; Göker, Markus

    2016-01-01

    The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved. PMID:28066339

  13. Genome-based Taxonomic Classification of Bacteroidetes

    Directory of Open Access Journals (Sweden)

    Richard L. Hahnke

    2016-12-01

    Full Text Available The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.

  14. Changing Histopathological Diagnostics by Genome-Based Tumor Classification

    Directory of Open Access Journals (Sweden)

    Michael Kloth

    2014-05-01

    Full Text Available Traditionally, tumors are classified by histopathological criteria, i.e., based on their specific morphological appearances. Consequently, current therapeutic decisions in oncology are strongly influenced by histology rather than underlying molecular or genomic aberrations. The increase of information on molecular changes however, enabled by the Human Genome Project and the International Cancer Genome Consortium as well as the manifold advances in molecular biology and high-throughput sequencing techniques, inaugurated the integration of genomic information into disease classification. Furthermore, in some cases it became evident that former classifications needed major revision and adaption. Such adaptations are often required by understanding the pathogenesis of a disease from a specific molecular alteration, using this molecular driver for targeted and highly effective therapies. Altogether, reclassifications should lead to higher information content of the underlying diagnoses, reflecting their molecular pathogenesis and resulting in optimized and individual therapeutic decisions. The objective of this article is to summarize some particularly important examples of genome-based classification approaches and associated therapeutic concepts. In addition to reviewing disease specific markers, we focus on potentially therapeutic or predictive markers and the relevance of molecular diagnostics in disease monitoring.

  15. Changing histopathological diagnostics by genome-based tumor classification.

    Science.gov (United States)

    Kloth, Michael; Buettner, Reinhard

    2014-05-28

    Traditionally, tumors are classified by histopathological criteria, i.e., based on their specific morphological appearances. Consequently, current therapeutic decisions in oncology are strongly influenced by histology rather than underlying molecular or genomic aberrations. The increase of information on molecular changes however, enabled by the Human Genome Project and the International Cancer Genome Consortium as well as the manifold advances in molecular biology and high-throughput sequencing techniques, inaugurated the integration of genomic information into disease classification. Furthermore, in some cases it became evident that former classifications needed major revision and adaption. Such adaptations are often required by understanding the pathogenesis of a disease from a specific molecular alteration, using this molecular driver for targeted and highly effective therapies. Altogether, reclassifications should lead to higher information content of the underlying diagnoses, reflecting their molecular pathogenesis and resulting in optimized and individual therapeutic decisions. The objective of this article is to summarize some particularly important examples of genome-based classification approaches and associated therapeutic concepts. In addition to reviewing disease specific markers, we focus on potentially therapeutic or predictive markers and the relevance of molecular diagnostics in disease monitoring.

  16. Cirrhosis Classification Based on Texture Classification of Random Features

    Directory of Open Access Journals (Sweden)

    Hui Liu

    2014-01-01

    Full Text Available Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage. CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM.

  17. A text classification algorithm based on feature weighting

    Science.gov (United States)

    Yang, Han; Cui, Honggang; Tang, Hao

    2017-08-01

    The text classification comes down to match according to certain characteristics of the data to be classified. Of course, the complete match is not possible, so the optimal matching result must be selected to complete the classification. Aiming at the shortcomings of the traditional KNN text classification algorithm, a KNN text classification algorithm based on feature weighting is proposed. The algorithm considers the contribution of each dimension to the classification of the model, gives different characteristics to different weights, improves the function of important features, and improves the classification accuracy of the algorithm.

  18. Research on Classification of Chinese Text Data Based on SVM

    Science.gov (United States)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  19. Brain extraction based on locally linear representation-based classification.

    Science.gov (United States)

    Huang, Meiyan; Yang, Wei; Jiang, Jun; Wu, Yao; Zhang, Yu; Chen, Wufan; Feng, Qianjin

    2014-05-15

    Brain extraction is an important procedure in brain image analysis. Although numerous brain extraction methods have been presented, enhancing brain extraction methods remains challenging because brain MRI images exhibit complex characteristics, such as anatomical variability and intensity differences across different sequences and scanners. To address this problem, we present a Locally Linear Representation-based Classification (LLRC) method for brain extraction. A novel classification framework is derived by introducing the locally linear representation to the classical classification model. Under this classification framework, a common label fusion approach can be considered as a special case and thoroughly interpreted. Locality is important to calculate fusion weights for LLRC; this factor is also considered to determine that Local Anchor Embedding is more applicable in solving locally linear coefficients compared with other linear representation approaches. Moreover, LLRC supplies a way to learn the optimal classification scores of the training samples in the dictionary to obtain accurate classification. The International Consortium for Brain Mapping and the Alzheimer's Disease Neuroimaging Initiative databases were used to build a training dataset containing 70 scans. To evaluate the proposed method, we used four publicly available datasets (IBSR1, IBSR2, LPBA40, and ADNI3T, with a total of 241 scans). Experimental results demonstrate that the proposed method outperforms the four common brain extraction methods (BET, BSE, GCUT, and ROBEX), and is comparable to the performance of BEaST, while being more accurate on some datasets compared with BEaST. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. Contextual segment-based classification of airborne laser scanner data

    NARCIS (Netherlands)

    Vosselman, George; Coenen, Maximilian; Rottensteiner, Franz

    2017-01-01

    Classification of point clouds is needed as a first step in the extraction of various types of geo-information from point clouds. We present a new approach to contextual classification of segmented airborne laser scanning data. Potential advantages of segment-based classification are easily offset

  1. Feature Subset Selection for Cancer Classification Using Weight Local Modularity.

    Science.gov (United States)

    Zhao, Guodong; Wu, Yan

    2016-10-05

    Microarray is recently becoming an important tool for profiling the global gene expression patterns of tissues. Gene selection is a popular technology for cancer classification that aims to identify a small number of informative genes from thousands of genes that may contribute to the occurrence of cancers to obtain a high predictive accuracy. This technique has been extensively studied in recent years. This study develops a novel feature selection (FS) method for gene subset selection by utilizing the Weight Local Modularity (WLM) in a complex network, called the WLMGS. In the proposed method, the discriminative power of gene subset is evaluated by using the weight local modularity of a weighted sample graph in the gene subset where the intra-class distance is small and the inter-class distance is large. A higher local modularity of the gene subset corresponds to a greater discriminative of the gene subset. With the use of forward search strategy, a more informative gene subset as a group can be selected for the classification process. Computational experiments show that the proposed algorithm can select a small subset of the predictive gene as a group while preserving classification accuracy.

  2. Colorectal Cancer Classification and Cell Heterogeneity: A Systems Oncology Approach

    Directory of Open Access Journals (Sweden)

    Moisés Blanco-Calvo

    2015-06-01

    Full Text Available Colorectal cancer is a heterogeneous disease that manifests through diverse clinical scenarios. During many years, our knowledge about the variability of colorectal tumors was limited to the histopathological analysis from which generic classifications associated with different clinical expectations are derived. However, currently we are beginning to understand that under the intense pathological and clinical variability of these tumors there underlies strong genetic and biological heterogeneity. Thus, with the increasing available information of inter-tumor and intra-tumor heterogeneity, the classical pathological approach is being displaced in favor of novel molecular classifications. In the present article, we summarize the most relevant proposals of molecular classifications obtained from the analysis of colorectal tumors using powerful high throughput techniques and devices. We also discuss the role that cancer systems biology may play in the integration and interpretation of the high amount of data generated and the challenges to be addressed in the future development of precision oncology. In addition, we review the current state of implementation of these novel tools in the pathological laboratory and in clinical practice.

  3. [New molecular classification of colorectal cancer, pancreatic cancer and stomach cancer: Towards "à la carte" treatment?].

    Science.gov (United States)

    Dreyer, Chantal; Afchain, Pauline; Trouilloud, Isabelle; André, Thierry

    2016-01-01

    This review reports 3 of recently published molecular classifications of the 3 main gastro-intestinal cancers: gastric, pancreatic and colorectal adenocarcinoma. In colorectal adenocarcinoma, 6 independent classifications were combined to finally hold 4 molecular sub-groups, Consensus Molecular Subtypes (CMS 1-4), linked to various clinical, molecular and survival data. CMS1 (14% MSI with immune activation); CMS2 (37%: canonical with epithelial differentiation and activation of the WNT/MYC pathway); CMS3 (13% metabolic with epithelial differentiation and RAS mutation); CMS4 (23%: mesenchymal with activation of TGFβ pathway and angiogenesis with stromal invasion). In gastric adenocarcinoma, 4 groups were established: subtype "EBV" (9%, high frequency of PIK3CA mutations, hypermetylation and amplification of JAK2, PD-L1 and PD-L2), subtype "MSI" (22%, high rate of mutation), subtype "genomically stable tumor" (20%, diffuse histology type and mutations of RAS and genes encoding integrins and adhesion proteins including CDH1) and subtype "tumors with chromosomal instability" (50%, intestinal type, aneuploidy and receptor tyrosine kinase amplification). In pancreatic adenocarcinomas, a classification in four sub-groups has been proposed, stable subtype (20%, aneuploidy), locally rearranged subtype (30%, focal event on one or two chromosoms), scattered subtype (36%,200 structural variation events, defects in DNA maintenance). Although currently away from the care of patients, these classifications open the way to "à la carte" treatment depending on molecular biology. Copyright © 2016 Société Française du Cancer. Published by Elsevier Masson SAS. All rights reserved.

  4. Alternative Polyadenylation Patterns for Novel Gene Discovery and Classification in Cancer

    Directory of Open Access Journals (Sweden)

    Oguzhan Begik

    2017-07-01

    Full Text Available Certain aspects of diagnosis, prognosis, and treatment of cancer patients are still important challenges to be addressed. Therefore, we propose a pipeline to uncover patterns of alternative polyadenylation (APA, a hidden complexity in cancer transcriptomes, to further accelerate efforts to discover novel cancer genes and pathways. Here, we analyzed expression data for 1045 cancer patients and found a significant shift in usage of poly(A signals in common tumor types (breast, colon, lung, prostate, gastric, and ovarian compared to normal tissues. Using machine-learning techniques, we further defined specific subsets of APA events to efficiently classify cancer types. Furthermore, APA patterns were associated with altered protein levels in patients, revealed by antibody-based profiling data, suggesting functional significance. Overall, our study offers a computational approach for use of APA in novel gene discovery and classification in common tumor types, with important implications in basic research, biomarker discovery, and precision medicine approaches.

  5. Comparison of hand-craft feature based SVM and CNN based deep learning framework for automatic polyp classification.

    Science.gov (United States)

    Younghak Shin; Balasingham, Ilangko

    2017-07-01

    Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.

  6. Applicability of the Proposed Japanese Model for the Classification of Gastric Cancer Location: The "PROTRADIST" Retrospective Study.

    Science.gov (United States)

    Marano, Luigi; Petrillo, Marianna; Pezzella, Modestino; Patriti, Alberto; Braccio, Bartolomeo; Esposito, Giuseppe; Grassia, Michele; Romano, Angela; Torelli, Francesco; De Luca, Raffaele; Fabozzi, Alessio; Falco, Giuseppe; Di Martino, Natale

    2017-06-01

    The extension of lymphadenectomy for surgical treatment of gastric cancer remains discordant among European and Japanese surgeons. Kinami et al. (Kinami S, Fujimura T, Ojima E, et al. PTD classification: proposal for a new classification of gastric cancer location based on physiological lymphatic flow. Int. J. Clin. Oncol. 2008;13:320-329) proposed a new experimental classification, the "Proximal zone, Transitional zone, Distal zone" (PTD) classification, based on the physiological lymphatic flow of gastric cancer site. The aim of the present retrospective study is to assess the applicability of PTD Japanese model in gastric cancer patients of our Western surgical department. Two groups of patients with histologically documented adenocarcinoma of the stomach were retrospectively obtained: In the first group were categorized 89 patients with T1a-T1b tumor invasion; and in the second group were 157 patients with T2-T3 category. The data collected were then categorized according to the PTD classification. In the T1a-T1b group there were no lymph node metastases within the r-GA or r-GEA compartments for tumors located in the P portion, and similarly there were no lymphatic metastases within the l-GEA or p-GA compartments for tumors located in the D portion. On the contrary, in the T2-T3 group the lymph node metastases presented a diffused spreading with no statistical significance between the two classification models. Our results show that the PTD classification based on physiological lymphatic flow of the gastric cancer site is a more physiological and clinical version than the Upper, Medium And Lower classification. It represents a valuable and applicable model of cancer location that could be a guide to a tailored surgical approach in Italian patients with neoplasm confined to submucosa. Nevertheless, in order to confirm our findings, larger and prospective studies are needed.

  7. Call for a Computer-Aided Cancer Detection and Classification Research Initiative in Oman.

    Science.gov (United States)

    Mirzal, Andri; Chaudhry, Shafique Ahmad

    2016-01-01

    Cancer is a major health problem in Oman. It is reported that cancer incidence in Oman is the second highest after Saudi Arabia among Gulf Cooperation Council countries. Based on GLOBOCAN estimates, Oman is predicted to face an almost two-fold increase in cancer incidence in the period 2008-2020. However, cancer research in Oman is still in its infancy. This is due to the fact that medical institutions and infrastructure that play central roles in data collection and analysis are relatively new developments in Oman. We believe the country requires an organized plan and efforts to promote local cancer research. In this paper, we discuss current research progress in cancer diagnosis using machine learning techniques to optimize computer aided cancer detection and classification (CAD). We specifically discuss CAD using two major medical data, i.e., medical imaging and microarray gene expression profiling, because medical imaging like mammography, MRI, and PET have been widely used in Oman for assisting radiologists in early cancer diagnosis and microarray data have been proven to be a reliable source for differential diagnosis. We also discuss future cancer research directions and benefits to Oman economy for entering the cancer research and treatment business as it is a multi-billion dollar industry worldwide.

  8. Graph-based Methods for Orbit Classification

    Energy Technology Data Exchange (ETDEWEB)

    Bagherjeiran, A; Kamath, C

    2005-09-29

    An important step in the quest for low-cost fusion power is the ability to perform and analyze experiments in prototype fusion reactors. One of the tasks in the analysis of experimental data is the classification of orbits in Poincare plots. These plots are generated by the particles in a fusion reactor as they move within the toroidal device. In this paper, we describe the use of graph-based methods to extract features from orbits. These features are then used to classify the orbits into several categories. Our results show that existing machine learning algorithms are successful in classifying orbits with few points, a situation which can arise in data from experiments.

  9. Superpixel-based classification of gastric chromoendoscopy images

    Science.gov (United States)

    Boschetto, Davide; Grisan, Enrico

    2017-03-01

    Chromoendoscopy (CH) is a gastroenterology imaging modality that involves the staining of tissues with methylene blue, which reacts with the internal walls of the gastrointestinal tract, improving the visual contrast in mucosal surfaces and thus enhancing a doctor's ability to screen precancerous lesions or early cancer. This technique helps identify areas that can be targeted for biopsy or treatment and in this work we will focus on gastric cancer detection. Gastric chromoendoscopy for cancer detection has several taxonomies available, one of which classifies CH images into three classes (normal, metaplasia, dysplasia) based on color, shape and regularity of pit patterns. Computer-assisted diagnosis is desirable to help us improve the reliability of the tissue classification and abnormalities detection. However, traditional computer vision methodologies, mainly segmentation, do not translate well to the specific visual characteristics of a gastroenterology imaging scenario. We propose the exploitation of a first unsupervised segmentation via superpixel, which groups pixels into perceptually meaningful atomic regions, used to replace the rigid structure of the pixel grid. For each superpixel, a set of features is extracted and then fed to a random forest based classifier, which computes a model used to predict the class of each superpixel. The average general accuracy of our model is 92.05% in the pixel domain (86.62% in the superpixel domain), while detection accuracies on the normal and abnormal class are respectively 85.71% and 95%. Eventually, the whole image class can be predicted image through a majority vote on each superpixel's predicted class.

  10. Classification models based on the level of metals in hair and nails of laryngeal cancer patients: diagnosis support or rather speculation?

    Science.gov (United States)

    Golasik, Magdalena; Jawień, Wojciech; Przybyłowicz, Agnieszka; Szyfter, Witold; Herman, Małgorzata; Golusiński, Wojciech; Florek, Ewa; Piekoszewski, Wojciech

    2015-03-01

    The etiology of cancer is complex, and the disturbances in toxic and essential metals homeostasis are among many of the factors that lead to the development of malignancy. The aim of this study is to investigate the relationship between cancer risk and element status as well as cancer risk and external factors, such as diet, smoking and drinking habits, in order to support diagnosis of cancer. The samples of hair and nails obtained from patients with larynx cancer and healthy subjects were analyzed. Essential elements (Ca, Cr, Mg, Zn, Cu, Mn, and Fe), besides toxic metals (Cd, Co, and Pb), were determined using inductively coupled plasma atomic emission spectrometry (ICP-OES) and mass spectrometry (ICP-MS) techniques. The concentration of essential elements was from 1.5- (Zn) to 4.7-fold (Fe) higher in hair and from 2.4- to 3.3-fold higher in the nails of the control group compared to the patients, while the opposite trend was observed for the heavy metals. The differences between two groups in the level of metals (except for Zn) were statistically significant (p cancer with metals and other factors was evaluated using various statistical methods, for which the best predictions were obtained using logistic regression, artificial neural networks and canonical discriminant analysis. The classifiers constructed using the data from a survey of diet and lifestyle, and analysis of elements in hair and nails, can be useful tools for estimating cancer risk and early screening of the disease.

  11. Microscopic Image Processing Of Automated Detection And Classification For Human Cancer Cell

    Directory of Open Access Journals (Sweden)

    Laith Muayyad Abdul-Hameed Al-Hayali

    2015-08-01

    Full Text Available Automated Detection for Human Cancer Cell is one of the most effective applications of image processing and has obtained great attention in latest years therefore. In this study we propose an automated detection system for human cancer cells based on breast cancer cells. This study was conducted on a set of Fine Needle Aspiration FNA biopsy microscopic images that have been obtained from the Pathology Center - Faculty of Medicine - Mansoura University Hospital - Egypt is made up of 72 microscope image samples of benign 72 microscope image samples of malignant. The purpose of this study is to detect and classify the benign and malignant cells in the breast biopsy. The images are exposed to a series of pre-processing steps which include resizing image such as 10241024 512512 enhance images by remove noise through Median Filter and contrast enhancement through Unsharp Masking Adjust Intensity. The system depends on breast cancer cells detection using clustering-based segmentation K-means clustering Fuzzy C-means clustering and region-based segmentation Watershed. Shape Texture and Color features are extracted for Detection. The results show high Detection Rate for breast cancer cells images either Benign or Malignant. Finally classification stage by using Support Vector Machine K-Nearest Neighbors and Back-Propagation Neural Networks. The final classification with the best accuracy in SVM is 97.22 in K-NN and BPNNs is 98.61.

  12. Visualization and tissue classification of human breast cancer images using ultrahigh-resolution OCT (Conference Presentation)

    Science.gov (United States)

    Yao, Xinwen; Gan, Yu; Chang, Ernest W.; Hibshoosh, Hanina; Feldman, Sheldon; Hendon, Christine P.

    2017-02-01

    We employed a home-built ultrahigh resolution (UHR) OCT system at 800nm to image human breast cancer sample ex vivo. The system has an axial resolution of 2.72µm and a lateral resolution of 5.52µm with an extended imaging range of 1.78mm. Over 900 UHR OCT volumes were generated on specimens from 23 breast cancer cases. With better spatial resolution, detailed structures in the breast tissue were better defined. Different types of breast cancer as well as healthy breast tissue can be well delineated from the UHR OCT images. To quantitatively evaluate the advantages of UHR OCT imaging of breast cancer, features derived from OCT intensity images were used as inputs to a machine learning model, the relevance vector machine. A trained machine learning model was employed to evaluate the performance of tissue classification based on UHR OCT images for differentiating tissue types in the breast samples, including adipose tissue, healthy stroma and cancerous region. For adipose tissue, grid-based local features were extracted from OCT intensity data, including standard deviation, entropy, and homogeneity. We showed that it was possible to enhance the classification performance on distinguishing fat tissue from non-fat tissue by using the UHR images when compared with the results based on OCT images from a commercial 1300 nm OCT system. For invasive ductal carcinoma (IDC) and normal stroma differentiation, the classification was based on frame-based features that portray signal penetration depth and tissue reflectivity. The confusing matrix indicated a sensitivity of 97.5% and a sensitivity of 77.8%.

  13. Conformational SERS Classification of K-Ras Point Mutations for Cancer Diagnostics.

    Science.gov (United States)

    Morla-Folch, Judit; Gisbert-Quilis, Patricia; Masetti, Matteo; Garcia-Rico, Eduardo; Alvarez-Puebla, Ramon A; Guerrini, Luca

    2017-02-20

    Point mutations in Ras oncogenes are routinely screened for diagnostics and treatment of tumors (especially in colorectal cancer). Here, we develop an optical approach based on direct SERS coupled with chemometrics for the study of the specific conformations that single-point mutations impose on a relatively large fragment of the K-Ras gene (141 nucleobases). Results obtained offer the unambiguous classification of different mutations providing a potentially useful insight for diagnostics and treatment of cancer in a sensitive, fast, direct and inexpensive manner. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Classification of Bladder Cancer Patients via Penalized Linear Discriminant Analysis

    Science.gov (United States)

    Raeisi Shahraki, Hadi; Bemani, Peyman; Jalali, Maryam

    2017-05-01

    Objectives: In order to identify genes with the greatest contribution to bladder cancer, we proposed a sparse model making the best discrimination from other patients. Methods: In a cross-sectional study, 22 genes with a key role in most cancers were considered in 21 bladder cancer patients and 14 participants of the same age (± 3 years) without bladder cancer in Shiraz city, Southern Iran. Real time-PCR was carried out using SYBR Green and for each of the 22 target genes 2-Δct as a quantitative index of gene expression was reported. We determined the most affective genes for the discriminant vector by applying penalized linear discriminant analysis using LASSO penalties. All the analyses were performed using SPSS version 18 and the penalized LDA package in R.3.1.3 software. Results: Using penalized linear discriminant analysis led to elimination of 13 less important genes. Considering the simultaneous effects of 22 genes with important influence on many cancers, it was found that TGFβ, IL12A, Her2, MDM2, CTLA-4 and IL-23 genes had the greatest contribution in classifying bladder cancer patients with the penalized linear discriminant vector. The receiver operating characteristic (ROC) curve revealed that the proposed vector had good performance with minimal (only 3) mis- classification. The area under the curve (AUC) of our proposed test was 96% (95% CI: 83%- 100%) and sensitivity, specificity, positive and negative predictive values were 90.5%, 85.7%, 90.5% and 85.7%, respectively. Conclusions: The penalized discriminant method can be considered as appropriate for classifying bladder cancer cases and searching for important biomarkers. Creative Commons Attribution License

  15. Classification of cancer-related death certificates using machine learning.

    Science.gov (United States)

    Butt, Luke; Zuccon, Guido; Nguyen, Anthony; Bergheim, Anton; Grayson, Narelle

    2013-01-01

    Cancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities. In this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated. A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes. Death certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM) classifier achieved best performance with an overall Fmeasure of 0.9866 when evaluated on a set of 5,000 freetext death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032) and false negative rate (0.0297) while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers. The selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with an SVM classifier.

  16. Classification of Cancer-related Death Certificates using Machine Learning

    Directory of Open Access Journals (Sweden)

    Luke Butt

    2013-05-01

    Full Text Available BackgroundCancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities.AimsIn this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated.Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes.ResultsDeath certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032 and false negative rate (0.0297 while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers.ConclusionThe selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with

  17. Les cancers de la cavité buccale et de l’oropharynx dans le monde : incidence internationale et classification TNM dans les registres du cancer

    OpenAIRE

    De Camargo Cancela, Marianna

    2010-01-01

    Oral cavity and oropharynx cancers : International incidence and TNM classification in population-based cancer registries The aim of this work was to know and to evaluate the epidemiological patterns of oral cavity and ororpharynx cancers. These topographies share some common risk factors and they are often grouped in epidemiological studies. However, the implication of the human papilloma virus in oropharyngeal tumors lead us to provide incidence rates according to the anatomical classificat...

  18. Characteristics of Differently Located Colorectal Cancers Support Proximal and Distal Classification: A Population-Based Study of 57,847 Patients.

    Directory of Open Access Journals (Sweden)

    Jiao Yang

    Full Text Available It has been suggested that colorectal cancer be regarded as several subgroups defined according to tumor location rather than as a single entity. The current study aimed to identify the most useful method for grouping colorectal cancer by tumor location according to both baseline and survival characteristics.Cases of pathologically confirmed colorectal adenocarcinoma diagnosed from 2000 to 2012 were identified from the Surveillance, Epidemiology, and End Results database and categorized into three groups: right colon cancer (RCC, left colon cancer (LCC, and rectal cancer (ReC. Adjusted hazard ratios for known predictors of disease-specific survival (DSS in colorectal cancer were obtained using a Cox proportional hazards regression model.The study included 57847 patients: 43.5% with RCC, 37.7% with LCC, and 18.8% with ReC. Compared with LCC and ReC, RCC was more likely to affect old patients and women, and to be at advanced stage, poorly differentiated or un-differentiated, and mucinous. Patients with LCC or ReC had better DSS than those with RCC in subgroups including stage III or IV disease, age ≤70 years and non-mucinous adenocarcinoma. Conversely, patients with LCC or ReC had worse DSS than those with RCC in subgroups including age ˃70 years and mucinous adenocarcinoma.RCC differed from both LCC and ReC in several clinicopathologic characteristics and in DSS. It seems reasonable to group colorectal cancer into right-sided (i.e., proximal and left-sided (i.e., distal ones.

  19. Classification of treatment-related mortality in children with cancer

    DEFF Research Database (Denmark)

    Alexander, Sarah; Pole, Jason D; Gibson, Paul

    2015-01-01

    Treatment-related mortality is an important outcome in paediatric cancer clinical trials. An international group of experts in supportive care in paediatric cancer developed a consensus-based definition of treatment-related mortality and a cause-of-death attribution system. The reliability and va...

  20. Cross-Disciplinary Analysis of Lymph Node Classification in Lung Cancer on CT Scanning.

    Science.gov (United States)

    El-Sherief, Ahmed H; Lau, Charles T; Obuchowski, Nancy A; Mehta, Atul C; Rice, Thomas W; Blackstone, Eugene H

    2017-04-01

    Accurate and consistent regional lymph node classification is an important element in the staging and multidisciplinary management of lung cancer. Regional lymph node definition sets-lymph node maps-have been created to standardize regional lymph node classification. In 2009, the International Association for the Study of Lung Cancer (IASLC) introduced a lymph node map to supersede all preexisting lymph node maps. Our aim was to study if and how lung cancer specialists apply the IASLC lymph node map when classifying thoracic lymph nodes encountered on CT scans during lung cancer staging. From April 2013 through July 2013, invitations were distributed to all members of the Fleischner Society, Society of Thoracic Radiology, General Thoracic Surgical Club, and the American Association of Bronchology and Interventional Pulmonology to participate in an anonymous online image-based and text-based 20-question survey regarding lymph node classification for lung cancer staging on CT imaging. Three hundred thirty-seven people responded (approximately 25% participation). Respondents consisted of self-reported thoracic radiologists (n = 158), thoracic surgeons (n = 102), and pulmonologists who perform endobronchial ultrasonography (n = 77). Half of the respondents (50%; 95% CI, 44%-55%) reported using the IASLC lymph node map in daily practice, with no significant differences between subspecialties. A disparity was observed between the IASLC definition sets and their interpretation and application on CT scans, in particular for lymph nodes near the thoracic inlet, anterior to the trachea, anterior to the tracheal bifurcation, near the ligamentum arteriosum, between the bronchus intermedius and esophagus, in the internal mammary space, and adjacent to the heart. Use of older lymph node maps and inconsistencies in interpretation and application of definitions in the IASLC lymph node map may potentially lead to misclassification of stage and suboptimal management of lung

  1. An Efficient Audio Classification Approach Based on Support Vector Machines

    OpenAIRE

    Lhoucine Bahatti; Omar Bouattane; My Elhoussine Echhibat; Mohamed Hicham Zaggaf

    2016-01-01

    In order to achieve an audio classification aimed to identify the composer, the use of adequate and relevant features is important to improve performance especially when the classification algorithm is based on support vector machines. As opposed to conventional approaches that often use timbral features based on a time-frequency representation of the musical signal using constant window, this paper deals with a new audio classification method which improves the features extraction according ...

  2. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.

    2015-02-01

    © 2014 The Authors. Microcirculation published by John Wiley & Sons Ltd. Objective: Recent developments in high-resolution imaging techniques have enabled digital reconstruction of three-dimensional sections of microvascular networks down to the capillary scale. To better interpret these large data sets, our goal is to distinguish branching trees of arterioles and venules from capillaries. Methods: Two novel algorithms are presented for classifying vessels in microvascular anatomical data sets without requiring flow information. The algorithms are compared with a classification based on observed flow directions (considered the gold standard), and with an existing resistance-based method that relies only on structural data. Results: The first algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules. The second algorithm, developed for networks with multiple inlets and outlets, correctly identifies more arterioles and venules, but is more sensitive to parameter changes. Conclusions: The algorithms presented here can be used to classify microvessels in large microvascular data sets lacking flow information. This provides a basis for analyzing the distinct geometrical properties and modelling the functional behavior of arterioles, capillaries, and venules.

  3. A Dataset for Breast Cancer Histopathological Image Classification.

    Science.gov (United States)

    Spanhol, Fabio A; Oliveira, Luiz S; Petitjean, Caroline; Heutte, Laurent

    2016-07-01

    Today, medical image analysis papers require solid experiments to prove the usefulness of proposed methods. However, experiments are often performed on data selected by the researchers, which may come from different institutions, scanners, and populations. Different evaluation measures may be used, making it difficult to compare the methods. In this paper, we introduce a dataset of 7909 breast cancer histopathology images acquired on 82 patients, which is now publicly available from http://web.inf.ufpr.br/vri/breast-cancer-database. The dataset includes both benign and malignant images. The task associated with this dataset is the automated classification of these images in two classes, which would be a valuable computer-aided diagnosis tool for the clinician. In order to assess the difficulty of this task, we show some preliminary results obtained with state-of-the-art image classification systems. The accuracy ranges from 80% to 85%, showing room for improvement is left. By providing this dataset and a standardized evaluation protocol to the scientific community, we hope to gather researchers in both the medical and the machine learning field to advance toward this clinical application.

  4. DNA sequence analysis using hierarchical ART-based classification networks

    Energy Technology Data Exchange (ETDEWEB)

    LeBlanc, C.; Hruska, S.I. [Florida State Univ., Tallahassee, FL (United States); Katholi, C.R.; Unnasch, T.R. [Univ. of Alabama, Birmingham, AL (United States)

    1994-12-31

    Adaptive resonance theory (ART) describes a class of artificial neural network architectures that act as classification tools which self-organize, work in real-time, and require no retraining to classify novel sequences. We have adapted ART networks to provide support to scientists attempting to categorize tandem repeat DNA fragments from Onchocerca volvulus. In this approach, sequences of DNA fragments are presented to multiple ART-based networks which are linked together into two (or more) tiers; the first provides coarse sequence classification while the sub- sequent tiers refine the classifications as needed. The overall rating of the resulting classification of fragments is measured using statistical techniques based on those introduced to validate results from traditional phylogenetic analysis. Tests of the Hierarchical ART-based Classification Network, or HABclass network, indicate its value as a fast, easy-to-use classification tool which adapts to new data without retraining on previously classified data.

  5. KNN BASED CLASSIFICATION OF DIGITAL MODULATED SIGNALS

    Directory of Open Access Journals (Sweden)

    Sajjad Ahmed Ghauri

    2016-11-01

    Full Text Available Demodulation process without the knowledge of modulation scheme requires Automatic Modulation Classification (AMC. When receiver has limited information about received signal then AMC become essential process. AMC finds important place in the field many civil and military fields such as modern electronic warfare, interfering source recognition, frequency management, link adaptation etc. In this paper we explore the use of K-nearest neighbor (KNN for modulation classification with different distance measurement methods. Five modulation schemes are used for classification purpose which is Binary Phase Shift Keying (BPSK, Quadrature Phase Shift Keying (QPSK, Quadrature Amplitude Modulation (QAM, 16-QAM and 64-QAM. Higher order cummulants (HOC are used as an input feature set to the classifier. Simulation results shows that proposed classification method provides better results for the considered modulation formats.

  6. Integrating Globality and Locality for Robust Representation Based Classification

    Directory of Open Access Journals (Sweden)

    Zheng Zhang

    2014-01-01

    Full Text Available The representation based classification method (RBCM has shown huge potential for face recognition since it first emerged. Linear regression classification (LRC method and collaborative representation classification (CRC method are two well-known RBCMs. LRC and CRC exploit training samples of each class and all the training samples to represent the testing sample, respectively, and subsequently conduct classification on the basis of the representation residual. LRC method can be viewed as a “locality representation” method because it just uses the training samples of each class to represent the testing sample and it cannot embody the effectiveness of the “globality representation.” On the contrary, it seems that CRC method cannot own the benefit of locality of the general RBCM. Thus we propose to integrate CRC and LRC to perform more robust representation based classification. The experimental results on benchmark face databases substantially demonstrate that the proposed method achieves high classification accuracy.

  7. Automatic ship target classification based on aerial images

    Science.gov (United States)

    Lan, Jinhui; Wan, Lili

    2008-12-01

    As the important reconnaissance and offensive weapon in future battlefield, Micro Aerial Vehicle (MAV) is applied more and more widely in civil and military field. In the sea battlefield, ship classification applied to MAV could effectively realize signals collection, force protection and strike to ship targets. At present, methods of ship classification are mostly based on signals from radar, infrared or ultrasonic. However, because of large volume and complex equipments, these methods can't meet the requirement of MAV. Thus, ship classification based on visible sensor is chosen and it could solve volume and weight limits of MAV. In order to realize ship classification in MAV, ship classification based on aerial images is first proposed and an effective robust algorithm for classification based on modified Zernike moment invariants is proposed in this paper. The task of classification is that the ships are classified into two categories, aircraft carrier and chaser. The experimental results show that the correct classification rate is more than 92% and the algorithm proposed is effective to solve classification problem for ship targets in MAV.

  8. Adaptive Base Class Boost for Multi-class Classification

    OpenAIRE

    Li, Ping

    2008-01-01

    We develop the concept of ABC-Boost (Adaptive Base Class Boost) for multi-class classification and present ABC-MART, a concrete implementation of ABC-Boost. The original MART (Multiple Additive Regression Trees) algorithm has been very successful in large-scale applications. For binary classification, ABC-MART recovers MART. For multi-class classification, ABC-MART considerably improves MART, as evaluated on several public data sets.

  9. Cancer Hallmark Text Classification Using Convolutional Neural Networks

    OpenAIRE

    Baker, Simon; Korhonen, Anna-Leena; Pyysalo, S

    2017-01-01

    Methods based on deep learning approaches have recently achieved state-of-the-art performance in a range of machine learning tasks and are increasingly applied to natural language processing (NLP). Despite strong results in various established NLP tasks involving general domain texts, here is only limited work applying these models to biomedical NLP. In this paper, we consider a Convolutional Neural Network (CNN) approach to biomedical text classification. Evaluation using a recently intr...

  10. AN Information Text Classification Algorithm Based on DBN

    Directory of Open Access Journals (Sweden)

    LU Shu-bao

    2017-04-01

    Full Text Available Aiming at the problem of low categorization accuracy and uneven distribution of the traditional text classification algorithms,a text classification algorithm based on deep learning has been put forward. Deep belief networks have very strong feature learning ability,which can be extracted from the high dimension of the original feature,so that the text classification can not only be considered,but also can be used to train classification model. The formula of TF-IDF is used to compute text eigenvalues,and the deep belief networks are used to construct the classifier. The experimental results show that compared with the commonly used classification algorithms such as support vector machine,neural network and extreme learning machine,the algorithm has higher accuracy and practicability,and it has opened up new ideas for the research of text classification.

  11. Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.

    Science.gov (United States)

    Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel

    2017-06-01

    Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.

  12. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  13. Random forest ensemble classification based fuzzy logic

    Science.gov (United States)

    Ben Ayed, Abdelkarim; Benhammouda, Marwa; Ben Halima, Mohamed; Alimi, Adel M.

    2017-03-01

    In this paper, we treat the supervised data classification, while using the fuzzy random forests that combine the hardiness of the decision trees, the power of the random selection that increases the diversity of the trees in the forest as well as the flexibility of the fuzzy logic for noise. We will be interested in the construction of a forest of fuzzy decision trees. Our system is validated on nine standard classification benchmarks from UCI repository and have the specificity to control some data, to reduce the rate of mistakes and to put in evidence more of hardiness and more of interoperability.

  14. Radar Target Classification using Recursive Knowledge-Based Methods

    DEFF Research Database (Denmark)

    Jochumsen, Lars Wurtz

    The topic of this thesis is target classification of radar tracks from a 2D mechanically scanning coastal surveillance radar. The measurements provided by the radar are position data and therefore the classification is mainly based on kinematic data, which is deduced from the position. The target...

  15. Transportation Mode Choice Analysis Based on Classification Methods

    OpenAIRE

    Zeņina, N; Borisovs, A.

    2011-01-01

    Mode choice analysis has received the most attention among discrete choice problems in travel behavior literature. Most traditional mode choice models are based on the principle of random utility maximization derived from econometric theory. This paper investigates performance of mode choice analysis with classification methods - decision trees, discriminant analysis and multinomial logit. Experimental results have demonstrated satisfactory quality of classification.

  16. A proposed data base system for detection, classification and ...

    African Journals Online (AJOL)

    A proposed data base system for detection, classification and location of fault on electricity company of Ghana electrical distribution system. Isaac Owusu-Nyarko, Mensah-Ananoo Eugine. Abstract. No Abstract. Keywords: database, classification of fault, power, distribution system, SCADA, ECG. Full Text: EMAIL FULL TEXT ...

  17. AN OBJECT-BASED METHOD FOR CHINESE LANDFORM TYPES CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    H. Ding

    2016-06-01

    Full Text Available Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM. In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  18. High dimensional multiclass classification with applications to cancer diagnosis

    DEFF Research Database (Denmark)

    Vincent, Martin

    Probabilistic classifiers are introduced and it is shown that the only regular linear probabilistic classifier with convex risk is multinomial regression. Penalized empirical risk minimization is introduced and used to construct supervised learning methods for probabilistic classifiers. A sparse...... group lasso penalized approach to high dimensional multinomial classification is presented. On different real data examples it is found that this approach clearly outperforms multinomial lasso in terms of error rate and features included in the model. An efficient coordinate descent algorithm...... is developed and the convergence is established. This algorithm is implemented in the msgl R package. Examples of high dimensional multiclass problems are studied, in particular examples of multiclass classification based on gene expression measurements. One such example is the clinically important - problem...

  19. Lung Cancer: Understanding Its Molecular Pathology and the 2015 WHO Classification.

    Science.gov (United States)

    Inamura, Kentaro

    2017-01-01

    Lung cancer is the leading cause of cancer-related death worldwide due to late diagnoses and limited treatment interventions. Recently, comprehensive molecular profiles of lung cancer have been identified. These novel characteristics have enhanced the understanding of the molecular pathology of lung cancer. The identification of driver genetic alterations and potential molecular targets has resulted in molecular-targeted therapies for an increasing number of lung cancer patients. Thus, the histopathological classification of lung cancer was modified in accordance with the increased understanding of molecular profiles. This review focuses on recent developments in the molecular profiling of lung cancer and provides perspectives on updated diagnostic concepts in the new 2015 WHO classification. The WHO classification will require additional revisions to allow for reliable, clinically meaningful tumor diagnoses as we gain a better understanding of the molecular characteristics of lung cancer.

  20. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks

    Science.gov (United States)

    Khan, Javed; Wei, Jun S.; Ringnér, Markus; Saal, Lao H.; Ladanyi, Marc; Westermann, Frank; Berthold, Frank; Schwab, Manfred; Antonescu, Cristina R.; Peterson, Carsten; Meltzer, Paul S.

    2005-01-01

    The purpose of this study was to develop a method of classifying cancers to specific diagnostic categories based on their gene expression signatures using artificial neural networks (ANNs). We trained the ANNs using the small, round blue-cell tumors (SRBCTs) as a model. These cancers belong to four distinct diagnostic categories and often present diagnostic dilemmas in clinical practice. The ANNs correctly classified all samples and identified the genes most relevant to the classification. Expression of several of these genes has been reported in SRBCTs, but most have not been associated with these cancers. To test the ability of the trained ANN models to recognize SRBCTs, we analyzed additional blinded samples that were not previously used for the training procedure, and correctly classified them in all cases. This study demonstrates the potential applications of these methods for tumor diagnosis and the identification of candidate targets for therapy. PMID:11385503

  1. Artificial neural networks as classification and diagnostic tools for lymph node-negative breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Eswari J, Satya; Chandrakar, Neha [National Institute of Technology Raipur, Raipur (India)

    2016-04-15

    Artificial neural networks (ANNs) can be used to develop a technique to classify lymph node negative breast cancer that is prone to distant metastases based on gene expression signatures. The neural network used is a multilayered feed forward network that employs back propagation algorithm. Once trained with DNA microarraybased gene expression profiles of genes that were predictive of distant metastasis recurrence of lymph node negative breast cancer, the ANNs became capable of correctly classifying all samples and recognizing the genes most appropriate to the classification. To test the ability of the trained ANN models in recognizing lymph node negative breast cancer, we analyzed additional idle samples that were not used beforehand for the training procedure and obtained the correctly classified result in the validation set. For more substantial result, bootstrapping of training and testing dataset was performed as external validation. This study illustrates the potential application of ANN for breast tumor diagnosis and the identification of candidate targets in patients for therapy.

  2. Classification of normal and abnormal images of lung cancer

    Science.gov (United States)

    Bhatnagar, Divyesh; Tiwari, Amit Kumar; Vijayarajan, V.; Krishnamoorthy, A.

    2017-11-01

    To find the exact symptoms of lung cancer is difficult, because of the formation of the most cancers tissues, wherein large structure of tissues is intersect in a different way. This problem can be evaluated with the help of digital images. In this strategy images will be examined with basic operation of PCA Algorithm. In this paper, GLCM method is used for pre-processing of the snap shots and function extraction system and to test the level of diseases of a patient in its premature stage get to know it is regular or unusual. With the help of result stage of cancer will be evaluated. With the help of dataset and result survival rate of cancer patient can be estimated. Result is based totally on the precise and wrong arrangement of the patterns of tissues.

  3. Robust spike classification based on frequency domain neural waveform features.

    Science.gov (United States)

    Yang, Chenhui; Yuan, Yuan; Si, Jennie

    2013-12-01

    We introduce a new spike classification algorithm based on frequency domain features of the spike snippets. The goal for the algorithm is to provide high classification accuracy, low false misclassification, ease of implementation, robustness to signal degradation, and objectivity in classification outcomes. In this paper, we propose a spike classification algorithm based on frequency domain features (CFDF). It makes use of frequency domain contents of the recorded neural waveforms for spike classification. The self-organizing map (SOM) is used as a tool to determine the cluster number intuitively and directly by viewing the SOM output map. After that, spike classification can be easily performed using clustering algorithms such as the k-Means. In conjunction with our previously developed multiscale correlation of wavelet coefficient (MCWC) spike detection algorithm, we show that the MCWC and CFDF detection and classification system is robust when tested on several sets of artificial and real neural waveforms. The CFDF is comparable to or outperforms some popular automatic spike classification algorithms with artificial and real neural data. The detection and classification of neural action potentials or neural spikes is an important step in single-unit-based neuroscientific studies and applications. After the detection of neural snippets potentially containing neural spikes, a robust classification algorithm is applied for the analysis of the snippets to (1) extract similar waveforms into one class for them to be considered coming from one unit, and to (2) remove noise snippets if they do not contain any features of an action potential. Usually, a snippet is a small 2 or 3 ms segment of the recorded waveform, and differences in neural action potentials can be subtle from one unit to another. Therefore, a robust, high performance classification system like the CFDF is necessary. In addition, the proposed algorithm does not require any assumptions on statistical

  4. State of the science: molecular classifications of breast cancer for clinical diagnostics.

    Science.gov (United States)

    Robison, John E; Perreard, Laurent; Bernard, Philip S

    2004-07-01

    Over the past few years, the study of genomics has embarked on developing gene expression-based classifications for tumors-an initiative that promises to revolutionize cancer medicine. High-throughput genomic platforms, such as microarray and SAGE, have found gene expression signatures that correlate to important clinical parameters used in current staging and are providing additional information that will improve standard of care. Although implementing a molecular taxonomy for prognosis and treatment would likely benefit cancer patients, there remain significant obstacles to using these assays within the current diagnostic framework. Since most genomic assays are being performed from fresh tissue, there is a need to either change the practice of formalin-fixing and paraffin-embedding tissue or adapting the assays for use on degraded RNA specimens. To date, even the most mature data sets, such as molecular classifications for breast cancer, still fall short of the number of patients needed to generalize the results to treating large populations. To implement these assays in large scale, there will need to be standardization of sample procurement, preparation, and analysis. Certainly, the greatest improvements in patient care will come through tailored therapies as genomics is coupled with clinical trials that randomize cohorts to different treatments. This manuscript reviews the current standards of care, presents progress that is being made in the development of genomic assays for breast cancer and discusses options for implementing these new tests into the clinical setting.

  5. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge Emil Borch Laurs; Lo, Pechin Chien Pau; Dirksen, Asger

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment...... between the branch feature vectors representing those trees. Hereby, localized information in the branches is collectively used in classification and variations in feature values across the tree are taken into account. An approximate anatomical correspondence between matched branches can be achieved...

  6. Improving accuracy for cancer classification with a new algorithm for genes selection

    Directory of Open Access Journals (Sweden)

    Zhang Hongyan

    2012-11-01

    Full Text Available Abstract Background Even though the classification of cancer tissue samples based on gene expression data has advanced considerably in recent years, it faces great challenges to improve accuracy. One of the challenges is to establish an effective method that can select a parsimonious set of relevant genes. So far, most methods for gene selection in literature focus on screening individual or pairs of genes without considering the possible interactions among genes. Here we introduce a new computational method named the Binary Matrix Shuffling Filter (BMSF. It not only overcomes the difficulty associated with the search schemes of traditional wrapper methods and overfitting problem in large dimensional search space but also takes potential gene interactions into account during gene selection. This method, coupled with Support Vector Machine (SVM for implementation, often selects very small number of genes for easy model interpretability. Results We applied our method to 9 two-class gene expression datasets involving human cancers. During the gene selection process, the set of genes to be kept in the model was recursively refined and repeatedly updated according to the effect of a given gene on the contributions of other genes in reference to their usefulness in cancer classification. The small number of informative genes selected from each dataset leads to significantly improved leave-one-out (LOOCV classification accuracy across all 9 datasets for multiple classifiers. Our method also exhibits broad generalization in the genes selected since multiple commonly used classifiers achieved either equivalent or much higher LOOCV accuracy than those reported in literature. Conclusions Evaluation of a gene’s contribution to binary cancer classification is better to be considered after adjusting for the joint effect of a large number of other genes. A computationally efficient search scheme was provided to perform effective search in the extensive

  7. Behavior Based Social Dimensions Extraction for Multi-Label Classification.

    Directory of Open Access Journals (Sweden)

    Le Li

    Full Text Available Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions.

  8. Gene Expression Profiles for Predicting Metastasis in Breast Cancer: A Cross-Study Comparison of Classification Methods

    Directory of Open Access Journals (Sweden)

    Mark Burton

    2012-01-01

    Full Text Available Machine learning has increasingly been used with microarray gene expression data and for the development of classifiers using a variety of methods. However, method comparisons in cross-study datasets are very scarce. This study compares the performance of seven classification methods and the effect of voting for predicting metastasis outcome in breast cancer patients, in three situations: within the same dataset or across datasets on similar or dissimilar microarray platforms. Combining classification results from seven classifiers into one voting decision performed significantly better during internal validation as well as external validation in similar microarray platforms than the underlying classification methods. When validating between different microarray platforms, random forest, another voting-based method, proved to be the best performing method. We conclude that voting based classifiers provided an advantage with respect to classifying metastasis outcome in breast cancer patients.

  9. A Classification-based Review Recommender

    Science.gov (United States)

    O'Mahony, Michael P.; Smyth, Barry

    Many online stores encourage their users to submit product/service reviews in order to guide future purchasing decisions. These reviews are often listed alongside product recommendations but, to date, limited attention has been paid as to how best to present these reviews to the end-user. In this paper, we describe a supervised classification approach that is designed to identify and recommend the most helpful product reviews. Using the TripAdvisor service as a case study, we compare the performance of several classification techniques using a range of features derived from hotel reviews. We then describe how these classifiers can be used as the basis for a practical recommender that automatically suggests the mosthelpful contrasting reviews to end-users. We present an empirical evaluation which shows that our approach achieves a statistically significant improvement over alternative review ranking schemes.

  10. Dictionary-based lip reading classification

    OpenAIRE

    Yu, Dahai; Ghita, Ovidiu; Sutherland, Alistair; Whelan, Paul F.

    2006-01-01

    Visual lip reading recognition is an essential stage in many multimedia systems such as “Audio Visual Speech Recognition” [6], “Mobile Phone Visual System for deaf people”, “Sign Language Recognition System”, etc. The use of lip visual features to help audio or hand recognition is appropriate because this information is robust to acoustic noise. In this paper, we describe our work towards developing a robust technique for lip reading classification that extracts the lips in a colo...

  11. Cellular based cancer vaccines

    DEFF Research Database (Denmark)

    Hansen, Morten; Met, O; Svane, I M

    2012-01-01

    Cancer vaccines designed to re-calibrate the existing host-tumour interaction, tipping the balance from tumor acceptance towards tumor control holds huge potential to complement traditional cancer therapies. In general, limited success has been achieved with vaccines composed of tumor...... in vitro migration via autocrine receptor-mediated endocytosis of CCR7. In the current review, we discuss optimal design of DC maturation focused on pre-clinical as well as clinical results from standard and polarized dendritic cell based cancer vaccines....

  12. Cellular based cancer vaccines

    DEFF Research Database (Denmark)

    Hansen, M; Met, Ö; Svane, I M

    2012-01-01

    Cancer vaccines designed to re-calibrate the existing host-tumour interaction, tipping the balance from tumor acceptance towards tumor control holds huge potential to complement traditional cancer therapies. In general, limited success has been achieved with vaccines composed of tumor...... to transiently affect in vitro migration via autocrine receptor-mediated endocytosis of CCR7. In the current review, we discuss optimal design of DC maturation focused on pre-clinical as well as clinical results from standard and polarized dendritic cell based cancer vaccines....

  13. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    N. Li

    2016-06-01

    Full Text Available Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could keep the initial spatial structure and insure the consideration of the neighborhood. Based on a small number of component features a k nearest neighborhood classification is applied.

  14. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression.

    Science.gov (United States)

    Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa

    2015-11-03

    Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  15. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression

    Science.gov (United States)

    Shayan, Zahra; Mezerji, Naser Mohammad Gholi; Shayan, Leila; Naseri, Parisa

    2016-01-01

    Background: Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. Methods: This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. Results: CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. Conclusion: The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups. PMID:26925900

  16. Classification of lung cancer patients and controls by chromatography of modified nucleosides in serum

    Science.gov (United States)

    McEntire, John E.; Kuo, Kenneth C.; Smith, Mark E.; Stalling, David L.; Richens, Jack W.; Zumwalt, Robert W.; Gehrke, Charles W.; Papermaster, Ben W.

    1989-01-01

    A wide spectrum of modified nucleosides has been quantified by high-performance liquid chromatography in serum of 49 male lung cancer patients, 35 patients with other cancers, and 48 patients hospitalized for nonneoplastic diseases. Data for 29 modified nucleoside peaks were normalized to an internal standard and analyzed by discriminant analysis and stepwise discriminant analysis. A model based on peaks selected by a stepwise discriminant procedure correctly classified 79% of the cancer and 75% of the noncancer subjects. It also demonstrated 84% sensitivity and 79% specificity when comparing lung cancer to noncancer subjects, and 80% sensitivity and 55% specificity in comparing lung cancer to other cancers. The nucleoside peaks having the greatest influence on the models varied dependent on the subgroups compared, confirming the importance of quantifying a wide array of nucleosides. These data support and expand previous studies which reported the utility of measuring modified nucleoside levels in serum and show that precise measurement of an array of 29 modified nucleosides in serum by high-performance liquid chromatography with UV scanning with subsequent data modeling may provide a clinically useful approach to patient classification in diagnosis and subsequent therapeutic monitoring.

  17. Improving breast cancer classification with mammography, supported on an appropriate variable selection analysis

    Science.gov (United States)

    Pérez, Noel; Guevara, Miguel A.; Silva, Augusto

    2013-02-01

    This work addresses the issue of variable selection within the context of breast cancer classification with mammography. A comprehensive repository of feature vectors was used including a hybrid subset gathering image-based and clinical features. It aimed to gather experimental evidence of variable selection in terms of cardinality, type and find a classification scheme that provides the best performance over the Area Under Receiver Operating Characteristics Curve (AUC) scores using the ranked features subset. We evaluated and classified a total of 300 subsets of features formed by the application of Chi-Square Discretization, Information-Gain, One-Rule and RELIEF methods in association with Feed-Forward Backpropagation Neural Network (FFBP), Support Vector Machine (SVM) and Decision Tree J48 (DTJ48) Machine Learning Algorithms (MLA) for a comparative performance evaluation based on AUC scores. A variable selection analysis was performed for Single-View Ranking and Multi-View Ranking groups of features. Features subsets representing Microcalcifications (MCs), Masses and both MCs and Masses lesions achieved AUC scores of 0.91, 0.954 and 0.934 respectively. Experimental evidence demonstrated that classification performance was improved by combining image-based and clinical features. The most important clinical and image-based features were StromaDistortion and Circularity respectively. Other less important but worth to use due to its consistency were Contrast, Perimeter, Microcalcification, Correlation and Elongation.

  18. Iris Image Classification Based on Hierarchical Visual Codebook.

    Science.gov (United States)

    Zhenan Sun; Hui Zhang; Tieniu Tan; Jianyu Wang

    2014-06-01

    Iris recognition as a reliable method for personal identification has been well-studied with the objective to assign the class label of each iris image to a unique subject. In contrast, iris image classification aims to classify an iris image to an application specific category, e.g., iris liveness detection (classification of genuine and fake iris images), race classification (e.g., classification of iris images of Asian and non-Asian subjects), coarse-to-fine iris identification (classification of all iris images in the central database into multiple categories). This paper proposes a general framework for iris image classification based on texture analysis. A novel texture pattern representation method called Hierarchical Visual Codebook (HVC) is proposed to encode the texture primitives of iris images. The proposed HVC method is an integration of two existing Bag-of-Words models, namely Vocabulary Tree (VT), and Locality-constrained Linear Coding (LLC). The HVC adopts a coarse-to-fine visual coding strategy and takes advantages of both VT and LLC for accurate and sparse representation of iris texture. Extensive experimental results demonstrate that the proposed iris image classification method achieves state-of-the-art performance for iris liveness detection, race classification, and coarse-to-fine iris identification. A comprehensive fake iris image database simulating four types of iris spoof attacks is developed as the benchmark for research of iris liveness detection.

  19. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  20. MO-DE-207B-03: Improved Cancer Classification Using Patient-Specific Biological Pathway Information Via Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Young, M; Craft, D [Massachusetts General Hospital and Harvard Medical School, Boston, MA (United States)

    2016-06-15

    Purpose: To develop an efficient, pathway-based classification system using network biology statistics to assist in patient-specific response predictions to radiation and drug therapies across multiple cancer types. Methods: We developed PICS (Pathway Informed Classification System), a novel two-step cancer classification algorithm. In PICS, a matrix m of mRNA expression values for a patient cohort is collapsed into a matrix p of biological pathways. The entries of p, which we term pathway scores, are obtained from either principal component analysis (PCA), normal tissue centroid (NTC), or gene expression deviation (GED). The pathway score matrix is clustered using both k-means and hierarchical clustering, and a clustering is judged by how well it groups patients into distinct survival classes. The most effective pathway scoring/clustering combination, per clustering p-value, thus generates various ‘signatures’ for conventional and functional cancer classification. Results: PICS successfully regularized large dimension gene data, separated normal and cancerous tissues, and clustered a large patient cohort spanning six cancer types. Furthermore, PICS clustered patient cohorts into distinct, statistically-significant survival groups. For a suboptimally-debulked ovarian cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00127) showed significant improvement over that of a prior gene expression-classified study (p = .0179). For a pancreatic cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00141) showed significant improvement over that of a prior gene expression-classified study (p = .04). Pathway-based classification confirmed biomarkers for the pyrimidine, WNT-signaling, glycerophosphoglycerol, beta-alanine, and panthothenic acid pathways for ovarian cancer. Despite its robust nature, PICS requires significantly less run time than current pathway scoring methods. Conclusion: This work validates the PICS method to improve

  1. Classification systems in Gestational trophoblastic neoplasia - Sentiment or evidenced based?

    Science.gov (United States)

    Parker, V L; Pacey, A A; Palmer, J E; Tidy, J A; Winter, M C; Hancock, B W

    2017-05-01

    The classification system for Gestational trophoblastic neoplasia (GTN) has proved a controversial topic for over 100years. Numerous systems simultaneously existed in different countries, with three main rival classifications gaining popularity, namely histological, anatomical and clinical prognostic systems. Until 2000, prior to the combination of the FIGO and WHO classifications, there was no worldwide consensus on the optimal classification system, largely due to a lack of high quality data proving the merit of one system over another. Remarkably, a validated, prospectively tested classification system is yet to be conducted. Over time, increasing criticisms have emerged regarding the currently adopted combined FIGO/WHO classification system, and its ability to identify patients most likely to develop primary chemotherapy resistance or disease relapse. This is particularly pertinent for patients with low-risk disease, whereby one in three patients are resistant to first line therapy, rising to four out of five women who score 5 or 6. This review aims to examine the historical basis of the GTN classification systems and critically appraise the evidence on which they were based. This culminates in a critique of the current FIGO/WHO prognostic system and discussion surrounding clinical preference versus evidence based practice. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Hot complaint intelligent classification based on text mining

    Directory of Open Access Journals (Sweden)

    XIA Haifeng

    2013-10-01

    Full Text Available The complaint recognizer system plays an important role in making sure the correct classification of the hot complaint,improving the service quantity of telecommunications industry.The customers’ complaint in telecommunications industry has its special particularity which should be done in limited time,which cause the error in classification of hot complaint.The paper presents a model of complaint hot intelligent classification based on text mining,which can classify the hot complaint in the correct level of the complaint navigation.The examples show that the model can be efficient to classify the text of the complaint.

  3. Classification

    DEFF Research Database (Denmark)

    Hjørland, Birger

    2017-01-01

    This article presents and discusses definitions of the term “classification” and the related concepts “Concept/conceptualization,”“categorization,” “ordering,” “taxonomy” and “typology.” It further presents and discusses theories of classification including the influences of Aristotle...... and Wittgenstein. It presents different views on forming classes, including logical division, numerical taxonomy, historical classification, hermeneutical and pragmatic/critical views. Finally, issues related to artificial versus natural classification and taxonomic monism versus taxonomic pluralism are briefly...

  4. Hyperspectral image classification based on filtering: a comparative study

    Science.gov (United States)

    Cao, Xianghai; Ji, Beibei; Ji, Yamei; Wang, Lin; Jiao, Licheng

    2017-07-01

    The classification of hyperspectral images benefits greatly from integration of spectral information and spatial context. There have been many means to incorporate spatial information into the classification, such as the Markov random field, extended morphological profiles, and segmentation-based methods. Recently, spatial filtering was introduced to improve the classification accuracy of hyperspectral images. Compared with other spectral-spatial algorithms, spatial filtering is simple and easy to implement. This advantage makes it suitable for practical applications. However, spatial filtering has not been given enough attention. A comprehensive comparative study of spatial filtering is conducted. Specifically, 10 kinds of filters are used to smooth the hyperspectral images and the classified maps, respectively. The experimental results show that most filtering-based classification methods perform well with high efficiency.

  5. Image Analysis and Classification Based on Soil Strength

    Science.gov (United States)

    2016-08-01

    Based on Soil Strength Co ld R eg io ns R es ea rc h an d En gi ne er in g La bo ra to ry Ariana M. Sopher, Sally A. Shoop, Jesse M...delineation, forestry, geology , and landslide potential. However, image classification for physical properties of surface soils, such as strength or...wetland delineation, forestry, geology , and landslide potential. However, image classification for physical properties of surface soils, such as

  6. EPA`s program for risk assessment guidelines: Cancer classification issues

    Energy Technology Data Exchange (ETDEWEB)

    Wiltse, J. [Environmental Protection Agency, Washington, DC (United States)

    1990-12-31

    Issues presented are related to classification of weight of evidence in cancer risk assessments. The focus in this paper is on lines of evidence used in constructing a conclusion about potential human carcinogenicity. The paper also discusses issues that are mistakenly addressed as classification issues but are really part of the risk assessment process. 2 figs.

  7. An unbalanced spectra classification method based on entropy

    Science.gov (United States)

    Liu, Zhong-bao; Zhao, Wen-juan

    2017-05-01

    How to solve the problem of distinguishing the minority spectra from the majority of the spectra is quite important in astronomy. In view of this, an unbalanced spectra classification method based on entropy (USCM) is proposed in this paper to deal with the unbalanced spectra classification problem. USCM greatly improves the performances of the traditional classifiers on distinguishing the minority spectra as it takes the data distribution into consideration in the process of classification. However, its time complexity is exponential with the training size, and therefore, it can only deal with the problem of small- and medium-scale classification. How to solve the large-scale classification problem is quite important to USCM. It can be easily obtained by mathematical computation that the dual form of USCM is equivalent to the minimum enclosing ball (MEB), and core vector machine (CVM) is introduced, USCM based on CVM is proposed to deal with the large-scale classification problem. Several comparative experiments on the 4 subclasses of K-type spectra, 3 subclasses of F-type spectra and 3 subclasses of G-type spectra from Sloan Digital Sky Survey (SDSS) verify USCM and USCM based on CVM perform better than kNN (k nearest neighbor) and SVM (support vector machine) in dealing with the problem of rare spectra mining respectively on the small- and medium-scale datasets and the large-scale datasets.

  8. Efficient molecular subtype classification of high-grade serous ovarian cancer.

    Science.gov (United States)

    Leong, Huei San; Galletta, Laura; Etemadmoghadam, Dariush; George, Joshy; Köbel, Martin; Ramus, Susan J; Bowtell, David

    2015-07-01

    High-grade serous carcinomas (HGSCs) account for approximately 70% of all epithelial ovarian cancers diagnosed. Using microarray gene expression profiling, we previously identified four molecular subtypes of HGSC: C1 (mesenchymal), C2 (immunoreactive), C4 (differentiated), and C5 (proliferative), which correlate with patient survival and have distinct biological features. Here, we describe molecular classification of HGSC based on a limited number of genes to allow cost-effective and high-throughput subtype analysis. We determined a minimal signature for accurate classification, including 39 differentially expressed and nine control genes from microarray experiments. Taqman-based (low-density arrays and Fluidigm), fluorescent oligonucleotides (Nanostring), and targeted RNA sequencing (Illumina) assays were then compared for their ability to correctly classify fresh and formalin-fixed, paraffin-embedded samples. All platforms achieved > 90% classification accuracy with RNA from fresh frozen samples. The Illumina and Nanostring assays were superior with fixed material. We found that the C1, C2, and C4 molecular subtypes were largely consistent across multiple surgical deposits from individual chemo-naive patients. In contrast, we observed substantial subtype heterogeneity in patients whose primary ovarian sample was classified as C5. The development of an efficient molecular classifier of HGSC should enable further biological characterization of molecular subtypes and the development of targeted clinical trials. Copyright © 2015 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.

  9. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value.

    Directory of Open Access Journals (Sweden)

    Laetitia Marisa

    Full Text Available Colon cancer (CC pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses.Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype-like, normal-like, serrated CC phenotype-like, and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II-III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse-free survival, even after

  10. Indoor scene classification of robot vision based on cloud computing

    Science.gov (United States)

    Hu, Tao; Qi, Yuxiao; Li, Shipeng

    2016-07-01

    For intelligent service robots, indoor scene classification is an important issue. To overcome the weak real-time performance of conventional algorithms, a new method based on Cloud computing is proposed for global image features in indoor scene classification. With MapReduce method, global PHOG feature of indoor scene image is extracted in parallel. And, feature eigenvector is used to train the decision classifier through SVM concurrently. Then, the indoor scene is validly classified by decision classifier. To verify the algorithm performance, we carried out an experiment with 350 typical indoor scene images from MIT LabelMe image library. Experimental results show that the proposed algorithm can attain better real-time performance. Generally, it is 1.4 2.1 times faster than traditional classification methods which rely on single computation, while keeping stable classification correct rate as 70%.

  11. ART-Based Neural Networks for Multi-label Classification

    Science.gov (United States)

    Sapozhnikova, Elena P.

    Multi-label classification is an active and rapidly developing research area of data analysis. It becomes increasingly important in such fields as gene function prediction, text classification or web mining. This task corresponds to classification of instances labeled by multiple classes rather than just one. Traditionally, it was solved by learning independent binary classifiers for each class and combining their outputs to obtain multi-label predictions. Alternatively, a classifier can be directly trained to predict a label set of an unknown size for each unseen instance. Recently, several direct multi-label machine learning algorithms have been proposed. This paper presents a novel approach based on ART (Adaptive Resonance Theory) neural networks. The Fuzzy ARTMAP and ARAM algorithms were modified in order to improve their multi-label classification performance and were evaluated on benchmark datasets. Comparison of experimental results with the results of other multi-label classifiers shows the effectiveness of the proposed approach.

  12. Visualization and tissue classification of human breast cancer images using ultrahigh-resolution OCT.

    Science.gov (United States)

    Yao, Xinwen; Gan, Yu; Chang, Ernest; Hibshoosh, Hanina; Feldman, Sheldon; Hendon, Christine

    2017-03-01

    Breast cancer is one of the most common cancers, and recognized as the third leading cause of mortality in women. Optical coherence tomography (OCT) enables three dimensional visualization of biological tissue with micrometer level resolution at high speed, and can play an important role in early diagnosis and treatment guidance of breast cancer. In particular, ultra-high resolution (UHR) OCT provides images with better histological correlation. This paper compared UHR OCT performance with standard OCT in breast cancer imaging qualitatively and quantitatively. Automatic tissue classification algorithms were used to automatically detect invasive ductal carcinoma in ex vivo human breast tissue. Human breast tissues, including non-neoplastic/normal tissues from breast reduction and tumor samples from mastectomy specimens, were excised from patients at Columbia University Medical Center. The tissue specimens were imaged by two spectral domain OCT systems at different wavelengths: a home-built ultra-high resolution (UHR) OCT system at 800 nm (measured as 2.72 μm axial and 5.52 μm lateral) and a commercial OCT system at 1,300 nm with standard resolution (measured as 6.5 μm axial and 15 μm lateral), and their imaging performances were analyzed qualitatively. Using regional features derived from OCT images produced by the two systems, we developed an automated classification algorithm based on relevance vector machine (RVM) to differentiate hollow-structured adipose tissue against solid tissue. We further developed B-scan based features for RVM to classify invasive ductal carcinoma (IDC) against normal fibrous stroma tissue among OCT datasets produced by the two systems. For adipose classification, 32 UHR OCT B-scans from 9 normal specimens, and 28 standard OCT B-scans from 6 normal and 4 IDC specimens were employed. For IDC classification, 152 UHR OCT B-scans from 6 normal and 13 IDC specimens, and 104 standard OCT B-scans from 5 normal and 8 IDC specimens

  13. The Study of Land Use Classification Based on SPOT6 High Resolution Data

    OpenAIRE

    Wu Song; Jiang Qigang

    2016-01-01

    A method is carried out to quick classification extract of the type of land use in agricultural areas, which is based on the spot6 high resolution remote sensing classification data and used of the good nonlinear classification ability of support vector machine. The results show that the spot6 high resolution remote sensing classification data can realize land classification efficiently, the overall classification accuracy reached 88.79% and Kappa factor is 0.8632 which means that the classif...

  14. Immunogenomic Classification of Colorectal Cancer and Therapeutic Implications

    Directory of Open Access Journals (Sweden)

    Jessica Roelands

    2017-10-01

    Full Text Available The immune system has a substantial effect on colorectal cancer (CRC progression. Additionally, the response to immunotherapeutics and conventional treatment options (e.g., chemotherapy, radiotherapy and targeted therapies is influenced by the immune system. The molecular characterization of colorectal cancer (CRC has led to the identification of favorable and unfavorable immunological attributes linked to clinical outcome. With the definition of consensus molecular subtypes (CMSs based on transcriptomic profiles, multiple characteristics have been proposed to be responsible for the development of the tumor immune microenvironment and corresponding mechanisms of immune escape. In this review, a detailed description of proposed immune phenotypes as well as their interaction with different therapeutic modalities will be provided. Finally, possible strategies to shift the CRC immune phenotype towards a reactive, anti-tumor orientation are proposed per CMS.

  15. Concept-based semi-automatic classification of drugs.

    Science.gov (United States)

    Gurulingappa, Harsha; Kolárik, Corinna; Hofmann-Apitius, Martin; Fluck, Juliane

    2009-08-01

    The anatomical therapeutic chemical (ATC) classification system maintained by the World Health Organization provides a global standard for the classification of medical substances and serves as a source for drug repurposing research. Nevertheless, it lacks several drugs that are major players in the global drug market. In order to establish classifications for yet unclassified drugs, this paper presents a newly developed approach based on a combination of information extraction (IE) and machine learning (ML) techniques. Most of the information about drugs is published in the scientific articles. Therefore, an IE-based framework is employed to extract terms from free text that express drug's chemical, pharmacological, therapeutic, and systemic effects. The extracted terms are used as features within a ML framework to predict putative ATC class labels for unclassified drugs. The system was tested on a portion of ATC containing drugs with an indication on the cardiovascular system. The class prediction turned out to be successful with the best predictive accuracy of 89.47% validated by a 100-fold bootstrapping of the training set and an accuracy of 77.12% on an independent test set. The presented concept-based classification system outperformed state-of-the-art classification methods based on chemical structure properties.

  16. Lesion margin analysis for automated classification of cervical cancer lesions

    Science.gov (United States)

    Van Raad, Viara; Xue, Zhiyun; Lange, Holger

    2006-03-01

    Digital colposcopy is an emerging technology, replacing the traditional colposcope for diagnosis of cervical lesions. Incorporating automated algorithms within a digital colposcopy system can improve the reliability and the diagnostic accuracy of cervical precancer and cancer. An automated computer-aided diagnosis (CAD) system can assess the three important cervical diagnostic cues: the color, the vascular patterns and the lesion margins with quantitative measures, similar to the way colposcopists use the Reid's index in traditional colposcopy. In this work we present a novel way to analyze and classify the global and the local features of one of the three major components in colposcopy diagnosis - the lesion margins. The margins of cervical lesion can be described as 'feathered,' 'geographic,' 'satellite,' 'regular or smooth' and 'margin-in-margin,' or they can be of mixed type. As margin characterization is a complex task, we use irregularity descriptors such as compactness indices and curvature descriptors. To address the complexity of the problem, the dependency of scale and the position of the lesion on the cervical image, our method use novel Fourier energy descriptors. The conceptually complex analysis of describing lesions as 'satellite' lesions or lesions with multiple margins is performed using descriptors, where the distance, the position and the local statistical estimates of image intensity play important role. We trained this new algorithm to classify and diagnose the cervix, evaluating only the lesions. The accuracy of the results is assessed against a 'ground truth' scheme, using colposcopists' annotations and pathology results. We report the resulted accuracy of the classification method assessed against this scheme.

  17. On the International Agency for Research on Cancer classification of glyphosate as a probable human carcinogen.

    Science.gov (United States)

    Tarone, Robert E

    2017-11-08

    The recent classification by International Agency for Research on Cancer (IARC) of the herbicide glyphosate as a probable human carcinogen has generated considerable discussion. The classification is at variance with evaluations of the carcinogenic potential of glyphosate by several national and international regulatory bodies. The basis for the IARC classification is examined under the assumptions that the IARC criteria are reasonable and that the body of scientific studies determined by IARC staff to be relevant to the evaluation of glyphosate by the Monograph Working Group is sufficiently complete. It is shown that the classification of glyphosate as a probable human carcinogen was the result of a flawed and incomplete summary of the experimental evidence evaluated by the Working Group. Rational and effective cancer prevention activities depend on scientifically sound and unbiased assessments of the carcinogenic potential of suspected agents. Implications of the erroneous classification of glyphosate with respect to the IARC Monograph Working Group deliberative process are discussed.

  18. Intelligence system based classification approach for medical disease diagnosis

    Science.gov (United States)

    Sagir, Abdu Masanawa; Sathasivam, Saratha

    2017-08-01

    The prediction of breast cancer in women who have no signs or symptoms of the disease as well as survivability after undergone certain surgery has been a challenging problem for medical researchers. The decision about presence or absence of diseases depends on the physician's intuition, experience and skill for comparing current indicators with previous one than on knowledge rich data hidden in a database. This measure is a very crucial and challenging task. The goal is to predict patient condition by using an adaptive neuro fuzzy inference system (ANFIS) pre-processed by grid partitioning. To achieve an accurate diagnosis at this complex stage of symptom analysis, the physician may need efficient diagnosis system. A framework describes methodology for designing and evaluation of classification performances of two discrete ANFIS systems of hybrid learning algorithms least square estimates with Modified Levenberg-Marquardt and Gradient descent algorithms that can be used by physicians to accelerate diagnosis process. The proposed method's performance was evaluated based on training and test datasets with mammographic mass and Haberman's survival Datasets obtained from benchmarked datasets of University of California at Irvine's (UCI) machine learning repository. The robustness of the performance measuring total accuracy, sensitivity and specificity is examined. In comparison, the proposed method achieves superior performance when compared to conventional ANFIS based gradient descent algorithm and some related existing methods. The software used for the implementation is MATLAB R2014a (version 8.3) and executed in PC Intel Pentium IV E7400 processor with 2.80 GHz speed and 2.0 GB of RAM.

  19. Atmospheric circulation classification comparison based on wildfires in Portugal

    Science.gov (United States)

    Pereira, M. G.; Trigo, R. M.

    2009-04-01

    Atmospheric circulation classifications are not a simple description of atmospheric states but a tool to understand and interpret the atmospheric processes and to model the relation between atmospheric circulation and surface climate and other related variables (Radan Huth et al., 2008). Classifications were initially developed with weather forecasting purposes, however with the progress in computer processing capability, new and more robust objective methods were developed and applied to large datasets prompting atmospheric circulation classification methods to one of the most important fields in synoptic and statistical climatology. Classification studies have been extensively used in climate change studies (e.g. reconstructed past climates, recent observed changes and future climates), in bioclimatological research (e.g. relating human mortality to climatic factors) and in a wide variety of synoptic climatological applications (e.g. comparison between datasets, air pollution, snow avalanches, wine quality, fish captures and forest fires). Likewise, atmospheric circulation classifications are important for the study of the role of weather in wildfire occurrence in Portugal because the daily synoptic variability is the most important driver of local weather conditions (Pereira et al., 2005). In particular, the objective classification scheme developed by Trigo and DaCamara (2000) to classify the atmospheric circulation affecting Portugal have proved to be quite useful in discriminating the occurrence and development of wildfires as well as the distribution over Portugal of surface climatic variables with impact in wildfire activity such as maximum and minimum temperature and precipitation. This work aims to present: (i) an overview the existing circulation classification for the Iberian Peninsula, and (ii) the results of a comparison study between these atmospheric circulation classifications based on its relation with wildfires and relevant meteorological

  20. Classification of Gait Types Based on the Duty-factor

    DEFF Research Database (Denmark)

    Fihl, Preben; Moeslund, Thomas B.

    2007-01-01

    This paper deals with classification of human gait types based on the notion that different gait types are in fact different types of locomotion, i.e., running is not simply walking done faster. We present the duty-factor, which is a descriptor based on this notion. The duty-factor is independent...... on the speed of the human, the cameras setup etc. and hence a robust descriptor for gait classification. The dutyfactor is basically a matter of measuring the ground support of the feet with respect to the stride. We estimate this by comparing the incoming silhouettes to a database of silhouettes with known...

  1. Classification in postural style based on stochastic process modeling.

    Science.gov (United States)

    Denis, Christophe

    2014-01-01

    We address the statistical challenge of classifying subjects as hemiplegic, vestibular or normal based on complex trajectories obtained through two experimental protocols designed to evaluate potential deficits in postural control. The classification procedure involves a dimension reduction step where the complex trajectories are summarized by finite-dimensional summary measures based on a stochastic process model for a real-valued trajectory. This allows us to retrieve from the trajectories information relative to their temporal dynamic. A leave-one-out evaluation yields a 79% performance of correct classification for a total of n=70 subjects, with 22 hemiplegic (31%), 16 vestibular (23%) and 32 normal (46%) subjects.

  2. Melancholia EEG classification based on CSSD and SVM

    Science.gov (United States)

    Shi, Jian-Jun; Yuan, Qing-Wu; Zhou, La-Wu

    2011-10-01

    It takes an important role to get the disease information from melancholia electroencephalograph (EEG). Firstly, A common spatial subspace decomposition (CSSD) method was used to extract features from 16-channel EEG of melancholia and normal healthy persons. Then based on support vector machines (SVM), a classifier was designed to train and test its classification capability between Melancholia and healthy persons. The results indicated that the proposed method can reach a higher accuracy as 95% in EEG classification, while the accuracy of the method based on wavelet is only 88%.That is, the proposed method is feasible for the melancholia diagnosis and research.

  3. Ki-67 marker useful for classification of malignant invasive ductal breast cancer

    Directory of Open Access Journals (Sweden)

    Irmawati Hassan

    2015-12-01

    The study showed that invasive ductal breast cancer with high Ki-67 index was significantly associated with high grade of malignacy. The high Ki-67 marker index can be used for classification of the grade of malignancy of invasive ductal breast cancer.

  4. Automated Classification of Lung Cancer Types from Cytological Images Using Deep Convolutional Neural Networks

    Directory of Open Access Journals (Sweden)

    Atsushi Teramoto

    2017-01-01

    Full Text Available Lung cancer is a leading cause of death worldwide. Currently, in differential diagnosis of lung cancer, accurate classification of cancer types (adenocarcinoma, squamous cell carcinoma, and small cell carcinoma is required. However, improving the accuracy and stability of diagnosis is challenging. In this study, we developed an automated classification scheme for lung cancers presented in microscopic images using a deep convolutional neural network (DCNN, which is a major deep learning technique. The DCNN used for classification consists of three convolutional layers, three pooling layers, and two fully connected layers. In evaluation experiments conducted, the DCNN was trained using our original database with a graphics processing unit. Microscopic images were first cropped and resampled to obtain images with resolution of 256 × 256 pixels and, to prevent overfitting, collected images were augmented via rotation, flipping, and filtering. The probabilities of three types of cancers were estimated using the developed scheme and its classification accuracy was evaluated using threefold cross validation. In the results obtained, approximately 71% of the images were classified correctly, which is on par with the accuracy of cytotechnologists and pathologists. Thus, the developed scheme is useful for classification of lung cancers from microscopic images.

  5. A critical appraisal of logistic regression-based nomograms, artificial neural networks, classification and regression-tree models, look-up tables and risk-group stratification models for prostate cancer.

    Science.gov (United States)

    Chun, Felix K-H; Karakiewicz, Pierre I; Briganti, Alberto; Walz, Jochen; Kattan, Michael W; Huland, Hartwig; Graefen, Markus

    2007-04-01

    To evaluate several methods of predicting prostate cancer-related outcomes, i.e. nomograms, look-up tables, artificial neural networks (ANN), classification and regression tree (CART) analyses and risk-group stratification (RGS) models, all of which represent valid alternatives. We present four direct comparisons, where a nomogram was compared to either an ANN, a look-up table, a CART model or a RGS model. In all comparisons we assessed the predictive accuracy and performance characteristics of both models. Nomograms have several advantages over ANN, look-up tables, CART and RGS models, the most fundamental being a higher predictive accuracy and better performance characteristics. These results suggest that nomograms are more accurate and have better performance characteristics than their alternatives. However, ANN, look-up tables, CART analyses and RGS models all rely on methodologically sound and valid alternatives, which should not be abandoned.

  6. Chinese Sentence Classification Based on Convolutional Neural Network

    Science.gov (United States)

    Gu, Chengwei; Wu, Ming; Zhang, Chuang

    2017-10-01

    Sentence classification is one of the significant issues in Natural Language Processing (NLP). Feature extraction is often regarded as the key point for natural language processing. Traditional ways based on machine learning can not take high level features into consideration, such as Naive Bayesian Model. The neural network for sentence classification can make use of contextual information to achieve greater results in sentence classification tasks. In this paper, we focus on classifying Chinese sentences. And the most important is that we post a novel architecture of Convolutional Neural Network (CNN) to apply on Chinese sentence classification. In particular, most of the previous methods often use softmax classifier for prediction, we embed a linear support vector machine to substitute softmax in the deep neural network model, minimizing a margin-based loss to get a better result. And we use tanh as an activation function, instead of ReLU. The CNN model improve the result of Chinese sentence classification tasks. Experimental results on the Chinese news title database validate the effectiveness of our model.

  7. Multi-class cancer classification using multinomial probit regression with Bayesian gene selection.

    Science.gov (United States)

    Zhou, X; Wang, X; Dougherty, E R

    2006-03-01

    We consider the problems of multi-class cancer classification from gene expression data. After discussing the multinomial probit regression model with Bayesian gene selection, we propose two Bayesian gene selection schemes: one employs different strongest genes for different probit regressions; the other employs the same strongest genes for all regressions. Some fast implementation issues for Bayesian gene selection are discussed, including preselection of the strongest genes and recursive computation of the estimation errors using QR decomposition. The proposed gene selection techniques are applied to analyse real breast cancer data, small round blue-cell tumours, the national cancer institute's anti-cancer drug-screen data and acute leukaemia data. Compared with existing multi-class cancer classifications, our proposed methods can find which genes are the most important genes affecting which kind of cancer. Also, the strongest genes selected using our methods are consistent with the biological significance. The recognition accuracies are very high using our proposed methods.

  8. Efficacy of the Kyoto Classification of Gastritis in Identifying Patients at High Risk for Gastric Cancer.

    Science.gov (United States)

    Sugimoto, Mitsushige; Ban, Hiromitsu; Ichikawa, Hitomi; Sahara, Shu; Otsuka, Taketo; Inatomi, Osamu; Bamba, Shigeki; Furuta, Takahisa; Andoh, Akira

    2017-01-01

    Objective The Kyoto gastritis classification categorizes the endoscopic characteristics of Helicobacter pylori (H. pylori) infection-associated gastritis and identifies patterns associated with a high risk of gastric cancer. We investigated its efficacy, comparing scores in patients with H. pylori-associated gastritis and with gastric cancer. Methods A total of 1,200 patients with H. pylori-positive gastritis alone (n=932), early-stage H. pylori-positive gastric cancer (n=189), and successfully treated H. pylori-negative cancer (n=79) were endoscopically graded according to the Kyoto gastritis classification for atrophy, intestinal metaplasia, fold hypertrophy, nodularity, and diffuse redness. Results The prevalence of O-II/O-III-type atrophy according to the Kimura-Takemoto classification in early-stage H. pylori-positive gastric cancer and successfully treated H. pylori-negative cancer groups was 45.1%, which was significantly higher than in subjects with gastritis alone (12.7%, pgastritis scores of atrophy and intestinal metaplasia in the H. pylori-positive cancer group were significantly higher than in subjects with gastritis alone (all pgastritis classification may thus be useful for detecting these patients.

  9. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is

  10. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    OpenAIRE

    Li, N.; Liu, C; Pfeifer, N; Yin, J. F.; Liao, Z.Y.; Zhou, Y

    2016-01-01

    Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could kee...

  11. Bladder cancer: Analysis of the 2004 WHO classification in ...

    African Journals Online (AJOL)

    schistosomal associated BCA as well as compare our findings with the 2004 WHO consensus classification of urothelial neoplasms and with other publications. Patients and methods: The archival materials of 180 urinary bladder specimens were ...

  12. Binary Classification of Multigranulation Searching Algorithm Based on Probabilistic Decision

    Directory of Open Access Journals (Sweden)

    Qinghua Zhang

    2016-01-01

    Full Text Available Multigranulation computing, which adequately embodies the model of human intelligence in process of solving complex problems, is aimed at decomposing the complex problem into many subproblems in different granularity spaces, and then the subproblems will be solved and synthesized for obtaining the solution of original problem. In this paper, an efficient binary classification of multigranulation searching algorithm which has optimal-mathematical expectation of classification times for classifying the objects of the whole domain is established. And it can solve the binary classification problems based on both multigranulation computing mechanism and probability statistic principle, such as the blood analysis case. Given the binary classifier, the negative sample ratio, and the total number of objects in domain, this model can search the minimum mathematical expectation of classification times and the optimal classification granularity spaces for mining all the negative samples. And the experimental results demonstrate that, with the granules divided into many subgranules, the efficiency of the proposed method gradually increases and tends to be stable. In addition, the complexity for solving problem is extremely reduced.

  13. An enhanced topologically significant directed random walk in cancer classification using gene expression datasets

    Directory of Open Access Journals (Sweden)

    Choon Sen Seah

    2017-12-01

    Full Text Available Microarray technology has become one of the elementary tools for researchers to study the genome of organisms. As the complexity and heterogeneity of cancer is being increasingly appreciated through genomic analysis, cancerous classification is an emerging important trend. Significant directed random walk is proposed as one of the cancerous classification approach which have higher sensitivity of risk gene prediction and higher accuracy of cancer classification. In this paper, the methodology and material used for the experiment are presented. Tuning parameter selection method and weight as parameter are applied in proposed approach. Gene expression dataset is used as the input datasets while pathway dataset is used to build a directed graph, as reference datasets, to complete the bias process in random walk approach. In addition, we demonstrate that our approach can improve sensitive predictions with higher accuracy and biological meaningful classification result. Comparison result takes place between significant directed random walk and directed random walk to show the improvement in term of sensitivity of prediction and accuracy of cancer classification.

  14. Apple Shape Classification Method Based on Wavelet Moment

    Directory of Open Access Journals (Sweden)

    Jiangsheng Gui

    2014-09-01

    Full Text Available Shape is not only an important indicator for assessing the grade of the apple, but also the important factors for increasing the value of the apple. In order to improve the apple shape classification accuracy rate, an approach for apple shape sorting based on wavelet moments was proposed, the image was first subjected to a normalization process using its regular moments to obtain scale and translation invariance, the rotation invariant wavelet moment features were then extracted from the scale and translation normalized images and the method of cluster analysis was used for finished the shape classification. This method performs better than traditional approaches such as Fourier descriptors and Zernike moments, because of that Wavelet moments can provide time-domain and frequency domain window, which was verified by experiments. The normal fruit shape, mild deformity and severe deformity classification accuracy is 86.21 %, 85.82 %, 90.81 % by our method.

  15. Emotion classification based on gamma-band EEG.

    Science.gov (United States)

    Li, Mu; Lu, Bao-Liang

    2009-01-01

    In this paper, we use EEG signals to classify two emotions-happiness and sadness. These emotions are evoked by showing subjects pictures of smile and cry facial expressions. We propose a frequency band searching method to choose an optimal band into which the recorded EEG signal is filtered. We use common spatial patterns (CSP) and linear-SVM to classify these two emotions. To investigate the time resolution of classification, we explore two kinds of trials with lengths of 3s and 1s. Classification accuracies of 93.5% +/- 6.7% and 93.0%+/-6.2% are achieved on 10 subjects for 3s-trials and 1s-trials, respectively. Our experimental results indicate that the gamma band (roughly 30-100 Hz) is suitable for EEG-based emotion classification.

  16. Classification of breast cancer histology images using Convolutional Neural Networks

    National Research Council Canada - National Science Library

    Teresa Araújo; Guilherme Aresta; Eduardo Castro; José Rouco; Paulo Aguiar; Catarina Eloy; António Polónia; Aurélio Campilho

    2017-01-01

    Breast cancer is one of the main causes of cancer death worldwide. The diagnosis of biopsy tissue with hematoxylin and eosin stained images is non-trivial and specialists often disagree on the final diagnosis...

  17. Improving EEG-Based Emotion Classification Using Conditional Transfer Learning

    Directory of Open Access Journals (Sweden)

    Yuan-Pin Lin

    2017-06-01

    Full Text Available To overcome the individual differences, an accurate electroencephalogram (EEG-based emotion-classification system requires a considerable amount of ecological calibration data for each individual, which is labor-intensive and time-consuming. Transfer learning (TL has drawn increasing attention in the field of EEG signal mining in recent years. The TL leverages existing data collected from other people to build a model for a new individual with little calibration data. However, brute-force transfer to an individual (i.e., blindly leveraged the labeled data from others may lead to a negative transfer that degrades performance rather than improving it. This study thus proposed a conditional TL (cTL framework to facilitate a positive transfer (improving subject-specific performance without increasing the labeled data for each individual. The cTL first assesses an individual’s transferability for positive transfer and then selectively leverages the data from others with comparable feature spaces. The empirical results showed that among 26 individuals, the proposed cTL framework identified 16 and 14 transferable individuals who could benefit from the data from others for emotion valence and arousal classification, respectively. These transferable individuals could then leverage the data from 18 and 12 individuals who had similar EEG signatures to attain maximal TL improvements in valence- and arousal-classification accuracy. The cTL improved the overall classification performance of 26 individuals by ~15% for valence categorization and ~12% for arousal counterpart, as compared to their default performance based solely on the subject-specific data. This study evidently demonstrated the feasibility of the proposed cTL framework for improving an individual’s default emotion-classification performance given a data repository. The cTL framework may shed light on the development of a robust emotion-classification model using fewer labeled subject

  18. Improving EEG-Based Emotion Classification Using Conditional Transfer Learning.

    Science.gov (United States)

    Lin, Yuan-Pin; Jung, Tzyy-Ping

    2017-01-01

    To overcome the individual differences, an accurate electroencephalogram (EEG)-based emotion-classification system requires a considerable amount of ecological calibration data for each individual, which is labor-intensive and time-consuming. Transfer learning (TL) has drawn increasing attention in the field of EEG signal mining in recent years. The TL leverages existing data collected from other people to build a model for a new individual with little calibration data. However, brute-force transfer to an individual (i.e., blindly leveraged the labeled data from others) may lead to a negative transfer that degrades performance rather than improving it. This study thus proposed a conditional TL (cTL) framework to facilitate a positive transfer (improving subject-specific performance without increasing the labeled data) for each individual. The cTL first assesses an individual's transferability for positive transfer and then selectively leverages the data from others with comparable feature spaces. The empirical results showed that among 26 individuals, the proposed cTL framework identified 16 and 14 transferable individuals who could benefit from the data from others for emotion valence and arousal classification, respectively. These transferable individuals could then leverage the data from 18 and 12 individuals who had similar EEG signatures to attain maximal TL improvements in valence- and arousal-classification accuracy. The cTL improved the overall classification performance of 26 individuals by ~15% for valence categorization and ~12% for arousal counterpart, as compared to their default performance based solely on the subject-specific data. This study evidently demonstrated the feasibility of the proposed cTL framework for improving an individual's default emotion-classification performance given a data repository. The cTL framework may shed light on the development of a robust emotion-classification model using fewer labeled subject-specific data toward a

  19. Deep learning for EEG-Based preference classification

    Science.gov (United States)

    Teo, Jason; Hou, Chew Lin; Mountstephens, James

    2017-10-01

    Electroencephalogram (EEG)-based emotion classification is rapidly becoming one of the most intensely studied areas of brain-computer interfacing (BCI). The ability to passively identify yet accurately correlate brainwaves with our immediate emotions opens up truly meaningful and previously unattainable human-computer interactions such as in forensic neuroscience, rehabilitative medicine, affective entertainment and neuro-marketing. One particularly useful yet rarely explored areas of EEG-based emotion classification is preference recognition [1], which is simply the detection of like versus dislike. Within the limited investigations into preference classification, all reported studies were based on musically-induced stimuli except for a single study which used 2D images. The main objective of this study is to apply deep learning, which has been shown to produce state-of-the-art results in diverse hard problems such as in computer vision, natural language processing and audio recognition, to 3D object preference classification over a larger group of test subjects. A cohort of 16 users was shown 60 bracelet-like objects as rotating visual stimuli on a computer display while their preferences and EEGs were recorded. After training a variety of machine learning approaches which included deep neural networks, we then attempted to classify the users' preferences for the 3D visual stimuli based on their EEGs. Here, we show that that deep learning outperforms a variety of other machine learning classifiers for this EEG-based preference classification task particularly in a highly challenging dataset with large inter- and intra-subject variability.

  20. Fast rule-based bioactivity prediction using associative classification mining

    Directory of Open Access Journals (Sweden)

    Yu Pulan

    2012-11-01

    Full Text Available Abstract Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM, which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR, classification based on multiple association rules (CMAR and classification based on association rules (CBA are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB, mutagenicity and hERG (the human Ether-a-go-go-Related Gene blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM methods, and produce highly interpretable models.

  1. Fast rule-based bioactivity prediction using associative classification mining.

    Science.gov (United States)

    Yu, Pulan; Wild, David J

    2012-11-23

    Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR), classification based on multiple association rules (CMAR) and classification based on association rules (CBA) are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB), mutagenicity and hERG (the human Ether-a-go-go-Related Gene) blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM) methods, and produce highly interpretable models.

  2. Energy-efficiency based classification of the manufacturing workstation

    Science.gov (United States)

    Frumuşanu, G.; Afteni, C.; Badea, N.; Epureanu, A.

    2017-08-01

    EU Directive 92/75/EC established for the first time an energy consumption labelling scheme, further implemented by several other directives. As consequence, nowadays many products (e.g. home appliances, tyres, light bulbs, houses) have an EU Energy Label when offered for sale or rent. Several energy consumption models of manufacturing equipments have been also developed. This paper proposes an energy efficiency - based classification of the manufacturing workstation, aiming to characterize its energetic behaviour. The concept of energy efficiency of the manufacturing workstation is defined. On this base, a classification methodology has been developed. It refers to specific criteria and their evaluation modalities, together to the definition & delimitation of energy efficiency classes. The energy class position is defined after the amount of energy needed by the workstation in the middle point of its operating domain, while its extension is determined by the value of the first coefficient from the Taylor series that approximates the dependence between the energy consume and the chosen parameter of the working regime. The main domain of interest for this classification looks to be the optimization of the manufacturing activities planning and programming. A case-study regarding an actual lathe classification from energy efficiency point of view, based on two different approaches (analytical and numerical) is also included.

  3. Multi-category classification using an Extreme Learning Machine for microarray gene expression cancer diagnosis.

    Science.gov (United States)

    Zhang, Runxuan; Huang, Guang-Bin; Sundararajan, Narasimhan; Saratchandran, P

    2007-01-01

    In this paper, the recently developed Extreme Learning Machine (ELM) is used for direct multicategory classification problems in the cancer diagnosis area. ELM avoids problems like local minima, improper learning rate and overfitting commonly faced by iterative learning methods and completes the training very fast. We have evaluated the multi-category classification performance of ELM on three benchmark microarray datasets for cancer diagnosis, namely, the GCM dataset, the Lung dataset and the Lymphoma dataset. The results indicate that ELM produces comparable or better classification accuracies with reduced training time and implementation complexity compared to artificial neural networks methods like conventional back-propagation ANN, Linder's SANN, and Support Vector Machine methods like SVM-OVO and Ramaswamy's SVM-OVA. ELM also achieves better accuracies for classification of individual categories.

  4. Comparison of Computational Algorithms for the Classification of Liver Cancer using SELDI Mass Spectrometry: A Case Study

    Directory of Open Access Journals (Sweden)

    Robert J Hickey

    2007-01-01

    Full Text Available Introduction: As an alternative to DNA microarrays, mass spectrometry based analysis of proteomic patterns has shown great potential in cancer diagnosis. The ultimate application of this technique in clinical settings relies on the advancement of the technology itself and the maturity of the computational tools used to analyze the data. A number of computational algorithms constructed on different principles are available for the classification of disease status based on proteomic patterns. Nevertheless, few studies have addressed the difference in the performance of these approaches. In this report, we describe a comparative case study on the classification accuracy of hepatocellular carcinoma based on the serum proteomic pattern generated from a Surface Enhanced Laser Desorption/Ionization (SELDI mass spectrometer.Methods: Nine supervised classifi cation algorithms are implemented in R software and compared for the classification accuracy.Results: We found that the support vector machine with radial function is preferable as a tool for classification of hepatocellular carcinoma using features in SELDI mass spectra. Among the rest of the methods, random forest and prediction analysis of microarrays have better performance. A permutation-based technique reveals that the support vector machine with a radial function seems intrinsically superior in learning from the training data since it has a lower prediction error than others when there is essentially no differential signal. On the other hand, the performance of the random forest and prediction analysis of microarrays rely on their capability of capturing the signals with substantial differentiation between groups.Conclusions: Our finding is similar to a previous study, where classification methods based on the Matrix Assisted Laser Desorption/Ionization (MALDI mass spectrometry are compared for the prediction accuracy of ovarian cancer. The support vector machine, random forest and prediction

  5. Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts.

    Science.gov (United States)

    Dashtban, M; Balafar, Mohammadali

    2017-03-01

    Gene selection is a demanding task for microarray data analysis. The diverse complexity of different cancers makes this issue still challenging. In this study, a novel evolutionary method based on genetic algorithms and artificial intelligence is proposed to identify predictive genes for cancer classification. A filter method was first applied to reduce the dimensionality of feature space followed by employing an integer-coded genetic algorithm with dynamic-length genotype, intelligent parameter settings, and modified operators. The algorithmic behaviors including convergence trends, mutation and crossover rate changes, and running time were studied, conceptually discussed, and shown to be coherent with literature findings. Two well-known filter methods, Laplacian and Fisher score, were examined considering similarities, the quality of selected genes, and their influences on the evolutionary approach. Several statistical tests concerning choice of classifier, choice of dataset, and choice of filter method were performed, and they revealed some significant differences between the performance of different classifiers and filter methods over datasets. The proposed method was benchmarked upon five popular high-dimensional cancer datasets; for each, top explored genes were reported. Comparing the experimental results with several state-of-the-art methods revealed that the proposed method outperforms previous methods in DLBCL dataset. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. A Chinese text classification system based on Naive Bayes algorithm

    Directory of Open Access Journals (Sweden)

    Cui Wei

    2016-01-01

    Full Text Available In this paper, aiming at the characteristics of Chinese text classification, using the ICTCLAS(Chinese lexical analysis system of Chinese academy of sciences for document segmentation, and for data cleaning and filtering the Stop words, using the information gain and document frequency feature selection algorithm to document feature selection. Based on this, based on the Naive Bayesian algorithm implemented text classifier , and use Chinese corpus of Fudan University has carried on the experiment and analysis on the system.

  7. Optical beam classification using deep learning: a comparison with rule- and feature-based classification

    Science.gov (United States)

    Alom, Md. Zahangir; Awwal, Abdul A. S.; Lowe-Webb, Roger; Taha, Tarek M.

    2017-08-01

    Vector Machine (SVM). The experimental results show around 96% classification accuracy using CNN; the CNN approach also provides comparable recognition results compared to the present feature-based off-normal detection. The feature-based solution was developed to capture the expertise of a human expert in classifying the images. The misclassified results are further studied to explain the differences and discover any discrepancies or inconsistencies in current classification.

  8. A Visual mining based framework for classification accuracy estimation

    Science.gov (United States)

    Arun, Pattathal Vijayakumar

    2013-12-01

    Classification techniques have been widely used in different remote sensing applications and correct classification of mixed pixels is a tedious task. Traditional approaches adopt various statistical parameters, however does not facilitate effective visualisation. Data mining tools are proving very helpful in the classification process. We propose a visual mining based frame work for accuracy assessment of classification techniques using open source tools such as WEKA and PREFUSE. These tools in integration can provide an efficient approach for getting information about improvements in the classification accuracy and helps in refining training data set. We have illustrated framework for investigating the effects of various resampling methods on classification accuracy and found that bilinear (BL) is best suited for preserving radiometric characteristics. We have also investigated the optimal number of folds required for effective analysis of LISS-IV images. Techniki klasyfikacji są szeroko wykorzystywane w różnych aplikacjach teledetekcyjnych, w których poprawna klasyfikacja pikseli stanowi poważne wyzwanie. Podejście tradycyjne wykorzystujące różnego rodzaju parametry statystyczne nie zapewnia efektywnej wizualizacji. Wielce obiecujące wydaje się zastosowanie do klasyfikacji narzędzi do eksploracji danych. W artykule zaproponowano podejście bazujące na wizualnej analizie eksploracyjnej, wykorzystujące takie narzędzia typu open source jak WEKA i PREFUSE. Wymienione narzędzia ułatwiają korektę pół treningowych i efektywnie wspomagają poprawę dokładności klasyfikacji. Działanie metody sprawdzono wykorzystując wpływ różnych metod resampling na zachowanie dokładności radiometrycznej i uzyskując najlepsze wyniki dla metody bilinearnej (BL).

  9. Segmentation Based Fuzzy Classification of High Resolution Images

    Science.gov (United States)

    Rao, Mukund; Rao, Suryaprakash; Masser, Ian; Kasturirangan, K.

    Information extraction from satellite images is the process of delineation of entities in the image which pertain to some feature on the earth and to which on associating an attribute, a classification of the image is obtained. Classification is a common technique to extract information from remote sensing data and, by and large, the common classification techniques mainly exploit the spectral characteristics of remote sensing images and attempt to detect patterns in spectral information to classify images. These are based on a per-pixel analysis of the spectral information, "clustering" or "grouping" of pixels is done to generate meaningful thematic information. Most of the classification techniques apply statistical pattern recognition of image spectral vectors to "label" each pixel with appropriate class information from a set of training information. On the other hand, Segmentation is not new, but it is yet seldom used in image processing of remotely sensed data. Although there has been a lot of development in segmentation of grey tone images in this field and other fields, like robotic vision, there has been little progress in segmentation of colour or multi-band imagery. Especially within the last two years many new segmentation algorithms as well as applications were developed, but not all of them lead to qualitatively convincing results while being robust and operational. One reason is that the segmentation of an image into a given number of regions is a problem with a huge number of possible solutions. Newer algorithms based on fractal approach could eventually revolutionize image processing of remotely sensed data. The paper looks at applying spatial concepts to image processing, paving the way to algorithmically formulate some more advanced aspects of cognition and inference. In GIS-based spatial analysis, vector-based tools already have been able to support advanced tasks generating new knowledge. By identifying objects (as segmentation results) from

  10. LiDAR point classification based on sparse representation

    Science.gov (United States)

    Li, Nan; Pfeifer, Norbert; Liu, Chun

    2017-04-01

    In order to combine the initial spatial structure and features of LiDAR data for accurate classification. The LiDAR data is represented as a 4-order tensor. Sparse representation for classification(SRC) method is used for LiDAR tensor classification. It turns out SRC need only a few of training samples from each class, meanwhile can achieve good classification result. Multiple features are extracted from raw LiDAR points to generate a high-dimensional vector at each point. Then the LiDAR tensor is built by the spatial distribution and feature vectors of the point neighborhood. The entries of LiDAR tensor are accessed via four indexes. Each index is called mode: three spatial modes in direction X ,Y ,Z and one feature mode. Sparse representation for classification(SRC) method is proposed in this paper. The sparsity algorithm is to find the best represent the test sample by sparse linear combination of training samples from a dictionary. To explore the sparsity of LiDAR tensor, the tucker decomposition is used. It decomposes a tensor into a core tensor multiplied by a matrix along each mode. Those matrices could be considered as the principal components in each mode. The entries of core tensor show the level of interaction between the different components. Therefore, the LiDAR tensor can be approximately represented by a sparse tensor multiplied by a matrix selected from a dictionary along each mode. The matrices decomposed from training samples are arranged as initial elements in the dictionary. By dictionary learning, a reconstructive and discriminative structure dictionary along each mode is built. The overall structure dictionary composes of class-specified sub-dictionaries. Then the sparse core tensor is calculated by tensor OMP(Orthogonal Matching Pursuit) method based on dictionaries along each mode. It is expected that original tensor should be well recovered by sub-dictionary associated with relevant class, while entries in the sparse tensor associated with

  11. Comparative Study of Classification Techniques on Breast Cancer FNA Biopsy Data

    Directory of Open Access Journals (Sweden)

    George Rumbe

    2010-12-01

    Full Text Available Accurate diagnostic detection of the cancerous cells in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Bayesian classifier and other Artificial neural network classifiers (Backpropagation, linear programming, Learning vector quantization, and K nearest neighborhood on the Wisconsin breast cancer classification problem.

  12. Median Filter Noise Reduction of Image and Backpropagation Neural Network Model for Cervical Cancer Classification

    Science.gov (United States)

    Wutsqa, D. U.; Marwah, M.

    2017-06-01

    In this paper, we consider spatial operation median filter to reduce the noise in the cervical images yielded by colposcopy tool. The backpropagation neural network (BPNN) model is applied to the colposcopy images to classify cervical cancer. The classification process requires an image extraction by using a gray level co-occurrence matrix (GLCM) method to obtain image features that are used as inputs of BPNN model. The advantage of noise reduction is evaluated by comparing the performances of BPNN models with and without spatial operation median filter. The experimental result shows that the spatial operation median filter can improve the accuracy of the BPNN model for cervical cancer classification.

  13. Fuzzy support vector machine: an efficient rule-based classification technique for microarrays

    Science.gov (United States)

    2013-01-01

    Background The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Results Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Conclusions Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification. PMID:24266942

  14. Sparse Representation Based Multi-Instance Learning for Breast Ultrasound Image Classification

    Directory of Open Access Journals (Sweden)

    Lu Bing

    2017-01-01

    Full Text Available We propose a novel method based on sparse representation for breast ultrasound image classification under the framework of multi-instance learning (MIL. After image enhancement and segmentation, concentric circle is used to extract the global and local features for improving the accuracy in diagnosis and prediction. The classification problem of ultrasound image is converted to sparse representation based MIL problem. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is represented by one feature vector which is obtained via sparse representations of all instances within the bag. The sparse and MIL problem is further converted to a conventional learning problem that is solved by relevance vector machine (RVM. Results of single classifiers are combined to be used for classification. Experimental results on the breast cancer datasets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods.

  15. Comparison Of Power Quality Disturbances Classification Based On Neural Network

    Directory of Open Access Journals (Sweden)

    Nway Nway Kyaw Win

    2015-07-01

    Full Text Available Abstract Power quality disturbances PQDs result serious problems in the reliability safety and economy of power system network. In order to improve electric power quality events the detection and classification of PQDs must be made type of transient fault. Software analysis of wavelet transform with multiresolution analysis MRA algorithm and feed forward neural network probabilistic and multilayer feed forward neural network based methodology for automatic classification of eight types of PQ signals flicker harmonics sag swell impulse fluctuation notch and oscillatory will be presented. The wavelet family Db4 is chosen in this system to calculate the values of detailed energy distributions as input features for classification because it can perform well in detecting and localizing various types of PQ disturbances. This technique classifies the types of PQDs problem sevents.The classifiers classify and identify the disturbance type according to the energy distribution. The results show that the PNN can analyze different power disturbance types efficiently. Therefore it can be seen that PNN has better classification accuracy than MLFF.

  16. Pathogenesis of Gastric Cancer: Genetics and Molecular Classification.

    Science.gov (United States)

    Figueiredo, Ceu; Camargo, M C; Leite, Marina; Fuentes-Pananá, Ezequiel M; Rabkin, Charles S; Machado, José C

    Gastric cancer is the fifth most incident and the third most common cause of cancer-related death in the world. Infection with Helicobacter pylori is the major risk factor for this disease. Gastric cancer is the final outcome of a cascade of events that takes decades to occur and results from the accumulation of multiple genetic and epigenetic alterations. These changes are crucial for tumor cells to expedite and sustain the array of pathways involved in the cancer development, such as cell cycle, DNA repair, metabolism, cell-to-cell and cell-to-matrix interactions, apoptosis, angiogenesis, and immune surveillance. Comprehensive molecular analyses of gastric cancer have disclosed the complex heterogeneity of this disease. In particular, these analyses have confirmed that Epstein-Barr virus (EBV)-positive gastric cancer is a distinct entity. The identification of gastric cancer subtypes characterized by recognizable molecular profiles may pave the way for a more personalized clinical management and to the identification of novel therapeutic targets and biomarkers for screening, prognosis, prediction of response to treatment, and monitoring of gastric cancer progression.

  17. Finding Combination of Features from Promoter Regions for Ovarian Cancer-related Gene Group Classification

    KAUST Repository

    Olayan, Rawan S.

    2012-12-01

    In classification problems, it is always important to use the suitable combination of features that will be employed by classifiers. Generating the right combination of features usually results in good classifiers. In the situation when the problem is not well understood, data items are usually described by many features in the hope that some of these may be the relevant or most relevant ones. In this study, we focus on one such problem related to genes implicated in ovarian cancer (OC). We try to recognize two important OC-related gene groups: oncogenes, which support the development and progression of OC, and oncosuppressors, which oppose such tendencies. For this, we use the properties of promoters of these genes. We identified potential “regulatory features” that characterize OC-related oncogenes and oncosuppressors promoters. In our study, we used 211 oncogenes and 39 oncosuppressors. For these, we identified 538 characteristic sequence motifs from their promoters. Promoters are annotated by these motifs and derived feature vectors used to develop classification models. We made a comparison of a number of classification models in their ability to distinguish oncogenes from oncosuppressors. Based on 10-fold cross-validation, the resultant model was able to separate the two classes with sensitivity of 96% and specificity of 100% with the complete set of features. Moreover, we developed another recognition model where we attempted to distinguish oncogenes and oncosuppressors as one group from other OC-related genes. That model achieved accuracy of 82%. We believe that the results of this study will help in discovering other OC-related oncogenes and oncosuppressors not identified as yet.

  18. Trial designs for personalizing cancer care: a systematic review and classification.

    Science.gov (United States)

    Tajik, Parvin; Zwinderman, Aleiko H; Mol, Ben W; Bossuyt, Patrick M

    2013-09-01

    There is an increasing interest in the evaluation of prognostic and predictive biomarkers for personalizing cancer care. The literature on the trial designs for evaluation of these markers is diverse and there is no consensus in the classification or nomenclature. We set this study to review the literature systematically, to identify the proposed trial designs, and to develop a classification scheme. We searched MEDLINE, EMBASE, Cochrane Methodology Register, and MathSciNet up to January 2013 for articles describing these trial designs. In each eligible article, we identified the trial designs presented and extracted the term used for labeling the design, components of patient flow (marker status of eligible participants, intervention, and comparator), study questions, and analysis plan. Our search strategy resulted in 88 eligible articles, wherein 315 labels had been used by authors in presenting trial designs; 134 of these were unique. By analyzing patient flow components, we could classify the 134 unique design labels into four basic patient flow categories, which we labeled with the most frequently used term: single-arm, enrichment, randomize-all, and biomarker-strategy designs. A fifth category consists of combinations of the other four patient flow categories. Our review showed that a considerable number of labels has been proposed for trial designs evaluating prognostic and predictive biomarkers which, based on patient flow elements, can be classified into five basic categories. The classification system proposed here could help clinicians and researchers in designing and interpreting trials evaluating predictive biomarkers, and could reduce confusion in labeling and reporting. ©2013 AACR.

  19. Quality-based Multimodal Classification Using Tree-Structured Sparsity

    Science.gov (United States)

    2014-03-08

    ASI Series F, Computer and Systems Sciences, 163:446–456, 1999. 5 [7] D. Hall and J. Llinas. An introduction to multisensor data fusion . Proceedings of...advantages of in- formation fusion based on sparsity models for multi- modal classification. Among several sparsity models, tree- structured sparsity provides...rithm is proposed to solve the optimization problem, which is an efficient tool for feature-level fusion among either ho- mogeneous or heterogeneous

  20. Hardwood species classification with DWT based hybrid texture ...

    Indian Academy of Sciences (India)

    Reduction in feature dataset by minimal redundancy maximal relevance (mRMR) feature selection method is achieved and the best classification accuracy of 99.00 ± 0.79% and 99.20 ± 0.42% have been obtained for DWT based FOS-LBP histogram Fourier features (DWTFOSLBP-HF) technique at the 5th and 6th levels of ...

  1. Classification of follicular cell-derived thyroid cancer by global RNA profiling

    DEFF Research Database (Denmark)

    Rossing, Maria

    2013-01-01

    classification will not only contribute to our biological insight but also improve clinical and pathological examinations, thus advancing thyroid tumour diagnosis and ultimately preventing superfluous surgery. This review evaluates the status of classification and biological insights gained from molecular...... classifiers that may differentiate malignant from benign thyroid nodules. Molecular classification models based on global RNA profiles from fine-needle aspirations are currently being evaluated; results are preliminary and lack validation in prospective clinical trials. There is no doubt that molecular...

  2. Color-texture based extreme learning machines for tissue tumor classification

    Science.gov (United States)

    Yang, X.; Yeo, S. Y.; Wong, S. T.; Lee, G.; Su, Y.; Hong, J. M.; Choo, A.; Chen, S.

    2016-03-01

    In histopathological classification and diagnosis of cancer cases, pathologists perform visual assessments of immunohistochemistry (IHC)-stained biomarkers in cells to determine tumor versus non-tumor tissues. One of the prerequisites for such assessments is the correct identification of regions-of-interest (ROIs) with relevant histological features. Advances in image processing and machine learning give rise to the possibility of full automation in ROI identification by identifying image features such as colors and textures. Such computer-aided diagnostic systems could enhance research output and efficiency in identifying the pathology (normal, non-tumor or tumor) of a tissue pattern from ROI images. In this paper, a computational method using color-texture based extreme learning machines (ELM) is proposed for automatic tissue tumor classification. Our approach consists of three steps: (1) ROIs are manually identified and annotated from individual cores of tissue microarrays (TMAs); (2) color and texture features are extracted from the ROIs images; (3) ELM is applied to the extracted features to classify the ROIs into non-tumor or tumor categories. The proposed approach is tested on 100 sets of images from a kidney cancer TMA and the results show that ELM is able to achieve classification accuracies of 91.19% and 88.72% with a Gaussian radial basis function (RBF) and linear kernel, respectively, which is superior to using SVM with the same kernels.

  3. Locally linear embedding and neighborhood rough set-based gene selection for gene expression data classification.

    Science.gov (United States)

    Sun, L; Xu, J-C; Wang, W; Yin, Y

    2016-08-30

    Cancer subtype recognition and feature selection are important problems in the diagnosis and treatment of tumors. Here, we propose a novel gene selection approach applied to gene expression data classification. First, two classical feature reduction methods including locally linear embedding (LLE) and rough set (RS) are summarized. The advantages and disadvantages of these algorithms were analyzed and an optimized model for tumor gene selection was developed based on LLE and neighborhood RS (NRS). Bhattacharyya distance was introduced to delete irrelevant genes, pair-wise redundant analysis was performed to remove strongly correlated genes, and the wavelet soft threshold was determined to eliminate noise in the gene datasets. Next, prior optimized search processing was carried out. A new approach combining dimension reduction of LLE and feature reduction of NRS (LLE-NRS) was developed for selecting gene subsets, and then an open source software Weka was applied to distinguish different tumor types and verify the cross-validation classification accuracy of our proposed method. The experimental results demonstrated that the classification performance of the proposed LLE-NRS for selecting gene subset outperforms those of other related models in terms of accuracy, and our proposed approach is feasible and effective in the field of high-dimensional tumor classification.

  4. Fully automatic classification of breast cancer microarray images

    Directory of Open Access Journals (Sweden)

    Nastaran Dehghan Khalilabad

    2016-09-01

    Full Text Available A microarray image is used as an accurate method for diagnosis of cancerous diseases. The aim of this research is to provide an approach for detection of breast cancer type. First, raw data is extracted from microarray images. Determining the exact location of each gene is carried out using image processing techniques. Then, by the sum of the pixels associated with each gene, the amount of “genes expression” is extracted as raw data. To identify more effective genes, information gain method on the set of raw data is used. Finally, the type of cancer can be recognized via analyzing the obtained data using a decision tree. The proposed approach has an accuracy of 95.23% in diagnosing the breast cancer types.

  5. The DTW-based representation space for seismic pattern classification

    Science.gov (United States)

    Orozco-Alzate, Mauricio; Castro-Cabrera, Paola Alexandra; Bicego, Manuele; Londoño-Bonilla, John Makario

    2015-12-01

    Distinguishing among the different seismic volcanic patterns is still one of the most important and labor-intensive tasks for volcano monitoring. This task could be lightened and made free from subjective bias by using automatic classification techniques. In this context, a core but often overlooked issue is the choice of an appropriate representation of the data to be classified. Recently, it has been suggested that using a relative representation (i.e. proximities, namely dissimilarities on pairs of objects) instead of an absolute one (i.e. features, namely measurements on single objects) is advantageous to exploit the relational information contained in the dissimilarities to derive highly discriminant vector spaces, where any classifier can be used. According to that motivation, this paper investigates the suitability of a dynamic time warping (DTW) dissimilarity-based vector representation for the classification of seismic patterns. Results show the usefulness of such a representation in the seismic pattern classification scenario, including analyses of potential benefits from recent advances in the dissimilarity-based paradigm such as the proper selection of representation sets and the combination of different dissimilarity representations that might be available for the same data.

  6. G0-WISHART Distribution Based Classification from Polarimetric SAR Images

    Science.gov (United States)

    Hu, G. C.; Zhao, Q. H.

    2017-09-01

    Enormous scientific and technical developments have been carried out to further improve the remote sensing for decades, particularly Polarimetric Synthetic Aperture Radar(PolSAR) technique, so classification method based on PolSAR images has getted much more attention from scholars and related department around the world. The multilook polarmetric G0-Wishart model is a more flexible model which describe homogeneous, heterogeneous and extremely heterogeneous regions in the image. Moreover, the polarmetric G0-Wishart distribution dose not include the modified Bessel function of the second kind. It is a kind of simple statistical distribution model with less parameter. To prove its feasibility, a process of classification has been tested with the full-polarized Synthetic Aperture Radar (SAR) image by the method. First, apply multilook polarimetric SAR data process and speckle filter to reduce speckle influence for classification result. Initially classify the image into sixteen classes by H/A/α decomposition. Using the ICM algorithm to classify feature based on the G0-Wshart distance. Qualitative and quantitative results show that the proposed method can classify polaimetric SAR data effectively and efficiently.

  7. Classification of Cancer Primary Sites Using Machine Learning and Somatic Mutations

    Directory of Open Access Journals (Sweden)

    Yukun Chen

    2015-01-01

    Full Text Available An accurate classification of human cancer, including its primary site, is important for better understanding of cancer and effective therapeutic strategies development. The available big data of somatic mutations provides us a great opportunity to investigate cancer classification using machine learning. Here, we explored the patterns of 1,760,846 somatic mutations identified from 230,255 cancer patients along with gene function information using support vector machine. Specifically, we performed a multiclass classification experiment over the 17 tumor sites using the gene symbol, somatic mutation, chromosome, and gene functional pathway as predictors for 6,751 subjects. The performance of the baseline using only gene features is 0.57 in accuracy. It was improved to 0.62 when adding the information of mutation and chromosome. Among the predictable primary tumor sites, the prediction of five primary sites (large intestine, liver, skin, pancreas, and lung could achieve the performance with more than 0.70 in F-measure. The model of the large intestine ranked the first with 0.87 in F-measure. The results demonstrate that the somatic mutation information is useful for prediction of primary tumor sites with machine learning modeling. To our knowledge, this study is the first investigation of the primary sites classification using machine learning and somatic mutation data.

  8. Classification of Cancer Primary Sites Using Machine Learning and Somatic Mutations.

    Science.gov (United States)

    Chen, Yukun; Sun, Jingchun; Huang, Liang-Chin; Xu, Hua; Zhao, Zhongming

    2015-01-01

    An accurate classification of human cancer, including its primary site, is important for better understanding of cancer and effective therapeutic strategies development. The available big data of somatic mutations provides us a great opportunity to investigate cancer classification using machine learning. Here, we explored the patterns of 1,760,846 somatic mutations identified from 230,255 cancer patients along with gene function information using support vector machine. Specifically, we performed a multiclass classification experiment over the 17 tumor sites using the gene symbol, somatic mutation, chromosome, and gene functional pathway as predictors for 6,751 subjects. The performance of the baseline using only gene features is 0.57 in accuracy. It was improved to 0.62 when adding the information of mutation and chromosome. Among the predictable primary tumor sites, the prediction of five primary sites (large intestine, liver, skin, pancreas, and lung) could achieve the performance with more than 0.70 in F-measure. The model of the large intestine ranked the first with 0.87 in F-measure. The results demonstrate that the somatic mutation information is useful for prediction of primary tumor sites with machine learning modeling. To our knowledge, this study is the first investigation of the primary sites classification using machine learning and somatic mutation data.

  9. Comparison of the prevalence of malnutrition diagnosis in head and neck, gastrointestinal, and lung cancer patients by 3 classification methods.

    Science.gov (United States)

    Platek, Mary E; Popp, Johann V; Possinger, Candi S; Denysschen, Carol A; Horvath, Peter; Brown, Jean K

    2011-01-01

    Malnutrition is prevalent among patients within certain cancer types. There is lack of universal standard of care for nutrition screening and a lack of agreement on an operational definition and on validity of malnutrition indicators. In a secondary data analysis, we investigated prevalence of malnutrition diagnosis with 3 classification methods using data from medical records of a National Cancer Institute-designated comprehensive cancer center. Records of 227 patients hospitalized during 1998 with head and neck, gastrointestinal, or lung cancer were reviewed for malnutrition based on 3 methods: (1) physician-diagnosed malnutrition-related International Classification of Diseases, Ninth Revision codes; (2) in-hospital nutritional assessment summaries conducted by registered dietitians; and (3) body mass indexes (BMIs). For patients with multiple admissions, only data from the first hospitalization were included. Prevalence of malnutrition diagnosis ranged from 8.8% based on BMI to approximately 26% of all cases based on dietitian assessment. κ coefficients between any methods indicated a weak (κ = 0.23, BMI and dietitians; and κ = 0.28, dietitians and physicians)-to-fair strength of agreement (κ = 0.38, BMI and physicians). Available methods to identify patients with malnutrition in a National Cancer Institute-designated comprehensive cancer center resulted in varied prevalence of malnutrition diagnosis. A universal standard of care for nutrition screening that uses validated tools is needed. The Joint Commission on the Accreditation of Healthcare Organizations requires nutritional screening of patients within 24 hours of admission. For this purpose, implementation of a validated tool that can be used by various healthcare practitioners, including nurses, needs to be considered.

  10. A comparative performance evaluation of neural network based approach for sentiment classification of online reviews

    OpenAIRE

    Vinodhini, G.; Chandrasekaran, R.M.

    2016-01-01

    The aim of sentiment classification is to efficiently identify the emotions expressed in the form of text messages. Machine learning methods for sentiment classification have been extensively studied, due to their predominant classification performance. Recent studies suggest that ensemble based machine learning methods provide better performance in classification. Artificial neural networks (ANNs) are rarely being investigated in the literature of sentiment classification. This paper compare...

  11. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    Science.gov (United States)

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  12. INTENSITY- AND TIME COURSE-BASED CLASSIFICATIONS OF OXIDATIVE STRESSES

    Directory of Open Access Journals (Sweden)

    Volodymyr Lushchak

    2015-05-01

    Full Text Available In living organisms, production of reactive oxygen species (ROS is counterbalanced by their elimination and/or prevention of formation which in concert can typically maintain a steady-state (stationary ROS level. However, this balance may be disturbed and lead to elevated ROS levels and enhanced damage to biomolecules. Since 1985, when H. Sies first introduced the definition of oxidative stress, this area has become one of the hot topics in biology and, to date, many details related to ROS-induced damage to cellular components, ROS-based signaling, cellular responses and adaptation have been disclosed. However, some basal oxidative damage always occurs under unstressed conditions, and in many experimental studies it is difficult to show definitely that oxidative stress is indeed induced by the stressor. Therefore, usually researchers experience substantial difficulties in the correct interpretation of oxidative stress development. For example, in many cases an increase or decrease in the activity of antioxidant and related enzymes are interpreted as evidences of oxidative stress. Careful selection of specific biomarkers (ROS-modified targets may be very helpful. To avoid these sorts of problems, I propose several classifications of oxidative stress based on its time-course and intensity. The time-course classification includes acute and chronic stresses. In the intensity based classification, I propose to discriminate four zones of function in the relationship between “Dose/concentration of inducer” and the measured “Endpoint”: I – basal oxidative stress zone (BOS; II – low intensity oxidative stress (LOS; III – intermediate intensity oxidative stress (IOS; IV – high intensity oxidative stress (HOS. The proposed classifications may be helpful to describe experimental data where oxidative stress is induced and systematize it based on its time course and intensity. Perspective directions of investigations in the field include

  13. Assessing the prognostic features of a pain classification system in advanced cancer patients.

    Science.gov (United States)

    Arthur, Joseph; Tanco, Kimberson; Haider, Ali; Maligi, Courtney; Park, Minjeong; Liu, Diane; Bruera, Eduardo

    2017-09-01

    The Edmonton Classification System for Cancer Pain (ECS-CP) has been shown to predict pain management complexity based on five features: pain mechanism, incident pain, psychological distress, addictive behavior, and cognitive function. The main objective of our study was to explore the association between ECS-CP features and pain treatment outcomes among outpatients managed by a palliative care specialist-led interdisciplinary team. Initial and follow-up clinical information of 386 eligible supportive care outpatients were retrospectively reviewed and analyzed. Between the initial consultation and the first follow-up visit, the median ESAS pain intensity improved from 6 to 4.5 (p feature (p = 0.006) used a higher number of adjuvant medications. At follow-up, patients with neuropathic pain were less likely to achieve their personalized pain goal (PPG) (29 vs 72%, p = 0.015). No statistically significant association was found between increasing sum of ECS-CP features and any of the pain treatment outcomes at follow-up. Neuropathy was found to be a poor prognostic feature in advanced cancer pain management. Increasing sum of ECS-CP features was not predictive of pain management complexity at the follow-up visit when pain was managed by a palliative medicine specialist. Further research is needed to further explore these observations.

  14. Clinical classification of cancer cachexia: phenotypic correlates in human skeletal muscle.

    Science.gov (United States)

    Johns, Neil; Hatakeyama, Shinji; Stephens, Nathan A; Degen, Martin; Degen, Simone; Frieauff, Wilfried; Lambert, Christian; Ross, James A; Roubenoff, Ronenn; Glass, David J; Jacobi, Carsten; Fearon, Kenneth C H

    2014-01-01

    Cachexia affects the majority of patients with advanced cancer and is associated with a reduction in treatment tolerance, response to therapy, and duration of survival. One impediment towards the effective treatment of cachexia is a validated classification system. 41 patients with resectable upper gastrointestinal (GI) or pancreatic cancer underwent characterisation for cachexia based on weight-loss (WL) and/or low muscularity (LM). Four diagnostic criteria were used >5%WL, >10%WL, LM, and LM+>2%WL. All patients underwent biopsy of the rectus muscle. Analysis included immunohistochemistry for fibre size and type, protein and nucleic acid concentration, Western blots for markers of autophagy, SMAD signalling, and inflammation. Compared with non-cachectic cancer patients, patients with LM or LM+>2%WL, mean muscle fibre diameter was reduced by about 25% (p = 0.02 and p = 0.001 respectively). No significant difference in fibre diameter was observed if patients had WL alone. Regardless of classification, there was no difference in fibre number or proportion of fibre type across all myosin heavy chain isoforms. Mean muscle protein content was reduced and the ratio of RNA/DNA decreased in patients with either >5%WL or LM+>2%WL. Compared with non-cachectic patients, SMAD3 protein levels were increased in patients with >5%WL (p = 0.022) and with >10%WL, beclin (p = 0.05) and ATG5 (p = 0.01) protein levels were increased. There were no differences in phospho-NFkB or phospho-STAT3 levels across any of the groups. Muscle fibre size, biochemical composition and pathway phenotype can vary according to whether the diagnostic criteria for cachexia are based on weight loss alone, a measure of low muscularity alone or a combination of the two. For intervention trials where the primary end-point is a change in muscle mass or function, use of combined diagnostic criteria may allow identification of a more homogeneous patient cohort, reduce the sample size required

  15. Clinical classification of cancer cachexia: phenotypic correlates in human skeletal muscle.

    Directory of Open Access Journals (Sweden)

    Neil Johns

    Full Text Available BACKGROUND: Cachexia affects the majority of patients with advanced cancer and is associated with a reduction in treatment tolerance, response to therapy, and duration of survival. One impediment towards the effective treatment of cachexia is a validated classification system. METHODS: 41 patients with resectable upper gastrointestinal (GI or pancreatic cancer underwent characterisation for cachexia based on weight-loss (WL and/or low muscularity (LM. Four diagnostic criteria were used >5%WL, >10%WL, LM, and LM+>2%WL. All patients underwent biopsy of the rectus muscle. Analysis included immunohistochemistry for fibre size and type, protein and nucleic acid concentration, Western blots for markers of autophagy, SMAD signalling, and inflammation. FINDINGS: Compared with non-cachectic cancer patients, patients with LM or LM+>2%WL, mean muscle fibre diameter was reduced by about 25% (p = 0.02 and p = 0.001 respectively. No significant difference in fibre diameter was observed if patients had WL alone. Regardless of classification, there was no difference in fibre number or proportion of fibre type across all myosin heavy chain isoforms. Mean muscle protein content was reduced and the ratio of RNA/DNA decreased in patients with either >5%WL or LM+>2%WL. Compared with non-cachectic patients, SMAD3 protein levels were increased in patients with >5%WL (p = 0.022 and with >10%WL, beclin (p = 0.05 and ATG5 (p = 0.01 protein levels were increased. There were no differences in phospho-NFkB or phospho-STAT3 levels across any of the groups. CONCLUSION: Muscle fibre size, biochemical composition and pathway phenotype can vary according to whether the diagnostic criteria for cachexia are based on weight loss alone, a measure of low muscularity alone or a combination of the two. For intervention trials where the primary end-point is a change in muscle mass or function, use of combined diagnostic criteria may allow identification of a more

  16. Radiographic classification for fractures of the fifth metatarsal base

    Energy Technology Data Exchange (ETDEWEB)

    Mehlhorn, Alexander T.; Zwingmann, Joern; Hirschmueller, Anja; Suedkamp, Norbert P.; Schmal, Hagen [University of Freiburg Medical Center, Department of Orthopaedic Surgery, Freiburg (Germany)

    2014-04-15

    Avulsion fractures of the fifth metatarsal base (MTB5) are common fore foot injuries. Based on a radiomorphometric analysis reflecting the risk for a secondary displacement, a new classification was developed. A cohort of 95 healthy, sportive, and young patients (age ≤ 50 years) with avulsion fractures of the MTB5 was included in the study and divided into groups with non-displaced, primary-displaced, and secondary-displaced fractures. Radiomorphometric data obtained using standard oblique and dorso-plantar views were analyzed in association with secondary displacement. Based on this, a classification was developed and checked for reproducibility. Fractures with a longer distance between the lateral edge of the styloid process and the lateral fracture step-off and fractures with a more medial joint entry of the fracture line at the MTB5 are at higher risk to displace secondarily. Based on these findings, all fractures were divided into three types: type I with a fracture entry in the lateral third; type II in the middle third; and type III in the medial third of the MTB5. Additionally, the three types were subdivided into an A-type with a fracture displacement <2 mm and a B-type with a fracture displacement ≥ 2 mm. A substantial level of interobserver agreement was found in the assignment of all 95 fractures to the six fracture types (κ = 0.72). The secondary displacement of fractures was confirmed by all examiners in 100 %. Radiomorphometric data may identify fractures at risk for secondary displacement of the MTB5. Based on this, a reliable classification was developed. (orig.)

  17. Radiographic classification for fractures of the fifth metatarsal base.

    Science.gov (United States)

    Mehlhorn, Alexander T; Zwingmann, Jörn; Hirschmüller, Anja; Südkamp, Norbert P; Schmal, Hagen

    2014-04-01

    Avulsion fractures of the fifth metatarsal base (MTB5) are common fore foot injuries. Based on a radiomorphometric analysis reflecting the risk for a secondary displacement, a new classification was developed. A cohort of 95 healthy, sportive, and young patients (age ≤ 50 years) with avulsion fractures of the MTB5 was included in the study and divided into groups with non-displaced, primary-displaced, and secondary-displaced fractures. Radiomorphometric data obtained using standard oblique and dorso-plantar views were analyzed in association with secondary displacement. Based on this, a classification was developed and checked for reproducibility. Fractures with a longer distance between the lateral edge of the styloid process and the lateral fracture step-off and fractures with a more medial joint entry of the fracture line at the MTB5 are at higher risk to displace secondarily. Based on these findings, all fractures were divided into three types: type I with a fracture entry in the lateral third; type II in the middle third; and type III in the medial third of the MTB5. Additionally, the three types were subdivided into an A-type with a fracture displacement <2 mm and a B-type with a fracture displacement ≥ 2 mm. A substantial level of interobserver agreement was found in the assignment of all 95 fractures to the six fracture types (κ = 0.72). The secondary displacement of fractures was confirmed by all examiners in 100%. Radiomorphometric data may identify fractures at risk for secondary displacement of the MTB5. Based on this, a reliable classification was developed.

  18. Region-based Unsupervised Classification of SAR Images

    NARCIS (Netherlands)

    K. Kayabol (Koray)

    2012-01-01

    htmlabstractMany applications in remote sensing, varying from crop and forest classification to urban area extraction, use Synthetic Aperture Radar (SAR) image classification. As ERCIM Fellows, we have studied the classification of land covers for a year. Our results on the classification of water,

  19. The applicability of the international classification of functioning, disability, and health to study lifestyle and quality of life of colorectal cancer survivors

    NARCIS (Netherlands)

    Roekel, E.H. van; Bours, M.J.; Brouwer, C.P. de; Napel, H.M.T.D. ten; Sanduleanu, S.; Beets, G.L.; Kant, I.J.; Weijenberg, M.P.

    2014-01-01

    BACKGROUND: Well-designed studies on lifestyle and health-related quality of life (HRQoL) in colorectal cancer survivors based on a biopsychosocial instead of a traditional biomedical approach are warranted. We report on the applicability of the International Classification of Functioning,

  20. MMG-based classification of muscle activity for prosthesis control.

    Science.gov (United States)

    Silva, J; Heim, W; Chau, T

    2004-01-01

    We have previously proposed the use of "muscle sounds" or mechanomyography (MMG) as a reliable alternative measure of muscle activity with the main objective of facilitating the use of more comfortable and functional soft silicone sockets with below-elbow externally powered prosthesis. This work describes an integrated strategy where data and sensor fusion algorithms are combined to provide MMG-based detection, estimation and classification of muscle activity. The proposed strategy represents the first ever attempt to generate multiple output signals for practical prosthesis control using a MMG multisensor array embedded distally within a silicon soft socket. This multisensor fusion strategy consists of two stages. The first is the detection stage which determines the presence or absence of muscle contractions in the acquired signals. Upon detection of a contraction, the second stage, that of classification, specifies the nature of the contraction and determines the corresponding control output. Tests with real amputees indicate that with the simple detection and classification algorithms proposed, MMG is indeed comparable to and may exceed EMG functionally.

  1. Risk Classification and Risk-based Safety and Mission Assurance

    Science.gov (United States)

    Leitner, Jesse A.

    2014-01-01

    Recent activities to revamp and emphasize the need to streamline processes and activities for Class D missions across the agency have led to various interpretations of Class D, including the lumping of a variety of low-cost projects into Class D. Sometimes terms such as Class D minus are used. In this presentation, mission risk classifications will be traced to official requirements and definitions as a measure to ensure that projects and programs align with the guidance and requirements that are commensurate for their defined risk posture. As part of this, the full suite of risk classifications, formal and informal will be defined, followed by an introduction to the new GPR 8705.4 that is currently under review.GPR 8705.4 lays out guidance for the mission success activities performed at the Classes A-D for NPR 7120.5 projects as well as for projects not under NPR 7120.5. Furthermore, the trends in stepping from Class A into higher risk posture classifications will be discussed. The talk will conclude with a discussion about risk-based safety and mission assuranceat GSFC.

  2. Tongue Images Classification Based on Constrained High Dispersal Network

    Directory of Open Access Journals (Sweden)

    Dan Meng

    2017-01-01

    Full Text Available Computer aided tongue diagnosis has a great potential to play important roles in traditional Chinese medicine (TCM. However, the majority of the existing tongue image analyses and classification methods are based on the low-level features, which may not provide a holistic view of the tongue. Inspired by deep convolutional neural network (CNN, we propose a novel feature extraction framework called constrained high dispersal neural networks (CHDNet to extract unbiased features and reduce human labor for tongue diagnosis in TCM. Previous CNN models have mostly focused on learning convolutional filters and adapting weights between them, but these models have two major issues: redundancy and insufficient capability in handling unbalanced sample distribution. We introduce high dispersal and local response normalization operation to address the issue of redundancy. We also add multiscale feature analysis to avoid the problem of sensitivity to deformation. Our proposed CHDNet learns high-level features and provides more classification information during training time, which may result in higher accuracy when predicting testing samples. We tested the proposed method on a set of 267 gastritis patients and a control group of 48 healthy volunteers. Test results show that CHDNet is a promising method in tongue image classification for the TCM study.

  3. Geographical classification of apple based on hyperspectral imaging

    Science.gov (United States)

    Guo, Zhiming; Huang, Wenqian; Chen, Liping; Zhao, Chunjiang; Peng, Yankun

    2013-05-01

    Attribute of apple according to geographical origin is often recognized and appreciated by the consumers. It is usually an important factor to determine the price of a commercial product. Hyperspectral imaging technology and supervised pattern recognition was attempted to discriminate apple according to geographical origins in this work. Hyperspectral images of 207 Fuji apple samples were collected by hyperspectral camera (400-1000nm). Principal component analysis (PCA) was performed on hyperspectral imaging data to determine main efficient wavelength images, and then characteristic variables were extracted by texture analysis based on gray level co-occurrence matrix (GLCM) from dominant waveband image. All characteristic variables were obtained by fusing the data of images in efficient spectra. Support vector machine (SVM) was used to construct the classification model, and showed excellent performance in classification results. The total classification rate had the high classify accuracy of 92.75% in the training set and 89.86% in the prediction sets, respectively. The overall results demonstrated that the hyperspectral imaging technique coupled with SVM classifier can be efficiently utilized to discriminate Fuji apple according to geographical origins.

  4. Overfitting Reduction of Text Classification Based on AdaBELM

    Directory of Open Access Journals (Sweden)

    Xiaoyue Feng

    2017-07-01

    Full Text Available Overfitting is an important problem in machine learning. Several algorithms, such as the extreme learning machine (ELM, suffer from this issue when facing high-dimensional sparse data, e.g., in text classification. One common issue is that the extent of overfitting is not well quantified. In this paper, we propose a quantitative measure of overfitting referred to as the rate of overfitting (RO and a novel model, named AdaBELM, to reduce the overfitting. With RO, the overfitting problem can be quantitatively measured and identified. The newly proposed model can achieve high performance on multi-class text classification. To evaluate the generalizability of the new model, we designed experiments based on three datasets, i.e., the 20 Newsgroups, Reuters-21578, and BioMed corpora, which represent balanced, unbalanced, and real application data, respectively. Experiment results demonstrate that AdaBELM can reduce overfitting and outperform classical ELM, decision tree, random forests, and AdaBoost on all three text-classification datasets; for example, it can achieve 62.2% higher accuracy than ELM. Therefore, the proposed model has a good generalizability.

  5. Correlation-based linear discriminant classification for gene expression data.

    Science.gov (United States)

    Pan, M; Zhang, J

    2017-01-23

    Microarray gene expression technology provides a systematic approach to patient classification. However, microarray data pose a great computational challenge owing to their large dimensionality, small sample sizes, and potential correlations among genes. A recent study has shown that gene-gene correlations have a positive effect on the accuracy of classification models, in contrast to some previous results. In this study, a recently developed correlation-based classifier, the ensemble of random subspace (RS) Fisher linear discriminants (FLDs), was utilized. The impact of gene-gene correlations on the performance of this classifier and other classifiers was studied using simulated datasets and real datasets. A cross-validation framework was used to evaluate the performance of each classifier using the simulated datasets or real datasets, and misclassification rates (MRs) were computed. Using the simulated data, the average MRs of the correlation-based classifiers decreased as the correlations increased when there were more correlated genes. Using real data, the correlation-based classifiers outperformed the non-correlation-based classifiers, especially when the gene-gene correlations were high. The ensemble RS-FLD classifier is a potential state-of-the-art computational method. The correlation-based ensemble RS-FLD classifier was effective and benefited from gene-gene correlations, particularly when the correlations were high.

  6. [A comparison between the revision of Atlanta classification and determinant-based classification in acute pancreatitis].

    Science.gov (United States)

    Wu, D; Lu, B; Xue, H D; Lai, Y M; Qian, J M; Yang, H

    2017-12-01

    Objective: To compare the performance of the revision of Atlanta classification (RAC) and determinant-based classification (DBC) in acute pancreatitis. Methods: Consecutive patients with acute pancreatitis admitted to a single center from January 2001 to January 2015 were retrospectively analyzed. Patients were classified into mild, moderately severe and severe categories based on RAC and were simultaneously classified into mild, moderate, severe and critical grades according to DBC. Disease severity and clinical outcomes were compared between subgroups. The receiver operating curve (ROC) was used to compare the utility of RAC and DBC by calculating the area under curve (AUC). Results: Among 1 120 patients enrolled, organ failure occurred in 343 patients (30.6%) and infected necrosis in 74 patients(6.6%). A total of 63 patients (5.6%) died. Statistically significant difference of disease severity and outcomes was observed between all the subgroups in RAC and DBC (Ppancreatitis (with both persistent organ failure and infected necrosis) had the most severe clinical course and the highest mortality (19/31, 61.3%). DBC had a larger AUC (0.73, 95%CI 0.69-0.78) than RAC (0.68, 95%CI 0.65-0.73) in classifying ICU admissions (P=0.031), but both were similar in predicting mortality(P=0.372) and prolonged ICU stay (P=0.266). Conclusions: DBC and RAC perform comparably well in categorizing patients with acute pancreatitis regarding disease severity and clinical outcome. DBC is slightly better than RAC in predicting prolonged hospital stay. Persistent organ failure and infected necrosis are risk factors for poor prognosis and presence of both is associated with the most dismal outcome.

  7. Breast Cancer Survival Defined by the ER/PR/HER2 Subtypes and a Surrogate Classification according to Tumor Grade and Immunohistochemical Biomarkers

    Directory of Open Access Journals (Sweden)

    Carol A. Parise

    2014-01-01

    Full Text Available Introduction. ER, PR, and HER2 are routinely available in breast cancer specimens. The purpose of this study is to contrast breast cancer-specific survival for the eight ER/PR/HER2 subtypes with survival of an immunohistochemical surrogate for the molecular subtype based on the ER/PR/HER2 subtypes and tumor grade. Methods. We identified 123,780 cases of stages 1–3 primary female invasive breast cancer from California Cancer Registry. The surrogate classification was derived using ER/PR/HER2 and tumor grade. Kaplan-Meier survival analysis and Cox proportional hazards modeling were used to assess differences in survival and risk of mortality for the ER/PR/HER2 subtypes and surrogate classification within each stage. Results. The luminal B/HER2− surrogate classification had a higher risk of mortality than the luminal B/HER2+ for all stages of disease. There was no difference in risk of mortality between the ER+/PR+/HER2− and ER+/PR+/HER2+ in stage 3. With one exception in stage 3, the ER-negative subtypes all had an increased risk of mortality when compared with the ER-positive subtypes. Conclusions. Assessment of survival using ER/PR/HER2 illustrates the heterogeneity of HER2+ subtypes. The surrogate classification provides clear separation in survival and adjusted mortality but underestimates the wide variability within the subtypes that make up the classification.

  8. Cell nuclei attributed relational graphs for efficient representation and classification of gastric cancer in digital histopathology

    Science.gov (United States)

    Sharma, Harshita; Zerbe, Norman; Heim, Daniel; Wienert, Stephan; Lohmann, Sebastian; Hellwich, Olaf; Hufnagl, Peter

    2016-03-01

    This paper describes a novel graph-based method for efficient representation and subsequent classification in histological whole slide images of gastric cancer. Her2/neu immunohistochemically stained and haematoxylin and eosin stained histological sections of gastric carcinoma are digitized. Immunohistochemical staining is used in practice by pathologists to determine extent of malignancy, however, it is laborious to visually discriminate the corresponding malignancy levels in the more commonly used haematoxylin and eosin stain, and this study attempts to solve this problem using a computer-based method. Cell nuclei are first isolated at high magnification using an automatic cell nuclei segmentation strategy, followed by construction of cell nuclei attributed relational graphs of the tissue regions. These graphs represent tissue architecture comprehensively, as they contain information about cell nuclei morphology as vertex attributes, along with knowledge of neighborhood in the form of edge linking and edge attributes. Global graph characteristics are derived and ensemble learning is used to discriminate between three types of malignancy levels, namely, non-tumor, Her2/neu positive tumor and Her2/neu negative tumor. Performance is compared with state of the art methods including four texture feature groups (Haralick, Gabor, Local Binary Patterns and Varma Zisserman features), color and intensity features, and Voronoi diagram and Delaunay triangulation. Texture, color and intensity information is also combined with graph-based knowledge, followed by correlation analysis. Quantitative assessment is performed using two cross validation strategies. On investigating the experimental results, it can be concluded that the proposed method provides a promising way for computer-based analysis of histopathological images of gastric cancer.

  9. Characterization and classification of lupus patients based on plasma thermograms.

    Directory of Open Access Journals (Sweden)

    Nichola C Garbett

    Full Text Available Plasma thermograms (thermal stability profiles of blood plasma are being utilized as a new diagnostic approach for clinical assessment. In this study, we investigated the ability of plasma thermograms to classify systemic lupus erythematosus (SLE patients versus non SLE controls using a sample of 300 SLE and 300 control subjects from the Lupus Family Registry and Repository. Additionally, we evaluated the heterogeneity of thermograms along age, sex, ethnicity, concurrent health conditions and SLE diagnostic criteria.Thermograms were visualized graphically for important differences between covariates and summarized using various measures. A modified linear discriminant analysis was used to segregate SLE versus control subjects on the basis of the thermograms. Classification accuracy was measured based on multiple training/test splits of the data and compared to classification based on SLE serological markers.Median sensitivity, specificity, and overall accuracy based on classification using plasma thermograms was 86%, 83%, and 84% compared to 78%, 95%, and 86% based on a combination of five antibody tests. Combining thermogram and serology information together improved sensitivity from 78% to 86% and overall accuracy from 86% to 89% relative to serology alone. Predictive accuracy of thermograms for distinguishing SLE and osteoarthritis / rheumatoid arthritis patients was comparable. Both gender and anemia significantly interacted with disease status for plasma thermograms (p<0.001, with greater separation between SLE and control thermograms for females relative to males and for patients with anemia relative to patients without anemia.Plasma thermograms constitute an additional biomarker which may help improve diagnosis of SLE patients, particularly when coupled with standard diagnostic testing. Differences in thermograms according to patient sex, ethnicity, clinical and environmental factors are important considerations for application of

  10. Objective classification system for sagittal craniosynostosis based on suture segmentation

    Science.gov (United States)

    Qian, Xiaohua; Tan, Hua; Zhang, Jian; Zhuang, Xiahai; Branch, Leslie; Sanger, Chaire; Thompson, Allison; Zhao, Weiling; Li, King Chuen; David, Lisa; Zhou, Xiaobo

    2015-01-01

    Purpose: Spring-assisted surgery is an effective and minimally invasive treatment for sagittal craniosynostosis (CSO). The principal barrier to the advancement of spring-assisted surgery is the patient-specific spring selection. The selection of spring force depends on the suture involved, subtypes of sagittal CSO, and age of the infant, among other factors. Clinically, physicians manually judge the subtype of sagittal CSO patients based on their CT image data, which may cause bias from different clinicians. An objective system would be helpful to stratify the sagittal CSO patients and make spring choice less subjective. Methods: The authors developed a novel informatics system to automatically segment and characterize sutures and classify sagittal CSO. The proposed system is composed of three phases: preprocessing, sutures segmentation, and classification. First, the three-dimensional (3D) skull was extracted from the CT images and aligned with the symmetry of the cranial vault. Second, a “hemispherical projection” algorithm was developed to transform 3D surface of the skull to a polar two-dimensional plane. Through the transformation, an “effective” projected region can be obtained to enable easy segmentation of sutures. Then, the different types of sutures, such as coronal sutures, lambdoid sutures, sagittal suture, and metopic suture, obtained from the segmented sutures were further identified by a dual-projection technique of the midline of the sutures. Finally, 108 quantified features of sutures were extracted and selected by a proposed multiclass feature scoring system. The sagittal CSO patients were classified into four subtypes: anterior, central, posterior, and complex with the support vector machine approach. Fivefold cross validation (CV) was employed to evaluate the capability of selected features in discriminating the four subtypes in 33 sagittal CSO patients. Receiver operating characteristics (ROC) curves were used to assess the robustness

  11. Exercise intensity classification in cancer patients undergoing allogeneic HCT.

    Science.gov (United States)

    Kuehl, Rea; Scharhag-Rosenberger, Friederike; Schommer, Kai; Schmidt, Martina E; Dreger, Peter; Huber, Gerhard; Bohus, Martin; Ulrich, Cornelia M; Wiskemann, Joachim

    2015-05-01

    Exercise intervention studies during and after cancer treatment show beneficial effects for various physical and psychosocial outcomes. Current exercise intensity guidelines for cancer patients are rather general and have been adapted from American College of Sports Medicine (ACSM) recommendations for healthy individuals. Intensive cancer treatment regimens such as allogeneic hematopoietic stem cell transplantation (allo-HCT) may change the cardiovascular response to acute exercise. Therefore, we evaluated the relationships between %V˙O2 reserve (%V˙O2R, reference) and %HRR, %HRmax, and %V˙O2max and compared calculated intensities with given intensities by ACSM. Measurements before and 180 d after allo-HCT from a randomized controlled trial were used. Only patients who reached maximal effort and at least two exercise stages in our maximal incremental cycling test were included. Before allo-HCT, 106 patients were included, and 180 d after treatment, 49 patients met our inclusion criteria. Individual regression lines were calculated with V˙O2R as the reference. Calculated exercise intensities for endurance training prescription were compared with ACSM values. Before allo-HCT, %HRR values of patients were significantly lower than ACSM values, and %HRmax and %V˙O2max values were significantly higher (except 90% HRmax, which was significantly lower, all P exercise intensity recommendations for endurance training may not be applicable for cancer patients during and 180 d after allo-HCT because they may not meet the targeted intensity class, with the exception of %HRR 180 d after allo-HCT.

  12. Update on epidemiology classification, and management of thyroid cancer

    Directory of Open Access Journals (Sweden)

    Heitham Gheriani

    2006-06-01

    Full Text Available Thyroid cancer represents approximately 0.5–1% of all human malignancy1. In the UK the incidence of thyroid cancer is 2-3 per 100,000 populations 2. In geographical areas of low iodine intake and in areas exposed to nuclear disasters the incidence of thyroid cancer is higher. Benign thyroid conditions are much more common. In the UK approximately 8 % of the population have nodular thyroid disease2. Nodular thyroid disease increases with age and is also more common in females and in geographical areas of low iodine intake. Primary thyroid malignancy can be broadly divided into 2 groups. The first group, which generally have much better prognosis, are the well-differentiated thyroid carcinoma, which includes papillary carcinoma, follicular carcinoma and Hürthle cell tumours. The second group includes the poorly differentiated thyroid carcinoma like medullary thyroid carcinoma and the anaplastic thyroid carcinoma. Other rare tumours such as sarcomas, lymphomas and the extremely rare primary squamous cell carcinoma of the thyroid should be included in the second group. Secondary or metastatic thyroid cancer can be from breast, lung, colon and kidney malignancies.

  13. Breast cancer tumor classification using LASSO method selection approach

    Energy Technology Data Exchange (ETDEWEB)

    Celaya P, J. M.; Ortiz M, J. A.; Martinez B, M. R.; Solis S, L. O.; Castaneda M, R.; Garza V, I.; Martinez F, M.; Ortiz R, J. M., E-mail: morvymm@yahoo.com.mx [Universidad Autonoma de Zacatecas, Av. Ramon Lopez Velarde 801, Col. Centro, 98000 Zacatecas, Zac. (Mexico)

    2016-10-15

    Breast cancer is one of the leading causes of deaths worldwide among women. Early tumor detection is key in reducing breast cancer deaths and screening mammography is the widest available method for early detection. Mammography is the most common and effective breast cancer screening test. However, the rate of positive findings is very low, making the radiologic interpretation monotonous and biased toward errors. In an attempt to alleviate radiological workload, this work presents a computer-aided diagnosis (CAD x) method aimed to automatically classify tumor lesions into malign or benign as a means to a second opinion. The CAD x methos, extracts image features, and classifies the screening mammogram abnormality into one of two categories: subject at risk of having malignant tumor (malign), and healthy subject (benign). In this study, 143 abnormal segmentation s (57 malign and 86 benign) from the Breast Cancer Digital Repository (BCD R) public database were used to train and evaluate the CAD x system. Percentile-rank (p-rank) was used to standardize the data. Using the LASSO feature selection methodology, the model achieved a Leave-one-out-cross-validation area under the receiver operating characteristic curve (Auc) of 0.950. The proposed method has the potential to rank abnormal lesions with high probability of malignant findings aiding in the detection of potential malign cases as a second opinion to the radiologist. (Author)

  14. Classification of Noisy Data: An Approach Based on Genetic Algorithms and Voronoi Tessellation

    DEFF Research Database (Denmark)

    Khan, Abdul Rauf; Schiøler, Henrik; Knudsen, Torben

    Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based on the po......Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based...

  15. Application of Bayesian Classification to Content-Based Data Management

    Science.gov (United States)

    Lynnes, Christopher; Berrick, S.; Gopalan, A.; Hua, X.; Shen, S.; Smith, P.; Yang, K-Y.; Wheeler, K.; Curry, C.

    2004-01-01

    The high volume of Earth Observing System data has proven to be challenging to manage for data centers and users alike. At the Goddard Earth Sciences Distributed Active Archive Center (GES DAAC), about 1 TB of new data are archived each day. Distribution to users is also about 1 TB/day. A substantial portion of this distribution is MODIS calibrated radiance data, which has a wide variety of uses. However, much of the data is not useful for a particular user's needs: for example, ocean color users typically need oceanic pixels that are free of cloud and sun-glint. The GES DAAC is using a simple Bayesian classification scheme to rapidly classify each pixel in the scene in order to support several experimental content-based data services for near-real-time MODIS calibrated radiance products (from Direct Readout stations). Content-based subsetting would allow distribution of, say, only clear pixels to the user if desired. Content-based subscriptions would distribute data to users only when they fit the user's usability criteria in their area of interest within the scene. Content-based cache management would retain more useful data on disk for easy online access. The classification may even be exploited in an automated quality assessment of the geolocation product. Though initially to be demonstrated at the GES DAAC, these techniques have applicability in other resource-limited environments, such as spaceborne data systems.

  16. Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis.

    Science.gov (United States)

    Al-Rajab, Murad; Lu, Joan; Xu, Qiang

    2017-07-01

    This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Support Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Object-based Dimensionality Reduction in Land Surface Phenology Classification

    Directory of Open Access Journals (Sweden)

    Brian E. Bunker

    2016-11-01

    Full Text Available Unsupervised classification or clustering of multi-decadal land surface phenology provides a spatio-temporal synopsis of natural and agricultural vegetation response to environmental variability and anthropogenic activities. Notwithstanding the detailed temporal information available in calibrated bi-monthly normalized difference vegetation index (NDVI and comparable time series, typical pre-classification workflows average a pixel’s bi-monthly index within the larger multi-decadal time series. While this process is one practical way to reduce the dimensionality of time series with many hundreds of image epochs, it effectively dampens temporal variation from both intra and inter-annual observations related to land surface phenology. Through a novel application of object-based segmentation aimed at spatial (not temporal dimensionality reduction, all 294 image epochs from a Moderate Resolution Imaging Spectroradiometer (MODIS bi-monthly NDVI time series covering the northern Fertile Crescent were retained (in homogenous landscape units as unsupervised classification inputs. Given the inherent challenges of in situ or manual image interpretation of land surface phenology classes, a cluster validation approach based on transformed divergence enabled comparison between traditional and novel techniques. Improved intra-annual contrast was clearly manifest in rain-fed agriculture and inter-annual trajectories showed increased cluster cohesion, reducing the overall number of classes identified in the Fertile Crescent study area from 24 to 10. Given careful segmentation parameters, this spatial dimensionality reduction technique augments the value of unsupervised learning to generate homogeneous land surface phenology units. By combining recent scalable computational approaches to image segmentation, future work can pursue new global land surface phenology products based on the high temporal resolution signatures of vegetation index time series.

  18. MRT letter: segmentation and texture-based classification of breast mammogram images.

    Science.gov (United States)

    Naveed, Nawazish; Jaffar, M Arfan; Choi, Tae-Sun

    2011-11-01

    Breast cancer is the most common cancer diagnosed among women. In this article, support vector machine is used to classify digital mammogram images into malignant and benign. Wiener filter is used to handle the possible quantum noise, which is more likely to occur in mammograms. Stack-based connected component method is proposed for background removal, and the image is enhanced using retinax method. Seeded region growing algorithm is used to remove the pectoral muscle part of the mammogram. We have extracted 13 different multidomains' features for classification. Results show the superiority of the proposed algorithm in terms of sensitivity, specificity, and accuracy. We have used MIAS database of mammography for experimentation. Copyright © 2011 Wiley Periodicals, Inc.

  19. New Adaptive Image Quality Assessment Based on Distortion Classification

    Directory of Open Access Journals (Sweden)

    Xin JIN

    2014-01-01

    Full Text Available This paper proposes a new adaptive image quality assessment (AIQA method, which is based on distortion classifying. AIQA contains two parts, distortion classification and image quality assessment. Firstly, we analysis characteristics of the original and distorted images, including the distribution of wavelet coefficient, the ratio of edge energy and inner energy of the differential image block, we divide distorted images into White Noise distortion, JPEG compression distortion and fuzzy distortion. To evaluate the quality of first two type distortion images, we use pixel based structure similarity metric and DCT based structural similarity metric respectively. For those blurriness pictures, we present a new wavelet-based structure similarity algorithm. According to the experimental results, AIQA takes the advantages of different structural similarity metrics, and it’s able to simulate the human visual perception effectively.

  20. Toward automated classification of consumers' cancer-related questions with a new taxonomy of expected answer types.

    Science.gov (United States)

    McRoy, Susan; Jones, Sean; Kurmally, Adam

    2016-09-01

    This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions. © The Author(s) 2015.

  1. Network analysis of genes regulated in renal diseases: implications for a molecular-based classification

    Directory of Open Access Journals (Sweden)

    Jagadish HV

    2009-09-01

    Full Text Available Abstract Background Chronic renal diseases are currently classified based on morphological similarities such as whether they produce predominantly inflammatory or non-inflammatory responses. However, such classifications do not reliably predict the course of the disease and its response to therapy. In contrast, recent studies in diseases such as breast cancer suggest that a classification which includes molecular information could lead to more accurate diagnoses and prediction of treatment response. This article describes how we extracted gene expression profiles from biopsies of patients with chronic renal diseases, and used network visualizations and associated quantitative measures to rapidly analyze similarities and differences between the diseases. Results The analysis revealed three main regularities: (1 Many genes associated with a single disease, and fewer genes associated with many diseases. (2 Unexpected combinations of renal diseases that share relatively large numbers of genes. (3 Uniform concordance in the regulation of all genes in the network. Conclusion The overall results suggest the need to define a molecular-based classification of renal diseases, in addition to hypotheses for the unexpected patterns of shared genes and the uniformity in gene concordance. Furthermore, the results demonstrate the utility of network analyses to rapidly understand complex relationships between diseases and regulated genes.

  2. An Expression Signature as an Aid to the Histologic Classification of Non-Small Cell Lung Cancer

    Science.gov (United States)

    Girard, Luc; Rodriguez-Canales, Jaime; Behrens, Carmen; Thompson, Debrah M.; Botros, Ihab W.; Tang, Hao; Xie, Yang; Rekhtman, Natasha; Travis, William D.; Wistuba, Ignacio I.; Minna, John D.; Gazdar, Adi F.

    2017-01-01

    Purpose Most non-small cell lung cancers (NSCLCs) are now diagnosed from small specimens, and classification using standard pathology methods can be difficult. This is of clinical relevance as many therapy regimens and clinical trials are histology dependent. The purpose of this study was to develop an mRNA expression signature as an adjunct test for routine histo-pathological classification of NSCLCs. Experimental Design A microarray dataset of resected adenocarcinomas (ADC) and squamous cell carcinomas (SCC) was used as the learning set for an ADC-SCC signature. The Cancer Genome Atlas (TCGA) lung RNAseq dataset was used for validation. Another microarray dataset of ADCs and matched non-malignant lung was used as the learning set for a Tumor vs. Nonmalignant signature. The classifiers were selected as the most differentially expressed genes and sample classification was determined by a nearest distance approach. Results We developed a 62-gene expression signature that contained many genes used in immunostains for NSCLC typing. It includes 42 genes that distinguish ADC from SCC and 20 genes differentiating non-malignant lung from lung cancer. Testing of the TCGA and other public datasets resulted in high prediction accuracies (93–95%). Additionally, a prediction score was derived that correlates both with histologic grading and prognosis. We developed a practical version of the Classifier using the HTG EdgeSeq nuclease protection-based technology in combination with next-generation sequencing that can be applied to formalin-fixed paraffin-embedded (FFPE) tissues and small biopsies. Conclusions Our RNA classifier provides an objective, quantitative method to aid in the pathological diagnosis of lung cancer. PMID:27354471

  3. Feature selection gait-based gender classification under different circumstances

    Science.gov (United States)

    Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

    2014-05-01

    This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

  4. Classification Based on Hierarchical Linear Models: The Need for Incorporation of Social Contexts in Classification Analysis

    Science.gov (United States)

    Vaughn, Brandon K.; Wang, Qui

    2009-01-01

    Many areas in educational and psychological research involve the use of classification statistical analysis. For example, school districts might be interested in attaining variables that provide optimal prediction of school dropouts. In psychology, a researcher might be interested in the classification of a subject into a particular psychological…

  5. Scene classification of infrared images based on texture feature

    Science.gov (United States)

    Zhang, Xiao; Bai, Tingzhu; Shang, Fei

    2008-12-01

    Scene Classification refers to as assigning a physical scene into one of a set of predefined categories. Utilizing the method texture feature is good for providing the approach to classify scenes. Texture can be considered to be repeating patterns of local variation of pixel intensities. And texture analysis is important in many applications of computer image analysis for classification or segmentation of images based on local spatial variations of intensity. Texture describes the structural information of images, so it provides another data to classify comparing to the spectrum. Now, infrared thermal imagers are used in different kinds of fields. Since infrared images of the objects reflect their own thermal radiation, there are some shortcomings of infrared images: the poor contrast between the objectives and background, the effects of blurs edges, much noise and so on. Because of these shortcomings, it is difficult to extract to the texture feature of infrared images. In this paper we have developed an infrared image texture feature-based algorithm to classify scenes of infrared images. This paper researches texture extraction using Gabor wavelet transform. The transformation of Gabor has excellent capability in analysis the frequency and direction of the partial district. Gabor wavelets is chosen for its biological relevance and technical properties In the first place, after introducing the Gabor wavelet transform and the texture analysis methods, the infrared images are extracted texture feature by Gabor wavelet transform. It is utilized the multi-scale property of Gabor filter. In the second place, we take multi-dimensional means and standard deviation with different scales and directions as texture parameters. The last stage is classification of scene texture parameters with least squares support vector machine (LS-SVM) algorithm. SVM is based on the principle of structural risk minimization (SRM). Compared with SVM, LS-SVM has overcome the shortcoming of

  6. Interannual rainfall variability and SOM-based circulation classification

    Science.gov (United States)

    Wolski, Piotr; Jack, Christopher; Tadross, Mark; van Aardenne, Lisa; Lennard, Christopher

    2018-01-01

    Self-Organizing Maps (SOM) based classifications of synoptic circulation patterns are increasingly being used to interpret large-scale drivers of local climate variability, and as part of statistical downscaling methodologies. These applications rely on a basic premise of synoptic climatology, i.e. that local weather is conditioned by the large-scale circulation. While it is clear that this relationship holds in principle, the implications of its implementation through SOM-based classification, particularly at interannual and longer time scales, are not well recognized. Here we use a SOM to understand the interannual synoptic drivers of climate variability at two locations in the winter and summer rainfall regimes of South Africa. We quantify the portion of variance in seasonal rainfall totals that is explained by year to year differences in the synoptic circulation, as schematized by a SOM. We furthermore test how different spatial domain sizes and synoptic variables affect the ability of the SOM to capture the dominant synoptic drivers of interannual rainfall variability. Additionally, we identify systematic synoptic forcing that is not captured by the SOM classification. The results indicate that the frequency of synoptic states, as schematized by a relatively disaggregated SOM (7 × 9) of prognostic atmospheric variables, including specific humidity, air temperature and geostrophic winds, captures only 20-45% of interannual local rainfall variability, and that the residual variance contains a strong systematic component. Utilising a multivariate linear regression framework demonstrates that this residual variance can largely be explained using synoptic variables over a particular location; even though they are used in the development of the SOM their influence, however, diminishes with the size of the SOM spatial domain. The influence of the SOM domain size, the choice of SOM atmospheric variables and grid-point explanatory variables on the levels of explained

  7. Joint Probability-Based Neuronal Spike Train Classification

    Directory of Open Access Journals (Sweden)

    Yan Chen

    2009-01-01

    Full Text Available Neuronal spike trains are used by the nervous system to encode and transmit information. Euclidean distance-based methods (EDBMs have been applied to quantify the similarity between temporally-discretized spike trains and model responses. In this study, using the same discretization procedure, we developed and applied a joint probability-based method (JPBM to classify individual spike trains of slowly adapting pulmonary stretch receptors (SARs. The activity of individual SARs was recorded in anaesthetized, paralysed adult male rabbits, which were artificially-ventilated at constant rate and one of three different volumes. Two-thirds of the responses to the 600 stimuli presented at each volume were used to construct three response models (one for each stimulus volume consisting of a series of time bins, each with spike probabilities. The remaining one-third of the responses where used as test responses to be classified into one of the three model responses. This was done by computing the joint probability of observing the same series of events (spikes or no spikes, dictated by the test response in a given model and determining which probability of the three was highest. The JPBM generally produced better classification accuracy than the EDBM, and both performed well above chance. Both methods were similarly affected by variations in discretization parameters, response epoch duration, and two different response alignment strategies. Increasing bin widths increased classification accuracy, which also improved with increased observation time, but primarily during periods of increasing lung inflation. Thus, the JPBM is a simple and effective method performing spike train classification.

  8. Informative Gene Selection for Cancer Classification with Microarray Data Using a Metaheuristic Framework

    Science.gov (United States)

    M, Pyingkodi; R, Thangarajan

    2018-02-26

    Objective: Cancer diagnosis is one of the most vital emerging clinical applications of microarray data. Due to the high dimensionality, gene selection is an important step for improving expression data classification performance. There is therefore a need for effective methods to select informative genes for prediction and diagnosis of cancer. The main objective of this research was to derive a heuristic approach to select highly informative genes. Methods: A metaheuristic approach with a Genetic Algorithm with Levy Flight (GA-LV) was applied for classification of cancer genes in microarrays. The experimental results were analyzed with five major cancer gene expression benchmark datasets. Result: GA-LV proved superior to GA and statistical approaches, with 100% accuracy for the dataset for Leukemia, Lung and Lymphoma. For Prostate and Colon datasets the GA-LV was 99.5% and 99.2% accurate, respectively. Conclusion: The experimental results show that the proposed approach is suitable for effective gene selection with all benchmark datasets, removing irrelevant and redundant genes to improve classification accuracy. Creative Commons Attribution License

  9. A fast gene selection method for multi-cancer classification using multiple support vector data description.

    Science.gov (United States)

    Cao, Jin; Zhang, Li; Wang, Bangjun; Li, Fanzhang; Yang, Jiwen

    2015-02-01

    For cancer classification problems based on gene expression, the data usually has only a few dozen sizes but has thousands to tens of thousands of genes which could contain a large number of irrelevant genes. A robust feature selection algorithm is required to remove irrelevant genes and choose the informative ones. Support vector data description (SVDD) has been applied to gene selection for many years. However, SVDD cannot address the problems with multiple classes since it only considers the target class. In addition, it is time-consuming when applying SVDD to gene selection. This paper proposes a novel fast feature selection method based on multiple SVDD and applies it to multi-class microarray data. A recursive feature elimination (RFE) scheme is introduced to iteratively remove irrelevant features, so the proposed method is called multiple SVDD-RFE (MSVDD-RFE). To make full use of all classes for a given task, MSVDD-RFE independently selects a relevant gene subset for each class. The final selected gene subset is the union of these relevant gene subsets. The effectiveness and accuracy of MSVDD-RFE are validated by experiments on five publicly available microarray datasets. Our proposed method is faster and more effective than other methods. Copyright © 2014 Elsevier Inc. All rights reserved.

  10. Pixel classification based color image segmentation using quaternion exponent moments.

    Science.gov (United States)

    Wang, Xiang-Yang; Wu, Zhi-Fang; Chen, Liang; Zheng, Hong-Liang; Yang, Hong-Ying

    2016-02-01

    Image segmentation remains an important, but hard-to-solve, problem since it appears to be application dependent with usually no a priori information available regarding the image structure. In recent years, many image segmentation algorithms have been developed, but they are often very complex and some undesired results occur frequently. In this paper, we propose a pixel classification based color image segmentation using quaternion exponent moments. Firstly, the pixel-level image feature is extracted based on quaternion exponent moments (QEMs), which can capture effectively the image pixel content by considering the correlation between different color channels. Then, the pixel-level image feature is used as input of twin support vector machines (TSVM) classifier, and the TSVM model is trained by selecting the training samples with Arimoto entropy thresholding. Finally, the color image is segmented with the trained TSVM model. The proposed scheme has the following advantages: (1) the effective QEMs is introduced to describe color image pixel content, which considers the correlation between different color channels, (2) the excellent TSVM classifier is utilized, which has lower computation time and higher classification accuracy. Experimental results show that our proposed method has very promising segmentation performance compared with the state-of-the-art segmentation approaches recently proposed in the literature. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Drunk driving detection based on classification of multivariate time series.

    Science.gov (United States)

    Li, Zhenlong; Jin, Xue; Zhao, Xiaohua

    2015-09-01

    This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.

  12. [Galaxy/quasar classification based on nearest neighbor method].

    Science.gov (United States)

    Li, Xiang-Ru; Lu, Yu; Zhou, Jian-Ming; Wang, Yong-Jun

    2011-09-01

    With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification.

  13. Style-based classification of Chinese ink and wash paintings

    Science.gov (United States)

    Sheng, Jiachuan; Jiang, Jianmin

    2013-09-01

    Following the fact that a large collection of ink and wash paintings (IWP) is being digitized and made available on the Internet, their automated content description, analysis, and management are attracting attention across research communities. While existing research in relevant areas is primarily focused on image processing approaches, a style-based algorithm is proposed to classify IWPs automatically by their authors. As IWPs do not have colors or even tones, the proposed algorithm applies edge detection to locate the local region and detect painting strokes to enable histogram-based feature extraction and capture of important cues to reflect the styles of different artists. Such features are then applied to drive a number of neural networks in parallel to complete the classification, and an information entropy balanced fusion is proposed to make an integrated decision for the multiple neural network classification results in which the entropy is used as a pointer to combine the global and local features. Evaluations via experiments support that the proposed algorithm achieves good performances, providing excellent potential for computerized analysis and management of IWPs.

  14. Chemometric classification of casework arson samples based on gasoline content.

    Science.gov (United States)

    Sinkov, Nikolai A; Sandercock, P Mark L; Harynuk, James J

    2014-02-01

    Detection and identification of ignitable liquids (ILs) in arson debris is a critical part of arson investigations. The challenge of this task is due to the complex and unpredictable chemical nature of arson debris, which also contains pyrolysis products from the fire. ILs, most commonly gasoline, are complex chemical mixtures containing hundreds of compounds that will be consumed or otherwise weathered by the fire to varying extents depending on factors such as temperature, air flow, the surface on which IL was placed, etc. While methods such as ASTM E-1618 are effective, data interpretation can be a costly bottleneck in the analytical process for some laboratories. In this study, we address this issue through the application of chemometric tools. Prior to the application of chemometric tools such as PLS-DA and SIMCA, issues of chromatographic alignment and variable selection need to be addressed. Here we use an alignment strategy based on a ladder consisting of perdeuterated n-alkanes. Variable selection and model optimization was automated using a hybrid backward elimination (BE) and forward selection (FS) approach guided by the cluster resolution (CR) metric. In this work, we demonstrate the automated construction, optimization, and application of chemometric tools to casework arson data. The resulting PLS-DA and SIMCA classification models, trained with 165 training set samples, have provided classification of 55 validation set samples based on gasoline content with 100% specificity and sensitivity. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  15. Gear Crack Level Classification Based on EMD and EDT

    Directory of Open Access Journals (Sweden)

    Haiping Li

    2015-01-01

    Full Text Available Gears are the most essential parts in rotating machinery. Crack fault is one of damage modes most frequently occurring in gears. So, this paper deals with the problem of different crack levels classification. The proposed method is mainly based on empirical mode decomposition (EMD and Euclidean distance technique (EDT. First, vibration signal acquired by accelerometer is processed by EMD and intrinsic mode functions (IMFs are obtained. Then, a correlation coefficient based method is proposed to select the sensitive IMFs which contain main gear fault information. And energy of these IMFs is chosen as the fault feature by comparing with kurtosis and skewness. Finally, Euclidean distances between test sample and four classes trained samples are calculated, and on this basis, fault level classification of the test sample can be made. The proposed approach is tested and validated through a gearbox experiment, in which four crack levels and three kinds of loads are utilized. The results show that the proposed method has high accuracy rates in classifying different crack levels and may be adaptive to different conditions.

  16. Automated object-based classification of topography from SRTM data

    Science.gov (United States)

    Drăguţ, Lucian; Eisank, Clemens

    2012-01-01

    We introduce an object-based method to automatically classify topography from SRTM data. The new method relies on the concept of decomposing land-surface complexity into more homogeneous domains. An elevation layer is automatically segmented and classified at three scale levels that represent domains of complexity by using self-adaptive, data-driven techniques. For each domain, scales in the data are detected with the help of local variance and segmentation is performed at these appropriate scales. Objects resulting from segmentation are partitioned into sub-domains based on thresholds given by the mean values of elevation and standard deviation of elevation respectively. Results resemble reasonably patterns of existing global and regional classifications, displaying a level of detail close to manually drawn maps. Statistical evaluation indicates that most of classes satisfy the regionalization requirements of maximizing internal homogeneity while minimizing external homogeneity. Most objects have boundaries matching natural discontinuities at regional level. The method is simple and fully automated. The input data consist of only one layer, which does not need any pre-processing. Both segmentation and classification rely on only two parameters: elevation and standard deviation of elevation. The methodology is implemented as a customized process for the eCognition® software, available as online download. The results are embedded in a web application with functionalities of visualization and download. PMID:22485060

  17. Biopharmaceutics classification system-based biowaivers for generic oncology drug products: case studies.

    Science.gov (United States)

    Tampal, Nilufer; Mandula, Haritha; Zhang, Hongling; Li, Bing V; Nguyen, Hoainhon; Conner, Dale P

    2015-02-01

    Establishing bioequivalence (BE) of drugs indicated to treat cancer poses special challenges. For ethical reasons, often, the studies need to be conducted in cancer patients rather than in healthy volunteers, especially when the drug is cytotoxic. The Biopharmaceutics Classification System (BCS) introduced by Amidon (1) and adopted by the FDA, presents opportunities to avoid conducting the bioequivalence studies in humans. This paper analyzes the application of the BCS approach by the generic pharmaceutical industry and the FDA to oncology drug products. To date, the FDA has granted BCS-based biowaivers for several drug products involving at least four different drug substances, used to treat cancer. Compared to in vivo BE studies, development of data to justify BCS waivers is considered somewhat easier, faster, and more cost effective. However, the FDA experience shows that the approval times for applications containing in vitro studies to support the BCS-based biowaivers are often as long as the applications containing in vivo BE studies, primarily because of inadequate information in the submissions. This paper deliberates some common causes for the delays in the approval of applications requesting BCS-based biowaivers for oncology drug products. Scientific considerations of conducting a non-BCS-based in vivo BE study for generic oncology drug products are also discussed. It is hoped that the information provided in our study would help the applicants to improve the quality of ANDA submissions in the future.

  18. Evidence-based cancer imaging

    Energy Technology Data Exchange (ETDEWEB)

    Shinagare, Atul B.; Khorasani, Ramin [Dept. of Radiology, Brigham and Women' s Hospital, Boston (Korea, Republic of)

    2017-01-15

    With the advances in the field of oncology, imaging is increasingly used in the follow-up of cancer patients, leading to concerns about over-utilization. Therefore, it has become imperative to make imaging more evidence-based, efficient, cost-effective and equitable. This review explores the strategies and tools to make diagnostic imaging more evidence-based, mainly in the context of follow-up of cancer patients.

  19. Keratoconus: Classification scheme based on videokeratography and clinical signs

    Science.gov (United States)

    Li, Xiaohui; Yang, Huiying; Rabinowitz, Yaron S.

    2013-01-01

    PURPOSE To determine in a longitudinal study whether there is correlation between videokeratography and clinical signs of keratoconus that might be useful to practicing clinicians. SETTING Cornea-Genetic Eye Institute, Cedars-Sinai Medical Center, Los Angeles, California, USA. METHODS Eyes grouped as keratoconus, early keratoconus, keratoconus suspect, or normal based on clinical signs and videokeratography were examined at baseline and followed for 1 to 8 years. Differences in quantitative videokeratography indices and the progression rate were evaluated. The quantitative indices were central keratometry (K), the inferior–superior (I–S) value, and the keratoconus percentage index (KISA). Discriminant analysis was used to estimate the classification rate using the indices. RESULTS There were significant differences at baseline between the normal, keratoconus-suspect, and early keratoconus groups in all indices; the respective means were central K: 44.17 D, 45.13 D, and 45.97 D; I–S: 0.57, 1.20, and 4.44; log(KISA): 2.49, 2.94, and 5.71 (all Pkeratoconus-suspect group progressed to early keratoconus or keratoconus and 75% in the early keratoconus group progressed to keratoconus. Using all 3 indices and age, 86.9% in the normal group, 75.3% in the early keratoconus group, and 44.6% in the keratoconus-suspect group could be classified, yielding a total classification rate of 68.9%. CONCLUSIONS Cross-sectional and longitudinal data showed significant differences between groups in the 3 indices. Use of this classification scheme might form a basis for detecting subclinical keratoconus. PMID:19683159

  20. Bearing Fault Classification Based on Conditional Random Field

    Directory of Open Access Journals (Sweden)

    Guofeng Wang

    2013-01-01

    Full Text Available Condition monitoring of rolling element bearing is paramount for predicting the lifetime and performing effective maintenance of the mechanical equipment. To overcome the drawbacks of the hidden Markov model (HMM and improve the diagnosis accuracy, conditional random field (CRF model based classifier is proposed. In this model, the feature vectors sequences and the fault categories are linked by an undirected graphical model in which their relationship is represented by a global conditional probability distribution. In comparison with the HMM, the main advantage of the CRF model is that it can depict the temporal dynamic information between the observation sequences and state sequences without assuming the independence of the input feature vectors. Therefore, the interrelationship between the adjacent observation vectors can also be depicted and integrated into the model, which makes the classifier more robust and accurate than the HMM. To evaluate the effectiveness of the proposed method, four kinds of bearing vibration signals which correspond to normal, inner race pit, outer race pit and roller pit respectively are collected from the test rig. And the CRF and HMM models are built respectively to perform fault classification by taking the sub band energy features of wavelet packet decomposition (WPD as the observation sequences. Moreover, K-fold cross validation method is adopted to improve the evaluation accuracy of the classifier. The analysis and comparison under different fold times show that the accuracy rate of classification using the CRF model is higher than the HMM. This method brings some new lights on the accurate classification of the bearing faults.

  1. Breast cancer cell nuclei classification in histopathology images using deep neural networks.

    Science.gov (United States)

    Feng, Yangqin; Zhang, Lei; Yi, Zhang

    2018-02-01

    Cell nuclei classification in breast cancer histopathology images plays an important role in effective diagnose since breast cancer can often be characterized by its expression in cell nuclei. However, due to the small and variant sizes of cell nuclei, and heavy noise in histopathology images, traditional machine learning methods cannot achieve desirable recognition accuracy. To address this challenge, this paper aims to present a novel deep neural network which performs representation learning and cell nuclei recognition in an end-to-end manner. The proposed model hierarchically maps raw medical images into a latent space in which robustness is achieved by employing a stacked denoising autoencoder. A supervised classifier is further developed to improve the discrimination of the model by maximizing inter-subject separability in the latent space. The proposed method involves a cascade model which jointly learns a set of nonlinear mappings and a classifier from the given raw medical images. Such an on-the-shelf learning strategy makes obtaining discriminative features possible, thus leading to better recognition performance. Extensive experiments with benign and malignant breast cancer datasets are conducted to verify the effectiveness of the proposed method. Better performance was obtained when compared with other feature extraction methods, and higher recognition rate was achieved when compared with other seven classification methods. We propose an end-to-end DNN model for cell nuclei and non-nuclei classification of histopathology images. It demonstrates that the proposed method can achieve promising performance in cell nuclei classification, and the proposed method is suitable for the cell nuclei classification task.

  2. Automatic classification of visual evoked potentials based on wavelet decomposition

    Science.gov (United States)

    Stasiakiewicz, Paweł; Dobrowolski, Andrzej P.; Tomczykiewicz, Kazimierz

    2017-04-01

    Diagnosis of part of the visual system, that is responsible for conducting compound action potential, is generally based on visual evoked potentials generated as a result of stimulation of the eye by external light source. The condition of patient's visual path is assessed by set of parameters that describe the time domain characteristic extremes called waves. The decision process is compound therefore diagnosis significantly depends on experience of a doctor. The authors developed a procedure - based on wavelet decomposition and linear discriminant analysis - that ensures automatic classification of visual evoked potentials. The algorithm enables to assign individual case to normal or pathological class. The proposed classifier has a 96,4% sensitivity at 10,4% probability of false alarm in a group of 220 cases and area under curve ROC equals to 0,96 which, from the medical point of view, is a very good result.

  3. GARLIC: Genomic Autozygosity Regions Likelihood-based Inference and Classification.

    Science.gov (United States)

    Szpiech, Zachary A; Blant, Alexandra; Pemberton, Trevor J

    2017-07-01

    Runs of homozygosity (ROH) are important genomic features that manifest when identical-by-descent haplotypes are inherited from parents. Their length distributions and genomic locations are informative about population history and they are useful for mapping recessive loci contributing to both Mendelian and complex disease risk. Here, we present software implementing a model-based method ( Pemberton et al., 2012 ) for inferring ROH in genome-wide SNP datasets that incorporates population-specific parameters and a genotyping error rate as well as provides a length-based classification module to identify biologically interesting classes of ROH. Using simulations, we evaluate the performance of this method. GARLIC is written in C ++. Source code and pre-compiled binaries (Windows, OSX and Linux) are hosted on GitHub ( https://github.com/szpiech/garlic ) under the GNU General Public License version 3. zachary.szpiech@ucsf.edu. Supplementary data are available at Bioinformatics online.

  4. Automated glioblastoma segmentation based on a multiparametric structured unsupervised classification.

    Science.gov (United States)

    Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V; Robles, Montserrat; Aparici, F; Martí-Bonmatí, L; García-Gómez, Juan M

    2015-01-01

    Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation.

  5. Automated glioblastoma segmentation based on a multiparametric structured unsupervised classification.

    Directory of Open Access Journals (Sweden)

    Javier Juan-Albarracín

    Full Text Available Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM, whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF. An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation.

  6. Simple adaptive sparse representation based classification schemes for EEG based brain-computer interface applications.

    Science.gov (United States)

    Shin, Younghak; Lee, Seungchan; Ahn, Minkyu; Cho, Hohyun; Jun, Sung Chan; Lee, Heung-No

    2015-11-01

    One of the main problems related to electroencephalogram (EEG) based brain-computer interface (BCI) systems is the non-stationarity of the underlying EEG signals. This results in the deterioration of the classification performance during experimental sessions. Therefore, adaptive classification techniques are required for EEG based BCI applications. In this paper, we propose simple adaptive sparse representation based classification (SRC) schemes. Supervised and unsupervised dictionary update techniques for new test data and a dictionary modification method by using the incoherence measure of the training data are investigated. The proposed methods are very simple and additional computation for the re-training of the classifier is not needed. The proposed adaptive SRC schemes are evaluated using two BCI experimental datasets. The proposed methods are assessed by comparing classification results with the conventional SRC and other adaptive classification methods. On the basis of the results, we find that the proposed adaptive schemes show relatively improved classification accuracy as compared to conventional methods without requiring additional computation. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Classification of follicular lymphoma images: a holistic approach with symbol-based machine learning methods.

    Science.gov (United States)

    Zorman, Milan; Sánchez de la Rosa, José Luis; Dinevski, Dejan

    2011-12-01

    It is not very often to see a symbol-based machine learning approach to be used for the purpose of image classification and recognition. In this paper we will present such an approach, which we first used on the follicular lymphoma images. Lymphoma is a broad term encompassing a variety of cancers of the lymphatic system. Lymphoma is differentiated by the type of cell that multiplies and how the cancer presents itself. It is very important to get an exact diagnosis regarding lymphoma and to determine the treatments that will be most effective for the patient's condition. Our work was focused on the identification of lymphomas by finding follicles in microscopy images provided by the Laboratory of Pathology in the University Hospital of Tenerife, Spain. We divided our work in two stages: in the first stage we did image pre-processing and feature extraction, and in the second stage we used different symbolic machine learning approaches for pixel classification. Symbolic machine learning approaches are often neglected when looking for image analysis tools. They are not only known for a very appropriate knowledge representation, but also claimed to lack computational power. The results we got are very promising and show that symbolic approaches can be successful in image analysis applications.

  8. Rough Set Soft Computing Cancer Classification and Network: One Stone, Two Birds

    Directory of Open Access Journals (Sweden)

    Yue Zhang

    2010-07-01

    Full Text Available Gene expression profiling provides tremendous information to help unravel the complexity of cancer. The selection of the most informative genes from huge noise for cancer classification has taken centre stage, along with predicting the function of such identified genes and the construction of direct gene regulatory networks at different system levels with a tuneable parameter. A new study by Wang and Gotoh described a novel Variable Precision Rough Sets-rooted robust soft computing method to successfully address these problems and has yielded some new insights. The significance of this progress and its perspectives will be discussed in this article.

  9. Texture-Based Automated Lithological Classification Using Aeromagenetic Anomaly Images

    Science.gov (United States)

    Shankar, Vivek

    2009-01-01

    This report consists of a thesis submitted to the faculty of the Department of Electrical and Computer Engineering, in partial fulfillment of the requirements for the degree of Master of Science, Graduate College, The University of Arizona, 2004 Aeromagnetic anomaly images are geophysical prospecting tools frequently used in the exploration of metalliferous minerals and hydrocarbons. The amplitude and texture content of these images provide a wealth of information to geophysicists who attempt to delineate the nature of the Earth's upper crust. These images prove to be extremely useful in remote areas and locations where the minerals of interest are concealed by basin fill. Typically, geophysicists compile a suite of aeromagnetic anomaly images, derived from amplitude and texture measurement operations, in order to obtain a qualitative interpretation of the lithological (rock) structure. Texture measures have proven to be especially capable of capturing the magnetic anomaly signature of unique lithological units. We performed a quantitative study to explore the possibility of using texture measures as input to a machine vision system in order to achieve automated classification of lithological units. This work demonstrated a significant improvement in classification accuracy over random guessing based on a priori probabilities. Additionally, a quantitative comparison between the performances of five classes of texture measures in their ability to discriminate lithological units was achieved.

  10. Half-Face Dictionary Integration for Representation-Based Classification.

    Science.gov (United States)

    Song, Xiaoning; Feng, Zhen-Hua; Hu, Guosheng; Wu, Xiao-Jun

    2017-01-01

    This paper presents a half-face dictionary integration (HFDI) algorithm for representation-based classification. The proposed HFDI algorithm measures residuals between an input signal and the reconstructed one, using both the original and the synthesized dual-column (row) half-face training samples. More specifically, we first generate a set of virtual half-face samples for the purpose of training data augmentation. The aim is to obtain high-fidelity collaborative representation of a test sample. In this half-face integrated dictionary, each original training vector is replaced by an integrated dual-column (row) half-face matrix. Second, to reduce the redundancy between the original dictionary and the extended half-face dictionary, we propose an elimination strategy to gain the most robust training atoms. The last contribution of the proposed HFDI method is the use of a competitive fusion method weighting the reconstruction residuals from different dictionaries for robust face classification. Experimental results obtained from the Facial Recognition Technology, Aleix and Robert, Georgia Tech, ORL, and Carnegie Mellon University-pose, illumination and expression data sets demonstrate the effectiveness of the proposed method, especially in the case of the small sample size problem.

  11. DTI based diagnostic prediction of a disease via pattern classification.

    Science.gov (United States)

    Ingalhalikar, Madhura; Kanterakis, Stathis; Gur, Ruben; Roberts, Timothy P L; Verma, Ragini

    2010-01-01

    The paper presents a method of creating abnormality classifiers learned from Diffusion Tensor Imaging (DTI) data of a population of patients and controls. The score produced by the classifier can be used to aid in diagnosis as it quantifies the degree of pathology. Using anatomically meaningful features computed from the DTI data we train a non-linear support vector machine (SVM) pattern classifier. The method begins with high dimensional elastic registration of DT images followed by a feature extraction step that involves creating a feature by concatenating average anisotropy and diffusivity values in anatomically meaningful regions. Feature selection is performed via a mutual information based technique followed by sequential elimination of the features. A non-linear SVM classifier is then constructed by training on the selected features. The classifier assigns each test subject with a probabilistic abnormality score that indicates the extent of pathology. In this study, abnormality classifiers were created for two populations; one consisting of schizophrenia patients (SCZ) and the other with individuals with autism spectrum disorder (ASD). A clear distinction between the SCZ patients and controls was achieved with 90.62% accuracy while for individuals with ASD, 89.58% classification accuracy was obtained. The abnormality scores clearly separate the groups and the high classification accuracy indicates the prospect of using the scores as a diagnostic and prognostic marker.

  12. [Vegetation change in Shenzhen City based on NDVI change classification].

    Science.gov (United States)

    Li, Yi-Jing; Zeng, Hui; Wel, Jian-Bing

    2008-05-01

    Based on the TM images of 1988 and 2003 as well as the land-use change survey data in 2004, the vegetation change in Shenzhen City was assessed by a NDVI (normalized difference vegetation index) change classification method, and the impacts from natural and social constraining factors were analyzed. The results showed that as a whole, the rapid urbanization in 1988-2003 had less impact on the vegetation cover in the City, but in its plain areas with low altitude, the vegetation cover degraded more obviously. The main causes of the localized ecological degradation were the invasion of built-ups to woods and orchards, land transformation from woods to orchards at the altitude of above 100 m, and low percentage of green land in some built-ups areas. In the future, the protection and construction of vegetation in Shenzhen should focus on strengthening the protection and restoration of remnant woods, trying to avoid the built-ups' expansion to woods and orchards where are better vegetation-covered, rectifying the unreasonable orchard constructions at the altitude of above 100 m, and consolidating the greenbelt construction inside the built-ups. It was considered that the NDVI change classification method could work well in efficiently uncovering the trend of macroscale vegetation change, and avoiding the effect of random noise in data.

  13. Neighborhood Hypergraph Based Classification Algorithm for Incomplete Information System

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2015-01-01

    Full Text Available The problem of classification in incomplete information system is a hot issue in intelligent information processing. Hypergraph is a new intelligent method for machine learning. However, it is hard to process the incomplete information system by the traditional hypergraph, which is due to two reasons: (1 the hyperedges are generated randomly in traditional hypergraph model; (2 the existing methods are unsuitable to deal with incomplete information system, for the sake of missing values in incomplete information system. In this paper, we propose a novel classification algorithm for incomplete information system based on hypergraph model and rough set theory. Firstly, we initialize the hypergraph. Second, we classify the training set by neighborhood hypergraph. Third, under the guidance of rough set, we replace the poor hyperedges. After that, we can obtain a good classifier. The proposed approach is tested on 15 data sets from UCI machine learning repository. Furthermore, it is compared with some existing methods, such as C4.5, SVM, NavieBayes, and KNN. The experimental results show that the proposed algorithm has better performance via Precision, Recall, AUC, and F-measure.

  14. Likelihood-based image segmentation and classification: a framework for the integration of expert knowledge in image classification procedures

    NARCIS (Netherlands)

    Abkar, Ali-Akbar; Mulder, Nanno; Sharifi, Mohammed Ali

    2000-01-01

    This paper describes a likelihood-based segmentation and classification method for remotely sensed images. It is based on optimization of a utility function that can be described as a cost-weighted likelihood for a collection of objects and their parameters. As the likelihood or posterior

  15. Gene Expression Based Leukemia Sub-Classification Using Committee Neural Networks

    OpenAIRE

    Sewak, Mihir S.; Reddy, Narender P.; Duan, Zhong-Hui

    2009-01-01

    Analysis of gene expression data provides an objective and efficient technique for sub‑classification of leukemia. The purpose of the present study was to design a committee neural networks based classification systems to subcategorize leukemia gene expression data. In the study, a binary classification system was considered to differentiate acute lymphoblastic leukemia from acute myeloid leukemia. A ternary classification system which classifies leukemia expression data into three subclasses...

  16. Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set

    Science.gov (United States)

    Li, Hui; Zhu, Yitan; Burnside, Elizabeth S; Huang, Erich; Drukker, Karen; Hoadley, Katherine A; Fan, Cheng; Conzen, Suzanne D; Zuley, Margarita; Net, Jose M; Sutton, Elizabeth; Whitman, Gary J; Morris, Elizabeth; Perou, Charles M; Ji, Yuan; Giger, Maryellen L

    2016-01-01

    Using quantitative radiomics, we demonstrate that computer-extracted magnetic resonance (MR) image-based tumor phenotypes can be predictive of the molecular classification of invasive breast cancers. Radiomics analysis was performed on 91 MRIs of biopsy-proven invasive breast cancers from National Cancer Institute’s multi-institutional TCGA/TCIA. Immunohistochemistry molecular classification was performed including estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2, and for 84 cases, the molecular subtype (normal-like, luminal A, luminal B, HER2-enriched, and basal-like). Computerized quantitative image analysis included: three-dimensional lesion segmentation, phenotype extraction, and leave-one-case-out cross validation involving stepwise feature selection and linear discriminant analysis. The performance of the classifier model for molecular subtyping was evaluated using receiver operating characteristic analysis. The computer-extracted tumor phenotypes were able to distinguish between molecular prognostic indicators; area under the ROC curve values of 0.89, 0.69, 0.65, and 0.67 in the tasks of distinguishing between ER+ versus ER−, PR+ versus PR−, HER2+ versus HER2−, and triple-negative versus others, respectively. Statistically significant associations between tumor phenotypes and receptor status were observed. More aggressive cancers are likely to be larger in size with more heterogeneity in their contrast enhancement. Even after controlling for tumor size, a statistically significant trend was observed within each size group (P=0.04 for lesions ⩽2 cm; P=0.02 for lesions >2 to ⩽5 cm) as with the entire data set (P-value=0.006) for the relationship between enhancement texture (entropy) and molecular subtypes (normal-like, luminal A, luminal B, HER2-enriched, basal-like). In conclusion, computer-extracted image phenotypes show promise for high-throughput discrimination of breast cancer subtypes and may yield a

  17. Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set.

    Science.gov (United States)

    Li, Hui; Zhu, Yitan; Burnside, Elizabeth S; Huang, Erich; Drukker, Karen; Hoadley, Katherine A; Fan, Cheng; Conzen, Suzanne D; Zuley, Margarita; Net, Jose M; Sutton, Elizabeth; Whitman, Gary J; Morris, Elizabeth; Perou, Charles M; Ji, Yuan; Giger, Maryellen L

    2016-01-01

    Using quantitative radiomics, we demonstrate that computer-extracted magnetic resonance (MR) image-based tumor phenotypes can be predictive of the molecular classification of invasive breast cancers. Radiomics analysis was performed on 91 MRIs of biopsy-proven invasive breast cancers from National Cancer Institute's multi-institutional TCGA/TCIA. Immunohistochemistry molecular classification was performed including estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2, and for 84 cases, the molecular subtype (normal-like, luminal A, luminal B, HER2-enriched, and basal-like). Computerized quantitative image analysis included: three-dimensional lesion segmentation, phenotype extraction, and leave-one-case-out cross validation involving stepwise feature selection and linear discriminant analysis. The performance of the classifier model for molecular subtyping was evaluated using receiver operating characteristic analysis. The computer-extracted tumor phenotypes were able to distinguish between molecular prognostic indicators; area under the ROC curve values of 0.89, 0.69, 0.65, and 0.67 in the tasks of distinguishing between ER+ versus ER-, PR+ versus PR-, HER2+ versus HER2-, and triple-negative versus others, respectively. Statistically significant associations between tumor phenotypes and receptor status were observed. More aggressive cancers are likely to be larger in size with more heterogeneity in their contrast enhancement. Even after controlling for tumor size, a statistically significant trend was observed within each size group (P = 0.04 for lesions ≤ 2 cm; P = 0.02 for lesions >2 to ≤5 cm) as with the entire data set (P-value = 0.006) for the relationship between enhancement texture (entropy) and molecular subtypes (normal-like, luminal A, luminal B, HER2-enriched, basal-like). In conclusion, computer-extracted image phenotypes show promise for high-throughput discrimination of breast cancer subtypes and may yield a

  18. Classification and Clinical Management of Variants of Uncertain Significance in High Penetrance Cancer Predisposition Genes.

    Science.gov (United States)

    Moghadasi, Setareh; Eccles, Diana M; Devilee, Peter; Vreeswijk, Maaike P G; van Asperen, Christi J

    2016-04-01

    In 2008, the International Agency for Research on Cancer (IARC) proposed a system for classifying sequence variants in highly penetrant breast and colon cancer susceptibility genes, linked to clinical actions. This system uses a multifactorial likelihood model to calculate the posterior probability that an altered DNA sequence is pathogenic. Variants between 5%-94.9% (class 3) are categorized as variants of uncertain significance (VUS). This interval is wide and might include variants with a substantial difference in pathogenicity at either end of the spectrum. We think that carriers of class 3 variants would benefit from a fine-tuning of this classification. Classification of VUS to a category with a defined clinical significance is very important because for carriers of a pathogenic mutation full surveillance and risk-reducing surgery can reduce cancer incidence. Counselees who are not carriers of a pathogenic mutation can be discharged from intensive follow-up and avoid unnecessary risk-reducing surgery. By means of examples, we show how, in selected cases, additional data can lead to reclassification of some variants to a different class with different recommendations for surveillance and therapy. To improve the clinical utility of this classification system, we suggest a pragmatic adaptation to clinical practice. © 2016 WILEY PERIODICALS, INC.

  19. Setting a generalized functional linear model (GFLM for the classification of different types of cancer

    Directory of Open Access Journals (Sweden)

    Miguel Flores

    2016-11-01

    Full Text Available This work aims to classify the DNA sequences of healthy and malignant cancer respectively. For this, supervised and unsupervised classification methods from a functional context are used; i.e. each strand of DNA is an observation. The observations are discretized, for that reason different ways to represent these observations with functions are evaluated. In addition, an exploratory study is done: estimating the mean and variance of each functional type of cancer. For the unsupervised classification method, hierarchical clustering with different measures of functional distance is used. On the other hand, for the supervised classification method, a functional generalized linear model is used. For this model the first and second derivatives are used which are included as discriminating variables. It has been verified that one of the advantages of working in the functional context is to obtain a model to correctly classify cancers by 100%. For the implementation of the methods it has been used the fda.usc R package that includes all the techniques of functional data analysis used in this work. In addition, some that have been developed in recent decades. For more details of these techniques can be consulted Ramsay, J. O. and Silverman (2005 and Ferraty et al. (2006.

  20. Apparent diffusion coefficient value of gastric cancer by diffusion-weighted imaging: Correlations with the histological differentiation and Lauren classification

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Song, E-mail: songliu532909756@gmail.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Guan, Wenxian, E-mail: wenxianguan123@126.com [Department of Gastrointestinal Surgery, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Wang, Hao, E-mail: wanghao20140525@126.com [Department of Gastrointestinal Surgery, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Pan, Liang, E-mail: panliang2014@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Zhou, Zhuping, E-mail: zhupingzhou@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Yu, Haiping, E-mail: haipingyu2012@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Liu, Tian, E-mail: tianliu2014@126.com [Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322 (United States); Yang, Xiaofeng, E-mail: xiaofengyang2014@126.com [Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322 (United States); He, Jian, E-mail: hjxueren@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Zhou, Zhengyang, E-mail: zyzhou@nju.edu.cn [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China)

    2014-12-15

    Highlights: • Gastric cancers’ ADC values were significantly lower than normal gastric wall. • Gastric adenocarcinomas with different differentiation had different ADC values. • Gastric adenocarcinomas’ ADC values correlated with histologic differentiations. • Gastric cancers’ ADC values correlated with Lauren classifications. • Mean ADC value was better than min ADC value in characterizing gastric cancers. - Abstract: Objective: The purpose of this study was to evaluate the correlations between histological differentiation and Lauren classification of gastric cancer and the apparent diffusion coefficient (ADC) value of diffusion weighted imaging (DWI). Materials and methods: Sixty-nine patients with gastric cancer lesions underwent preoperative magnetic resonance imaging (MRI) (3.0T) and surgical resection. DWI was obtained with a single-shot, echo-planar imaging sequence in the axial plane (b values: 0 and 1000 s/mm{sup 2}). Mean and minimum ADC values were obtained for each gastric cancer and normal gastric walls by two radiologists, who were blinded to the histological findings. Histological type, degree of differentiation and Lauren classification of each resected specimen were determined by one pathologist. Mean and minimum ADC values of gastric cancers with different histological types, degrees of differentiation and Lauren classifications were compared. Correlations between ADC values and histological differentiation and Lauren classification were analyzed. Results: The mean and minimum ADC values of gastric cancers, as a whole and separately, were significantly lower than those of normal gastric walls (all p values <0.001). There were significant differences in the mean and minimum ADC values among gastric cancers with different histological types, degrees of differentiation and Lauren classifications (p < 0.05). Mean and minimum ADC values correlated significantly (all p < 0.001) with histological differentiation (r = 0.564, 0.578) and

  1. Radiological classification of renal angiomyolipomas based on 127 tumors

    Directory of Open Access Journals (Sweden)

    Prando Adilson

    2003-01-01

    Full Text Available PURPOSE: Demonstrate radiological findings of 127 angiomyolipomas (AMLs and propose a classification based on the radiological evidence of fat. MATERIALS AND METHODS: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73, multiple without tuberous sclerosis (TS (n = 4 and multiple with TS (n = 8, were retrospectively reviewed. Eighteen AMLs (14% presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13, hemorrhage (n = 11 and impossibility of an adequate preoperative characterization (n = 8. There was not a case of renal cell carcinoma (RCC with fat component in this group of patients. RESULTS: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal: 54%; Pattern-II, partially fatty (intrarenal or exophytic: 29%; Pattern-III, minimally fatty (most exophytic and perirenal: 11%; and Pattern-IV, without fat (most exophytic and perirenal: 6%. CONCLUSIONS: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm, pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exophytic and heterogeneous (patterns II and III. The rare pattern-IV AMLs, however, can be small or large, intra-renal or exophytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with detectable

  2. Automated detection and classification for craters based on geometric matching

    Science.gov (United States)

    Chen, Jian-qing; Cui, Ping-yuan; Cui, Hui-tao

    2011-08-01

    Crater detection and classification are critical elements for planetary mission preparations and landing site selection. This paper presents a methodology for the automated detection and matching of craters on images of planetary surface such as Moon, Mars and asteroids. For craters usually are bowl shaped depression, craters can be figured as circles or circular arc during landing phase. Based on the hypothesis that detected crater edges is related to craters in a template by translation, rotation and scaling, the proposed matching method use circles to fitting craters edge, and align circular arc edges from the image of the target body with circular features contained in a model. The approach includes edge detection, edge grouping, reference point detection and geometric circle model matching. Finally we simulate planetary surface to test the reasonableness and effectiveness of the proposed method.

  3. Tumor classification based on orthogonal linear discriminant analysis.

    Science.gov (United States)

    Wang, Huiya; Zhang, Shanwen

    2014-01-01

    Gene expression profiles have great potential for accurate tumor diagnosis. It is expected to enable us to diagnose tumors precisely and systematically, and also bring the researchers of machine learning two challenges, the curse of dimensionality and the small sample size problems. We propose a manifold learning based dimensional reduction algorithm named orthogonal local discriminant embedding (O-LDE) and apply it to tumor classification. Comparing with the classical local discriminant embedding (LDE), O-LDE aims to obtain an orthogonal linear projection matrix by solving an optimization problem. After being projected into a low-dimensional subspace by O-LDE, the data points of the same class maintain their intrinsic neighbor relations, whereas the neighboring points of the different classes are far from each other. Experimental results on a public tumor dataset validate the effectiveness and feasibility of the proposed algorithm.

  4. Linear regression-based feature selection for microarray data classification.

    Science.gov (United States)

    Abid Hasan, Md; Hasan, Md Kamrul; Abdul Mottalib, M

    2015-01-01

    Predicting the class of gene expression profiles helps improve the diagnosis and treatment of diseases. Analysing huge gene expression data otherwise known as microarray data is complicated due to its high dimensionality. Hence the traditional classifiers do not perform well where the number of features far exceeds the number of samples. A good set of features help classifiers to classify the dataset efficiently. Moreover, a manageable set of features is also desirable for the biologist for further analysis. In this paper, we have proposed a linear regression-based feature selection method for selecting discriminative features. Our main focus is to classify the dataset more accurately using less number of features than other traditional feature selection methods. Our method has been compared with several other methods and in almost every case the classification accuracy is higher using less number of features than the other popular feature selection methods.

  5. Machine Learning Based Localization and Classification with Atomic Magnetometers

    Science.gov (United States)

    Deans, Cameron; Griffin, Lewis D.; Marmugi, Luca; Renzoni, Ferruccio

    2018-01-01

    We demonstrate identification of position, material, orientation, and shape of objects imaged by a Rb 85 atomic magnetometer performing electromagnetic induction imaging supported by machine learning. Machine learning maximizes the information extracted from the images created by the magnetometer, demonstrating the use of hidden data. Localization 2.6 times better than the spatial resolution of the imaging system and successful classification up to 97% are obtained. This circumvents the need of solving the inverse problem and demonstrates the extension of machine learning to diffusive systems, such as low-frequency electrodynamics in media. Automated collection of task-relevant information from quantum-based electromagnetic imaging will have a relevant impact from biomedicine to security.

  6. Fines Classification Based on Sensitivity to Pore-Fluid Chemistry

    KAUST Repository

    Jang, Junbong

    2015-12-28

    The 75-μm particle size is used to discriminate between fine and coarse grains. Further analysis of fine grains is typically based on the plasticity chart. Whereas pore-fluid-chemistry-dependent soil response is a salient and distinguishing characteristic of fine grains, pore-fluid chemistry is not addressed in current classification systems. Liquid limits obtained with electrically contrasting pore fluids (deionized water, 2-M NaCl brine, and kerosene) are combined to define the soil "electrical sensitivity." Liquid limit and electrical sensitivity can be effectively used to classify fine grains according to their fluid-soil response into no-, low-, intermediate-, or high-plasticity fine grains of low, intermediate, or high electrical sensitivity. The proposed methodology benefits from the accumulated experience with liquid limit in the field and addresses the needs of a broader range of geotechnical engineering problems. © ASCE.

  7. Fines classification based on sensitivity to pore-fluid chemistry

    Science.gov (United States)

    Jang, Junbong; Santamarina, J. Carlos

    2016-01-01

    The 75-μm particle size is used to discriminate between fine and coarse grains. Further analysis of fine grains is typically based on the plasticity chart. Whereas pore-fluid-chemistry-dependent soil response is a salient and distinguishing characteristic of fine grains, pore-fluid chemistry is not addressed in current classification systems. Liquid limits obtained with electrically contrasting pore fluids (deionized water, 2-M NaCl brine, and kerosene) are combined to define the soil “electrical sensitivity.” Liquid limit and electrical sensitivity can be effectively used to classify fine grains according to their fluid-soil response into no-, low-, intermediate-, or high-plasticity fine grains of low, intermediate, or high electrical sensitivity. The proposed methodology benefits from the accumulated experience with liquid limit in the field and addresses the needs of a broader range of geotechnical engineering problems.

  8. Deep neural network and noise classification-based speech enhancement

    Science.gov (United States)

    Shi, Wenhua; Zhang, Xiongwei; Zou, Xia; Han, Wei

    2017-07-01

    In this paper, a speech enhancement method using noise classification and Deep Neural Network (DNN) was proposed. Gaussian mixture model (GMM) was employed to determine the noise type in speech-absent frames. DNN was used to model the relationship between noisy observation and clean speech. Once the noise type was determined, the corresponding DNN model was applied to enhance the noisy speech. GMM was trained with mel-frequency cepstrum coefficients (MFCC) and the parameters were estimated with an iterative expectation-maximization (EM) algorithm. Noise type was updated by spectrum entropy-based voice activity detection (VAD). Experimental results demonstrate that the proposed method could achieve better objective speech quality and smaller distortion under stationary and non-stationary conditions.

  9. Classification of hyperspectral images based on conditional random fields

    Science.gov (United States)

    Hu, Yang; Saber, Eli; Monteiro, Sildomar T.; Cahill, Nathan D.; Messinger, David W.

    2015-02-01

    A significant increase in the availability of high resolution hyperspectral images has led to the need for developing pertinent techniques in image analysis, such as classification. Hyperspectral images that are correlated spatially and spectrally provide ample information across the bands to benefit this purpose. Conditional Random Fields (CRFs) are discriminative models that carry several advantages over conventional techniques: no requirement of the independence assumption for observations, flexibility in defining local and pairwise potentials, and an independence between the modules of feature selection and parameter leaning. In this paper we present a framework for classifying remotely sensed imagery based on CRFs. We apply a Support Vector Machine (SVM) classifier to raw remotely sensed imagery data in order to generate more meaningful feature potentials to the CRFs model. This approach produces promising results when tested with publicly available AVIRIS Indian Pine imagery.

  10. Classification of cassava genotypes based on qualitative and quantitative data.

    Science.gov (United States)

    Oliveira, E J; Oliveira Filho, O S; Santos, V S

    2015-02-02

    We evaluated the genetic variation of cassava accessions based on qualitative (binomial and multicategorical) and quantitative traits (continuous). We characterized 95 accessions obtained from the Cassava Germplasm Bank of Embrapa Mandioca e Fruticultura; we evaluated these accessions for 13 continuous, 10 binary, and 25 multicategorical traits. First, we analyzed the accessions based only on quantitative traits; next, we conducted joint analysis (qualitative and quantitative traits) based on the Ward-MLM method, which performs clustering in two stages. According to the pseudo-F, pseudo-t2, and maximum likelihood criteria, we identified five and four groups based on quantitative trait and joint analysis, respectively. The smaller number of groups identified based on joint analysis may be related to the nature of the data. On the other hand, quantitative data are more subject to environmental effects in the phenotype expression; this results in the absence of genetic differences, thereby contributing to greater differentiation among accessions. For most of the accessions, the maximum probability of classification was >0.90, independent of the trait analyzed, indicating a good fit of the clustering method. Differences in clustering according to the type of data implied that analysis of quantitative and qualitative traits in cassava germplasm might explore different genomic regions. On the other hand, when joint analysis was used, the means and ranges of genetic distances were high, indicating that the Ward-MLM method is very useful for clustering genotypes when there are several phenotypic traits, such as in the case of genetic resources and breeding programs.

  11. ICA-Based Imagined Conceptual Words Classification on EEG Signals.

    Science.gov (United States)

    Imani, Ehsan; Pourmohammad, Ali; Bagheri, Mahsa; Mobasheri, Vida

    2017-01-01

    function, the classification accuracies were almost the same and not very different. Linear discriminant analysis (LDA) in comparison with the neural network yielded higher classification accuracies. ICA is a suitable algorithm for recognizing of the word's concept and its place in the brain. Achieved results from this experiment were the same compared with the results from other methods such as functional magnetic resonance imaging and methods based on the brain signals (EEG) in the vowel imagination and covert speech. Herein, the highest classification accuracy was obtained by extracting the target signal from the output of the ICA and extracting the features of coefficients AR model with time interval of 2.5 s. Finally, LDA resulted in the highest classification accuracy more than 60%.

  12. Semi-Automated Classification of Landform Elements in Armenia Based on SRTM DEM using K-Means Unsupervised Classification

    Directory of Open Access Journals (Sweden)

    Piloyan Artak

    2017-03-01

    Full Text Available Land elements have been used as basic landform descriptors in many science disciplines, including soil mapping, vegetation mapping, and landscape ecology. This paper presents a semi-automatic method based on k-means unsupervised classification to analyze geomorphometric features as landform elements in Armenia. First, several data layers were derived from DEM: elevation, slope, profile curvature, plan curvature and flow path length. Then, k-means algorithm has been used for classifying landform elements based on these morphomertic parameters. The classification has seven landform classes. Overall, landform classification is performed in the form of a three-level hierarchical scheme. The resulting map reflects the general topography and landform character of Armenia.

  13. New classification system-based visual outcome in Eales′ disease

    Directory of Open Access Journals (Sweden)

    Saxena Sandeep

    2007-01-01

    Full Text Available Purpose: A retrospective tertiary care center-based study was undertaken to evaluate the visual outcome in Eales′ disease, based on a new classification system, for the first time. Materials and Methods: One hundred and fifty-nine consecutive cases of Eales′ disease were included. All the eyes were staged according to the new classification: Stage 1: periphlebitis of small (1a and large (1b caliber vessels with superficial retinal hemorrhages; Stage 2a: capillary non-perfusion, 2b: neovascularization elsewhere/of the disc; Stage 3a: fibrovascular proliferation, 3b: vitreous hemorrhage; Stage 4a: traction/combined rhegmatogenous retinal detachment and 4b: rubeosis iridis, neovascular glaucoma, complicated cataract and optic atrophy. Visual acuity was graded as: Grade I 20/20 or better; Grade II 20/30 to 20/40; Grade III 20/60 to 20/120 and Grade IV 20/200 or worse. All the cases were managed by medical therapy, photocoagulation and/or vitreoretinal surgery. Visual acuity was converted into decimal scale, denoting 20/20=1 and 20/800=0.01. Paired t-test / Wilcoxon signed-rank tests were used for statistical analysis. Results: Vitreous hemorrhage was the commonest presenting feature (49.32%. Cases with Stages 1 to 3 and 4a and 4b achieved final visual acuity ranging from 20/15 to 20/40; 20/80 to 20/400 and 20/200 to 20/400, respectively. Statistically significant improvement in visual acuities was observed in all the stages of the disease except Stages 1a and 4b. Conclusion: Significant improvement in visual acuities was observed in the majority of stages of Eales′ disease following treatment. This study adds further to the little available evidences of treatment effects in literature and may have effect on patient care and health policy in Eales′ disease.

  14. Molecular classification of breast cancer: what the pathologist needs to know.

    Science.gov (United States)

    Rakha, Emad A; Green, Andrew R

    2017-02-01

    Breast cancer is a heterogeneous disease featuring distinct histological, molecular and clinical phenotypes. Although traditional classification systems utilising clinicopathological and few molecular markers are well established and validated, they remain insufficient to reflect the diverse biological and clinical heterogeneity of breast cancer. Advancements in high-throughput molecular techniques and bioinformatics have contributed to the improved understanding of breast cancer biology, refinement of molecular taxonomies and the development of novel prognostic and predictive molecular assays. Application of such technologies is already underway, and is expected to change the way we manage breast cancer. Despite the enormous amount of work that has been carried out to develop and refine breast cancer molecular prognostic and predictive assays, molecular testing is still in evolution. Pathologists should be aware of the new technology and be ready for the challenge. In this review, we provide an update on the application of molecular techniques with regard to breast cancer diagnosis, prognosis and outcome prediction. The current contribution of emerging technology to our understanding of breast cancer is also highlighted. Copyright © 2016 Royal College of Pathologists of Australasia. Published by Elsevier B.V. All rights reserved.

  15. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  16. SA-SVM based automated diagnostic system for skin cancer

    Science.gov (United States)

    Masood, Ammara; Al-Jumaily, Adel

    2015-03-01

    Early diagnosis of skin cancer is one of the greatest challenges due to lack of experience of general practitioners (GPs). This paper presents a clinical decision support system aimed to save time and resources in the diagnostic process. Segmentation, feature extraction, pattern recognition, and lesion classification are the important steps in the proposed decision support system. The system analyses the images to extract the affected area using a novel proposed segmentation method H-FCM-LS. The underlying features which indicate the difference between melanoma and benign lesions are obtained through intensity, spatial/frequency and texture based methods. For classification purpose, self-advising SVM is adapted which showed improved classification rate as compared to standard SVM. The presented work also considers analyzed performance of linear and kernel based SVM on the specific skin lesion diagnostic problem and discussed corresponding findings. The best diagnostic rates obtained through the proposed method are around 90.5 %.

  17. Tweet-based Target Market Classification Using Ensemble Method

    OpenAIRE

    Muhammad Adi Khairul Anshary; Bambang Riyanto Trilaksono

    2016-01-01

    Target market classification is aimed at focusing marketing activities on the right targets. Classification of target markets can be done through data mining and by utilizing data from social media, e.g. Twitter. The end result of data mining are learning models that can classify new data. Ensemble methods can improve the accuracy of the models and therefore provide better results. In this study, classification of target markets was conducted on a dataset of 3000 tweets in order to extract fe...

  18. Docking-based classification models for exploratory toxicology ...

    Science.gov (United States)

    Background: Exploratory toxicology is a new emerging research area whose ultimate mission is that of protecting human health and environment from risks posed by chemicals. In this regard, the ethical and practical limitation of animal testing has encouraged the promotion of computational methods for the fast screening of huge collections of chemicals available on the market. Results: We derived 24 reliable docking-based classification models able to predict the estrogenic potential of a large collection of chemicals having high quality experimental data, kindly provided by the U.S. Environmental Protection Agency (EPA). The predictive power of our docking-based models was supported by values of AUC, EF1% (EFmax = 7.1), -LR (at SE = 0.75) and +LR (at SE = 0.25) ranging from 0.63 to 0.72, from 2.5 to 6.2, from 0.35 to 0.67 and from 2.05 to 9.84, respectively. In addition, external predictions were successfully made on some representative known estrogenic chemicals. Conclusion: We show how structure-based methods, widely applied to drug discovery programs, can be adapted to meet the conditions of the regulatory context. Importantly, these methods enable one to employ the physicochemical information contained in the X-ray solved biological target and to screen structurally-unrelated chemicals. Shows how structure-based methods, widely applied to drug discovery programs, can be adapted to meet the conditions of the regulatory context. Evaluation of 24 reliable dockin

  19. Cancer DNA microarray analysis considering multi-subclass with graph-based clustering method.

    Science.gov (United States)

    Kawamura, Takashi; Mutoh, Hironori; Tomita, Yasuyuki; Kato, Ryuji; Honda, Hiroyuki

    2008-11-01

    It is well known that various genes related to cell cycle, cell-cell adhesion, and transcriptional regulation cause the onset of cancer. Moreover, environmental factors including age, sex, and lifestyle can also contribute to the onset of cancer. Therefore, it is difficult to ascertain which factors influence the onset. Thus, patients suffering from same disease can be divided into several distinct groups. In the present study, we applied graph-based clustering to several DNA microarray datasets before the classification analysis. Several clusters formed by the graph-based clustering were used for the construction of multi-class classification model with the k-nearest neighbor and for finding genes, which are specific to a certain cluster, by One vs. Others classification. Using this approach, the classification model was constructed for four microarray datasets, leukemia, breast cancer, prostate cancer, and colon cancer, and the accuracies of classification with k-nearest neighbor were all more than 80%. And in the breast cancer dataset, we succeeded in finding genes that are specific in a cluster consisting of 38 control group samples. These results indicate the importance of sample clustering before classification model construction.

  20. Rough Set Based Classification rules generation for SARS Patients.

    Science.gov (United States)

    Honghai, Feng; Guoshun, Chen; Yufeng, Wang; Bingru, Yang; Yumei, Chen

    2005-01-01

    SARS is an acute infectious disease and can cause a large amount of death. Up until now we have not known it well. With the experimental results of micronutrients of 30 SARS patients and 30 non-SARS patients, using rough set theory we induce some classification rules. Attribute reduction results show that micronutrients Fe, Ca, K and Na are necessary and sufficient for classification, whereas micronutrients Zn, Cu and Mg are not necessary or are redundant. Additionally, we find that micronutrient Ca has a strong correlation to SARS. The classification results of 30 other examples show that the rough set classification method is available.

  1. Hyperspectral image classification based on NMF Features Selection Method

    Science.gov (United States)

    Abe, Bolanle T.; Jordaan, J. A.

    2013-12-01

    Hyperspectral instruments are capable of collecting hundreds of images corresponding to wavelength channels for the same area on the earth surface. Due to the huge number of features (bands) in hyperspectral imagery, land cover classification procedures are computationally expensive and pose a problem known as the curse of dimensionality. In addition, higher correlation among contiguous bands increases the redundancy within the bands. Hence, dimension reduction of hyperspectral data is very crucial so as to obtain good classification accuracy results. This paper presents a new feature selection technique. Non-negative Matrix Factorization (NMF) algorithm is proposed to obtain reduced relevant features in the input domain of each class label. This aimed to reduce classification error and dimensionality of classification challenges. Indiana pines of the Northwest Indiana dataset is used to evaluate the performance of the proposed method through experiments of features selection and classification. The Waikato Environment for Knowledge Analysis (WEKA) data mining framework is selected as a tool to implement the classification using Support Vector Machines and Neural Network. The selected features subsets are subjected to land cover classification to investigate the performance of the classifiers and how the features size affects classification accuracy. Results obtained shows that performances of the classifiers are significant. The study makes a positive contribution to the problems of hyperspectral imagery by exploring NMF, SVMs and NN to improve classification accuracy. The performances of the classifiers are valuable for decision maker to consider tradeoffs in method accuracy versus method complexity.

  2. Validation of the determinant-based classification and revision of the Atlanta classification systems for acute pancreatitis.

    Science.gov (United States)

    Acevedo-Piedra, Nelly G; Moya-Hoyo, Neftalí; Rey-Riveiro, Mónica; Gil, Santiago; Sempere, Laura; Martínez, Juan; Lluís, Félix; Sánchez-Payá, José; de-Madaria, Enrique

    2014-02-01

    Two new classification systems for the severity of acute pancreatitis (AP) have been proposed, the determinant-based classification (DBC) and a revision of the Atlanta classification (RAC). Our aim was to validate and compare these classification systems. We analyzed data from adult patients with AP (543 episodes of AP in 459 patients) who were admitted to Hospital General Universitario de Alicante from December 2007 to February 2013. Imaging results were reviewed, and the classification systems were validated and compared in terms of outcomes. Pancreatic necrosis was present in 66 of the patients (12%), peripancreatic necrosis in 109 (20%), walled-off necrosis in 61 (11%), acute peripancreatic fluid collections in 98 (18%), and pseudocysts in 19 (4%). Transient and persistent organ failures were present in 31 patients (6%) and 21 patients (4%), respectively. Sixteen patients (3%) died. On the basis of the DBC, 386 (71%), 131 (24%), 23 (4%), and 3 (0.6%) patients were determined to have mild, moderate, severe, or critical AP, respectively. On the basis of the RAC, 363 patients (67%), 160 patients (30%), and 20 patients (4%) were determined to have mild, moderately severe, or severe AP, respectively. The different categories of severity for each classification system were associated with statistically significant and clinically relevant differences in length of hospital stay, need for admission to the intensive care unit, nutritional support, invasive treatment, and in-hospital mortality. In comparing similar categories between the classification systems, no significant differences were found. The DBC and the RAC accurately classify the severity of AP in subgroups of patients. Copyright © 2014 AGA Institute. Published by Elsevier Inc. All rights reserved.

  3. Pathological classification of intrahepatic cholangiocarcinoma based on a new concept.

    Science.gov (United States)

    Nakanuma, Yasuni; Sato, Yasunori; Harada, Kenichi; Sasaki, Motoko; Xu, Jing; Ikeda, Hiroko

    2010-12-27

    Intrahepatic cholangiocarcinoma (ICC) arises from the lining epithelium and peribiliary glands of the intrahepatic biliary tree and shows variable cholangiocytic differentiation. To date, ICC was largely classified into adenocarcinoma and rare variants. Herein, we propose to subclassify the former, based on recent progress in the study of ICC including the gross classification and hepatic progenitor/stem cells and on the pathological similarities between biliary and pancreatic neoplasms. That is, ICC is classifiable into the conventional (bile duct) type, the bile ductular type, the intraductal neoplasm type and rare variants. The conventional type is further divided into the small duct type (peripheral type) and large bile duct type (perihilar type). The former is a tubular or micropapillary adenocarcinoma while the latter involves the intrahepatic large bile duct. Bile ductular type resembles proliferated bile ductules and shows a replacing growth of the hepatic parenchyma. Hepatic progenitor cell or stem cell phenotypes such as neural cell adhesion molecule expression are frequently expressed in the bile ductular type. Intraductal type includes papillary and tubular neoplasms of the bile duct (IPNBs and ITNBs) and a superficial spreading type. IPNB and ITNB show a spectrum from a preneoplastic borderline lesion to carcinoma and may have pancreatic counterparts. At invasive sites, IPNB is associated with the conventional bile duct ICC and mucinous carcinoma. Biliary mucinous cystic neoplasm with ovarian-like stroma in its wall is different from IPNB, particularly IPNB showing cystic dilatation of the affected ducts. Rare variants of ICC include squamous/adenosquamous cell carcinoma, mucinous/signet ring cell carcinoma, clear cell type, undifferentiated type, neuroendocrine carcinoma and so on. This classification of ICC may open up a new field of research of ICC and contribute to the clinical approach to ICC.

  4. A stream-based classification of European cyclone tracks

    Science.gov (United States)

    Hofstätter, Michael; Chimani, Barbara; Steinacker, Reinhold; Blöschl, Günther

    2014-05-01

    The geographical region from where a cyclone enters Europe appears to play an important role in generating certain weather extremes. Some of the most devastating European floods have been associated with type Vb cyclones as in August 2002 or June 2013 for example. On the other hand, gale force storms in Western-Continental Europe are usually caused by cyclones that propagate from the north-eastern Atlantic into Europe. A method is presented for tracking the paths of atmospheric cyclones with the ability to detect both linear and branching tracks. Cyclones are tracked at three atmospheric levels independently using the reanalysis data of NCEP1, ERA-40 and ERA-Interim over Europe in parts of the period 1948-2012. The cyclones are then classified by a new stream-based classification approach into nine types, on the basis of the geographic regions from where cyclones enter Central Europe. Results show that the total number of tracks identified from ERA-40 is about 25% larger than those from NCEP1 due to the higher spatial resolution. The ERA-40 data suggest that, at 700hPa, 80% of all tracks are linear as compared to 65% at sea level pressure (SLP) due to the smoother pressure patterns at higher atmospheric levels. So branching events are more frequent at the surface. The relative number of linear tracks is always largest in the data with the coarsest resolution at all levels. The classification indicates that the proportion of linear and branching tracks varies substantially between cyclone types. For example, the famous cyclone track type Vb has the highest ratio of complex (compound and merge/split) tracks with only 1/3 of linear cases at SLP (ERA-40). The new cyclone type catalogue established in this paper will be used for identifying the temporal behaviour of cyclone tracks in the context of changing weather extremes in Central Europe.

  5. Sequence-based classification using discriminatory motif feature selection.

    Directory of Open Access Journals (Sweden)

    Hao Xiong

    Full Text Available Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all k-mer patterns. The motivation behind such (enumerative approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of length ≤ k, such that potentially important, longer (> k predictors are not considered. Second, features so generated exhibit strong dependencies, which can complicate understanding of derived classification rules. Third, and most importantly, numerous irrelevant features are created. These concerns can compromise prediction and interpretation. While remedies have been proposed, they tend to be problem-specific and not broadly applicable. Here, we develop a generally applicable methodology, and an attendant software pipeline, that is predicated on discriminatory motif finding. In addition to the traditional training and validation partitions, our framework entails a third level of data partitioning, a discovery partition. A discriminatory motif finder is used on sequences and associated class labels in the discovery partition to yield a (small set of features. These features are then used as inputs to a classifier in the training partition. Finally, performance assessment occurs on the validation partition. Important attributes of our approach are its modularity (any discriminatory motif finder and any classifier can be deployed and its universality (all data, including sequences that are unaligned and/or of unequal length, can be accommodated. We illustrate our approach on two nucleosome occupancy datasets and a protein solubility dataset, previously analyzed using enumerative feature generation. Our method achieves excellent performance results, with and without optimization of classifier tuning parameters. A Python pipeline implementing the approach is

  6. Classification of types of stuttering symptoms based on brain activity.

    Directory of Open Access Journals (Sweden)

    Jing Jiang

    Full Text Available Among the non-fluencies seen in speech, some are more typical (MT of stuttering speakers, whereas others are less typical (LT and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT whole-word repetitions (WWR should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type.

  7. Classification of Types of Stuttering Symptoms Based on Brain Activity

    Science.gov (United States)

    Jiang, Jing; Lu, Chunming; Peng, Danling; Zhu, Chaozhe; Howell, Peter

    2012-01-01

    Among the non-fluencies seen in speech, some are more typical (MT) of stuttering speakers, whereas others are less typical (LT) and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT) whole-word repetitions (WWR) should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type. PMID:22761887

  8. Liposome based radiosensitizer cancer therapy

    DEFF Research Database (Denmark)

    Pourhassan, Houman

    Liposome-encapsulated chemotherapeutics have been used in the treatment of a variety of cancers and are feasible for use as mono-therapeutics as well as for combination therapy in conjunction with other modalities. Despite widespread use of liposomal drugs in cancer patient care, insufficient drug...... in tumor-bearing mice.The safety and efficacy of sPLA2-sensitive liposomal L-OHP was assessed in sPLA2-deficient FaDu hypopharyngeal squamous cell carcinoma and sPLA2-expressing Colo205 colorectal adenocarcinoma. Also, the feasibility of multimodal cancer therapy employing L-OHP encapsulated in MMP....... And may thereby improve therapeutic outcome. Two types of enzymes commonly overexpressed in solid cancers and exploited for liposomal drug delivery purposes, are secretory phospholipase A2 (sPLA2) and matrix metalloproteinases (MMPs).Furthermore, as platinum-based chemotherapeutic compounds are renowned...

  9. [Classification and characteristics of interval cancers in the Principality of Asturias's Breast Cancer Screening Program].

    Science.gov (United States)

    Prieto García, M A; Delgado Sevillano, R; Baldó Sierra, C; González Díaz, E; López Secades, A; Llavona Amor, J A; Vidal Marín, B

    2013-09-01

    To review and classify the interval cancers found in the Principality of Asturias's Breast Cancer Screening Program (PDPCM). A secondary objective was to determine the histological characteristics, size, and stage of the interval cancers at the time of diagnosis. We included the interval cancers in the PDPCM in the period 2003-2007. Interval cancers were classified according to the breast cancer screening program protocol, with double reading without consensus, without blinding, with arbitration. Mammograms were interpreted by 10 radiologists in the PDPCM. A total of 33.7% of the interval cancers could not be classified; of the interval cancers that could be classified, 40.67% were labeled true interval cancers, 31.4% were labeled false negatives on screening, 23.7% had minimal signs, and 4.23% were considered occult. A total of 70% of the interval cancers were diagnosed in the year of the period between screening examinations and 71.7% were diagnosed after subsequent screening. A total of 76.9% were invasive ductal carcinomas, 61.1% were stage II when detected, and 78.7% were larger than 10mm when detected. The rate of interval cancers and the rate of false negatives in the PDPCM are higher than those recommended in the European guidelines. Interval cancers are diagnosed later than the tumors detected at screening. Studying interval cancers provides significant training for the radiologists in the PDPCM. Copyright © 2011 SERAM. Published by Elsevier Espana. All rights reserved.

  10. Robust breast cancer prediction system based on rough set theory at National Cancer Institute of Egypt.

    Science.gov (United States)

    Hamouda, Saeed Khodary M; Wahed, Mohammed E; Abo Alez, Reda H; Riad, Khaled

    2018-01-01

    Breast cancer is one of the major death causing diseases of the women in the world. Every year more than million women are diagnosed with breast cancer more than half of them will die because of inaccuracies and delays in diagnosis of the disease. High accuracy in cancer prediction is important to improve the treatment quality and the survivability rate of patients. In this paper, we are going to propose a new and robust breast cancer prediction and diagnosis system based on the Rough Set (RS). Also, introducing the robust classification process based on some new and most effective attributes. Comparing and evaluating the performance of our proposed approach with the clinical, Radial Basis Function, and Artificial Neural Networks classification schemes. The dataset used in our experiments consists of 60 samples obtained from the National Cancer Institute (NCI) of Egypt. We have used the RS theory to robustly find dependence relationships among data, and evaluate the importance of attributes through: Results: Conclusion: We have introduced the robustness of the RS theory in early predicting and diagnosing the breast cancer. This lay more importance to the contribution and efficiency of RS theory in the field of computational biology. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Dermatologist-level classification of skin cancer with deep neural networks.

    Science.gov (United States)

    Esteva, Andre; Kuprel, Brett; Novoa, Roberto A; Ko, Justin; Swetter, Susan M; Blau, Helen M; Thrun, Sebastian

    2017-02-02

    Skin cancer, the most common human malignancy, is primarily diagnosed visually, beginning with an initial clinical screening and followed potentially by dermoscopic analysis, a biopsy and histopathological examination. Automated classification of skin lesions using images is a challenging task owing to the fine-grained variability in the appearance of skin lesions. Deep convolutional neural networks (CNNs) show potential for general and highly variable tasks across many fine-grained object categories. Here we demonstrate classification of skin lesions using a single CNN, trained end-to-end from images directly, using only pixels and disease labels as inputs. We train a CNN using a dataset of 129,450 clinical images-two orders of magnitude larger than previous datasets-consisting of 2,032 different diseases. We test its performance against 21 board-certified dermatologists on biopsy-proven clinical images with two critical binary classification use cases: keratinocyte carcinomas versus benign seborrheic keratoses; and malignant melanomas versus benign nevi. The first case represents the identification of the most common cancers, the second represents the identification of the deadliest skin cancer. The CNN achieves performance on par with all tested experts across both tasks, demonstrating an artificial intelligence capable of classifying skin cancer with a level of competence comparable to dermatologists. Outfitted with deep neural networks, mobile devices can potentially extend the reach of dermatologists outside of the clinic. It is projected that 6.3 billion smartphone subscriptions will exist by the year 2021 (ref. 13) and can therefore potentially provide low-cost universal access to vital diagnostic care.

  12. Decomposition-based transfer distance metric learning for image classification.

    Science.gov (United States)

    Luo, Yong; Liu, Tongliang; Tao, Dacheng; Xu, Chao

    2014-09-01

    Distance metric learning (DML) is a critical factor for image analysis and pattern recognition. To learn a robust distance metric for a target task, we need abundant side information (i.e., the similarity/dissimilarity pairwise constraints over the labeled data), which is usually unavailable in practice due to the high labeling cost. This paper considers the transfer learning setting by exploiting the large quantity of side information from certain related, but different source tasks to help with target metric learning (with only a little side information). The state-of-the-art metric learning algorithms usually fail in this setting because the data distributions of the source task and target task are often quite different. We address this problem by assuming that the target distance metric lies in the space spanned by the eigenvectors of the source metrics (or other randomly generated bases). The target metric is represented as a combination of the base metrics, which are computed using the decomposed components of the source metrics (or simply a set of random bases); we call the proposed method, decomposition-based transfer DML (DTDML). In particular, DTDML learns a sparse combination of the base metrics to construct the target metric by forcing the target metric to be close to an integration of the source metrics. The main advantage of the proposed method compared with existing transfer metric learning approaches is that we directly learn the base metric coefficients instead of the target metric. To this end, far fewer variables need to be learned. We therefore obtain more reliable solutions given the limited side information and the optimization tends to be faster. Experiments on the popular handwritten image (digit, letter) classification and challenge natural image annotation tasks demonstrate the effectiveness of the proposed method.

  13. Cancer Biochemistry and Host-Tumor Interactions: A Decimal Classification, (Categories 51.6, 51.7, and 51.8).

    Science.gov (United States)

    Schneider, John H.

    This is a hierarchical decimal classification of information related to cancer biochemistry, to host-tumor interactions (including cancer immunology), and to occurrence of cancer in special types of animals and plants. It is a working draft of categories taken from an extensive classification of many fields of biomedical information. Because the…

  14. Anchor-based classification and type-C inhibitors for tyrosine kinases

    Science.gov (United States)

    Hsu, Kai-Cheng; Sung, Tzu-Ying; Lin, Chih-Ta; Chiu, Yi-Yuan; Hsu, John T.-A.; Hung, Hui-Chen; Sun, Chung-Ming; Barve, Indrajeet; Chen, Wen-Liang; Huang, Wen-Chien; Huang, Chin-Ting; Chen, Chun-Hwa; Yang, Jinn-Moon

    2015-01-01

    Tyrosine kinases regulate various biological processes and are drug targets for cancers. At present, the design of selective and anti-resistant inhibitors of kinases is an emergent task. Here, we inferred specific site-moiety maps containing two specific anchors to uncover a new binding pocket in the C-terminal hinge region by docking 4,680 kinase inhibitors into 51 protein kinases, and this finding provides an opportunity for the development of kinase inhibitors with high selectivity and anti-drug resistance. We present an anchor-based classification for tyrosine kinases and discover two type-C inhibitors, namely rosmarinic acid (RA) and EGCG, which occupy two and one specific anchors, respectively, by screening 118,759 natural compounds. Our profiling reveals that RA and EGCG selectively inhibit 3% (EGFR and SYK) and 14% of 64 kinases, respectively. According to the guide of our anchor model, we synthesized three RA derivatives with better potency. These type-C inhibitors are able to maintain activities for drug-resistant EGFR and decrease the invasion ability of breast cancer cells. Our results show that the type-C inhibitors occupying a new pocket are promising for cancer treatments due to their kinase selectivity and anti-drug resistance. PMID:26077136

  15. Wavelet and K-L Seperability Based Feature Extraction Method for Functional Data Classification

    OpenAIRE

    Jun Wan; Zehua Chen; Yingwu Chen; Zhidong Bai

    2010-01-01

    This paper proposes a novel feature extraction method, based on Discrete Wavelet Transform (DWT) and K-L Seperability (KLS), for the classification of Functional Data (FD). This method combines the decorrelation and reduction property of DWT and the additive independence property of KLS, which is helpful to extraction classification features of FD. It is an advanced approach of the popular wavelet based shrinkage method for functional data reduction and classification. A ...

  16. Data Stream Classification Based on the Gamma Classifier

    Directory of Open Access Journals (Sweden)

    Abril Valeria Uriarte-Arcia

    2015-01-01

    Full Text Available The ever increasing data generation confronts us with the problem of handling online massive amounts of information. One of the biggest challenges is how to extract valuable information from these massive continuous data streams during single scanning. In a data stream context, data arrive continuously at high speed; therefore the algorithms developed to address this context must be efficient regarding memory and time management and capable of detecting changes over time in the underlying distribution that generated the data. This work describes a novel method for the task of pattern classification over a continuous data stream based on an associative model. The proposed method is based on the Gamma classifier, which is inspired by the Alpha-Beta associative memories, which are both supervised pattern recognition models. The proposed method is capable of handling the space and time constrain inherent to data stream scenarios. The Data Streaming Gamma classifier (DS-Gamma classifier implements a sliding window approach to provide concept drift detection and a forgetting mechanism. In order to test the classifier, several experiments were performed using different data stream scenarios with real and synthetic data streams. The experimental results show that the method exhibits competitive performance when compared to other state-of-the-art algorithms.

  17. Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Łukasz Augustyniak

    2015-12-01

    Full Text Available We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.

  18. Neural-Fuzzy model Based Steel Pipeline Multiple Cracks Classification

    Science.gov (United States)

    Elwalwal, Hatem Mostafa; Mahzan, Shahruddin Bin Hj.; Abdalla, Ahmed N.

    2017-10-01

    While pipes are cheaper than other means of transportation, this cost saving comes with a major price: pipes are subject to cracks, corrosion etc., which in turn can cause leakage and environmental damage. In this paper, Neural-Fuzzy model for multiple cracks classification based on Lamb Guide Wave. Simulation results for 42 sample were collected using ANSYS software. The current research object to carry on the numerical simulation and experimental study, aiming at finding an effective way to detection and the localization of cracks and holes defects in the main body of pipeline. Considering the damage form of multiple cracks and holes which may exist in pipeline, to determine the respective position in the steel pipe. In addition, the technique used in this research a guided lamb wave based structural health monitoring method whereas piezoelectric transducers will use as exciting and receiving sensors by Pitch-Catch method. Implementation of simple learning mechanism has been developed specially for the ANN for fuzzy the system represented.

  19. Comparison of the prevalence of malnutrition diagnosis in head and neck, gastrointestinal and lung cancer patients by three classification methods

    Science.gov (United States)

    Platek, Mary E.; Popp KPf, Johann V.; Possinger, Candi S.; DeNysschen, Carol A.; Horvath, Peter; Brown, Jean K.

    2011-01-01

    Background Malnutrition is prevalent among patients within certain cancer types. There is lack of universal standard of care for nutrition screening, lack of agreement on an operational definition and on validity of malnutrition indicators. Objective In a secondary data analysis, we investigated prevalence of malnutrition diagnosis by three classification methods using data from medical records of a National Cancer Institute (NCI)-designated comprehensive cancer center. Interventions/Methods Records of 227 patients hospitalized during 1998 with head and neck, gastrointestinal or lung cancer were reviewed for malnutrition based on three methods: 1) physician diagnosed malnutrition related ICD-9 codes; 2) in-hospital nutritional assessment summary conducted by Registered Dietitians; and 3) body mass index (BMI). For patients with multiple admissions, only data from the first hospitalization was included. Results Prevalence of malnutrition diagnosis ranged from 8.8% based on BMI to approximately 26% of all cases based on dietitian assessment. Kappa coefficients between any methods indicated a weak (kappa=0.23, BMI and Dietitians and kappa=0.28, Dietitians and Physicians) to fair strength of agreement (kappa=0.38, BMI and Physicians). Conclusions Available methods to identify patients with malnutrition in an NCI designated comprehensive cancer center resulted in varied prevalence of malnutrition diagnosis. Universal standard of care for nutrition screening that utilizes validated tools is needed. Implications for Practice The Joint Commission on the Accreditation of Healthcare Organizations requires nutritional screening of patients within 24 hours of admission. For this purpose, implementation of a validated tool that can be used by various healthcare practitioners, including nurses, needs to be considered. PMID:21242767

  20. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    Science.gov (United States)

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Hydrological landscape classification: investigating the performance of HAND based landscape classifications in a central European meso-scale catchment

    Directory of Open Access Journals (Sweden)

    S. Gharari

    2011-11-01

    Full Text Available This paper presents a detailed performance and sensitivity analysis of a recently developed hydrological landscape classification method based on dominant runoff mechanisms. Three landscape classes are distinguished: wetland, hillslope and plateau, corresponding to three dominant hydrological regimes: saturation excess overland flow, storage excess sub-surface flow, and deep percolation. Topography, geology and land use hold the key to identifying these landscapes. The height above the nearest drainage (HAND and the surface slope, which can be easily obtained from a digital elevation model, appear to be the dominant topographical controls for hydrological classification. In this paper several indicators for classification are tested as well as their sensitivity to scale and resolution of observed points (sample size. The best results are obtained by the simple use of HAND and slope. The results obtained compared well with the topographical wetness index. The HAND based landscape classification appears to be an efficient method to ''read the landscape'' on the basis of which conceptual models can be developed.

  2. Validation of TNM classification for metastatic prostatic cancer treated using primary androgen deprivation therapy.

    Science.gov (United States)

    Kadono, Yoshifumi; Nohara, Takahiro; Ueno, Satoru; Izumi, Kouji; Kitagawa, Yasuhide; Konaka, Hiroyuki; Mizokami, Atsushi; Onozawa, Mizuki; Hinotsu, Shiro; Akaza, Hideyuki; Namiki, Mikio

    2016-02-01

    The current tumor-node-metastasis (TNM) classification system has been used for many years. The prognosis of patients with metastatic prostate cancer (mPC) treated using primary androgen deprivation therapy (PADT) was analyzed according to the TNM classification. A total of 5618 cases with lymph node metastases only (N1M0), non-regional lymph node metastasis (M1a), bone metastasis (M1b), and distant metastasis (M1c) were selected from the Japanese Study Group of Prostate Cancer database. Overall survival (OS), cancer-specific survival (CSS), and progression-free survival (PFS) rates were calculated using Kaplan-Meier analysis. The influence of clinical variables on patient prognosis was evaluated using the Cox proportional hazard regression model. The 5-year OS, CSS, and PFS were 76.0, 83.2, and 38.8% in N1M0, 57.5, 69.0, and 23.0% in M1a, 54.0, 63.1, and 23.0% in M1b, and 40.0, 51.5, and 16.6% in M1c, respectively. OS, CSS, and PFS worsened as the stages progressed. OS, CSS, and PFS were all significantly worse in N1M1b compared with N0M1b. Multivariate analysis revealed that OS and CSS were worse in patients with a Gleason score ≥8 and that combined androgen blockade (CAB) treatment provided better OS than non-CAB treatments at any tumor stage. However, OS and CSS were worse in individuals with a prostate-specific antigen >100 ng/ml only in M1b. Patient prognosis worsened with stage progression; therefore, current TNM classification system of mPC for PADT was shown to be trustworthy. Each PC cell that develops bone or lymphoid metastasis may exhibit different characteristics.

  3. Contribution of Multiparameter Flow Cytometry Immunophenotyping to the Diagnostic Screening and Classification of Pediatric Cancer

    Science.gov (United States)

    Ferreira-Facio, Cristiane S.; Milito, Cristiane; Botafogo, Vitor; Fontana, Marcela; Thiago, Leandro S.; Oliveira, Elen; da Rocha-Filho, Ariovaldo S.; Werneck, Fernando; Forny, Danielle N.; Dekermacher, Samuel; de Azambuja, Ana Paula; Ferman, Sima Esther; de Faria, Paulo Antônio Silvestre; Land, Marcelo G. P.; Orfao, Alberto; Costa, Elaine S.

    2013-01-01

    Pediatric cancer is a relatively rare and heterogeneous group of hematological and non-hematological malignancies which require multiple procedures for its diagnostic screening and classification. Until now, flow cytometry (FC) has not been systematically applied to the diagnostic work-up of such malignancies, particularly for solid tumors. Here we evaluated a FC panel of markers for the diagnostic screening of pediatric cancer and further classification of pediatric solid tumors. The proposed strategy aims at the differential diagnosis between tumoral vs. reactive samples, and hematological vs. non-hematological malignancies, and the subclassification of solid tumors. In total, 52 samples from 40 patients suspicious of containing tumor cells were analyzed by FC in parallel to conventional diagnostic procedures. The overall concordance rate between both approaches was of 96% (50/52 diagnostic samples), with 100% agreement for all reactive/inflammatory and non-infiltrated samples as well as for those corresponding to solid tumors (n = 35), with only two false negative cases diagnosed with Hodgkin lymphoma and anaplastic lymphoma, respectively. Moreover, clear discrimination between samples infiltrated by hematopoietic vs. non-hematopoietic tumor cells was systematically achieved. Distinct subtypes of solid tumors showed different protein expression profiles, allowing for the differential diagnosis of neuroblastoma (CD56hi/GD2+/CD81hi), primitive neuroectodermal tumors (CD271hi/CD99+), Wilms tumors (>1 cell population), rhabdomyosarcoma (nuMYOD1+/numyogenin+), carcinomas (CD45−/EpCAM+), germ cell tumors (CD56+/CD45−/NG2+/CD10+) and eventually also hemangiopericytomas (CD45−/CD34+). In summary, our results show that multiparameter FC provides fast and useful complementary data to routine histopathology for the diagnostic screening and classification of pediatric cancer. PMID:23472067

  4. Contribution of multiparameter flow cytometry immunophenotyping to the diagnostic screening and classification of pediatric cancer.

    Directory of Open Access Journals (Sweden)

    Cristiane S Ferreira-Facio

    Full Text Available Pediatric cancer is a relatively rare and heterogeneous group of hematological and non-hematological malignancies which require multiple procedures for its diagnostic screening and classification. Until now, flow cytometry (FC has not been systematically applied to the diagnostic work-up of such malignancies, particularly for solid tumors. Here we evaluated a FC panel of markers for the diagnostic screening of pediatric cancer and further classification of pediatric solid tumors. The proposed strategy aims at the differential diagnosis between tumoral vs. reactive samples, and hematological vs. non-hematological malignancies, and the subclassification of solid tumors. In total, 52 samples from 40 patients suspicious of containing tumor cells were analyzed by FC in parallel to conventional diagnostic procedures. The overall concordance rate between both approaches was of 96% (50/52 diagnostic samples, with 100% agreement for all reactive/inflammatory and non-infiltrated samples as well as for those corresponding to solid tumors (n = 35, with only two false negative cases diagnosed with Hodgkin lymphoma and anaplastic lymphoma, respectively. Moreover, clear discrimination between samples infiltrated by hematopoietic vs. non-hematopoietic tumor cells was systematically achieved. Distinct subtypes of solid tumors showed different protein expression profiles, allowing for the differential diagnosis of neuroblastoma (CD56(hi/GD2(+/CD81(hi, primitive neuroectodermal tumors (CD271(hi/CD99(+, Wilms tumors (>1 cell population, rhabdomyosarcoma (nuMYOD1(+/numyogenin(+, carcinomas (CD45(-/EpCAM(+, germ cell tumors (CD56(+/CD45(-/NG2(+/CD10(+ and eventually also hemangiopericytomas (CD45(-/CD34(+. In summary, our results show that multiparameter FC provides fast and useful complementary data to routine histopathology for the diagnostic screening and classification of pediatric cancer.

  5. Trace elements based classification on clinkers. Application to Spanish clinkers

    Directory of Open Access Journals (Sweden)

    Tamás, F. D.

    2001-12-01

    Full Text Available The qualitative identification to determine the origin (i.e. manufacturing factory of Spanish clinkers is described. The classification of clinkers produced in different factories can be based on their trace element content. Approximately fifteen clinker sorts are analysed, collected from 11 Spanish cement factories to determine their Mg, Sr, Ba, Mn, Ti, Zr, Zn and V content. An expert system formulated by a binary decision tree is designed based on the collected data. The performance of the obtained classifier was measured by ten-fold cross validation. The results show that the proposed method is useful to identify an easy-to-use expert system that is able to determine the origin of the clinker based on its trace element content.

    En el presente trabajo se describe el procedimiento de identificación cualitativa de clínkeres españoles con el objeto de determinar su origen (fábrica. Esa clasificación de los clínkeres se basa en el contenido de sus elementos traza. Se analizaron 15 clínkeres diferentes procedentes de 11 fábricas de cemento españolas, determinándose los contenidos en Mg, Sr, Ba, Mn, Ti, Zr, Zn y V. Se ha diseñado un sistema experto mediante un árbol de decisión binario basado en los datos recogidos. La clasificación obtenida fue examinada mediante la validación cruzada de 10 valores. Los resultados obtenidos muestran que el modelo propuesto es válido para identificar, de manera fácil, un sistema experto capaz de determinar el origen de un clínker basándose en el contenido de sus elementos traza.

  6. TOPICAL REVIEW: A review of classification algorithms for EEG-based brain computer interfaces

    Science.gov (United States)

    Lotte, F.; Congedo, M.; Lécuyer, A.; Lamarche, F.; Arnaldi, B.

    2007-06-01

    In this paper we review classification algorithms used to design brain-computer interface (BCI) systems based on electroencephalography (EEG). We briefly present the commonly employed algorithms and describe their critical properties. Based on the literature, we compare them in terms of performance and provide guidelines to choose the suitable classification algorithm(s) for a specific BCI.

  7. Event-Based User Classification in Weibo Media

    Science.gov (United States)

    Wang, Wendong; Cheng, Shiduan; Que, Xirong

    2014-01-01

    Weibo media, known as the real-time microblogging services, has attracted massive attention and support from social network users. Weibo platform offers an opportunity for people to access information and changes the way people acquire and disseminate information significantly. Meanwhile, it enables people to respond to the social events in a more convenient way. Much of the information in Weibo media is related to some events. Users who post different contents, and exert different behavior or attitude may lead to different contribution to the specific event. Therefore, classifying the large amount of uncategorized social circles generated in Weibo media automatically from the perspective of events has been a promising task. Under this circumstance, in order to effectively organize and manage the huge amounts of users, thereby further managing their contents, we address the task of user classification in a more granular, event-based approach in this paper. By analyzing real data collected from Sina Weibo, we investigate the Weibo properties and utilize both content information and social network information to classify the numerous users into four primary groups: celebrities, organizations/media accounts, grassroots stars, and ordinary individuals. The experiments results show that our method identifies the user categories accurately. PMID:25133235

  8. Brazilian Cardiorespiratory Fitness Classification Based on Maximum Oxygen Consumption

    Science.gov (United States)

    Herdy, Artur Haddad; Caixeta, Ananda

    2016-01-01

    Background Cardiopulmonary exercise test (CPET) is the most complete tool available to assess functional aerobic capacity (FAC). Maximum oxygen consumption (VO2 max), an important biomarker, reflects the real FAC. Objective To develop a cardiorespiratory fitness (CRF) classification based on VO2 max in a Brazilian sample of healthy and physically active individuals of both sexes. Methods We selected 2837 CEPT from 2837 individuals aged 15 to 74 years, distributed as follows: G1 (15 to 24); G2 (25 to 34); G3 (35 to 44); G4 (45 to 54); G5 (55 to 64) and G6 (65 to 74). Good CRF was the mean VO2 max obtained for each group, generating the following subclassification: Very Low (VL): VO2 105%. Results Men VL 105% G1 53.13 G2 49.77 G3 47.67 G4 42.52 G5 37.06 G6 31.50 Women G1 40.85 G2 40.01 G3 34.09 G4 32.66 G5 30.04 G6 26.36 Conclusions This chart stratifies VO2 max measured on a treadmill in a robust Brazilian sample and can be used as an alternative for the real functional evaluation of physically and healthy individuals stratified by age and sex. PMID:27305285

  9. Power Allocation Based on Data Classification in Wireless Sensor Networks.

    Science.gov (United States)

    Wang, Houlian; Zhou, Gongbo

    2017-05-12

    Limited node energy in wireless sensor networks is a crucial factor which affects the monitoring of equipment operation and working conditions in coal mines. In addition, due to heterogeneous nodes and different data acquisition rates, the number of arriving packets in a queue network can differ, which may lead to some queue lengths reaching the maximum value earlier compared with others. In order to tackle these two problems, an optimal power allocation strategy based on classified data is proposed in this paper. Arriving data is classified into dissimilar classes depending on the number of arriving packets. The problem is formulated as a Lyapunov drift optimization with the objective of minimizing the weight sum of average power consumption and average data class. As a result, a suboptimal distributed algorithm without any knowledge of system statistics is presented. The simulations, conducted in the perfect channel state information (CSI) case and the imperfect CSI case, reveal that the utility can be pushed arbitrarily close to optimal by increasing the parameter V, but with a corresponding growth in the average delay, and that other tunable parameters W and the classification method in the interior of utility function can trade power optimality for increased average data class. The above results show that data in a high class has priorities to be processed than data in a low class, and energy consumption can be minimized in this resource allocation strategy.

  10. Wavelets and Morphological Operators Based Classification of Epilepsy Risk Levels

    Directory of Open Access Journals (Sweden)

    Harikumar Rajaguru

    2014-01-01

    Full Text Available The objective of this paper is to compare the performance of Singular Value Decomposition (SVD, Expectation Maximization (EM, and Modified Expectation Maximization (MEM as the postclassifiers for classifications of the epilepsy risk levels obtained from extracted features through wavelet transforms and morphological filters from EEG signals. The code converter acts as a level one classifier. The seven features such as energy, variance, positive and negative peaks, spike and sharp waves, events, average duration, and covariance are extracted from EEG signals, out of which four parameters like positive and negative peaks, spike and sharp waves, events, and average duration are extracted using Haar, dB2, dB4, and Sym8 wavelet transforms with hard and soft thresholding methods. The above said four features are also extracted through morphological filters. The performance of the code converter and classifiers are compared based on the parameters such as Performance Index (PI and Quality Value (QV. The Performance Index and Quality Value of code converters are at low value of 33.26% and 12.74, respectively. The highest PI of 98.03% and QV of 23.82 are attained at dB2 wavelet with hard thresholding method for SVD classifier. All the postclassifiers are settled at PI value of more than 90% at QV of 20.

  11. Radar-Derived Quantitative Precipitation Estimation Based on Precipitation Classification

    Directory of Open Access Journals (Sweden)

    Lili Yang

    2016-01-01

    Full Text Available A method for improving radar-derived quantitative precipitation estimation is proposed. Tropical vertical profiles of reflectivity (VPRs are first determined from multiple VPRs. Upon identifying a tropical VPR, the event can be further classified as either tropical-stratiform or tropical-convective rainfall by a fuzzy logic (FL algorithm. Based on the precipitation-type fields, the reflectivity values are converted into rainfall rate using a Z-R relationship. In order to evaluate the performance of this rainfall classification scheme, three experiments were conducted using three months of data and two study cases. In Experiment I, the Weather Surveillance Radar-1988 Doppler (WSR-88D default Z-R relationship was applied. In Experiment II, the precipitation regime was separated into convective and stratiform rainfall using the FL algorithm, and corresponding Z-R relationships were used. In Experiment III, the precipitation regime was separated into convective, stratiform, and tropical rainfall, and the corresponding Z-R relationships were applied. The results show that the rainfall rates obtained from all three experiments match closely with the gauge observations, although Experiment II could solve the underestimation, when compared to Experiment I. Experiment III significantly reduced this underestimation and generated the most accurate radar estimates of rain rate among the three experiments.

  12. Brazilian Cardiorespiratory Fitness Classification Based on Maximum Oxygen Consumption

    Directory of Open Access Journals (Sweden)

    Artur Haddad Herdy

    2016-05-01

    Full Text Available Abstract Background: Cardiopulmonary exercise test (CPET is the most complete tool available to assess functional aerobic capacity (FAC. Maximum oxygen consumption (VO2 max, an important biomarker, reflects the real FAC. Objective: To develop a cardiorespiratory fitness (CRF classification based on VO2 max in a Brazilian sample of healthy and physically active individuals of both sexes. Methods: We selected 2837 CEPT from 2837 individuals aged 15 to 74 years, distributed as follows: G1 (15 to 24; G2 (25 to 34; G3 (35 to 44; G4 (45 to 54; G5 (55 to 64 and G6 (65 to 74. Good CRF was the mean VO2 max obtained for each group, generating the following subclassification: Very Low (VL: VO2 105%. Results: Men VL 105% G1 53.13 G2 49.77 G3 47.67 G4 42.52 G5 37.06 G6 31.50 Women G1 40.85 G2 40.01 G3 34.09 G4 32.66 G5 30.04 G6 26.36 Conclusions: This chart stratifies VO2 max measured on a treadmill in a robust Brazilian sample and can be used as an alternative for the real functional evaluation of physically and healthy individuals stratified by age and sex.

  13. Locally linear embedding (LLE) for MRI based Alzheimer's disease classification.

    Science.gov (United States)

    Liu, Xin; Tosun, Duygu; Weiner, Michael W; Schuff, Norbert

    2013-12-01

    Modern machine learning algorithms are increasingly being used in neuroimaging studies, such as the prediction of Alzheimer's disease (AD) from structural MRI. However, finding a good representation for multivariate brain MRI features in which their essential structure is revealed and easily extractable has been difficult. We report a successful application of a machine learning framework that significantly improved the use of brain MRI for predictions. Specifically, we used the unsupervised learning algorithm of local linear embedding (LLE) to transform multivariate MRI data of regional brain volume and cortical thickness to a locally linear space with fewer dimensions, while also utilizing the global nonlinear data structure. The embedded brain features were then used to train a classifier for predicting future conversion to AD based on a baseline MRI. We tested the approach on 413 individuals from the Alzheimer's Disease Neuroimaging Initiative (ADNI) who had baseline MRI scans and complete clinical follow-ups over 3 years with the following diagnoses: cognitive normal (CN; n=137), stable mild cognitive impairment (s-MCI; n=93), MCI converters to AD (c-MCI, n=97), and AD (n=86). We found that classifications using embedded MRI features generally outperformed (pclassifications using the original features directly. Moreover, the improvement from LLE was not limited to a particular classifier but worked equally well for regularized logistic regressions, support vector machines, and linear discriminant analysis. Most strikingly, using LLE significantly improved (p=0.007) predictions of MCI subjects who converted to AD and those who remained stable (accuracy/sensitivity/specificity: =0.68/0.80/0.56). In contrast, predictions using the original features performed not better than by chance (accuracy/sensitivity/specificity: =0.56/0.65/0.46). In conclusion, LLE is a very effective tool for classification studies of AD using multivariate MRI data. The improvement in

  14. Classification of CT brain images based on deep learning networks.

    Science.gov (United States)

    Gao, Xiaohong W; Hui, Rui; Tian, Zengmin

    2017-01-01

    While computerised tomography (CT) may have been the first imaging tool to study human brain, it has not yet been implemented into clinical decision making process for diagnosis of Alzheimer's disease (AD). On the other hand, with the nature of being prevalent, inexpensive and non-invasive, CT does present diagnostic features of AD to a great extent. This study explores the significance and impact on the application of the burgeoning deep learning techniques to the task of classification of CT brain images, in particular utilising convolutional neural network (CNN), aiming at providing supplementary information for the early diagnosis of Alzheimer's disease. Towards this end, three categories of CT images (N = 285) are clustered into three groups, which are AD, lesion (e.g. tumour) and normal ageing. In addition, considering the characteristics of this collection with larger thickness along the direction of depth (z) (~3-5 mm), an advanced CNN architecture is established integrating both 2D and 3D CNN networks. The fusion of the two CNN networks is subsequently coordinated based on the average of Softmax scores obtained from both networks consolidating 2D images along spatial axial directions and 3D segmented blocks respectively. As a result, the classification accuracy rates rendered by this elaborated CNN architecture are 85.2%, 80% and 95.3% for classes of AD, lesion and normal respectively with an average of 87.6%. Additionally, this improved CNN network appears to outperform the others when in comparison with 2D version only of CNN network as well as a number of state of the art hand-crafted approaches. As a result, these approaches deliver accuracy rates in percentage of 86.3, 85.6 ± 1.10, 86.3 ± 1.04, 85.2 ± 1.60, 83.1 ± 0.35 for 2D CNN, 2D SIFT, 2D KAZE, 3D SIFT and 3D KAZE respectively. The two major contributions of the paper constitute a new 3-D approach while applying deep learning technique to extract signature information

  15. Manifold learning based feature extraction for classification of hyperspectral data

    CSIR Research Space (South Africa)

    Lunga, D

    2014-01-01

    Full Text Available Interest in manifold learning for representing the topology of large, high dimensional nonlinear data sets in lower, but still meaningful dimensions for visualization and classification has grown rapidly over the past decade, and particularly...

  16. Vehicle Maneuver Detection with Accelerometer-Based Classification

    National Research Council Canada - National Science Library

    Cervantes-Villanueva, Javier; Carrillo-Zapata, Daniel; Terroso-Saenz, Fernando; Valdes-Vela, Mercedes; Skarmeta, Antonio

    2016-01-01

    .... For its realization, we have evaluated different classification algorithms to act as agents within the architecture. Finally, our approach has been tested with a real-world dataset collected by means of the ad hoc mobile application developed.

  17. Cross-platform classification in microarray-based leukemia diagnostics.

    Science.gov (United States)

    Nilsson, Bjorn; Andersson, Anna; Johansson, Mikael; Fioretos, Thoas

    2006-06-01

    Gene expression profiling is a powerful technique for classifying hematologic malignancies. Its clinical use is, however, currently hindered by the need to collect large sets of expression profiles at each diagnostic facility. To overcome this limitation, we introduced cross-platform classification, allowing classifier construction using pre-existing microarray datasets. As proof-of-principle, we performed cross-platform classification of acute myeloid leukemia and childhood acute lymphoblastic leukemia using expression data from four different facilities. We show that cross-platform classification of these disorders is achievable, and, strikingly, that the diagnostic accuracy can be retained. We conclude that cross-platform classification constitutes an effective and convenient way to implement microarray diagnostics.

  18. A Mood-based Genre Classification of Television Content

    OpenAIRE

    Corona, Humberto; O'Mahony, Michael P.

    2015-01-01

    The classification of television content helps users organise and navigate through the large list of channels and programs now available. In this paper, we address the problem of television content classification by exploiting text information extracted from program transcriptions. We present an analysis which adapts a model for sentiment that has been widely and successfully applied in other fields such as music or blog posts. We use a real-world dataset obtained from the Box- fish API to co...

  19. Tweet-based Target Market Classification Using Ensemble Method

    Directory of Open Access Journals (Sweden)

    Muhammad Adi Khairul Anshary

    2016-09-01

    Full Text Available Target market classification is aimed at focusing marketing activities on the right targets. Classification of target markets can be done through data mining and by utilizing data from social media, e.g. Twitter. The end result of data mining are learning models that can classify new data. Ensemble methods can improve the accuracy of the models and therefore provide better results. In this study, classification of target markets was conducted on a dataset of 3000 tweets in order to extract features. Classification models were constructed to manipulate the training data using two ensemble methods (bagging and boosting. To investigate the effectiveness of the ensemble methods, this study used the CART (classification and regression tree algorithm for comparison. Three categories of consumer goods (computers, mobile phones and cameras and three categories of sentiments (positive, negative and neutral were classified towards three target-market categories. Machine learning was performed using Weka 3.6.9. The results of the test data showed that the bagging method improved the accuracy of CART with 1.9% (to 85.20%. On the other hand, for sentiment classification, the ensemble methods were not successful in increasing the accuracy of CART. The results of this study may be taken into consideration by companies who approach their customers through social media, especially Twitter.

  20. Texture classification of aerial image based on PCA-NBC

    Science.gov (United States)

    Yu, Xin; Zheng, Zhaoboa; Li, Linyi; Ye, Zhiwei

    2005-10-01

    Bayesian Networks have emerged in recent years as a powerful data mining technique for handling uncertainty in complex domains. The Bayesian Network represents the joint probability distribution and domain (or expert) knowledge in a compact way and provides a comprehensive method of representing relationships and influences among nodes (variables) with a graphical diagram. Actually, however, in the classification domain it was not paid attention to by researchers until the simplest of form of Bayesian Networks, Naive Bayes Classifier, turned up. Naive Bayes Classifier is a simple and efficient probability classification method, and has shown surprising performance in some domains, which owes to the independence assumption that makes Naive Bayes Classifier fit the classification more easily. However, the independence assumption obviously does not hold in the real world. Therefore, in order to meet the "naive" (unreal) assumption, this paper proposes a new image texture classification method of aerial images, PCA-NBC, which combines the Principal Components Analysis (PCA) and Naive Bayes Classifier (NBC). The PCA transforms the highly correlated features into statistically independent and orthogonal "features", so it is suitable to solve that problem and can lay a solid theoretic foundation in the application. One hundred and thirteen aerial images are used to evaluate the classification performance in the experiment. The experimental results demonstrate that the proposed method can cut down the number of features and computational costs and improve the accuracy during classification. In one word, the new method, PCA-NBC, is an attractive and effective method, which outperforms the Naive Bayes Classifier.

  1. A Library Book Intelligence Classification System based on Multi-agent

    Science.gov (United States)

    Pengfei, Guo; Liangxian, Du; Junxia, Qi

    This paper introduces the concept of artificial intelligence into the administrative system of the library, and then gives the model of robot system in book classification based on multi-agent. The intelligent robot can recognize books' barcode automatically and here gives the classification algorithm according to the book classification of Chinese library. The algorithm can calculate the concrete position of the books, and relate with all similar books, thus the robot can put all congener books once without turning back.

  2. Statistical Analysis of Tissue Images for Detection and Classification of Cervical Cancer

    CERN Document Server

    Jagtap, Jaidip; Pandey, Kiran; Agarwa, Asha; Panigrahi, Prasanta K; Pradhan, Asima

    2011-01-01

    Cervical cancer is one of the major health threats in women worldwide. The current "gold standard" for detecting cancer of the epithelial tissue is the histopathology analysis of biopsy samples. However it relies on the pathologist's judgment of the disease. We investigate the utility of statistical parameters as a potential tool for detection and discrimination of the stages of dysplasia. Digital images of the tissue slides are captured with the help of a digital camera plugged to a microscope. Statistical data analysis is performed with the help of software to evaluate parameters such as mean, maxima, full width half maxima, skewness, kurtosis etc. for the images. We believe that these parameters can help effectively to improve the diagnosis and further classify normal and abnormal tissue sections. These parameters can be used independently as well as in tandem with other parameters as features in classification algorithms that involve the use of Neural networks or Principal component analysis.

  3. Classification of functional voice disorders based on phonovibrograms.

    Science.gov (United States)

    Voigt, Daniel; Döllinger, Michael; Braunschweig, Thomas; Yang, Anxiong; Eysholdt, Ulrich; Lohscheller, Jörg

    2010-05-01

    This work presents a computer-aided method for automatically and objectively classifying individuals with healthy and dysfunctional vocal fold vibration patterns as depicted in clinical high-speed (HS) videos of the larynx. By employing a specialized image segmentation and vocal fold movement visualization technique - namely phonovibrography - a novel set of numerical features is derived from laryngeal HS videos capturing the dynamic behavior and the symmetry of oscillating vocal folds. In order to assess the discriminatory power of the features, a support vector machine is applied to the preprocessed data with regard to clinically relevant diagnostic tasks. Finally, the classification performance of the learned nonlinear models is evaluated to allow for conclusions to be drawn about suitability of features and data resulting from different examination paradigms. As a reference, a second feature set is determined which corresponds to more traditional voice analysis approaches. For the first time an automatic classification of healthy and pathological voices could be obtained by analyzing the vibratory patterns of vocal folds using phonovibrograms (PVGs). An average classification accuracy of approximately 81% was achieved for 2-class discrimination with PVG features. This exceeds the results obtained through traditional voice analysis features. Furthermore, a relevant influence of phonation frequency on classification accuracy was substantiated by the clinical HS data. The PVG feature extraction and classification approach can be assessed as being promising with regard to the diagnosis of functional voice disorders. The obtained results indicate that an objective analysis of dysfunctional vocal fold vibration can be achieved with considerably high accuracy. Moreover, the PVG classification method holds a lot of potential when it comes to the clinical assessment of voice pathologies in general, as the diagnostic support can be provided to the voice clinician in a

  4. Pooling breast cancer datasets has a synergetic effect on classification performance and improves signature stability

    Directory of Open Access Journals (Sweden)

    van de Vijver Marc J

    2008-08-01

    Full Text Available Abstract Background Michiels et al. (Lancet 2005; 365: 488–92 employed a resampling strategy to show that the genes identified as predictors of prognosis from resamplings of a single gene expression dataset are highly variable. The genes most frequently identified in the separate resamplings were put forward as a 'gold standard'. On a higher level, breast cancer datasets collected by different institutions can be considered as resamplings from the underlying breast cancer population. The limited overlap between published prognostic signatures confirms the trend of signature instability identified by the resampling strategy. Six breast cancer datasets, totaling 947 samples, all measured on the Affymetrix platform, are currently available. This provides a unique opportunity to employ a substantial dataset to investigate the effects of pooling datasets on classifier accuracy, signature stability and enrichment of functional categories. Results We show that the resampling strategy produces a suboptimal ranking of genes, which can not be considered to be a 'gold standard'. When pooling breast cancer datasets, we observed a synergetic effect on the classification performance in 73% of the cases. We also observe a significant positive correlation between the number of datasets that is pooled, the validation performance, the number of genes selected, and the enrichment of specific functional categories. In addition, we have evaluated the support for five explanations that have been postulated for the limited overlap of signatures. Conclusion The limited overlap of current signature genes can be attributed to small sample size. Pooling datasets results in more accurate classification and a convergence of signature genes. We therefore advocate the analysis of new data within the context of a compendium, rather than analysis in isolation.

  5. CT imaging-based determination and classification of anatomic variations of left gastric vein.

    Science.gov (United States)

    Wu, Yongyou; Chen, Guangqiang; Wu, Pengfei; Zhu, Jianbin; Peng, Wei; Xing, Chungen

    2017-03-01

    Precise determination and classification of left gastric vein (LGV) anatomy are helpful in planning for gastric surgery, in particular, for resection of gastric cancer. However, the anatomy of LGV is highly variable. A systematic classification of its variations is still to be proposed. We aimed to investigate the anatomical variations in LGV using CT imaging and develop a new nomenclature system. We reviewed CT images and tracked the course of LGV in 825 adults. The frequencies of common and variable LGV anatomical courses were recorded. Anatomic variations of LGV were proposed and classified into different types mainly based on its courses. The inflow sites of LGV into the portal system were also considered if common hepatic artery (CHA) or splenic artery (SA) could not be used as a frame of reference due to variations. Detailed anatomy and courses of LGV were depicted on CT images. Using CHA and SA as the frames of reference, the routes of LGV were divided into six types (i.e., PreS, RetroS, Mid, PreCH, RetroCH, and Supra). The inflow sites were classified into four types (i.e., PV, SV, PSV, and LPV). The new classification was mainly based on the courses of LGV, which was validated with MDCT in the 805 cases with an identifiable LGV, namely type I, RetroCH, 49.8 % (401/805); type II, PreS, 20.6 % (166/805); type III, Mid, 20.0 % (161/805); type IV, RetroS, 7.3 % (59/805); type V, Supra, 1.5 % (12/805); and type VI, PreCH, 0.7 % (6/805). Type VII, designated to the cases in which SA and CHA could not be used as frames of reference, was not observed in this series. Detailed depiction of the anatomy and courses of LGV on CT images allowed us to evaluate and develop a new classification and nomenclature system for the anatomical variations of LGV.

  6. Factors influencing the discrimination and classification of prostate cancer cell lines by FTIR microspectroscopy.

    Science.gov (United States)

    Harvey, T J; Gazi, E; Henderson, A; Snook, R D; Clarke, N W; Brown, M; Gardner, P

    2009-06-01

    In this study we obtained Fourier transform infrared (FTIR) spectra of fixed prostate cell lines of differing types as well as the primary epithelial cells from benign prostatic hyperplasia (BPH). Results showed that by using multivariate chemometric analysis it was possible to discriminate and classify these cell lines, which gave rise to sensitivity and specificity values of >94% and >98%, respectively. Following on from these results the possible influences of different factors on the discrimination and classification of the prostate cell lines were examined. Firstly, the effect of using different growth media during cell culturing was investigated, with results indicating that this did not influence chemometric discrimination. Secondly, differences in the nucleus-to-cytoplasm (N/C) ratio were examined, and it was concluded that this factor was not the main reason for the discrimination and classification of the prostate cancer (CaP) cell lines. In conclusion, given the fact that neither growth media nor N/C ratio could totally explain the classification it is likely that actual biochemical differences between the cell lines is the major contributing factor.

  7. Genetic programming based ensemble system for microarray data classification.

    Science.gov (United States)

    Liu, Kun-Hong; Tong, Muchenxuan; Xie, Shu-Tong; Yee Ng, Vincent To

    2015-01-01

    Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.

  8. Genetic Programming Based Ensemble System for Microarray Data Classification

    Directory of Open Access Journals (Sweden)

    Kun-Hong Liu

    2015-01-01

    Full Text Available Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP based new ensemble system (named GPES, which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.

  9. Parotid cancer: Impact of changes from the 1997 to the 2002 American Joint Committee on Cancer classification on outcome prediction.

    Science.gov (United States)

    Schroeder, Ursula; Groppe, Daniela; Mueller, Rolf-Peter; Guntinas-Lichius, Orlando

    2008-08-15

    The TNM classification [American Joint Committee on Cancer (AJCC)] of salivary gland cancer was revised again in 2002. In the present study, the outcome prediction of the new TNM system was compared with the old 1997 TNM system in 202 patients with primary parotid cancer. All patients treated from 1986 to 2006 were reclassified in both TNM systems. Disease-free survival (DFS) rates were calculated. The prognostic validity of both the TNM system and other factors were analyzed univariately (log-rank test) and multivariately (Cox regression). AJCC TNM stage changes from 1997 to 2002 altered the disease staging in 35% of the patients. Concerning DFS, the new TNM 2002 staging allowed significantly better separation of stage III, but not of stage I versus stage II. The TNM 2002 staging was the most powerful predictor for DFS according to multivariate analysis. The 1997 system showed no independent significance. The subclassification of the new stage IV was not satisfactory; no clear distinction of IVA versus III, and IVA versus IVB was possible. The TNM 2002 staging is more valid than the 1997 system, but a significant problem was observed in separating stage I from stage II, and within the stage IV subgroups. 2008 American Cancer Society

  10. Automatic classification of acetowhite temporal patterns to identify precursor lesions of cervical cancer

    Science.gov (United States)

    Gutiérrez-Fragoso, K.; Acosta-Mesa, H. G.; Cruz-Ramírez, N.; Hernández-Jiménez, R.

    2013-12-01

    Cervical cancer has remained, until now, as a serious public health problem in developing countries. The most common method of screening is the Pap test or cytology. When abnormalities are reported in the result, the patient is referred to a dysplasia clinic for colposcopy. During this test, a solution of acetic acid is applied, which produces a color change in the tissue and is known as acetowhitening phenomenon. This reaction aims to obtaining a sample of tissue and its histological analysis let to establish a final diagnosis. During the colposcopy test, digital images can be acquired to analyze the behavior of the acetowhitening reaction from a temporal approach. In this way, we try to identify precursor lesions of cervical cancer through a process of automatic classification of acetowhite temporal patterns. In this paper, we present the performance analysis of three classification methods: kNN, Naïve Bayes and C4.5. The results showed that there is similarity between some acetowhite temporal patterns of normal and abnormal tissues. Therefore we conclude that it is not sufficient to only consider the temporal dynamic of the acetowhitening reaction to establish a diagnosis by an automatic method. Information from cytologic, colposcopic and histopathologic disciplines should be integrated as well.

  11. Phylogeny and classification of Dickeya based on multilocus sequence analysis.

    Science.gov (United States)

    Marrero, Glorimar; Schneider, Kevin L; Jenkins, Daniel M; Alvarez, Anne M

    2013-09-01

    Bacterial heart rot of pineapple reported in Hawaii in 2003 and reoccurring in 2006 was caused by an undetermined species of Dickeya. Classification of the bacterial strains isolated from infected pineapple to one of the recognized Dickeya species and their phylogenetic relationships with Dickeya were determined by a multilocus sequence analysis (MLSA), based on the partial gene sequences of dnaA, dnaJ, dnaX, gyrB and recN. Individual and concatenated gene phylogenies revealed that the strains form a clade with reference Dickeya sp. isolated from pineapple in Malaysia and are closely related to D. zeae; however, previous DNA-DNA reassociation values suggest that these strains do not meet the genomic threshold for consideration in D. zeae, and require further taxonomic analysis. An analysis of the markers used in this MLSA determined that recN was the best overall marker for resolution of species within Dickeya. Differential intraspecies resolution was observed with the other markers, suggesting that marker selection is important for defining relationships within a clade. Phylogenies produced with gene sequences from the sequenced genomes of strains D. dadantii Ech586, D. dadantii Ech703 and D. zeae Ech1591 did not place the sequenced strains with members of other well-characterized members of their respective species. The average nucleotide identity (ANI) and tetranucleotide frequencies determined for the sequenced strains corroborated the results of the MLSA that D. dadantii Ech586 and D. dadantii Ech703 should be reclassified as Dickeya zeae Ech586 and Dickeya paradisiaca Ech703, respectively, whereas D. zeae Ech1591 should be reclassified as Dickeya chrysanthemi Ech1591.

  12. Ulises: A Agent-Based System For Timbre Classification

    Directory of Open Access Journals (Sweden)

    Eduardo Porto TEIXEIRA

    2017-07-01

    Full Text Available The Sound and Music Computing (SMC field has grown over the years and every time there are more conferences and specialized researchers in this area. The sub-field of Music Information Retrieval (MIR, one of the main research fields on SMC has focused on getting information from sound data. The most critical issue with regard to the human perception of sound is: what are the qualities of musical instrument sounds to perform recognition of its sound sources. There are four main sound dimensions: pitch, loudness, duration and timbre. The fourth dimension, timbre, is the most vague and complex dimension, a complex and high-level multidimensional property. Recognition of timbres is an area of high interest within MIR, being present in several papers state of the art on SMC. About Multi-Agent Systems (MAS, the term autonomous refers to the fact that the agents have their own existence, regardless of the existence of other agents, and are able to take own decisions without outside interference. Agents technology is particularly suitable for musical applications because of the possibility of associating a computational agent with the role of a singer or instrumentalist as can be seen in works state of art in SMC area. In this context, this paper proposes a agent-based approach to timbre recognition, focusing on the parallelization of the classification model. For this, we assign a method of recognition of timbres to different agents, where each agent is a specialized entity in a particular timbre, characteristic of a specific instrument, seeking a distributed solution for solving the timbre recognition problem.

  13. Leveraging Sequence Classification by Taxonomy-Based Multitask Learning

    Science.gov (United States)

    Widmer, Christian; Leiva, Jose; Altun, Yasemin; Rätsch, Gunnar

    In this work we consider an inference task that biologists are very good at: deciphering biological processes by bringing together knowledge that has been obtained by experiments using various organisms, while respecting the differences and commonalities of these organisms. We look at this problem from an sequence analysis point of view, where we aim at solving the same classification task in different organisms. We investigate the challenge of combining information from several organisms, whereas we consider the relation between the organisms to be defined by a tree structure derived from their phylogeny. Multitask learning, a machine learning technique that recently received considerable attention, considers the problem of learning across tasks that are related to each other. We treat each organism as one task and present three novel multitask learning methods to handle situations in which the relationships among tasks can be described by a hierarchy. These algorithms are designed for large-scale applications and are therefore applicable to problems with a large number of training examples, which are frequently encountered in sequence analysis. We perform experimental analyses on synthetic data sets in order to illustrate the properties of our algorithms. Moreover, we consider a problem from genomic sequence analysis, namely splice site recognition, to illustrate the usefulness of our approach. We show that intelligently combining data from 15 eukaryotic organisms can indeed significantly improve the prediction performance compared to traditional learning approaches. On a broader perspective, we expect that algorithms like the ones presented in this work have the potential to complement and enrich the strategy of homology-based sequence analysis that are currently the quasi-standard in biological sequence analysis.

  14. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  15. Glomerulus Classification and Detection Based on Convolutional Neural Networks

    Directory of Open Access Journals (Sweden)

    Jaime Gallego

    2018-01-01

    Full Text Available Glomerulus classification and detection in kidney tissue segments are key processes in nephropathology used for the correct diagnosis of the diseases. In this paper, we deal with the challenge of automating Glomerulus classification and detection from digitized kidney slide segments using a deep learning framework. The proposed method applies Convolutional Neural Networks (CNNs between two classes: Glomerulus and Non-Glomerulus, to detect the image segments belonging to Glomerulus regions. We configure the CNN with the public pre-trained AlexNet model and adapt it to our system by learning from Glomerulus and Non-Glomerulus regions extracted from training slides. Once the model is trained, labeling is performed by applying the CNN classification to the image blocks under analysis. The results of the method indicate that this technique is suitable for correct Glomerulus detection in Whole Slide Images (WSI, showing robustness while reducing false positive and false negative detections.

  16. A thyroid nodule classification method based on TI-RADS

    Science.gov (United States)

    Wang, Hao; Yang, Yang; Peng, Bo; Chen, Qin

    2017-07-01

    Thyroid Imaging Reporting and Data System(TI-RADS) is a valuable tool for differentiating the benign and the malignant thyroid nodules. In clinic, doctors can determine the extent of being benign or malignant in terms of different classes by using TI-RADS. Classification represents the degree of malignancy of thyroid nodules. TI-RADS as a classification standard can be used to guide the ultrasonic doctor to examine thyroid nodules more accurately and reliably. In this paper, we aim to classify the thyroid nodules with the help of TI-RADS. To this end, four ultrasound signs, i.e., cystic and solid, echo pattern, boundary feature and calcification of thyroid nodules are extracted and converted into feature vectors. Then semi-supervised fuzzy C-means ensemble (SS-FCME) model is applied to obtain the classification results. The experimental results demonstrate that the proposed method can help doctors diagnose the thyroid nodules effectively.

  17. Novel Evidence-Based Classification of Cavernous Venous Occlusive Disease.

    Science.gov (United States)

    Pathak, Ram A; Rawal, Bhupendra; Li, Zhuo; Broderick, Gregory A

    2016-10-01

    The primary aim of our study was to determine whether an evidence-based rationale could categorize cavernous venous occlusive disease into mild, moderate and severe erectile dysfunction. A total of 863 patients underwent color duplex Doppler ultrasound from January 2010 to June 2013 performed by a single urologist. We identified a cohort of 75 patients (8.7%) with a diagnosis of cavernous venous occlusive disease based on a unilateral resistive index less than 0.9, and right and left peak systolic velocity 35 cm per second or less after visual sexual stimulation. At a median followup of 13 months patients were evaluated for treatment efficacy. A total of 75 patients with a median age of 60 years (range 19 to 83) and a mean body mass index of 26.3 kg/m(2) (range 19.0 to 39.3) satisfied the criteria of cavernous venous occlusive disease. When substratified into tertiles, resistive index cutoffs were obtained, including mild cavernous venous occlusive disease-81.6 to 94.0, moderate disease-72.6 to 81.5 and severe disease-59.5 to 72.5. Using these 3 groups the phosphodiesterase type 5-inhibitor failure rate (p = 0.017) and SHIM (Sexual Health Inventory for Men) score categories (1 to 10 vs 11 to 20, p = 0.030) were statistically significantly different for mild, moderate and severe cavernous venous occlusive disease. Treatment satisfaction was also statistically significantly different. Penile prosthetic placement was a more common outcome among patients with erectile dysfunction and more severe cavernous venous occlusive disease. Our retrospective analysis supports a correlation between the phosphodiesterase type 5 inhibitor failure rate, SHIM score and the rate of surgical intervention using resistive index values. Our data further suggest that an evidence-based classification of cavernous venous occlusive disease by color Doppler ultrasound is possible and can triage patients to penile prosthetic placement. Copyright © 2016 American Urological Association

  18. [Land cover classification of Four Lakes Region in Hubei Province based on MODIS and ENVISAT data].

    Science.gov (United States)

    Xue, Lian; Jin, Wei-Bin; Xiong, Qin-Xue; Liu, Zhang-Yong

    2010-03-01

    Based on the differences of back scattering coefficient in ENVISAT ASAR data, a classification was made on the towns, waters, and vegetation-covered areas in the Four Lakes Region of Hubei Province. According to the local cropping systems and phenological characteristics in the region, and by using the discrepancies of the MODIS-NDVI index from late April to early May, the vegetation-covered areas were classified into croplands and non-croplands. The classification results based on the above-mentioned procedure was verified by the classification results based on the ETM data with high spatial resolution. Based on the DEM data, the non-croplands were categorized into forest land and bottomland; and based on the discrepancies of mean NDVI index per month, the crops were identified as mid rice, late rice, and cotton, and the croplands were identified as paddy field and upland field. The land cover classification based on the MODIS data with low spatial resolution was basically consistent with that based on the ETM data with high spatial resolution, and the total error rate was about 13.15% when the classification results based on ETM data were taken as the standard. The utilization of the above-mentioned procedures for large scale land cover classification and mapping could make the fast tracking of regional land cover classification.

  19. Initial steps towards an evidence-based classification system for golfers with a physical impairment

    NARCIS (Netherlands)

    Stoter, Inge K.; Hettinga, Florentina J.; Altmann, Viola; Eisma, Wim; Arendzen, Hans; Bennett, Tony; van der Woude, Lucas H.; Dekker, Rienk

    2017-01-01

    Purpose: The present narrative review aims to make a first step towards an evidence-based classification system in handigolf following the International Paralympic Committee (IPC). It intends to create a conceptual framework of classification for handigolf and an agenda for future research. Method:

  20. Using Discrete Loss Functions and Weighted Kappa for Classification: An Illustration Based on Bayesian Network Analysis

    Science.gov (United States)

    Zwick, Rebecca; Lenaburg, Lubella

    2009-01-01

    In certain data analyses (e.g., multiple discriminant analysis and multinomial log-linear modeling), classification decisions are made based on the estimated posterior probabilities that individuals belong to each of several distinct categories. In the Bayesian network literature, this type of classification is often accomplished by assigning…

  1. A comparative performance evaluation of neural network based approach for sentiment classification of online reviews

    Directory of Open Access Journals (Sweden)

    G. Vinodhini

    2016-01-01

    Full Text Available The aim of sentiment classification is to efficiently identify the emotions expressed in the form of text messages. Machine learning methods for sentiment classification have been extensively studied, due to their predominant classification performance. Recent studies suggest that ensemble based machine learning methods provide better performance in classification. Artificial neural networks (ANNs are rarely being investigated in the literature of sentiment classification. This paper compares neural network based sentiment classification methods (back propagation neural network (BPN, probabilistic neural network (PNN & homogeneous ensemble of PNN (HEN using varying levels of word granularity as features for feature level sentiment classification. They are validated using a dataset of product reviews collected from the Amazon reviews website. An empirical analysis is done to compare results of ANN based methods with two statistical individual methods. The methods are evaluated using five different quality measures and results show that the homogeneous ensemble of the neural network method provides better performance. Among the two neural network approaches used, probabilistic neural networks (PNNs outperform in classifying the sentiment of the product reviews. The integration of neural network based sentiment classification methods with principal component analysis (PCA as a feature reduction technique provides superior performance in terms of training time also.

  2. Vision-Based Perception and Classification of Mosquitoes Using Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Masataka Fuchida

    2017-01-01

    Full Text Available The need for a novel automated mosquito perception and classification method is becoming increasingly essential in recent years, with steeply increasing number of mosquito-borne diseases and associated casualties. There exist remote sensing and GIS-based methods for mapping potential mosquito inhabitants and locations that are prone to mosquito-borne diseases, but these methods generally do not account for species-wise identification of mosquitoes in closed-perimeter regions. Traditional methods for mosquito classification involve highly manual processes requiring tedious sample collection and supervised laboratory analysis. In this research work, we present the design and experimental validation of an automated vision-based mosquito classification module that can deploy in closed-perimeter mosquito inhabitants. The module is capable of identifying mosquitoes from other bugs such as bees and flies by extracting the morphological features, followed by support vector machine-based classification. In addition, this paper presents the results of three variants of support vector machine classifier in the context of mosquito classification problem. This vision-based approach to the mosquito classification problem presents an efficient alternative to the conventional methods for mosquito surveillance, mapping and sample image collection. Experimental results involving classification between mosquitoes and a predefined set of other bugs using multiple classification strategies demonstrate the efficacy and validity of the proposed approach with a maximum recall of 98%.

  3. Feature selection and classification of MAQC-II breast cancer and multiple myeloma microarray gene expression data.

    Directory of Open Access Journals (Sweden)

    Qingzhong Liu

    Full Text Available Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by exploiting information redundancy induced by associations among genetic markers. Judicious feature selection in microarray data analysis can result in significant reduction of cost while maintaining or improving the classification or prediction accuracy of learning machines that are employed to sort out the datasets. In this paper, we propose a gene selection method called Recursive Feature Addition (RFA, which combines supervised learning and statistical similarity measures. We compare our method with the following gene selection methods: Support Vector Machine Recursive Feature Elimination (SVMRFE, Leave-One-Out Calculation Sequential Forward Selection (LOOCSFS, Gradient based Leave-one-out Gene Selection (GLGS. To evaluate the performance of these gene selection methods, we employ several popular learning classifiers on the MicroArray Quality Control phase II on predictive modeling (MAQC-II breast cancer dataset and the MAQC-II multiple myeloma dataset. Experimental results show that gene selection is strictly paired with learning classifier. Overall, our approach outperforms other compared methods. The biological functional analysis based on the MAQC-II breast cancer dataset convinced us to apply our method for phenotype prediction. Additionally, learning classifiers also play important roles in the classification of microarray data and our experimental results indicate that the Nearest Mean Scale Classifier (NMSC is a good choice due to its prediction reliability and its stability across the three performance measurements: Testing accuracy, MCC values, and

  4. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  5. Skin injury model classification based on shape vector analysis.

    Science.gov (United States)

    Röhrich, Emil; Thali, Michael; Schweitzer, Wolf

    2012-11-06

    Skin injuries can be crucial in judicial decision making. Forensic experts base their classification on subjective opinions. This study investigates whether known classes of simulated skin injuries are correctly classified statistically based on 3D surface models and derived numerical shape descriptors. Skin injury surface characteristics are simulated with plasticine. Six injury classes - abrasions, incised wounds, gunshot entry wounds, smooth and textured strangulation marks as well as patterned injuries - with 18 instances each are used for a k-fold cross validation with six partitions. Deformed plasticine models are captured with a 3D surface scanner. Mean curvature is estimated for each polygon surface vertex. Subsequently, distance distributions and derived aspect ratios, convex hulls, concentric spheres, hyperbolic points and Fourier transforms are used to generate 1284-dimensional shape vectors. Subsequent descriptor reduction maximizing SNR (signal-to-noise ratio) result in an average of 41 descriptors (varying across k-folds). With non-normal multivariate distribution of heteroskedastic data, requirements for LDA (linear discriminant analysis) are not met. Thus, shrinkage parameters of RDA (regularized discriminant analysis) are optimized yielding a best performance with λ = 0.99 and γ = 0.001. Receiver Operating Characteristic of a descriptive RDA yields an ideal Area Under the Curve of 1.0 for all six categories. Predictive RDA results in an average CRR (correct recognition rate) of 97,22% under a 6 partition k-fold. Adding uniform noise within the range of one standard deviation degrades the average CRR to 71,3%. Digitized 3D surface shape data can be used to automatically classify idealized shape models of simulated skin injuries. Deriving some well established descriptors such as histograms, saddle shape of hyperbolic points or convex hulls with subsequent reduction of dimensionality while maximizing SNR seem to work well for the data at hand, as

  6. Analysis and classification of oncology activities on the way to workflow based single source documentation in clinical information systems.

    Science.gov (United States)

    Wagner, Stefan; Beckmann, Matthias W; Wullich, Bernd; Seggewies, Christof; Ries, Markus; Bürkle, Thomas; Prokosch, Hans-Ulrich

    2015-12-22

    Today, cancer documentation is still a tedious task involving many different information systems even within a single institution and it is rarely supported by appropriate documentation workflows. In a comprehensive 14 step analysis we compiled diagnostic and therapeutic pathways for 13 cancer entities using a mixed approach of document analysis, workflow analysis, expert interviews, workflow modelling and feedback loops. These pathways were stepwise classified and categorized to create a final set of grouped pathways and workflows including electronic documentation forms. A total of 73 workflows for the 13 entities based on 82 paper documentation forms additionally to computer based documentation systems were compiled in a 724 page document comprising 130 figures, 94 tables and 23 tumour classifications as well as 12 follow-up tables. Stepwise classification made it possible to derive grouped diagnostic and therapeutic pathways for the three major classes - solid entities with surgical therapy - solid entities with surgical and additional therapeutic activities and - non-solid entities. For these classes it was possible to deduct common documentation workflows to support workflow-guided single-source documentation. Clinical documentation activities within a Comprehensive Cancer Center can likely be realized in a set of three documentation workflows with conditional branching in a modern workflow supporting clinical information system.

  7. Knowledge-based sea ice classification by polarimetric SAR

    DEFF Research Database (Denmark)

    Skriver, Henning; Dierking, Wolfgang

    2004-01-01

    Polarimetric SAR images acquired at C- and L-band over sea ice in the Greenland Sea, Baltic Sea, and Beaufort Sea have been analysed with respect to their potential for ice type classification. The polarimetric data were gathered by the Danish EMISAR and the US AIRSAR which both are airborne...

  8. A vegetation-based hierarchical classification for seasonally pulsed ...

    African Journals Online (AJOL)

    The resultant dendrogram provides an objective routine for classifying floodplains in the Boro-Xudum distributary in an ecologically meaningful way. This classification will assist in monitoring changes in vegetation resulting from hydrological change. Keywords: ecological monitoring, indicator species, plant communities, ...

  9. A neural network based approach to social touch classification

    NARCIS (Netherlands)

    van Wingerden, Siewart; Uebbing, Tobias J.; Jung, Merel Madeleine; Poel, Mannes

    2014-01-01

    Touch is an important interaction modality in social interaction, for instance touch can communicate emotions and can intensify emotions communicated by other modalities. In this paper we explore the use of Neural Networks for the classification of touch. The exploration and assessment of Neural

  10. An introduction of accidents’ classification based on their outcome control

    NARCIS (Netherlands)

    Karanikas, Nektarios

    2015-01-01

    Most safety oriented organizations have established their accidents classification taking into account the magnitude of the combined adverse outcomes on humans, assets and the environment without considering the accidents‟ potential and the actual attempts of the involved persons to intervene with

  11. Density-based unsupervised classification for remote sensing

    NARCIS (Netherlands)

    C.H.M. van Kemenade; J.A. La Poutré (Han); R.J. Mokken

    1998-01-01

    textabstractMost image classification methods are supervised and use a parametric model of the classes that have to be detected. The models of the different classes are trained by means of a set of training regions that usually have to be marked and classified by a human interpreter. Unsupervised

  12. Atmosphere-based image classification through luminance and hue

    Science.gov (United States)

    Xu, Feng; Zhang, Yujin

    2005-07-01

    In this paper a novel image classification system is proposed. Atmosphere serves an important role in generating the scene"s topic or in conveying the message behind the scene"s story, which belongs to abstract attribute level in semantic levels. At first, five atmosphere semantic categories are defined according to rules of photo and film grammar, followed by global luminance and hue features. Then the hierarchical SVM classifiers are applied. In each classification stage, corresponding features are extracted and the trained linear SVM is implemented, resulting in two classes. After three stages of classification, five atmosphere categories are obtained. At last, the text annotation of the atmosphere semantics and the corresponding features by Extensible Markup Language (XML) in MPEG-7 is defined, which can be integrated into more multimedia applications (such as searching, indexing and accessing of multimedia content). The experiment is performed on Corel images and film frames. The classification results prove the effectiveness of the definition of atmosphere semantic classes and the corresponding features.

  13. Simulation and classification of power quality events based on ...

    African Journals Online (AJOL)

    The STFT and Discrete Wavelet Transform are used for feature extraction. Acquired features of this step are decreased using advance algorithm of feature selection. Then these extracted features were given to the neural network as input. The simulation results showed that the classification precision of data caused by the ...

  14. Improving the potential of pixel-based supervised classification in ...

    African Journals Online (AJOL)

    Pair separation indicators and probability thresholds were used to analyse the effect of training area size and heterogeneity as well as band combinations and the use of vegetation indices. It was found that adding probability thresholds to the classification may provide a measure of suitability regarding training area ...

  15. Clinical application of a microfluidic chip for immunocapture and quantification of circulating exosomes to assist breast cancer diagnosis and molecular classification.

    Science.gov (United States)

    Fang, Shimeng; Tian, Hongzhu; Li, Xiancheng; Jin, Dong; Li, Xiaojie; Kong, Jing; Yang, Chun; Yang, Xuesong; Lu, Yao; Luo, Yong; Lin, Bingcheng; Niu, Weidong; Liu, Tingjiao

    2017-01-01

    Increasing attention has been attracted by exosomes in blood-based diagnosis because cancer cells release more exosomes in serum than normal cells and these exosomes overexpress a certain number of cancer-related biomarkers. However, capture and biomarker analysis of exosomes for clinical application are technically challenging. In this study, we developed a microfluidic chip for immunocapture and quantification of circulating exosomes from small sample volume and applied this device in clinical study. Circulating EpCAM-positive exosomes were measured in 6 cases breast cancer patients and 3 healthy controls to assist diagnosis. A significant increase in the EpCAM-positive exosome level in these patients was detected, compared to healthy controls. Furthermore, we quantified circulating HER2-positive exosomes in 19 cases of breast cancer patients for molecular classification. We demonstrated that the exosomal HER2 expression levels were almost consistent with that in tumor tissues assessed by immunohistochemical staining. The microfluidic chip might provide a new platform to assist breast cancer diagnosis and molecular classification.

  16. p53-based Cancer Therapy

    Science.gov (United States)

    Lane, David P.; Cheok, Chit Fang; Lain, Sonia

    2010-01-01

    Inactivation of p53 functions is an almost universal feature of human cancer cells. This has spurred a tremendous effort to develop p53 based cancer therapies. Gene therapy using wild-type p53, delivered by adenovirus vectors, is now in widespread use in China. Other biologic approaches include the development of oncolytic viruses designed to replicate and kill only p53 defective cells and also the development of siRNA and antisense RNA's that activate p53 by inhibiting the function of the negative regulators Mdm2, MdmX, and HPV E6. The altered processing of p53 that occurs in tumor cells can elicit T-cell and B-cell responses to p53 that could be effective in eliminating cancer cells and p53 based vaccines are now in clinical trial. A number of small molecules that directly or indirectly activate the p53 response have also reached the clinic, of which the most advanced are the p53 mdm2 interaction inhibitors. Increased understanding of the p53 response is also allowing the development of powerful drug combinations that may increase the selectivity and safety of chemotherapy, by selective protection of normal cells and tissues. PMID:20463003

  17. Applying Topographic Classification, Based on the Hydrological Process, to Design Habitat Linkages for Climate Change

    Directory of Open Access Journals (Sweden)

    Yongwon Mo

    2017-11-01

    Full Text Available The use of biodiversity surrogates has been discussed in the context of designing habitat linkages to support the migration of species affected by climate change. Topography has been proposed as a useful surrogate in the coarse-filter approach, as the hydrological process caused by topography such as erosion and accumulation is the basis of ecological processes. However, some studies that have designed topographic linkages as habitat linkages, so far have focused much on the shape of the topography (morphometric topographic classification with little emphasis on the hydrological processes (generic topographic classification to find such topographic linkages. We aimed to understand whether generic classification was valid for designing these linkages. First, we evaluated whether topographic classification is more appropriate for describing actual (coniferous and deciduous and potential (mammals and amphibians habitat distributions. Second, we analyzed the difference in the linkages between the morphometric and generic topographic classifications. The results showed that the generic classification represented the actual distribution of the trees, but neither the morphometric nor the generic classification could represent the potential animal distributions adequately. Our study demonstrated that the topographic classes, according to the generic classification, were arranged successively according to the flow of water, nutrients, and sediment; therefore, it would be advantageous to secure linkages with a width of 1 km or more. In addition, the edge effect would be smaller than with the morphometric classification. Accordingly, we suggest that topographic characteristics, based on the hydrological process, are required to design topographic linkages for climate change.

  18. Optimal query-based relevance feedback in medical image retrieval using score fusion-based classification.

    Science.gov (United States)

    Behnam, Mohammad; Pourghassem, Hossein

    2015-04-01

    In this paper, a new content-based medical image retrieval (CBMIR) framework using an effective classification method and a novel relevance feedback (RF) approach are proposed. For a large-scale database with diverse collection of different modalities, query image classification is inevitable due to firstly, reducing the computational complexity and secondly, increasing influence of data fusion by removing unimportant data and focus on the more valuable information. Hence, we find probability distribution of classes in the database using Gaussian mixture model (GMM) for each feature descriptor and then using the fusion of obtained scores from the dependency probabilities, the most relevant clusters are identified for a given query. Afterwards, visual similarity of query image and images in relevant clusters are calculated. This method is performed separately on all feature descriptors, and then the results are fused together using feature similarity ranking level fusion algorithm. In the RF level, we propose a new approach to find the optimal queries based on relevant images. The main idea is based on density function estimation of positive images and strategy of moving toward the aggregation of estimated density function. The proposed framework has been evaluated on ImageCLEF 2005 database consisting of 10,000 medical X-ray images of 57 semantic classes. The experimental results show that compared with the existing CBMIR systems, our framework obtains the acceptable performance both in the image classification and in the image retrieval by RF.

  19. Segmentation-Based PolSAR Image Classification Using Visual Features: RHLBP and Color Features

    Directory of Open Access Journals (Sweden)

    Jian Cheng

    2015-05-01

    Full Text Available A segmentation-based fully-polarimetric synthetic aperture radar (PolSAR image classification method that incorporates texture features and color features is designed and implemented. This method is based on the framework that conjunctively uses statistical region merging (SRM for segmentation and support vector machine (SVM for classification. In the segmentation step, we propose an improved local binary pattern (LBP operator named the regional homogeneity local binary pattern (RHLBP to guarantee the regional homogeneity in PolSAR images. In the classification step, the color features extracted from false color images are applied to improve the classification accuracy. The RHLBP operator and color features can provide discriminative information to separate those pixels and regions with similar polarimetric features, which are from different classes. Extensive experimental comparison results with conventional methods on L-band PolSAR data demonstrate the effectiveness of our proposed method for PolSAR image classification.

  20. Maximum-margin based representation learning from multiple atlases for Alzheimer's disease classification.

    Science.gov (United States)

    Min, Rui; Cheng, Jian; Price, True; Wu, Guorong; Shen, Dinggang

    2014-01-01

    In order to establish the correspondences between different brains for comparison, spatial normalization based morphometric measurements have been widely used in the analysis of Alzheimer's disease (AD). In the literature, different subjects are often compared in one atlas space, which may be insufficient in revealing complex brain changes. In this paper, instead of deploying one atlas for feature extraction and classification, we propose a maximum-margin based representation learning (MMRL) method to learn the optimal representation from multiple atlases. Unlike traditional methods that perform the representation learning separately from the classification, we propose to learn the new representation jointly with the classification model, which is more powerful in discriminating AD patients from normal controls (NC). We evaluated the proposed method on the ADNI database, and achieved 90.69% for AD/NC classification and 73.69% for p-MCI/s-MCI classification.

  1. The method of narrow-band audio classification based on universal noise background model

    Science.gov (United States)

    Rui, Rui; Bao, Chang-chun

    2013-03-01

    Audio classification is the basis of content-based audio analysis and retrieval. The conventional classification methods mainly depend on feature extraction of audio clip, which certainly increase the time requirement for classification. An approach for classifying the narrow-band audio stream based on feature extraction of audio frame-level is presented in this paper. The audio signals are divided into speech, instrumental music, song with accompaniment and noise using the Gaussian mixture model (GMM). In order to satisfy the demand of actual environment changing, a universal noise background model (UNBM) for white noise, street noise, factory noise and car interior noise is built. In addition, three feature schemes are considered to optimize feature selection. The experimental results show that the proposed algorithm achieves a high accuracy for audio classification, especially under each noise background we used and keep the classification time less than one second.

  2. A kernel-based multivariate feature selection method for microarray data classification.

    Directory of Open Access Journals (Sweden)

    Shiquan Sun

    Full Text Available High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Formula: see text]-nearest neighbor on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.

  3. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function.

    Directory of Open Access Journals (Sweden)

    Derek G Groenendyk

    Full Text Available Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization

  4. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function.

    Science.gov (United States)

    Groenendyk, Derek G; Ferré, Ty P A; Thorp, Kelly R; Rice, Amy K

    2015-01-01

    Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape

  5. Cell-based therapy technology classifications and translational challenges.

    Science.gov (United States)

    Mount, Natalie M; Ward, Stephen J; Kefalas, Panos; Hyllner, Johan

    2015-10-19

    Cell therapies offer the promise of treating and altering the course of diseases which cannot be addressed adequately by existing pharmaceuticals. Cell therapies are a diverse group across cell types and therapeutic indications and have been an active area of research for many years but are now strongly emerging through translation and towards successful commercial development and patient access. In this article, we present a description of a classification of cell therapies on the basis of their underlying technologies rather than the more commonly used classification by cell type because the regulatory path and manufacturing solutions are often similar within a technology area due to the nature of the methods used. We analyse the progress of new cell therapies towards clinical translation, examine how they are addressing the clinical, regulatory, manufacturing and reimbursement requirements, describe some of the remaining challenges and provide perspectives on how the field may progress for the future. © 2015 The Authors.

  6. [A systematic review of worldwide natural history models of colorectal cancer: classification, transition rate and a recommendation for developing Chinese population-specific model].

    Science.gov (United States)

    Li, Z F; Huang, H Y; Shi, J F; Guo, C G; Zou, S M; Liu, C C; Wang, Y; Wang, L; Zhu, S L; Wu, S L; Dai, M

    2017-02-10

    Objective: To review the worldwide studies on natural history models among colorectal cancer (CRC), and to inform building a Chinese population-specific CRC model and developing a platform for further evaluation of CRC screening and other interventions in population in China. Methods: A structured literature search process was conducted in PubMed and the target publication dates were from January 1995 to December 2014. Information about classification systems on both colorectal cancer and precancer on corresponding transition rate, were extracted and summarized. Indicators were mainly expressed by the medians and ranges of annual progression or regression rate. Results: A total of 24 studies were extracted from 1 022 studies, most were from America (n=9), but 2 from China including 1 from the mainland area, mainly based on Markov model (n=22). Classification systems for adenomas included progression risk (n=9) and the sizes of adenoma (n=13, divided into two ways) as follows: 1) Based on studies where adenoma was risk-dependent, the median annual transition rates, from ' normal status' to ' non-advanced adenoma', 'non-advanced' to ' advanced' and ' advanced adenoma' to CRC were 0.016 0 (range: 0.002 2-0.020 0), 0.020 (range: 0.002-0.177) and 0.044 (range: 0.005-0.063), respectively. 2) Median annual transition rates, based on studies where adenoma were classified by sizes, into system of CRC mainly included LRD (localized/regional/distant, n=10), Dukes' (n=7) and TNM (n=3). When using the LRD classification, the median annual transition rates from ' localized' to ' regional' and ' regional' to 'distant' were 0.28 (range: 0.20-0.33) and 0.40 (range: 0.24-0.63), respectively. Under the Dukes' classification, the median annual transition rates appeared as 0.583 (range: 0.050-0.910), 0.656 (range: 0.280-0.720) and 0.830 (range: 0.630-0.865) from Dukes' A to B, B to C and C to Dukes' D, respectively. Again, when using the TNM classification, very limited transition rate

  7. Overfitting Reduction of Text Classification Based on AdaBELM

    OpenAIRE

    Xiaoyue Feng; Yanchun Liang; Xiaohu Shi; Dong Xu; Xu Wang; Renchu Guan

    2017-01-01

    Overfitting is an important problem in machine learning. Several algorithms, such as the extreme learning machine (ELM), suffer from this issue when facing high-dimensional sparse data, e.g., in text classification. One common issue is that the extent of overfitting is not well quantified. In this paper, we propose a quantitative measure of overfitting referred to as the rate of overfitting (RO) and a novel model, named AdaBELM, to reduce the overfitting. With RO, the overfitting problem can ...

  8. Instrument classification in polyphonic music based on timbre analysis

    Science.gov (United States)

    Zhang, Tong

    2001-07-01

    While most previous work on musical instrument recognition is focused on the classification of single notes in monophonic music, a scheme is proposed in this paper for the distinction of instruments in continuous music pieces which may contain one or more kinds of instruments. Highlights of the system include music segmentation into notes, harmonic partial estimation in polyphonic sound, note feature calculation and normalization, note classification using a set of neural networks, and music piece categorization with fuzzy logic principles. Example outputs of the system are `the music piece is 100% guitar (with 90% likelihood)' and `the music piece is 60% violin and 40% piano, thus a violin/piano duet'. The system has been tested with twelve kinds of musical instruments, and very promising experimental results have been obtained. An accuracy of about 80% is achieved, and the number can be raised to 90% if misindexings within the same instrument family are tolerated (e.g. cello, viola and violin). A demonstration system for musical instrument classification and music timbre retrieval is also presented.

  9. A unified classification model based on robust optimization.

    Science.gov (United States)

    Takeda, Akiko; Mitsugi, Hiroyuki; Kanamori, Takafumi

    2013-03-01

    A wide variety of machine learning algorithms such as the support vector machine (SVM), minimax probability machine (MPM), and Fisher discriminant analysis (FDA) exist for binary classification. The purpose of this letter is to provide a unified classification model that includes these models through a robust optimization approach. This unified model has several benefits. One is that the extensions and improvements intended for SVMs become applicable to MPM and FDA, and vice versa. For example, we can obtain nonconvex variants of MPM and FDA by mimicking Perez-Cruz, Weston, Hermann, and Schölkopf's (2003) extension from convex ν-SVM to nonconvex Eν-SVM. Another benefit is to provide theoretical results concerning these learning methods at once by dealing with the unified model. We give a statistical interpretation of the unified classification model and prove that the model is a good approximation for the worst-case minimization of an expected loss with respect to the uncertain probability distribution. We also propose a nonconvex optimization algorithm that can be applied to nonconvex variants of existing learning methods and show promising numerical results.

  10. Classification of weld defect based on information fusion technology for radiographic testing system

    Energy Technology Data Exchange (ETDEWEB)

    Jiang, Hongquan; Liang, Zeming, E-mail: heavenlzm@126.com; Gao, Jianmin; Dang, Changying [State Key Laboratory for Manufacturing System Engineering, Department of Mechanical Engineering, Xi’an Jiaotong University, Xi’an 710049 (China)

    2016-03-15

    Improving the efficiency and accuracy of weld defect classification is an important technical problem in developing the radiographic testing system. This paper proposes a novel weld defect classification method based on information fusion technology, Dempster–Shafer evidence theory. First, to characterize weld defects and improve the accuracy of their classification, 11 weld defect features were defined based on the sub-pixel level edges of radiographic images, four of which are presented for the first time in this paper. Second, we applied information fusion technology to combine different features for weld defect classification, including a mass function defined based on the weld defect feature information and the quartile-method-based calculation of standard weld defect class which is to solve a sample problem involving a limited number of training samples. A steam turbine weld defect classification case study is also presented herein to illustrate our technique. The results show that the proposed method can increase the correct classification rate with limited training samples and address the uncertainties associated with weld defect classification.

  11. iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space.

    Science.gov (United States)

    Akbar, Shahid; Hayat, Maqsood; Iqbal, Muhammad; Jan, Mian Ahmad

    2017-06-01

    Cancer is a fatal disease, responsible for one-quarter of all deaths in developed countries. Traditional anticancer therapies such as, chemotherapy and radiation, are highly expensive, susceptible to errors and ineffective techniques. These conventional techniques induce severe side-effects on human cells. Due to perilous impact of cancer, the development of an accurate and highly efficient intelligent computational model is desirable for identification of anticancer peptides. In this paper, evolutionary intelligent genetic algorithm-based ensemble model, 'iACP-GAEnsC', is proposed for the identification of anticancer peptides. In this model, the protein sequences are formulated, using three different discrete feature representation methods, i.e., amphiphilic Pseudo amino acid composition, g-Gap dipeptide composition, and Reduce amino acid alphabet composition. The performance of the extracted feature spaces are investigated separately and then merged to exhibit the significance of hybridization. In addition, the predicted results of individual classifiers are combined together, using optimized genetic algorithm and simple majority technique in order to enhance the true classification rate. It is observed that genetic algorithm-based ensemble classification outperforms than individual classifiers as well as simple majority voting base ensemble. The performance of genetic algorithm-based ensemble classification is highly reported on hybrid feature space, with an accuracy of 96.45%. In comparison to the existing techniques, 'iACP-GAEnsC' model has achieved remarkable improvement in terms of various performance metrics. Based on the simulation results, it is observed that 'iACP-GAEnsC' model might be a leading tool in the field of drug design and proteomics for researchers. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Single-labelled music genre classification using content-based features

    CSIR Research Space (South Africa)

    Ajoodha, R

    2015-11-01

    Full Text Available In this paper we use content-based features to perform automatic classification of music pieces into genres. We categorise these features into four groups: features extracted from the Fourier transform’s magnitude spectrum, features designed...

  13. Object-based land cover classification based on fusion of multifrequency SAR data and THAICHOTE optical imagery

    Science.gov (United States)

    Sukawattanavijit, Chanika; Srestasathiern, Panu

    2017-10-01

    Land Use and Land Cover (LULC) information are significant to observe and evaluate environmental change. LULC classification applying remotely sensed data is a technique popularly employed on a global and local dimension particularly, in urban areas which have diverse land cover types. These are essential components of the urban terrain and ecosystem. In the present, object-based image analysis (OBIA) is becoming widely popular for land cover classification using the high-resolution image. COSMO-SkyMed SAR data was fused with THAICHOTE (namely, THEOS: Thailand Earth Observation Satellite) optical data for land cover classification using object-based. This paper indicates a comparison between object-based and pixel-based approaches in image fusion. The per-pixel method, support vector machines (SVM) was implemented to the fused image based on Principal Component Analysis (PCA). For the objectbased classification was applied to the fused images to separate land cover classes by using nearest neighbor (NN) classifier. Finally, the accuracy assessment was employed by comparing with the classification of land cover mapping generated from fused image dataset and THAICHOTE image. The object-based data fused COSMO-SkyMed with THAICHOTE images demonstrated the best classification accuracies, well over 85%. As the results, an object-based data fusion provides higher land cover classification accuracy than per-pixel data fusion.

  14. Scene Classification of Remote Sensing Image Based on Multi-scale Feature and Deep Neural Network

    Directory of Open Access Journals (Sweden)

    XU Suhui

    2016-07-01

    Full Text Available Aiming at low precision of remote sensing image scene classification owing to small sample sizes, a new classification approach is proposed based on multi-scale deep convolutional neural network (MS-DCNN, which is composed of nonsubsampled Contourlet transform (NSCT, deep convolutional neural network (DCNN, and multiple-kernel support vector machine (MKSVM. Firstly, remote sensing image multi-scale decomposition is conducted via NSCT. Secondly, the decomposing high frequency and low frequency subbands are trained by DCNN to obtain image features in different scales. Finally, MKSVM is adopted to integrate multi-scale image features and implement remote sensing image scene classification. The experiment results in the standard image classification data sets indicate that the proposed approach obtains great classification effect due to combining the recognition superiority to different scenes of low frequency and high frequency subbands.

  15. SVM-Based Classification of Segmented Airborne LiDAR Point Clouds in Urban Areas

    OpenAIRE

    Xiaogang Ning; Xiangguo Lin; Jixian Zhang

    2013-01-01

    Object-based point cloud analysis (OBPA) is useful for information extraction from airborne LiDAR point clouds. An object-based classification method is proposed for classifying the airborne LiDAR point clouds in urban areas herein. In the process of classification, the surface growing algorithm is employed to make clustering of the point clouds without outliers, thirteen features of the geometry, radiometry, topology and echo characteristics are calculated, a support vector machine (SVM) is ...

  16. Polsar Land Cover Classification Based on Hidden Polarimetric Features in Rotation Domain and Svm Classifier

    Science.gov (United States)

    Tao, C.-S.; Chen, S.-W.; Li, Y.-Z.; Xiao, S.-P.

    2017-09-01

    Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR) data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets' scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM) classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy with the proposed

  17. POLSAR LAND COVER CLASSIFICATION BASED ON HIDDEN POLARIMETRIC FEATURES IN ROTATION DOMAIN AND SVM CLASSIFIER

    Directory of Open Access Journals (Sweden)

    C.-S. Tao

    2017-09-01

    Full Text Available Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets’ scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy

  18. Long-term Prostate-specific Antigen Velocity in Improved Classification of Prostate Cancer Risk and Mortality

    DEFF Research Database (Denmark)

    Ørsted, David Dynnes; Bojesen, Stig E; Kamstrup, Pia R

    2013-01-01

    BACKGROUND: It remains unclear whether adding long-term prostate-specific antigen velocity (PSAV) to baseline PSA values improves classification of prostate cancer (PCa) risk and mortality in the general population. OBJECTIVE: To determine whether long-term PSAV improves classification of PCa risk...... classification was assessed using the net reclassification index (NRI). RESULTS: Age-adjusted hazard ratios for PCa risk and mortality were 2.7-5.3 and 2.3-3.4, respectively, for long-term PSAV when added to models already including baseline PSA values. For PCa risk and mortality, adding long-term PSAV to models....... Correspondingly, inappropriately reclassified were 49 of 10 000 men with PCa and 1658 of 10 000 men with no PCa. CONCLUSIONS: Long-term PSAV in addition to baseline PSA value improves classification of PCa risk and mortality. Applying long-term PSAV nationwide, the ratio of appropriately to inappropriately...

  19. [CT-based classification aid for acetabular fractures: evaluation and clinical testing].

    Science.gov (United States)

    Schäffler, A; Fensky, F; Knöschke, D; Haas, N P; Becken, A G; Stöckle, U; König, B

    2013-11-01

    The basis for the classification of acetabular fractures depends on accurate radiological diagnostics. The use of conventional X-rays alone implicates a low intrapersonal reproducibility and interpersonal reliability. By applying computed tomography (CT) at an early stage in the emergency room, the typical diagonal X-rays of ala and obturator, on which the classification is based, are no longer recommended. The aim of this study was to develop a new reliable classification system based on standardized CT slices according to the system of Judet and Letournel without using diagonal X-rays. In this study 12 select cases with acetabular fractures were peer reviewed. In each case eight characteristic CT slices (five axial, two coronal and one sagittal) were selected as well as the conventional anteroposterior X-ray of the pelvis. All cases were peer reviewed by 14 members of the "AG Becken" (working group pelvis). The classification of the acetabular fractures was based on Judet and Letournel and the results were compared with the reference classification. The results were scaled according to differences to the original classification and the relevance to the approach as well as the medical qualification of the member. A total of 167 out of 168 possible classifications were conducted, 90 cases (54 %) were in accordance with the reference classification. In 69 cases (41 %) the outcome was different, which would have had no influence on the choice of the surgical approach. A wrong classification was present eight times (5 %). According to the medical qualification status the senior physicians were right in 54%, the residents in 53 %. Within the group of senior physicians 7.5 % of the classifications were completely wrong and 93 % of the participating members would have preferred to have more CT slices. The CT-based classification developed represents an adaption to the current standard of diagnostics of acetabular fractures and represents a step towards

  20. A Spectral Signature Shape-Based Algorithm for Landsat Image Classification

    Directory of Open Access Journals (Sweden)

    Yuanyuan Chen

    2016-08-01

    Full Text Available Land-cover datasets are crucial for earth system modeling and human-nature interaction research at local, regional and global scales. They can be obtained from remotely sensed data using image classification methods. However, in processes of image classification, spectral values have received considerable attention for most classification methods, while the spectral curve shape has seldom been used because it is difficult to be quantified. This study presents a classification method based on the observation that the spectral curve is composed of segments and certain extreme values. The presented classification method quantifies the spectral curve shape and takes full use of the spectral shape differences among land covers to classify remotely sensed images. Using this method, classification maps from TM (Thematic mapper data were obtained with an overall accuracy of 0.834 and 0.854 for two respective test areas. The approach presented in this paper, which differs from previous image classification methods that were mostly concerned with spectral “value” similarity characteristics, emphasizes the "shape" similarity characteristics of the spectral curve. Moreover, this study will be helpful for classification research on hyperspectral and multi-temporal images.

  1. A computational study on convolutional feature combination strategies for grade classification in colon cancer using fluorescence microscopy data

    Science.gov (United States)

    Chowdhury, Aritra; Sevinsky, Christopher J.; Santamaria-Pang, Alberto; Yener, Bülent

    2017-03-01

    The cancer diagnostic workflow is typically performed by highly specialized and trained pathologists, for which analysis is expensive both in terms of time and money. This work focuses on grade classification in colon cancer. The analysis is performed over 3 protein markers; namely E-cadherin, beta actin and colagenIV. In addition, we also use a virtual Hematoxylin and Eosin (HE) stain. This study involves a comparison of various ways in which we can manipulate the information over the 4 different images of the tissue samples and come up with a coherent and unified response based on the data at our disposal. Pre- trained convolutional neural networks (CNNs) is the method of choice for feature extraction. The AlexNet architecture trained on the ImageNet database is used for this purpose. We extract a 4096 dimensional feature vector corresponding to the 6th layer in the network. Linear SVM is used to classify the data. The information from the 4 different images pertaining to a particular tissue sample; are combined using the following techniques: soft voting, hard voting, multiplication, addition, linear combination, concatenation and multi-channel feature extraction. We observe that we obtain better results in general than when we use a linear combination of the feature representations. We use 5-fold cross validation to perform the experiments. The best results are obtained when the various features are linearly combined together resulting in a mean accuracy of 91.27%.

  2. Radiomic signature as a diagnostic factor for histologic subtype classification of non-small cell lung cancer.

    Science.gov (United States)

    Zhu, Xinzhong; Dong, Di; Chen, Zhendong; Fang, Mengjie; Zhang, Liwen; Song, Jiangdian; Yu, Dongdong; Zang, Yali; Liu, Zhenyu; Shi, Jingyun; Tian, Jie

    2018-02-15

    To distinguish squamous cell carcinoma (SCC) from lung adenocarcinoma (ADC) based on a radiomic signature METHODS: This study involved 129 patients with non-small cell lung cancer (NSCLC) (81 in the training cohort and 48 in the independent validation cohort). Approximately 485 features were extracted from a manually outlined tumor region. The LASSO logistic regression model selected the key features of a radiomic signature. Receiver operating characteristic curve and area under the curve (AUC) were used to evaluate the performance of the radiomic signature in the training and validation cohorts. Five features were selected to construct the radiomic signature for histologic subtype classification. The performance of the radiomic signature to distinguish between lung ADC and SCC in both training and validation cohorts was good, with an AUC of 0.905 (95% confidence interval [CI]: 0.838 to 0.971), sensitivity of 0.830, and specificity of 0.929. In the validation cohort, the radiomic signature showed an AUC of 0.893 (95% CI: 0.789 to 0.996), sensitivity of 0.828, and specificity of 0.900. A unique radiomic signature was constructed for use as a diagnostic factor for discriminating lung ADC from SCC. Patients with NSCLC will benefit from the proposed radiomic signature. • Machine learning can be used for auxiliary distinguish in lung cancer. • Radiomic signature can discriminate lung ADC from SCC. • Radiomics can help to achieve precision medical treatment.

  3. Performance Evaluation of Frequency Transform Based Block Classification of Compound Image Segmentation Techniques

    Science.gov (United States)

    Selwyn, Ebenezer Juliet; Florinabel, D. Jemi

    2017-12-01

    Compound image segmentation plays a vital role in the compression of computer screen images. Computer screen images are images which are mixed with textual, graphical, or pictorial contents. In this paper, we present a comparison of two transform based block classification of compound images based on metrics like speed of classification, precision and recall rate. Block based classification approaches normally divide the compound images into fixed size blocks of non-overlapping in nature. Then frequency transform like Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) are applied over each block. Mean and standard deviation are computed for each 8 × 8 block and are used as features set to classify the compound images into text/graphics and picture/background block. The classification accuracy of block classification based segmentation techniques are measured by evaluation metrics like precision and recall rate. Compound images of smooth background and complex background images containing text of varying size, colour and orientation are considered for testing. Experimental evidence shows that the DWT based segmentation provides significant improvement in recall rate and precision rate approximately 2.3% than DCT based segmentation with an increase in block classification time for both smooth and complex background images.

  4. Breast tomosynthesis and digital mammography: a comparison of breast cancer visibility and BIRADS classification in a population of cancers with subtle mammographic findings

    Energy Technology Data Exchange (ETDEWEB)

    Andersson, Ingvar; Zackrisson, Sophia [Malmoe University Hospital, Diagnostic Centre of Imaging and Functional Medicine, Malmoe (Sweden); Ikeda, Debra M. [Stanford University, Stanford Advanced Medicine Center, Department of Radiology, Stanford, CA (United States); Ruschin, Mark [Lund University, Malmoe University Hospital, Department of Medical Radiation Physics, Malmoe (Sweden); University Health Network/Princess Margaret Hospital, Department of Radiation Physics, Toronto, ON (Canada); Svahn, Tony; Timberg, Pontus; Tingberg, Anders [Lund University, Malmoe University Hospital, Department of Medical Radiation Physics, Malmoe (Sweden)

    2008-12-15

    The main purpose was to compare breast cancer visibility in one-view breast tomosynthesis (BT) to cancer visibility in one- or two-view digital mammography (DM). Thirty-six patients were selected on the basis of subtle signs of breast cancer on DM. One-view BT was performed with the same compression angle as the DM image in which the finding was least/not visible. On BT, 25 projections images were acquired over an angular range of 50 degrees, with double the dose of one-view DM. Two expert breast imagers classified one- and two-view DM, and BT findings for cancer visibility and BIRADS cancer probability in a non-blinded consensus study. Forty breast cancers were found in 37 breasts. The cancers were rated more visible on BT compared to one-view and two-view DM in 22 and 11 cases, respectively, (p<0.01 for both comparisons). Comparing one-view DM to one-view BT, 21 patients were upgraded on BIRADS classification (p<0.01). Comparing two-view DM to one-view BT, 12 patients were upgraded on BIRADS classification (p<0.01). The results indicate that the cancer visibility on BT is superior to DM, which suggests that BT may have a higher sensitivity for breast cancer detection. (orig.)

  5. Extreme Facial Expressions Classification Based on Reality Parameters

    Science.gov (United States)

    Rahim, Mohd Shafry Mohd; Rad, Abdolvahab Ehsani; Rehman, Amjad; Altameem, Ayman

    2014-09-01

    Extreme expressions are really type of emotional expressions that are basically stimulated through the strong emotion. An example of those extreme expression is satisfied through tears. So to be able to provide these types of features; additional elements like fluid mechanism (particle system) plus some of physics techniques like (SPH) are introduced. The fusion of facile animation with SPH exhibits promising results. Accordingly, proposed fluid technique using facial animation is the real tenor for this research to get the complex expression, like laugh, smile, cry (tears emergence) or the sadness until cry strongly, as an extreme expression classification that's happens on the human face in some cases.

  6. An approach for tissue density classification in mammographic images using artificial neural network based on wavelet and curvelet transforms

    Science.gov (United States)

    Yaşar, Hüseyin; Ceylan, Murat

    2015-03-01

    Breast cancer is one of the types of cancer which is most commonly seen in women. Density of breast is an important indicator for the risk of cancer. In addition, densities of tissue may harden the diagnosis by hiding the abnormalities occurring on the breast. For this reason, during the process of diagnosis, the process of automatic classification of breast density has a significant importance. In this study, a new system with the base of Artificial Neural Network (ANN) and multiple resolution analysis is suggested. Wavelet and curvelet analyses having the most common use have been used as multi resolution analysis. 4 pieces of statistics which are minimum value, maximum value, mean value and standard deviation have been extracted from the images which have been eluted to their sub-bands via multi resolution analysis. For the purpose of testing the success of the system, 322 pieces of images which are in MIAS database have been used. The obtained results for different backgrounds are so satisfying; and the highest classification values have been obtained as 97.16 % with Wavelet transform and ANN for fatty background and 79.80 % with Wavelet transform and ANN for fatty-glanduar background. The same results have been obtained using Wavelet transform and ANN and Curvelet transform and ANN for dense background and accuracy rate of 84.82 % have been reached. The results of mean classification have been obtained, for three pieces of tissue types (fatty, fatty-glanduar, dense), in sequence as 84.47 % with the use of ANN, 85.71 % with the use of curvelet analysis and ANN; and 87.26 % with the use of wavelet analysis and ANN.

  7. A New Classification Analysis of Customer Requirement Information Based on Quantitative Standardization for Product Configuration

    Directory of Open Access Journals (Sweden)

    Zheng Xiao

    2016-01-01

    Full Text Available Traditional methods used for the classification of customer requirement information are typically based on specific indicators, hierarchical structures, and data formats and involve a qualitative analysis in terms of stationary patterns. Because these methods neither consider the scalability of classification results nor do they regard subsequent application to product configuration, their classification becomes an isolated operation. However, the transformation of customer requirement information into quantifiable values would lead to a dynamic classification according to specific conditions and would enable an association with product configuration in an enterprise. This paper introduces a classification analysis based on quantitative standardization, which focuses on (i expressing customer requirement information mathematically and (ii classifying customer requirement information for product configuration purposes. Our classification analysis treated customer requirement information as follows: first, it was transformed into standardized values using mathematics, subsequent to which it was classified through calculating the dissimilarity with general customer requirement information related to the product family. Finally, a case study was used to demonstrate and validate the feasibility and effectiveness of the classification analysis.

  8. Graph-Based Semi-Supervised Hyperspectral Image Classification Using Spatial Information

    Science.gov (United States)

    Jamshidpour, N.; Homayouni, S.; Safari, A.

    2017-09-01

    Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  9. GRAPH-BASED SEMI-SUPERVISED HYPERSPECTRAL IMAGE CLASSIFICATION USING SPATIAL INFORMATION

    Directory of Open Access Journals (Sweden)

    N. Jamshidpour

    2017-09-01

    Full Text Available Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.

  10. A Neuro-Fuzzy based System for Classification of Natural Textures

    Science.gov (United States)

    Jiji, G. Wiselin

    2016-12-01

    A statistical approach based on the coordinated clusters representation of images is used for classification and recognition of textured images. In this paper, two issues are being addressed; one is the extraction of texture features from the fuzzy texture spectrum in the chromatic and achromatic domains from each colour component histogram of natural texture images and the second issue is the concept of a fusion of multiple classifiers. The implementation of an advanced neuro-fuzzy learning scheme has been also adopted in this paper. The results of classification tests show the high performance of the proposed method that may have industrial application for texture classification, when compared with other works.

  11. EEG classification approach based on the extreme learning machine and wavelet transform.

    Science.gov (United States)

    Yuan, Qi; Zhou, Weidong; Zhang, Jing; Li, Shufang; Cai, Dongmei; Zeng, Yanjun

    2012-04-01

    Automatic detection and classification of electroencephalogram (EEG) epileptic activity aid diagnosis and relieve the heavy workload of doctors. This article presents a new EEG classification approach based on the extreme learning machine (ELM) and wavelet transform (WT). First, the WT is used to extract useful features when certain scales cover abnormal components of the EEG. Second, the ELM algorithm is used to train a single hidden layer of feedforward neural network (SLFN) features. Finally, the SLFN is tested with interictal and ictal EEGs. The experiments demonstrated that the proposed approach achieved a satisfactory classification rate of 99.25% for interictal and ictal EEGs.

  12. Convolutional neural network-based data page classification for holographic memory.

    Science.gov (United States)

    Shimobaba, Tomoyoshi; Kuwata, Naoki; Homma, Mizuha; Takahashi, Takayuki; Nagahama, Yuki; Sano, Marie; Hasegawa, Satoki; Hirayama, Ryuji; Kakue, Takashi; Shiraki, Atsushi; Takada, Naoki; Ito, Tomoyoshi

    2017-09-10

    We propose a deep-learning-based classification of data pages used in holographic memory. We numerically investigated the classification performance of a conventional multilayer perceptron (MLP) and a deep neural network, under the condition that reconstructed page data are contaminated by some noise and are randomly laterally shifted. When data pages are randomly laterally shifted, the MLP was found to have a classification accuracy of 93.02%, whereas the deep neural network was able to classify data pages at an accuracy of 99.98%. The accuracy of the deep neural network is 2 orders of magnitude better than the MLP.

  13. Classification of samples into two or more ordered populations with application to a cancer trial.

    Science.gov (United States)

    Conde, D; Fernández, M A; Rueda, C; Salvador, B

    2012-12-10

    In many applications, especially in cancer treatment and diagnosis, investigators are interested in classifying patients into various diagnosis groups on the basis of molecular data such as gene expression or proteomic data. Often, some of the diagnosis groups are known to be related to higher or lower values of some of the predictors. The standard methods of classifying patients into various groups do not take into account the underlying order. This could potentially result in high misclassification rates, especially when the number of groups is larger than two. In this article, we develop classification procedures that exploit the underlying order among the mean values of the predictor variables and the diagnostic groups by using ideas from order-restricted inference. We generalize the existing methodology on discrimination under restrictions and provide empirical evidence to demonstrate that the proposed methodology improves over the existing unrestricted methodology. The proposed methodology is applied to a bladder cancer data set where the researchers are interested in classifying patients into various groups. Copyright © 2012 John Wiley & Sons, Ltd.

  14. A review of supervised object-based land-cover image classification

    Science.gov (United States)

    Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue

    2017-08-01

    Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial

  15. Computer vision-based limestone rock-type classification using probabilistic neural network

    Directory of Open Access Journals (Sweden)

    Ashok Kumar Patel

    2016-01-01

    Full Text Available Proper quality planning of limestone raw materials is an essential job of maintaining desired feed in cement plant. Rock-type identification is an integrated part of quality planning for limestone mine. In this paper, a computer vision-based rock-type classification algorithm is proposed for fast and reliable identification without human intervention. A laboratory scale vision-based model was developed using probabilistic neural network (PNN where color histogram features are used as input. The color image histogram-based features that include weighted mean, skewness and kurtosis features are extracted for all three color space red, green, and blue. A total nine features are used as input for the PNN classification model. The smoothing parameter for PNN model is selected judicially to develop an optimal or close to the optimum classification model. The developed PPN is validated using the test data set and results reveal that the proposed vision-based model can perform satisfactorily for classifying limestone rock-types. Overall the error of mis-classification is below 6%. When compared with other three classification algorithms, it is observed that the proposed method performs substantially better than all three classification algorithms.

  16. Strategy in clinical practice for classification of unselected colorectal tumours based on mismatch repair deficiency

    DEFF Research Database (Denmark)

    Jensen, Lars Henrik; Lindebjerg, J; Byriel, L

    2007-01-01

    nonpolyposis colon cancer or Lynch syndrome), but most are epigenetic changes of sporadic origin. The aim of this study was to define a robust and inexpensive strategy for such classification in clinical practice. Method Tumours and blood samples from 262 successive patients with colorectal adenocarcinomas...... or BRAF mutation analysis to distinguish sporadic patients from likely hereditary ones. MMR deficient patients with sporadic disease can be reassured of the better prognosis and the likely hereditary cases should receive genetic counselling....

  17. Performance of epileptic single-channel scalp EEG classifications using single wavelet-based features.

    Science.gov (United States)

    Janjarasjitt, Suparerk

    2017-03-01

    Classification of epileptic scalp EEGs are certainly ones of the most crucial tasks in diagnosis of epilepsy. Rather than using multiple quantitative features, a single quantitative feature of single-channel scalp EEG is applied for classifying its corresponding state of the brain, i.e., during seizure activity or non-seizure period. The quantitative features proposed are wavelet-based features obtained from the logarithm of variance of detail and approximation coefficients of single-channel scalp EEG signals. The performance on patient-dependent based epileptic seizure classifications using single wavelet-based features are examined on scalp EEG data of 12 children subjects containing 79 seizures. The 4-fold cross validation is applied to evaluate the performance on patient-dependent based epileptic seizure classifications using single wavelet-based features. From the computational results, it is shown that the wavelet-based features can provide an outstanding performance on patient-dependent based epileptic seizure classification. The average accuracy, sensitivity, and specificity of patient-dependent based epileptic seizure classification are, respectively, 93.24%, 83.34%, and 93.53%.

  18. A Hybrid Classification System for Heart Disease Diagnosis Based on the RFRS Method

    Directory of Open Access Journals (Sweden)

    Xiao Liu

    2017-01-01

    Full Text Available Heart disease is one of the most common diseases in the world. The objective of this study is to aid the diagnosis of heart disease using a hybrid classification system based on the ReliefF and Rough Set (RFRS method. The proposed system contains two subsystems: the RFRS feature selection system and a classification system with an ensemble classifier. The first system includes three stages: (i data discretization, (ii feature extraction using the ReliefF algorithm, and (iii feature reduction using the heuristic Rough Set reduction algorithm that we developed. In the second system, an ensemble classifier is proposed based on the C4.5 classifier. The Statlog (Heart dataset, obtained from the UCI database, was used for experiments. A maximum classification accuracy of 92.59% was achieved according to a jackknife cross-validation scheme. The results demonstrate that the performance of the proposed system is superior to the performances of previously reported classification techniques.

  19. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  20. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  1. Efficacy measures associated to a plantar pressure based classification system in diabetic foot medicine.

    Science.gov (United States)

    Deschamps, Kevin; Matricali, Giovanni Arnoldo; Desmet, Dirk; Roosen, Philip; Keijsers, Noel; Nobels, Frank; Bruyninckx, Herman; Staes, Filip

    2016-09-01

    The concept of 'classification' has, similar to many other diseases, been found to be fundamental in the field of diabetic medicine. In the current study, we aimed at determining efficacy measures of a recently published plantar pressure based classification system. Technical efficacy of the classification system was investigated by applying a high resolution, pixel-level analysis on the normalized plantar pressure pedobarographic fields of the original experimental dataset consisting of 97 patients with diabetes and 33 persons without diabetes. Clinical efficacy was assessed by considering the occurence of foot ulcers at the plantar aspect of the forefoot in this dataset. Classification efficacy was assessed by determining the classification recognition rate as well as its sensitivity and specificity using cross-validation subsets of the experimental dataset together with a novel cohort of 12 patients with diabetes. Pixel-level comparison of the four groups associated to the classification system highlighted distinct regional differences. Retrospective analysis showed the occurence of eleven foot ulcers in the experimental dataset since their gait analysis. Eight out of the eleven ulcers developed in a region of the foot which had the highest forces. Overall classification recognition rate exceeded 90% for all cross-validation subsets. Sensitivity and specificity of the four groups associated to the classification system exceeded respectively the 0.7 and 0.8 level in all cross-validation subsets. The results of the current study support the use of the novel plantar pressure based classification system in diabetic foot medicine. It may particularly serve in communication, diagnosis and clinical decision making. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. SDT: A Virus Classification Tool Based on Pairwise Sequence Alignment and Identity Calculation

    OpenAIRE

    Muhire, Brejnev Muhizi; Varsani, Arvind; Martin, Darren Patrick

    2014-01-01

    The perpetually increasing rate at which viral full-genome sequences are being determined is creating a pressing demand for computational tools that will aid the objective classification of these genome sequences. Taxonomic classification approaches that are based on pairwise genetic identity measures are potentially highly automatable and are progressively gaining favour with the International Committee on Taxonomy of Viruses (ICTV). There are, however, various issues with the calculation of...

  3. Reduce working memory load for visual classification tasks through gaze-based interaction

    OpenAIRE

    Geisler, J; Granacher, T.

    2010-01-01

    The rate of working memory errors as an influence on the performance of visual classification at computer screens, e. g. in image exploitation, is expected to get reduced by use of gaze tracking instead of conventional pointing techniques. We compared two gaze-based techniques with the usage of a computer mouse: pure visual fixation (PFix) and visual fixation with confirmation (FixC). A prediction of memory errors while performing visual classification has been carried out using the "Human Pr...

  4. New approach using Bayesian Network to improve content based image classification systems

    OpenAIRE

    jayech, Khlifia; mahjoub, mohamed ali

    2012-01-01

    This paper proposes a new approach based on augmented naive Bayes for image classification. Initially, each image is cutting in a whole of blocks. For each block, we compute a vector of descriptors. Then, we propose to carry out a classification of the vectors of descriptors to build a vector of labels for each image. Finally, we propose three variants of Bayesian Networks such as Naive Bayesian Network (NB), Tree Augmented Naive Bayes (TAN) and Forest Augmented Naive Bayes (FAN) to classify ...

  5. A Quantum Hybrid PSO Combined with Fuzzy k-NN Approach to Feature Selection and Cell Classification in Cervical Cancer Detection.

    Science.gov (United States)

    Iliyasu, Abdullah M; Fatichah, Chastine

    2017-12-19

    A quantum hybrid (QH) intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO) method with the intuitionistic rationality of traditional fuzzy k-nearest neighbours (Fuzzy k-NN) algorithm (known simply as the Q-Fuzzy approach) is proposed for efficient feature selection and classification of cells in cervical smeared (CS) images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles) that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection) and another hybrid technique combining the standard PSO algorithm with the Fuzzy k-NN technique (P-Fuzzy approach). In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k-NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.

  6. MEDLINE Abstracts Classification Based on Noun Phrases Extraction

    Science.gov (United States)

    Ruiz-Rico, Fernando; Vicedo, José-Luis; Rubio-Sánchez, María-Consuelo

    Many algorithms have come up in the last years to tackle automated text categorization. They have been exhaustively studied, leading to several variants and combinations not only in the particular procedures but also in the treatment of the input data. A widely used approach is representing documents as Bag-Of-Words (BOW) and weighting tokens with the TFIDF schema. Many researchers have thrown into precision and recall improvements and classification time reduction enriching BOW with stemming, n-grams, feature selection, noun phrases, metadata, weight normalization, etc. We contribute to this field with a novel combination of these techniques. For evaluation purposes, we provide comparisons to previous works with SVM against the simple BOW. The well known OHSUMED corpus is exploited and different sets of categories are selected, as previously done in the literature. The conclusion is that the proposed method can be successfully applied to existing binary classifiers such as SVM outperforming the mixture of BOW and TFIDF approaches.

  7. Vehicle Maneuver Detection with Accelerometer-Based Classification

    Directory of Open Access Journals (Sweden)

    Javier Cervantes-Villanueva

    2016-09-01

    Full Text Available In the mobile computing era, smartphones have become instrumental tools to develop innovative mobile context-aware systems. In that sense, their usage in the vehicular domain eases the development of novel and personal transportation solutions. In this frame, the present work introduces an innovative mechanism to perceive the current kinematic state of a vehicle on the basis of the accelerometer data from a smartphone mounted in the vehicle. Unlike previous proposals, the introduced architecture targets the computational limitations of such devices to carry out the detection process following an incremental approach. For its realization, we have evaluated different classification algorithms to act as agents within the architecture. Finally, our approach has been tested with a real-world dataset collected by means of the ad hoc mobile application developed.

  8. Voting-based Classification for E-mail Spam Detection

    Directory of Open Access Journals (Sweden)

    Bashar Awad Al-Shboul

    2016-06-01

    Full Text Available The problem of spam e-mail has gained a tremendous amount of attention. Although entities tend to use e-mail spam filter applications to filter out received spam e-mails, marketing companies still tend to send unsolicited e-mails in bulk and users still receive a reasonable amount of spam e-mail despite those filtering applications. This work proposes a new method for classifying e-mails into spam and non-spam. First, several e-mail content features are extracted and then those features are used for classifying each e-mail individually. The classification results of three different classifiers (i.e. Decision Trees, Random Forests and k-Nearest Neighbor are combined in various voting schemes (i.e. majority vote, average probability, product of probabilities, minimum probability and maximum probability for making the final decision. To validate our method, two different spam e-mail collections were used.

  9. The Normalization of Citation Counts Based on Classification Systems

    Directory of Open Access Journals (Sweden)

    Andreas Barth

    2013-08-01

    Full Text Available If we want to assess whether the paper in question has had a particularly high or low citation impact compared to other papers, the standard practice in bibliometrics is to normalize citations in respect of the subject category and publication year. A number of proposals for an improved procedure in the normalization of citation impact have been put forward in recent years. Against the background of these proposals, this study describes an ideal solution for the normalization of citation impact: in a first step, the reference set for the publication in question is collated by means of a classification scheme, where every publication is associated with a single principal research field or subfield entry (e.g., via Chemical Abstracts sections and a publication year. In a second step, percentiles of citation counts are calculated for this set and used to assign the normalized citation impact score to the publications (and also to the publication in question.

  10. Artificial-neural-network-based classification of mammographic microcalcifications using image structure features

    Science.gov (United States)

    Dhawan, Atam P.; Chitre, Yateen S.; Moskowitz, Myron

    1993-07-01

    Mammography associated with clinical breast examination and self-breast examination is the only effective and viable method for mass breast screening. It is however, difficult to distinguish between benign and malignant microcalcifications associated with breast cancer. Most of the techniques used in the computerized analysis of mammographic microcalcifications segment the digitized gray-level image into regions representing microcalcifications. We present a second-order gray-level histogram based feature extraction approach to extract microcalcification features. These features, called image structure features, are computed from the second-order gray-level histogram statistics, and do not require segmentation of the original image into binary regions. Several image structure features were computed for 100 cases of `difficult to diagnose' microcalcification cases with known biopsy results. These features were analyzed in a correlation study which provided a set of five best image structure features. A feedforward backpropagation neural network was used to classify mammographic microcalcifications using the image structure features. The network was trained on 10 cases of mammographic microcalcifications and tested on additional 85 `difficult-to-diagnose' microcalcifications cases using the selected image structure features. The trained network yielded good results for classification of `difficult-to- diagnose' microcalcifications into benign and malignant categories.

  11. Enhancement of force patterns classification based on Gaussian distributions.

    Science.gov (United States)

    Ertelt, Thomas; Solomonovs, Ilja; Gronwald, Thomas

    2018-01-23

    Description of the patterns of ground reaction force is a standard method in areas such as medicine, biomechanics and robotics. The fundamental parameter is the time course of the force, which is classified visually in particular in the field of clinical diagnostics. Here, the knowledge and experience of the diagnostician is relevant for its assessment. For an objective and valid discrimination of the ground reaction force pattern, a generic method, especially in the medical field, is absolutely necessary to describe the qualities of the time-course. The aim of the presented method was to combine the approaches of two existing procedures from the fields of machine learning and the Gauss approximation in order to take advantages of both methods for the classification of ground reaction force patterns. The current limitations of both methods could be eliminated by an overarching method. Twenty-nine male athletes from different sports were examined. Each participant was given the task of performing a one-legged stopping maneuver on a force plate from the maximum possible starting speed. The individual time course of the ground reaction force of each subject was registered and approximated on the basis of eight Gaussian distributions. The descriptive coefficients were then classified using Bayesian regulated neural networks. The different sports served as the distinguishing feature. Although the athletes were all given the same task, all sports referred to a different quality in the time course of ground reaction force. Meanwhile within each sport, the athletes were homogeneous. With an overall prediction (R = 0.938) all subjects/sports were classified correctly with 94.29% accuracy. The combination of the two methods: the mathematical description of the time course of ground reaction forces on the basis of Gaussian distributions and their classification by means of Bayesian regulated neural networks, seems an adequate and promising method to discriminate the

  12. Material classification based on multi-band polarimetric images fusion

    Science.gov (United States)

    Zhao, Yongqiang; Pan, Quan; Zhang, Hongcai

    2006-05-01

    Polarization imparted by surface reflections contains unique and discriminatory signatures which may augment spectral target-detection techniques. With the development of multi-band polarization imaging technology, it is becoming more and more important on how to integrate polarimetric, spatial and spectral information to improve target discrimination. In this study, investigations were performed on combining multi-band polarimetric images through false color mapping and wavelet integrated image fusion method. The objective of this effort was to extend the investigation of the use of polarized light to target detection and material classification. As there is great variation in polarization in and between each of the bandpasses, and this variation is comparable to the magnitude of the variation intensity. At the same time, the contrast in polarization is greater than for intensity, and that polarization contrast increases as intensity contrast decreases. It is also pointed out that chromaticity can be used to make targets stand out more clearly against background, and material can be divided into conductor and nonconductor through polarization information. So, through false color mapping, the difference part of polarimetric information between each of the bandpasses and common part of polarimetric information in each of the bandpasses are combined, in the resulting image the conductor and nonconductor are assigned different color. Then panchromatic polarimetric images are fused with resulting image through wavelet decomposition, the final fused image have more detail information and more easy identification. This study demonstrated, using digital image data collected by imaging spectropolarimeter, multi-band imaging polarimetry is likely to provide an advantage in target detection and material classification.

  13. Multi-Frequency Polarimetric SAR Classification Based on Riemannian Manifold and Simultaneous Sparse Representation

    Directory of Open Access Journals (Sweden)

    Fan Yang

    2015-07-01

    Full Text Available Normally, polarimetric SAR classification is a high-dimensional nonlinear mapping problem. In the realm of pattern recognition, sparse representation is a very efficacious and powerful approach. As classical descriptors of polarimetric SAR, covariance and coherency matrices are Hermitian semidefinite and form a Riemannian manifold. Conventional Euclidean metrics are not suitable for a Riemannian manifold, and hence, normal sparse representation classification cannot be applied to polarimetric SAR directly. This paper proposes a new land cover classification approach for polarimetric SAR. There are two principal novelties in this paper. First, a Stein kernel on a Riemannian manifold instead of Euclidean metrics, combined with sparse representation, is employed for polarimetric SAR land cover classification. This approach is named Stein-sparse representation-based classification (SRC. Second, using simultaneous sparse representation and reasonable assumptions of the correlation of representation among different frequency bands, Stein-SRC is generalized to simultaneous Stein-SRC for multi-frequency polarimetric SAR classification. These classifiers are assessed using polarimetric SAR images from the Airborne Synthetic Aperture Radar (AIRSAR sensor of the Jet Propulsion Laboratory (JPL and the Electromagnetics Institute Synthetic Aperture Radar (EMISAR sensor of the Technical University of Denmark (DTU. Experiments on single-band and multi-band data both show that these approaches acquire more accurate classification results in comparison to many conventional and advanced classifiers.

  14. Multigrades Classification Model of Magnesite Ore Based on SAE and ELM

    Directory of Open Access Journals (Sweden)

    Yachun Mao

    2017-01-01

    Full Text Available Magnesite is an important raw material for extracting magnesium metal and magnesium compound; how precise its grade classification exerts great influence on the smelting process. Thus, it is increasingly important to determine fast and accurately the grade of magnesite. In this paper, a method based on stacked autoencoder (SAE and extreme learning machine (ELM was established for the classification model of magnesite. Stacked autoencoder (SAE was firstly used to reduce the dimension of magnesite spectrum data and then neutral network model of extreme learning machine (ELM was adopted to classify the data. Two improved extreme learning machine (ELM models were employed for better classification, namely, accuracy extreme learning machine (AELM and integrated accuracy (IELM to build up the classification models. The grade classification through traditional methods such as chemical approaches, artificial methods, and BP neutral network model was compared to that in this paper. Results showed that the classification model of magnesite ore through stacked autoencoder (SAE and extreme learning machine (ELM is better in terms of speed and accuracy; thus, this paper provides a new way for the grade classification of magnesite ore.

  15. Classification of right-hand grasp movement based on EMOTIV Epoc+

    Science.gov (United States)<