WorldWideScience

Sample records for cancer classification based

  1. NIM: A Node Influence Based Method for Cancer Classification

    Directory of Open Access Journals (Sweden)

    Yiwen Wang

    2014-01-01

    Full Text Available The classification of different cancer types owns great significance in the medical field. However, the great majority of existing cancer classification methods are clinical-based and have relatively weak diagnostic ability. With the rapid development of gene expression technology, it is able to classify different kinds of cancers using DNA microarray. Our main idea is to confront the problem of cancer classification using gene expression data from a graph-based view. Based on a new node influence model we proposed, this paper presents a novel high accuracy method for cancer classification, which is composed of four parts: the first is to calculate the similarity matrix of all samples, the second is to compute the node influence of training samples, the third is to obtain the similarity between every test sample and each class using weighted sum of node influence and similarity matrix, and the last is to classify each test sample based on its similarity between every class. The data sets used in our experiments are breast cancer, central nervous system, colon tumor, prostate cancer, acute lymphoblastic leukemia, and lung cancer. experimental results showed that our node influence based method (NIM is more efficient and robust than the support vector machine, K-nearest neighbor, C4.5, naive Bayes, and CART.

  2. NIM: a node influence based method for cancer classification.

    Science.gov (United States)

    Wang, Yiwen; Yao, Min; Yang, Jianhua

    2014-01-01

    The classification of different cancer types owns great significance in the medical field. However, the great majority of existing cancer classification methods are clinical-based and have relatively weak diagnostic ability. With the rapid development of gene expression technology, it is able to classify different kinds of cancers using DNA microarray. Our main idea is to confront the problem of cancer classification using gene expression data from a graph-based view. Based on a new node influence model we proposed, this paper presents a novel high accuracy method for cancer classification, which is composed of four parts: the first is to calculate the similarity matrix of all samples, the second is to compute the node influence of training samples, the third is to obtain the similarity between every test sample and each class using weighted sum of node influence and similarity matrix, and the last is to classify each test sample based on its similarity between every class. The data sets used in our experiments are breast cancer, central nervous system, colon tumor, prostate cancer, acute lymphoblastic leukemia, and lung cancer. experimental results showed that our node influence based method (NIM) is more efficient and robust than the support vector machine, K-nearest neighbor, C4.5, naive Bayes, and CART.

  3. Cancer classification based on gene expression using neural networks.

    Science.gov (United States)

    Hu, H P; Niu, Z J; Bai, Y P; Tan, X H

    2015-12-21

    Based on gene expression, we have classified 53 colon cancer patients with UICC II into two groups: relapse and no relapse. Samples were taken from each patient, and gene information was extracted. Of the 53 samples examined, 500 genes were considered proper through analyses by S-Kohonen, BP, and SVM neural networks. Classification accuracy obtained by S-Kohonen neural network reaches 91%, which was more accurate than classification by BP and SVM neural networks. The results show that S-Kohonen neural network is more plausible for classification and has a certain feasibility and validity as compared with BP and SVM neural networks.

  4. Prediction of Breast Cancer using Rule Based Classification

    Directory of Open Access Journals (Sweden)

    Nagendra Kumar SINGH

    2015-12-01

    Full Text Available The current work proposes a model for prediction of breast cancer using the classification approach in data mining. The proposed model is based on various parameters, including symptoms of breast cancer, gene mutation and other risk factors causing breast cancer. Mutations have been predicted in breast cancer causing genes with the help of alignment of normal and abnormal gene sequences; then predicting the class label of breast cancer (risky or safe on the basis of IF-THEN rules, using Genetic Algorithm (GA. In this work, GA has used variable gene encoding mechanisms for chromosomes encoding, uniform population generations and selects two chromosomes by Roulette-Wheel selection technique for two-point crossover, which gives better solutions. The performance of the model is evaluated using the F score measure, Matthews Correlation Coefficient (MCC and Receiver Operating Characteristic (ROC by plotting points (Sensitivity V/s 1- Specificity.

  5. Pathological Bases for a Robust Application of Cancer Molecular Classification

    Directory of Open Access Journals (Sweden)

    Salvador J. Diaz-Cano

    2015-04-01

    Full Text Available Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes, and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors.

  6. Histotype-based prognostic classification of gastric cancer

    Institute of Scientific and Technical Information of China (English)

    Anna Maria Chiaravalli; Catherine Klersy; Alessandro Vanoli; Andrea Ferretti; Carlo Capella; Enrico Solcia

    2012-01-01

    AIM:To test the efficiency of a recently proposed histotype-based grading system in a consecutive series of gastric cancers.METHOIS:Two hundred advanced gastric cancers operated upon in 1980-1987 and followed for a median 159 mo were investigated on hematoxylin-eosinstained sections to identify low-grade [muconodular,well differentiated tubular,diffuse desmoplastic and high lymphoid response (HLR)],high-grade (anaplastic and mucinous invasive) and intermediate-grade (ordinarycohesive,diffuse and mucinous) cancers,in parallel with a previously investigated series of 292 cases.In addition,immunohistochemical analyses for CD8,CD11 and HLA-DR antigens,pancytokeratin and podoplanin,as well as immunohistochemical and molecular tests for microsatellite DNA instability and in situ hybridization for the Epstein-Barr virus (EBV) EBER1 gene were performed.Patient survival was assessed with death rates per 100 person-years and with Kaplan-Meier or Cox model estimates.RESULTS:Collectively,the four low-grade histotypes accounted for 22% and the two high-grade histotypes for 7% of the consecutive cancers investigated,while the remaining 71% of cases were intermediate-grade cancers,with highly significant,stage-independent,survival differences among the three tumor grades (P =0.004 for grade 1 vs 2 and P =0.0019 for grade 2 vs grade 3),thus confirming the results in the original series.A combined analysis of 492 cases showed an improved prognostic value of histotype-based grading compared with the Lauren classification.In addition,it allowed better characterization of rare histotypes,particularly the three subsets of prognostically different mucinous neoplasms,of which 10 ordinary mucinous cancers showed stage-inclusive survival worse than that of 20 muconodular (P =0.037) and better than that of 21 high-grade (P < 0.001) cases.Tumors with high-level microsatellite DNA instability(MSI-H) or EBV infection,together with a third subset negative for both conditions,formed the

  7. A review on ultrasound-based thyroid cancer tissue characterization and automated classification.

    Science.gov (United States)

    Acharya, U R; Swapna, G; Sree, S V; Molinari, F; Gupta, S; Bardales, R H; Witkowska, A; Suri, J S

    2014-08-01

    In this paper, we review the different studies that developed Computer Aided Diagnostic (CAD) for automated classification of thyroid cancer into benign and malignant types. Specifically, we discuss the different types of features that are used to study and analyze the differences between benign and malignant thyroid nodules. These features can be broadly categorized into (a) the sonographic features from the ultrasound images, and (b) the non-clinical features extracted from the ultrasound images using statistical and data mining techniques. We also present a brief description of the commonly used classifiers in ultrasound based CAD systems. We then review the studies that used features based on the ultrasound images for thyroid nodule classification and highlight the limitations of such studies. We also discuss and review the techniques used in studies that used the non-clinical features for thyroid nodule classification and report the classification accuracies obtained in these studies.

  8. Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data

    Directory of Open Access Journals (Sweden)

    He Miao

    2009-12-01

    Full Text Available Abstract Background More studies based on gene expression data have been reported in great detail, however, one major challenge for the methodologists is the choice of classification methods. The main purpose of this research was to compare the performance of linear discriminant analysis (LDA and its modification methods for the classification of cancer based on gene expression data. Methods The classification performance of linear discriminant analysis (LDA and its modification methods was evaluated by applying these methods to six public cancer gene expression datasets. These methods included linear discriminant analysis (LDA, prediction analysis for microarrays (PAM, shrinkage centroid regularized discriminant analysis (SCRDA, shrinkage linear discriminant analysis (SLDA and shrinkage diagonal discriminant analysis (SDDA. The procedures were performed by software R 2.80. Results PAM picked out fewer feature genes than other methods from most datasets except from Brain dataset. For the two methods of shrinkage discriminant analysis, SLDA selected more genes than SDDA from most datasets except from 2-class lung cancer dataset. When comparing SLDA with SCRDA, SLDA selected more genes than SCRDA from 2-class lung cancer, SRBCT and Brain dataset, the result was opposite for the rest datasets. The average test error of LDA modification methods was lower than LDA method. Conclusions The classification performance of LDA modification methods was superior to that of traditional LDA with respect to the average error and there was no significant difference between theses modification methods.

  9. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2005-11-01

    Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

  10. Cancer pain: A critical review of mechanism-based classification and physical therapy management in palliative care

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2011-01-01

    Full Text Available Mechanism-based classification and physical therapy management of pain is essential to effectively manage painful symptoms in patients attending palliative care. The objective of this review is to provide a detailed review of mechanism-based classification and physical therapy management of patients with cancer pain. Cancer pain can be classified based upon pain symptoms, pain mechanisms and pain syndromes. Classification based upon mechanisms not only addresses the underlying pathophysiology but also provides us with an understanding behind patient′s symptoms and treatment responses. Existing evidence suggests that the five mechanisms - central sensitization, peripheral sensitization, sympathetically maintained pain, nociceptive and cognitive-affective - operate in patients with cancer pain. Summary of studies showing evidence for physical therapy treatment methods for cancer pain follows with suggested therapeutic implications. Effective palliative physical therapy care using a mechanism-based classification model should be tailored to suit each patient′s findings, using a biopsychosocial model of pain.

  11. Accurate and reliable cancer classification based on probabilistic inference of pathway activity.

    Directory of Open Access Journals (Sweden)

    Junjie Su

    Full Text Available With the advent of high-throughput technologies for measuring genome-wide expression profiles, a large number of methods have been proposed for discovering diagnostic markers that can accurately discriminate between different classes of a disease. However, factors such as the small sample size of typical clinical data, the inherent noise in high-throughput measurements, and the heterogeneity across different samples, often make it difficult to find reliable gene markers. To overcome this problem, several studies have proposed the use of pathway-based markers, instead of individual gene markers, for building the classifier. Given a set of known pathways, these methods estimate the activity level of each pathway by summarizing the expression values of its member genes, and use the pathway activities for classification. It has been shown that pathway-based classifiers typically yield more reliable results compared to traditional gene-based classifiers. In this paper, we propose a new classification method based on probabilistic inference of pathway activities. For a given sample, we compute the log-likelihood ratio between different disease phenotypes based on the expression level of each gene. The activity of a given pathway is then inferred by combining the log-likelihood ratios of the constituent genes. We apply the proposed method to the classification of breast cancer metastasis, and show that it achieves higher accuracy and identifies more reproducible pathway markers compared to several existing pathway activity inference methods.

  12. Classification of Cancer Gene Selection Using Random Forest and Neural Network Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Jogendra Kushwah

    2013-06-01

    Full Text Available The free radical gene classification of cancer diseases is challenging job in biomedical data engineering. The improving of classification of gene selection of cancer diseases various classifier are used, but the classification of classifier are not validate. So ensemble classifier is used for cancer gene classification using neural network classifier with random forest tree. The random forest tree is ensembling technique of classifier in this technique the number of classifier ensemble of their leaf node of class of classifier. In this paper we combined neural network with random forest ensemble classifier for classification of cancer gene selection for diagnose analysis of cancer diseases. The proposed method is different from most of the methods of ensemble classifier, which follow an input output paradigm of neural network, where the members of the ensemble are selected from a set of neural network classifier. the number of classifiers is determined during the rising procedure of the forest. Furthermore, the proposed method produces an ensemble not only correct, but also assorted, ensuring the two important properties that should characterize an ensemble classifier. For empirical evaluation of our proposed method we used UCI cancer diseases data set for classification. Our experimental result shows that better result in compression of random forest tree classification.

  13. Classification of lung cancer tumors based on structural and physicochemical properties of proteins by bioinformatics models.

    Science.gov (United States)

    Hosseinzadeh, Faezeh; Ebrahimi, Mansour; Goliaei, Bahram; Shamabadi, Narges

    2012-01-01

    Rapid distinction between small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) tumors is very important in diagnosis of this disease. Furthermore sequence-derived structural and physicochemical descriptors are very useful for machine learning prediction of protein structural and functional classes, classifying proteins and the prediction performance. Herein, in this study is the classification of lung tumors based on 1497 attributes derived from structural and physicochemical properties of protein sequences (based on genes defined by microarray analysis) investigated through a combination of attribute weighting, supervised and unsupervised clustering algorithms. Eighty percent of the weighting methods selected features such as autocorrelation, dipeptide composition and distribution of hydrophobicity as the most important protein attributes in classification of SCLC, NSCLC and COMMON classes of lung tumors. The same results were observed by most tree induction algorithms while descriptors of hydrophobicity distribution were high in protein sequences COMMON in both groups and distribution of charge in these proteins was very low; showing COMMON proteins were very hydrophobic. Furthermore, compositions of polar dipeptide in SCLC proteins were higher than NSCLC proteins. Some clustering models (alone or in combination with attribute weighting algorithms) were able to nearly classify SCLC and NSCLC proteins. Random Forest tree induction algorithm, calculated on leaves one-out and 10-fold cross validation) shows more than 86% accuracy in clustering and predicting three different lung cancer tumors. Here for the first time the application of data mining tools to effectively classify three classes of lung cancer tumors regarding the importance of dipeptide composition, autocorrelation and distribution descriptor has been reported.

  14. Classification of lung cancer tumors based on structural and physicochemical properties of proteins by bioinformatics models.

    Directory of Open Access Journals (Sweden)

    Faezeh Hosseinzadeh

    Full Text Available Rapid distinction between small cell lung cancer (SCLC and non-small cell lung cancer (NSCLC tumors is very important in diagnosis of this disease. Furthermore sequence-derived structural and physicochemical descriptors are very useful for machine learning prediction of protein structural and functional classes, classifying proteins and the prediction performance. Herein, in this study is the classification of lung tumors based on 1497 attributes derived from structural and physicochemical properties of protein sequences (based on genes defined by microarray analysis investigated through a combination of attribute weighting, supervised and unsupervised clustering algorithms. Eighty percent of the weighting methods selected features such as autocorrelation, dipeptide composition and distribution of hydrophobicity as the most important protein attributes in classification of SCLC, NSCLC and COMMON classes of lung tumors. The same results were observed by most tree induction algorithms while descriptors of hydrophobicity distribution were high in protein sequences COMMON in both groups and distribution of charge in these proteins was very low; showing COMMON proteins were very hydrophobic. Furthermore, compositions of polar dipeptide in SCLC proteins were higher than NSCLC proteins. Some clustering models (alone or in combination with attribute weighting algorithms were able to nearly classify SCLC and NSCLC proteins. Random Forest tree induction algorithm, calculated on leaves one-out and 10-fold cross validation shows more than 86% accuracy in clustering and predicting three different lung cancer tumors. Here for the first time the application of data mining tools to effectively classify three classes of lung cancer tumors regarding the importance of dipeptide composition, autocorrelation and distribution descriptor has been reported.

  15. Diagnostic Classification of Normal Persons and Cancer Patients by Using Neural Network Based on Trace Metal Contents in Serum Samples

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Artificial neural network with the back-propagation(BP-ANN) approach was applied to the classification of normal persons and various cancer patients based on the elemental contents in serum samples. This method was verified by the cross-validation method. The effects of the net work parameters were investigated and the related problems were discussed. The samples of 72, 42, and 52 for lung, liver, and stomach cancer patients and normal persons, respectively, were used for the classification study. About 95% of the samples can be classified correctly. There fore, the method can be used as an auxiliary means of the diagnosis of cancer.

  16. Classification of Cancer Gene Selection Using Random Forest and Neural Network Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Jogendra Kushwah

    2013-06-01

    Full Text Available The free radical gene classification of cancerdiseasesis challenging job in biomedical dataengineering. The improving of classification of geneselection of cancer diseases various classifier areused, but the classification of classifier are notvalidate. So ensemble classifier is used for cancergene classification using neural network classifierwith random forest tree. The random forest tree isensembling technique of classifier in this techniquethe number of classifier ensemble of their leaf nodeof class of classifier. In this paper we combinedneuralnetwork with random forest ensembleclassifier for classification of cancer gene selectionfor diagnose analysis of cancer diseases.Theproposed method is different from most of themethods of ensemble classifier, which follow aninput output paradigm ofneural network, where themembers of the ensemble are selected from a set ofneural network classifier. the number of classifiersis determined during the rising procedure of theforest. Furthermore, the proposed method producesan ensemble not only correct, but also assorted,ensuring the two important properties that shouldcharacterize an ensemble classifier. For empiricalevaluation of our proposed method we used UCIcancer diseases data set for classification. Ourexperimental result shows that betterresult incompression of random forest tree classification

  17. Superpixel-based spectral classification for the detection of head and neck cancer with hyperspectral imaging

    Science.gov (United States)

    Chung, Hyunkoo; Lu, Guolan; Tian, Zhiqiang; Wang, Dongsheng; Chen, Zhuo Georgia; Fei, Baowei

    2016-03-01

    Hyperspectral imaging (HSI) is an emerging imaging modality for medical applications. HSI acquires two dimensional images at various wavelengths. The combination of both spectral and spatial information provides quantitative information for cancer detection and diagnosis. This paper proposes using superpixels, principal component analysis (PCA), and support vector machine (SVM) to distinguish regions of tumor from healthy tissue. The classification method uses 2 principal components decomposed from hyperspectral images and obtains an average sensitivity of 93% and an average specificity of 85% for 11 mice. The hyperspectral imaging technology and classification method can have various applications in cancer research and management.

  18. Swarm Intelligence Approach Based on Adaptive ELM Classifier with ICGA Selection for Microarray Gene Expression and Cancer Classification

    Directory of Open Access Journals (Sweden)

    T. Karthikeyan

    2014-05-01

    Full Text Available The aim of this research study is based on efficient gene selection and classification of microarray data analysis using hybrid machine learning algorithms. The beginning of microarray technology has enabled the researchers to quickly measure the position of thousands of genes expressed in an organic/biological tissue samples in a solitary experiment. One of the important applications of this microarray technology is to classify the tissue samples using their gene expression representation, identify numerous type of cancer. Cancer is a group of diseases in which a set of cells shows uncontrolled growth, instance that interrupts upon and destroys nearby tissues and spreading to other locations in the body via lymph or blood. Cancer has becomes a one of the major important disease in current scenario. DNA microarrays turn out to be an effectual tool utilized in molecular biology and cancer diagnosis. Microarrays can be measured to establish the relative quantity of mRNAs in two or additional organic/biological tissue samples for thousands/several thousands of genes at the same time. As the superiority of this technique become exactly analysis/identifying the suitable assessment of microarray data in various open issues. In the field of medical sciences multi-category cancer classification play a major important role to classify the cancer types according to the gene expression. The need of the cancer classification has been become indispensible, because the numbers of cancer victims are increasing steadily identified by recent years. To perform this proposed a combination of Integer-Coded Genetic Algorithm (ICGA and Artificial Bee Colony algorithm (ABC, coupled with an Adaptive Extreme Learning Machine (AELM, is used for gene selection and cancer classification. ICGA is used with ABC based AELM classifier to chose an optimal set of genes which results in an efficient hybrid algorithm that can handle sparse data and sample imbalance. The

  19. Proteomic classification of breast cancer.

    LENUS (Irish Health Repository)

    Kamel, Dalia

    2012-11-01

    Being a significant health problem that affects patients in various age groups, breast cancer has been extensively studied to date. Recently, molecular breast cancer classification has advanced significantly with the availability of genomic profiling technologies. Proteomic technologies have also advanced from traditional protein assays including enzyme-linked immunosorbent assay, immunoblotting and immunohistochemistry to more comprehensive approaches including mass spectrometry and reverse phase protein lysate arrays (RPPA). The purpose of this manuscript is to review the current protein markers that influence breast cancer prediction and prognosis and to focus on novel advances in proteomic classification of breast cancer.

  20. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...

  1. Genetic Fuzzy System (GFS based wavelet co-occurrence feature selection in mammogram classification for breast cancer diagnosis

    Directory of Open Access Journals (Sweden)

    Meenakshi M. Pawar

    2016-09-01

    Full Text Available Breast cancer is significant health problem diagnosed mostly in women worldwide. Therefore, early detection of breast cancer is performed with the help of digital mammography, which can reduce mortality rate. This paper presents wrapper based feature selection approach for wavelet co-occurrence feature (WCF using Genetic Fuzzy System (GFS in mammogram classification problem. The performance of GFS algorithm is explained using mini-MIAS database. WCF features are obtained from detail wavelet coefficients at each level of decomposition of mammogram image. At first level of decomposition, 18 features are applied to GFS algorithm, which selects 5 features with an average classification success rate of 39.64%. Subsequently, at second level it selects 9 features from 36 features and the classification success rate is improved to 56.75%. For third level, 16 features are selected from 54 features and average success rate is improved to 64.98%. Lastly, at fourth level 72 features are applied to GFS, which selects 16 features and thereby increasing average success rate to 89.47%. Hence, GFS algorithm is the effective way of obtaining optimal set of feature in breast cancer diagnosis.

  2. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  3. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  4. Breast cancer detection and classification in digital mammography based on Non-Subsampled Contourlet Transform (NSCT) and Super Resolution.

    Science.gov (United States)

    Pak, Fatemeh; Kanan, Hamidreza Rashidy; Alikhassi, Afsaneh

    2015-11-01

    Breast cancer is one of the most perilous diseases among women. Breast screening is a method of detecting breast cancer at a very early stage which can reduce the mortality rate. Mammography is a standard method for the early diagnosis of breast cancer. In this paper, a new algorithm is proposed for breast cancer detection and classification in digital mammography based on Non-Subsampled Contourlet Transform (NSCT) and Super Resolution (SR). The presented algorithm includes three main parts including pre-processing, feature extraction and classification. In the pre-processing stage, after determining the region of interest (ROI) by an automatic technique, the quality of image is improved using NSCT and SR algorithm. In the feature extraction part, several features of the image components are extracted and skewness of each feature is calculated. Finally, AdaBoost algorithm is used to classify and determine the probability of benign and malign disease. The obtained results on Mammographic Image Analysis Society (MIAS) database indicate the significant performance and superiority of the proposed method in comparison with the state of the art approaches. According to the obtained results, the proposed technique achieves 91.43% and 6.42% as a mean accuracy and FPR, respectively.

  5. A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification

    Directory of Open Access Journals (Sweden)

    Wang Lily

    2008-07-01

    Full Text Available Abstract Background Cancer diagnosis and clinical outcome prediction are among the most important emerging applications of gene expression microarray technology with several molecular signatures on their way toward clinical deployment. Use of the most accurate classification algorithms available for microarray gene expression data is a critical ingredient in order to develop the best possible molecular signatures for patient care. As suggested by a large body of literature to date, support vector machines can be considered "best of class" algorithms for classification of such data. Recent work, however, suggests that random forest classifiers may outperform support vector machines in this domain. Results In the present paper we identify methodological biases of prior work comparing random forests and support vector machines and conduct a new rigorous evaluation of the two algorithms that corrects these limitations. Our experiments use 22 diagnostic and prognostic datasets and show that support vector machines outperform random forests, often by a large margin. Our data also underlines the importance of sound research design in benchmarking and comparison of bioinformatics algorithms. Conclusion We found that both on average and in the majority of microarray datasets, random forests are outperformed by support vector machines both in the settings when no gene selection is performed and when several popular gene selection methods are used.

  6. Laser Raman detection for oral cancer based on a Gaussian process classification method

    Science.gov (United States)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Zhang, Chijun; Chen, He; Luo, Yusheng; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Jia, Jun; Shen, Aiguo; Hu, Jiming

    2013-06-01

    Oral squamous cell carcinoma is the most common neoplasm of the oral cavity. The incidence rate accounts for 80% of total oral cancer and shows an upward trend in recent years. It has a high degree of malignancy and is difficult to detect in terms of differential diagnosis, as a consequence of which the timing of treatment is always delayed. In this work, Raman spectroscopy was adopted to differentially diagnose oral squamous cell carcinoma and oral gland carcinoma. In total, 852 entries of raw spectral data which consisted of 631 items from 36 oral squamous cell carcinoma patients, 87 items from four oral gland carcinoma patients and 134 items from five normal people were collected by utilizing an optical method on oral tissues. The probability distribution of the datasets corresponding to the spectral peaks of the oral squamous cell carcinoma tissue was analyzed and the experimental result showed that the data obeyed a normal distribution. Moreover, the distribution characteristic of the noise was also in compliance with a Gaussian distribution. A Gaussian process (GP) classification method was utilized to distinguish the normal people and the oral gland carcinoma patients from the oral squamous cell carcinoma patients. The experimental results showed that all the normal people could be recognized. 83.33% of the oral squamous cell carcinoma patients could be correctly diagnosed and the remaining ones would be diagnosed as having oral gland carcinoma. For the classification process of oral gland carcinoma and oral squamous cell carcinoma, the correct ratio was 66.67% and the erroneously diagnosed percentage was 33.33%. The total sensitivity was 80% and the specificity was 100% with the Matthews correlation coefficient (MCC) set to 0.447 213 595. Considering the numerical results above, the application prospects and clinical value of this technique are significantly impressive.

  7. Mechanism-based classification and physical therapy management of persons with cancer pain: A prospective case series

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2013-01-01

    Full Text Available Context: Mechanism-based classification (MBC was established with current evidence and physical therapy (PT management methods for both cancer and for noncancer pain. Aims: This study aims to describe the efficacy of MBC-based PT in persons with primary complaints of cancer pain. Settings and Design: A prospective case series of patients who attended the physiotherapy department of a multispecialty university-affiliated teaching hospital. Material and Methods: A total of 24 adults (18 female, 6 male aged 47.5 ± 10.6 years, with primary diagnosis of heterogeneous group of cancer, chief complaints of chronic disabling pain were included in the study on their consent for participation The patients were evaluated and classified on the basis of five predominant mechanisms for pain. Physical therapy interventions were recommended based on mechanisms identified and home program was prescribed with a patient log to ensure compliance. Treatments were given in five consecutive weekly sessions for five weeks each of 30 min duration. Statistical Analysis Used: Pre-post comparisons for pain severity (PS and pain interference (PI subscales of Brief pain inventory-Cancer pain (BPI-CP and, European organization for research and treatment in cancer-quality of life questionnaire (EORTC-QLQ-C30 were done using Wilcoxon signed-rank test at 95% confidence interval using SPSS for Windows version 16.0 (SPSS Inc, Chicago, IL. Results: There were statistically significant ( P < 0.05 reduction in pain severity, pain interference and total BPI-CP scores, and the EORTC-QLQ-C30. Conclusion: MBC-PT was effective for improving BPI-CP and EORTC-QLQ-C30 scores in people with cancer pain.

  8. Preprocessing for classification of thermograms in breast cancer detection

    Science.gov (United States)

    Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał; Oleszkiewicz, Witold; Cichosz, Paweł; Jagodziński, Dariusz; Matysiewicz, Mateusz

    2016-09-01

    Performance of binary classification of breast cancer suffers from high imbalance between classes. In this article we present the preprocessing module designed to negate the discrepancy in training examples. Preprocessing module is based on standardization, Synthetic Minority Oversampling Technique and undersampling. We show how each algorithm influences classification accuracy. Results indicate that described module improves overall Area Under Curve up to 10% on the tested dataset. Furthermore we propose other methods of dealing with imbalanced datasets in breast cancer classification.

  9. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic......, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s....

  10. Clinical study of quantitative diagnosis of early cervical cancer based on the classification of acetowhitening kinetics

    Science.gov (United States)

    Wu, Tao; Cheung, Tak-Hong; Yim, So-Fan; Qu, Jianan Y.

    2010-03-01

    A quantitative colposcopic imaging system for the diagnosis of early cervical cancer is evaluated in a clinical study. This imaging technology based on 3-D active stereo vision and motion tracking extracts diagnostic information from the kinetics of acetowhitening process measured from the cervix of human subjects in vivo. Acetowhitening kinetics measured from 137 cervical sites of 57 subjects are analyzed and classified using multivariate statistical algorithms. Cross-validation methods are used to evaluate the performance of the diagnostic algorithms. The results show that an algorithm for screening precancer produced 95% sensitivity (SE) and 96% specificity (SP) for discriminating normal and human papillomavirus (HPV)-infected tissues from cervical intraepithelial neoplasia (CIN) lesions. For a diagnostic algorithm, 91% SE and 90% SP are achieved for discriminating normal tissue, HPV infected tissue, and low-grade CIN lesions from high-grade CIN lesions. The results demonstrate that the quantitative colposcopic imaging system could provide objective screening and diagnostic information for early detection of cervical cancer.

  11. Constructing Support Vector Machine Ensembles for Cancer Classification Based on Proteomic Profiling

    Institute of Scientific and Technical Information of China (English)

    Yong Mao; Xiao-Bo Zhou; Dao-Ying Pi; You-Xian Sun

    2005-01-01

    In this study, we present a constructive algorithm for training cooperative support vector machine ensembles (CSVMEs). CSVME combines ensemble architecture design with cooperative training for individual SVMs in ensembles. Unlike most previous studies on training ensembles, CSVME puts emphasis on both accuracy and collaboration among individual SVMs in an ensemble. A group of SVMs selected on the basis of recursive classifier elimination is used in CSVME, and the number of the individual SVMs selected to construct CSVME is determined by 10-fold cross-validation. This kind of SVME has been tested on two ovarian cancer datasets previously obtained by proteomic mass spectrometry. By combining several individual SVMs, the proposed method achieves better performance than the SVME of all base SVMs.

  12. Identification of immune cell infiltration in hematoxylin-eosin stained breast cancer samples: texture-based classification of tissue morphologies

    Science.gov (United States)

    Turkki, Riku; Linder, Nina; Kovanen, Panu E.; Pellinen, Teijo; Lundin, Johan

    2016-03-01

    The characteristics of immune cells in the tumor microenvironment of breast cancer capture clinically important information. Despite the heterogeneity of tumor-infiltrating immune cells, it has been shown that the degree of infiltration assessed by visual evaluation of hematoxylin-eosin (H and E) stained samples has prognostic and possibly predictive value. However, quantification of the infiltration in H and E-stained tissue samples is currently dependent on visual scoring by an expert. Computer vision enables automated characterization of the components of the tumor microenvironment, and texture-based methods have successfully been used to discriminate between different tissue morphologies and cell phenotypes. In this study, we evaluate whether local binary pattern texture features with superpixel segmentation and classification with support vector machine can be utilized to identify immune cell infiltration in H and E-stained breast cancer samples. Guided with the pan-leukocyte CD45 marker, we annotated training and test sets from 20 primary breast cancer samples. In the training set of arbitrary sized image regions (n=1,116) a 3-fold cross-validation resulted in 98% accuracy and an area under the receiver-operating characteristic curve (AUC) of 0.98 to discriminate between immune cell -rich and - poor areas. In the test set (n=204), we achieved an accuracy of 96% and AUC of 0.99 to label cropped tissue regions correctly into immune cell -rich and -poor categories. The obtained results demonstrate strong discrimination between immune cell -rich and -poor tissue morphologies. The proposed method can provide a quantitative measurement of the degree of immune cell infiltration and applied to digitally scanned H and E-stained breast cancer samples for diagnostic purposes.

  13. Accurate molecular classification of cancer using simple rules

    Directory of Open Access Journals (Sweden)

    Gotoh Osamu

    2009-10-01

    Full Text Available Abstract Background One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible. Methods We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV of training sets and classification of independent test sets. Results We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML], lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML. Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods. Conclusion In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction.

  14. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    Directory of Open Access Journals (Sweden)

    Liu Qingzhong

    2011-12-01

    Full Text Available Abstract Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.

  15. Is cancer a disease that can be cured? An answer based on a new classification of diseases

    CERN Document Server

    Richmond, Peter

    2016-01-01

    Is cancer a disease that can be cured or a degenerative disease which comes predominantly with old age? We give an answer based on a two-dimensional representation of diseases. These two dimensions are defined as follows. In mortality curves there is an age, namely a_c = 10 years, which plays a crucial role in the sense that the mortality rate decreases in the interval I1=(aa_c). The respective trends in I1 and I2 are the two parameters used in our classification of diseases. Within the framework of reliability analysis, I1 and I2 would be referred to as the "burn-in" and "wear-out" phases. This leads to define three broad groups of diseases. (AS1) Asymmetry with prevalence of I1. (AS2) Asymmetry with prevalence of I2. (S) Symmetry, with I1 and I2 both playing roles of comparable importance. Not surprisingly, among AS1-cases one finds all diseases due to congenital malformations. In the AS2-class one finds degenerative diseases, e.g. Alzheimer's disease. Among S-cases one finds most diseases due to external p...

  16. Novelty detection for breast cancer image classification

    Science.gov (United States)

    Cichosz, Pawel; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał; Oleszkiewicz, Witold

    2016-09-01

    Using classification learning algorithms for medical applications may require not only refined model creation techniques and careful unbiased model evaluation, but also detecting the risk of misclassification at the time of model application. This is addressed by novelty detection, which identifies instances for which the training set is not sufficiently representative and for which it may be safer to restrain from classification and request a human expert diagnosis. The paper investigates two techniques for isolated instance identification, based on clustering and one-class support vector machines, which represent two different approaches to multidimensional outlier detection. The prediction quality for isolated instances in breast cancer image data is evaluated using the random forest algorithm and found to be substantially inferior to the prediction quality for non-isolated instances. Each of the two techniques is then used to create a novelty detection model which can be combined with a classification model and used at the time of prediction to detect instances for which the latter cannot be reliably applied. Novelty detection is demonstrated to improve random forest prediction quality and argued to deserve further investigation in medical applications.

  17. Machine learning-based receiver operating characteristic (ROC) curves for crisp and fuzzy classification of DNA microarrays in cancer research.

    Science.gov (United States)

    Peterson, Leif E; Coleman, Matthew A

    2008-01-01

    Receiver operating characteristic (ROC) curves were generated to obtain classification area under the curve (AUC) as a function of feature standardization, fuzzification, and sample size from nine large sets of cancer-related DNA microarrays. Classifiers used included k nearest neighbor (kNN), näive Bayes classifier (NBC), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), learning vector quantization (LVQ1), logistic regression (LOG), polytomous logistic regression (PLOG), artificial neural networks (ANN), particle swarm optimization (PSO), constricted particle swarm optimization (CPSO), kernel regression (RBF), radial basis function networks (RBFN), gradient descent support vector machines (SVMGD), and least squares support vector machines (SVMLS). For each data set, AUC was determined for a number of combinations of sample size, total sum[-log(p)] of feature t-tests, with and without feature standardization and with (fuzzy) and without (crisp) fuzzification of features. Altogether, a total of 2,123,530 classification runs were made. At the greatest level of sample size, ANN resulted in a fitted AUC of 90%, while PSO resulted in the lowest fitted AUC of 72.1%. AUC values derived from 4NN were the most dependent on sample size, while PSO was the least. ANN depended the most on total statistical significance of features used based on sum[-log(p)], whereas PSO was the least dependent. Standardization of features increased AUC by 8.1% for PSO and -0.2% for QDA, while fuzzification increased AUC by 9.4% for PSO and reduced AUC by 3.8% for QDA. AUC determination in planned microarray experiments without standardization and fuzzification of features will benefit the most if CPSO is used for lower levels of feature significance (i.e., sum[-log(p)] ~ 50) and ANN is used for greater levels of significance (i.e., sum[-log(p)] ~ 500). When only standardization of features is performed, studies are likely to benefit most by using CPSO for low levels

  18. Magnetic resonance imaging texture analysis classification of primary breast cancer

    Energy Technology Data Exchange (ETDEWEB)

    Waugh, S.A.; Lerski, R.A. [Ninewells Hospital and Medical School, Department of Medical Physics, Dundee (United Kingdom); Purdie, C.A.; Jordan, L.B. [Ninewells Hospital and Medical School, Department of Pathology, Dundee (United Kingdom); Vinnicombe, S. [University of Dundee, Division of Imaging and Technology, Ninewells Hospital and Medical School, Dundee (United Kingdom); Martin, P. [Ninewells Hospital and Medical School, Department of Clinical Radiology, Dundee (United Kingdom); Thompson, A.M. [University of Texas MD Anderson Cancer Center, Department of Surgical Oncology, Houston, TX (United States)

    2016-02-15

    Patient-tailored treatments for breast cancer are based on histological and immunohistochemical (IHC) subtypes. Magnetic Resonance Imaging (MRI) texture analysis (TA) may be useful in non-invasive lesion subtype classification. Women with newly diagnosed primary breast cancer underwent pre-treatment dynamic contrast-enhanced breast MRI. TA was performed using co-occurrence matrix (COM) features, by creating a model on retrospective training data, then prospectively applying to a test set. Analyses were blinded to breast pathology. Subtype classifications were performed using a cross-validated k-nearest-neighbour (k = 3) technique, with accuracy relative to pathology assessed and receiver operator curve (AUROC) calculated. Mann-Whitney U and Kruskal-Wallis tests were used to assess raw entropy feature values. Histological subtype classifications were similar across training (n = 148 cancers) and test sets (n = 73 lesions) using all COM features (training: 75 %, AUROC = 0.816; test: 72.5 %, AUROC = 0.823). Entropy features were significantly different between lobular and ductal cancers (p < 0.001; Mann-Whitney U). IHC classifications using COM features were also similar for training and test data (training: 57.2 %, AUROC = 0.754; test: 57.0 %, AUROC = 0.750). Hormone receptor positive and negative cancers demonstrated significantly different entropy features. Entropy features alone were unable to create a robust classification model. Textural differences on contrast-enhanced MR images may reflect underlying lesion subtypes, which merits testing against treatment response. (orig.)

  19. Multi-label classification for colon cancer using histopathological images.

    Science.gov (United States)

    Xu, Yan; Jiao, Liping; Wang, Siyu; Wei, Junsheng; Fan, Yubo; Lai, Maode; Chang, Eric I-Chao

    2013-12-01

    Colon cancer classification has a significant guidance value in clinical diagnoses and medical prognoses. The classification of colon cancers with high accuracy is the premise of efficient treatment. Our task is to build a system for colon cancer detection and classification based on slide histopathological images. Some former researches focus on single label classification. Through analyzing large amount of colon cancer images, we found that one image may contain cancer regions of multiple types. Therefore, we reformulated the task as multi-label problem. Four kinds of features (Color Histogram, Gray-Level Co-occurrence Matrix, Histogram of Oriented Gradients and Euler number) were introduced to compose our discriminative feature set, extracted from our dataset that includes six single categories and four multi-label categories. In order to evaluate the performance and make comparison with our multi-label model, three commonly used multi-classification methods were designed in our experiment including one-against-all SVM (OAA), one-against-one SVM (OAO) and multi-structure SVM. Four indicators (Precision, Recall, F-measure, and Accuracy) under 3-fold cross-validation were used to validate the performance of our approach. Experiment results show that the precision, recall and F-measure of multi-label method as 73.7%, 68.2%, and 70.8% with all features, which are higher than the other three classifiers. These results demonstrate the effectiveness and efficiency of our method on colon histopathological images analysis.

  20. Influence of nuclei segmentation on breast cancer malignancy classification

    Science.gov (United States)

    Jelen, Lukasz; Fevens, Thomas; Krzyzak, Adam

    2009-02-01

    Breast Cancer is one of the most deadly cancers affecting middle-aged women. Accurate diagnosis and prognosis are crucial to reduce the high death rate. Nowadays there are numerous diagnostic tools for breast cancer diagnosis. In this paper we discuss a role of nuclear segmentation from fine needle aspiration biopsy (FNA) slides and its influence on malignancy classification. Classification of malignancy plays a very important role during the diagnosis process of breast cancer. Out of all cancer diagnostic tools, FNA slides provide the most valuable information about the cancer malignancy grade which helps to choose an appropriate treatment. This process involves assessing numerous nuclear features and therefore precise segmentation of nuclei is very important. In this work we compare three powerful segmentation approaches and test their impact on the classification of breast cancer malignancy. The studied approaches involve level set segmentation, fuzzy c-means segmentation and textural segmentation based on co-occurrence matrix. Segmented nuclei were used to extract nuclear features for malignancy classification. For classification purposes four different classifiers were trained and tested with previously extracted features. The compared classifiers are Multilayer Perceptron (MLP), Self-Organizing Maps (SOM), Principal Component-based Neural Network (PCA) and Support Vector Machines (SVM). The presented results show that level set segmentation yields the best results over the three compared approaches and leads to a good feature extraction with a lowest average error rate of 6.51% over four different classifiers. The best performance was recorded for multilayer perceptron with an error rate of 3.07% using fuzzy c-means segmentation.

  1. Automated Breast Cancer Diagnosis based on GVF-Snake Segmentation, Wavelet Features Extraction and Neural Network Classification

    Directory of Open Access Journals (Sweden)

    Abderrahim Sebri

    2007-01-01

    Full Text Available Breast cancer accounts for the second most cancer diagnoses among women and the second most cancer deaths in the world. In fact, more than 11000 women die each year, all over the world, because this disease. The automatic breast cancer diagnosis is a very important purpose of medical informatics researches. Some researches has been oriented to make automatic the diagnosis at the step of mammographic diagnosis, some others treated the problem at the step of cytological diagnosis. In this work, we describes the current state of the ongoing the BC automated diagnosis research program. It is a software system that provides expert diagnosis of breast cancer based on three step of cytological image analysis. The first step is based on segmentation using an active contour for cell tracking and isolating of the nucleus in the studied image. Then from this nucleus, have been extracted some textural features using the wavelet transforms to characterize image using its texture, so that malign texture can be differentiated from benign on the assumption that tumoral texture is different from the texture of other kinds of tissues. Finally, the obtained features will be introduced as the input vector of a Multi-Layer Perceptron (MLP, to classify the images into malign and benign ones.

  2. Biogeography based Satellite Image Classification

    CERN Document Server

    Panchal, V K; Kaur, Navdeep; Kundra, Harish

    2009-01-01

    Biogeography is the study of the geographical distribution of biological organisms. The mindset of the engineer is that we can learn from nature. Biogeography Based Optimization is a burgeoning nature inspired technique to find the optimal solution of the problem. Satellite image classification is an important task because it is the only way we can know about the land cover map of inaccessible areas. Though satellite images have been classified in past by using various techniques, the researchers are always finding alternative strategies for satellite image classification so that they may be prepared to select the most appropriate technique for the feature extraction task in hand. This paper is focused on classification of the satellite image of a particular land cover using the theory of Biogeography based Optimization. The original BBO algorithm does not have the inbuilt property of clustering which is required during image classification. Hence modifications have been proposed to the original algorithm and...

  3. Reverse phase protein array based tumor profiling identifies a biomarker signature for risk classification of hormone receptor-positive breast cancer

    Directory of Open Access Journals (Sweden)

    Johanna Sonntag

    2014-03-01

    Full Text Available A robust subclassification of luminal breast cancer, the most common molecular subtype of human breast cancer, is crucial for therapy decisions. While a part of patients is at higher risk of recurrence and requires chemo-endocrine treatment, the other part is at lower risk and also poorly responds to chemotherapeutic regimens. To approximate the risk of cancer recurrence, clinical guidelines recommend determining histologic grading and abundance of a cell proliferation marker in tumor specimens. However, this approach assigns an intermediate risk to a substantial number of patients and in addition suffers from a high interobserver variability. Therefore, the aim of our study was to identify a quantitative protein biomarker signature to facilitate risk classification. Reverse phase protein arrays (RPPA were used to obtain quantitative expression data for 128 breast cancer relevant proteins in a set of hormone receptor-positive tumors (n = 109. Proteomic data for the subset of histologic G1 (n = 14 and G3 (n = 22 samples were used for biomarker discovery serving as surrogates of low and high recurrence risk, respectively. A novel biomarker selection workflow based on combining three different classification methods identified caveolin-1, NDKA, RPS6, and Ki-67 as top candidates. NDKA, RPS6, and Ki-67 were expressed at elevated levels in high risk tumors whereas caveolin-1 was observed as downregulated. The identified biomarker signature was subsequently analyzed using an independent test set (AUC = 0.78. Further evaluation of the identified biomarker panel by Western blot and mRNA profiling confirmed the proteomic signature obtained by RPPA. In conclusion, the biomarker signature introduced supports RPPA as a tool for cancer biomarker discovery.

  4. Naïve Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer.

    Science.gov (United States)

    Aguiar-Pulido, Vanessa; Munteanu, Cristian R; Seoane, José A; Fernández-Blanco, Enrique; Pérez-Montoto, Lázaro G; González-Díaz, Humberto; Dorado, Julián

    2012-06-01

    Fast cancer diagnosis represents a real necessity in applied medicine due to the importance of this disease. Thus, theoretical models can help as prediction tools. Graph theory representation is one option because it permits us to numerically describe any real system such as the protein macromolecules by transforming real properties into molecular graph topological indices. This study proposes a new classification model for proteins linked with human colon cancer by using spiral graph topological indices of protein amino acid sequences. The best quantitative structure-disease relationship model is based on eleven Shannon entropy indices. It was obtained with the Naïve Bayes method and shows excellent predictive ability (90.92%) for new proteins linked with this type of cancer. The statistical analysis confirms that this model allows diagnosing the absence of human colon cancer obtaining an area under receiver operating characteristic of 0.91. The methodology presented can be used for any type of sequential information such as any protein and nucleic acid sequence.

  5. Diagnostic Classification of Normal Persons and Cancer Patients by Using Neural Network Based on Trace Metal Contents in Serum Samples

    Institute of Scientific and Technical Information of China (English)

    ZHANG; Zhuo-yong

    2001-01-01

    [1]Miatto, O. , Casaril, M. , Gabriell, G. B. , et al. , Cancer, 55, 774(1985)[2]Margalioth, E. J., Udassin, R., Maor, J. , et al. , Cancer, 56, 856(1986)[3]Xu, B., Chinese Journal of Tumor, 12, 512(1990)[4]Jayadeep, A. , Raveendran, P. K. , Kannan, S. , et al. , J. Exp. Clin. Cancer Res. , 16, 295 (1997)[5]Sattar, N. , Scott, H. R. , McMillan, D. C. , et al. , Nutr. Cancer, 28, 308(1997)[6]Koksoy, C. , Kavas, G. O. , Akcil, E. , et al. , Breast Cancer Res. Treat. , 45, 1(1997)[7]Leung,P. L. , Huang, H. M. , Biol. Trace Elem. Res. , 57, 19(1997)[8]Antila, E. , Mussalo-Rauhamaa, H. , Kantola, M. , et al. , Sci. Total Environ. , 186, 251(1996)[9]Tariq, M. A. , Qama-un-Nisa, Fatima, A. , Sci. Total Environ. , 175, 43(1995)[10]Martin-Lagos, F. , Navarro-Alarcon, M. , Terres-Martos, C. , et al. , Sci. Total Environ. , 204, 27(1997)[11]Poo, J. L. , Romero, R. R. , Robles, J. A. , et al. , Arch. Med. Res. , 28, 259(1997)[12]Magalova, T., Bella, V. , Brtkova, A. , et al. , Neoplasma, 46, 100(1999)[13]Ferrigno, D. , Buccheri, G. , Camilla, T. , et al. , Archives for Chest Disease, 54, 204(1999)[14]Huang, Y. L. , Sheu, J. Y. , Lin, T. H. , Clinical Biochem. , 32, 131(1999)[15]Songchitsomboon, S. , Komindr, S. , Komindr, A. , et al. , J. Med. Assoc. Thai, 82, 701(1999)[16]Mason, R. P. , Cancer, 85, 2 093(1999)[17]Wargovith, M. J. , Ed. Moon T. E. , Micozz M. S. , Calcium, Vitamin D and the Prevention of Gastrointestinal Cancer, in Nutrition and Cancer Prevention, Marcel Dekker Inc. , New York, 1989:291[18]Leung, P. L. , Li, X. L. , Li, Z. X. , et al. , Biol. Trace Elem. Res. , 42, 1(1994)[19]Jing, X. ,Han, C., Cancer Research on Prevention and Treatment, 25, 186(1998)[20]Huang, Y. , Li, J. , Carcinogenesis, Teratogenesis and Mutagenesis, 10, 123(1998)[21]Wang, X. , Zhu, E. ,Yan, X. , et al. , Acta Chimica Sinica, 51, 1 094(1993)[22]Wan, T. , Qin, S. , Zhuang, S. , et al. , Rock and Mineral

  6. Classification of Rat FTIR Colon Cancer Data Using Waveletsand BPNN

    Institute of Scientific and Technical Information of China (English)

    CHENG,Cungui; XIONG,Wei; TIAN,Yumei

    2009-01-01

    A feature extracting method based on wavelets for horizontal attenuated total reflectance Fourier transform in-frared spectroscopy (HATR-FTIR) and the cancer classification using artificial neural network trained with back-propagation algorithm is presented. The FTIR data collected from 36 normal Sprague-dawley (SD) rats, 60 1,2-DMH-induced SD rats, and 44 second generation rats of those induced rats were first preprocessed. Then, 12 feature variants were extracted using continuous wavelet analysis. Based on BPNN classification, all spectra were classified into two categories: normal and abnormal ones. The accuracy values of identifying normal, dysplastic, early carcinoma, and advanced carcinoma were 100%, 94%, 97.5%, and 100%, respectively. This result indicated that FTIR with continuous wavelet transform (CWT) and the back-propagation neural network (BPNN) could ef- fectively and easily diagnose colon cancer in its early stages.

  7. Clinicopathological classification and individualized treatment of breast cancer

    Institute of Scientific and Technical Information of China (English)

    HU Hui; LIU Yin-hua; XU Ling; ZHAO Jian-xin; DUAN Xue-ning; YE Jing-ming; LI Ting

    2013-01-01

    Background The clinicopathological classification was proposed in the St.Gallen Consensus Report 2011.We conducted a retrospective analysis of breast cancer subtypes,tumor-nodal-metastatic (TNM) staging,and histopathological grade to investigate the value of these parameters in the treatment strategies of invasive breast cancer.Methods A retrospective analysis of breast cancer subtypes,TNM staging,and histopathological grading of 213 cases has been performed by the methods recommended in the St.Gallen International Expert Consensus Report 2011.The estrogen receptor (ER),progesterone receptor (PR),human epidermal growth factor receptor-2 (HER2),and Ki-67 of 213 tumor samples have been investigated by immunohistochemistry according to methods for classifying breast cancer subtypes proposed in the St.Gallen Consensus Report 2011.Results The luminal A subtype was found in 53 patients (24.9%),the luminal B subtype was found in 112 patients (52.6%),the HER2-positive subtype was found in 22 patients (10.3%),and the triple-negative subtype was found in 26 patients (12%).Histopathological grade and TNM staging differed significantly among the four subtypes of breast cancer (P<0.001).Conclusion It is important to consider TNM staging and histopathological grading in the treatment strategies of breast cancer based on the current clinicopathological classification methods.

  8. MORPHOLOGICAL CLASSIFICATION OF RENAL-CANCER

    NARCIS (Netherlands)

    STORKEL, S; VANDENBERG, E

    1995-01-01

    The current classification of renal-cell adenomas (RCAs) and carcinomas (RCCs) is based on eight basic cell and tumor types (entities) with characteristic morphologic features: (1) RCCs of clear-cell type, (2) RCAs/RCCs of chromophilic-cell type, (3) RCAs/RCCs of chromophobic-cell type, (4) RCCs of

  9. Classification of Cancer Recurrence with Alpha-Beta BAM

    Directory of Open Access Journals (Sweden)

    María Elena Acevedo

    2009-01-01

    Full Text Available Bidirectional Associative Memories (BAMs based on first model proposed by Kosko do not have perfect recall of training set, and their algorithm must iterate until it reaches a stable state. In this work, we use the model of Alpha-Beta BAM to classify automatically cancer recurrence in female patients with a previous breast cancer surgery. Alpha-Beta BAM presents perfect recall of all the training patterns and it has a one-shot algorithm; these advantages make to Alpha-Beta BAM a suitable tool for classification. We use data from Haberman database, and leave-one-out algorithm was applied to analyze the performance of our model as classifier. We obtain a percentage of classification of 99.98%.

  10. Computer aided decision support system for cervical cancer classification

    Science.gov (United States)

    Rahmadwati, Rahmadwati; Naghdy, Golshah; Ros, Montserrat; Todd, Catherine

    2012-10-01

    Conventional analysis of a cervical histology image, such a pap smear or a biopsy sample, is performed by an expert pathologist manually. This involves inspecting the sample for cellular level abnormalities and determining the spread of the abnormalities. Cancer is graded based on the spread of the abnormal cells. This is a tedious, subjective and time-consuming process with considerable variations in diagnosis between the experts. This paper presents a computer aided decision support system (CADSS) tool to help the pathologists in their examination of the cervical cancer biopsies. The main aim of the proposed CADSS system is to identify abnormalities and quantify cancer grading in a systematic and repeatable manner. The paper proposes three different methods which presents and compares the results using 475 images of cervical biopsies which include normal, three stages of pre cancer, and malignant cases. This paper will explore various components of an effective CADSS; image acquisition, pre-processing, segmentation, feature extraction, classification, grading and disease identification. Cervical histological images are captured using a digital microscope. The images are captured in sufficient resolution to retain enough information for effective classification. Histology images of cervical biopsies consist of three major sections; background, stroma and squamous epithelium. Most diagnostic information are contained within the epithelium region. This paper will present two levels of segmentations; global (macro) and local (micro). At the global level the squamous epithelium is separated from the background and stroma. At the local or cellular level, the nuclei and cytoplasm are segmented for further analysis. Image features that influence the pathologists' decision during the analysis and classification of a cervical biopsy are the nuclei's shape and spread; the ratio of the areas of nuclei and cytoplasm as well as the texture and spread of the abnormalities

  11. Modulation classification based on spectrogram

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    The aim of modulation classification (MC) is to identify the modulation type of a communication signal. It plays an important role in many cooperative or noncooperative communication applications. Three spectrogram-based modulation classification methods are proposed. Their reccgnition scope and performance are investigated or evaluated by theoretical analysis and extensive simulation studies. The method taking moment-like features is robust to frequency offset while the other two, which make use of principal component analysis (PCA) with different transformation inputs,can achieve satisfactory accuracy even at low SNR (as low as 2 dB). Due to the properties of spectrogram, the statistical pattern recognition techniques, and the image preprocessing steps, all of our methods are insensitive to unknown phase and frequency offsets, timing errors, and the arriving sequence of symbols.

  12. Classification of breast cancer cytological specimen using convolutional neural network

    Science.gov (United States)

    Żejmo, Michał; Kowal, Marek; Korbicz, Józef; Monczak, Roman

    2017-01-01

    The paper presents a deep learning approach for automatic classification of breast tumors based on fine needle cytology. The main aim of the system is to distinguish benign from malignant cases based on microscopic images. Experiment was carried out on cytological samples derived from 50 patients (25 benign cases + 25 malignant cases) diagnosed in Regional Hospital in Zielona Góra. To classify microscopic images, we used convolutional neural networks (CNN) of two types: GoogLeNet and AlexNet. Due to the very large size of images of cytological specimen (on average 200000 × 100000 pixels), they were divided into smaller patches of size 256 × 256 pixels. Breast cancer classification usually is based on morphometric features of nuclei. Therefore, training and validation patches were selected using Support Vector Machine (SVM) so that suitable amount of cell material was depicted. Neural classifiers were tuned using GPU accelerated implementation of gradient descent algorithm. Training error was defined as a cross-entropy classification loss. Classification accuracy was defined as the percentage ratio of successfully classified validation patches to the total number of validation patches. The best accuracy rate of 83% was obtained by GoogLeNet model. We observed that more misclassified patches belong to malignant cases.

  13. Network-Based Logistic Classification with an Enhanced L1/2 Solver Reveals Biomarker and Subnetwork Signatures for Diagnosing Lung Cancer

    Directory of Open Access Journals (Sweden)

    Hai-Hui Huang

    2015-01-01

    Full Text Available Identifying biomarker and signaling pathway is a critical step in genomic studies, in which the regularization method is a widely used feature extraction approach. However, most of the regularizers are based on L1-norm and their results are not good enough for sparsity and interpretation and are asymptotically biased, especially in genomic research. Recently, we gained a large amount of molecular interaction information about the disease-related biological processes and gathered them through various databases, which focused on many aspects of biological systems. In this paper, we use an enhanced L1/2 penalized solver to penalize network-constrained logistic regression model called an enhanced L1/2 net, where the predictors are based on gene-expression data with biologic network knowledge. Extensive simulation studies showed that our proposed approach outperforms L1 regularization, the old L1/2 penalized solver, and the Elastic net approaches in terms of classification accuracy and stability. Furthermore, we applied our method for lung cancer data analysis and found that our method achieves higher predictive accuracy than L1 regularization, the old L1/2 penalized solver, and the Elastic net approaches, while fewer but informative biomarkers and pathways are selected.

  14. Classification based polynomial image interpolation

    Science.gov (United States)

    Lenke, Sebastian; Schröder, Hartmut

    2008-02-01

    Due to the fast migration of high resolution displays for home and office environments there is a strong demand for high quality picture scaling. This is caused on the one hand by large picture sizes and on the other hand due to an enhanced visibility of picture artifacts on these displays [1]. There are many proposals for an enhanced spatial interpolation adaptively matched to picture contents like e.g. edges. The drawback of these approaches is the normally integer and often limited interpolation factor. In order to achieve rational factors there exist combinations of adaptive and non adaptive linear filters, but due to the non adaptive step the overall quality is notably limited. We present in this paper a content adaptive polyphase interpolation method which uses "offline" trained filter coefficients and an "online" linear filtering depending on a simple classification of the input situation. Furthermore we present a new approach to a content adaptive interpolation polynomial, which allows arbitrary polyphase interpolation factors at runtime and further improves the overall interpolation quality. The main goal of our new approach is to optimize interpolation quality by adapting higher order polynomials directly to the image content. In addition we derive filter constraints for enhanced picture quality. Furthermore we extend the classification based filtering to the temporal dimension in order to use it for an intermediate image interpolation.

  15. Molecular Classification of Gastric Cancer: A new paradigm

    Science.gov (United States)

    Shah, Manish A.; Khanin, Raya; Tang, Laura; Janjigian, Yelena Y.; Klimstra, David S.; Gerdes, Hans; Kelsen, David P.

    2011-01-01

    Purpose Gastric cancer may be subdivided into three distinct subtypes –proximal, diffuse, and distal gastric cancer– based on histopathologic and anatomic criteria. Each subtype is associated with unique epidemiology. Our aim is to test the hypothesis that these distinct gastric cancer subtypes may also be distinguished by gene expression analysis. Experimental Design Patients with localized gastric adenocarcinoma being screened for a phase II preoperative clinical trial (NCI 5917) underwent endoscopic biopsy for fresh tumor procurement. 4–6 targeted biopsies of the primary tumor were obtained. Macrodissection was performed to ensure >80% carcinoma in the sample. HG-U133A GeneChip (Affymetrix) was used for cDNA expression analysis, and all arrays were processed and analyzed using the Bioconductor R-package. Results Between November 2003 and January 2006, 57 patients were screened to identify 36 patients with localized gastric cancer who had adequate RNA for expression analysis. Using supervised analysis, we built a classifier to distinguish the three gastric cancer subtypes, successfully classifying each into tightly grouped clusters. Leave-one-out cross validation error was 0.14, suggesting that >85% of samples were classified correctly. Gene set analysis with the False Discovery Rate set at 0.25 identified several pathways that were differentially regulated when comparing each gastric cancer subtype to adjacent normal stomach. Conclusions Subtypes of gastric cancer that have epidemiologic and histologic distinction are also distinguished by gene expression data. These preliminary data suggest a new classification of gastric cancer with implications for improving our understanding of disease biology and identification of unique molecular drivers for each gastric cancer subtype. PMID:21430069

  16. Automated classification of histopathology images of prostate cancer using a Bag-of-Words approach

    Science.gov (United States)

    Sanghavi, Foram M.; Agaian, Sos S.

    2016-05-01

    The goals of this paper are (1) test the Computer Aided Classification of the prostate cancer histopathology images based on the Bag-of-Words (BoW) approach (2) evaluate the performance of the classification grade 3 and 4 of the proposed method using the results of the approach proposed by the authors Khurd et al. in [9] and (3) classify the different grades of cancer namely, grade 0, 3, 4, and 5 using the proposed approach. The system performance is assessed using 132 prostate cancer histopathology of different grades. The system performance of the SURF features are also analyzed by comparing the results with SIFT features using different cluster sizes. The results show 90.15% accuracy in detection of prostate cancer images using SURF features with 75 clusters for k-mean clustering. The results showed higher sensitivity for SURF based BoW classification compared to SIFT based BoW.

  17. Breast Cancer Classification From Histological Images with Multiple Features and Random Subspace Classifier Ensemble

    Science.gov (United States)

    Zhang, Yungang; Zhang, Bailing; Lu, Wenjin

    2011-06-01

    Histological image is important for diagnosis of breast cancer. In this paper, we present a novel automatic breaset cancer classification scheme based on histological images. The image features are extracted using the Curvelet Transform, statistics of Gray Level Co-occurence Matrix (GLCM) and Completed Local Binary Patterns (CLBP), respectively. The three different features are combined together and used for classification. A classifier ensemble approach, called Random Subspace Ensemble (RSE), are used to select and aggregate a set of base neural network classifiers for classification. The proposed multiple features and random subspace ensemble offer the classification rate 95.22% on a publically available breast cancer image dataset, which compares favorably with the previously published result 93.4%.

  18. Cluster-based adaptive metric classification

    NARCIS (Netherlands)

    Giotis, Ioannis; Petkov, Nicolai

    2012-01-01

    Introducing adaptive metric has been shown to improve the results of distance-based classification algorithms. Existing methods are often computationally intensive, either in the training or in the classification phase. We present a novel algorithm that we call Cluster-Based Adaptive Metric (CLAM) c

  19. Ontology-Based Classification System Development Methodology

    Directory of Open Access Journals (Sweden)

    Grabusts Peter

    2015-12-01

    Full Text Available The aim of the article is to analyse and develop an ontology-based classification system methodology that uses decision tree learning with statement propositionalized attributes. Classical decision tree learning algorithms, as well as decision tree learning with taxonomy and propositionalized attributes have been observed. Thus, domain ontology can be extracted from the data sets and can be used for data classification with the help of a decision tree. The use of ontology methods in decision tree-based classification systems has been researched. Using such methodologies, the classification accuracy in some cases can be improved.

  20. Mass spectrometry cancer data classification using wavelets and genetic algorithm.

    Science.gov (United States)

    Nguyen, Thanh; Nahavandi, Saeid; Creighton, Douglas; Khosravi, Abbas

    2015-12-21

    This paper introduces a hybrid feature extraction method applied to mass spectrometry (MS) data for cancer classification. Haar wavelets are employed to transform MS data into orthogonal wavelet coefficients. The most prominent discriminant wavelets are then selected by genetic algorithm (GA) to form feature sets. The combination of wavelets and GA yields highly distinct feature sets that serve as inputs to classification algorithms. Experimental results show the robustness and significant dominance of the wavelet-GA against competitive methods. The proposed method therefore can be applied to cancer classification models that are useful as real clinical decision support systems for medical practitioners.

  1. Has the new TNM classification for colorectal cancer improved care?

    NARCIS (Netherlands)

    Nagtegaal, I.D.; Quirke, P.; Schmoll, H.J.

    2012-01-01

    In 2009, the Union for International Cancer Control issued the seventh edition of the well-used T (tumor), N (node), and M (metastasis) classification guidelines. There has been a continual refinement of the staging for colorectal cancer since this system for assessing tumor stage was initially adop

  2. Visual words based approach for tissue classification in mammograms

    Science.gov (United States)

    Diamant, Idit; Goldberger, Jacob; Greenspan, Hayit

    2013-02-01

    The presence of Microcalcifications (MC) is an important indicator for developing breast cancer. Additional indicators for cancer risk exist, such as breast tissue density type. Different methods have been developed for breast tissue classification for use in Computer-aided diagnosis systems. Recently, the visual words (VW) model has been successfully applied for different classification tasks. The goal of our work is to explore VW based methodologies for various mammography classification tasks. We start with the challenge of classifying breast density and then focus on classification of normal tissue versus Microcalcifications. The presented methodology is based on patch-based visual words model which includes building a dictionary for a training set using local descriptors and representing the image using a visual word histogram. Classification is then performed using k-nearest-neighbour (KNN) and Support vector machine (SVM) classifiers. We tested our algorithm on the MIAS and DDSM publicly available datasets. The input is a representative region-of-interest per mammography image, manually selected and labelled by expert. In the tissue density task, classification accuracy reached 85% using KNN and 88% using SVM, which competes with the state-of-the-art results. For MC vs. normal tissue, accuracy reached 95.6% using SVM. Results demonstrate the feasibility to classify breast tissue using our model. Currently, we are improving the results further while also investigating VW capability to classify additional important mammogram classification problems. We expect that the methodology presented will enable high levels of classification, suggesting new means for automated tools for mammography diagnosis support.

  3. Sparse discriminant analysis for breast cancer biomarker identification and classification

    Institute of Scientific and Technical Information of China (English)

    Yu Shi; Daoqing Dai; Chaochun Liu; Hong Yan

    2009-01-01

    Biomarker identification and cancer classification are two important procedures in microarray data analysis. We propose a novel uni-fied method to carry out both tasks. We first preselect biomarker candidates by eliminating unrelated genes through the BSS/WSS ratio filter to reduce computational cost, and then use a sparse discriminant analysis method for simultaneous biomarker identification and cancer classification. Moreover, we give a mathematical justification about automatic biomarker identification. Experimental results show that the proposed method can identify key genes that have been verified in biochemical or biomedical research and classify the breast cancer type correctly.

  4. Classification of neuropathic pain in cancer patients

    DEFF Research Database (Denmark)

    Brunelli, Cinzia; Bennett, Michael I; Kaasa, Stein

    2014-01-01

    and on the relevance of patient-reported outcome (PRO) descriptors for the screening of NP in this population. An international group of 42 experts was invited to participate in a consensus process through a modified 2-round Internet-based Delphi survey. Relevant topics investigated were: peculiarities of NP...... in patients with cancer, IASP NeuPSIG diagnostic criteria adaptation and assessment, and standardized PRO assessment for NP screening. Median consensus scores (MED) and interquartile ranges (IQR) were calculated to measure expert consensus after both rounds. Twenty-nine experts answered, and good agreement...

  5. Classification based on CART algorithm for microarray data of lung cancer%基于CART算法的肺癌微阵列数据的分类

    Institute of Scientific and Technical Information of China (English)

    陈磊; 刘毅慧

    2011-01-01

    基因芯片技术是基因组学中的重要研究工具.而基因芯片数据(微阵列数据)往往是高维的,使得降维成为微阵列数据分析中的一个必要步骤.本文对美国哈佛医学院G.J.Gordon等人提供的肺癌微阵列数据进行分析.通过t-test,Wilcox-on秩和检测分别提取微阵列数据特征属性,后根据CART(Classification and Regression Tree)算法,以Gini差异性指标作为误差函数,用提取的特征属性广延的构造分类树;再进行剪枝找到最优规模的树,目的是提高树的泛化性能使得能很好适应新的预测数据.实验证明:该方法对肺癌微阵列数据分类识别率达到96%以上,且很稳定;并可以得到人们容易理解的分类规则和分类关键基因.%The gene chip technology is a significant tool in the genomics research. But the gene chip data ( microar-ray data) is often high -dimensional, make the dimensionality reduction a necessary step. In this paper, the mi-croarray data of lung cancer we analyze that provided by Gavin J. Gordon ect. From the Harvard Medical School. Firstly, t - test, Wilcoxon rank - sum test methods are used for feature selection to reduce the dimensionality of mi-croarray data; then according to CART (Classification and Regression Tree) algorithm, take Gini index as the error function, with the feature attributes fitting an extension to the classification tree, find the optimal size of the tree by pruning, improve the generalization performance of the tree to perfectly adapt to the new samples. Experimental results show; the recognition rate can be up to over 96% for lung cancer microarray data classification using our method, and is very stable; also discovery of significant rules which can be understand easily and key genes information for classification.

  6. Evolving Cancer Classification in the Era of Personalized Medicine: A Primer for Radiologists

    Science.gov (United States)

    Jagannathan, Jyothi P.; Ramaiya, Nikhil H.

    2017-01-01

    Traditionally tumors were classified based on anatomic location but now specific genetic mutations in cancers are leading to treatment of tumors with molecular targeted therapies. This has led to a paradigm shift in the classification and treatment of cancer. Tumors treated with molecular targeted therapies often show morphological changes rather than change in size and are associated with class specific and drug specific toxicities, different from those encountered with conventional chemotherapeutic agents. It is important for the radiologists to be familiar with the new cancer classification and the various treatment strategies employed, in order to effectively communicate and participate in the multi-disciplinary care. In this paper we will focus on lung cancer as a prototype of the new molecular classification.

  7. Which is the most suitable classification for colorectal cancer, log odds, the number or the ratio of positive lymph nodes?

    Directory of Open Access Journals (Sweden)

    Yong-Xi Song

    Full Text Available OBJECTIVE: The aim of the current study was to investigate which is the most suitable classification for colorectal cancer, log odds of positive lymph nodes (LODDS classification or the classifications based on the number of positive lymph nodes (pN and positive lymph node ratio(LNR in a Chinese single institutional population. DESIGN: Clinicopathologic and prognostic data of 1297 patients with colorectal cancer were retrospectively studied. The log-rank statistics, Cox's proportional hazards model, the Nagelkerke R(2 index and a Harrell's C statistic were used. RESULTS: Univariate and three-step multivariate analyses identified that LNR was a significant prognostic factor and LNR classification was superior to both the pN and LODDS classifications. Moreover, the results of the Nagelkerke R(2 index (0.130 and a Harrell's C statistic (0.707 of LNR showed that LNR and LODDS classifications were similar and LNR was a little better than the other two classifications. Furthermore, for patients in each LNR classification, prognosis was homologous between those in different pN or LODDS classifications. However, for patients in pN1a, pN1b, LODDS2 and LODDS3 classifications, significant differences in survival were observed among patients in different LNR classifications. CONCLUSIONS: For patients with colorectal cancer, the LNR classification is more suitable than pN and LODDS classifications for prognostic assessment in a Chinese single institutional population.

  8. Contour classification in thermographic images for detection of breast cancer

    Science.gov (United States)

    Okuniewski, Rafał; Nowak, Robert M.; Cichosz, Paweł; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Oleszkiewicz, Witold

    2016-09-01

    Thermographic images of breast taken by the Braster device are uploaded into web application which uses different classification algorithms to automatically decide whether a patient should be more thoroughly examined. This article presents the approach to the task of classifying contours visible on thermographic images of breast taken by the Braster device in order to make the decision about the existence of cancerous tumors in breast. It presents the results of the researches conducted on the different classification algorithms.

  9. Classification of oral cancers using Raman spectroscopy of serum

    Science.gov (United States)

    Sahu, Aditi; Talathi, Sneha; Sawant, Sharada; Krishna, C. Murali

    2014-03-01

    Oral cancers are the sixth most common malignancy worldwide, with low 5-year disease free survival rates, attributable to late detection due to lack of reliable screening modalities. Our in vivo Raman spectroscopy studies have demonstrated classification of normal and tumor as well as cancer field effects (CFE), the earliest events in oral cancers. In view of limitations such as requirement of on-site instrumentation and stringent experimental conditions of this approach, feasibility of classification of normal and cancer using serum was explored using 532 nm excitation. In this study, strong resonance features of β-carotenes, present differentially in normal and pathological conditions, were observed. In the present study, Raman spectra of sera of 36 buccal mucosa, 33 tongue cancers and 17 healthy subjects were recorded using Raman microprobe coupled with 40X objective using 785 nm excitation, a known source of excitation for biomedical applications. To eliminate heterogeneity, average of 3 spectra recorded from each sample was subjected to PC-LDA followed by leave-one-out-cross-validation. Findings indicate average classification efficiency of ~70% for normal and cancer. Buccal mucosa and tongue cancer serum could also be classified with an efficiency of ~68%. Of the two cancers, buccal mucosa cancer and normal could be classified with a higher efficiency. Findings of the study are quite comparable to that of our earlier study, which suggest that there exist significant differences, other than β- carotenes, between normal and cancerous samples which can be exploited for the classification. Prospectively, extensive validation studies will be undertaken to confirm the findings.

  10. Survival of non-seminomatous germ cell cancer patients according to the IGCC classification: An update based on meta-analysis.

    Science.gov (United States)

    van Dijk, Merel R; Steyerberg, Ewout W; Habbema, J Dik F

    2006-05-01

    The International Germ Cell Consensus (IGCC) Classification distinguishes patients with non-seminomatous germ cell tumours (NSGCT) with a good, intermediate or poor prognosis, with a reported 5-year overall survival of 92%, 80% and 48%, respectively. Since the IGCC classification was based on patients treated between 1975 and 1990, we aimed to investigate whether survival has improved for more recently treated patients. We did a systematic search of the literature and included studies on survival of patients with NSGCT, treated after 1989 and classified according to the IGCC classification. Survival estimates of selected studies were pooled using meta-analytic techniques. We included 10 papers, describing 1775 patients with NSGCT with good (n = 1087), intermediate (n = 232), or poor (n = 456) prognosis. Pooled 5-year survival estimates were 94%, 83% and 71%, respectively. Since the publication of the IGCC classification, there was a small increase in survival for good and intermediate prognosis patients, and a large increase in survival for patients with a poor prognosis. This increase is most likely due to both more effective treatment strategies and more experience in treating NSGCT patients.

  11. Novel approaches for the molecular classification of prostate cancer

    Institute of Scientific and Technical Information of China (English)

    Robert H. Getzenberg

    2010-01-01

    @@ Among the urologic cancers, prostate cancer is by far the most common, and it appears to have the potential to affect almost all men throughout the world as they age. A number of studies have shown that many men with prostate cancer will not die from their disease, but rather with the disease but from other causes. These men have a form of prostate cancer that is de-scribed as "very low risk" and has often been called indolent. There are however a group of men that have a form of prostate cancer that is much more aggressive and life threatening. Unlike other cancer types, we have few tools to provide for the molecular classification of prostate cancer.

  12. Weakly supervised histopathology cancer image segmentation and classification.

    Science.gov (United States)

    Xu, Yan; Zhu, Jun-Yan; Chang, Eric I-Chao; Lai, Maode; Tu, Zhuowen

    2014-04-01

    Labeling a histopathology image as having cancerous regions or not is a critical task in cancer diagnosis; it is also clinically important to segment the cancer tissues and cluster them into various classes. Existing supervised approaches for image classification and segmentation require detailed manual annotations for the cancer pixels, which are time-consuming to obtain. In this paper, we propose a new learning method, multiple clustered instance learning (MCIL) (along the line of weakly supervised learning) for histopathology image segmentation. The proposed MCIL method simultaneously performs image-level classification (cancer vs. non-cancer image), medical image segmentation (cancer vs. non-cancer tissue), and patch-level clustering (different classes). We embed the clustering concept into the multiple instance learning (MIL) setting and derive a principled solution to performing the above three tasks in an integrated framework. In addition, we introduce contextual constraints as a prior for MCIL, which further reduces the ambiguity in MIL. Experimental results on histopathology colon cancer images and cytology images demonstrate the great advantage of MCIL over the competing methods.

  13. Cancer classification using the Immunoscore: a worldwide task force.

    Science.gov (United States)

    Galon, Jérôme; Pagès, Franck; Marincola, Francesco M; Angell, Helen K; Thurin, Magdalena; Lugli, Alessandro; Zlobec, Inti; Berger, Anne; Bifulco, Carlo; Botti, Gerardo; Tatangelo, Fabiana; Britten, Cedrik M; Kreiter, Sebastian; Chouchane, Lotfi; Delrio, Paolo; Arndt, Hartmann; Asslaber, Martin; Maio, Michele; Masucci, Giuseppe V; Mihm, Martin; Vidal-Vanaclocha, Fernando; Allison, James P; Gnjatic, Sacha; Hakansson, Leif; Huber, Christoph; Singh-Jasuja, Harpreet; Ottensmeier, Christian; Zwierzina, Heinz; Laghi, Luigi; Grizzi, Fabio; Ohashi, Pamela S; Shaw, Patricia A; Clarke, Blaise A; Wouters, Bradly G; Kawakami, Yutaka; Hazama, Shoichi; Okuno, Kiyotaka; Wang, Ena; O'Donnell-Tormey, Jill; Lagorce, Christine; Pawelec, Graham; Nishimura, Michael I; Hawkins, Robert; Lapointe, Réjean; Lundqvist, Andreas; Khleif, Samir N; Ogino, Shuji; Gibbs, Peter; Waring, Paul; Sato, Noriyuki; Torigoe, Toshihiko; Itoh, Kyogo; Patel, Prabhu S; Shukla, Shilin N; Palmqvist, Richard; Nagtegaal, Iris D; Wang, Yili; D'Arrigo, Corrado; Kopetz, Scott; Sinicrope, Frank A; Trinchieri, Giorgio; Gajewski, Thomas F; Ascierto, Paolo A; Fox, Bernard A

    2012-10-03

    Prediction of clinical outcome in cancer is usually achieved by histopathological evaluation of tissue samples obtained during surgical resection of the primary tumor. Traditional tumor staging (AJCC/UICC-TNM classification) summarizes data on tumor burden (T), presence of cancer cells in draining and regional lymph nodes (N) and evidence for metastases (M). However, it is now recognized that clinical outcome can significantly vary among patients within the same stage. The current classification provides limited prognostic information, and does not predict response to therapy. Recent literature has alluded to the importance of the host immune system in controlling tumor progression. Thus, evidence supports the notion to include immunological biomarkers, implemented as a tool for the prediction of prognosis and response to therapy. Accumulating data, collected from large cohorts of human cancers, has demonstrated the impact of immune-classification, which has a prognostic value that may add to the significance of the AJCC/UICC TNM-classification. It is therefore imperative to begin to incorporate the 'Immunoscore' into traditional classification, thus providing an essential prognostic and potentially predictive tool. Introduction of this parameter as a biomarker to classify cancers, as part of routine diagnostic and prognostic assessment of tumors, will facilitate clinical decision-making including rational stratification of patient treatment. Equally, the inherent complexity of quantitative immunohistochemistry, in conjunction with protocol variation across laboratories, analysis of different immune cell types, inconsistent region selection criteria, and variable ways to quantify immune infiltration, all underline the urgent requirement to reach assay harmonization. In an effort to promote the Immunoscore in routine clinical settings, an international task force was initiated. This review represents a follow-up of the announcement of this initiative, and of the J

  14. Ontology-Based Classification System Development Methodology

    OpenAIRE

    2015-01-01

    The aim of the article is to analyse and develop an ontology-based classification system methodology that uses decision tree learning with statement propositionalized attributes. Classical decision tree learning algorithms, as well as decision tree learning with taxonomy and propositionalized attributes have been observed. Thus, domain ontology can be extracted from the data sets and can be used for data classification with the help of a decision tree. The use of ontology methods in decision ...

  15. An Authentication Technique Based on Classification

    Institute of Scientific and Technical Information of China (English)

    李钢; 杨杰

    2004-01-01

    We present a novel watermarking approach based on classification for authentication, in which a watermark is embedded into the host image. When the marked image is modified, the extracted watermark is also different to the original watermark, and different kinds of modification lead to different extracted watermarks. In this paper, different kinds of modification are considered as classes, and we used classification algorithm to recognize the modifications with high probability. Simulation results show that the proposed method is potential and effective.

  16. Cuckoo search optimisation for feature selection in cancer classification: a new approach.

    Science.gov (United States)

    Gunavathi, C; Premalatha, K

    2015-01-01

    Cuckoo Search (CS) optimisation algorithm is used for feature selection in cancer classification using microarray gene expression data. Since the gene expression data has thousands of genes and a small number of samples, feature selection methods can be used for the selection of informative genes to improve the classification accuracy. Initially, the genes are ranked based on T-statistics, Signal-to-Noise Ratio (SNR) and F-statistics values. The CS is used to find the informative genes from the top-m ranked genes. The classification accuracy of k-Nearest Neighbour (kNN) technique is used as the fitness function for CS. The proposed method is experimented and analysed with ten different cancer gene expression datasets. The results show that the CS gives 100% average accuracy for DLBCL Harvard, Lung Michigan, Ovarian Cancer, AML-ALL and Lung Harvard2 datasets and it outperforms the existing techniques in DLBCL outcome and prostate datasets.

  17. Distance-based features in pattern classification

    Directory of Open Access Journals (Sweden)

    Lin Wei-Yang

    2011-01-01

    Full Text Available Abstract In data mining and pattern classification, feature extraction and representation methods are a very important step since the extracted features have a direct and significant impact on the classification accuracy. In literature, numbers of novel feature extraction and representation methods have been proposed. However, many of them only focus on specific domain problems. In this article, we introduce a novel distance-based feature extraction method for various pattern classification problems. Specifically, two distances are extracted, which are based on (1 the distance between the data and its intra-cluster center and (2 the distance between the data and its extra-cluster centers. Experiments based on ten datasets containing different numbers of classes, samples, and dimensions are examined. The experimental results using naïve Bayes, k-NN, and SVM classifiers show that concatenating the original features provided by the datasets to the distance-based features can improve classification accuracy except image-related datasets. In particular, the distance-based features are suitable for the datasets which have smaller numbers of classes, numbers of samples, and the lower dimensionality of features. Moreover, two datasets, which have similar characteristics, are further used to validate this finding. The result is consistent with the first experiment result that adding the distance-based features can improve the classification performance.

  18. Texture Classification based on Gabor Wavelet

    Directory of Open Access Journals (Sweden)

    Amandeep Kaur

    2012-07-01

    Full Text Available This paper presents the comparison of Texture classification algorithms based on Gabor Wavelets. The focus of this paper is on feature extraction scheme for texture classification. The texture feature for an image can be classified using texture descriptors. In this paper we have used Homogeneous texture descriptor that uses Gabor Wavelets concept. For texture classification, we have used online texture database that is Brodatz’s database and three advanced well known classifiers: Support Vector Machine, K-nearest neighbor method and decision tree induction method. The results shows that classification using Support vector machines gives better results as compare to the other classifiers. It can accurately discriminate between a testing image data and training data.

  19. Inventory classification based on decoupling points

    Directory of Open Access Journals (Sweden)

    Joakim Wikner

    2015-01-01

    Full Text Available The ideal state of continuous one-piece flow may never be achieved. Still the logistics manager can improve the flow by carefully positioning inventory to buffer against variations. Strategies such as lean, postponement, mass customization, and outsourcing all rely on strategic positioning of decoupling points to separate forecast-driven from customer-order-driven flows. Planning and scheduling of the flow are also based on classification of decoupling points as master scheduled or not. A comprehensive classification scheme for these types of decoupling points is introduced. The approach rests on identification of flows as being either demand based or supply based. The demand or supply is then combined with exogenous factors, classified as independent, or endogenous factors, classified as dependent. As a result, eight types of strategic as well as tactical decoupling points are identified resulting in a process-based framework for inventory classification that can be used for flow design.

  20. Definition and classification of cancer cachexia: an international consensus.

    Science.gov (United States)

    Fearon, Kenneth; Strasser, Florian; Anker, Stefan D; Bosaeus, Ingvar; Bruera, Eduardo; Fainsinger, Robin L; Jatoi, Aminah; Loprinzi, Charles; MacDonald, Neil; Mantovani, Giovanni; Davis, Mellar; Muscaritoli, Maurizio; Ottery, Faith; Radbruch, Lukas; Ravasco, Paula; Walsh, Declan; Wilcock, Andrew; Kaasa, Stein; Baracos, Vickie E

    2011-05-01

    To develop a framework for the definition and classification of cancer cachexia a panel of experts participated in a formal consensus process, including focus groups and two Delphi rounds. Cancer cachexia was defined as a multifactorial syndrome defined by an ongoing loss of skeletal muscle mass (with or without loss of fat mass) that cannot be fully reversed by conventional nutritional support and leads to progressive functional impairment. Its pathophysiology is characterised by a negative protein and energy balance driven by a variable combination of reduced food intake and abnormal metabolism. The agreed diagnostic criterion for cachexia was weight loss greater than 5%, or weight loss greater than 2% in individuals already showing depletion according to current bodyweight and height (body-mass index [BMI] definition and classification of cancer cachexia. After validation, this should aid clinical trial design, development of practice guidelines, and, eventually, routine clinical management.

  1. A CAD System for Identification and Classification of Breast Cancer Tumors in DCE-MR Images Based on Hierarchical Convolutional Neural Networks

    Directory of Open Access Journals (Sweden)

    Reza Rastiboroujeni

    2015-06-01

    Full Text Available In this paper, we propose a computer aided diagnosis (CAD system based on hierarchical convolutional neural networks (HCNNs to discriminate between malignant and benign tumors in breast DCE-MRIs. A HCNN is a hierarchical neural network that operates on two-dimensional images. A HCNN integrates feature extraction and classification processes into one single and fully adaptive structure. It can extract two-dimensional key features automatically, and it is relatively tolerant to geometric and local distortions in input images. We evaluate CNN implementation learning and testing processes based on gradient descent (GD and resilient back-propagation (RPROP approaches. We show that, proposed HCNN with RPROP learning approach provide an effective and robust neural structure to design a CAD base system for breast MRI, and has potential as a mechanism for the evaluation of different types of abnormalities in medical images.

  2. Optimization based tumor classification from microarray gene expression data.

    Directory of Open Access Journals (Sweden)

    Onur Dagliyan

    Full Text Available BACKGROUND: An important use of data obtained from microarray measurements is the classification of tumor types with respect to genes that are either up or down regulated in specific cancer types. A number of algorithms have been proposed to obtain such classifications. These algorithms usually require parameter optimization to obtain accurate results depending on the type of data. Additionally, it is highly critical to find an optimal set of markers among those up or down regulated genes that can be clinically utilized to build assays for the diagnosis or to follow progression of specific cancer types. In this paper, we employ a mixed integer programming based classification algorithm named hyper-box enclosure method (HBE for the classification of some cancer types with a minimal set of predictor genes. This optimization based method which is a user friendly and efficient classifier may allow the clinicians to diagnose and follow progression of certain cancer types. METHODOLOGY/PRINCIPAL FINDINGS: We apply HBE algorithm to some well known data sets such as leukemia, prostate cancer, diffuse large B-cell lymphoma (DLBCL, small round blue cell tumors (SRBCT to find some predictor genes that can be utilized for diagnosis and prognosis in a robust manner with a high accuracy. Our approach does not require any modification or parameter optimization for each data set. Additionally, information gain attribute evaluator, relief attribute evaluator and correlation-based feature selection methods are employed for the gene selection. The results are compared with those from other studies and biological roles of selected genes in corresponding cancer type are described. CONCLUSIONS/SIGNIFICANCE: The performance of our algorithm overall was better than the other algorithms reported in the literature and classifiers found in WEKA data-mining package. Since it does not require a parameter optimization and it performs consistently very high prediction rate on

  3. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim U.; Emberton, Mark [University College London, Research Department of Urology, Division of Surgery and Interventional Science, London (United Kingdom); Kirkham, Alex [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2015-09-15

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  4. Texture Image Classification Based on Gabor Wavelet

    Institute of Scientific and Technical Information of China (English)

    DENG Wei-bing; LI Hai-fei; SHI Ya-li; YANG Xiao-hui

    2014-01-01

    For a texture image, by recognizining the class of every pixel of the image, it can be partitioned into disjoint regions of uniform texture. This paper proposed a texture image classification algorithm based on Gabor wavelet. In this algorithm, characteristic of every image is obtained through every pixel and its neighborhood of this image. And this algorithm can achieve the information transform between different sizes of neighborhood. Experiments on standard Brodatz texture image dataset show that our proposed algorithm can achieve good classification rates.

  5. Density Based Support Vector Machines for Classification

    Directory of Open Access Journals (Sweden)

    Zahra Nazari

    2015-04-01

    Full Text Available Support Vector Machines (SVM is the most successful algorithm for classification problems. SVM learns the decision boundary from two classes (for Binary Classification of training points. However, sometimes there are some less meaningful samples amongst training points, which are corrupted by noises or misplaced in wrong side, called outliers. These outliers are affecting on margin and classification performance, and machine should better to discard them. SVM as a popular and widely used classification algorithm is very sensitive to these outliers and lacks the ability to discard them. Many research results prove this sensitivity which is a weak point for SVM. Different approaches are proposed to reduce the effect of outliers but no method is suitable for all types of data sets. In this paper, the new method of Density Based SVM (DBSVM is introduced. Population Density is the basic concept which is used in this method for both linear and non-linear SVM to detect outliers. Experiments on artificial data sets, real high-dimensional benchmark data sets of Liver disorder and Heart disease, and data sets of new and fatigued banknotes’ acoustic signals can prove the efficiency of this method on noisy data classification and the better generalization that it can provide compared to the standard SVM.

  6. Classification of Base Sequences (+1,

    Directory of Open Access Journals (Sweden)

    Dragomir Ž. Ðoković

    2010-01-01

    Full Text Available Base sequences BS(+1, are quadruples of {±1}-sequences (;;;, with A and B of length +1 and C and D of length n, such that the sum of their nonperiodic autocor-relation functions is a -function. The base sequence conjecture, asserting that BS(+1, exist for all n, is stronger than the famous Hadamard matrix conjecture. We introduce a new definition of equivalence for base sequences BS(+1, and construct a canonical form. By using this canonical form, we have enumerated the equivalence classes of BS(+1, for ≤30. As the number of equivalence classes grows rapidly (but not monotonically with n, the tables in the paper cover only the cases ≤13.

  7. Semi-Supervised Projective Non-Negative Matrix Factorization for Cancer Classification.

    Directory of Open Access Journals (Sweden)

    Xiang Zhang

    Full Text Available Advances in DNA microarray technologies have made gene expression profiles a significant candidate in identifying different types of cancers. Traditional learning-based cancer identification methods utilize labeled samples to train a classifier, but they are inconvenient for practical application because labels are quite expensive in the clinical cancer research community. This paper proposes a semi-supervised projective non-negative matrix factorization method (Semi-PNMF to learn an effective classifier from both labeled and unlabeled samples, thus boosting subsequent cancer classification performance. In particular, Semi-PNMF jointly learns a non-negative subspace from concatenated labeled and unlabeled samples and indicates classes by the positions of the maximum entries of their coefficients. Because Semi-PNMF incorporates statistical information from the large volume of unlabeled samples in the learned subspace, it can learn more representative subspaces and boost classification performance. We developed a multiplicative update rule (MUR to optimize Semi-PNMF and proved its convergence. The experimental results of cancer classification for two multiclass cancer gene expression profile datasets show that Semi-PNMF outperforms the representative methods.

  8. Cancer classification using the Immunoscore: a worldwide task force

    Directory of Open Access Journals (Sweden)

    Galon Jérôme

    2012-10-01

    Full Text Available Abstract Prediction of clinical outcome in cancer is usually achieved by histopathological evaluation of tissue samples obtained during surgical resection of the primary tumor. Traditional tumor staging (AJCC/UICC-TNM classification summarizes data on tumor burden (T, presence of cancer cells in draining and regional lymph nodes (N and evidence for metastases (M. However, it is now recognized that clinical outcome can significantly vary among patients within the same stage. The current classification provides limited prognostic information, and does not predict response to therapy. Recent literature has alluded to the importance of the host immune system in controlling tumor progression. Thus, evidence supports the notion to include immunological biomarkers, implemented as a tool for the prediction of prognosis and response to therapy. Accumulating data, collected from large cohorts of human cancers, has demonstrated the impact of immune-classification, which has a prognostic value that may add to the significance of the AJCC/UICC TNM-classification. It is therefore imperative to begin to incorporate the ‘Immunoscore’ into traditional classification, thus providing an essential prognostic and potentially predictive tool. Introduction of this parameter as a biomarker to classify cancers, as part of routine diagnostic and prognostic assessment of tumors, will facilitate clinical decision-making including rational stratification of patient treatment. Equally, the inherent complexity of quantitative immunohistochemistry, in conjunction with protocol variation across laboratories, analysis of different immune cell types, inconsistent region selection criteria, and variable ways to quantify immune infiltration, all underline the urgent requirement to reach assay harmonization. In an effort to promote the Immunoscore in routine clinical settings, an international task force was initiated. This review represents a follow-up of the announcement of

  9. An Agent Based Classification Model

    CERN Document Server

    Gu, Feng; Greensmith, Julie

    2009-01-01

    The major function of this model is to access the UCI Wisconsin Breast Can- cer data-set[1] and classify the data items into two categories, which are normal and anomalous. This kind of classifi cation can be referred as anomaly detection, which discriminates anomalous behaviour from normal behaviour in computer systems. One popular solution for anomaly detection is Artifi cial Immune Sys- tems (AIS). AIS are adaptive systems inspired by theoretical immunology and observed immune functions, principles and models which are applied to prob- lem solving. The Dendritic Cell Algorithm (DCA)[2] is an AIS algorithm that is developed specifi cally for anomaly detection. It has been successfully applied to intrusion detection in computer security. It is believed that agent-based mod- elling is an ideal approach for implementing AIS, as intelligent agents could be the perfect representations of immune entities in AIS. This model evaluates the feasibility of re-implementing the DCA in an agent-based simulation environ- ...

  10. Image-based Vehicle Classification System

    CERN Document Server

    Ng, Jun Yee

    2012-01-01

    Electronic toll collection (ETC) system has been a common trend used for toll collection on toll road nowadays. The implementation of electronic toll collection allows vehicles to travel at low or full speed during the toll payment, which help to avoid the traffic delay at toll road. One of the major components of an electronic toll collection is the automatic vehicle detection and classification (AVDC) system which is important to classify the vehicle so that the toll is charged according to the vehicle classes. Vision-based vehicle classification system is one type of vehicle classification system which adopt camera as the input sensing device for the system. This type of system has advantage over the rest for it is cost efficient as low cost camera is used. The implementation of vision-based vehicle classification system requires lower initial investment cost and very suitable for the toll collection trend migration in Malaysia from single ETC system to full-scale multi-lane free flow (MLFF). This project ...

  11. [New molecular classification of colorectal cancer, pancreatic cancer and stomach cancer: Towards "à la carte" treatment?].

    Science.gov (United States)

    Dreyer, Chantal; Afchain, Pauline; Trouilloud, Isabelle; André, Thierry

    2016-01-01

    This review reports 3 of recently published molecular classifications of the 3 main gastro-intestinal cancers: gastric, pancreatic and colorectal adenocarcinoma. In colorectal adenocarcinoma, 6 independent classifications were combined to finally hold 4 molecular sub-groups, Consensus Molecular Subtypes (CMS 1-4), linked to various clinical, molecular and survival data. CMS1 (14% MSI with immune activation); CMS2 (37%: canonical with epithelial differentiation and activation of the WNT/MYC pathway); CMS3 (13% metabolic with epithelial differentiation and RAS mutation); CMS4 (23%: mesenchymal with activation of TGFβ pathway and angiogenesis with stromal invasion). In gastric adenocarcinoma, 4 groups were established: subtype "EBV" (9%, high frequency of PIK3CA mutations, hypermetylation and amplification of JAK2, PD-L1 and PD-L2), subtype "MSI" (22%, high rate of mutation), subtype "genomically stable tumor" (20%, diffuse histology type and mutations of RAS and genes encoding integrins and adhesion proteins including CDH1) and subtype "tumors with chromosomal instability" (50%, intestinal type, aneuploidy and receptor tyrosine kinase amplification). In pancreatic adenocarcinomas, a classification in four sub-groups has been proposed, stable subtype (20%, aneuploidy), locally rearranged subtype (30%, focal event on one or two chromosoms), scattered subtype (36%,200 structural variation events, defects in DNA maintenance). Although currently away from the care of patients, these classifications open the way to "à la carte" treatment depending on molecular biology.

  12. A Hybrid Reduction Approach for Enhancing Cancer Classification of Microarray Data

    Directory of Open Access Journals (Sweden)

    Abeer M. Mahmoud

    2014-10-01

    Full Text Available This paper presents a novel hybrid machine learning (MLreduction approach to enhance cancer classification accuracy of microarray data based on two ML gene ranking techniques (T-test and Class Separability (CS. The proposed approach is integrated with two ML classifiers; K-nearest neighbor (KNN and support vector machine (SVM; for mining microarray gene expression profiles. Four public cancer microarray databases are used for evaluating the proposed approach and successfully accomplish the mining process. These are Lymphoma, Leukemia SRBCT, and Lung Cancer. The strategy to select genes only from the training samples and totally excluding the testing samples from the classifier building process is utilized for more accurate and validated results. Also, the computational experiments are illustrated in details and comprehensively presented with literature related results. The results showed that the proposed reduction approach reached promising results of the number of genes supplemented to the classifiers as well as the classification accuracy.

  13. Distance-based classification of keystroke dynamics

    Science.gov (United States)

    Tran Nguyen, Ngoc

    2016-07-01

    This paper uses the keystroke dynamics in user authentication. The relationship between the distance metrics and the data template, for the first time, was analyzed and new distance based algorithm for keystroke dynamics classification was proposed. The results of the experiments on the CMU keystroke dynamics benchmark dataset1 were evaluated with an equal error rate of 0.0614. The classifiers using the proposed distance metric outperform existing top performing keystroke dynamics classifiers which use traditional distance metrics.

  14. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  15. Review on Feature Selection Techniques and the Impact of SVM for Cancer Classification using Gene Expression Profile

    CERN Document Server

    George, G Victo Sudha; 10.5121/ijcses.2011.2302

    2011-01-01

    The DNA microarray technology has modernized the approach of biology research in such a way that scientists can now measure the expression levels of thousands of genes simultaneously in a single experiment. Gene expression profiles, which represent the state of a cell at a molecular level, have great potential as a medical diagnosis tool. But compared to the number of genes involved, available training data sets generally have a fairly small sample size for classification. These training data limitations constitute a challenge to certain classification methodologies. Feature selection techniques can be used to extract the marker genes which influence the classification accuracy effectively by eliminating the un wanted noisy and redundant genes This paper presents a review of feature selection techniques that have been employed in micro array data based cancer classification and also the predominant role of SVM for cancer classification.

  16. Research on the Gastric Cancer Clinical Medical Data Mining Research Based on SPRINT Classification Algorithm%基于SPRINT算法的胃癌临床医疗数据挖掘研究

    Institute of Scientific and Technical Information of China (English)

    郑丹青

    2012-01-01

    To supply the data mining demand,a decision-tree based model is proposed for gastric cancer clinical medical information analysis and application.The model is developed from the existing operational database or data warehouse,from which the factors related to gastric cancer recurrence are extracted to form a decision tree training data set.Using the SPRINT classification algorithm,the model is capable of analyzing the risk factors for gastric cancer recurrence.Based on the analysis of all the potential factors affecting clinical diagnosis,treatment and prognosis,the model confirmed that the primary risk factor for gastric cancer recurrence was hereditary.%为了满足数据挖掘的需要,本文提出了一个基于决策树的胃癌临床医疗信息分析应用研究模型.该模型是从业务数据库或数据仓库中抽取与胃癌术后复发因素有关的数据,形成决策树的训练数据集.运用SPRINT算法,构建胃癌术后复发的危险因素分析模型.通过对模型分析,寻找疾病的临床诊断、治疗和预后的关系,证实胃癌术后复发首要危险因素是家族遗传.

  17. Registration and classification of adolescent and young adult cancer cases.

    Science.gov (United States)

    Pollock, Brad H; Birch, Jillian M

    2008-05-01

    Cancer registries are an important research resource that facilitate the study of etiology, tumor biology, patterns of delayed diagnosis and health planning needs. When outcome data are included, registries can track secular changes in survival related to improvements in early detection or treatment. The surveillance, epidemiology, and end results (SEER) registry has been used to identify major gaps in survival for older adolescent and young adult (AYA) patients compared with younger children and older adults. In order to determine the reasons for this gap, the complete registration and accurate classification of AYA malignancies is necessary. There are inconsistencies in defining the age limits for AYAs although the Adolescent and Young Adult Oncology Progress Review Group proposed a definition of ages 15 through 39 years. The central registration and classification issues for AYAs are case-finding, defining common data elements (CDE) collected across different registries and the diagnostic classification of these malignancies. Goals to achieve by 2010 include extending and validating current diagnostic classification schemes and expanding the CDE to support AYA oncology research, including the collection of tracking information to assess long-term outcomes. These efforts will advance preventive, etiologic, therapeutic, and health services-related research for this understudied age group.

  18. Collaborative Representation based Classification for Face Recognition

    CERN Document Server

    Zhang, Lei; Feng, Xiangchu; Ma, Yi; Zhang, David

    2012-01-01

    By coding a query sample as a sparse linear combination of all training samples and then classifying it by evaluating which class leads to the minimal coding residual, sparse representation based classification (SRC) leads to interesting results for robust face recognition. It is widely believed that the l1- norm sparsity constraint on coding coefficients plays a key role in the success of SRC, while its use of all training samples to collaboratively represent the query sample is rather ignored. In this paper we discuss how SRC works, and show that the collaborative representation mechanism used in SRC is much more crucial to its success of face classification. The SRC is a special case of collaborative representation based classification (CRC), which has various instantiations by applying different norms to the coding residual and coding coefficient. More specifically, the l1 or l2 norm characterization of coding residual is related to the robustness of CRC to outlier facial pixels, while the l1 or l2 norm c...

  19. Texture feature based liver lesion classification

    Science.gov (United States)

    Doron, Yeela; Mayer-Wolf, Nitzan; Diamant, Idit; Greenspan, Hayit

    2014-03-01

    Liver lesion classification is a difficult clinical task. Computerized analysis can support clinical workflow by enabling more objective and reproducible evaluation. In this paper, we evaluate the contribution of several types of texture features for a computer-aided diagnostic (CAD) system which automatically classifies liver lesions from CT images. Based on the assumption that liver lesions of various classes differ in their texture characteristics, a variety of texture features were examined as lesion descriptors. Although texture features are often used for this task, there is currently a lack of detailed research focusing on the comparison across different texture features, or their combinations, on a given dataset. In this work we investigated the performance of Gray Level Co-occurrence Matrix (GLCM), Local Binary Patterns (LBP), Gabor, gray level intensity values and Gabor-based LBP (GLBP), where the features are obtained from a given lesion`s region of interest (ROI). For the classification module, SVM and KNN classifiers were examined. Using a single type of texture feature, best result of 91% accuracy, was obtained with Gabor filtering and SVM classification. Combination of Gabor, LBP and Intensity features improved the results to a final accuracy of 97%.

  20. Classification of Dukes' B and C colorectal cancers using expression arrays

    DEFF Research Database (Denmark)

    Frederiksen, C.M.; Knudsen, Steen; Laurberg, S.;

    2003-01-01

    Purpose. Colorectal cancer is one of the most common malignancies. Substaging of the cancer is of importance not only to prognosis but also to treatment. Classification of substages based on DNA microarray technology is currently the most promising approach. We therefore investigated if gene...... expression microarrays could be used to classify colorectal tumors. Methods. We used the Affymetrix oligonucleotide arrays to analyze the expression of more than 5,000 genes in samples from the sigmoid and upper rectum of the left colon. Five samples were from normal mucosa and five samples from each......' A and D could not be classified correctly. A number of interesting gene clusters showed a discriminating difference between Dukes' B and C samples. These included mitochondrial genes, stromal remodeling genes, and genes related to cell adhesion. Conclusion. Molecular classification based on gene...

  1. Texture classification based on EMD and FFT

    Institute of Scientific and Technical Information of China (English)

    XIONG Chang-zhen; XU Jun-yi; ZOU Jian-cheng; QI Dong-xu

    2006-01-01

    Empirical mode decomposition (EMD) is an adaptive and approximately orthogonal filtering process that reflects human's visual mechanism of differentiating textures. In this paper, we present a modified 2D EMD algorithm using the FastRBF and an appropriate number of iterations in the shifting process (SP), then apply it to texture classification. Rotation-invariant texture feature vectors are extracted using auto-registration and circular regions of magnitude spectra of 2D fast Fourier transform(FFT). In the experiments, we employ a Bayesion classifier to classify a set of 15 distinct natural textures selected from the Brodatz album. The experimental results, based on different testing datasets for images with different orientations, show the effectiveness of the proposed classification scheme.

  2. Feature-Based Classification of Networks

    CERN Document Server

    Barnett, Ian; Kuijjer, Marieke L; Mucha, Peter J; Onnela, Jukka-Pekka

    2016-01-01

    Network representations of systems from various scientific and societal domains are neither completely random nor fully regular, but instead appear to contain recurring structural building blocks. These features tend to be shared by networks belonging to the same broad class, such as the class of social networks or the class of biological networks. At a finer scale of classification within each such class, networks describing more similar systems tend to have more similar features. This occurs presumably because networks representing similar purposes or constructions would be expected to be generated by a shared set of domain specific mechanisms, and it should therefore be possible to classify these networks into categories based on their features at various structural levels. Here we describe and demonstrate a new, hybrid approach that combines manual selection of features of potential interest with existing automated classification methods. In particular, selecting well-known and well-studied features that ...

  3. SQL based cardiovascular ultrasound image classification.

    Science.gov (United States)

    Nandagopalan, S; Suryanarayana, Adiga B; Sudarshan, T S B; Chandrashekar, Dhanalakshmi; Manjunath, C N

    2013-01-01

    This paper proposes a novel method to analyze and classify the cardiovascular ultrasound echocardiographic images using Naïve-Bayesian model via database OLAP-SQL. Efficient data mining algorithms based on tightly-coupled model is used to extract features. Three algorithms are proposed for classification namely Naïve-Bayesian Classifier for Discrete variables (NBCD) with SQL, NBCD with OLAP-SQL, and Naïve-Bayesian Classifier for Continuous variables (NBCC) using OLAP-SQL. The proposed model is trained with 207 patient images containing normal and abnormal categories. Out of the three proposed algorithms, a high classification accuracy of 96.59% was achieved from NBCC which is better than the earlier methods.

  4. Improved PCA + LDA Applies to Gastric Cancer Image Classification Process

    Science.gov (United States)

    Gan, Lan; Lv, Wenya; Zhang, Xu; Meng, Xiuming

    Principal component analysis (PCA) and linear discriminant analysis (LDA) are two most widely used pattern recognition methods in the field of feature extraction,while PCA + LDA is often used in image recognition.Here,we apply PCA + LDA to gastric cancer image feature classification, but the traditional PCA + LDA dimension reduction method has good effect on the training sample dimensionality and clustering, the effect on test samples dimension reduction and clustering is very poor, that is, the traditional PCA + LDA exists Generalization problem on the test samples. To solve this problem, this paper proposes an improved PCA + LDA method, which mainly considers from the LDA transform; improves the traditional PCA + LDA;increase the generalization performance of LDA on test samples and increases the classification accuracy on test samples. The experiment proves that the method can achieve good clustering.

  5. An Approach for Leukemia Classification Based on Cooperative Game Theory

    Directory of Open Access Journals (Sweden)

    Atefeh Torkaman

    2011-01-01

    Full Text Available Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers in the world. So, the emphasis on diagnostic techniques and best treatments would be able to provide better prognosis and survival for patients. In this paper, an automatic diagnosis recommender system for classifying leukemia based on cooperative game is presented. Through out this research, we analyze the flow cytometry data toward the classification of leukemia into eight classes. We work on real data set from different types of leukemia that have been collected at Iran Blood Transfusion Organization (IBTO. Generally, the data set contains 400 samples taken from human leukemic bone marrow. This study deals with cooperative game used for classification according to different weights assigned to the markers. The proposed method is versatile as there are no constraints to what the input or output represent. This means that it can be used to classify a population according to their contributions. In other words, it applies equally to other groups of data. The experimental results show the accuracy rate of 93.12%, for classification and compared to decision tree (C4.5 with (90.16% in accuracy. The result demonstrates that cooperative game is very promising to be used directly for classification of leukemia as a part of Active Medical decision support system for interpretation of flow cytometry readout. This system could assist clinical hematologists to properly recognize different kinds of leukemia by preparing suggestions and this could improve the treatment

  6. TNM staging and classification (familial and nonfamilial of breast cancer in Jordanian females

    Directory of Open Access Journals (Sweden)

    M F Atoum

    2010-01-01

    Full Text Available Purpose : Staging of breast tumor has important implications for treatment and prognosis. This study aims at pinpointing the frequency of each stage among familial and nonfamilial breast cancers. Materials and Methods : Ninety-nine Jordanian females diagnosed with familial and nonfamilial breast cancer between 2000 and 2002 were enrolled in this study All breast cancer cases were staged according to the TNM classification into in situ, early invasive, advanced invasive and metastatic. Results : Forty-three cases were familial breast cancer and 56 were nonfamilial. One female breast cancer was diagnosed with ductal carcinoma in situ (DCIS cancer. Fifty cases were diagnosed in early stages of invasive breast cancer, of which 31 cases were familial, 29 cases were classified as advanced invasive, where 21 cases were nonfamilial and 19 cases were metastatic stage of breast cancer, with 16 nonfamilial cases. Stage 2b was the most common stage of early invasive cases and represented 48% of the early stage of breast cancer. On the other hand, among cases diagnosed with advanced invasive breast cancer, stage 3a was the most common stage and represented 89.6% of the advanced stage. Interestingly, all cases of stage 3a belonged to TNM stages of T2N2M0 and T3N1M0. The tumor size in all cases of Jordanian females diagnosed with advanced invasive breast cancer exceeded 2 cm in size due to selection bias from symptomatic women in our study. Conclusion : The incidence of nonfamilial breast cancer was slightly higher than that of the familial type amongst studied the Jordanian females studied. The early invasive stage of breast cancer was more common in the familial while the advanced invasive and metastatic breast cancer cases were encountered more often in the nonfamilial type. Our study was based on a small sample and symptomatic women. Therefore, more research with larger population samples is needed to confirm this conclusion.

  7. Changing Histopathological Diagnostics by Genome-Based Tumor Classification

    Directory of Open Access Journals (Sweden)

    Michael Kloth

    2014-05-01

    Full Text Available Traditionally, tumors are classified by histopathological criteria, i.e., based on their specific morphological appearances. Consequently, current therapeutic decisions in oncology are strongly influenced by histology rather than underlying molecular or genomic aberrations. The increase of information on molecular changes however, enabled by the Human Genome Project and the International Cancer Genome Consortium as well as the manifold advances in molecular biology and high-throughput sequencing techniques, inaugurated the integration of genomic information into disease classification. Furthermore, in some cases it became evident that former classifications needed major revision and adaption. Such adaptations are often required by understanding the pathogenesis of a disease from a specific molecular alteration, using this molecular driver for targeted and highly effective therapies. Altogether, reclassifications should lead to higher information content of the underlying diagnoses, reflecting their molecular pathogenesis and resulting in optimized and individual therapeutic decisions. The objective of this article is to summarize some particularly important examples of genome-based classification approaches and associated therapeutic concepts. In addition to reviewing disease specific markers, we focus on potentially therapeutic or predictive markers and the relevance of molecular diagnostics in disease monitoring.

  8. Digital image-based classification of biodiesel.

    Science.gov (United States)

    Costa, Gean Bezerra; Fernandes, David Douglas Sousa; Almeida, Valber Elias; Araújo, Thomas Souto Policarpo; Melo, Jessica Priscila; Diniz, Paulo Henrique Gonçalves Dias; Véras, Germano

    2015-07-01

    This work proposes a simple, rapid, inexpensive, and non-destructive methodology based on digital images and pattern recognition techniques for classification of biodiesel according to oil type (cottonseed, sunflower, corn, or soybean). For this, differing color histograms in RGB (extracted from digital images), HSI, Grayscale channels, and their combinations were used as analytical information, which was then statistically evaluated using Soft Independent Modeling by Class Analogy (SIMCA), Partial Least Squares Discriminant Analysis (PLS-DA), and variable selection using the Successive Projections Algorithm associated with Linear Discriminant Analysis (SPA-LDA). Despite good performances by the SIMCA and PLS-DA classification models, SPA-LDA provided better results (up to 95% for all approaches) in terms of accuracy, sensitivity, and specificity for both the training and test sets. The variables selected Successive Projections Algorithm clearly contained the information necessary for biodiesel type classification. This is important since a product may exhibit different properties, depending on the feedstock used. Such variations directly influence the quality, and consequently the price. Moreover, intrinsic advantages such as quick analysis, requiring no reagents, and a noteworthy reduction (the avoidance of chemical characterization) of waste generation, all contribute towards the primary objective of green chemistry.

  9. Conformational SERS Classification of K-Ras Point Mutations for Cancer Diagnostics.

    Science.gov (United States)

    Morla-Folch, Judit; Gisbert-Quilis, Patricia; Masetti, Matteo; Garcia-Rico, Eduardo; Alvarez-Puebla, Ramon A; Guerrini, Luca

    2017-02-20

    Point mutations in Ras oncogenes are routinely screened for diagnostics and treatment of tumors (especially in colorectal cancer). Here, we develop an optical approach based on direct SERS coupled with chemometrics for the study of the specific conformations that single-point mutations impose on a relatively large fragment of the K-Ras gene (141 nucleobases). Results obtained offer the unambiguous classification of different mutations providing a potentially useful insight for diagnostics and treatment of cancer in a sensitive, fast, direct and inexpensive manner.

  10. Classification of Cancer-related Death Certificates using Machine Learning

    Directory of Open Access Journals (Sweden)

    Luke Butt

    2013-05-01

    Full Text Available BackgroundCancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities.AimsIn this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated.Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes.ResultsDeath certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032 and false negative rate (0.0297 while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers.ConclusionThe selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with

  11. Classification of treatment-related mortality in children with cancer

    DEFF Research Database (Denmark)

    Alexander, Sarah; Pole, Jason D; Gibson, Paul;

    2015-01-01

    Treatment-related mortality is an important outcome in paediatric cancer clinical trials. An international group of experts in supportive care in paediatric cancer developed a consensus-based definition of treatment-related mortality and a cause-of-death attribution system. The reliability and va...

  12. Genome-based Taxonomic Classification of Bacteroidetes

    Directory of Open Access Journals (Sweden)

    Richard L. Hahnke

    2016-12-01

    Full Text Available The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.

  13. Genome-Based Taxonomic Classification of Bacteroidetes.

    Science.gov (United States)

    Hahnke, Richard L; Meier-Kolthoff, Jan P; García-López, Marina; Mukherjee, Supratim; Huntemann, Marcel; Ivanova, Natalia N; Woyke, Tanja; Kyrpides, Nikos C; Klenk, Hans-Peter; Göker, Markus

    2016-01-01

    The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.

  14. Gastric Cancer Risk Analysis in Unhealthy Habits Data with Classification Algorithms

    Directory of Open Access Journals (Sweden)

    Kirshners Arnis

    2015-12-01

    Full Text Available Data mining methods are applied to a medical task that seeks for the information about the influence of Helicobacter Pylori on the gastric cancer risk increase by analysing the adverse factors of individual lifestyle. In the process of data preprocessing, the data are cleared of noise and other factors, reduced in dimensionality, as well as transformed for the task and cleared of non-informative attributes. Data classification using C4.5, CN2 and k-nearest neighbour algorithms is carried out to find relationships between the analysed attributes and the descriptive class attribute – Helicobacter Pylori presence that could have influence on the cancer development risk. Experimental analysis is carried out using the data of the Latvian-based project “Interdisciplinary Research Group for Early Cancer Detection and Cancer Prevention” database.

  15. Cirrhosis Classification Based on Texture Classification of Random Features

    Directory of Open Access Journals (Sweden)

    Hui Liu

    2014-01-01

    Full Text Available Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage. CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM.

  16. Cirrhosis classification based on texture classification of random features.

    Science.gov (United States)

    Liu, Hui; Shao, Ying; Guo, Dongmei; Zheng, Yuanjie; Zhao, Zuowei; Qiu, Tianshuang

    2014-01-01

    Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD) can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM) features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage). CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM.

  17. "Chromosome": a knowledge-based system for the chromosome classification.

    Science.gov (United States)

    Ramstein, G; Bernadet, M

    1993-01-01

    Chromosome, a knowledge-based analysis system has been designed for the classification of human chromosomes. Its aim is to perform an optimal classification by driving a tool box containing the procedures of image processing, pattern recognition and classification. This paper presents the general architecture of Chromosome, based on a multiagent system generator. The image processing tool box is described from the met aphasic enhancement to the fine classification. Emphasis is then put on the knowledge base intended for the chromosome recognition. The global classification process is also presented, showing how Chromosome proceeds to classify a given chromosome. Finally, we discuss further extensions of the system for the karyotype building.

  18. Fuzzy Rule Base System for Software Classification

    Directory of Open Access Journals (Sweden)

    Adnan Shaout

    2013-07-01

    Full Text Available Given the central role that software development plays in the delivery and application of informationtechnology, managers have been focusing on process improvement in the software development area. Thisimprovement has increased the demand for software measures, or metrics to manage the process. Thismetrics provide a quantitative basis for the development and validation of models during the softwaredevelopment process. In this paper a fuzzy rule-based system will be developed to classify java applicationsusing object oriented metrics. The system will contain the following features:Automated method to extract the OO metrics from the source code,Default/base set of rules that can be easily configured via XML file so companies, developers, teamleaders,etc, can modify the set of rules according to their needs,Implementation of a framework so new metrics, fuzzy sets and fuzzy rules can be added or removeddepending on the needs of the end user,General classification of the software application and fine-grained classification of the java classesbased on OO metrics, andTwo interfaces are provided for the system: GUI and command.

  19. Group classification based on high-dimensional data: application to differential scanning calorimetry plasma thermogram analysis of cervical cancer and control samples

    OpenAIRE

    Rai, Shesh; Pan,, Y.B.; Cambon,; Chaires,; Garbett,

    2013-01-01

    Shesh N Rai,1,2 Jianmin Pan,1 Alex Cambon,2 Jonathan B Chaires,3–5 Nichola C Garbett3,4 1Biostatistics Shared Facility, James Graham Brown Cancer Center, University of Louisville, 2Department of Bioinformatics and Biostatistics, University of Louisville, 3Biophysical Core Facility, James Graham Brown Cancer Center, University of Louisville, 4Department of Medicine, University of Louisville, 5Department of Biochemistry and Molecular Biology, University of Louisville, Louisville, KY, ...

  20. Characteristics of Differently Located Colorectal Cancers Support Proximal and Distal Classification: A Population-Based Study of 57,847 Patients

    Science.gov (United States)

    Yang, Jiao; Du, Xiang lin; Li, Shu ting; Wang, Bi yuan; Wu, Yin ying; Chen, Zhe ling; Lv, Meng; Shen, Yan wei; Wang, Xin; Dong, Dan feng; Li, Dan; Wang, Fan; Li, En xiao; Yi, Min

    2016-01-01

    Background It has been suggested that colorectal cancer be regarded as several subgroups defined according to tumor location rather than as a single entity. The current study aimed to identify the most useful method for grouping colorectal cancer by tumor location according to both baseline and survival characteristics. Methods Cases of pathologically confirmed colorectal adenocarcinoma diagnosed from 2000 to 2012 were identified from the Surveillance, Epidemiology, and End Results database and categorized into three groups: right colon cancer (RCC), left colon cancer (LCC), and rectal cancer (ReC). Adjusted hazard ratios for known predictors of disease-specific survival (DSS) in colorectal cancer were obtained using a Cox proportional hazards regression model. Results The study included 57847 patients: 43.5% with RCC, 37.7% with LCC, and 18.8% with ReC. Compared with LCC and ReC, RCC was more likely to affect old patients and women, and to be at advanced stage, poorly differentiated or un-differentiated, and mucinous. Patients with LCC or ReC had better DSS than those with RCC in subgroups including stage III or IV disease, age ≤70 years and non-mucinous adenocarcinoma. Conversely, patients with LCC or ReC had worse DSS than those with RCC in subgroups including age ˃70 years and mucinous adenocarcinoma. Conclusions RCC differed from both LCC and ReC in several clinicopathologic characteristics and in DSS. It seems reasonable to group colorectal cancer into right-sided (i.e., proximal) and left-sided (i.e., distal) ones. PMID:27936129

  1. Malware Classification based on Call Graph Clustering

    CERN Document Server

    Kinable, Joris

    2010-01-01

    Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations away, and enable the detection of structural similarities between samples. The ability to cluster similar samples together will make more generic detection techniques possible, thereby targeting the commonalities of the samples within a cluster. To compare call graphs mutually, we compute pairwise graph similarity scores via graph matchings which approximately minimize the graph edit distance. Next, to facilitate the discovery of similar malware samples, we employ several clustering algorithms, including k-medoids and DB...

  2. TSG: a new algorithm for binary and multi-class cancer classification and informative genes selection

    Directory of Open Access Journals (Sweden)

    Wang Haiyan

    2013-01-01

    Full Text Available Abstract Background One of the challenges in classification of cancer tissue samples based on gene expression data is to establish an effective method that can select a parsimonious set of informative genes. The Top Scoring Pair (TSP, k-Top Scoring Pairs (k-TSP, Support Vector Machines (SVM, and prediction analysis of microarrays (PAM are four popular classifiers that have comparable performance on multiple cancer datasets. SVM and PAM tend to use a large number of genes and TSP, k-TSP always use even number of genes. In addition, the selection of distinct gene pairs in k-TSP simply combined the pairs of top ranking genes without considering the fact that the gene set with best discrimination power may not be the combined pairs. The k-TSP algorithm also needs the user to specify an upper bound for the number of gene pairs. Here we introduce a computational algorithm to address the problems. The algorithm is named Chisquare-statistic-based Top Scoring Genes (Chi-TSG classifier simplified as TSG. Results The TSG classifier starts with the top two genes and sequentially adds additional gene into the candidate gene set to perform informative gene selection. The algorithm automatically reports the total number of informative genes selected with cross validation. We provide the algorithm for both binary and multi-class cancer classification. The algorithm was applied to 9 binary and 10 multi-class gene expression datasets involving human cancers. The TSG classifier outperforms TSP family classifiers by a big margin in most of the 19 datasets. In addition to improved accuracy, our classifier shares all the advantages of the TSP family classifiers including easy interpretation, invariant to monotone transformation, often selects a small number of informative genes allowing follow-up studies, resistant to sampling variations due to within sample operations. Conclusions Redefining the scores for gene set and the classification rules in TSP family

  3. Age Classification Based On Integrated Approach

    Directory of Open Access Journals (Sweden)

    Pullela. SVVSR Kumar

    2014-05-01

    Full Text Available The present paper presents a new age classification method by integrating the features derived from Grey Level Co-occurrence Matrix (GLCM with a new structural approach derived from four distinct LBP's (4-DLBP on a 3 x 3 image. The present paper derived four distinct patterns called Left Diagonal (LD, Right diagonal (RD, vertical centre (VC and horizontal centre (HC LBP's. For all the LBP's the central pixel value of the 3 x 3 neighbourhood is significant. That is the reason in the present research LBP values are evaluated by comparing all 9 pixels of the 3 x 3 neighbourhood with the average value of the neighbourhood. The four distinct LBP's are grouped into two distinct LBP's. Based on these two distinct LBP's GLCM is computed and features are evaluated to classify the human age into four age groups i.e: Child (0-15, Young adult (16-30, Middle aged adult (31-50 and senior adult (>50. The co-occurrence features extracted from the 4-DLBP provides complete texture information about an image which is useful for classification. The proposed 4-DLBP reduces the size of the LBP from 6561 to 79 in the case of original texture spectrum and 2020 to 79 in the case of Fuzzy Texture approach.

  4. Automatic web services classification based on rough set theory

    Institute of Scientific and Technical Information of China (English)

    陈立; 张英; 宋自林; 苗壮

    2013-01-01

    With development of web services technology, the number of existing services in the internet is growing day by day. In order to achieve automatic and accurate services classification which can be beneficial for service related tasks, a rough set theory based method for services classification was proposed. First, the services descriptions were preprocessed and represented as vectors. Elicited by the discernibility matrices based attribute reduction in rough set theory and taking into account the characteristic of decision table of services classification, a method based on continuous discernibility matrices was proposed for dimensionality reduction. And finally, services classification was processed automatically. Through the experiment, the proposed method for services classification achieves approving classification result in all five testing categories. The experiment result shows that the proposed method is accurate and could be used in practical web services classification.

  5. Survival of patients with nonseminomatous germ cell cancer: a review of the IGCC classification by Cox regression and recursive partitioning.

    Science.gov (United States)

    van Dijk, M R; Steyerberg, E W; Stenning, S P; Dusseldorp, E; Habbema, J D F

    2004-03-22

    The International Germ Cell Consensus (IGCC) classification identifies good, intermediate and poor prognosis groups among patients with metastatic nonseminomatous germ cell tumours (NSGCT). It uses the risk factors primary site, presence of nonpulmonary visceral metastases and tumour markers alpha-fetoprotein (AFP), human chorionic gonadotrophin (HCG) and lactic dehydrogenase (LDH). The IGCC classification is easy to use and remember, but lacks flexibility. We aimed to examine the extent of any loss in discrimination within the IGCC classification in comparison with alternative modelling by formal weighing of the risk factors. We analysed survival of 3048 NSGCT patients with Cox regression and recursive partitioning for alternative classifications. Good, intermediate and poor prognosis groups were based on predicted 5-year survival. Classifications were further refined by subgrouping within the poor prognosis group. Performance was measured primarily by a bootstrap corrected c-statistic to indicate discriminative ability for future patients. The weights of the risk factors in the alternative classifications differed slightly from the implicit weights in the IGCC classification. Discriminative ability, however, did not increase clearly (IGCC classification, c=0.732; Cox classification, c=0.730; Recursive partitioning classification, c=0.709). Three subgroups could be identified within the poor prognosis groups, resulting in classifications with five prognostic groups and slightly better discriminative ability (c=0.740). In conclusion, the IGCC classification in three prognostic groups is largely supported by Cox regression and recursive partitioning. Cox regression was the most promising tool to define a more refined classification. British Journal of Cancer (2004) 90, 1176-1183. doi:10.1038/sj.bjc.6601665 www.bjcancer.com Published online 24 February 2004

  6. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  7. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine.

    Science.gov (United States)

    Xi, Maolong; Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun

    2016-01-01

    This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  8. Molecular voting for glioma classification reflecting heterogeneity in the continuum of cancer progression.

    Science.gov (United States)

    Fuller, Gregory N; Mircean, Cristian; Tabus, Ioan; Taylor, Ellen; Sawaya, Raymond; Bruner, Janet M; Shmulevich, Ilya; Zhang, Wei

    2005-09-01

    Gliomas, the most common brain tumors, are generally categorized into two lineages (astrocytic and oligodendrocytic) and further classified as low-grade (astrocytoma and oligodendroglioma), mid-grade (anaplastic astrocytoma and anaplastic oligodendroglioma), and high-grade (glioblastoma multiforme) based on morphological features. A strict classification scheme has limitations because a specific glioma can be at any stage of the continuum of cancer progression and may contain mixed features. Thus, a more comprehensive classification based on molecular signatures may reflect the biological nature of specific tumors more accurately. In this study, we used microarray technology to profile the gene expression of 49 human brain tumors and applied the k-nearest neighbor algorithm for classification. We first trained the classification gene set with 19 of the most typical glioma cases and selected a set of genes that provide the lowest cross-validation classification error with k=5. We then applied this gene set to the 30 remaining cases, including several that do not belong to gliomas such as atypical meningioma. The results showed that not only does the algorithm correctly classify most of the gliomas, but the detailed voting results also provide more subtle information regarding the molecular similarities to neighboring classes. For atypical meningioma, the voting was equally split among the four classes, indicating a difficulty in placement of meningioma into the four classes of gliomas. Thus, the actual voting results, which are typically used only to decide the winning class label in k-nearest neighbor algorithms, provide a useful method for gaining deeper insight into the stage of a tumor in the continuum of cancer development.

  9. High dimensional multiclass classification with applications to cancer diagnosis

    DEFF Research Database (Denmark)

    Vincent, Martin

    Probabilistic classifiers are introduced and it is shown that the only regular linear probabilistic classifier with convex risk is multinomial regression. Penalized empirical risk minimization is introduced and used to construct supervised learning methods for probabilistic classifiers. A sparse...... and a simulation based domain adaption strategy is presented. It is shown that the presented computational contamination approach drastically improves the primary tumor site classification of lever contaminated biopsies of metastases. A final classifier for identification of the primary tumor site is developed...

  10. Graph-based Methods for Orbit Classification

    Energy Technology Data Exchange (ETDEWEB)

    Bagherjeiran, A; Kamath, C

    2005-09-29

    An important step in the quest for low-cost fusion power is the ability to perform and analyze experiments in prototype fusion reactors. One of the tasks in the analysis of experimental data is the classification of orbits in Poincare plots. These plots are generated by the particles in a fusion reactor as they move within the toroidal device. In this paper, we describe the use of graph-based methods to extract features from orbits. These features are then used to classify the orbits into several categories. Our results show that existing machine learning algorithms are successful in classifying orbits with few points, a situation which can arise in data from experiments.

  11. Sentiment classification technology based on Markov logic networks

    Science.gov (United States)

    He, Hui; Li, Zhigang; Yao, Chongchong; Zhang, Weizhe

    2016-07-01

    With diverse online media emerging, there is a growing concern of sentiment classification problem. At present, text sentiment classification mainly utilizes supervised machine learning methods, which feature certain domain dependency. On the basis of Markov logic networks (MLNs), this study proposed a cross-domain multi-task text sentiment classification method rooted in transfer learning. Through many-to-one knowledge transfer, labeled text sentiment classification, knowledge was successfully transferred into other domains, and the precision of the sentiment classification analysis in the text tendency domain was improved. The experimental results revealed the following: (1) the model based on a MLN demonstrated higher precision than the single individual learning plan model. (2) Multi-task transfer learning based on Markov logical networks could acquire more knowledge than self-domain learning. The cross-domain text sentiment classification model could significantly improve the precision and efficiency of text sentiment classification.

  12. Artificial neural networks as classification and diagnostic tools for lymph node-negative breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Eswari J, Satya; Chandrakar, Neha [National Institute of Technology Raipur, Raipur (India)

    2016-04-15

    Artificial neural networks (ANNs) can be used to develop a technique to classify lymph node negative breast cancer that is prone to distant metastases based on gene expression signatures. The neural network used is a multilayered feed forward network that employs back propagation algorithm. Once trained with DNA microarraybased gene expression profiles of genes that were predictive of distant metastasis recurrence of lymph node negative breast cancer, the ANNs became capable of correctly classifying all samples and recognizing the genes most appropriate to the classification. To test the ability of the trained ANN models in recognizing lymph node negative breast cancer, we analyzed additional idle samples that were not used beforehand for the training procedure and obtained the correctly classified result in the validation set. For more substantial result, bootstrapping of training and testing dataset was performed as external validation. This study illustrates the potential application of ANN for breast tumor diagnosis and the identification of candidate targets in patients for therapy.

  13. Profiling alternatively spliced mRNA isoforms for prostate cancer classification

    Directory of Open Access Journals (Sweden)

    Fan Jian-Bing

    2006-04-01

    Full Text Available Abstract Background Prostate cancer is one of the leading causes of cancer illness and death among men in the United States and world wide. There is an urgent need to discover good biomarkers for early clinical diagnosis and treatment. Previously, we developed an exon-junction microarray-based assay and profiled 1532 mRNA splice isoforms from 364 potential prostate cancer related genes in 38 prostate tissues. Here, we investigate the advantage of using splice isoforms, which couple transcriptional and splicing regulation, for cancer classification. Results As many as 464 splice isoforms from more than 200 genes are differentially regulated in tumors at a false discovery rate (FDR of 0.05. Remarkably, about 30% of genes have isoforms that are called significant but do not exhibit differential expression at the overall mRNA level. A support vector machine (SVM classifier trained on 128 signature isoforms can correctly predict 92% of the cases, which outperforms the classifier using overall mRNA abundance by about 5%. It is also observed that the classification performance can be improved using multivariate variable selection methods, which take correlation among variables into account. Conclusion These results demonstrate that profiling of splice isoforms is able to provide unique and important information which cannot be detected by conventional microarrays.

  14. Classification of Laser Induced Fluorescence Spectra from Normal and Malignant bladder tissues using Learning Vector Quantization Neural Network in Bladder Cancer Diagnosis

    DEFF Research Database (Denmark)

    Karemore, Gopal Raghunath; Mascarenhas, Kim Komal; Patil, Choudhary

    2008-01-01

    the classification accuracy of LVQ with other classifiers (eg. SVM and Multi Layer Perceptron) for the same data set. Good agreement has been obtained between LVQ based classification of spectroscopy data and histopathology results which demonstrate the use of LVQ classifier in bladder cancer diagnosis....

  15. 3D texture analysis for classification of second harmonic generation images of human ovarian cancer

    Science.gov (United States)

    Wen, Bruce; Campbell, Kirby R.; Tilbury, Karissa; Nadiarnykh, Oleg; Brewer, Molly A.; Patankar, Manish; Singh, Vikas; Eliceiri, Kevin. W.; Campagnola, Paul J.

    2016-10-01

    Remodeling of the collagen architecture in the extracellular matrix (ECM) has been implicated in ovarian cancer. To quantify these alterations we implemented a form of 3D texture analysis to delineate the fibrillar morphology observed in 3D Second Harmonic Generation (SHG) microscopy image data of normal (1) and high risk (2) ovarian stroma, benign ovarian tumors (3), low grade (4) and high grade (5) serous tumors, and endometrioid tumors (6). We developed a tailored set of 3D filters which extract textural features in the 3D image sets to build (or learn) statistical models of each tissue class. By applying k-nearest neighbor classification using these learned models, we achieved 83-91% accuracies for the six classes. The 3D method outperformed the analogous 2D classification on the same tissues, where we suggest this is due the increased information content. This classification based on ECM structural changes will complement conventional classification based on genetic profiles and can serve as an additional biomarker. Moreover, the texture analysis algorithm is quite general, as it does not rely on single morphological metrics such as fiber alignment, length, and width but their combined convolution with a customizable basis set.

  16. Improving accuracy for cancer classification with a new algorithm for genes selection

    Directory of Open Access Journals (Sweden)

    Zhang Hongyan

    2012-11-01

    Full Text Available Abstract Background Even though the classification of cancer tissue samples based on gene expression data has advanced considerably in recent years, it faces great challenges to improve accuracy. One of the challenges is to establish an effective method that can select a parsimonious set of relevant genes. So far, most methods for gene selection in literature focus on screening individual or pairs of genes without considering the possible interactions among genes. Here we introduce a new computational method named the Binary Matrix Shuffling Filter (BMSF. It not only overcomes the difficulty associated with the search schemes of traditional wrapper methods and overfitting problem in large dimensional search space but also takes potential gene interactions into account during gene selection. This method, coupled with Support Vector Machine (SVM for implementation, often selects very small number of genes for easy model interpretability. Results We applied our method to 9 two-class gene expression datasets involving human cancers. During the gene selection process, the set of genes to be kept in the model was recursively refined and repeatedly updated according to the effect of a given gene on the contributions of other genes in reference to their usefulness in cancer classification. The small number of informative genes selected from each dataset leads to significantly improved leave-one-out (LOOCV classification accuracy across all 9 datasets for multiple classifiers. Our method also exhibits broad generalization in the genes selected since multiple commonly used classifiers achieved either equivalent or much higher LOOCV accuracy than those reported in literature. Conclusions Evaluation of a gene’s contribution to binary cancer classification is better to be considered after adjusting for the joint effect of a large number of other genes. A computationally efficient search scheme was provided to perform effective search in the extensive

  17. Gene Expression Profiles for Predicting Metastasis in Breast Cancer: A Cross-Study Comparison of Classification Methods

    Directory of Open Access Journals (Sweden)

    Mark Burton

    2012-01-01

    Full Text Available Machine learning has increasingly been used with microarray gene expression data and for the development of classifiers using a variety of methods. However, method comparisons in cross-study datasets are very scarce. This study compares the performance of seven classification methods and the effect of voting for predicting metastasis outcome in breast cancer patients, in three situations: within the same dataset or across datasets on similar or dissimilar microarray platforms. Combining classification results from seven classifiers into one voting decision performed significantly better during internal validation as well as external validation in similar microarray platforms than the underlying classification methods. When validating between different microarray platforms, random forest, another voting-based method, proved to be the best performing method. We conclude that voting based classifiers provided an advantage with respect to classifying metastasis outcome in breast cancer patients.

  18. Comparing two classifications of cancer cachexia and their association with survival in patients with unresected pancreatic cancer.

    Science.gov (United States)

    Wesseltoft-Rao, Nima; Hjermstad, Marianne J; Ikdahl, Tone; Dajani, Olav; Ulven, Stine M; Iversen, Per Ole; Bye, Asta

    2015-01-01

    There is no universally accepted definition of cancer cachexia. Two classifications have been proposed; the 3-factor classification requiring ≥ 2 of 3 factors; weight loss ≥ 10%, food intake ≤ 1500 kcal/day, and C-reactive protein ≥ 10 mg/l, and the consensus classification requiring weight loss >5% the past 6 mo, or body mass index 2%. Precachexia is the initial stage of the cachexia trajectory, identified by weight loss ≤ 5%, anorexia and metabolic change. We examined the consistency between the 2 classifications, and their association with survival in a palliative cohort of 45 (25 men, median age of 72 yr, range 35-89) unresected pancreatic cancer patients. Computed tomography images were used to determine sarcopenia. Height/weight/C-reactive protein and survival were extracted from medical records. Food intake was self-reported. The agreement for cachexia and noncachexia was 78% across classifications. Survival was poorer in cachexia compared to noncachexia (3-factor classification, P = 0.0052; consensus classification, P = 0.056; when precachexia was included in the consensus classification, P = 0.027). Both classifications showed a trend toward lower median survival (P cachexia. In conclusion, the two classifications showed good overall agreement in defining cachectic pancreatic cancer patients, and cachexia was associated with poorer survival according to both.

  19. Use of multivariate analysis to suggest a new molecular classification of colorectal cancer

    Science.gov (United States)

    Domingo, Enric; Ramamoorthy, Rajarajan; Oukrif, Dahmane; Rosmarin, Daniel; Presz, Michal; Wang, Haitao; Pulker, Hannah; Lockstone, Helen; Hveem, Tarjei; Cranston, Treena; Danielsen, Havard; Novelli, Marco; Davidson, Brian; Xu, Zheng-Zhou; Molloy, Peter; Johnstone, Elaine; Holmes, Christopher; Midgley, Rachel; Kerr, David; Sieber, Oliver; Tomlinson, Ian

    2013-01-01

    Abstract Molecular classification of colorectal cancer (CRC) is currently based on microsatellite instability (MSI), KRAS or BRAF mutation and, occasionally, chromosomal instability (CIN). Whilst useful, these categories may not fully represent the underlying molecular subgroups. We screened 906 stage II/III CRCs from the VICTOR clinical trial for somatic mutations. Multivariate analyses (logistic regression, clustering, Bayesian networks) identified the primary molecular associations. Positive associations occurred between: CIN and TP53 mutation; MSI and BRAF mutation; and KRAS and PIK3CA mutations. Negative associations occurred between: MSI and CIN; MSI and NRAS mutation; and KRAS mutation, and each of NRAS, TP53 and BRAF mutations. Some complex relationships were elucidated: KRAS and TP53 mutations had both a direct negative association and a weaker, confounding, positive association via TP53–CIN–MSI–BRAF–KRAS. Our results suggested a new molecular classification of CRCs: (1) MSI+ and/or BRAF-mutant; (2) CIN+ and/or TP53– mutant, with wild-type KRAS and PIK3CA; (3) KRAS- and/or PIK3CA-mutant, CIN+, TP53-wild-type; (4) KRAS– and/or PIK3CA-mutant, CIN–, TP53-wild-type; (5) NRAS-mutant; (6) no mutations; (7) others. As expected, group 1 cancers were mostly proximal and poorly differentiated, usually occurring in women. Unexpectedly, two different types of CIN+ CRC were found: group 2 cancers were usually distal and occurred in men, whereas group 3 showed neither of these associations but were of higher stage. CIN+ cancers have conventionally been associated with all three of these variables, because they have been tested en masse. Our classification also showed potentially improved prognostic capabilities, with group 3, and possibly group 1, independently predicting disease-free survival. Copyright © 2012 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. PMID:23165447

  20. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.

    2015-02-01

    © 2014 The Authors. Microcirculation published by John Wiley & Sons Ltd. Objective: Recent developments in high-resolution imaging techniques have enabled digital reconstruction of three-dimensional sections of microvascular networks down to the capillary scale. To better interpret these large data sets, our goal is to distinguish branching trees of arterioles and venules from capillaries. Methods: Two novel algorithms are presented for classifying vessels in microvascular anatomical data sets without requiring flow information. The algorithms are compared with a classification based on observed flow directions (considered the gold standard), and with an existing resistance-based method that relies only on structural data. Results: The first algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules. The second algorithm, developed for networks with multiple inlets and outlets, correctly identifies more arterioles and venules, but is more sensitive to parameter changes. Conclusions: The algorithms presented here can be used to classify microvessels in large microvascular data sets lacking flow information. This provides a basis for analyzing the distinct geometrical properties and modelling the functional behavior of arterioles, capillaries, and venules.

  1. RECURSIVE CLASSIFICATION OF MQAM SIGNALS BASED ON HIGHER ORDER CUMULANTS

    Institute of Scientific and Technical Information of China (English)

    Chen Weidong; Yang Shaoquan

    2002-01-01

    A new feature based on higher order cumulants is proposed for classification of MQAM signals. Theoretical analysis justify that the new feature is invariant with respect to translation (shift), scale and rotation transform of signal constellations, and can suppress color or white additive Gaussian noise. Computer simulation shows that the proposed recursive orderreduction based classification algorithm can classify MQAM signals with any order.

  2. SELDI-TOF Serum Profiling for Prognostic and Diagnostic Classification of Breast Cancers

    Directory of Open Access Journals (Sweden)

    Christine Laronga

    2004-01-01

    Full Text Available Surface enhanced laser desorption/ionization (SELDI time-of-flight mass spectrometry has emerged as a successful tool for serum based detection and differentiation of many cancer types, including breast cancers. In this study, we have applied the SELDI technology to evaluate three potential applications that could extend the effectiveness of established procedures and biomarkers used for prognostication of breast cancers. Paired serum samples obtained from women with breast cancers prior to surgery and post-surgery (6–9 mos. were examined. In 14/16 post-treatment patients, serum protein profiles could be used to distinguish these samples from the pre-treatment cancer samples. When compared to serum samples from normal healthy women, 11 of these post-treatment samples retained global protein profiles not found in healthy women, including five low-mass proteins that remained elevated in both pre-treatment and post-treatment serum groups. In another pilot study, serum profiles were compared for a group of 30 women who were known BRCA-1 mutation carriers, half of whom subsequently developed breast cancer within three years of the sample procurement. SELDI protein profiling accurately classified 13/15 women with BRCA-1 breast cancers from the 15 non-cancer BRCA-1 carriers. Additionally, the ability of SELDI to distinguish between the serum profiles from sentinel lymph node positive and sentinel lymph node negative patients was evaluated. In sentinel lymph node positive samples, 22/27 samples were correctly classified, in comparison to the correct classification of 55/71 sentinel lymph node negative samples. These initial results indicate the utility of protein profiling approaches for developing new diagnostic and prognostic assays for breast cancers.

  3. Spectral-Spatial Hyperspectral Image Classification Based on KNN

    Science.gov (United States)

    Huang, Kunshan; Li, Shutao; Kang, Xudong; Fang, Leyuan

    2016-12-01

    Fusion of spectral and spatial information is an effective way in improving the accuracy of hyperspectral image classification. In this paper, a novel spectral-spatial hyperspectral image classification method based on K nearest neighbor (KNN) is proposed, which consists of the following steps. First, the support vector machine is adopted to obtain the initial classification probability maps which reflect the probability that each hyperspectral pixel belongs to different classes. Then, the obtained pixel-wise probability maps are refined with the proposed KNN filtering algorithm that is based on matching and averaging nonlocal neighborhoods. The proposed method does not need sophisticated segmentation and optimization strategies while still being able to make full use of the nonlocal principle of real images by using KNN, and thus, providing competitive classification with fast computation. Experiments performed on two real hyperspectral data sets show that the classification results obtained by the proposed method are comparable to several recently proposed hyperspectral image classification methods.

  4. Integrating Globality and Locality for Robust Representation Based Classification

    Directory of Open Access Journals (Sweden)

    Zheng Zhang

    2014-01-01

    Full Text Available The representation based classification method (RBCM has shown huge potential for face recognition since it first emerged. Linear regression classification (LRC method and collaborative representation classification (CRC method are two well-known RBCMs. LRC and CRC exploit training samples of each class and all the training samples to represent the testing sample, respectively, and subsequently conduct classification on the basis of the representation residual. LRC method can be viewed as a “locality representation” method because it just uses the training samples of each class to represent the testing sample and it cannot embody the effectiveness of the “globality representation.” On the contrary, it seems that CRC method cannot own the benefit of locality of the general RBCM. Thus we propose to integrate CRC and LRC to perform more robust representation based classification. The experimental results on benchmark face databases substantially demonstrate that the proposed method achieves high classification accuracy.

  5. Molecular classification of familial non-BRCA1/BRCA2 breast cancer.

    Science.gov (United States)

    Hedenfalk, Ingrid; Ringner, Markus; Ben-Dor, Amir; Yakhini, Zohar; Chen, Yidong; Chebil, Gunilla; Ach, Robert; Loman, Niklas; Olsson, Håkan; Meltzer, Paul; Borg, Ake; Trent, Jeffrey

    2003-03-01

    In the decade since their discovery, the two major breast cancer susceptibility genes BRCA1 and BRCA2, have been shown conclusively to be involved in a significant fraction of families segregating breast and ovarian cancer. However, it has become equally clear that a large proportion of families segregating breast cancer alone are not caused by mutations in BRCA1 or BRCA2. Unfortunately, despite intensive effort, the identification of additional breast cancer predisposition genes has so far been unsuccessful, presumably because of genetic heterogeneity, low penetrance, or recessive/polygenic mechanisms. These non-BRCA1/2 breast cancer families (termed BRCAx families) comprise a histopathologically heterogeneous group, further supporting their origin from multiple genetic events. Accordingly, the identification of a method to successfully subdivide BRCAx families into recognizable groups could be of considerable value to further genetic analysis. We have previously shown that global gene expression analysis can identify unique and distinct expression profiles in breast tumors from BRCA1 and BRCA2 mutation carriers. Here we show that gene expression profiling can discover novel classes among BRCAx tumors, and differentiate them from BRCA1 and BRCA2 tumors. Moreover, microarray-based comparative genomic hybridization (CGH) to cDNA arrays revealed specific somatic genetic alterations within the BRCAx subgroups. These findings illustrate that, when gene expression-based classifications are used, BRCAx families can be grouped into homogeneous subsets, thereby potentially increasing the power of conventional genetic analysis.

  6. Case base classification on digital mammograms: improving the performance of case base classifier

    Science.gov (United States)

    Raman, Valliappan; Then, H. H.; Sumari, Putra; Venkatesa Mohan, N.

    2011-10-01

    Breast cancer continues to be a significant public health problem in the world. Early detection is the key for improving breast cancer prognosis. The aim of the research presented here is in twofold. First stage of research involves machine learning techniques, which segments and extracts features from the mass of digital mammograms. Second level is on problem solving approach which includes classification of mass by performance based case base classifier. In this paper we build a case-based Classifier in order to diagnose mammographic images. We explain different methods and behaviors that have been added to the classifier to improve the performance of the classifier. Currently the initial Performance base Classifier with Bagging is proposed in the paper and it's been implemented and it shows an improvement in specificity and sensitivity.

  7. A new classification algorithm based on RGH-tree search

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    In this paper, we put forward a new classification algorithm based on RGH-Tree search and perform the classification analysis and comparison study. This algorithm can save computing resource and increase the classification efficiency. The experiment shows that this algorithm can get better effect in dealing with three dimensional multi-kind data. We find that the algorithm has better generalization ability for small training set and big testing result.

  8. Cellular based cancer vaccines

    DEFF Research Database (Denmark)

    Hansen, Morten; Met, O; Svane, I M;

    2012-01-01

    Cancer vaccines designed to re-calibrate the existing host-tumour interaction, tipping the balance from tumor acceptance towards tumor control holds huge potential to complement traditional cancer therapies. In general, limited success has been achieved with vaccines composed of tumor...... in vitro migration via autocrine receptor-mediated endocytosis of CCR7. In the current review, we discuss optimal design of DC maturation focused on pre-clinical as well as clinical results from standard and polarized dendritic cell based cancer vaccines....

  9. MASS CLASSIFICATION IN DIGITAL MAMMOGRAMS BASED ON DISCRETE SHEARLET TRANSFORM

    Directory of Open Access Journals (Sweden)

    J. Amjath Ali

    2013-01-01

    Full Text Available The most significant health problem in the world is breast cancer and early detection is the key to predict it. Mammography is the most reliable method to diagnose breast cancer at the earliest. The classification of the two most findings in the digital mammograms, micro calcifications and mass are valuable for early detection. Since, the appearance of the masses are similar to the surrounding parenchyma, the classification is not an easy task. In this study, an efficient approach to classify masses in the Mammography Image Analysis Society (MIAS database mammogram images is presented. The key features used for the classification is the energies of shearlet decomposed image. These features are fed into SVM classifier to classify mass/non mass images and also benign/malignant. The results demonstrate that the proposed shearlet energy features outperforms the wavelet energy features in terms of accuracy."

  10. EPA`s program for risk assessment guidelines: Cancer classification issues

    Energy Technology Data Exchange (ETDEWEB)

    Wiltse, J. [Environmental Protection Agency, Washington, DC (United States)

    1990-12-31

    Issues presented are related to classification of weight of evidence in cancer risk assessments. The focus in this paper is on lines of evidence used in constructing a conclusion about potential human carcinogenicity. The paper also discusses issues that are mistakenly addressed as classification issues but are really part of the risk assessment process. 2 figs.

  11. AN OBJECT-BASED METHOD FOR CHINESE LANDFORM TYPES CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    H. Ding

    2016-06-01

    Full Text Available Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM. In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  12. An Object-Based Method for Chinese Landform Types Classification

    Science.gov (United States)

    Ding, Hu; Tao, Fei; Zhao, Wufan; Na, Jiaming; Tang, Guo'an

    2016-06-01

    Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM). In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  13. Two-Dimensional ARMA Modeling for Breast Cancer Detection and Classification

    CERN Document Server

    Bouaynaya, Nidhal; Schonfeld, Dan

    2009-01-01

    We propose a new model-based computer-aided diagnosis (CAD) system for tumor detection and classification (cancerous v.s. benign) in breast images. Specifically, we show that (x-ray, ultrasound and MRI) images can be accurately modeled by two-dimensional autoregressive-moving average (ARMA) random fields. We derive a two-stage Yule-Walker Least-Squares estimates of the model parameters, which are subsequently used as the basis for statistical inference and biophysical interpretation of the breast image. We use a k-means classifier to segment the breast image into three regions: healthy tissue, benign tumor, and cancerous tumor. Our simulation results on ultrasound breast images illustrate the power of the proposed approach.

  14. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value.

    Directory of Open Access Journals (Sweden)

    Laetitia Marisa

    Full Text Available BACKGROUND: Colon cancer (CC pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses. METHODS AND FINDINGS: Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype-like, normal-like, serrated CC phenotype-like, and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II-III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse

  15. Fast Wavelet-Based Visual Classification

    CERN Document Server

    Yu, Guoshen

    2008-01-01

    We investigate a biologically motivated approach to fast visual classification, directly inspired by the recent work of Serre et al. Specifically, trading-off biological accuracy for computational efficiency, we explore using wavelet and grouplet-like transforms to parallel the tuning of visual cortex V1 and V2 cells, alternated with max operations to achieve scale and translation invariance. A feature selection procedure is applied during learning to accelerate recognition. We introduce a simple attention-like feedback mechanism, significantly improving recognition and robustness in multiple-object scenes. In experiments, the proposed algorithm achieves or exceeds state-of-the-art success rate on object recognition, texture and satellite image classification, language identification and sound classification.

  16. Knowledge-Based Classification in Automated Soil Mapping

    Institute of Scientific and Technical Information of China (English)

    ZHOU BIN; WANG RENCHAO

    2003-01-01

    A machine-learning approach was developed for automated building of knowledge bases for soil resourcesmapping by using a classification tree to generate knowledge from training data. With this method, buildinga knowledge base for automated soil mapping was easier than using the conventional knowledge acquisitionapproach. The knowledge base built by classification tree was used by the knowledge classifier to perform thesoil type classification of Longyou County, Zhejiang Province, China using Landsat TM bi-temporal imagesand GIS data. To evaluate the performance of the resultant knowledge bases, the classification results werecompared to existing soil map based on a field survey. The accuracy assessment and analysis of the resultantsoil maps suggested that the knowledge bases built by the machine-learning method was of good quality formapping distribution model of soil classes over the study area.

  17. Shape classification based on singular value decomposition transform

    Institute of Scientific and Technical Information of China (English)

    SHAABAN Zyad; ARIF Thawar; BABA Sami; KREKOR Lala

    2009-01-01

    In this paper, a new shape classification system based on singular value decomposition (SVD) transform using nearest neighbour classifier was proposed. The gray scale image of the shape object was converted into a black and white image. The squared Euclidean distance transform on binary image was applied to extract the boundary image of the shape. SVD transform features were extracted from the the boundary of the object shapes. In this paper, the proposed classification system based on SVD transform feature extraction method was compared with classifier based on moment invariants using nearest neighbour classifier. The experimental results showed the advantage of our proposed classification system.

  18. Multiclass Classification Based on the Analytical Center of Version Space

    Institute of Scientific and Technical Information of China (English)

    ZENGFanzi; QIUZhengding; YUEJianhai; LIXiangqian

    2005-01-01

    Analytical center machine, based on the analytical center of version space, outperforms support vector machine, especially when the version space is elongated or asymmetric. While analytical center machine for binary classification is well understood, little is known about corresponding multiclass classification.Moreover, considering that the current multiclass classification method: “one versus all” needs repeatedly constructing classifiers to separate a single class from all the others, which leads to daunting computation and low efficiency of classification, and that though multiclass support vector machine corresponds to a simple quadratic optimization, it is not very effective when the version spaceis asymmetric or elongated, Thus, the multiclass classification approach based on the analytical center of version space is proposed to address the above problems. Experiments on wine recognition and glass identification dataset demonstrate validity of the approach proposed.

  19. Parallel Implementation of Classification Algorithms Based on Cloud Computing Environment

    Directory of Open Access Journals (Sweden)

    Wenbo Wang

    2012-09-01

    Full Text Available As an important task of data mining, Classification has been received considerable attention in many applications, such as information retrieval, web searching, etc. The enlarging volumes of information emerging by the progress of technology and the growing individual needs of data mining, makes classifying of very large scale of data a challenging task. In order to deal with the problem, many researchers try to design efficient parallel classification algorithms. This paper introduces the classification algorithms and cloud computing briefly, based on it analyses the bad points of the present parallel classification algorithms, then addresses a new model of parallel classifying algorithms. And it mainly introduces a parallel Naïve Bayes classification algorithm based on MapReduce, which is a simple yet powerful parallel programming technique. The experimental results demonstrate that the proposed algorithm improves the original algorithm performance, and it can process large datasets efficiently on commodity hardware.

  20. An Efficient Audio Classification Approach Based on Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Lhoucine Bahatti

    2016-05-01

    Full Text Available In order to achieve an audio classification aimed to identify the composer, the use of adequate and relevant features is important to improve performance especially when the classification algorithm is based on support vector machines. As opposed to conventional approaches that often use timbral features based on a time-frequency representation of the musical signal using constant window, this paper deals with a new audio classification method which improves the features extraction according the Constant Q Transform (CQT approach and includes original audio features related to the musical context in which the notes appear. The enhancement done by this work is also lay on the proposal of an optimal features selection procedure which combines filter and wrapper strategies. Experimental results show the accuracy and efficiency of the adopted approach in the binary classification as well as in the multi-class classification.

  1. Behavior Based Social Dimensions Extraction for Multi-Label Classification.

    Science.gov (United States)

    Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin

    2016-01-01

    Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions.

  2. Classification of skin cancer images using local binary pattern and SVM classifier

    Science.gov (United States)

    Adjed, Faouzi; Faye, Ibrahima; Ababsa, Fakhreddine; Gardezi, Syed Jamal; Dass, Sarat Chandra

    2016-11-01

    In this paper, a classification method for melanoma and non-melanoma skin cancer images has been presented using the local binary patterns (LBP). The LBP computes the local texture information from the skin cancer images, which is later used to compute some statistical features that have capability to discriminate the melanoma and non-melanoma skin tissues. Support vector machine (SVM) is applied on the feature matrix for classification into two skin image classes (malignant and benign). The method achieves good classification accuracy of 76.1% with sensitivity of 75.6% and specificity of 76.7%.

  3. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  4. Breast cancer surgery and diagnosis-related groups (DRGs): patient classification and hospital reimbursement in 11 European countries.

    Science.gov (United States)

    Scheller-Kreinsen, David; Quentin, Wilm; Geissler, Alexander; Busse, Reinhard

    2013-10-01

    Researchers from eleven countries (i.e. Austria, England, Estonia, Finland, France, Germany, Ireland, Netherlands, Poland, Spain, and Sweden) compared how their DRG systems deal with breast cancer surgery patients. DRG algorithms and indicators of resource consumption were assessed for those DRGs that individually contain at least 1% of all breast cancer surgery patients. Six standardised case vignettes were defined and quasi prices according to national DRG-based hospital payment systems were ascertained. European DRG systems classify breast cancer surgery patients according to different sets of classification variables into three to seven DRGs. Quasi prices for an index case treated with partial mastectomy range from €577 in Poland to €5780 in the Netherlands. Countries award their highest payments for very different kinds of patients. Breast cancer specialists and national DRG authorities should consider how other countries' DRG systems classify breast cancer patients in order to identify potential scope for improvement and to ensure fair and appropriate reimbursement.

  5. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    N. Li

    2016-06-01

    Full Text Available Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could keep the initial spatial structure and insure the consideration of the neighborhood. Based on a small number of component features a k nearest neighborhood classification is applied.

  6. Tensor Modeling Based for Airborne LiDAR Data Classification

    Science.gov (United States)

    Li, N.; Liu, C.; Pfeifer, N.; Yin, J. F.; Liao, Z. Y.; Zhou, Y.

    2016-06-01

    Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the "raw" data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could keep the initial spatial structure and insure the consideration of the neighborhood. Based on a small number of component features a k nearest neighborhood classification is applied.

  7. Speech Segregation based on Binary Classification

    Science.gov (United States)

    2016-07-15

    other provision of law, no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a...to the adoption of the ideal ratio mask (IRM). A subsequent listening evaluation shows increased intelligibility in noise for human listeners...15. SUBJECT TERMS Binary classification, time-frequency masking, supervised speech segregation, speech intelligibility , room reverberation 16

  8. NONSUBSAMPLED CONTOURLET TRANSFORM BASED CLASSIFICATION OF MICROCALCIFICATION IN DIGITAL MAMMOGRAMS

    Directory of Open Access Journals (Sweden)

    J. S. Leena Jasmine

    2013-01-01

    Full Text Available Mammogram is the best available radiographic method to detect breast cancer in the early stage. However detecting a microcalcification clusters in the early stage is a tough task for the radiologist. Herein we present a novel approach for classifying microcalcification in digital mammograms using Nonsubsampled Contourlet Transform (NSCT and Support Vector Machine (SVM. The classification of microcalcification is achieved by extracting the microcalcification features from the Contourlet coefficients of the image and the outcomes are used as an input to the SVM for classification. The system classifies the mammogram images as normal or abnormal and the abnormal severity as benign or malignant. The evaluation of the system is carried on using Mammography Image Analysis Society (MIAS database. The experimental result shows that the proposed method provides improved classification rate.

  9. Intelligent Hybrid Cluster Based Classification Algorithm for Social Network Analysis

    Directory of Open Access Journals (Sweden)

    S. Muthurajkumar

    2014-05-01

    Full Text Available In this paper, we propose an hybrid clustering based classification algorithm based on mean approach to effectively classify to mine the ordered sequences (paths from weblog data in order to perform social network analysis. In the system proposed in this work for social pattern analysis, the sequences of human activities are typically analyzed by switching behaviors, which are likely to produce overlapping clusters. In this proposed system, a robust Modified Boosting algorithm is proposed to hybrid clustering based classification for clustering the data. This work is useful to provide connection between the aggregated features from the network data and traditional indices used in social network analysis. Experimental results show that the proposed algorithm improves the decision results from data clustering when combined with the proposed classification algorithm and hence it is proved that of provides better classification accuracy when tested with Weblog dataset. In addition, this algorithm improves the predictive performance especially for multiclass datasets which can increases the accuracy.

  10. Methodological Aspects of Prognostic Classifications: Applications in Testicular Cancer

    NARCIS (Netherlands)

    M.R. van Dijk (Merel)

    2007-01-01

    textabstractPatients with similar characteristics can be grouped together in a prognostic classification to estimate a patient’s prognosis and guide treatment decisions. The topic of this thesis is methodological aspects of defining prognosis classifications. We specifically looked at patients wi

  11. Hybrid Support Vector Machines-Based Multi-fault Classification

    Institute of Scientific and Technical Information of China (English)

    GAO Guo-hua; ZHANG Yong-zhong; ZHU Yu; DUAN Guang-huang

    2007-01-01

    Support Vector Machines (SVM) is a new general machine-learning tool based on structural risk minimization principle. This characteristic is very signific ant for the fault diagnostics when the number of fault samples is limited. Considering that SVM theory is originally designed for a two-class classification, a hybrid SVM scheme is proposed for multi-fault classification of rotating machinery in our paper. Two SVM strategies, 1-v-1 (one versus one) and 1-v-r (one versus rest), are respectively adopted at different classification levels. At the parallel classification level, using 1-v-1 strategy, the fault features extracted by various signal analysis methods are transferred into the multiple parallel SVM and the local classification results are obtained. At the serial classification level, these local results values are fused by one serial SVM based on 1-v-r strategy. The hybrid SVM scheme introduced in our paper not only generalizes the performance of signal binary SVMs but improves the precision and reliability of the fault classification results. The actually testing results show the availability suitability of this new method.

  12. Key-phrase based classification of public health web pages.

    Science.gov (United States)

    Dolamic, Ljiljana; Boyer, Célia

    2013-01-01

    This paper describes and evaluates the public health web pages classification model based on key phrase extraction and matching. Easily extendible both in terms of new classes as well as the new language this method proves to be a good solution for text classification faced with the total lack of training data. To evaluate the proposed solution we have used a small collection of public health related web pages created by a double blind manual classification. Our experiments have shown that by choosing the adequate threshold value the desired value for either precision or recall can be achieved.

  13. Support vector classification algorithm based on variable parameter linear programming

    Institute of Scientific and Technical Information of China (English)

    Xiao Jianhua; Lin Jian

    2007-01-01

    To solve the problems of SVM in dealing with large sample size and asymmetric distributed samples, a support vector classification algorithm based on variable parameter linear programming is proposed.In the proposed algorithm, linear programming is employed to solve the optimization problem of classification to decrease the computation time and to reduce its complexity when compared with the original model.The adjusted punishment parameter greatly reduced the classification error resulting from asymmetric distributed samples and the detailed procedure of the proposed algorithm is given.An experiment is conducted to verify whether the proposed algorithm is suitable for asymmetric distributed samples.

  14. Words semantic orientation classification based on HowNet

    Institute of Scientific and Technical Information of China (English)

    LI Dun; MA Yong-tao; GUO Jian-li

    2009-01-01

    Based on the text orientation classification, a new measurement approach to semantic orientation of words was proposed. According to the integrated and detailed definition of words in HowNet, seed sets including the words with intense orientations were built up. The orientation similarity between the seed words and the given word was then calculated using the sentiment weight priority to recognize the semantic orientation of common words. Finally, the words' semantic orientation and the context were combined to recognize the given words' orientation. The experiments show that the measurement approach achieves better results for common words' orientation classification and contributes particularly to the text orientation classification of large granularities.

  15. Radar Target Classification using Recursive Knowledge-Based Methods

    DEFF Research Database (Denmark)

    Jochumsen, Lars Wurtz

    The topic of this thesis is target classification of radar tracks from a 2D mechanically scanning coastal surveillance radar. The measurements provided by the radar are position data and therefore the classification is mainly based on kinematic data, which is deduced from the position. The target...... been terminated. Therefore, an update of the classification results must be made for each measurement of the target. The data for this work are collected throughout the PhD and are both collected from radars and other sensors such as GPS....

  16. Identifying Cancer Biomarkers Via Node Classification within a Mapreduce Framework

    Directory of Open Access Journals (Sweden)

    Taysir Hassan A. Soliman

    2015-12-01

    Full Text Available Big data are giving new research challenges in the life sciences domain because of their variety, volume, veracity, velocity, and value. Predicting gene biomarkers is one of the vital research issues in bioinformatics field, where microarray gene expression and network based methods can be used. These datasets suffer from the huge data voluminous, causing main memory problems. In this paper, a Random Committee Node Classifier algorithm (RCNC is proposed for identifying cancer biomarkers, which is based on microarray gene expression data and Protein-Protein Interaction (PPI data. Data are enriched from other public databases, such as IntACT1 and UniProt2 and Gene Ontology3 (GO. Cancer Biomarkers are identified when applied to different datasets with an accuracy rate an accuracy rate 99.16%, 99.96% precision, 99.24% recall, 99.16% F1-measure and 99.6 ROC. To speed up the performance, it is run within a MapReduce framework, where RCNC MapReduce algorithm is much faster than RCNC sequential algorithm when having large datasets.

  17. Fuzzy Aspect Based Opinion Classification System for Mining Tourist Reviews

    Directory of Open Access Journals (Sweden)

    Muhammad Afzaal

    2016-01-01

    Full Text Available Due to the large amount of opinions available on the websites, tourists are often overwhelmed with information and find it extremely difficult to use the available information to make a decision about the tourist places to visit. A number of opinion mining methods have been proposed in the past to identify and classify an opinion into positive or negative. Recently, aspect based opinion mining has been introduced which targets the various aspects present in the opinion text. A number of existing aspect based opinion classification methods are available in the literature but very limited research work has targeted the automatic aspect identification and extraction of implicit, infrequent, and coreferential aspects. Aspect based classification suffers from the presence of irrelevant sentences in a typical user review. Such sentences make the data noisy and degrade the classification accuracy of the machine learning algorithms. This paper presents a fuzzy aspect based opinion classification system which efficiently extracts aspects from user opinions and perform near to accurate classification. We conducted experiments on real world datasets to evaluate the effectiveness of our proposed system. Experimental results prove that the proposed system not only is effective in aspect extraction but also improves the classification accuracy.

  18. A Syntactic Classification based Web Page Ranking Algorithm

    CERN Document Server

    Mukhopadhyay, Debajyoti; Kim, Young-Chon

    2011-01-01

    The existing search engines sometimes give unsatisfactory search result for lack of any categorization of search result. If there is some means to know the preference of user about the search result and rank pages according to that preference, the result will be more useful and accurate to the user. In the present paper a web page ranking algorithm is being proposed based on syntactic classification of web pages. Syntactic Classification does not bother about the meaning of the content of a web page. The proposed approach mainly consists of three steps: select some properties of web pages based on user's demand, measure them, and give different weightage to each property during ranking for different types of pages. The existence of syntactic classification is supported by running fuzzy c-means algorithm and neural network classification on a set of web pages. The change in ranking for difference in type of pages but for same query string is also being demonstrated.

  19. Texture Classification Using Sparse Frame-Based Representations

    Directory of Open Access Journals (Sweden)

    Skretting Karl

    2006-01-01

    Full Text Available A new method for supervised texture classification, denoted by frame texture classification method (FTCM, is proposed. The method is based on a deterministic texture model in which a small image block, taken from a texture region, is modeled as a sparse linear combination of frame elements. FTCM has two phases. In the design phase a frame is trained for each texture class based on given texture example images. The design method is an iterative procedure in which the representation error, given a sparseness constraint, is minimized. In the classification phase each pixel in a test image is labeled by analyzing its spatial neighborhood. This block is represented by each of the frames designed for the texture classes under consideration, and the frame giving the best representation gives the class. The FTCM is applied to nine test images of natural textures commonly used in other texture classification work, yielding excellent overall performance.

  20. Feature Extraction based Face Recognition, Gender and Age Classification

    Directory of Open Access Journals (Sweden)

    Venugopal K R

    2010-01-01

    Full Text Available The face recognition system with large sets of training sets for personal identification normally attains good accuracy. In this paper, we proposed Feature Extraction based Face Recognition, Gender and Age Classification (FEBFRGAC algorithm with only small training sets and it yields good results even with one image per person. This process involves three stages: Pre-processing, Feature Extraction and Classification. The geometric features of facial images like eyes, nose, mouth etc. are located by using Canny edge operator and face recognition is performed. Based on the texture and shape information gender and age classification is done using Posteriori Class Probability and Artificial Neural Network respectively. It is observed that the face recognition is 100%, the gender and age classification is around 98% and 94% respectively.

  1. Analysis of Kernel Approach in Fuzzy-Based Image Classifications

    Directory of Open Access Journals (Sweden)

    Mragank Singhal

    2013-03-01

    Full Text Available This paper presents a framework of kernel approach in the field of fuzzy based image classification in remote sensing. The goal of image classification is to separate images according to their visual content into two or more disjoint classes. Fuzzy logic is relatively young theory. Major advantage of this theory is that it allows the natural description, in linguistic terms, of problems that should be solved rather than in terms of relationships between precise numerical values. This paper describes how remote sensing data with uncertainty are handled with fuzzy based classification using Kernel approach for land use/land cover maps generation. The introduction to fuzzification using Kernel approach provides the basis for the development of more robust approaches to the remote sensing classification problem. The kernel explicitly defines a similarity measure between two samples and implicitly represents the mapping of the input space to the feature space.

  2. Object Based and Pixel Based Classification Using Rapideye Satellite Imager of ETI-OSA, Lagos, Nigeria

    Directory of Open Access Journals (Sweden)

    Esther Oluwafunmilayo Makinde

    2016-12-01

    Full Text Available Several studies have been carried out to find an appropriate method to classify the remote sensing data. Traditional classification approaches are all pixel-based, and do not utilize the spatial information within an object which is an important source of information to image classification. Thus, this study compared the pixel based and object based classification algorithms using RapidEye satellite image of Eti-Osa LGA, Lagos. In the object-oriented approach, the image was segmented to homogenous area by suitable parameters such as scale parameter, compactness, shape etc. Classification based on segments was done by a nearest neighbour classifier. In the pixel-based classification, the spectral angle mapper was used to classify the images. The user accuracy for each class using object based classification were 98.31% for waterbody, 92.31% for vegetation, 86.67% for bare soil and 90.57% for Built up while the user accuracy for the pixel based classification were 98.28% for waterbody, 84.06% for Vegetation 86.36% and 79.41% for Built up. These classification techniques were subjected to accuracy assessment and the overall accuracy of the Object based classification was 94.47%, while that of Pixel based classification yielded 86.64%. The result of classification and accuracy assessment show that the object-based approach gave more accurate and satisfying results

  3. Tomato classification based on laser metrology and computer algorithms

    Science.gov (United States)

    Igno Rosario, Otoniel; Muñoz Rodríguez, J. Apolinar; Martínez Hernández, Haydeé P.

    2011-08-01

    An automatic technique for tomato classification is presented based on size and color. The size is determined based on surface contouring by laser line scanning. Here, a Bezier network computes the tomato height based on the line position. The tomato color is determined by CIELCH color space and the components red and green. Thus, the tomato size is classified in large, medium and small. Also, the tomato is classified into six colors associated with its maturity. The performance and accuracy of the classification system is evaluated based on methods reported in the recent years. The technique is tested and experimental results are presented.

  4. Classification of LiDAR Data with Point Based Classification Methods

    Science.gov (United States)

    Yastikli, N.; Cetin, Z.

    2016-06-01

    LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applications directly use the classified point clouds. The different classification methods can be seen in recent researches and most of researches work with the gridded LiDAR point cloud. In grid based data processing of the LiDAR data, the characteristic point loss in the LiDAR point cloud especially vegetation and buildings or losing height accuracy during the interpolation stage are inevitable. In this case, the possible solution is the use of the raw point cloud data for classification to avoid data and accuracy loss in gridding process. In this study, the point based classification possibilities of the LiDAR point cloud is investigated to obtain more accurate classes. The automatic point based approaches, which are based on hierarchical rules, have been proposed to achieve ground, building and vegetation classes using the raw LiDAR point cloud data. In proposed approaches, every single LiDAR point is analyzed according to their features such as height, multi-return, etc. then automatically assigned to the class which they belong to. The use of un-gridded point cloud in proposed point based classification process helped the determination of more realistic rule sets. The detailed parameter analyses have been performed to obtain the most appropriate parameters in the rule sets to achieve accurate classes. The hierarchical rule sets were created for proposed Approach 1 (using selected spatial-based and echo-based features) and Approach 2 (using only selected spatial-based features

  5. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin

    DEFF Research Database (Denmark)

    Hoadley, Katherine A; Yau, Christina; Wolf, Denise M

    2014-01-01

    Recent genomic analyses of pathologically defined tumor types identify "within-a-tissue" disease subtypes. However, the extent to which genomic signatures are shared across tissues is still unclear. We performed an integrative analysis using five genome-wide platforms and one proteomic platform...... on 3,527 specimens from 12 cancer types, revealing a unified classification into 11 major subtypes. Five subtypes were nearly identical to their tissue-of-origin counterparts, but several distinct cancer types were found to converge into common subtypes. Lung squamous, head and neck, and a subset...... of bladder cancers coalesced into one subtype typified by TP53 alterations, TP63 amplifications, and high expression of immune and proliferation pathway genes. Of note, bladder cancers split into three pan-cancer subtypes. The multiplatform classification, while correlated with tissue-of-origin, provides...

  6. Semantic Document Image Classification Based on Valuable Text Pattern

    Directory of Open Access Journals (Sweden)

    Hossein Pourghassem

    2011-01-01

    Full Text Available Knowledge extraction from detected document image is a complex problem in the field of information technology. This problem becomes more intricate when we know, a negligible percentage of the detected document images are valuable. In this paper, a segmentation-based classification algorithm is used to analysis the document image. In this algorithm, using a two-stage segmentation approach, regions of the image are detected, and then classified to document and non-document (pure region regions in the hierarchical classification. In this paper, a novel valuable definition is proposed to classify document image in to valuable or invaluable categories. The proposed algorithm is evaluated on a database consisting of the document and non-document image that provide from Internet. Experimental results show the efficiency of the proposed algorithm in the semantic document image classification. The proposed algorithm provides accuracy rate of 98.8% for valuable and invaluable document image classification problem.

  7. Indoor scene classification of robot vision based on cloud computing

    Science.gov (United States)

    Hu, Tao; Qi, Yuxiao; Li, Shipeng

    2016-07-01

    For intelligent service robots, indoor scene classification is an important issue. To overcome the weak real-time performance of conventional algorithms, a new method based on Cloud computing is proposed for global image features in indoor scene classification. With MapReduce method, global PHOG feature of indoor scene image is extracted in parallel. And, feature eigenvector is used to train the decision classifier through SVM concurrently. Then, the indoor scene is validly classified by decision classifier. To verify the algorithm performance, we carried out an experiment with 350 typical indoor scene images from MIT LabelMe image library. Experimental results show that the proposed algorithm can attain better real-time performance. Generally, it is 1.4 2.1 times faster than traditional classification methods which rely on single computation, while keeping stable classification correct rate as 70%.

  8. Classification approach based on association rules mining for unbalanced data

    CERN Document Server

    Ndour, Cheikh

    2012-01-01

    This paper deals with the supervised classification when the response variable is binary and its class distribution is unbalanced. In such situation, it is not possible to build a powerful classifier by using standard methods such as logistic regression, classification tree, discriminant analysis, etc. To overcome this short-coming of these methods that provide classifiers with low sensibility, we tackled the classification problem here through an approach based on the association rules learning because this approach has the advantage of allowing the identification of the patterns that are well correlated with the target class. Association rules learning is a well known method in the area of data-mining. It is used when dealing with large database for unsupervised discovery of local patterns that expresses hidden relationships between variables. In considering association rules from a supervised learning point of view, a relevant set of weak classifiers is obtained from which one derives a classification rule...

  9. Ensemble polarimetric SAR image classification based on contextual sparse representation

    Science.gov (United States)

    Zhang, Lamei; Wang, Xiao; Zou, Bin; Qiao, Zhijun

    2016-05-01

    Polarimetric SAR image interpretation has become one of the most interesting topics, in which the construction of the reasonable and effective technique of image classification is of key importance. Sparse representation represents the data using the most succinct sparse atoms of the over-complete dictionary and the advantages of sparse representation also have been confirmed in the field of PolSAR classification. However, it is not perfect, like the ordinary classifier, at different aspects. So ensemble learning is introduced to improve the issue, which makes a plurality of different learners training and obtained the integrated results by combining the individual learner to get more accurate and ideal learning results. Therefore, this paper presents a polarimetric SAR image classification method based on the ensemble learning of sparse representation to achieve the optimal classification.

  10. Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts.

    Science.gov (United States)

    Dashtban, M; Balafar, Mohammadali

    2017-03-01

    Gene selection is a demanding task for microarray data analysis. The diverse complexity of different cancers makes this issue still challenging. In this study, a novel evolutionary method based on genetic algorithms and artificial intelligence is proposed to identify predictive genes for cancer classification. A filter method was first applied to reduce the dimensionality of feature space followed by employing an integer-coded genetic algorithm with dynamic-length genotype, intelligent parameter settings, and modified operators. The algorithmic behaviors including convergence trends, mutation and crossover rate changes, and running time were studied, conceptually discussed, and shown to be coherent with literature findings. Two well-known filter methods, Laplacian and Fisher score, were examined considering similarities, the quality of selected genes, and their influences on the evolutionary approach. Several statistical tests concerning choice of classifier, choice of dataset, and choice of filter method were performed, and they revealed some significant differences between the performance of different classifiers and filter methods over datasets. The proposed method was benchmarked upon five popular high-dimensional cancer datasets; for each, top explored genes were reported. Comparing the experimental results with several state-of-the-art methods revealed that the proposed method outperforms previous methods in DLBCL dataset.

  11. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge Emil Borch Laurs; Lo, Pechin Chien Pau; Dirksen, Asger

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment...... by including anatomical features in the branch feature vectors. The proposed approach is applied to classify airway trees in computed tomography images of subjects with and without chronic obstructive pulmonary disease (COPD). Using the wall area percentage (WA%), a common measure of airway abnormality in COPD...

  12. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge; Lo, Pechin Chien Pau; Dirksen, Asger

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment...... between the branch feature vectors representing those trees. Hereby, localized information in the branches is collectively used in classification and variations in feature values across the tree are taken into account. An approximate anatomical correspondence between matched branches can be achieved...... by including anatomical features in the branch feature vectors. The proposed approach is applied to classify airway trees in computed tomography images of subjects with and without chronic obstructive pulmonary disease (COPD). Using the wall area percentage (WA%), a common measure of airway abnormality in COPD...

  13. Classification of Gait Types Based on the Duty-factor

    DEFF Research Database (Denmark)

    Fihl, Preben; Moeslund, Thomas B.

    2007-01-01

    This paper deals with classification of human gait types based on the notion that different gait types are in fact different types of locomotion, i.e., running is not simply walking done faster. We present the duty-factor, which is a descriptor based on this notion. The duty-factor is independent...

  14. D Land Cover Classification Based on Multispectral LIDAR Point Clouds

    Science.gov (United States)

    Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong

    2016-06-01

    Multispectral Lidar System can emit simultaneous laser pulses at the different wavelengths. The reflected multispectral energy is captured through a receiver of the sensor, and the return signal together with the position and orientation information of sensor is recorded. These recorded data are solved with GNSS/IMU data for further post-processing, forming high density multispectral 3D point clouds. As the first commercial multispectral airborne Lidar sensor, Optech Titan system is capable of collecting point clouds data from all three channels at 532nm visible (Green), at 1064 nm near infrared (NIR) and at 1550nm intermediate infrared (IR). It has become a new source of data for 3D land cover classification. The paper presents an Object Based Image Analysis (OBIA) approach to only use multispectral Lidar point clouds datasets for 3D land cover classification. The approach consists of three steps. Firstly, multispectral intensity images are segmented into image objects on the basis of multi-resolution segmentation integrating different scale parameters. Secondly, intensity objects are classified into nine categories by using the customized features of classification indexes and a combination the multispectral reflectance with the vertical distribution of object features. Finally, accuracy assessment is conducted via comparing random reference samples points from google imagery tiles with the classification results. The classification results show higher overall accuracy for most of the land cover types. Over 90% of overall accuracy is achieved via using multispectral Lidar point clouds for 3D land cover classification.

  15. Super pixel density based clustering automatic image classification method

    Science.gov (United States)

    Xu, Mingxing; Zhang, Chuan; Zhang, Tianxu

    2015-12-01

    The image classification is an important means of image segmentation and data mining, how to achieve rapid automated image classification has been the focus of research. In this paper, based on the super pixel density of cluster centers algorithm for automatic image classification and identify outlier. The use of the image pixel location coordinates and gray value computing density and distance, to achieve automatic image classification and outlier extraction. Due to the increased pixel dramatically increase the computational complexity, consider the method of ultra-pixel image preprocessing, divided into a small number of super-pixel sub-blocks after the density and distance calculations, while the design of a normalized density and distance discrimination law, to achieve automatic classification and clustering center selection, whereby the image automatically classify and identify outlier. After a lot of experiments, our method does not require human intervention, can automatically categorize images computing speed than the density clustering algorithm, the image can be effectively automated classification and outlier extraction.

  16. Atmospheric circulation classification comparison based on wildfires in Portugal

    Science.gov (United States)

    Pereira, M. G.; Trigo, R. M.

    2009-04-01

    Atmospheric circulation classifications are not a simple description of atmospheric states but a tool to understand and interpret the atmospheric processes and to model the relation between atmospheric circulation and surface climate and other related variables (Radan Huth et al., 2008). Classifications were initially developed with weather forecasting purposes, however with the progress in computer processing capability, new and more robust objective methods were developed and applied to large datasets prompting atmospheric circulation classification methods to one of the most important fields in synoptic and statistical climatology. Classification studies have been extensively used in climate change studies (e.g. reconstructed past climates, recent observed changes and future climates), in bioclimatological research (e.g. relating human mortality to climatic factors) and in a wide variety of synoptic climatological applications (e.g. comparison between datasets, air pollution, snow avalanches, wine quality, fish captures and forest fires). Likewise, atmospheric circulation classifications are important for the study of the role of weather in wildfire occurrence in Portugal because the daily synoptic variability is the most important driver of local weather conditions (Pereira et al., 2005). In particular, the objective classification scheme developed by Trigo and DaCamara (2000) to classify the atmospheric circulation affecting Portugal have proved to be quite useful in discriminating the occurrence and development of wildfires as well as the distribution over Portugal of surface climatic variables with impact in wildfire activity such as maximum and minimum temperature and precipitation. This work aims to present: (i) an overview the existing circulation classification for the Iberian Peninsula, and (ii) the results of a comparison study between these atmospheric circulation classifications based on its relation with wildfires and relevant meteorological

  17. Finding Combination of Features from Promoter Regions for Ovarian Cancer-related Gene Group Classification

    KAUST Repository

    Olayan, Rawan S.

    2012-12-01

    In classification problems, it is always important to use the suitable combination of features that will be employed by classifiers. Generating the right combination of features usually results in good classifiers. In the situation when the problem is not well understood, data items are usually described by many features in the hope that some of these may be the relevant or most relevant ones. In this study, we focus on one such problem related to genes implicated in ovarian cancer (OC). We try to recognize two important OC-related gene groups: oncogenes, which support the development and progression of OC, and oncosuppressors, which oppose such tendencies. For this, we use the properties of promoters of these genes. We identified potential “regulatory features” that characterize OC-related oncogenes and oncosuppressors promoters. In our study, we used 211 oncogenes and 39 oncosuppressors. For these, we identified 538 characteristic sequence motifs from their promoters. Promoters are annotated by these motifs and derived feature vectors used to develop classification models. We made a comparison of a number of classification models in their ability to distinguish oncogenes from oncosuppressors. Based on 10-fold cross-validation, the resultant model was able to separate the two classes with sensitivity of 96% and specificity of 100% with the complete set of features. Moreover, we developed another recognition model where we attempted to distinguish oncogenes and oncosuppressors as one group from other OC-related genes. That model achieved accuracy of 82%. We believe that the results of this study will help in discovering other OC-related oncogenes and oncosuppressors not identified as yet.

  18. Hybrid SPR algorithm to select predictive genes for effectual cancer classification

    OpenAIRE

    2012-01-01

    Designing an automated system for classifying DNA microarray data is an extremely challenging problem because of its high dimension and low amount of sample data. In this paper, a hybrid statistical pattern recognition algorithm is proposed to reduce the dimensionality and select the predictive genes for the classification of cancer. Colon cancer gene expression profiles having 62 samples of 2000 genes were used for the experiment. A gene subset of 6 highly informative genes was selecte...

  19. An Efficient Semantic Model For Concept Based Clustering And Classification

    Directory of Open Access Journals (Sweden)

    SaiSindhu Bandaru

    2012-03-01

    Full Text Available Usually in text mining techniques the basic measures like term frequency of a term (word or phrase is computed to compute the importance of the term in the document. But with statistical analysis, the original semantics of the term may not carry the exact meaning of the term. To overcome this problem, a new framework has been introduced which relies on concept based model and synonym based approach. The proposed model can efficiently find significant matching and related concepts between documents according to concept based and synonym based approaches. Large sets of experiments using the proposed model on different set in clustering and classification are conducted. Experimental results demonstrate the substantialenhancement of the clustering quality using sentence based, document based, corpus based and combined approach concept analysis. A new similarity measure has been proposed to find the similarity between adocument and the existing clusters, which can be used in classification of the document with existing clusters.

  20. Object Based and Pixel Based Classification Using Rapideye Satellite Imager of ETI-OSA, Lagos, Nigeria

    OpenAIRE

    Esther Oluwafunmilayo Makinde; Ayobami Taofeek Salami; James Bolarinwa Olaleye; Oluwapelumi Comfort Okewusi

    2016-01-01

    Several studies have been carried out to find an appropriate method to classify the remote sensing data. Traditional classification approaches are all pixel-based, and do not utilize the spatial information within an object which is an important source of information to image classification. Thus, this study compared the pixel based and object based classification algorithms using RapidEye satellite image of Eti-Osa LGA, Lagos. In the object-oriented approach, the image was segmented to homog...

  1. Appraisal of progenitor markers in the context of molecular classification of breast cancers.

    Science.gov (United States)

    Haviv, Izhak

    2011-01-25

    Clinical management of breast cancer relies on case stratification, which increasingly employs molecular markers. The motivation behind delineating breast epithelial differentiation is to better target cancer cases through innate sensitivities bequeathed to the cancer from its normal progenitor state. A combination of histopathological and molecular classification of breast cancer cases suggests a role for progenitors in particular breast cancer cases. Although a remarkable fraction of the real tissue repertoire is maintained within a population of independent cell line cultures, some steps that are closer to the terminal differentiation state and that form a majority of primary human breast tissues are missing in the cell line cultures. This raises concerns about current breast cancer models.

  2. Early Detection and Classification of Melanoma Skin Cancer

    Directory of Open Access Journals (Sweden)

    Abbas Hanon. Alasadi

    2015-10-01

    Full Text Available Melanoma is a form of cancer that begins in melanocytes (cells that make the pigment melanin. It can affect the skin only, or it may spread to the organs and bones. It is less common, but more serious and aggressive than other types of skin cancer. Melanoma can be of benign or malignant. Malignant melanoma is the dangerous condition, while benign is not. In order to reduce the death rate due to malignant melanoma skin cancer, it is necessary to diagnose it at an early stage. In this paper, a detection system has been designed for diagnosing melanoma in early stages by using digital image processing techniques. The system consists of two phases: the first phase detects whether the pigmented skin lesion is malignant or benign; the second phase recognizes malignant melanoma skin cancer types. Both first and second phases have several stages. The experimental results are acceptable.

  3. Vessel-guided airway segmentation based on voxel classification

    DEFF Research Database (Denmark)

    Lo, Pechin Chien Pau; Sporring, Jon; Ashraf, Haseem;

    2008-01-01

    This paper presents a method for improving airway tree segmentation using vessel orientation information. We use the fact that an airway branch is always accompanied by an artery, with both structures having similar orientations. This work is based on a  voxel classification airway segmentation...

  4. Hierarchical Real-time Network Traffic Classification Based on ECOC

    Directory of Open Access Journals (Sweden)

    Yaou Zhao

    2013-09-01

    Full Text Available Classification of network traffic is basic and essential for manynetwork researches and managements. With the rapid development ofpeer-to-peer (P2P application using dynamic port disguisingtechniques and encryption to avoid detection, port-based and simplepayload-based network traffic classification methods were diminished.An alternative method based on statistics and machine learning hadattracted researchers' attention in recent years. However, most ofthe proposed algorithms were off-line and usually used a single classifier.In this paper a new hierarchical real-time model was proposed which comprised of a three tuple (source ip, destination ip and destination portlook up table(TT-LUT part and layered milestone part. TT-LUT was used to quickly classify short flows whichneed not to pass the layered milestone part, and milestones in layered milestone partcould classify the other flows in real-time with the real-time feature selection and statistics.Every milestone was a ECOC(Error-Correcting Output Codes based model which was usedto improve classification performance. Experiments showed that the proposedmodel can improve the efficiency of real-time to 80%, and themulti-class classification accuracy encouragingly to 91.4% on the datasets which had been captured from the backbone router in our campus through a week.

  5. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is constr

  6. Pulse frequency classification based on BP neural network

    Institute of Scientific and Technical Information of China (English)

    WANG Rui; WANG Xu; YANG Dan; FU Rong

    2006-01-01

    In Traditional Chinese Medicine (TCM), it is an important parameter of the clinic disease diagnosis to analysis the pulse frequency. This article accords to pulse eight major essentials to identify pulse type of the pulse frequency classification based on back-propagation neural networks (BPNN). The pulse frequency classification includes slow pulse, moderate pulse, rapid pulse etc. By feature parameter of the pulse frequency analysis research and establish to identify system of pulse frequency features. The pulse signal from detecting system extracts period, frequency etc feature parameter to compare with standard feature value of pulse type. The result shows that identify-rate attains 92.5% above.

  7. Optimizing Mining Association Rules for Artificial Immune System based Classification

    Directory of Open Access Journals (Sweden)

    SAMEER DIXIT

    2011-08-01

    Full Text Available The primary function of a biological immune system is to protect the body from foreign molecules known as antigens. It has great pattern recognition capability that may be used to distinguish between foreigncells entering the body (non-self or antigen and the body cells (self. Immune systems have many characteristics such as uniqueness, autonomous, recognition of foreigners, distributed detection, and noise tolerance . Inspired by biological immune systems, Artificial Immune Systems have emerged during the last decade. They are incited by many researchers to design and build immune-based models for a variety of application domains. Artificial immune systems can be defined as a computational paradigm that is inspired by theoretical immunology, observed immune functions, principles and mechanisms. Association rule mining is one of the most important and well researched techniques of data mining. The goal of association rules is to extract interesting correlations, frequent patterns, associations or casual structures among sets of items in thetransaction databases or other data repositories. Association rules are widely used in various areas such as inventory control, telecommunication networks, intelligent decision making, market analysis and risk management etc. Apriori is the most widely used algorithm for mining the association rules. Other popular association rule mining algorithms are frequent pattern (FP growth, Eclat, dynamic itemset counting (DIC etc. Associative classification uses association rule mining in the rule discovery process to predict the class labels of the data. This technique has shown great promise over many other classification techniques. Associative classification also integrates the process of rule discovery and classification to build the classifier for the purpose of prediction. The main problem with the associative classification approach is the discovery of highquality association rules in a very large space of

  8. Fault Diagnosis for Fuel Cell Based on Naive Bayesian Classification

    Directory of Open Access Journals (Sweden)

    Liping Fan

    2013-07-01

    Full Text Available Many kinds of uncertain factors may exist in the process of fault diagnosis and affect diagnostic results. Bayesian network is one of the most effective theoretical models for uncertain knowledge expression and reasoning. The method of naive Bayesian classification is used in this paper in fault diagnosis of a proton exchange membrane fuel cell (PEMFC system. Based on the model of PEMFC, fault data are obtained through simulation experiment, learning and training of the naive Bayesian classification are finished, and some testing samples are selected to validate this method. Simulation results demonstrate that the method is feasible.    

  9. Multimodal sparse representation-based classification for lung needle biopsy images.

    Science.gov (United States)

    Shi, Yinghuan; Gao, Yang; Yang, Yubin; Zhang, Ying; Wang, Dong

    2013-10-01

    Lung needle biopsy image classification is a critical task for computer-aided lung cancer diagnosis. In this study, a novel method, multimodal sparse representation-based classification (mSRC), is proposed for classifying lung needle biopsy images. In the data acquisition procedure of our method, the cell nuclei are automatically segmented from the images captured by needle biopsy specimens. Then, features of three modalities (shape, color, and texture) are extracted from the segmented cell nuclei. After this procedure, mSRC goes through a training phase and a testing phase. In the training phase, three discriminative subdictionaries corresponding to the shape, color, and texture information are jointly learned by a genetic algorithm guided multimodal dictionary learning approach. The dictionary learning aims to select the topmost discriminative samples and encourage large disagreement among different subdictionaries. In the testing phase, when a new image comes, a hierarchical fusion strategy is applied, which first predicts the labels of the cell nuclei by fusing three modalities, then predicts the label of the image by majority voting. Our method is evaluated on a real image set of 4372 cell nuclei regions segmented from 271 images. These cell nuclei regions can be divided into five classes: four cancerous classes (corresponding to four types of lung cancer) plus one normal class (no cancer). The results demonstrate that the multimodal information is important for lung needle biopsy image classification. Moreover, compared to several state-of-the-art methods (LapRLS, MCMI-AB, mcSVM, ESRC, KSRC), the proposed mSRC can achieve significant improvement (mean accuracy of 88.1%, precision of 85.2%, recall of 92.8%, etc.), especially for classifying different cancerous types.

  10. Adaptive stellar spectral subclass classification based on Bayesian SVMs

    Science.gov (United States)

    Du, Changde; Luo, Ali; Yang, Haifeng

    2017-02-01

    Stellar spectral classification is one of the most fundamental tasks in survey astronomy. Many automated classification methods have been applied to spectral data. However, their main limitation is that the model parameters must be tuned repeatedly to deal with different data sets. In this paper, we utilize the Bayesian support vector machines (BSVM) to classify the spectral subclass data. Based on Gibbs sampling, BSVM can infer all model parameters adaptively according to different data sets, which allows us to circumvent the time-consuming cross validation for penalty parameter. We explored different normalization methods for stellar spectral data, and the best one has been suggested in this study. Finally, experimental results on several stellar spectral subclass classification problems show that the BSVM model not only possesses good adaptability but also provides better prediction performance than traditional methods.

  11. Hyperspectral image classification based on volumetric texture and dimensionality reduction

    Science.gov (United States)

    Su, Hongjun; Sheng, Yehua; Du, Peijun; Chen, Chen; Liu, Kui

    2015-06-01

    A novel approach using volumetric texture and reduced-spectral features is presented for hyperspectral image classification. Using this approach, the volumetric textural features were extracted by volumetric gray-level co-occurrence matrices (VGLCM). The spectral features were extracted by minimum estimated abundance covariance (MEAC) and linear prediction (LP)-based band selection, and a semi-supervised k-means (SKM) clustering method with deleting the worst cluster (SKMd) bandclustering algorithms. Moreover, four feature combination schemes were designed for hyperspectral image classification by using spectral and textural features. It has been proven that the proposed method using VGLCM outperforms the gray-level co-occurrence matrices (GLCM) method, and the experimental results indicate that the combination of spectral information with volumetric textural features leads to an improved classification performance in hyperspectral imagery.

  12. Clinical classification of cancer cachexia: phenotypic correlates in human skeletal muscle.

    Directory of Open Access Journals (Sweden)

    Neil Johns

    Full Text Available BACKGROUND: Cachexia affects the majority of patients with advanced cancer and is associated with a reduction in treatment tolerance, response to therapy, and duration of survival. One impediment towards the effective treatment of cachexia is a validated classification system. METHODS: 41 patients with resectable upper gastrointestinal (GI or pancreatic cancer underwent characterisation for cachexia based on weight-loss (WL and/or low muscularity (LM. Four diagnostic criteria were used >5%WL, >10%WL, LM, and LM+>2%WL. All patients underwent biopsy of the rectus muscle. Analysis included immunohistochemistry for fibre size and type, protein and nucleic acid concentration, Western blots for markers of autophagy, SMAD signalling, and inflammation. FINDINGS: Compared with non-cachectic cancer patients, patients with LM or LM+>2%WL, mean muscle fibre diameter was reduced by about 25% (p = 0.02 and p = 0.001 respectively. No significant difference in fibre diameter was observed if patients had WL alone. Regardless of classification, there was no difference in fibre number or proportion of fibre type across all myosin heavy chain isoforms. Mean muscle protein content was reduced and the ratio of RNA/DNA decreased in patients with either >5%WL or LM+>2%WL. Compared with non-cachectic patients, SMAD3 protein levels were increased in patients with >5%WL (p = 0.022 and with >10%WL, beclin (p = 0.05 and ATG5 (p = 0.01 protein levels were increased. There were no differences in phospho-NFkB or phospho-STAT3 levels across any of the groups. CONCLUSION: Muscle fibre size, biochemical composition and pathway phenotype can vary according to whether the diagnostic criteria for cachexia are based on weight loss alone, a measure of low muscularity alone or a combination of the two. For intervention trials where the primary end-point is a change in muscle mass or function, use of combined diagnostic criteria may allow identification of a more

  13. Integrative disease classification based on cross-platform microarray data

    Directory of Open Access Journals (Sweden)

    Huang Haiyan

    2009-01-01

    Full Text Available Abstract Background Disease classification has been an important application of microarray technology. However, most microarray-based classifiers can only handle data generated within the same study, since microarray data generated by different laboratories or with different platforms can not be compared directly due to systematic variations. This issue has severely limited the practical use of microarray-based disease classification. Results In this study, we tested the feasibility of disease classification by integrating the large amount of heterogeneous microarray datasets from the public microarray repositories. Cross-platform data compatibility is created by deriving expression log-rank ratios within datasets. One may then compare vectors of log-rank ratios across datasets. In addition, we systematically map textual annotations of datasets to concepts in Unified Medical Language System (UMLS, permitting quantitative analysis of the phenotype "distance" between datasets and automated construction of disease classes. We design a new classification approach named ManiSVM, which integrates Manifold data transformation with SVM learning to exploit the data properties. Using the leave one dataset out cross validation, ManiSVM achieved the overall accuracy of 70.7% (68.6% precision and 76.9% recall with many disease classes achieving the accuracy higher than 80%. Conclusion Our results not only demonstrated the feasibility of the integrated disease classification approach, but also showed that the classification accuracy increases with the number of homogenous training datasets. Thus, the power of the integrative approach will increase with the continuous accumulation of microarray data in public repositories. Our study shows that automated disease diagnosis can be an important and promising application of the enormous amount of costly to generate, yet freely available, public microarray data.

  14. Hardware Accelerators Targeting a Novel Group Based Packet Classification Algorithm

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2013-01-01

    Full Text Available Packet classification is a ubiquitous and key building block for many critical network devices. However, it remains as one of the main bottlenecks faced when designing fast network devices. In this paper, we propose a novel Group Based Search packet classification Algorithm (GBSA that is scalable, fast, and efficient. GBSA consumes an average of 0.4 megabytes of memory for a 10 k rule set. The worst-case classification time per packet is 2 microseconds, and the preprocessing speed is 3 M rules/second based on an Xeon processor operating at 3.4 GHz. When compared with other state-of-the-art classification techniques, the results showed that GBSA outperforms the competition with respect to speed, memory usage, and processing time. Moreover, GBSA is amenable to implementation in hardware. Three different hardware implementations are also presented in this paper including an Application Specific Instruction Set Processor (ASIP implementation and two pure Register-Transfer Level (RTL implementations based on Impulse-C and Handel-C flows, respectively. Speedups achieved with these hardware accelerators ranged from 9x to 18x compared with a pure software implementation running on an Xeon processor.

  15. Fast rule-based bioactivity prediction using associative classification mining

    Directory of Open Access Journals (Sweden)

    Yu Pulan

    2012-11-01

    Full Text Available Abstract Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM, which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR, classification based on multiple association rules (CMAR and classification based on association rules (CBA are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB, mutagenicity and hERG (the human Ether-a-go-go-Related Gene blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM methods, and produce highly interpretable models.

  16. Sparse Representation Based Binary Hypothesis Model for Hyperspectral Image Classification

    Directory of Open Access Journals (Sweden)

    Yidong Tang

    2016-01-01

    Full Text Available The sparse representation based classifier (SRC and its kernel version (KSRC have been employed for hyperspectral image (HSI classification. However, the state-of-the-art SRC often aims at extended surface objects with linear mixture in smooth scene and assumes that the number of classes is given. Considering the small target with complex background, a sparse representation based binary hypothesis (SRBBH model is established in this paper. In this model, a query pixel is represented in two ways, which are, respectively, by background dictionary and by union dictionary. The background dictionary is composed of samples selected from the local dual concentric window centered at the query pixel. Thus, for each pixel the classification issue becomes an adaptive multiclass classification problem, where only the number of desired classes is required. Furthermore, the kernel method is employed to improve the interclass separability. In kernel space, the coding vector is obtained by using kernel-based orthogonal matching pursuit (KOMP algorithm. Then the query pixel can be labeled by the characteristics of the coding vectors. Instead of directly using the reconstruction residuals, the different impacts the background dictionary and union dictionary have on reconstruction are used for validation and classification. It enhances the discrimination and hence improves the performance.

  17. Breast Cancer Survival Defined by the ER/PR/HER2 Subtypes and a Surrogate Classification according to Tumor Grade and Immunohistochemical Biomarkers

    Directory of Open Access Journals (Sweden)

    Carol A. Parise

    2014-01-01

    Full Text Available Introduction. ER, PR, and HER2 are routinely available in breast cancer specimens. The purpose of this study is to contrast breast cancer-specific survival for the eight ER/PR/HER2 subtypes with survival of an immunohistochemical surrogate for the molecular subtype based on the ER/PR/HER2 subtypes and tumor grade. Methods. We identified 123,780 cases of stages 1–3 primary female invasive breast cancer from California Cancer Registry. The surrogate classification was derived using ER/PR/HER2 and tumor grade. Kaplan-Meier survival analysis and Cox proportional hazards modeling were used to assess differences in survival and risk of mortality for the ER/PR/HER2 subtypes and surrogate classification within each stage. Results. The luminal B/HER2− surrogate classification had a higher risk of mortality than the luminal B/HER2+ for all stages of disease. There was no difference in risk of mortality between the ER+/PR+/HER2− and ER+/PR+/HER2+ in stage 3. With one exception in stage 3, the ER-negative subtypes all had an increased risk of mortality when compared with the ER-positive subtypes. Conclusions. Assessment of survival using ER/PR/HER2 illustrates the heterogeneity of HER2+ subtypes. The surrogate classification provides clear separation in survival and adjusted mortality but underestimates the wide variability within the subtypes that make up the classification.

  18. Choice-Based Conjoint Analysis: Classification vs. Discrete Choice Models

    Science.gov (United States)

    Giesen, Joachim; Mueller, Klaus; Taneva, Bilyana; Zolliker, Peter

    Conjoint analysis is a family of techniques that originated in psychology and later became popular in market research. The main objective of conjoint analysis is to measure an individual's or a population's preferences on a class of options that can be described by parameters and their levels. We consider preference data obtained in choice-based conjoint analysis studies, where one observes test persons' choices on small subsets of the options. There are many ways to analyze choice-based conjoint analysis data. Here we discuss the intuition behind a classification based approach, and compare this approach to one based on statistical assumptions (discrete choice models) and to a regression approach. Our comparison on real and synthetic data indicates that the classification approach outperforms the discrete choice models.

  19. Cell nuclei attributed relational graphs for efficient representation and classification of gastric cancer in digital histopathology

    Science.gov (United States)

    Sharma, Harshita; Zerbe, Norman; Heim, Daniel; Wienert, Stephan; Lohmann, Sebastian; Hellwich, Olaf; Hufnagl, Peter

    2016-03-01

    This paper describes a novel graph-based method for efficient representation and subsequent classification in histological whole slide images of gastric cancer. Her2/neu immunohistochemically stained and haematoxylin and eosin stained histological sections of gastric carcinoma are digitized. Immunohistochemical staining is used in practice by pathologists to determine extent of malignancy, however, it is laborious to visually discriminate the corresponding malignancy levels in the more commonly used haematoxylin and eosin stain, and this study attempts to solve this problem using a computer-based method. Cell nuclei are first isolated at high magnification using an automatic cell nuclei segmentation strategy, followed by construction of cell nuclei attributed relational graphs of the tissue regions. These graphs represent tissue architecture comprehensively, as they contain information about cell nuclei morphology as vertex attributes, along with knowledge of neighborhood in the form of edge linking and edge attributes. Global graph characteristics are derived and ensemble learning is used to discriminate between three types of malignancy levels, namely, non-tumor, Her2/neu positive tumor and Her2/neu negative tumor. Performance is compared with state of the art methods including four texture feature groups (Haralick, Gabor, Local Binary Patterns and Varma Zisserman features), color and intensity features, and Voronoi diagram and Delaunay triangulation. Texture, color and intensity information is also combined with graph-based knowledge, followed by correlation analysis. Quantitative assessment is performed using two cross validation strategies. On investigating the experimental results, it can be concluded that the proposed method provides a promising way for computer-based analysis of histopathological images of gastric cancer.

  20. Trace elements based classification on clinkers. Application to Spanish clinkers

    OpenAIRE

    Tamás, F. D.; Abonyi, J.; Puertas, F.

    2001-01-01

    The qualitative identification to determine the origin (i.e. manufacturing factory) of Spanish clinkers is described. The classification of clinkers produced in different factories can be based on their trace element content. Approximately fifteen clinker sorts are analysed, collected from 11 Spanish cement factories to determine their Mg, Sr, Ba, Mn, Ti, Zr, Zn and V content. An expert system formulated by a binary decision tree is designed based on the collected data. The performance of the...

  1. Classification of Mental Disorders Based on Temperament

    Directory of Open Access Journals (Sweden)

    Nadi Sakhvidi

    2015-08-01

    Full Text Available Context Different paradoxical theories are available regarding psychiatric disorders. The current study aimed to establish a more comprehensive overall approach. Evidence Acquisition This basic study examined ancient medical books. “The Canon” by Avicenna and “Comprehensive Textbook of Psychiatry” by Kaplan and Sadock were the most important and frequently consulted books in this study. Results Four groups of temperaments were identified: high active, high flexible; high active, low flexible; low active, low flexible; and low active, high flexible. When temperament deteriorates personality, non-psychotic, and psychotic psychiatric disorders can develop. Conclusions Temperaments can provide a basis to classify psychiatric disorders. Psychiatric disorders can be placed in a spectrum based on temperaments.

  2. Update on epidemiology classification, and management of thyroid cancer

    Directory of Open Access Journals (Sweden)

    Heitham Gheriani

    2006-06-01

    Full Text Available Thyroid cancer represents approximately 0.5–1% of all human malignancy1. In the UK the incidence of thyroid cancer is 2-3 per 100,000 populations 2. In geographical areas of low iodine intake and in areas exposed to nuclear disasters the incidence of thyroid cancer is higher. Benign thyroid conditions are much more common. In the UK approximately 8 % of the population have nodular thyroid disease2. Nodular thyroid disease increases with age and is also more common in females and in geographical areas of low iodine intake. Primary thyroid malignancy can be broadly divided into 2 groups. The first group, which generally have much better prognosis, are the well-differentiated thyroid carcinoma, which includes papillary carcinoma, follicular carcinoma and Hürthle cell tumours. The second group includes the poorly differentiated thyroid carcinoma like medullary thyroid carcinoma and the anaplastic thyroid carcinoma. Other rare tumours such as sarcomas, lymphomas and the extremely rare primary squamous cell carcinoma of the thyroid should be included in the second group. Secondary or metastatic thyroid cancer can be from breast, lung, colon and kidney malignancies.

  3. Breast cancer tumor classification using LASSO method selection approach

    Energy Technology Data Exchange (ETDEWEB)

    Celaya P, J. M.; Ortiz M, J. A.; Martinez B, M. R.; Solis S, L. O.; Castaneda M, R.; Garza V, I.; Martinez F, M.; Ortiz R, J. M., E-mail: morvymm@yahoo.com.mx [Universidad Autonoma de Zacatecas, Av. Ramon Lopez Velarde 801, Col. Centro, 98000 Zacatecas, Zac. (Mexico)

    2016-10-15

    Breast cancer is one of the leading causes of deaths worldwide among women. Early tumor detection is key in reducing breast cancer deaths and screening mammography is the widest available method for early detection. Mammography is the most common and effective breast cancer screening test. However, the rate of positive findings is very low, making the radiologic interpretation monotonous and biased toward errors. In an attempt to alleviate radiological workload, this work presents a computer-aided diagnosis (CAD x) method aimed to automatically classify tumor lesions into malign or benign as a means to a second opinion. The CAD x methos, extracts image features, and classifies the screening mammogram abnormality into one of two categories: subject at risk of having malignant tumor (malign), and healthy subject (benign). In this study, 143 abnormal segmentation s (57 malign and 86 benign) from the Breast Cancer Digital Repository (BCD R) public database were used to train and evaluate the CAD x system. Percentile-rank (p-rank) was used to standardize the data. Using the LASSO feature selection methodology, the model achieved a Leave-one-out-cross-validation area under the receiver operating characteristic curve (Auc) of 0.950. The proposed method has the potential to rank abnormal lesions with high probability of malignant findings aiding in the detection of potential malign cases as a second opinion to the radiologist. (Author)

  4. Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification.

    Science.gov (United States)

    Algamal, Zakariya Yahya; Lee, Muhammad Hisyam

    2015-12-01

    Cancer classification and gene selection in high-dimensional data have been popular research topics in genetics and molecular biology. Recently, adaptive regularized logistic regression using the elastic net regularization, which is called the adaptive elastic net, has been successfully applied in high-dimensional cancer classification to tackle both estimating the gene coefficients and performing gene selection simultaneously. The adaptive elastic net originally used elastic net estimates as the initial weight, however, using this weight may not be preferable for certain reasons: First, the elastic net estimator is biased in selecting genes. Second, it does not perform well when the pairwise correlations between variables are not high. Adjusted adaptive regularized logistic regression (AAElastic) is proposed to address these issues and encourage grouping effects simultaneously. The real data results indicate that AAElastic is significantly consistent in selecting genes compared to the other three competitor regularization methods. Additionally, the classification performance of AAElastic is comparable to the adaptive elastic net and better than other regularization methods. Thus, we can conclude that AAElastic is a reliable adaptive regularized logistic regression method in the field of high-dimensional cancer classification.

  5. A wrapper-based approach to image segmentation and classification.

    Science.gov (United States)

    Farmer, Michael E; Jain, Anil K

    2005-12-01

    The traditional processing flow of segmentation followed by classification in computer vision assumes that the segmentation is able to successfully extract the object of interest from the background image. It is extremely difficult to obtain a reliable segmentation without any prior knowledge about the object that is being extracted from the scene. This is further complicated by the lack of any clearly defined metrics for evaluating the quality of segmentation or for comparing segmentation algorithms. We propose a method of segmentation that addresses both of these issues, by using the object classification subsystem as an integral part of the segmentation. This will provide contextual information regarding the objects to be segmented, as well as allow us to use the probability of correct classification as a metric to determine the quality of the segmentation. We view traditional segmentation as a filter operating on the image that is independent of the classifier, much like the filter methods for feature selection. We propose a new paradigm for segmentation and classification that follows the wrapper methods of feature selection. Our method wraps the segmentation and classification together, and uses the classification accuracy as the metric to determine the best segmentation. By using shape as the classification feature, we are able to develop a segmentation algorithm that relaxes the requirement that the object of interest to be segmented must be homogeneous in some low-level image parameter, such as texture, color, or grayscale. This represents an improvement over other segmentation methods that have used classification information only to modify the segmenter parameters, since these algorithms still require an underlying homogeneity in some parameter space. Rather than considering our method as, yet, another segmentation algorithm, we propose that our wrapper method can be considered as an image segmentation framework, within which existing image segmentation

  6. Similarity-Based Classification in Partially Labeled Networks

    Science.gov (United States)

    Zhang, Qian-Ming; Shang, Ming-Sheng; Lü, Linyuan

    Two main difficulties in the problem of classification in partially labeled networks are the sparsity of the known labeled nodes and inconsistency of label information. To address these two difficulties, we propose a similarity-based method, where the basic assumption is that two nodes are more likely to be categorized into the same class if they are more similar. In this paper, we introduce ten similarity indices defined based on the network structure. Empirical results on the co-purchase network of political books show that the similarity-based method can, to some extent, overcome these two difficulties and give higher accurate classification than the relational neighbors method, especially when the labeled nodes are sparse. Furthermore, we find that when the information of known labeled nodes is sufficient, the indices considering only local information can perform as good as those global indices while having much lower computational complexity.

  7. Object-Based Classification and Change Detection of Hokkaido, Japan

    Science.gov (United States)

    Park, J. G.; Harada, I.; Kwak, Y.

    2016-06-01

    Topography and geology are factors to characterize the distribution of natural vegetation. Topographic contour is particularly influential on the living conditions of plants such as soil moisture, sunlight, and windiness. Vegetation associations having similar characteristics are present in locations having similar topographic conditions unless natural disturbances such as landslides and forest fires or artificial disturbances such as deforestation and man-made plantation bring about changes in such conditions. We developed a vegetation map of Japan using an object-based segmentation approach with topographic information (elevation, slope, slope direction) that is closely related to the distribution of vegetation. The results found that the object-based classification is more effective to produce a vegetation map than the pixel-based classification.

  8. Toward automated classification of consumers' cancer-related questions with a new taxonomy of expected answer types.

    Science.gov (United States)

    McRoy, Susan; Jones, Sean; Kurmally, Adam

    2016-09-01

    This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions.

  9. Automatic Detection of Cervical Cancer Cells by a Two-Level Cascade Classification System

    Directory of Open Access Journals (Sweden)

    Jie Su

    2016-01-01

    Full Text Available We proposed a method for automatic detection of cervical cancer cells in images captured from thin liquid based cytology slides. We selected 20,000 cells in images derived from 120 different thin liquid based cytology slides, which include 5000 epithelial cells (normal 2500, abnormal 2500, lymphoid cells, neutrophils, and junk cells. We first proposed 28 features, including 20 morphologic features and 8 texture features, based on the characteristics of each cell type. We then used a two-level cascade integration system of two classifiers to classify the cervical cells into normal and abnormal epithelial cells. The results showed that the recognition rates for abnormal cervical epithelial cells were 92.7% and 93.2%, respectively, when C4.5 classifier or LR (LR: logical regression classifier was used individually; while the recognition rate was significantly higher (95.642% when our two-level cascade integrated classifier system was used. The false negative rate and false positive rate (both 1.44% of the proposed automatic two-level cascade classification system are also much lower than those of traditional Pap smear review.

  10. Impact of full field digital mammography on the classification and mammographic characteristics of interval breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Knox, Mark, E-mail: marktknox@gmail.com; O’Brien, Angela, E-mail: angelaobrien@doctors.org.uk; Szabó, Endre, E-mail: endrebacsi@freemail.hu; Smith, Clare S., E-mail: csmith@mater.ie; Fenlon, Helen M., E-mail: helen.fenlon@cancerscreening.ie; McNicholas, Michelle M., E-mail: michelle.mcnicholas@cancerscreening.ie; Flanagan, Fidelma L., E-mail: fidelma.flanagan@cancerscreening.ie

    2015-06-15

    Highlights: • Digital mammography has changed the presentation of interval breast cancer. • Less interval breast cancers are associated with microcalcifications following FFDM. • Interval breast cancer audit remains a key feature of any breast screening program. - Abstract: Objective: Full field digital mammography (FFDM) is increasingly replacing screen film mammography (SFM) in breast screening programs. Interval breast cancers are an issue in all screening programs and the purpose of our study is to assess the impact of FFDM on the classification of interval breast cancers at independent blind review and to compare the mammographic features of interval cancers at FFDM and SFM. Materials and methods: This study included 138 cases of interval breast cancer, 76 following an FFDM screening examination and 62 following screening with SFM. The prior screening mammogram was assessed by each of five consultant breast radiologists who were blinded to the site of subsequent cancer. Subsequent review of the diagnostic mammogram was performed and cases were classified as missed, minimal signs, occult or true interval. Mammographic features of the interval cancer at diagnosis and any abnormality identified on the prior screening mammogram were recorded. Results: The percentages of cancers classified as missed at FFDM and SFM did not differ significantly, 10.5% (8 of 76) at FFDM and 8.1% (5 of 62) at SFM (p = .77). There were significantly less interval cancers presenting as microcalcifications (alone or in association with another abnormality) following screening with FFDM, 16% (12 of 76) than following a SFM examination, 32% (20 of 62) (p = .02). Conclusion: Interval breast cancers continue to pose a problem at FFDM. The switch to FFDM has changed the mammographic presentation of interval breast cancer, with less interval cancers presenting in association with microcalcifications.

  11. Classification data mining method based on dynamic RBF neural networks

    Science.gov (United States)

    Zhou, Lijuan; Xu, Min; Zhang, Zhang; Duan, Luping

    2009-04-01

    With the widely application of databases and sharp development of Internet, The capacity of utilizing information technology to manufacture and collect data has improved greatly. It is an urgent problem to mine useful information or knowledge from large databases or data warehouses. Therefore, data mining technology is developed rapidly to meet the need. But DM (data mining) often faces so much data which is noisy, disorder and nonlinear. Fortunately, ANN (Artificial Neural Network) is suitable to solve the before-mentioned problems of DM because ANN has such merits as good robustness, adaptability, parallel-disposal, distributing-memory and high tolerating-error. This paper gives a detailed discussion about the application of ANN method used in DM based on the analysis of all kinds of data mining technology, and especially lays stress on the classification Data Mining based on RBF neural networks. Pattern classification is an important part of the RBF neural network application. Under on-line environment, the training dataset is variable, so the batch learning algorithm (e.g. OLS) which will generate plenty of unnecessary retraining has a lower efficiency. This paper deduces an incremental learning algorithm (ILA) from the gradient descend algorithm to improve the bottleneck. ILA can adaptively adjust parameters of RBF networks driven by minimizing the error cost, without any redundant retraining. Using the method proposed in this paper, an on-line classification system was constructed to resolve the IRIS classification problem. Experiment results show the algorithm has fast convergence rate and excellent on-line classification performance.

  12. Land Cover and Land Use Classification with TWOPAC: towards Automated Processing for Pixel- and Object-Based Image Classification

    Directory of Open Access Journals (Sweden)

    Stefan Dech

    2012-09-01

    Full Text Available We present a novel and innovative automated processing environment for the derivation of land cover (LC and land use (LU information. This processing framework named TWOPAC (TWinned Object and Pixel based Automated classification Chain enables the standardized, independent, user-friendly, and comparable derivation of LC and LU information, with minimized manual classification labor. TWOPAC allows classification of multi-spectral and multi-temporal remote sensing imagery from different sensor types. TWOPAC enables not only pixel-based classification, but also allows classification based on object-based characteristics. Classification is based on a Decision Tree approach (DT for which the well-known C5.0 code has been implemented, which builds decision trees based on the concept of information entropy. TWOPAC enables automatic generation of the decision tree classifier based on a C5.0-retrieved ascii-file, as well as fully automatic validation of the classification output via sample based accuracy assessment.Envisaging the automated generation of standardized land cover products, as well as area-wide classification of large amounts of data in preferably a short processing time, standardized interfaces for process control, Web Processing Services (WPS, as introduced by the Open Geospatial Consortium (OGC, are utilized. TWOPAC’s functionality to process geospatial raster or vector data via web resources (server, network enables TWOPAC’s usability independent of any commercial client or desktop software and allows for large scale data processing on servers. Furthermore, the components of TWOPAC were built-up using open source code components and are implemented as a plug-in for Quantum GIS software for easy handling of the classification process from the user’s perspective.

  13. Rule-Based Classification of Chemical Structures by Scaffold.

    Science.gov (United States)

    Schuffenhauer, Ansgar; Varin, Thibault

    2011-08-01

    Databases for small organic chemical molecules usually contain millions of structures. The screening decks of pharmaceutical companies contain more than a million of structures. Nevertheless chemical substructure searching in these databases can be performed interactively in seconds. Because of this nobody has really missed structural classification of these databases for the purpose of finding data for individual chemical substructures. However, a full deck high-throughput screen produces also activity data for more than a million of substances. How can this amount of data be analyzed? Which are the active scaffolds identified by an assays? To answer such questions systematic classifications of molecules by scaffolds are needed. In this review it is described how molecules can be hierarchically classified by their scaffolds. It is explained how such classifications can be used to identify active scaffolds in an HTS data set. Once active classes are identified, they need to be visualized in the context of related scaffolds in order to understand SAR. Consequently such visualizations are another topic of this review. In addition scaffold based diversity measures are discussed and an outlook is given about the potential impact of structural classifications on a chemically aware semantic web.

  14. Comparison Of Power Quality Disturbances Classification Based On Neural Network

    Directory of Open Access Journals (Sweden)

    Nway Nway Kyaw Win

    2015-07-01

    Full Text Available Abstract Power quality disturbances PQDs result serious problems in the reliability safety and economy of power system network. In order to improve electric power quality events the detection and classification of PQDs must be made type of transient fault. Software analysis of wavelet transform with multiresolution analysis MRA algorithm and feed forward neural network probabilistic and multilayer feed forward neural network based methodology for automatic classification of eight types of PQ signals flicker harmonics sag swell impulse fluctuation notch and oscillatory will be presented. The wavelet family Db4 is chosen in this system to calculate the values of detailed energy distributions as input features for classification because it can perform well in detecting and localizing various types of PQ disturbances. This technique classifies the types of PQDs problem sevents.The classifiers classify and identify the disturbance type according to the energy distribution. The results show that the PNN can analyze different power disturbance types efficiently. Therefore it can be seen that PNN has better classification accuracy than MLFF.

  15. From molecular classification to targeted therapeutics: the changing face of systemic therapy in metastatic gastroesophageal cancer.

    Science.gov (United States)

    Murphy, Adrian; Kelly, Ronan J

    2015-01-01

    Histological classification of adenocarcinoma or squamous cell carcinoma for esophageal cancer or using the Lauren classification for intestinal and diffuse type gastric cancer has limited clinical utility in the management of advanced disease. Germline mutations in E-cadherin (CDH1) or mismatch repair genes (Lynch syndrome) were identified many years ago but given their rarity, the identification of these molecular alterations does not substantially impact treatment in the advanced setting. Recent molecular profiling studies of upper GI tumors have added to our knowledge of the underlying biology but have not led to an alternative classification system which can guide clinician's therapeutic decisions. Recently the Cancer Genome Atlas Research Network has proposed four subtypes of gastric cancer dividing tumors into those positive for Epstein-Barr virus, microsatellite unstable tumors, genomically stable tumors, and tumors with chromosomal instability. Unfortunately to date, many phase III clinical trials involving molecularly targeted agents have failed to meet their survival endpoints due to their use in unselected populations. Future clinical trials should utilize molecular profiling of individual tumors in order to determine the optimal use of targeted therapies in preselected patients.

  16. From Molecular Classification to Targeted Therapeutics: The Changing Face of Systemic Therapy in Metastatic Gastroesophageal Cancer

    Directory of Open Access Journals (Sweden)

    Adrian Murphy

    2015-01-01

    Full Text Available Histological classification of adenocarcinoma or squamous cell carcinoma for esophageal cancer or using the Lauren classification for intestinal and diffuse type gastric cancer has limited clinical utility in the management of advanced disease. Germline mutations in E-cadherin (CDH1 or mismatch repair genes (Lynch syndrome were identified many years ago but given their rarity, the identification of these molecular alterations does not substantially impact treatment in the advanced setting. Recent molecular profiling studies of upper GI tumors have added to our knowledge of the underlying biology but have not led to an alternative classification system which can guide clinician’s therapeutic decisions. Recently the Cancer Genome Atlas Research Network has proposed four subtypes of gastric cancer dividing tumors into those positive for Epstein-Barr virus, microsatellite unstable tumors, genomically stable tumors, and tumors with chromosomal instability. Unfortunately to date, many phase III clinical trials involving molecularly targeted agents have failed to meet their survival endpoints due to their use in unselected populations. Future clinical trials should utilize molecular profiling of individual tumors in order to determine the optimal use of targeted therapies in preselected patients.

  17. Structure-based classification and ontology in chemistry

    Directory of Open Access Journals (Sweden)

    Hastings Janna

    2012-04-01

    Full Text Available Abstract Background Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving relevant results from the available information, and organising those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures, while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies. Results We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches. Conclusion Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational

  18. Full Intelligent Cancer Classification of Thermal Breast Images to Assist Physician in Clinical Diagnostic Applications.

    Science.gov (United States)

    Lashkari, AmirEhsan; Pak, Fatemeh; Firouzmand, Mohammad

    2016-01-01

    Breast cancer is the most common type of cancer among women. The important key to treat the breast cancer is early detection of it because according to many pathological studies more than 75% - 80% of all abnormalities are still benign at primary stages; so in recent years, many studies and extensive research done to early detection of breast cancer with higher precision and accuracy. Infra-red breast thermography is an imaging technique based on recording temperature distribution patterns of breast tissue. Compared with breast mammography technique, thermography is more suitable technique because it is noninvasive, non-contact, passive and free ionizing radiation. In this paper, a full automatic high accuracy technique for classification of suspicious areas in thermogram images with the aim of assisting physicians in early detection of breast cancer has been presented. Proposed algorithm consists of four main steps: pre-processing & segmentation, feature extraction, feature selection and classification. At the first step, using full automatic operation, region of interest (ROI) determined and the quality of image improved. Using thresholding and edge detection techniques, both right and left breasts separated from each other. Then relative suspected areas become segmented and image matrix normalized due to the uniqueness of each person's body temperature. At feature extraction stage, 23 features, including statistical, morphological, frequency domain, histogram and Gray Level Co-occurrence Matrix (GLCM) based features are extracted from segmented right and left breast obtained from step 1. To achieve the best features, feature selection methods such as minimum Redundancy and Maximum Relevance (mRMR), Sequential Forward Selection (SFS), Sequential Backward Selection (SBS), Sequential Floating Forward Selection (SFFS), Sequential Floating Backward Selection (SFBS) and Genetic Algorithm (GA) have been used at step 3. Finally to classify and TH labeling procedures

  19. An AERONET-based aerosol classification using the Mahalanobis distance

    Science.gov (United States)

    Hamill, Patrick; Giordano, Marco; Ward, Carolyne; Giles, David; Holben, Brent

    2016-09-01

    We present an aerosol classification based on AERONET aerosol data from 1993 to 2012. We used the AERONET Level 2.0 almucantar aerosol retrieval products to define several reference aerosol clusters which are characteristic of the following general aerosol types: Urban-Industrial, Biomass Burning, Mixed Aerosol, Dust, and Maritime. The classification of a particular aerosol observation as one of these aerosol types is determined by its five-dimensional Mahalanobis distance to each reference cluster. We have calculated the fractional aerosol type distribution at 190 AERONET sites, as well as the monthly variation in aerosol type at those locations. The results are presented on a global map and individually in the supplementary material. Our aerosol typing is based on recognizing that different geographic regions exhibit characteristic aerosol types. To generate reference clusters we only keep data points that lie within a Mahalanobis distance of 2 from the centroid. Our aerosol characterization is based on the AERONET retrieved quantities, therefore it does not include low optical depth values. The analysis is based on "point sources" (the AERONET sites) rather than globally distributed values. The classifications obtained will be useful in interpreting aerosol retrievals from satellite borne instruments.

  20. Linear classifier and textural analysis of optical scattering images for tumor classification during breast cancer extraction

    Science.gov (United States)

    Eguizabal, Alma; Laughney, Ashley M.; Garcia Allende, Pilar Beatriz; Krishnaswamy, Venkataramanan; Wells, Wendy A.; Paulsen, Keith D.; Pogue, Brian W.; López-Higuera, José M.; Conde, Olga M.

    2013-02-01

    Texture analysis of light scattering in tissue is proposed to obtain diagnostic information from breast cancer specimens. Light scattering measurements are minimally invasive, and allow the estimation of tissue morphology to guide the surgeon in resection surgeries. The usability of scatter signatures acquired with a micro-sampling reflectance spectral imaging system was improved utilizing an empirical approximation to the Mie theory to estimate the scattering power on a per-pixel basis. Co-occurrence analysis is then applied to the scattering power images to extract the textural features. A statistical analysis of the features demonstrated the suitability of the autocorrelation for the classification of notmalignant (normal epithelia and stroma, benign epithelia and stroma, inflammation), malignant (DCIS, IDC, ILC) and adipose tissue, since it reveals morphological information of tissue. Non-malignant tissue shows higher autocorrelation values while adipose tissue presents a very low autocorrelation on its scatter texture, being malignant the middle ground. Consequently, a fast linear classifier based on the consideration of just one straightforward feature is enough for providing relevant diagnostic information. A leave-one-out validation of the linear classifier on 29 samples with 48 regions of interest showed classification accuracies of 98.74% on adipose tissue, 82.67% on non-malignant tissue and 72.37% on malignant tissue, in comparison with the biopsy H and E gold standard. This demonstrates that autocorrelation analysis of scatter signatures is a very computationally efficient and automated approach to provide pathological information in real-time to guide surgeon during tissue resection.

  1. Ensemble classification of colon biopsy images based on information rich hybrid features.

    Science.gov (United States)

    Rathore, Saima; Hussain, Mutawarra; Aksam Iftikhar, Muhammad; Jalil, Abdul

    2014-04-01

    In recent years, classification of colon biopsy images has become an active research area. Traditionally, colon cancer is diagnosed using microscopic analysis. However, the process is subjective and leads to considerable inter/intra observer variation. Therefore, reliable computer-aided colon cancer detection techniques are in high demand. In this paper, we propose a colon biopsy image classification system, called CBIC, which benefits from discriminatory capabilities of information rich hybrid feature spaces, and performance enhancement based on ensemble classification methodology. Normal and malignant colon biopsy images differ with each other in terms of the color distribution of different biological constituents. The colors of different constituents are sharp in normal images, whereas the colors diffuse with each other in malignant images. In order to exploit this variation, two feature types, namely color components based statistical moments (CCSM) and Haralick features have been proposed, which are color components based variants of their traditional counterparts. Moreover, in normal colon biopsy images, epithelial cells possess sharp and well-defined edges. Histogram of oriented gradients (HOG) based features have been employed to exploit this information. Different combinations of hybrid features have been constructed from HOG, CCSM, and Haralick features. The minimum Redundancy Maximum Relevance (mRMR) feature selection method has been employed to select meaningful features from individual and hybrid feature sets. Finally, an ensemble classifier based on majority voting has been proposed, which classifies colon biopsy images using the selected features. Linear, RBF, and sigmoid SVM have been employed as base classifiers. The proposed system has been tested on 174 colon biopsy images, and improved performance (=98.85%) has been observed compared to previously reported studies. Additionally, the use of mRMR method has been justified by comparing the

  2. Multi-robot system learning based on evolutionary classification

    Directory of Open Access Journals (Sweden)

    Manko Sergey

    2016-01-01

    Full Text Available This paper presents a novel machine learning method for agents of a multi-robot system. The learning process is based on knowledge discovery through continual analysis of robot sensory information. We demonstrate that classification trees and evolutionary forests may be a basis for creation of autonomous robots capable both of learning and knowledge exchange with other agents in multi-robot system. The results of experimental studies confirm the effectiveness of the proposed approach.

  3. Label-Embedding for Attribute-Based Classification

    OpenAIRE

    Akata, Zeynep; Perronnin, Florent; Harchaoui, Zaid; Schmid, Cordelia

    2013-01-01

    International audience; Attributes are an intermediate representation, which enables parameter sharing between classes, a must when training data is scarce. We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors. We introduce a function which measures the compatibility between an image and a label embedding. The parameters of this function are learned on a training set of labeled samples to ensure that, gi...

  4. Hierarchical Classification of Chinese Documents Based on N-grams

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    We explore the techniques of utilizing N-gram informatio n tocategorize Chinese text documents hierarchically so that the classifier can shak e off the burden of large dictionaries and complex segmentation processing, and subsequently be domain and time independent. A hierarchical Chinese text classif ier is implemented. Experimental results show that hierarchically classifying Chinese text documents based N-grams can achieve satisfactory performance and outperforms the other traditional Chinese text classifiers.

  5. Tree-based disease classification using protein data.

    Science.gov (United States)

    Zhu, Hongtu; Yu, Chang-Yung; Zhang, Heping

    2003-09-01

    A reliable and precise classification of diseases is essential for successful diagnosis and treatment. Using mass spectrometry from clinical specimens, scientists may find the protein variations among disease and use this information to improve diagnosis. In this paper, we propose a novel procedure to classify disease status based on the protein data from mass spectrometry. Our new tree-based algorithm consists of three steps: projection, selection and classification tree. The projection step aims to project all observations from specimens into the same bases so that the projected data have fixed coordinates. Thus, for each specimen, we obtain a large vector of 'coefficients' on the same basis. The purpose of the selection step is data reduction by condensing the large vector from the projection step into a much lower order of informative vector. Finally, using these reduced vectors, we apply recursive partitioning to construct an informative classification tree. This method has been successfully applied to protein data, provided by the Department of Radiology and Chemistry at Duke University.

  6. Classification of pulmonary airway disease based on mucosal color analysis

    Science.gov (United States)

    Suter, Melissa; Reinhardt, Joseph M.; Riker, David; Ferguson, John Scott; McLennan, Geoffrey

    2005-04-01

    Airway mucosal color changes occur in response to the development of bronchial diseases including lung cancer, cystic fibrosis, chronic bronchitis, emphysema and asthma. These associated changes are often visualized using standard macro-optical bronchoscopy techniques. A limitation to this form of assessment is that the subtle changes that indicate early stages in disease development may often be missed as a result of this highly subjective assessment, especially in inexperienced bronchoscopists. Tri-chromatic CCD chip bronchoscopes allow for digital color analysis of the pulmonary airway mucosa. This form of analysis may facilitate a greater understanding of airway disease response. A 2-step image classification approach is employed: the first step is to distinguish between healthy and diseased bronchoscope images and the second is to classify the detected abnormal images into 1 of 4 possible disease categories. A database of airway mucosal color constructed from healthy human volunteers is used as a standard against which statistical comparisons are made from mucosa with known apparent airway abnormalities. This approach demonstrates great promise as an effective detection and diagnosis tool to highlight potentially abnormal airway mucosa identifying a region possibly suited to further analysis via airway forceps biopsy, or newly developed micro-optical biopsy strategies. Following the identification of abnormal airway images a neural network is used to distinguish between the different disease classes. We have shown that classification of potentially diseased airway mucosa is possible through comparative color analysis of digital bronchoscope images. The combination of the two strategies appears to increase the classification accuracy in addition to greatly decreasing the computational time.

  7. A Novel Segment Classification for Multifocal and Multicentric Breast Cancer to Facilitate Breast-Conservation Treatment.

    Science.gov (United States)

    Tan, Mona P

    2015-01-01

    Breast conservation treatment (BCT) is an appropriate alternative to mastectomy for the treatment of unifocal breast cancer. Multifocal and multicentric breast cancers (MFMCBC) challenge conventional indications for BCT and are often treated with mastectomy. Following progress in treatment strategies for unifocal tumors, there was a movement to evaluate the use of BCT for MFMCBC. Now a growing body of evidence from retrospective data has emerged, demonstrating acceptable local control and overall survival rates with BCT for MFMCBC. Prospective studies are needed to confirm these findings. One of the possible barriers to such trials is the absence of a standardized classification and nomenclature for MFMCBC at this point in time. A novel segment classification is presented in this article in an endeavor to overcome this deficiency and allow future work on this issue.

  8. Lung Cancer Early Diagnosis Using Some Data Mining Classification Techniques: A Survey

    Directory of Open Access Journals (Sweden)

    Thangaraju P

    2015-11-01

    Full Text Available  Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining is primarily used to this requirement thus finding its applications in diverse fields such as retail, financial, communication, marketing organizations and medicine. Data Mining plays an important role in healthcare organization because with the growth of population and dangerous deadly diseases like Cancer, SARS, Leprosy, HIV etc, Lung cancer is one of the most dangerous disease. This survey for appropriate medical image mining, Data Preprocessing, Feature Extraction, rule generation and classification, it provides basic framework for further improvement in medical diagnosis.

  9. Lung Cancer Early Diagnosis Using Some Data Mining Classification Techniques: A Survey

    Directory of Open Access Journals (Sweden)

    Thangaraju P

    2014-06-01

    Full Text Available Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining is primarily used to this requirement thus finding its applications in diverse fields such as retail, financial, communication, marketing organizations and medicine. Data Mining plays an important role in healthcare organization because with the growth of population and dangerous deadly diseases like Cancer, SARS, Leprosy, HIV etc, Lung cancer is one of the most dangerous disease. This survey for appropriate medical image mining, Data Preprocessing, Feature Extraction, rule generation and classification, it provides basic framework for further improvement in medical diagnosis.

  10. The DTW-based representation space for seismic pattern classification

    Science.gov (United States)

    Orozco-Alzate, Mauricio; Castro-Cabrera, Paola Alexandra; Bicego, Manuele; Londoño-Bonilla, John Makario

    2015-12-01

    Distinguishing among the different seismic volcanic patterns is still one of the most important and labor-intensive tasks for volcano monitoring. This task could be lightened and made free from subjective bias by using automatic classification techniques. In this context, a core but often overlooked issue is the choice of an appropriate representation of the data to be classified. Recently, it has been suggested that using a relative representation (i.e. proximities, namely dissimilarities on pairs of objects) instead of an absolute one (i.e. features, namely measurements on single objects) is advantageous to exploit the relational information contained in the dissimilarities to derive highly discriminant vector spaces, where any classifier can be used. According to that motivation, this paper investigates the suitability of a dynamic time warping (DTW) dissimilarity-based vector representation for the classification of seismic patterns. Results show the usefulness of such a representation in the seismic pattern classification scenario, including analyses of potential benefits from recent advances in the dissimilarity-based paradigm such as the proper selection of representation sets and the combination of different dissimilarity representations that might be available for the same data.

  11. Data Classification Based on Confidentiality in Virtual Cloud Environment

    Directory of Open Access Journals (Sweden)

    Munwar Ali Zardari

    2014-10-01

    Full Text Available The aim of this study is to provide suitable security to data based on the security needs of data. It is very difficult to decide (in cloud which data need what security and which data do not need security. However it will be easy to decide the security level for data after data classification according to their security level based on the characteristics of the data. In this study, we have proposed a data classification cloud model to solve data confidentiality issue in cloud computing environment. The data are classified into two major classes: sensitive and non-sensitive. The K-Nearest Neighbour (K-NN classifier is used for data classification and the Rivest, Shamir and Adelman (RSA algorithm is used to encrypt sensitive data. After implementing the proposed model, it is found that the confidentiality level of data is increased and this model is proved to be more cost and memory friendly for the users as well as for the cloud services providers. The data storage service is one of the cloud services where data servers are virtualized of all users. In a cloud server, the data are stored in two ways. First encrypt the received data and store on cloud servers. Second store data on the cloud servers without encryption. Both of these data storage methods can face data confidentiality issue, because the data have different values and characteristics that must be identified before sending to cloud severs.

  12. Integrated Classification of Prostate Cancer Reveals a Novel Luminal Subtype with Poor Outcome.

    Science.gov (United States)

    You, Sungyong; Knudsen, Beatrice S; Erho, Nicholas; Alshalalfa, Mohammed; Takhar, Mandeep; Al-Deen Ashab, Hussam; Davicioni, Elai; Karnes, R Jeffrey; Klein, Eric A; Den, Robert B; Ross, Ashley E; Schaeffer, Edward M; Garraway, Isla P; Kim, Jayoung; Freeman, Michael R

    2016-09-01

    Prostate cancer is a biologically heterogeneous disease with variable molecular alterations underlying cancer initiation and progression. Despite recent advances in understanding prostate cancer heterogeneity, better methods for classification of prostate cancer are still needed to improve prognostic accuracy and therapeutic outcomes. In this study, we computationally assembled a large virtual cohort (n = 1,321) of human prostate cancer transcriptome profiles from 38 distinct cohorts and, using pathway activation signatures of known relevance to prostate cancer, developed a novel classification system consisting of three distinct subtypes (named PCS1-3). We validated this subtyping scheme in 10 independent patient cohorts and 19 laboratory models of prostate cancer, including cell lines and genetically engineered mouse models. Analysis of subtype-specific gene expression patterns in independent datasets derived from luminal and basal cell models provides evidence that PCS1 and PCS2 tumors reflect luminal subtypes, while PCS3 represents a basal subtype. We show that PCS1 tumors progress more rapidly to metastatic disease in comparison with PCS2 or PCS3, including PSC1 tumors of low Gleason grade. To apply this finding clinically, we developed a 37-gene panel that accurately assigns individual tumors to one of the three PCS subtypes. This panel was also applied to circulating tumor cells (CTC) and provided evidence that PCS1 CTCs may reflect enzalutamide resistance. In summary, PCS subtyping may improve accuracy in predicting the likelihood of clinical progression and permit treatment stratification at early and late disease stages. Cancer Res; 76(17); 4948-58. ©2016 AACR.

  13. Evidence-Based Cancer Imaging

    Science.gov (United States)

    Khorasani, Ramin

    2017-01-01

    With the advances in the field of oncology, imaging is increasingly used in the follow-up of cancer patients, leading to concerns about over-utilization. Therefore, it has become imperative to make imaging more evidence-based, efficient, cost-effective and equitable. This review explores the strategies and tools to make diagnostic imaging more evidence-based, mainly in the context of follow-up of cancer patients.

  14. An ellipse detection algorithm based on edge classification

    Science.gov (United States)

    Yu, Liu; Chen, Feng; Huang, Jianming; Wei, Xiangquan

    2015-12-01

    In order to enhance the speed and accuracy of ellipse detection, an ellipse detection algorithm based on edge classification is proposed. Too many edge points are removed by making edge into point in serialized form and the distance constraint between the edge points. It achieves effective classification by the criteria of the angle between the edge points. And it makes the probability of randomly selecting the edge points falling on the same ellipse greatly increased. Ellipse fitting accuracy is significantly improved by the optimization of the RED algorithm. It uses Euclidean distance to measure the distance from the edge point to the elliptical boundary. Experimental results show that: it can detect ellipse well in case of edge with interference or edges blocking each other. It has higher detecting precision and less time consuming than the RED algorithm.

  15. Entropy coders for image compression based on binary forward classification

    Science.gov (United States)

    Yoo, Hoon; Jeong, Jechang

    2000-12-01

    Entropy coders as a noiseless compression method are widely used as final step compression for images, and there have been many contributions to increase of entropy coder performance and to reduction of entropy coder complexity. In this paper, we propose some entropy coders based on the binary forward classification (BFC). The BFC requires overhead of classification but there is no change between the amount of input information and the total amount of classified output information, which we prove this property in this paper. And using the proved property, we propose entropy coders that are the BFC followed by Golomb-Rice coders (BFC+GR) and the BFC followed by arithmetic coders (BFC+A). The proposed entropy coders introduce negligible additional complexity due to the BFC. Simulation results also show better performance than other entropy coders that have similar complexity to the proposed coders.

  16. A novel classification method based on membership function

    Science.gov (United States)

    Peng, Yaxin; Shen, Chaomin; Wang, Lijia; Zhang, Guixu

    2011-03-01

    We propose a method for medical image classification using membership function. Our aim is to classify the image as several classes based on a prior knowledge. For every point, we calculate its membership function, i.e., the probability that the point belongs to each class. The point is finally labeled as the class with the highest value of membership function. The classification is reduced to a minimization problem of a functional with arguments of membership functions. Three novelties are in our paper. First, bias correction and Rudin-Osher-Fatemi (ROF) model are adopted to the input image to enhance the image quality. Second, unconstrained functional is used. We use variable substitution to avoid the constraints that membership functions should be positive and with sum one. Third, several techniques are used to fasten the computation. The experimental result of ventricle shows the validity of this approach.

  17. SPEECH/MUSIC CLASSIFICATION USING WAVELET BASED FEATURE EXTRACTION TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Thiruvengatanadhan Ramalingam

    2014-01-01

    Full Text Available Audio classification serves as the fundamental step towards the rapid growth in audio data volume. Due to the increasing size of the multimedia sources speech and music classification is one of the most important issues for multimedia information retrieval. In this work a speech/music discrimination system is developed which utilizes the Discrete Wavelet Transform (DWT as the acoustic feature. Multi resolution analysis is the most significant statistical way to extract the features from the input signal and in this study, a method is deployed to model the extracted wavelet feature. Support Vector Machines (SVM are based on the principle of structural risk minimization. SVM is applied to classify audio into their classes namely speech and music, by learning from training data. Then the proposed method extends the application of Gaussian Mixture Models (GMM to estimate the probability density function using maximum likelihood decision methods. The system shows significant results with an accuracy of 94.5%.

  18. A Fuzzy Similarity Based Concept Mining Model for Text Classification

    CERN Document Server

    Puri, Shalini

    2012-01-01

    Text Classification is a challenging and a red hot field in the current scenario and has great importance in text categorization applications. A lot of research work has been done in this field but there is a need to categorize a collection of text documents into mutually exclusive categories by extracting the concepts or features using supervised learning paradigm and different classification algorithms. In this paper, a new Fuzzy Similarity Based Concept Mining Model (FSCMM) is proposed to classify a set of text documents into pre - defined Category Groups (CG) by providing them training and preparing on the sentence, document and integrated corpora levels along with feature reduction, ambiguity removal on each level to achieve high system performance. Fuzzy Feature Category Similarity Analyzer (FFCSA) is used to analyze each extracted feature of Integrated Corpora Feature Vector (ICFV) with the corresponding categories or classes. This model uses Support Vector Machine Classifier (SVMC) to classify correct...

  19. Apparent diffusion coefficient value of gastric cancer by diffusion-weighted imaging: Correlations with the histological differentiation and Lauren classification

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Song, E-mail: songliu532909756@gmail.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Guan, Wenxian, E-mail: wenxianguan123@126.com [Department of Gastrointestinal Surgery, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Wang, Hao, E-mail: wanghao20140525@126.com [Department of Gastrointestinal Surgery, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Pan, Liang, E-mail: panliang2014@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Zhou, Zhuping, E-mail: zhupingzhou@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Yu, Haiping, E-mail: haipingyu2012@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Liu, Tian, E-mail: tianliu2014@126.com [Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322 (United States); Yang, Xiaofeng, E-mail: xiaofengyang2014@126.com [Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322 (United States); He, Jian, E-mail: hjxueren@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Zhou, Zhengyang, E-mail: zyzhou@nju.edu.cn [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China)

    2014-12-15

    Highlights: • Gastric cancers’ ADC values were significantly lower than normal gastric wall. • Gastric adenocarcinomas with different differentiation had different ADC values. • Gastric adenocarcinomas’ ADC values correlated with histologic differentiations. • Gastric cancers’ ADC values correlated with Lauren classifications. • Mean ADC value was better than min ADC value in characterizing gastric cancers. - Abstract: Objective: The purpose of this study was to evaluate the correlations between histological differentiation and Lauren classification of gastric cancer and the apparent diffusion coefficient (ADC) value of diffusion weighted imaging (DWI). Materials and methods: Sixty-nine patients with gastric cancer lesions underwent preoperative magnetic resonance imaging (MRI) (3.0T) and surgical resection. DWI was obtained with a single-shot, echo-planar imaging sequence in the axial plane (b values: 0 and 1000 s/mm{sup 2}). Mean and minimum ADC values were obtained for each gastric cancer and normal gastric walls by two radiologists, who were blinded to the histological findings. Histological type, degree of differentiation and Lauren classification of each resected specimen were determined by one pathologist. Mean and minimum ADC values of gastric cancers with different histological types, degrees of differentiation and Lauren classifications were compared. Correlations between ADC values and histological differentiation and Lauren classification were analyzed. Results: The mean and minimum ADC values of gastric cancers, as a whole and separately, were significantly lower than those of normal gastric walls (all p values <0.001). There were significant differences in the mean and minimum ADC values among gastric cancers with different histological types, degrees of differentiation and Lauren classifications (p < 0.05). Mean and minimum ADC values correlated significantly (all p < 0.001) with histological differentiation (r = 0.564, 0.578) and

  20. Classification of breast cancer stroma as a tool for prognosis

    Science.gov (United States)

    Reis, Sara; Gazinska, Patrycja; Hipwell, John H.; Mertzanidou, Thomy; Naidoo, Kalnisha; Pinder, Sarah; Hawkes, David J.

    2016-03-01

    It has been shown that the tumour microenvironment plays a crucial role in regulating tumour progression by a number of different mechanisms, including the remodeling of collagen fibres in tumour-associated stroma. It is still unclear, however, if these stromal changes are of benefit to the host or the tumour. We hypothesise that stromal maturity is an important reflection of tumour biology, and thus can be used to predict prognosis. The aim of this study is to develop a texture analysis methodology which will automatically classify stromal regions from images of hematoxylin and eosin-stained (H and E) sections into two categories: mature and immature. Subsequently we will investigate whether stromal maturity could be used as a predictor of survival and also as a means to better understand the relationship between the radiological imaging signal and the underlying tissue microstructure. We present initial results for 118 regions-of-interest from a dataset of 39 patients diagnosed with invasive breast cancer.

  1. Classification and Clinical Management of Variants of Uncertain Significance in High Penetrance Cancer Predisposition Genes.

    Science.gov (United States)

    Moghadasi, Setareh; Eccles, Diana M; Devilee, Peter; Vreeswijk, Maaike P G; van Asperen, Christi J

    2016-04-01

    In 2008, the International Agency for Research on Cancer (IARC) proposed a system for classifying sequence variants in highly penetrant breast and colon cancer susceptibility genes, linked to clinical actions. This system uses a multifactorial likelihood model to calculate the posterior probability that an altered DNA sequence is pathogenic. Variants between 5%-94.9% (class 3) are categorized as variants of uncertain significance (VUS). This interval is wide and might include variants with a substantial difference in pathogenicity at either end of the spectrum. We think that carriers of class 3 variants would benefit from a fine-tuning of this classification. Classification of VUS to a category with a defined clinical significance is very important because for carriers of a pathogenic mutation full surveillance and risk-reducing surgery can reduce cancer incidence. Counselees who are not carriers of a pathogenic mutation can be discharged from intensive follow-up and avoid unnecessary risk-reducing surgery. By means of examples, we show how, in selected cases, additional data can lead to reclassification of some variants to a different class with different recommendations for surveillance and therapy. To improve the clinical utility of this classification system, we suggest a pragmatic adaptation to clinical practice.

  2. Local fractal dimension based approaches for colonic polyp classification.

    Science.gov (United States)

    Häfner, Michael; Tamaki, Toru; Tanaka, Shinji; Uhl, Andreas; Wimmer, Georg; Yoshida, Shigeto

    2015-12-01

    This work introduces texture analysis methods that are based on computing the local fractal dimension (LFD; or also called the local density function) and applies them for colonic polyp classification. The methods are tested on 8 HD-endoscopic image databases, where each database is acquired using different imaging modalities (Pentax's i-Scan technology combined with or without staining the mucosa) and on a zoom-endoscopic image database using narrow band imaging. In this paper, we present three novel extensions to a LFD based approach. These extensions additionally extract shape and/or gradient information of the image to enhance the discriminativity of the original approach. To compare the results of the LFD based approaches with the results of other approaches, five state of the art approaches for colonic polyp classification are applied to the employed databases. Experiments show that LFD based approaches are well suited for colonic polyp classification, especially the three proposed extensions. The three proposed extensions are the best performing methods or at least among the best performing methods for each of the employed databases. The methods are additionally tested by means of a public texture image database, the UIUCtex database. With this database, the viewpoint invariance of the methods is assessed, an important features for the employed endoscopic image databases. Results imply that most of the LFD based methods are more viewpoint invariant than the other methods. However, the shape, size and orientation adapted LFD approaches (which are especially designed to enhance the viewpoint invariance) are in general not more viewpoint invariant than the other LFD based approaches.

  3. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    Science.gov (United States)

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  4. Dictionary-Based, Clustered Sparse Representation for Hyperspectral Image Classification

    Directory of Open Access Journals (Sweden)

    Zhen-tao Qin

    2015-01-01

    Full Text Available This paper presents a new, dictionary-based method for hyperspectral image classification, which incorporates both spectral and contextual characteristics of a sample clustered to obtain a dictionary of each pixel. The resulting pixels display a common sparsity pattern in identical clustered groups. We calculated the image’s sparse coefficients using the dictionary approach, which generated the sparse representation features of the remote sensing images. The sparse coefficients are then used to classify the hyperspectral images via a linear SVM. Experiments show that our proposed method of dictionary-based, clustered sparse coefficients can create better representations of hyperspectral images, with a greater overall accuracy and a Kappa coefficient.

  5. Typology of Digital News Media: Theoretical Bases for their Classification

    Directory of Open Access Journals (Sweden)

    Ramón SALAVERRÍA

    2017-01-01

    Full Text Available Since their beginnings in the 1990s, digital news media have undergone a process of settlement and diversification. As a result, the prolific classification of online media has become increasingly rich and complex. Based on a review of media typologies, this article proposes some theoretical bases for the distinction of the online media from previous media and, above all, for the differentiation of the various types of online media among then. With that purpose, nine typologic criteria are proposed: 1 platform, 2 temporality, 3 topic, 4 reach, 5 ownership, 6 authorship, 7 focus, 8 economic purpose, and 9 dynamism.

  6. Network Traffic Anomalies Identification Based on Classification Methods

    Directory of Open Access Journals (Sweden)

    Donatas Račys

    2015-07-01

    Full Text Available A problem of network traffic anomalies detection in the computer networks is analyzed. Overview of anomalies detection methods is given then advantages and disadvantages of the different methods are analyzed. Model for the traffic anomalies detection was developed based on IBM SPSS Modeler and is used to analyze SNMP data of the router. Investigation of the traffic anomalies was done using three classification methods and different sets of the learning data. Based on the results of investigation it was determined that C5.1 decision tree method has the largest accuracy and performance and can be successfully used for identification of the network traffic anomalies.

  7. Classification of body movements based on posturographic data.

    Science.gov (United States)

    Saripalle, Sashi K; Paiva, Gavin C; Cliett, Thomas C; Derakhshani, Reza R; King, Gregory W; Lovelace, Christopher T

    2014-02-01

    The human body, standing on two feet, produces a continuous sway pattern. Intended movements, sensory cues, emotional states, and illnesses can all lead to subtle changes in sway appearing as alterations in ground reaction forces and the body's center of pressure (COP). The purpose of this study is to demonstrate that carefully selected COP parameters and classification methods can differentiate among specific body movements while standing, providing new prospects in camera-free motion identification. Force platform data were collected from participants performing 11 choreographed postural and gestural movements. Twenty-three different displacement- and frequency-based features were extracted from COP time series, and supplied to classification-guided feature extraction modules. For identification of movement type, several linear and nonlinear classifiers were explored; including linear discriminants, nearest neighbor classifiers, and support vector machines. The average classification rates on previously unseen test sets ranged from 67% to 100%. Within the context of this experiment, no single method was able to uniformly outperform the others for all movement types, and therefore a set of movement-specific features and classifiers is recommended.

  8. Spectrum-based kernel length estimation for Gaussian process classification.

    Science.gov (United States)

    Wang, Liang; Li, Chuan

    2014-06-01

    Recent studies have shown that Gaussian process (GP) classification, a discriminative supervised learning approach, has achieved competitive performance in real applications compared with most state-of-the-art supervised learning methods. However, the problem of automatic model selection in GP classification, involving the kernel function form and the corresponding parameter values (which are unknown in advance), remains a challenge. To make GP classification a more practical tool, this paper presents a novel spectrum analysis-based approach for model selection by refining the GP kernel function to match the given input data. Specifically, we target the problem of GP kernel length scale estimation. Spectrums are first calculated analytically from the kernel function itself using the autocorrelation theorem as well as being estimated numerically from the training data themselves. Then, the kernel length scale is automatically estimated by equating the two spectrum values, i.e., the kernel function spectrum equals to the estimated training data spectrum. Compared with the classical Bayesian method for kernel length scale estimation via maximizing the marginal likelihood (which is time consuming and could suffer from multiple local optima), extensive experimental results on various data sets show that our proposed method is both efficient and accurate.

  9. Risk Classification and Risk-based Safety and Mission Assurance

    Science.gov (United States)

    Leitner, Jesse A.

    2014-01-01

    Recent activities to revamp and emphasize the need to streamline processes and activities for Class D missions across the agency have led to various interpretations of Class D, including the lumping of a variety of low-cost projects into Class D. Sometimes terms such as Class D minus are used. In this presentation, mission risk classifications will be traced to official requirements and definitions as a measure to ensure that projects and programs align with the guidance and requirements that are commensurate for their defined risk posture. As part of this, the full suite of risk classifications, formal and informal will be defined, followed by an introduction to the new GPR 8705.4 that is currently under review.GPR 8705.4 lays out guidance for the mission success activities performed at the Classes A-D for NPR 7120.5 projects as well as for projects not under NPR 7120.5. Furthermore, the trends in stepping from Class A into higher risk posture classifications will be discussed. The talk will conclude with a discussion about risk-based safety and mission assuranceat GSFC.

  10. Geographical classification of apple based on hyperspectral imaging

    Science.gov (United States)

    Guo, Zhiming; Huang, Wenqian; Chen, Liping; Zhao, Chunjiang; Peng, Yankun

    2013-05-01

    Attribute of apple according to geographical origin is often recognized and appreciated by the consumers. It is usually an important factor to determine the price of a commercial product. Hyperspectral imaging technology and supervised pattern recognition was attempted to discriminate apple according to geographical origins in this work. Hyperspectral images of 207 Fuji apple samples were collected by hyperspectral camera (400-1000nm). Principal component analysis (PCA) was performed on hyperspectral imaging data to determine main efficient wavelength images, and then characteristic variables were extracted by texture analysis based on gray level co-occurrence matrix (GLCM) from dominant waveband image. All characteristic variables were obtained by fusing the data of images in efficient spectra. Support vector machine (SVM) was used to construct the classification model, and showed excellent performance in classification results. The total classification rate had the high classify accuracy of 92.75% in the training set and 89.86% in the prediction sets, respectively. The overall results demonstrated that the hyperspectral imaging technique coupled with SVM classifier can be efficiently utilized to discriminate Fuji apple according to geographical origins.

  11. Spectral classification of stars based on LAMOST spectra

    CERN Document Server

    Liu, Chao; Zhang, Bo; Wan, Jun-Chen; Deng, Li-Cai; Hou, Yonghui; Wang, Yuefei; Yang, Ming; Zhang, Yong

    2015-01-01

    In this work, we select the high signal-to-noise ratio spectra of stars from the LAMOST data andmap theirMK classes to the spectral features. The equivalentwidths of the prominent spectral lines, playing the similar role as the multi-color photometry, form a clean stellar locus well ordered by MK classes. The advantage of the stellar locus in line indices is that it gives a natural and continuous classification of stars consistent with either the broadly used MK classes or the stellar astrophysical parameters. We also employ a SVM-based classification algorithm to assignMK classes to the LAMOST stellar spectra. We find that the completenesses of the classification are up to 90% for A and G type stars, while it is down to about 50% for OB and K type stars. About 40% of the OB and K type stars are mis-classified as A and G type stars, respectively. This is likely owe to the difference of the spectral features between the late B type and early A type stars or between the late G and early K type stars are very we...

  12. Classification of Breast Cancer Subtypes by combining Gene Expression and DNA Methylation Data

    DEFF Research Database (Denmark)

    List, Markus; Hauschild, Anne-Christin; Tan, Qihua;

    2014-01-01

    on the transcriptomic, but also on an epigenetic level. We compared so-called random forest derived classification models based on gene expression and methylation data alone, to a model based on the combined features and to a model based on the gold standard PAM50. We obtained bootstrap errors of 10...

  13. Intrusion Awareness Based on Data Fusion and SVM Classification

    Directory of Open Access Journals (Sweden)

    Ramnaresh Sharma

    2012-06-01

    Full Text Available Network intrusion awareness is important factor for risk analysis of network security. In the current decade various method and framework are available for intrusion detection and security awareness. Some method based on knowledge discovery process and some framework based on neural network. These entire model take rule based decision for the generation of security alerts. In this paper we proposed a novel method for intrusion awareness using data fusion and SVM classification. Data fusion work on the biases of features gathering of event. Support vector machine is super classifier of data. Here we used SVM for the detection of closed item of ruled based technique. Our proposed method simulate on KDD1999 DARPA data set and get better empirical evaluation result in comparison of rule based technique and neural network model.

  14. Intrusion Awareness Based on Data Fusion and SVM Classification

    Directory of Open Access Journals (Sweden)

    Ramnaresh Sharma

    2012-06-01

    Full Text Available Network intrusion awareness is important factor forrisk analysis of network security. In the currentdecade various method and framework are availablefor intrusion detection and security awareness.Some method based on knowledge discovery processand some framework based on neural network.These entire model take rule based decision for thegeneration of security alerts. In this paper weproposed a novel method for intrusion awarenessusing data fusion and SVM classification. Datafusion work on the biases of features gathering ofevent. Support vector machine is super classifier ofdata. Here we used SVM for the detection of closeditem of ruled based technique. Our proposedmethod simulate on KDD1999 DARPA data set andget better empirical evaluation result in comparisonof rule based technique and neural network model.

  15. Content Based Image Retrieval : Classification Using Neural Networks

    Directory of Open Access Journals (Sweden)

    Shereena V.B

    2014-10-01

    Full Text Available In a content-based image retrieval system (CBIR, the main issue is to extract the image features that effectively represent the image contents in a database. Such an extraction requires a detailed evaluation of retrieval performance of image features. This paper presents a review of fundamental aspects of content based image retrieval including feature extraction of color and texture features. Commonly used color features including color moments, color histogram and color correlogram and Gabor texture are compared. The paper reviews the increase in efficiency of image retrieval when the color and texture features are combined. The similarity measures based on which matches are made and images are retrieved are also discussed. For effective indexing and fast searching of images based on visual features, neural network based pattern learning can be used to achieve effective classification.

  16. Content Based Image Retrieval : Classification Using Neural Networks

    Directory of Open Access Journals (Sweden)

    Shereena V.B

    2014-11-01

    Full Text Available In a content-based image retrieval system (CBIR, the main issue is to extract the image features that effectively represent the image contents in a database. Such an extraction requires a detailed evaluation of retrieval performance of image features. This paper presents a review of fundamental aspects of content based image retrieval including feature extraction of color and texture features. Commonly used color features including color moments, color histogram and color correlogram and Gabor texture are compared. The paper reviews the increase in efficiency of image retrieval when the color and texture features are combined. The similarity measures based on which matches are made and images are retrieved are also discussed. For effective indexing and fast searching of images based on visual features, neural network based pattern learning can be used to achieve effective classification.

  17. A NEW FUNCTIONAL CLASSIFICATION OF STOMACH CANCER AND ITS PATHOBIOLOGICAL AND CLINICAL SIGNIFICANCE

    Institute of Scientific and Technical Information of China (English)

    辛彦; 赵风凯; 宫伟; 王艳萍; 张荫昌; 闫瑞方

    1994-01-01

    The functional differentiations of stomach cancer specimens from 121patients were investigated by enzyme-,mucin-,affinity-and immunohistochemical methods,and the stomach cancers were divided into five functionally differentiated types:1)Absorptive Function Differentiation Type (AFDT),19.8%;2)Mucin Secreting Func-tion Differentiation Type (MSFDT),24.0%;3)Absorptive and Mucin-Producing Function Differentiation Type (AMPFDT),47.1%;4)Special Function Differentiation Type (SFDT),0.8%;and 5)Non-Function Differ-entiation Type(NFDT),8.3%.The results indicate that stomach cancer tissues of the same histological type of -ten display differing functional differentiation,and these functionally differentiated types have different invasive and metastatic characteristics.In addition,the functionally differentiated types have particular organic affinities of metastasis and different clinical prognoses.This study suggests that this new functional classification may supple-ment histological classification.The mechanisms of liver and ovary metastases of stomach cancer are also dis-cussed.

  18. Correlation coefficient mapping in fluorescence spectroscopy: tissue classification for cancer detection.

    Science.gov (United States)

    Crowell, Ed; Wang, Gufeng; Cox, Jason; Platz, Charles P; Geng, Lei

    2005-03-01

    Correlation coefficient mapping has been applied to intrinsic fluorescence spectra of colonic tissue for the purpose of cancer diagnosis. Fluorescence emission spectra were collected of 57 colonic tissue sites in a range of 4 physiological conditions: normal (29), hyperplastic (2), adenomatous (5), and cancerous tissues (21). The sample-sample correlation was used to examine the ability of correlation coefficient mapping to determine tissue disease state. The correlation coefficient map indicates two main categories of samples. These categories were found to relate to disease states of the tissue. Sensitivity, selectivity, predictive value positive, and predictive value negative for differentiation between normal tissue and all other categories were all above 92%. This was found to be similar to, or higher than, tissue classification using existing methods of data reduction. Wavelength-wavelength correlation among the samples highlights areas of importance for tissue classification. The two-dimensional correlation map reveals absorption by NADH and hemoglobin in the samples as negative correlation, an effect not obvious from the one-dimensional fluorescence spectra alone. The integrity of tissue was examined in a time series of spectra of a single tissue sample taken after tissue resection. The wavelength-wavelength correlation coefficient map shows the areas of significance for each fluorophore and their relation to each other. NADH displays negative correlation to collagen and FAD, from the absorption of emission or fluorescence resonance energy transfer. The wavelength-wavelength correlation map for the decay set also clearly shows that there are only three fluorophores of importance in the samples, by the well-defined pattern of the map. The sample-sample correlation coefficient map reveals the changes over time and their impact on tissue classification. Correlation coefficient mapping proves to be an effective method for sample classification and cancer

  19. Application of Bayesian Classification to Content-Based Data Management

    Science.gov (United States)

    Lynnes, Christopher; Berrick, S.; Gopalan, A.; Hua, X.; Shen, S.; Smith, P.; Yang, K-Y.; Wheeler, K.; Curry, C.

    2004-01-01

    The high volume of Earth Observing System data has proven to be challenging to manage for data centers and users alike. At the Goddard Earth Sciences Distributed Active Archive Center (GES DAAC), about 1 TB of new data are archived each day. Distribution to users is also about 1 TB/day. A substantial portion of this distribution is MODIS calibrated radiance data, which has a wide variety of uses. However, much of the data is not useful for a particular user's needs: for example, ocean color users typically need oceanic pixels that are free of cloud and sun-glint. The GES DAAC is using a simple Bayesian classification scheme to rapidly classify each pixel in the scene in order to support several experimental content-based data services for near-real-time MODIS calibrated radiance products (from Direct Readout stations). Content-based subsetting would allow distribution of, say, only clear pixels to the user if desired. Content-based subscriptions would distribute data to users only when they fit the user's usability criteria in their area of interest within the scene. Content-based cache management would retain more useful data on disk for easy online access. The classification may even be exploited in an automated quality assessment of the geolocation product. Though initially to be demonstrated at the GES DAAC, these techniques have applicability in other resource-limited environments, such as spaceborne data systems.

  20. Object-based Dimensionality Reduction in Land Surface Phenology Classification

    Directory of Open Access Journals (Sweden)

    Brian E. Bunker

    2016-11-01

    Full Text Available Unsupervised classification or clustering of multi-decadal land surface phenology provides a spatio-temporal synopsis of natural and agricultural vegetation response to environmental variability and anthropogenic activities. Notwithstanding the detailed temporal information available in calibrated bi-monthly normalized difference vegetation index (NDVI and comparable time series, typical pre-classification workflows average a pixel’s bi-monthly index within the larger multi-decadal time series. While this process is one practical way to reduce the dimensionality of time series with many hundreds of image epochs, it effectively dampens temporal variation from both intra and inter-annual observations related to land surface phenology. Through a novel application of object-based segmentation aimed at spatial (not temporal dimensionality reduction, all 294 image epochs from a Moderate Resolution Imaging Spectroradiometer (MODIS bi-monthly NDVI time series covering the northern Fertile Crescent were retained (in homogenous landscape units as unsupervised classification inputs. Given the inherent challenges of in situ or manual image interpretation of land surface phenology classes, a cluster validation approach based on transformed divergence enabled comparison between traditional and novel techniques. Improved intra-annual contrast was clearly manifest in rain-fed agriculture and inter-annual trajectories showed increased cluster cohesion, reducing the overall number of classes identified in the Fertile Crescent study area from 24 to 10. Given careful segmentation parameters, this spatial dimensionality reduction technique augments the value of unsupervised learning to generate homogeneous land surface phenology units. By combining recent scalable computational approaches to image segmentation, future work can pursue new global land surface phenology products based on the high temporal resolution signatures of vegetation index time series.

  1. Biopharmaceutics classification system-based biowaivers for generic oncology drug products: case studies.

    Science.gov (United States)

    Tampal, Nilufer; Mandula, Haritha; Zhang, Hongling; Li, Bing V; Nguyen, Hoainhon; Conner, Dale P

    2015-02-01

    Establishing bioequivalence (BE) of drugs indicated to treat cancer poses special challenges. For ethical reasons, often, the studies need to be conducted in cancer patients rather than in healthy volunteers, especially when the drug is cytotoxic. The Biopharmaceutics Classification System (BCS) introduced by Amidon (1) and adopted by the FDA, presents opportunities to avoid conducting the bioequivalence studies in humans. This paper analyzes the application of the BCS approach by the generic pharmaceutical industry and the FDA to oncology drug products. To date, the FDA has granted BCS-based biowaivers for several drug products involving at least four different drug substances, used to treat cancer. Compared to in vivo BE studies, development of data to justify BCS waivers is considered somewhat easier, faster, and more cost effective. However, the FDA experience shows that the approval times for applications containing in vitro studies to support the BCS-based biowaivers are often as long as the applications containing in vivo BE studies, primarily because of inadequate information in the submissions. This paper deliberates some common causes for the delays in the approval of applications requesting BCS-based biowaivers for oncology drug products. Scientific considerations of conducting a non-BCS-based in vivo BE study for generic oncology drug products are also discussed. It is hoped that the information provided in our study would help the applicants to improve the quality of ANDA submissions in the future.

  2. A risk evaluation model of cervical cancer based on etiology and human leukocyte antigen allele susceptibility

    Directory of Open Access Journals (Sweden)

    Bicheng Hu

    2014-11-01

    Conclusions: This model, based on etiology and HLA allele susceptibility, can estimate the risk of cervical cancer in chronic cervicitis patients after HPV infection. It combines genetic and environmental factors and significantly enhances the accuracy of risk evaluation for cervical cancer. This model could be used to select patients for intervention therapy and to guide patient classification management.

  3. Generalization performance of graph-based semisupervised classification

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    Semi-supervised learning has been of growing interest over the past few years and many methods have been proposed. Although various algorithms are provided to implement semi-supervised learning,there are still gaps in our understanding of the dependence of generalization error on the numbers of labeled and unlabeled data. In this paper,we consider a graph-based semi-supervised classification algorithm and establish its generalization error bounds. Our results show the close relations between the generalization performance and the structural invariants of data graph.

  4. Hydrophobicity classification of polymeric materials based on fractal dimension

    Directory of Open Access Journals (Sweden)

    Daniel Thomazini

    2008-12-01

    Full Text Available This study proposes a new method to obtain hydrophobicity classification (HC in high voltage polymer insulators. In the method mentioned, the HC was analyzed by fractal dimension (fd and its processing time was evaluated having as a goal the application in mobile devices. Texture images were created from spraying solutions produced of mixtures of isopropyl alcohol and distilled water in proportions, which ranged from 0 to 100% volume of alcohol (%AIA. Based on these solutions, the contact angles of the drops were measured and the textures were used as patterns for fractal dimension calculations.

  5. Radar Image Texture Classification based on Gabor Filter Bank

    OpenAIRE

    Mbainaibeye Jérôme; Olfa Marrakchi Charfi

    2014-01-01

    The aim of this paper is to design and develop a filter bank for the detection and classification of radar image texture with 4.6m resolution obtained by airborne synthetic Aperture Radar. The textures of this kind of images are more correlated and contain forms with random disposition. The design and the developing of the filter bank is based on Gabor filter. We have elaborated a set of filters applied to each set of feature texture allowing its identification and enhancement in comparison w...

  6. Cancer Biochemistry and Host-Tumor Interactions: A Decimal Classification, (Categories 51.6, 51.7, and 51.8).

    Science.gov (United States)

    Schneider, John H.

    This is a hierarchical decimal classification of information related to cancer biochemistry, to host-tumor interactions (including cancer immunology), and to occurrence of cancer in special types of animals and plants. It is a working draft of categories taken from an extensive classification of many fields of biomedical information. Because the…

  7. A Method for Data Classification Based on Discernibility Matrix and Discernibility Function

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    A method for data classification will influence the efficiency of classification. Attributes reduction based on discernibility matrix and discernibility function in rough sets can use in data classification, so we put forward a method for data classification. Namely, firstly, we use discernibility matrix and discernibility function to delete superfluous attributes in formation system and get a necessary attribute set. Secondly, we delete superfluous attribute values and get decision rules. Finally, we classify data by means of decision rules. The experiments show that data classification using this method is simpler in the structure, and can improve the efficiency of classification.

  8. Semi-Supervised Classification based on Gaussian Mixture Model for remote imagery

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    Semi-Supervised Classification (SSC),which makes use of both labeled and unlabeled data to determine classification borders in feature space,has great advantages in extracting classification information from mass data.In this paper,a novel SSC method based on Gaussian Mixture Model (GMM) is proposed,in which each class’s feature space is described by one GMM.Experiments show the proposed method can achieve high classification accuracy with small amount of labeled data.However,for the same accuracy,supervised classification methods such as Support Vector Machine,Object Oriented Classification,etc.should be provided with much more labeled data.

  9. Task Classification Based Energy-Aware Consolidation in Clouds

    Directory of Open Access Journals (Sweden)

    HeeSeok Choi

    2016-01-01

    Full Text Available We consider a cloud data center, in which the service provider supplies virtual machines (VMs on hosts or physical machines (PMs to its subscribers for computation in an on-demand fashion. For the cloud data center, we propose a task consolidation algorithm based on task classification (i.e., computation-intensive and data-intensive and resource utilization (e.g., CPU and RAM. Furthermore, we design a VM consolidation algorithm to balance task execution time and energy consumption without violating a predefined service level agreement (SLA. Unlike the existing research on VM consolidation or scheduling that applies none or single threshold schemes, we focus on a double threshold (upper and lower scheme, which is used for VM consolidation. More specifically, when a host operates with resource utilization below the lower threshold, all the VMs on the host will be scheduled to be migrated to other hosts and then the host will be powered down, while when a host operates with resource utilization above the upper threshold, a VM will be migrated to avoid using 100% of resource utilization. Based on experimental performance evaluations with real-world traces, we prove that our task classification based energy-aware consolidation algorithm (TCEA achieves a significant energy reduction without incurring predefined SLA violations.

  10. Feature selection gait-based gender classification under different circumstances

    Science.gov (United States)

    Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

    2014-05-01

    This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

  11. Association Technique based on Classification for Classifying Microcalcification and Mass in Mammogram

    Directory of Open Access Journals (Sweden)

    Herwanto

    2013-01-01

    Full Text Available Currently, mammography is recognized as the most effective imaging modality for breast cancer screening. The challenge of using mammography is how to locate the area, which is indeed a solitary geographic abnormality. In mammography screening it is important to define the risk for women who have radiologically negative findings and for those who might develop malignancy later in life. Microcalcification and mass segmentation are used frequently as the first step in mammography screening. The main objective of this paper is to apply association technique based on classification algorithm to classify microcalcification and mass in mammogram. The system that we propose consists of: (i a preprocessing phase to enhance the quality of the image and followed by segmentating region of interest; (ii a phase for mining a transactional table; and (iii a phase for organizing the resulted association rules in a classification model. This paper also illustrates how important the data cleaning phase in building the data mining process for image classification. The proposed method was evaluated using the mammogram data from Mammographic Image Analysis Society (MIAS. The MIAS data consist of 207 images of normal breast, 64 benign, and 51 malignant. 85 mammograms of MIAS data have mass, and 25 mammograms have microcalcification. The features of mean and Gray Level Co-occurrence Matrix homogeneity have been proved to be potential for discriminating microcalcification from mass. The accuracy obtained by this method is 83%.

  12. Classification of prostate cancer grade using temporal ultrasound: in vivo feasibility study

    Science.gov (United States)

    Ghavidel, Sahar; Imani, Farhad; Khallaghi, Siavash; Gibson, Eli; Khojaste, Amir; Gaed, Mena; Moussa, Madeleine; Gomez, Jose A.; Siemens, D. Robert; Leveridge, Michael; Chang, Silvia; Fenster, Aaron; Ward, Aaron D.; Abolmaesumi, Purang; Mousavi, Parvin

    2016-03-01

    Temporal ultrasound has been shown to have high classification accuracy in differentiating cancer from benign tissue. In this paper, we extend the temporal ultrasound method to classify lower grade Prostate Cancer (PCa) from all other grades. We use a group of nine patients with mostly lower grade PCa, where cancerous regions are also limited. A critical challenge is to train a classifier with limited aggressive cancerous tissue compared to low grade cancerous tissue. To resolve the problem of imbalanced data, we use Synthetic Minority Oversampling Technique (SMOTE) to generate synthetic samples for the minority class. We calculate spectral features of temporal ultrasound data and perform feature selection using Random Forests. In leave-one-patient-out cross-validation strategy, an area under receiver operating characteristic curve (AUC) of 0.74 is achieved with overall sensitivity and specificity of 70%. Using an unsupervised learning approach prior to proposed method improves sensitivity and AUC to 80% and 0.79. This work represents promising results to classify lower and higher grade PCa with limited cancerous training samples, using temporal ultrasound.

  13. Classification of normal and cancerous lung tissues by electrical impendence tomography.

    Science.gov (United States)

    Gao, Jianling; Yue, Shihong; Chen, Jun; Wang, Huaxiang

    2014-01-01

    Biological tissue impedance spectroscopy can provide rich physiological and pathological information by measuring the variation of the complex impedance of biological tissues under various frequencies of driven current. Electrical Impedance Tomography (EIT) technique can measure the impedance spectroscopy of biological tissue in medical field. Before application, a key problem must be solved on how to generally distinguish normal tissues from the cancerous in terms of measurable EIT data. In this paper, the impedance spectroscopy characteristics of human lung tissue are studied. On the basis of the measured data of 109 lung cancer patients, Cole-Cole Circle radius (CCCR) and the complex modulus are extracted. In terms of the two characteristics, 71.6% and 66.4% samples of cancerous and normal tissues can be correctly classified, respectively. Furthermore, two characteristics of the measured EIT data of each patient consist of a two-dimensional vector and all such vectors comprise a set of vectors. When classifying the vector set, the rate of correctly partitioning normal and cancerous tissues can be raised to 78.2%. The main factors to affect the classification results on normal and cancerous tissues are generally analyzed. The proposed method will play an important role in further working out an efficient and feasible diagnostic method for potential lung cancer patients, and provide theoretical basis and reference data for electrical impedance tomography technology in monitoring pulmonary function.

  14. Scene classification of infrared images based on texture feature

    Science.gov (United States)

    Zhang, Xiao; Bai, Tingzhu; Shang, Fei

    2008-12-01

    Scene Classification refers to as assigning a physical scene into one of a set of predefined categories. Utilizing the method texture feature is good for providing the approach to classify scenes. Texture can be considered to be repeating patterns of local variation of pixel intensities. And texture analysis is important in many applications of computer image analysis for classification or segmentation of images based on local spatial variations of intensity. Texture describes the structural information of images, so it provides another data to classify comparing to the spectrum. Now, infrared thermal imagers are used in different kinds of fields. Since infrared images of the objects reflect their own thermal radiation, there are some shortcomings of infrared images: the poor contrast between the objectives and background, the effects of blurs edges, much noise and so on. Because of these shortcomings, it is difficult to extract to the texture feature of infrared images. In this paper we have developed an infrared image texture feature-based algorithm to classify scenes of infrared images. This paper researches texture extraction using Gabor wavelet transform. The transformation of Gabor has excellent capability in analysis the frequency and direction of the partial district. Gabor wavelets is chosen for its biological relevance and technical properties In the first place, after introducing the Gabor wavelet transform and the texture analysis methods, the infrared images are extracted texture feature by Gabor wavelet transform. It is utilized the multi-scale property of Gabor filter. In the second place, we take multi-dimensional means and standard deviation with different scales and directions as texture parameters. The last stage is classification of scene texture parameters with least squares support vector machine (LS-SVM) algorithm. SVM is based on the principle of structural risk minimization (SRM). Compared with SVM, LS-SVM has overcome the shortcoming of

  15. Web entity extraction based on entity attribute classification

    Science.gov (United States)

    Li, Chuan-Xi; Chen, Peng; Wang, Ru-Jing; Su, Ya-Ru

    2011-12-01

    The large amount of entity data are continuously published on web pages. Extracting these entities automatically for further application is very significant. Rule-based entity extraction method yields promising result, however, it is labor-intensive and hard to be scalable. The paper proposes a web entity extraction method based on entity attribute classification, which can avoid manual annotation of samples. First, web pages are segmented into different blocks by algorithm Vision-based Page Segmentation (VIPS), and a binary classifier LibSVM is trained to retrieve the candidate blocks which contain the entity contents. Second, the candidate blocks are partitioned into candidate items, and the classifiers using LibSVM are performed for the attributes annotation of the items and then the annotation results are aggregated into an entity. Results show that the proposed method performs well to extract agricultural supply and demand entities from web pages.

  16. Soft computing based feature selection for environmental sound classification

    NARCIS (Netherlands)

    Shakoor, A.; May, T.M.; Van Schijndel, N.H.

    2010-01-01

    Environmental sound classification has a wide range of applications,like hearing aids, mobile communication devices, portable media players, and auditory protection devices. Sound classification systemstypically extract features from the input sound. Using too many features increases complexity unne

  17. ECG-based heartbeat classification for arrhythmia detection: A survey.

    Science.gov (United States)

    Luz, Eduardo José da S; Schwartz, William Robson; Cámara-Chávez, Guillermo; Menotti, David

    2016-04-01

    An electrocardiogram (ECG) measures the electric activity of the heart and has been widely used for detecting heart diseases due to its simplicity and non-invasive nature. By analyzing the electrical signal of each heartbeat, i.e., the combination of action impulse waveforms produced by different specialized cardiac tissues found in the heart, it is possible to detect some of its abnormalities. In the last decades, several works were developed to produce automatic ECG-based heartbeat classification methods. In this work, we survey the current state-of-the-art methods of ECG-based automated abnormalities heartbeat classification by presenting the ECG signal preprocessing, the heartbeat segmentation techniques, the feature description methods and the learning algorithms used. In addition, we describe some of the databases used for evaluation of methods indicated by a well-known standard developed by the Association for the Advancement of Medical Instrumentation (AAMI) and described in ANSI/AAMI EC57:1998/(R)2008 (ANSI/AAMI, 2008). Finally, we discuss limitations and drawbacks of the methods in the literature presenting concluding remarks and future challenges, and also we propose an evaluation process workflow to guide authors in future works.

  18. Understanding Acupuncture Based on ZHENG Classification from System Perspective

    Directory of Open Access Journals (Sweden)

    Junwei Fang

    2013-01-01

    Full Text Available Acupuncture is an efficient therapy method originated in ancient China, the study of which based on ZHENG classification is a systematic research on understanding its complexity. The system perspective is contributed to understand the essence of phenomena, and, as the coming of the system biology era, broader technology platforms such as omics technologies were established for the objective study of traditional chinese medicine (TCM. Omics technologies could dynamically determine molecular components of various levels, which could achieve a systematic understanding of acupuncture by finding out the relationships of various response parts. After reviewing the literature of acupuncture studied by omics approaches, the following points were found. Firstly, with the help of omics approaches, acupuncture was found to be able to treat diseases by regulating the neuroendocrine immune (NEI network and the change of which could reflect the global effect of acupuncture. Secondly, the global effect of acupuncture could reflect ZHENG information at certain structure and function levels, which might reveal the mechanism of Meridian and Acupoint Specificity. Furthermore, based on comprehensive ZHENG classification, omics researches could help us understand the action characteristics of acupoints and the molecular mechanisms of their synergistic effect.

  19. Gear Crack Level Classification Based on EMD and EDT

    Directory of Open Access Journals (Sweden)

    Haiping Li

    2015-01-01

    Full Text Available Gears are the most essential parts in rotating machinery. Crack fault is one of damage modes most frequently occurring in gears. So, this paper deals with the problem of different crack levels classification. The proposed method is mainly based on empirical mode decomposition (EMD and Euclidean distance technique (EDT. First, vibration signal acquired by accelerometer is processed by EMD and intrinsic mode functions (IMFs are obtained. Then, a correlation coefficient based method is proposed to select the sensitive IMFs which contain main gear fault information. And energy of these IMFs is chosen as the fault feature by comparing with kurtosis and skewness. Finally, Euclidean distances between test sample and four classes trained samples are calculated, and on this basis, fault level classification of the test sample can be made. The proposed approach is tested and validated through a gearbox experiment, in which four crack levels and three kinds of loads are utilized. The results show that the proposed method has high accuracy rates in classifying different crack levels and may be adaptive to different conditions.

  20. Robust Pedestrian Classification Based on Hierarchical Kernel Sparse Representation

    Directory of Open Access Journals (Sweden)

    Rui Sun

    2016-08-01

    Full Text Available Vision-based pedestrian detection has become an active topic in computer vision and autonomous vehicles. It aims at detecting pedestrians appearing ahead of the vehicle using a camera so that autonomous vehicles can assess the danger and take action. Due to varied illumination and appearance, complex background and occlusion pedestrian detection in outdoor environments is a difficult problem. In this paper, we propose a novel hierarchical feature extraction and weighted kernel sparse representation model for pedestrian classification. Initially, hierarchical feature extraction based on a CENTRIST descriptor is used to capture discriminative structures. A max pooling operation is used to enhance the invariance of varying appearance. Then, a kernel sparse representation model is proposed to fully exploit the discrimination information embedded in the hierarchical local features, and a Gaussian weight function as the measure to effectively handle the occlusion in pedestrian images. Extensive experiments are conducted on benchmark databases, including INRIA, Daimler, an artificially generated dataset and a real occluded dataset, demonstrating the more robust performance of the proposed method compared to state-of-the-art pedestrian classification methods.

  1. 78 FR 58153 - Prevailing Rate Systems; North American Industry Classification System Based Federal Wage System...

    Science.gov (United States)

    2013-09-23

    ... RIN 3206-AM78 Prevailing Rate Systems; North American Industry Classification System Based Federal... Industry Classification System (NAICS) codes currently used in Federal Wage System wage survey industry... 2007 North American Industry Classification System (NAICS) codes used in Federal Wage System (FWS)...

  2. 78 FR 18252 - Prevailing Rate Systems; North American Industry Classification System Based Federal Wage System...

    Science.gov (United States)

    2013-03-26

    ... Industry Classification System Based Federal Wage System Wage Surveys AGENCY: U. S. Office of Personnel... is issuing a proposed rule that would update the 2007 North American Industry Classification System... North American Industry Classification System (NAICS) codes used in Federal Wage System (FWS)...

  3. Gene selection in class space for molecular classification of cancer

    Institute of Scientific and Technical Information of China (English)

    ZHANG Junying; Yue Joseph WANG; Javed KHAN; Robert CLARKE

    2004-01-01

    Gene selection (feature selection) is generally performed in gene space (feature space), where a very serious curse of dimensionality problem always exists because the number of genes is much larger than the number of samples in gene space (G-space). This results in difficulty in modeling the data set in this space and the low confidence of the result of gene selection. How to find a gene subset in this case is a challenging subject. In this paper, the above G-space is transformed into its dual space, referred to as class space (C-space) such that the number of dimensions is the very number of classes of the samples in G-space and the number of samples in C-space is the number of genes in G-space. It is obvious that the curse of dimensionality in C-space does not exist. A new gene selection method which is based on the principle of separating different classes as far as possible is presented with the help of Principal Component Analysis (PCA). The experimental results on gene selection for real data set are evaluated with Fisher criterion, weighted Fisher criterion as well as leave-one-out cross validation, showing that the method presented here is effective and efficient.

  4. Bearing Fault Classification Based on Conditional Random Field

    Directory of Open Access Journals (Sweden)

    Guofeng Wang

    2013-01-01

    Full Text Available Condition monitoring of rolling element bearing is paramount for predicting the lifetime and performing effective maintenance of the mechanical equipment. To overcome the drawbacks of the hidden Markov model (HMM and improve the diagnosis accuracy, conditional random field (CRF model based classifier is proposed. In this model, the feature vectors sequences and the fault categories are linked by an undirected graphical model in which their relationship is represented by a global conditional probability distribution. In comparison with the HMM, the main advantage of the CRF model is that it can depict the temporal dynamic information between the observation sequences and state sequences without assuming the independence of the input feature vectors. Therefore, the interrelationship between the adjacent observation vectors can also be depicted and integrated into the model, which makes the classifier more robust and accurate than the HMM. To evaluate the effectiveness of the proposed method, four kinds of bearing vibration signals which correspond to normal, inner race pit, outer race pit and roller pit respectively are collected from the test rig. And the CRF and HMM models are built respectively to perform fault classification by taking the sub band energy features of wavelet packet decomposition (WPD as the observation sequences. Moreover, K-fold cross validation method is adopted to improve the evaluation accuracy of the classifier. The analysis and comparison under different fold times show that the accuracy rate of classification using the CRF model is higher than the HMM. This method brings some new lights on the accurate classification of the bearing faults.

  5. Comparison Effectiveness of Pixel Based Classification and Object Based Classification Using High Resolution Image In Floristic Composition Mapping (Study Case: Gunung Tidar Magelang City)

    Science.gov (United States)

    Ardha Aryaguna, Prama; Danoedoro, Projo

    2016-11-01

    Developments of analysis remote sensing have same way with development of technology especially in sensor and plane. Now, a lot of image have high spatial and radiometric resolution, that's why a lot information. Vegetation object analysis such floristic composition got a lot advantage of that development. Floristic composition can be interpreted using a lot of method such pixel based classification and object based classification. The problems for pixel based method on high spatial resolution image are salt and paper who appear in result of classification. The purpose of this research are compare effectiveness between pixel based classification and object based classification for composition vegetation mapping on high resolution image Worldview-2. The results show that pixel based classification using majority 5×5 kernel windows give the highest accuracy between another classifications. The highest accuracy is 73.32% from image Worldview-2 are being radiometric corrected level surface reflectance, but for overall accuracy in every class, object based are the best between another methods. Reviewed from effectiveness aspect, pixel based are more effective then object based for vegetation composition mapping in Tidar forest.

  6. Kernel-based machine learning techniques for infrasound signal classification

    Science.gov (United States)

    Tuma, Matthias; Igel, Christian; Mialle, Pierrick

    2014-05-01

    Infrasound monitoring is one of four remote sensing technologies continuously employed by the CTBTO Preparatory Commission. The CTBTO's infrasound network is designed to monitor the Earth for potential evidence of atmospheric or shallow underground nuclear explosions. Upon completion, it will comprise 60 infrasound array stations distributed around the globe, of which 47 were certified in January 2014. Three stages can be identified in CTBTO infrasound data processing: automated processing at the level of single array stations, automated processing at the level of the overall global network, and interactive review by human analysts. At station level, the cross correlation-based PMCC algorithm is used for initial detection of coherent wavefronts. It produces estimates for trace velocity and azimuth of incoming wavefronts, as well as other descriptive features characterizing a signal. Detected arrivals are then categorized into potentially treaty-relevant versus noise-type signals by a rule-based expert system. This corresponds to a binary classification task at the level of station processing. In addition, incoming signals may be grouped according to their travel path in the atmosphere. The present work investigates automatic classification of infrasound arrivals by kernel-based pattern recognition methods. It aims to explore the potential of state-of-the-art machine learning methods vis-a-vis the current rule-based and task-tailored expert system. To this purpose, we first address the compilation of a representative, labeled reference benchmark dataset as a prerequisite for both classifier training and evaluation. Data representation is based on features extracted by the CTBTO's PMCC algorithm. As classifiers, we employ support vector machines (SVMs) in a supervised learning setting. Different SVM kernel functions are used and adapted through different hyperparameter optimization routines. The resulting performance is compared to several baseline classifiers. All

  7. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    Science.gov (United States)

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification.

  8. Hepatic CT Image Query Based on Threshold-based Classification Scheme with Gabor Features

    Institute of Scientific and Technical Information of China (English)

    JIANG Li-jun; LUO Yong-zing; ZHAO Jun; ZHUANG Tian-ge

    2008-01-01

    Hepatic computed tomography (CT) images with Gabor function were analyzed.Then a thresholdbased classification scheme was proposed using Gabor features and proceeded with the retrieval of the hepatic CT images.In our experiments,a batch of hepatic CT images containing several types of CT findings was used and compared with the Zhao's image classification scheme,support vector machines (SVM) scheme and threshold-based scheme.

  9. Highly comparative, feature-based time-series classification

    CERN Document Server

    Fulcher, Ben D

    2014-01-01

    A highly comparative, feature-based approach to time series classification is introduced that uses an extensive database of algorithms to extract thousands of interpretable features from time series. These features are derived from across the scientific time-series analysis literature, and include summaries of time series in terms of their correlation structure, distribution, entropy, stationarity, scaling properties, and fits to a range of time-series models. After computing thousands of features for each time series in a training set, those that are most informative of the class structure are selected using greedy forward feature selection with a linear classifier. The resulting feature-based classifiers automatically learn the differences between classes using a reduced number of time-series properties, and circumvent the need to calculate distances between time series. Representing time series in this way results in orders of magnitude of dimensionality reduction, allowing the method to perform well on ve...

  10. Credal Classification based on AODE and compression coefficients

    CERN Document Server

    Corani, Giorgio

    2012-01-01

    Bayesian model averaging (BMA) is an approach to average over alternative models; yet, it usually gets excessively concentrated around the single most probable model, therefore achieving only sub-optimal classification performance. The compression-based approach (Boulle, 2007) overcomes this problem, averaging over the different models by applying a logarithmic smoothing over the models' posterior probabilities. This approach has shown excellent performances when applied to ensembles of naive Bayes classifiers. AODE is another ensemble of models with high performance (Webb, 2005), based on a collection of non-naive classifiers (called SPODE) whose probabilistic predictions are aggregated by simple arithmetic mean. Aggregating the SPODEs via BMA rather than by arithmetic mean deteriorates the performance; instead, we aggregate the SPODEs via the compression coefficients and we show that the resulting classifier obtains a slight but consistent improvement over AODE. However, an important issue in any Bayesian e...

  11. Contribution of multiparameter flow cytometry immunophenotyping to the diagnostic screening and classification of pediatric cancer.

    Directory of Open Access Journals (Sweden)

    Cristiane S Ferreira-Facio

    Full Text Available Pediatric cancer is a relatively rare and heterogeneous group of hematological and non-hematological malignancies which require multiple procedures for its diagnostic screening and classification. Until now, flow cytometry (FC has not been systematically applied to the diagnostic work-up of such malignancies, particularly for solid tumors. Here we evaluated a FC panel of markers for the diagnostic screening of pediatric cancer and further classification of pediatric solid tumors. The proposed strategy aims at the differential diagnosis between tumoral vs. reactive samples, and hematological vs. non-hematological malignancies, and the subclassification of solid tumors. In total, 52 samples from 40 patients suspicious of containing tumor cells were analyzed by FC in parallel to conventional diagnostic procedures. The overall concordance rate between both approaches was of 96% (50/52 diagnostic samples, with 100% agreement for all reactive/inflammatory and non-infiltrated samples as well as for those corresponding to solid tumors (n = 35, with only two false negative cases diagnosed with Hodgkin lymphoma and anaplastic lymphoma, respectively. Moreover, clear discrimination between samples infiltrated by hematopoietic vs. non-hematopoietic tumor cells was systematically achieved. Distinct subtypes of solid tumors showed different protein expression profiles, allowing for the differential diagnosis of neuroblastoma (CD56(hi/GD2(+/CD81(hi, primitive neuroectodermal tumors (CD271(hi/CD99(+, Wilms tumors (>1 cell population, rhabdomyosarcoma (nuMYOD1(+/numyogenin(+, carcinomas (CD45(-/EpCAM(+, germ cell tumors (CD56(+/CD45(-/NG2(+/CD10(+ and eventually also hemangiopericytomas (CD45(-/CD34(+. In summary, our results show that multiparameter FC provides fast and useful complementary data to routine histopathology for the diagnostic screening and classification of pediatric cancer.

  12. 乳腺癌的分子分型%Molecular classification of breast cancer

    Institute of Scientific and Technical Information of China (English)

    张百红; 岳红云

    2014-01-01

    乳腺癌是一种分子水平异质性很高的疾病,分子分型可为乳腺癌的个体化治疗提供一个新视野.在分子病理学、分子生物学和系统生物学指导下,乳腺癌经历了4类分型、70种和21种基因蛋白谱以及基因组整合分类等不同分型.这些分型将为乳腺癌的精确治疗提供指导.%Breast cancer is a group of heterogeneous diseases.Molecular portraits provide a new insight for personalized cancer management in breast cancer.According to the molecular pathology,molecular biology and system biology,breast cancer goes through different typing methods,including four subclasses,geneexpression signature and integrated genomic classification.These major subtypes of breast cancer may provide guidance for precise therapeutics.

  13. Automated glioblastoma segmentation based on a multiparametric structured unsupervised classification.

    Science.gov (United States)

    Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V; Robles, Montserrat; Aparici, F; Martí-Bonmatí, L; García-Gómez, Juan M

    2015-01-01

    Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation.

  14. Automated glioblastoma segmentation based on a multiparametric structured unsupervised classification.

    Directory of Open Access Journals (Sweden)

    Javier Juan-Albarracín

    Full Text Available Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM, whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF. An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation.

  15. A Fuzzy Similarity Based Concept Mining Model for Text Classification

    Directory of Open Access Journals (Sweden)

    Shalini Puri

    2011-11-01

    Full Text Available Text Classification is a challenging and a red hot field in the current scenario and has great importance in text categorization applications. A lot of research work has been done in this field but there is a need to categorize a collection of text documents into mutually exclusive categories by extracting the concepts or features using supervised learning paradigm and different classification algorithms. In this paper, a new Fuzzy Similarity Based Concept Mining Model (FSCMM is proposed to classify a set of text documents into pre - defined Category Groups (CG by providing them training and preparing on the sentence, document and integrated corpora levels along with feature reduction, ambiguity removal on each level to achieve high system performance. Fuzzy Feature Category Similarity Analyzer (FFCSA is used to analyze each extracted feature of Integrated Corpora Feature Vector (ICFV with the corresponding categories or classes. This model uses Support Vector Machine Classifier (SVMC to classify correctly the training data patterns into two groups; i. e., + 1 and – 1, thereby producing accurate and correct results. The proposed model works efficiently and effectively with great performance and high - accuracy results.

  16. Radar Image Texture Classification based on Gabor Filter Bank

    Directory of Open Access Journals (Sweden)

    Mbainaibeye Jérôme

    2014-01-01

    Full Text Available The aim of this paper is to design and develop a filter bank for the detection and classification of radar image texture with 4.6m resolution obtained by airborne synthetic Aperture Radar. The textures of this kind of images are more correlated and contain forms with random disposition. The design and the developing of the filter bank is based on Gabor filter. We have elaborated a set of filters applied to each set of feature texture allowing its identification and enhancement in comparison with other textures. The filter bank which we have elaborated is represented by a combination of different texture filters. After processing, the selected filter bank is the filter bank which allows the identification of all the textures of an image with a significant identification rate. This developed filter is applied to radar image and the obtained results are compared with those obtained by using filter banks issue from the generalized Gaussian models (GGM. We have shown that Gabor filter developed in this work gives the classification rate greater than the results obtained by Generalized Gaussian model. The main contribution of this work is the generation of the filter banks able to give an optimal filter bank for a given texture and in particular for radar image textures

  17. Neighborhood Hypergraph Based Classification Algorithm for Incomplete Information System

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2015-01-01

    Full Text Available The problem of classification in incomplete information system is a hot issue in intelligent information processing. Hypergraph is a new intelligent method for machine learning. However, it is hard to process the incomplete information system by the traditional hypergraph, which is due to two reasons: (1 the hyperedges are generated randomly in traditional hypergraph model; (2 the existing methods are unsuitable to deal with incomplete information system, for the sake of missing values in incomplete information system. In this paper, we propose a novel classification algorithm for incomplete information system based on hypergraph model and rough set theory. Firstly, we initialize the hypergraph. Second, we classify the training set by neighborhood hypergraph. Third, under the guidance of rough set, we replace the poor hyperedges. After that, we can obtain a good classifier. The proposed approach is tested on 15 data sets from UCI machine learning repository. Furthermore, it is compared with some existing methods, such as C4.5, SVM, NavieBayes, and KNN. The experimental results show that the proposed algorithm has better performance via Precision, Recall, AUC, and F-measure.

  18. Classification of knee arthropathy with accelerometer-based vibroarthrography.

    Science.gov (United States)

    Moreira, Dinis; Silva, Joana; Correia, Miguel V; Massada, Marta

    2016-01-01

    One of the most common knee joint disorders is known as osteoarthritis which results from the progressive degeneration of cartilage and subchondral bone over time, affecting essentially elderly adults. Current evaluation techniques are either complex, expensive, invasive or simply fails into detection of small and progressive changes that occur within the knee. Vibroarthrography appeared as a new solution where the mechanical vibratory signals arising from the knee are recorded recurring only to an accelerometer and posteriorly analyzed enabling the differentiation between a healthy and an arthritic joint. In this study, a vibration-based classification system was created using a dataset with 92 healthy and 120 arthritic segments of knee joint signals collected from 19 healthy and 20 arthritic volunteers, evaluated with k-nearest neighbors and support vector machine classifiers. The best classification was obtained using the k-nearest neighbors classifier with only 6 time-frequency features with an overall accuracy of 89.8% and with a precision, recall and f-measure of 88.3%, 92.4% and 90.1%, respectively. Preliminary results showed that vibroarthrography can be a promising, non-invasive and low cost tool that could be used for screening purposes. Despite this encouraging results, several upgrades in the data collection process and analysis can be further implemented.

  19. Gastric Cancer Risk Analysis in Unhealthy Habits Data with Classification Algorithms

    OpenAIRE

    2015-01-01

    Data mining methods are applied to a medical task that seeks for the information about the influence of Helicobacter Pylori on the gastric cancer risk increase by analysing the adverse factors of individual lifestyle. In the process of data pre-processing, the data are cleared of noise and other factors, reduced in dimensionality, as well as transformed for the task and cleared of non-informative attributes. Data classification using C4.5, CN2 and k-nearest neighbour algorithms is carried out...

  20. Pro duct Image Classification Based on Fusion Features

    Institute of Scientific and Technical Information of China (English)

    YANG Xiao-hui; LIU Jing-jing; YANG Li-jun

    2015-01-01

    Two key challenges raised by a product images classification system are classi-fication precision and classification time. In some categories, classification precision of the latest techniques, in the product images classification system, is still low. In this paper, we propose a local texture descriptor termed fan refined local binary pattern, which captures more detailed information by integrating the spatial distribution into the local binary pattern feature. We compare our approach with different methods on a subset of product images on Amazon/eBay and parts of PI100 and experimental results have demonstrated that our proposed approach is superior to the current existing methods. The highest classification precision is increased by 21%and the average classification time is reduced by 2/3.

  1. Captan: transition from 'B2' to 'not likely'. How pesticide registrants affected the EPA Cancer Classification Update.

    Science.gov (United States)

    Gordon, Elliot

    2007-01-01

    On 24 November 2004 EPA changed the cancer classification of captan from a 'probable human carcinogen' (Category B2) to 'not likely' when used according to label directions. The new cancer classification considers captan to be a potential carcinogen at prolonged high doses that cause cytotoxicity and regenerative cell hyperplasia. These high doses of captan are many orders of magnitude above those likely to be consumed in the diet, or encountered by individuals in occupational or residential settings. This revised cancer classification reflects EPA's implementation of their new cancer guidelines. The procedures involved in the reclassification effort were agreed upon with EPA and involved an Independent Transparent Review as it related to four components that formed the basis of the original 1986 B2 classification: mouse tumors; rat tumors; mutagenicity; and structural similarity to other carcinogens. A Peer Review Panel organized and administered by Toxicology Excellence for Risk Assessment (TERA) met on 2-3 September 2003. The Panel concluded that captan acted through a non-mutagenic threshold mode of action that required prolonged irritation of the duodenal villi as the initial key event. EPA's Cancer Assessment Review Committee (CARC) met on 9 June 2004 and endorsed the Peer Review findings. EPA intended to have the FIFRA Scientific Advisory Panel (SAP) consider the basis for this reclassification but found the science was robust and judged that a SAP review was not warranted. Using the revised classification, the margin of exposure is approximately 1,200,000, supporting the 'not likely' characterization.

  2. Hyperspectral image classification based on spatial and spectral features and sparse representation

    Institute of Scientific and Technical Information of China (English)

    Yang Jing-Hui; Wang Li-Guo; Qian Jin-Xi

    2014-01-01

    To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is based on the Gabor spatial texture features and nonparametric weighted spectral features, and the sparse representation classification method (Gabor–NWSF and SRC), abbreviated GNWSF–SRC. The proposed (GNWSF–SRC) method first combines the Gabor spatial features and nonparametric weighted spectral features to describe the hyperspectral image, and then applies the sparse representation method. Finally, the classification is obtained by analyzing the reconstruction error. We use the proposed method to process two typical hyperspectral data sets with different percentages of training samples. Theoretical analysis and simulation demonstrate that the proposed method improves the classification accuracy and Kappa coefficient compared with traditional classification methods and achieves better classification performance.

  3. Effective Rule Based Classifier using Multivariate Filter and Genetic Miner for Mammographic Image Classification

    Directory of Open Access Journals (Sweden)

    Nirase Fathima Abubacker

    2015-06-01

    Full Text Available Mammography is an important examination in the early detection of breast abnormalities. Automatic classifications of mammogram images into normal, benign or malignant would help the radiologists in diagnosis of breast cancer cases. This study investigates the effectiveness of using rule-based classifiers with multivariate filter and genetic miner to classify mammogram images. The method discovers association rules with the classes as the consequence and classifies the images based on the Highest Average Confidence of the association rules (HAvC matched for the classes. In the association rules mining stage, Correlation based Feature Selection (CFS plays an enormous significance to reduce the complexity of image mining process is used in this study as a feature selection method and a modified genetic association rule mining technique, the GARM, is used to discover the rules. The method is evaluated on mammogram image dataset with 240 images taken from DDSM. The performance of the method is compared against other classifiers such as SMO; Naïve Bayes and J48. The performance of the proposed method is promising with 88% accuracy and outperforms other classifiers in the context of mammogram image classification.

  4. A Method of Soil Salinization Information Extraction with SVM Classification Based on ICA and Texture Features

    Institute of Scientific and Technical Information of China (English)

    ZHANG Fei; TASHPOLAT Tiyip; KUNG Hsiang-te; DING Jian-li; MAMAT.Sawut; VERNER Johnson; HAN Gui-hong; GUI Dong-wei

    2011-01-01

    Salt-affected soils classification using remotely sensed images is one of the most common applications in remote sensing,and many algorithms have been developed and applied for this purpose in the literature.This study takes the Delta Oasis of Weigan and Kuqa Rivers as a study area and discusses the prediction of soil salinization from ETM+ Landsat data.It reports the Support Vector Machine(SVM) classification method based on Independent Component Analysis(ICA) and Texture features.Meanwhile,the letter introduces the fundamental theory of SVM algorithm and ICA,and then incorporates ICA and texture features.The classification result is compared with ICA-SVM classification,single data source SVM classification,maximum likelihood classification(MLC) and neural network classification qualitatively and quantitatively.The result shows that this method can effectively solve the problem of low accuracy and fracture classification result in single data source classification.It has high spread ability toward higher array input.The overall accuracy is 98.64%,which increases by 10.2% compared with maximum likelihood classification,even increases by 12.94% compared with neural net classification,and thus acquires good effectiveness.Therefore,the classification method based on SVM and incorporating the ICA and texture features can be adapted to RS image classification and monitoring of soil salinization.

  5. Radiological classification of renal angiomyolipomas based on 127 tumors

    Directory of Open Access Journals (Sweden)

    Prando Adilson

    2003-01-01

    Full Text Available PURPOSE: Demonstrate radiological findings of 127 angiomyolipomas (AMLs and propose a classification based on the radiological evidence of fat. MATERIALS AND METHODS: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73, multiple without tuberous sclerosis (TS (n = 4 and multiple with TS (n = 8, were retrospectively reviewed. Eighteen AMLs (14% presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13, hemorrhage (n = 11 and impossibility of an adequate preoperative characterization (n = 8. There was not a case of renal cell carcinoma (RCC with fat component in this group of patients. RESULTS: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal: 54%; Pattern-II, partially fatty (intrarenal or exophytic: 29%; Pattern-III, minimally fatty (most exophytic and perirenal: 11%; and Pattern-IV, without fat (most exophytic and perirenal: 6%. CONCLUSIONS: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm, pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exophytic and heterogeneous (patterns II and III. The rare pattern-IV AMLs, however, can be small or large, intra-renal or exophytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with detectable

  6. Classification moléculaire du cancer du sein au Maroc

    OpenAIRE

    Fouad, Abbass; Yousra, Akasbi; Kaoutar, Znati; Omar, El Mesbahi; Afaf, Amarti; Sanae, Bennis

    2012-01-01

    Introduction La classification moléculaire des cancers du sein basée sur l'expression génique puis sur le profil protéique a permis de distinguer cinq groupes moléculaires: luminal A, luminal B, Her2/neu, basal-like et non-classées. L'objectif de cette étude réalisée au CHU Hassan II de Fès est de classer 335 cancers du sein infiltrant en groupes moléculaires, puis de les corréler avec les caractéristiques clinicopathologiques. Méthodes Etude rétrospective étalée sur 45 mois, comportant 335 p...

  7. Statistical Analysis of Tissue Images for Detection and Classification of Cervical Cancer

    CERN Document Server

    Jagtap, Jaidip; Pandey, Kiran; Agarwa, Asha; Panigrahi, Prasanta K; Pradhan, Asima

    2011-01-01

    Cervical cancer is one of the major health threats in women worldwide. The current "gold standard" for detecting cancer of the epithelial tissue is the histopathology analysis of biopsy samples. However it relies on the pathologist's judgment of the disease. We investigate the utility of statistical parameters as a potential tool for detection and discrimination of the stages of dysplasia. Digital images of the tissue slides are captured with the help of a digital camera plugged to a microscope. Statistical data analysis is performed with the help of software to evaluate parameters such as mean, maxima, full width half maxima, skewness, kurtosis etc. for the images. We believe that these parameters can help effectively to improve the diagnosis and further classify normal and abnormal tissue sections. These parameters can be used independently as well as in tandem with other parameters as features in classification algorithms that involve the use of Neural networks or Principal component analysis.

  8. Pooling breast cancer datasets has a synergetic effect on classification performance and improves signature stability

    Directory of Open Access Journals (Sweden)

    van de Vijver Marc J

    2008-08-01

    Full Text Available Abstract Background Michiels et al. (Lancet 2005; 365: 488–92 employed a resampling strategy to show that the genes identified as predictors of prognosis from resamplings of a single gene expression dataset are highly variable. The genes most frequently identified in the separate resamplings were put forward as a 'gold standard'. On a higher level, breast cancer datasets collected by different institutions can be considered as resamplings from the underlying breast cancer population. The limited overlap between published prognostic signatures confirms the trend of signature instability identified by the resampling strategy. Six breast cancer datasets, totaling 947 samples, all measured on the Affymetrix platform, are currently available. This provides a unique opportunity to employ a substantial dataset to investigate the effects of pooling datasets on classifier accuracy, signature stability and enrichment of functional categories. Results We show that the resampling strategy produces a suboptimal ranking of genes, which can not be considered to be a 'gold standard'. When pooling breast cancer datasets, we observed a synergetic effect on the classification performance in 73% of the cases. We also observe a significant positive correlation between the number of datasets that is pooled, the validation performance, the number of genes selected, and the enrichment of specific functional categories. In addition, we have evaluated the support for five explanations that have been postulated for the limited overlap of signatures. Conclusion The limited overlap of current signature genes can be attributed to small sample size. Pooling datasets results in more accurate classification and a convergence of signature genes. We therefore advocate the analysis of new data within the context of a compendium, rather than analysis in isolation.

  9. Fines Classification Based on Sensitivity to Pore-Fluid Chemistry

    KAUST Repository

    Jang, Junbong

    2015-12-28

    The 75-μm particle size is used to discriminate between fine and coarse grains. Further analysis of fine grains is typically based on the plasticity chart. Whereas pore-fluid-chemistry-dependent soil response is a salient and distinguishing characteristic of fine grains, pore-fluid chemistry is not addressed in current classification systems. Liquid limits obtained with electrically contrasting pore fluids (deionized water, 2-M NaCl brine, and kerosene) are combined to define the soil "electrical sensitivity." Liquid limit and electrical sensitivity can be effectively used to classify fine grains according to their fluid-soil response into no-, low-, intermediate-, or high-plasticity fine grains of low, intermediate, or high electrical sensitivity. The proposed methodology benefits from the accumulated experience with liquid limit in the field and addresses the needs of a broader range of geotechnical engineering problems. © ASCE.

  10. Improved Collaborative Filtering Recommendation Based on Classification and User Trust

    Institute of Scientific and Technical Information of China (English)

    Xiao-Lin Xu; Guang-Lin Xu

    2016-01-01

    When dealing with the ratings from users, traditional collaborative filtering algorithms do not consider the credibility of rating data, which affects the accuracy of similarity. To address this issue, the paper proposes an improved algorithm based on classification and user trust. It firstly classifies all the ratings by the categories of items. And then, for each category, it evaluates the trustworthy degree of each user on the category and imposes the degree on the ratings of the user. Finally, the algorithm explores the similarities between users, finds the nearest neighbors, and makes recommendations within each category. Simulations show that the improved algorithm outperforms the traditional collaborative filtering algorithms and enhances the accuracy of recommendation.

  11. About Classification Methods Based on Tensor Modelling for Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    Salah Bourennane

    2010-03-01

    Full Text Available Denoising and Dimensionality Reduction (DR are key issue to improve the classifiers efficiency for Hyper spectral images (HSI. The multi-way Wiener filtering recently developed is used, Principal and independent component analysis (PCA; ICA and projection pursuit(PP approaches to DR have been investigated. These matrix algebra methods are applied on vectorized images. Thereof, the spatial rearrangement is lost. To jointly take advantage of the spatial and spectral information, HSI has been recently represented as tensor. Offering multiple ways to decompose data orthogonally, we introduced filtering and DR methods based on multilinear algebra tools. The DR is performed on spectral way using PCA, or PP joint to an orthogonal projection onto a lower subspace dimension of the spatial ways. Weshow the classification improvement using the introduced methods in function to existing methods. This experiment is exemplified using real-world HYDICE data. Multi-way filtering, Dimensionality reduction, matrix and multilinear algebra tools, tensor processing.

  12. The generalization ability of online SVM classification based on Markov sampling.

    Science.gov (United States)

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  13. Classification of cassava genotypes based on qualitative and quantitative data.

    Science.gov (United States)

    Oliveira, E J; Oliveira Filho, O S; Santos, V S

    2015-02-02

    We evaluated the genetic variation of cassava accessions based on qualitative (binomial and multicategorical) and quantitative traits (continuous). We characterized 95 accessions obtained from the Cassava Germplasm Bank of Embrapa Mandioca e Fruticultura; we evaluated these accessions for 13 continuous, 10 binary, and 25 multicategorical traits. First, we analyzed the accessions based only on quantitative traits; next, we conducted joint analysis (qualitative and quantitative traits) based on the Ward-MLM method, which performs clustering in two stages. According to the pseudo-F, pseudo-t2, and maximum likelihood criteria, we identified five and four groups based on quantitative trait and joint analysis, respectively. The smaller number of groups identified based on joint analysis may be related to the nature of the data. On the other hand, quantitative data are more subject to environmental effects in the phenotype expression; this results in the absence of genetic differences, thereby contributing to greater differentiation among accessions. For most of the accessions, the maximum probability of classification was >0.90, independent of the trait analyzed, indicating a good fit of the clustering method. Differences in clustering according to the type of data implied that analysis of quantitative and qualitative traits in cassava germplasm might explore different genomic regions. On the other hand, when joint analysis was used, the means and ranges of genetic distances were high, indicating that the Ward-MLM method is very useful for clustering genotypes when there are several phenotypic traits, such as in the case of genetic resources and breeding programs.

  14. A comparative study on classification of sleep stage based on EEG signals using feature selection and classification algorithms.

    Science.gov (United States)

    Şen, Baha; Peker, Musa; Çavuşoğlu, Abdullah; Çelebi, Fatih V

    2014-03-01

    Sleep scoring is one of the most important diagnostic methods in psychiatry and neurology. Sleep staging is a time consuming and difficult task undertaken by sleep experts. This study aims to identify a method which would classify sleep stages automatically and with a high degree of accuracy and, in this manner, will assist sleep experts. This study consists of three stages: feature extraction, feature selection from EEG signals, and classification of these signals. In the feature extraction stage, it is used 20 attribute algorithms in four categories. 41 feature parameters were obtained from these algorithms. Feature selection is important in the elimination of irrelevant and redundant features and in this manner prediction accuracy is improved and computational overhead in classification is reduced. Effective feature selection algorithms such as minimum redundancy maximum relevance (mRMR); fast correlation based feature selection (FCBF); ReliefF; t-test; and Fisher score algorithms are preferred at the feature selection stage in selecting a set of features which best represent EEG signals. The features obtained are used as input parameters for the classification algorithms. At the classification stage, five different classification algorithms (random forest (RF); feed-forward neural network (FFNN); decision tree (DT); support vector machine (SVM); and radial basis function neural network (RBF)) classify the problem. The results, obtained from different classification algorithms, are provided so that a comparison can be made between computation times and accuracy rates. Finally, it is obtained 97.03 % classification accuracy using the proposed method. The results show that the proposed method indicate the ability to design a new intelligent assistance sleep scoring system.

  15. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  16. Three types of abdominoperineal excision procedures for the rectal cancer based on anatomic landmarks classification%基于解剖边界划分的三种直肠癌腹会阴联合切除术式

    Institute of Scientific and Technical Information of China (English)

    叶颖江; 申占龙; 王杉

    2014-01-01

    对于肿瘤位置过低、肿瘤明显外侵和骨盆过于狭小的患者,腹会阴联合切除术(APE)依然是主要术式。APE腹部操作已明确应遵循全直肠系膜切除术( TME )原则,但会阴操作原则尚未达成共识,其重要原因在于会阴部操作缺乏明确的解剖边界,以至于难以实现标准化。2014年,瑞典外科学家Holm教授基于会阴区筋膜、神经和血管组成的解剖边界,提出了直肠癌APE的术式分类新概念,将APE分为3类,即括约肌间APE、肛提肌外APE和坐骨肛管间APE。此新概念的提出,使APE术式分类更明确,解剖界标更清晰,更利于推广和标准化。本文结合文献和笔者的诊治经验,对这3种术式分别加以介绍和讨论。%Abdominoperineal excision (APE) procedure is still the main approach to low rectal cancer patients with short distance from the anal verge, obvious invasion of adjacent organs and narrow pelvis. Although the principle of TME (total mesorectal excision) needs to be obeyed in the abdominal phase of APE procedure, it does not reach the consensus for the perineal phase. The important reason is the lack of definite anatomic landmarks in the perineal phase, thus the standardization of the procedure remains hard. In 2014, Swedish surgeon, professor Holm, proposed the new conception to classify the APE procedure into three types, which were intersphincteric APE, the extralavator APE and the ischioanal APE, based on the anatomic landmarks with perineal fascias, nervous and blood vessels. In this paper, we combine the review of literatures and our experiences of treatment to introduce and discuss these three types of APE procedures. This new concep is based on anatomic landmarks which makes the category of APE procedure more definitive, the anatomic dissection more clear and the standardization and adoption of APE procedure much easier.

  17. Classification between normal and tumor tissues based on the pair-wise gene expression ratio

    Directory of Open Access Journals (Sweden)

    Wong YC

    2004-10-01

    Full Text Available Abstract Background Precise classification of cancer types is critically important for early cancer diagnosis and treatment. Numerous efforts have been made to use gene expression profiles to improve precision of tumor classification. However, reliable cancer-related signals are generally lacking. Method Using recent datasets on colon and prostate cancer, a data transformation procedure from single gene expression to pair-wise gene expression ratio is proposed. Making use of the internal consistency of each expression profiling dataset this transformation improves the signal to noise ratio of the dataset and uncovers new relevant cancer-related signals (features. The efficiency in using the transformed dataset to perform normal/tumor classification was investigated using feature partitioning with informative features (gene annotation as discriminating axes (single gene expression or pair-wise gene expression ratio. Classification results were compared to the original datasets for up to 10-feature model classifiers. Results 82 and 262 genes that have high correlation to tissue phenotype were selected from the colon and prostate datasets respectively. Remarkably, data transformation of the highly noisy expression data successfully led to lower the coefficient of variation (CV for the within-class samples as well as improved the correlation with tissue phenotypes. The transformed dataset exhibited lower CV when compared to that of single gene expression. In the colon cancer set, the minimum CV decreased from 45.3% to 16.5%. In prostate cancer, comparable CV was achieved with and without transformation. This improvement in CV, coupled with the improved correlation between the pair-wise gene expression ratio and tissue phenotypes, yielded higher classification efficiency, especially with the colon dataset – from 87.1% to 93.5%. Over 90% of the top ten discriminating axes in both datasets showed significant improvement after data transformation. The

  18. Breast tissue classification in digital tomosynthesis images based on global gradient minimization and texture features

    Science.gov (United States)

    Qin, Xulei; Lu, Guolan; Sechopoulos, Ioannis; Fei, Baowei

    2014-03-01

    Digital breast tomosynthesis (DBT) is a pseudo-three-dimensional x-ray imaging modality proposed to decrease the effect of tissue superposition present in mammography, potentially resulting in an increase in clinical performance for the detection and diagnosis of breast cancer. Tissue classification in DBT images can be useful in risk assessment, computer-aided detection and radiation dosimetry, among other aspects. However, classifying breast tissue in DBT is a challenging problem because DBT images include complicated structures, image noise, and out-of-plane artifacts due to limited angular tomographic sampling. In this project, we propose an automatic method to classify fatty and glandular tissue in DBT images. First, the DBT images are pre-processed to enhance the tissue structures and to decrease image noise and artifacts. Second, a global smooth filter based on L0 gradient minimization is applied to eliminate detailed structures and enhance large-scale ones. Third, the similar structure regions are extracted and labeled by fuzzy C-means (FCM) classification. At the same time, the texture features are also calculated. Finally, each region is classified into different tissue types based on both intensity and texture features. The proposed method is validated using five patient DBT images using manual segmentation as the gold standard. The Dice scores and the confusion matrix are utilized to evaluate the classified results. The evaluation results demonstrated the feasibility of the proposed method for classifying breast glandular and fat tissue on DBT images.

  19. A Novel Prostate Cancer Classification Technique Using Intermediate Memory Tabu Search

    Directory of Open Access Journals (Sweden)

    Tahir Muhammad Atif

    2005-01-01

    Full Text Available The introduction of multispectral imaging in pathology problems such as the identification of prostatic cancer is recent. Unlike conventional RGB color space, it allows the acquisition of a large number of spectral bands within the visible spectrum. This results in a feature vector of size greater than 100. For such a high dimensionality, pattern recognition techniques suffer from the well-known curse of dimensionality problem. The two well-known techniques to solve this problem are feature extraction and feature selection. In this paper, a novel feature selection technique using tabu search with an intermediate-term memory is proposed. The cost of a feature subset is measured by leave-one-out correct-classification rate of a nearest-neighbor (1-NN classifier. The experiments have been carried out on the prostate cancer textured multispectral images and the results have been compared with a reported classical feature extraction technique. The results have indicated a significant boost in the performance both in terms of minimizing features and maximizing classification accuracy.

  20. The Evaluation of Microcarcinoma in Differentiated Thyroid Cancers According to Old and New TNM Classification

    Directory of Open Access Journals (Sweden)

    Zekiye Hasbek

    2011-12-01

    Full Text Available Objective: In this study, we aimed to evaluate the tumor size for proximal and distant metastases when the new and old TNM clas¬sification is taken into account in differentiated thyroid cancers. Material and Methods: Two hundred sixty eight patients diagnosed with thyroid carcinoma, undergoing bilateral total or subto¬tal thyroidectomy treated with high doses of I-131 were examined retrospectively. The data of these patients were compared after classification, according to tumor size 1 cm. In the same group, according to the revised TNM classification, in 149 of 207 patients (72% the tumor size was 2 cm. Of 187 patients with negative lymph nodes, 15 (8% showed abnormal activity accumulation in the first post I-131 treatment whole-body scan and 10 (40% of 25 patients positive lymph node (p<0.05 involvement. Conclusion: Since the treatment of patients with microcarcinoma is controversial, tumor size should not be the only factor consid¬ered in patients with differentiated thyroid cancer Tissue tumor invasion, age, gender and multifocality should also be taken into account. (MIRT2011;20:94-99

  1. [Postoperative results under the new stage classification of lung cancer: the additional reports for those of JACS in 1996].

    Science.gov (United States)

    Shirakusa, T

    2000-10-01

    This time, in 3008 lung cancer patients, the postoperative results were analyzed under the new stage grouping of TNM classification. All of those patients underwent the operation in 1989, and the 5 year-survival rates had beeb surveyed in 1996 by JACS (The Japanese Association for Chest Surgery). Under the new TNM classification established in 1996 worldwidey, T3N0M0 was transferred from IIIA to IIB. This report is the additional one in the focus of the results accompanied with the change of TNM classification.

  2. Automated segmentation of atherosclerotic histology based on pattern classification

    Directory of Open Access Journals (Sweden)

    Arna van Engelen

    2013-01-01

    Full Text Available Background: Histology sections provide accurate information on atherosclerotic plaque composition, and are used in various applications. To our knowledge, no automated systems for plaque component segmentation in histology sections currently exist. Materials and Methods: We perform pixel-wise classification of fibrous, lipid, and necrotic tissue in Elastica Von Gieson-stained histology sections, using features based on color channel intensity and local image texture and structure. We compare an approach where we train on independent data to an approach where we train on one or two sections per specimen in order to segment the remaining sections. We evaluate the results on segmentation accuracy in histology, and we use the obtained histology segmentations to train plaque component classification methods in ex vivo Magnetic resonance imaging (MRI and in vivo MRI and computed tomography (CT. Results: In leave-one-specimen-out experiments on 176 histology slices of 13 plaques, a pixel-wise accuracy of 75.7 ± 6.8% was obtained. This increased to 77.6 ± 6.5% when two manually annotated slices of the specimen to be segmented were used for training. Rank correlations of relative component volumes with manually annotated volumes were high in this situation (P = 0.82-0.98. Using the obtained histology segmentations to train plaque component classification methods in ex vivo MRI and in vivo MRI and CT resulted in similar image segmentations for training on the automated histology segmentations as for training on a fully manual ground truth. The size of the lipid-rich necrotic core was significantly smaller when training on fully automated histology segmentations than when manually annotated histology sections were used. This difference was reduced and not statistically significant when one or two slices per section were manually annotated for histology segmentation. Conclusions: Good histology segmentations can be obtained by automated segmentation

  3. Classification of types of stuttering symptoms based on brain activity.

    Science.gov (United States)

    Jiang, Jing; Lu, Chunming; Peng, Danling; Zhu, Chaozhe; Howell, Peter

    2012-01-01

    Among the non-fluencies seen in speech, some are more typical (MT) of stuttering speakers, whereas others are less typical (LT) and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT) whole-word repetitions (WWR) should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type.

  4. Classification of types of stuttering symptoms based on brain activity.

    Directory of Open Access Journals (Sweden)

    Jing Jiang

    Full Text Available Among the non-fluencies seen in speech, some are more typical (MT of stuttering speakers, whereas others are less typical (LT and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT whole-word repetitions (WWR should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type.

  5. Sequence-based classification using discriminatory motif feature selection.

    Directory of Open Access Journals (Sweden)

    Hao Xiong

    Full Text Available Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all k-mer patterns. The motivation behind such (enumerative approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of length ≤ k, such that potentially important, longer (> k predictors are not considered. Second, features so generated exhibit strong dependencies, which can complicate understanding of derived classification rules. Third, and most importantly, numerous irrelevant features are created. These concerns can compromise prediction and interpretation. While remedies have been proposed, they tend to be problem-specific and not broadly applicable. Here, we develop a generally applicable methodology, and an attendant software pipeline, that is predicated on discriminatory motif finding. In addition to the traditional training and validation partitions, our framework entails a third level of data partitioning, a discovery partition. A discriminatory motif finder is used on sequences and associated class labels in the discovery partition to yield a (small set of features. These features are then used as inputs to a classifier in the training partition. Finally, performance assessment occurs on the validation partition. Important attributes of our approach are its modularity (any discriminatory motif finder and any classifier can be deployed and its universality (all data, including sequences that are unaligned and/or of unequal length, can be accommodated. We illustrate our approach on two nucleosome occupancy datasets and a protein solubility dataset, previously analyzed using enumerative feature generation. Our method achieves excellent performance results, with and without optimization of classifier tuning parameters. A Python pipeline implementing the approach is

  6. An innovative blazar classification based on radio jet kinematics

    Science.gov (United States)

    Hervet, O.; Boisson, C.; Sol, H.

    2016-07-01

    Context. Blazars are usually classified following their synchrotron peak frequency (νF(ν) scale) as high, intermediate, low frequency peaked BL Lacs (HBLs, IBLs, LBLs), and flat spectrum radio quasars (FSRQs), or, according to their radio morphology at large scale, FR I or FR II. However, the diversity of blazars is such that these classes seem insufficient to chart the specific properties of each source. Aims: We propose to classify a wide sample of blazars following the kinematic features of their radio jets seen in very long baseline interferometry (VLBI). Methods: For this purpose we use public data from the MOJAVE collaboration in which we select a sample of blazars with known redshift and sufficient monitoring to constrain apparent velocities. We selected 161 blazars from a sample of 200 sources. We identify three distinct classes of VLBI jets depending on radio knot kinematics: class I with quasi-stationary knots, class II with knots in relativistic motion from the radio core, and class I/II, intermediate, showing quasi-stationary knots at the jet base and relativistic motions downstream. Results: A notable result is the good overlap of this kinematic classification with the usual spectral classification; class I corresponds to HBLs, class II to FSRQs, and class I/II to IBLs/LBLs. We deepen this study by characterizing the physical parameters of jets from VLBI radio data. Hence we focus on the singular case of the class I/II by the study of the blazar BL Lac itself. Finally we show how the interpretation that radio knots are recollimation shocks is fully appropriate to describe the characteristics of these three classes.

  7. Research and Application of Human Capital Strategic Classification Tool: Human Capital Classification Matrix Based on Biological Natural Attribute

    Directory of Open Access Journals (Sweden)

    Yong Liu

    2014-12-01

    Full Text Available In order to study the causes of weak human capital structure strategic classification management in China, we analyze that enterprises around the world face increasingly difficult for human capital management. In order to provide strategically sound answers, the HR managers need the critical information provided by the right technology processing and analytical tools. In this study, there are different types and levels of human capital in formal organization management, which is not the same contribution to a formal organization. An important guarantee for sustained and healthy development of the formal or informal organization is lower human capital risk. To resist this risk is primarily dependent on human capital hedge force and appreciation force in value, which is largely dependent on the strategic value of the performance of senior managers. Based on the analysis of high-level managers perspective, we also discuss the value and configuration of principles and methods to be followed in human capital strategic classification based on Boston Consulting Group (BCG matrix and build Human Capital Classification (HCC matrix based on biological natural attribute to effectively realize human capital structure strategic classification.

  8. Automated classification of mouse pup isolation syllables: from cluster analysis to an Excel-based "mouse pup syllable classification calculator".

    Science.gov (United States)

    Grimsley, Jasmine M S; Gadziola, Marie A; Wenstrup, Jeffrey J

    2012-01-01

    Mouse pups vocalize at high rates when they are cold or isolated from the nest. The proportions of each syllable type produced carry information about disease state and are being used as behavioral markers for the internal state of animals. Manual classifications of these vocalizations identified 10 syllable types based on their spectro-temporal features. However, manual classification of mouse syllables is time consuming and vulnerable to experimenter bias. This study uses an automated cluster analysis to identify acoustically distinct syllable types produced by CBA/CaJ mouse pups, and then compares the results to prior manual classification methods. The cluster analysis identified two syllable types, based on their frequency bands, that have continuous frequency-time structure, and two syllable types featuring abrupt frequency transitions. Although cluster analysis computed fewer syllable types than manual classification, the clusters represented well the probability distributions of the acoustic features within syllables. These probability distributions indicate that some of the manually classified syllable types are not statistically distinct. The characteristics of the four classified clusters were used to generate a Microsoft Excel-based mouse syllable classifier that rapidly categorizes syllables, with over a 90% match, into the syllable types determined by cluster analysis.

  9. Clinical application of a microfluidic chip for immunocapture and quantification of circulating exosomes to assist breast cancer diagnosis and molecular classification

    Science.gov (United States)

    Fang, Shimeng; Tian, Hongzhu; Li, Xiancheng; Jin, Dong; Li, Xiaojie; Kong, Jing; Yang, Chun; Yang, Xuesong; Lu, Yao; Luo, Yong; Lin, Bingcheng; Niu, Weidong

    2017-01-01

    Increasing attention has been attracted by exosomes in blood-based diagnosis because cancer cells release more exosomes in serum than normal cells and these exosomes overexpress a certain number of cancer-related biomarkers. However, capture and biomarker analysis of exosomes for clinical application are technically challenging. In this study, we developed a microfluidic chip for immunocapture and quantification of circulating exosomes from small sample volume and applied this device in clinical study. Circulating EpCAM-positive exosomes were measured in 6 cases breast cancer patients and 3 healthy controls to assist diagnosis. A significant increase in the EpCAM-positive exosome level in these patients was detected, compared to healthy controls. Furthermore, we quantified circulating HER2-positive exosomes in 19 cases of breast cancer patients for molecular classification. We demonstrated that the exosomal HER2 expression levels were almost consistent with that in tumor tissues assessed by immunohistochemical staining. The microfluidic chip might provide a new platform to assist breast cancer diagnosis and molecular classification. PMID:28369094

  10. Hyperspectral remote sensing image classification based on decision level fusion

    Institute of Scientific and Technical Information of China (English)

    Peijun Du; Wei Zhang; Junshi Xia

    2011-01-01

    @@ To apply decision level fusion to hyperspectral remote sensing (HRS) image classification, three decision level fusion strategies are experimented on and compared, namely, linear consensus algorithm, improved evidence theory, and the proposed support vector machine (SVM) combiner.To evaluate the effects of the input features on classification performance, four schemes are used to organize input features for member classifiers.In the experiment, by using the operational modular imaging spectrometer (OMIS) II HRS image, the decision level fusion is shown as an effective way for improving the classification accuracy of the HRS image, and the proposed SVM combiner is especially suitable for decision level fusion.The results also indicate that the optimization of input features can improve the classification performance.%To apply decision level fusion to hyperspectral remote sensing (HRS) image classification, three decision level fusion strategies are experimented on and compared, namely, linear consensus algorithm, improved evidence theory, and the proposed support vector machine (SVM) combiner. To evaluate the effects of the input features on classification performance, four schemes are used to organize input features for member classifiers. In the experiment, by using the operational modular imaging spectrometer (OMIS) Ⅱ HRS image, the decision level fusion is shown as an effective way for improving the classification accuracy of the HRS image, and the proposed SVM combiner is especially suitable for decision level fusion. The results also indicate that the optimization of input features can improve the classification performance.

  11. Text Passage Retrieval Based on Colon Classification: Retrieval Performance.

    Science.gov (United States)

    Shepherd, Michael A.

    1981-01-01

    Reports the results of experiments using colon classification for the analysis, representation, and retrieval of primary information from the full text of documents. Recall, precision, and search length measures indicate colon classification did not perform significantly better than Boolean or simple word occurrence systems. Thirteen references…

  12. Text Classification Retrieval Based on Complex Network and ICA Algorithm

    Directory of Open Access Journals (Sweden)

    Hongxia Li

    2013-08-01

    Full Text Available With the development of computer science and information technology, the library is developing toward information and network. The library digital process converts the book into digital information. The high-quality preservation and management are achieved by computer technology as well as text classification techniques. It realizes knowledge appreciation. This paper introduces complex network theory in the text classification process and put forwards the ICA semantic clustering algorithm. It realizes the independent component analysis of complex network text classification. Through the ICA clustering algorithm of independent component, it realizes character words clustering extraction of text classification. The visualization of text retrieval is improved. Finally, we make a comparative analysis of collocation algorithm and ICA clustering algorithm through text classification and keyword search experiment. The paper gives the clustering degree of algorithm and accuracy figure. Through simulation analysis, we find that ICA clustering algorithm increases by 1.2% comparing with text classification clustering degree. Accuracy can be improved by 11.1% at most. It improves the efficiency and accuracy of text classification retrieval. It also provides a theoretical reference for text retrieval classification of eBook

  13. Gene expression-based classifications of fibroadenomas and phyllodes tumours of the breast.

    Science.gov (United States)

    Vidal, Maria; Peg, Vicente; Galván, Patricia; Tres, Alejandro; Cortés, Javier; Ramón y Cajal, Santiago; Rubio, Isabel T; Prat, Aleix

    2015-06-01

    Fibroepithelial tumors (FTs) of the breast are a heterogeneous group of lesions ranging from fibroadenomas (FAD) to phyllodes tumors (PT) (benign, borderline, malignant). Further understanding of their molecular features and classification might be of clinical value. In this study, we analysed the expression of 105 breast cancer-related genes, including the 50 genes of the PAM50 intrinsic subtype predictor and 12 genes of the Claudin-low subtype predictor, in a panel of 75 FTs (34 FADs, 5 juvenile FADs, 20 benign PTs, 5 borderline PTs and 11 malignant PTs) with clinical follow-up. In addition, we compared the expression profiles of FTs with those of 14 normal breast tissues and 49 primary invasive ductal carcinomas (IDCs). Our results revealed that the levels of expression of all breast cancer-related genes can discriminate the various groups of FTs, together with normal breast tissues and IDCs (False Discovery Rate expression of proliferation-related genes (e.g. CCNB1 and MKI67) and mesenchymal/epithelial-related (e.g. CLDN3 and EPCAM) genes were found to be most discriminative. As expected, FADs showed the highest and lowest expression of epithelial- and proliferation-related genes, respectively, whereas malignant PTs showed the opposite expression pattern. Interestingly, the overall profile of benign PTs was found more similar to FADs and normal breast tissues than the rest of tumours, including juvenile FADs. Within the dataset of IDCs and normal breast tissues, the vast majority of FADs, juvenile FADs, benign PTs and borderline PTs were identified as Normal-like by intrinsic breast cancer subtyping, whereas 7 (63.6%) and 3 (27.3%) malignant PTs were identified as Claudin-low and Basal-like, respectively. Finally, we observed that the previously described PAM50 risk of relapse prognostic score better predicted outcome in FTs than the morphological classification, even within PTs-only. Our results suggest that classification of FTs using gene expression-based

  14. A spectral-spatial kernel-based method for hyperspectral imagery classification

    Science.gov (United States)

    Li, Li; Ge, Hongwei; Gao, Jianqiang

    2017-02-01

    Spectral-based classification methods have gained increasing attention in hyperspectral imagery classification. Nevertheless, the spectral cannot fully represent the inherent spatial distribution of the imagery. In this paper, a spectral-spatial kernel-based method for hyperspectral imagery classification is proposed. Firstly, the spatial feature was extracted by using area median filtering (AMF). Secondly, the result of the AMF was used to construct spatial feature patch according to different window sizes. Finally, using the kernel technique, the spectral feature and the spatial feature were jointly used for the classification through a support vector machine (SVM) formulation. Therefore, for hyperspectral imagery classification, the proposed method was called spectral-spatial kernel-based support vector machine (SSF-SVM). To evaluate the proposed method, experiments are performed on three hyperspectral images. The experimental results show that an improvement is possible with the proposed technique in most of the real world classification problems.

  15. Hydrological landscape classification: investigating the performance of HAND based landscape classifications in a central European meso-scale catchment

    Directory of Open Access Journals (Sweden)

    S. Gharari

    2011-11-01

    Full Text Available This paper presents a detailed performance and sensitivity analysis of a recently developed hydrological landscape classification method based on dominant runoff mechanisms. Three landscape classes are distinguished: wetland, hillslope and plateau, corresponding to three dominant hydrological regimes: saturation excess overland flow, storage excess sub-surface flow, and deep percolation. Topography, geology and land use hold the key to identifying these landscapes. The height above the nearest drainage (HAND and the surface slope, which can be easily obtained from a digital elevation model, appear to be the dominant topographical controls for hydrological classification. In this paper several indicators for classification are tested as well as their sensitivity to scale and resolution of observed points (sample size. The best results are obtained by the simple use of HAND and slope. The results obtained compared well with the topographical wetness index. The HAND based landscape classification appears to be an efficient method to ''read the landscape'' on the basis of which conceptual models can be developed.

  16. Data Stream Classification Based on the Gamma Classifier

    Directory of Open Access Journals (Sweden)

    Abril Valeria Uriarte-Arcia

    2015-01-01

    Full Text Available The ever increasing data generation confronts us with the problem of handling online massive amounts of information. One of the biggest challenges is how to extract valuable information from these massive continuous data streams during single scanning. In a data stream context, data arrive continuously at high speed; therefore the algorithms developed to address this context must be efficient regarding memory and time management and capable of detecting changes over time in the underlying distribution that generated the data. This work describes a novel method for the task of pattern classification over a continuous data stream based on an associative model. The proposed method is based on the Gamma classifier, which is inspired by the Alpha-Beta associative memories, which are both supervised pattern recognition models. The proposed method is capable of handling the space and time constrain inherent to data stream scenarios. The Data Streaming Gamma classifier (DS-Gamma classifier implements a sliding window approach to provide concept drift detection and a forgetting mechanism. In order to test the classifier, several experiments were performed using different data stream scenarios with real and synthetic data streams. The experimental results show that the method exhibits competitive performance when compared to other state-of-the-art algorithms.

  17. Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Łukasz Augustyniak

    2015-12-01

    Full Text Available We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.

  18. A texton-based approach for the classification of lung parenchyma in CT images

    DEFF Research Database (Denmark)

    Gangeh, Mehrdad J.; Sørensen, Lauge; Shaker, Saher B.

    2010-01-01

    In this paper, a texton-based classification system based on raw pixel representation along with a support vector machine with radial basis function kernel is proposed for the classification of emphysema in computed tomography images of the lung. The proposed approach is tested on 168 annotated...

  19. Classification of Polarimetric SAR Image Based on the Subspace Method

    Science.gov (United States)

    Xu, J.; Li, Z.; Tian, B.; Chen, Q.; Zhang, P.

    2013-07-01

    Land cover classification is one of the most significant applications in remote sensing. Compared to optical sensing technologies, synthetic aperture radar (SAR) can penetrate through clouds and have all-weather capabilities. Therefore, land cover classification for SAR image is important in remote sensing. The subspace method is a novel method for the SAR data, which reduces data dimensionality by incorporating feature extraction into the classification process. This paper uses the averaged learning subspace method (ALSM) method that can be applied to the fully polarimetric SAR image for classification. The ALSM algorithm integrates three-component decomposition, eigenvalue/eigenvector decomposition and textural features derived from the gray-level cooccurrence matrix (GLCM). The study site, locates in the Dingxing county, in Hebei Province, China. We compare the subspace method with the traditional supervised Wishart classification. By conducting experiments on the fully polarimetric Radarsat-2 image, we conclude the proposed method yield higher classification accuracy. Therefore, the ALSM classification method is a feasible and alternative method for SAR image.

  20. An approach for mechanical fault classification based on generalized discriminant analysis

    Institute of Scientific and Technical Information of China (English)

    LI Wei-hua; SHI Tie-lin; YANG Shu-zi

    2006-01-01

    To deal with pattern classification of complicated mechanical faults,an approach to multi-faults classification based on generalized discriminant analysis is presented.Compared with linear discriminant analysis (LDA),generalized discriminant analysis (GDA),one of nonlinear discriminant analysis methods,is more suitable for classifying the linear non-separable problem.The connection and difference between KPCA (Kernel Principal Component Analysis) and GDA is discussed.KPCA is good at detection of machine abnormality while GDA performs well in multi-faults classification based on the collection of historical faults symptoms.When the proposed method is applied to air compressor condition classification and gear fault classification,an excellent performance in complicated multi-faults classification is presented.

  1. Rapid Occupant Classification System Based Rough Sets Theory

    Directory of Open Access Journals (Sweden)

    Lin Chen

    2012-09-01

    Full Text Available In the intelligent airbag system, the correct classification of occupant type is the precondition and plays an important role in controlling the airbag release time and inflation strength during emergent accidents. In the paper, the novel rapid occupant classification system is proposed in which tens of pressure sensors are needed to real-time collect pressure distribution data and then the rough sets theory is combined to extract classification knowledge from data features. Furthermore, Experiments have been done to verify its efficiency and effectiviness.

  2. A NEW SVM BASED EMOTIONAL CLASSIFICATION OF IMAGE

    Institute of Scientific and Technical Information of China (English)

    Wang Weining; Yu Yinglin; Zhang Jianchao

    2005-01-01

    How high-level emotional representation of art paintings can be inferred from percep tual level features suited for the particular classes (dynamic vs. static classification)is presented. The key points are feature selection and classification. According to the strong relationship between notable lines of image and human sensations, a novel feature vector WLDLV (Weighted Line Direction-Length Vector) is proposed, which includes both orientation and length information of lines in an image. Classification is performed by SVM (Support Vector Machine) and images can be classified into dynamic and static. Experimental results demonstrate the effectiveness and superiority of the algorithm.

  3. Identification of area-level influences on regions of high cancer incidence in Queensland, Australia: a classification tree approach

    Directory of Open Access Journals (Sweden)

    Mengersen Kerrie L

    2011-07-01

    Full Text Available Abstract Background Strategies for cancer reduction and management are targeted at both individual and area levels. Area-level strategies require careful understanding of geographic differences in cancer incidence, in particular the association with factors such as socioeconomic status, ethnicity and accessibility. This study aimed to identify the complex interplay of area-level factors associated with high area-specific incidence of Australian priority cancers using a classification and regression tree (CART approach. Methods Area-specific smoothed standardised incidence ratios were estimated for priority-area cancers across 478 statistical local areas in Queensland, Australia (1998-2007, n = 186,075. For those cancers with significant spatial variation, CART models were used to identify whether area-level accessibility, socioeconomic status and ethnicity were associated with high area-specific incidence. Results The accessibility of a person's residence had the most consistent association with the risk of cancer diagnosis across the specific cancers. Many cancers were likely to have high incidence in more urban areas, although male lung cancer and cervical cancer tended to have high incidence in more remote areas. The impact of socioeconomic status and ethnicity on these associations differed by type of cancer. Conclusions These results highlight the complex interactions between accessibility, socioeconomic status and ethnicity in determining cancer incidence risk.

  4. An Efficient Method for Landscape Image Classification and Matching Based on MPEG-7 Descriptors

    OpenAIRE

    2011-01-01

    In this thesis, an efficient approach for landscape image classification and matching system based on the MPEG-7 (Moving Picture Expert group) color and shape descriptor. Image classification is the task of deciding whether an image landscape or not. These classifications use the dominant color descriptor method for finding the dominant color in the image. In DCD we examine whole image pixel values. The pixel value contains Red, Green and Blue color values in the RGB color model. After calcul...

  5. Analysis on Design of Kohonen-network System Based on Classification of Complex Signals

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The key methods of detection and classification of the electroencephalogram(EEG) used in recent years are introduced . Taking EEG for example, the design plan of Kohonen neural network system based on detection and classification of complex signals is proposed, and both the network design and signal processing are analyzed, including pre-processing of signals, extraction of signal features, classification of signal and network topology, etc.

  6. Trace elements based classification on clinkers. Application to Spanish clinkers

    Directory of Open Access Journals (Sweden)

    Tamás, F. D.

    2001-12-01

    Full Text Available The qualitative identification to determine the origin (i.e. manufacturing factory of Spanish clinkers is described. The classification of clinkers produced in different factories can be based on their trace element content. Approximately fifteen clinker sorts are analysed, collected from 11 Spanish cement factories to determine their Mg, Sr, Ba, Mn, Ti, Zr, Zn and V content. An expert system formulated by a binary decision tree is designed based on the collected data. The performance of the obtained classifier was measured by ten-fold cross validation. The results show that the proposed method is useful to identify an easy-to-use expert system that is able to determine the origin of the clinker based on its trace element content.

    En el presente trabajo se describe el procedimiento de identificación cualitativa de clínkeres españoles con el objeto de determinar su origen (fábrica. Esa clasificación de los clínkeres se basa en el contenido de sus elementos traza. Se analizaron 15 clínkeres diferentes procedentes de 11 fábricas de cemento españolas, determinándose los contenidos en Mg, Sr, Ba, Mn, Ti, Zr, Zn y V. Se ha diseñado un sistema experto mediante un árbol de decisión binario basado en los datos recogidos. La clasificación obtenida fue examinada mediante la validación cruzada de 10 valores. Los resultados obtenidos muestran que el modelo propuesto es válido para identificar, de manera fácil, un sistema experto capaz de determinar el origen de un clínker basándose en el contenido de sus elementos traza.

  7. Knowledge-based sea ice classification by polarimetric SAR

    DEFF Research Database (Denmark)

    Skriver, Henning; Dierking, Wolfgang

    2004-01-01

    Polarimetric SAR images acquired at C- and L-band over sea ice in the Greenland Sea, Baltic Sea, and Beaufort Sea have been analysed with respect to their potential for ice type classification. The polarimetric data were gathered by the Danish EMISAR and the US AIRSAR which both are airborne...... systems. A hierarchical classification scheme was chosen for sea ice because our knowledge about magnitudes, variations, and dependences of sea ice signatures can be directly considered. The optimal sequence of classification rules and the rules themselves depend on the ice conditions/regimes. The use...... of the polarimetric phase information improves the classification only in the case of thin ice types but is not necessary for thicker ice (above about 30 cm thickness)...

  8. [WHO classification 2016 and first S3 guidelines on renal cell cancer: What is important for the practice?].

    Science.gov (United States)

    Moch, H

    2016-03-01

    The first S3 guidelines on renal cell cancer cover the practical aspects of imaging, diagnostics and therapy as well as the clinical relevance of pathology reporting. This review summarizes the changes in renal tumor classification and the new recommendations for reporting renal cell tumors. The S3 guidelines recommend the 2016 World Health Organization (WHO) classification of renal cell tumors. Novel renal cell tumor entities and provisional or emerging renal cell tumor entities of the 2016 WHO classification of renal tumors are discussed. The S3 guidelines for renal cell cancer also recommend the use of the WHO/International Society of Urologic Pathology (ISUP) grading system for clear cell and for papillary renal cell carcinomas, which replaces the previously used Fuhrman grading system.

  9. Plant Electrical Signal Classification Based on Waveform Similarity

    Directory of Open Access Journals (Sweden)

    Yang Chen

    2016-10-01

    Full Text Available (1 Background: Plant electrical signals are important physiological traits which reflect plant physiological state. As a kind of phenotypic data, plant action potential (AP evoked by external stimuli—e.g., electrical stimulation, environmental stress—may be associated with inhibition of gene expression related to stress tolerance. However, plant AP is a response to environment changes and full of variability. It is an aperiodic signal with refractory period, discontinuity, noise, and artifacts. In consequence, there are still challenges to automatically recognize and classify plant AP; (2 Methods: Therefore, we proposed an AP recognition algorithm based on dynamic difference threshold to extract all waveforms similar to AP. Next, an incremental template matching algorithm was used to classify the AP and non-AP waveforms; (3 Results: Experiment results indicated that the template matching algorithm achieved a classification rate of 96.0%, and it was superior to backpropagation artificial neural networks (BP-ANNs, supported vector machine (SVM and deep learning method; (4 Conclusion: These findings imply that the proposed methods are likely to expand possibilities for rapidly recognizing and classifying plant action potentials in the database in the future.

  10. Radar-Derived Quantitative Precipitation Estimation Based on Precipitation Classification

    Directory of Open Access Journals (Sweden)

    Lili Yang

    2016-01-01

    Full Text Available A method for improving radar-derived quantitative precipitation estimation is proposed. Tropical vertical profiles of reflectivity (VPRs are first determined from multiple VPRs. Upon identifying a tropical VPR, the event can be further classified as either tropical-stratiform or tropical-convective rainfall by a fuzzy logic (FL algorithm. Based on the precipitation-type fields, the reflectivity values are converted into rainfall rate using a Z-R relationship. In order to evaluate the performance of this rainfall classification scheme, three experiments were conducted using three months of data and two study cases. In Experiment I, the Weather Surveillance Radar-1988 Doppler (WSR-88D default Z-R relationship was applied. In Experiment II, the precipitation regime was separated into convective and stratiform rainfall using the FL algorithm, and corresponding Z-R relationships were used. In Experiment III, the precipitation regime was separated into convective, stratiform, and tropical rainfall, and the corresponding Z-R relationships were applied. The results show that the rainfall rates obtained from all three experiments match closely with the gauge observations, although Experiment II could solve the underestimation, when compared to Experiment I. Experiment III significantly reduced this underestimation and generated the most accurate radar estimates of rain rate among the three experiments.

  11. A tentative classification of paleoweathering formations based on geomorphological criteria

    Science.gov (United States)

    Battiau-Queney, Yvonne

    1996-05-01

    A geomorphological classification is proposed that emphasizes the usefulness of paleoweathering records in any reconstruction of past landscapes. Four main paleoweathering records are recognized: 1. Paleoweathering formations buried beneath a sedimentary or volcanic cover. Most of them are saprolites, sometimes with preserved overlying soils. Ages range from Archean to late Cenozoic times; 2. Paleoweathering formations trapped in karst: some of them have buried pre-existent karst landforms, others have developed simultaneously with the subjacent karst; 3. Relict paleoweathering formations: although inherited, they belong to the present landscape. Some of them are indurated (duricrusts, silcretes, ferricretes,…); others are not and owe their preservation to a stable morphotectonic environment; 4. Polyphased weathering mantles: weathering has taken place in changing geochemical conditions. After examples of each type are provided, the paper considers the relations between chemical weathering and landform development. The climatic significance of paleoweathering formations is discussed. Some remote morphogenic systems have no present equivalent. It is doubtful that chemical weathering alone might lead to widespread planation surfaces. Moreover, classical theories based on sea-level and rivers as the main factors of erosion are not really adequate to explain the observed landscapes.

  12. Classification of CT brain images based on deep learning networks.

    Science.gov (United States)

    Gao, Xiaohong W; Hui, Rui; Tian, Zengmin

    2017-01-01

    While computerised tomography (CT) may have been the first imaging tool to study human brain, it has not yet been implemented into clinical decision making process for diagnosis of Alzheimer's disease (AD). On the other hand, with the nature of being prevalent, inexpensive and non-invasive, CT does present diagnostic features of AD to a great extent. This study explores the significance and impact on the application of the burgeoning deep learning techniques to the task of classification of CT brain images, in particular utilising convolutional neural network (CNN), aiming at providing supplementary information for the early diagnosis of Alzheimer's disease. Towards this end, three categories of CT images (N = 285) are clustered into three groups, which are AD, lesion (e.g. tumour) and normal ageing. In addition, considering the characteristics of this collection with larger thickness along the direction of depth (z) (~3-5 mm), an advanced CNN architecture is established integrating both 2D and 3D CNN networks. The fusion of the two CNN networks is subsequently coordinated based on the average of Softmax scores obtained from both networks consolidating 2D images along spatial axial directions and 3D segmented blocks respectively. As a result, the classification accuracy rates rendered by this elaborated CNN architecture are 85.2%, 80% and 95.3% for classes of AD, lesion and normal respectively with an average of 87.6%. Additionally, this improved CNN network appears to outperform the others when in comparison with 2D version only of CNN network as well as a number of state of the art hand-crafted approaches. As a result, these approaches deliver accuracy rates in percentage of 86.3, 85.6 ± 1.10, 86.3 ± 1.04, 85.2 ± 1.60, 83.1 ± 0.35 for 2D CNN, 2D SIFT, 2D KAZE, 3D SIFT and 3D KAZE respectively. The two major contributions of the paper constitute a new 3-D approach while applying deep learning technique to extract signature information

  13. IMPROVEMENT OF TCAM-BASED PACKET CLASSIFICATION ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    Xu Zhen; Zhang Jun; Rui Liyang; Sun Jun

    2008-01-01

    The feature of Ternary Content Addressable Memories (TCAMs) makes them particularly attractive for IP address lookup and packet classification applications in a router system. However, the limitations of TCAMs impede their utilization. In this paper, the solutions for decreasing the power consumption and avoiding entry expansion in range matching are addressed. Experimental results demonstrate that the proposed techniques can make some big improvements on the performance of TCAMs in IP address lookup and packet classification.

  14. Reliable classification of two-class cancer data using evolutionary algorithms.

    Science.gov (United States)

    Deb, Kalyanmoy; Raji Reddy, A

    2003-11-01

    In the area of bioinformatics, the identification of gene subsets responsible for classifying available disease samples to two or more of its variants is an important task. Such problems have been solved in the past by means of unsupervised learning methods (hierarchical clustering, self-organizing maps, k-mean clustering, etc.) and supervised learning methods (weighted voting approach, k-nearest neighbor method, support vector machine method, etc.). Such problems can also be posed as optimization problems of minimizing gene subset size to achieve reliable and accurate classification. The main difficulties in solving the resulting optimization problem are the availability of only a few samples compared to the number of genes in the samples and the exorbitantly large search space of solutions. Although there exist a few applications of evolutionary algorithms (EAs) for this task, here we treat the problem as a multiobjective optimization problem of minimizing the gene subset size and minimizing the number of misclassified samples. Moreover, for a more reliable classification, we consider multiple training sets in evaluating a classifier. Contrary to the past studies, the use of a multiobjective EA (NSGA-II) has enabled us to discover a smaller gene subset size (such as four or five) to correctly classify 100% or near 100% samples for three cancer samples (Leukemia, Lymphoma, and Colon). We have also extended the NSGA-II to obtain multiple non-dominated solutions discovering as much as 352 different three-gene combinations providing a 100% correct classification to the Leukemia data. In order to have further confidence in the identification task, we have also introduced a prediction strength threshold for determining a sample's belonging to one class or the other. All simulation results show consistent gene subset identifications on three disease samples and exhibit the flexibilities and efficacies in using a multiobjective EA for the gene subset identification task.

  15. Tweet-based Target Market Classification Using Ensemble Method

    Directory of Open Access Journals (Sweden)

    Muhammad Adi Khairul Anshary

    2016-09-01

    Full Text Available Target market classification is aimed at focusing marketing activities on the right targets. Classification of target markets can be done through data mining and by utilizing data from social media, e.g. Twitter. The end result of data mining are learning models that can classify new data. Ensemble methods can improve the accuracy of the models and therefore provide better results. In this study, classification of target markets was conducted on a dataset of 3000 tweets in order to extract features. Classification models were constructed to manipulate the training data using two ensemble methods (bagging and boosting. To investigate the effectiveness of the ensemble methods, this study used the CART (classification and regression tree algorithm for comparison. Three categories of consumer goods (computers, mobile phones and cameras and three categories of sentiments (positive, negative and neutral were classified towards three target-market categories. Machine learning was performed using Weka 3.6.9. The results of the test data showed that the bagging method improved the accuracy of CART with 1.9% (to 85.20%. On the other hand, for sentiment classification, the ensemble methods were not successful in increasing the accuracy of CART. The results of this study may be taken into consideration by companies who approach their customers through social media, especially Twitter.

  16. Long-term Prostate-specific Antigen Velocity in Improved Classification of Prostate Cancer Risk and Mortality

    DEFF Research Database (Denmark)

    Ørsted, David Dynnes; Bojesen, Stig E; Kamstrup, Pia R;

    2013-01-01

    BACKGROUND: It remains unclear whether adding long-term prostate-specific antigen velocity (PSAV) to baseline PSA values improves classification of prostate cancer (PCa) risk and mortality in the general population. OBJECTIVE: To determine whether long-term PSAV improves classification of PCa risk...... classification was assessed using the net reclassification index (NRI). RESULTS: Age-adjusted hazard ratios for PCa risk and mortality were 2.7-5.3 and 2.3-3.4, respectively, for long-term PSAV when added to models already including baseline PSA values. For PCa risk and mortality, adding long-term PSAV to models....... Correspondingly, inappropriately reclassified were 49 of 10 000 men with PCa and 1658 of 10 000 men with no PCa. CONCLUSIONS: Long-term PSAV in addition to baseline PSA value improves classification of PCa risk and mortality. Applying long-term PSAV nationwide, the ratio of appropriately to inappropriately...

  17. Hierarchical structure for audio-video based semantic classification of sports video sequences

    Science.gov (United States)

    Kolekar, M. H.; Sengupta, S.

    2005-07-01

    A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.

  18. Object Classification based Context Management for Identity Management in Internet of Things

    DEFF Research Database (Denmark)

    Mahalle, Parikshit N.; Prasad, Neeli R.; Prasad, Ramjee

    2013-01-01

    , and there is a need of context-aware access control solution for IdM. Confronting uncertainty of different types of objects in IoT is not easy. This paper presents the logical framework for object classification in context aware IoT, as richer contextual information creates an impact on the access control. This paper...... proposes decision theory based object classification to provide contextual information and context management. Simulation results show that the proposed object classification is useful to improve network lifetime. Results also give motivation of object classification in terms of energy consumption...

  19. Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Support Vector Machine

    Science.gov (United States)

    Kim, Sang-Kyun; Chang, Joon-Hyuk

    In this letter, we propose a novel approach to speech/music classification based on the support vector machine (SVM) to improve the performance of the 3GPP2 selectable mode vocoder (SMV) codec. We first analyze the features and the classification method used in real time speech/music classification algorithm in SMV, and then apply the SVM for enhanced speech/music classification. For evaluation of performance, we compare the proposed algorithm and the traditional algorithm of the SMV. The performance of the proposed system is evaluated under the various environments and shows better performance compared to the original method in the SMV.

  20. Signal classification method based on data mining for multi-mode radar

    Institute of Scientific and Technical Information of China (English)

    Qiang Guo; Pulong Nan; Jian Wan

    2016-01-01

    For the multi-mode radar working in the modern elec-tronic battlefield, different working states of one single radar are prone to being classified as multiple emitters when adopting traditional classification methods to process intercepted signals, which has a negative effect on signal classification. A classification method based on spatial data mining is presented to address the above chal enge. Inspired by the idea of spatial data mining, the classification method applies nuclear field to depicting the distribu-tion information of pulse samples in feature space, and digs out the hidden cluster information by analyzing distribution characteristics. In addition, a membership-degree criterion to quantify the correla-tion among al classes is established, which ensures classification accuracy of signal samples. Numerical experiments show that the presented method can effectively prevent different working states of multi-mode emitter from being classified as several emitters, and achieve higher classification accuracy.

  1. Classification of Noisy Data: An Approach Based on Genetic Algorithms and Voronoi Tessellation

    DEFF Research Database (Denmark)

    Khan, Abdul Rauf; Schiøler, Henrik; Knudsen, Torben;

    2016-01-01

    Classification is one of the major constituents of the data-mining toolkit. The well-known methods for classification are built on either the principle of logic or statistical/mathematical reasoning for classification. In this article we propose: (1) a different strategy, which is based......). The results of this study suggest that our proposed methodology is specialized to deal with the classification problem of highly imbalanced classes with significant overlap....... on the portioning of information space; and (2) use of the genetic algorithm to solve combinatorial problems for classification. In particular, we will implement our methodology to solve complex classification problems and compare the performance of our classifier with other well-known methods (SVM, KNN, and ANN...

  2. SAR images classification method based on Dempster-Shafer theory and kernel estimate

    Institute of Scientific and Technical Information of China (English)

    He Chu; Xia Guisong; Sun Hong

    2007-01-01

    To study the scene classification in the Synthetic Aperture Radar (SAR) image, a novel method based on kernel estimate, with the Markov context and Dempster-Shafer evidence theory is proposed.Initially, a nonparametric Probability Density Function (PDF) estimate method is introduced, to describe the scene of SAR images.And then under the Markov context, both the determinate PDF and the kernel estimate method are adopted respectively, to form a primary classification.Next, the primary classification results are fused using the evidence theory in an unsupervised way to get the scene classification.Finally, a regularization step is used, in which an iterated maximum selecting approach is introduced to control the fragments and modify the errors of the classification.Use of the kernel estimate and evidence theory can describe the complicated scenes with little prior knowledge and eliminate the ambiguities of the primary classification results.Experimental results on real SAR images illustrate a rather impressive performance.

  3. Classification of PolSAR image based on quotient space theory

    Science.gov (United States)

    An, Zhihui; Yu, Jie; Liu, Xiaomeng; Liu, Limin; Jiao, Shuai; Zhu, Teng; Wang, Shaohua

    2015-12-01

    In order to improve the classification accuracy, quotient space theory was applied in the classification of polarimetric SAR (PolSAR) image. Firstly, Yamaguchi decomposition method is adopted, which can get the polarimetric characteristic of the image. At the same time, Gray level Co-occurrence Matrix (GLCM) and Gabor wavelet are used to get texture feature, respectively. Secondly, combined with texture feature and polarimetric characteristic, Support Vector Machine (SVM) classifier is used for initial classification to establish different granularity spaces. Finally, according to the quotient space granularity synthetic theory, we merge and reason the different quotient spaces to get the comprehensive classification result. Method proposed in this paper is tested with L-band AIRSAR of San Francisco bay. The result shows that the comprehensive classification result based on the theory of quotient space is superior to the classification result of single granularity space.

  4. Review of Remotely Sensed Imagery Classification Patterns Based on Object-oriented Image Analysis

    Institute of Scientific and Technical Information of China (English)

    LIU Yongxue; LI Manchun; MAO Liang; XU Feifei; HUANG Shuo

    2006-01-01

    With the wide use of high-resolution remotely sensed imagery, the object-oriented remotely sensed information classification pattern has been intensively studied. Starting with the definition of object-oriented remotely sensed information classification pattern and a literature review of related research progress, this paper sums up 4 developing phases of object-oriented classification pattern during the past 20 years. Then, we discuss the three aspects of methodology in detail, namely remotely sensed imagery segmentation, feature analysis and feature selection, and classification rule generation, through comparing them with remotely sensed information classification method based on per-pixel. At last, this paper presents several points that need to be paid attention to in the future studies on object-oriented RS information classification pattern: 1) developing robust and highly effective image segmentation algorithm for multi-spectral RS imagery; 2) improving the feature-set including edge, spatial-adjacent and temporal characteristics; 3) discussing the classification rule generation classifier based on the decision tree; 4) presenting evaluation methods for classification result by object-oriented classification pattern.

  5. Land cover classification using random forest with genetic algorithm-based parameter optimization

    Science.gov (United States)

    Ming, Dongping; Zhou, Tianning; Wang, Min; Tan, Tian

    2016-07-01

    Land cover classification based on remote sensing imagery is an important means to monitor, evaluate, and manage land resources. However, it requires robust classification methods that allow accurate mapping of complex land cover categories. Random forest (RF) is a powerful machine-learning classifier that can be used in land remote sensing. However, two important parameters of RF classification, namely, the number of trees and the number of variables tried at each split, affect classification accuracy. Thus, optimal parameter selection is an inevitable problem in RF-based image classification. This study uses the genetic algorithm (GA) to optimize the two parameters of RF to produce optimal land cover classification accuracy. HJ-1B CCD2 image data are used to classify six different land cover categories in Changping, Beijing, China. Experimental results show that GA-RF can avoid arbitrariness in the selection of parameters. The experiments also compare land cover classification results by using GA-RF method, traditional RF method (with default parameters), and support vector machine method. When the GA-RF method is used, classification accuracies, respectively, improved by 1.02% and 6.64%. The comparison results show that GA-RF is a feasible solution for land cover classification without compromising accuracy or incurring excessive time.

  6. Multi-target QPDR classification model for human breast and colon cancer-related proteins using star graph topological indices.

    Science.gov (United States)

    Munteanu, Cristian Robert; Magalhães, Alexandre L; Uriarte, Eugenio; González-Díaz, Humberto

    2009-03-21

    The cancer diagnostic is a complex process and, sometimes, the specific markers can interfere or produce negative results. Thus, new simple and fast theoretical models are required. One option is the complex network graphs theory that permits us to describe any real system, from the small molecules to the complex genetic, neural or social networks by transforming real properties in topological indices. This work converts the protein primary structure data in specific Randic's star networks topological indices using the new sequence to star networks (S2SNet) application. A set of 1054 proteins were selected from previous works and contains proteins related or not with two types of cancer, human breast cancer (HBC) and human colon cancer (HCC). The general discriminant analysis method generates an input-coded multi-target classification model with the training/predicting set accuracies of 90.0% for the forward stepwise model type. In addition, a protein subset was modified by single amino acid mutations with higher log-odds PAM250 values and tested with the new classification if can be related with HBC or HCC. In conclusion, we shown that, using simple input data such is the primary protein sequence and the simples linear analysis, it is possible to obtain accurate classification models that can predict if a new protein related with two types of cancer. These results promote the use of the S2SNet in clinical proteomics.

  7. Objected-oriented remote sensing image classification method based on geographic ontology model

    Science.gov (United States)

    Chu, Z.; Liu, Z. J.; Gu, H. Y.

    2016-11-01

    Nowadays, with the development of high resolution remote sensing image and the wide application of laser point cloud data, proceeding objected-oriented remote sensing classification based on the characteristic knowledge of multi-source spatial data has been an important trend on the field of remote sensing image classification, which gradually replaced the traditional method through improving algorithm to optimize image classification results. For this purpose, the paper puts forward a remote sensing image classification method that uses the he characteristic knowledge of multi-source spatial data to build the geographic ontology semantic network model, and carries out the objected-oriented classification experiment to implement urban features classification, the experiment uses protégé software which is developed by Stanford University in the United States, and intelligent image analysis software—eCognition software as the experiment platform, uses hyperspectral image and Lidar data that is obtained through flight in DaFeng City of JiangSu as the main data source, first of all, the experiment uses hyperspectral image to obtain feature knowledge of remote sensing image and related special index, the second, the experiment uses Lidar data to generate nDSM(Normalized DSM, Normalized Digital Surface Model),obtaining elevation information, the last, the experiment bases image feature knowledge, special index and elevation information to build the geographic ontology semantic network model that implement urban features classification, the experiment results show that, this method is significantly higher than the traditional classification algorithm on classification accuracy, especially it performs more evidently on the respect of building classification. The method not only considers the advantage of multi-source spatial data, for example, remote sensing image, Lidar data and so on, but also realizes multi-source spatial data knowledge integration and application

  8. Classification and Identification of Over-voltage Based on HHT and SVM

    Institute of Scientific and Technical Information of China (English)

    WANG Jing; YANG Qing; CHEN Lin; SIMA Wenxia

    2012-01-01

    This paper proposes an effective method for over-voltage classification based on the Hilbert-Huang transform(HHT) method.Hilbert-Huang transform method is composed of empirical mode decomposition(EMD) and Hilbert transform.Nine kinds of common power system over-voltages are calculated and analyzed by HHT.Based on the instantaneous amplitude spectrum,Hilbert marginal spectrum and Hilbert time-frequency spectrum,three kinds of over-voltage characteristic quantities are obtained.A hierarchical classification system is built based on HHT and support vector machine(SVM).This classification system is tested by 106 field over-voltage signals,and the average classification rate is 94.3%.This research shows that HHT is an effective time-frequency analysis algorithms in the application of over-voltage classification and identification.

  9. Dihedral-Based Segment Identification and Classification of Biopolymers II: Polynucleotides

    Science.gov (United States)

    2013-01-01

    In an accompanying paper (Nagy, G.; Oostenbrink, C. Dihedral-based segment identification and classification of biopolymers I: Proteins. J. Chem. Inf. Model. 2013, DOI: 10.1021/ci400541d), we introduce a new algorithm for structure classification of biopolymeric structures based on main-chain dihedral angles. The DISICL algorithm (short for DIhedral-based Segment Identification and CLassification) classifies segments of structures containing two central residues. Here, we introduce the DISICL library for polynucleotides, which is based on the dihedral angles ε, ζ, and χ for the two central residues of a three-nucleotide segment of a single strand. Seventeen distinct structural classes are defined for nucleotide structures, some of which—to our knowledge—were not described previously in other structure classification algorithms. In particular, DISICL also classifies noncanonical single-stranded structural elements. DISICL is applied to databases of DNA and RNA structures containing 80,000 and 180,000 segments, respectively. The classifications according to DISICL are compared to those of another popular classification scheme in terms of the amount of classified nucleotides, average occurrence and length of structural elements, and pairwise matches of the classifications. While the detailed classification of DISICL adds sensitivity to a structure analysis, it can be readily reduced to eight simplified classes providing a more general overview of the secondary structure in polynucleotides. PMID:24364355

  10. A Bayesian Based Search and Classification System for Product Information of Agricultural Logistics Information Technology

    OpenAIRE

    2011-01-01

    Part 1: Decision Support Systems, Intelligent Systems and Artificial Intelligence Applications; International audience; In order to meet the needs of users who search agricultural products logistics information technology, this paper introduces a search and classification system of agricultural products logistics information technology search and classification. Firstly, the dictionary of field concept word was built based on analyzing the characteristics of agricultural products logistics in...

  11. Dihedral-based segment identification and classification of biopolymers II: polynucleotides.

    Science.gov (United States)

    Nagy, Gabor; Oostenbrink, Chris

    2014-01-27

    In an accompanying paper (Nagy, G.; Oostenbrink, C. Dihedral-based segment identification and classification of biopolymers I: Proteins. J. Chem. Inf. Model. 2013, DOI: 10.1021/ci400541d), we introduce a new algorithm for structure classification of biopolymeric structures based on main-chain dihedral angles. The DISICL algorithm (short for DIhedral-based Segment Identification and CLassification) classifies segments of structures containing two central residues. Here, we introduce the DISICL library for polynucleotides, which is based on the dihedral angles ε, ζ, and χ for the two central residues of a three-nucleotide segment of a single strand. Seventeen distinct structural classes are defined for nucleotide structures, some of which--to our knowledge--were not described previously in other structure classification algorithms. In particular, DISICL also classifies noncanonical single-stranded structural elements. DISICL is applied to databases of DNA and RNA structures containing 80,000 and 180,000 segments, respectively. The classifications according to DISICL are compared to those of another popular classification scheme in terms of the amount of classified nucleotides, average occurrence and length of structural elements, and pairwise matches of the classifications. While the detailed classification of DISICL adds sensitivity to a structure analysis, it can be readily reduced to eight simplified classes providing a more general overview of the secondary structure in polynucleotides.

  12. A Kernel-Based Nonlinear Representor with Application to Eigenface Classification

    Institute of Scientific and Technical Information of China (English)

    ZHANG Jing; LIU Ben-yong; TAN Hao

    2004-01-01

    This paper presents a classifier named kernel-based nonlinear representor (KNR) for optimal representation of pattern features. Adopting the Gaussian kernel, with the kernel width adaptively estimated by a simple technique, it is applied to eigenface classification. Experimental results on the ORL face database show that it improves performance by around 6 points, in classification rate, over the Euclidean distance classifier.

  13. Using Discrete Loss Functions and Weighted Kappa for Classification: An Illustration Based on Bayesian Network Analysis

    Science.gov (United States)

    Zwick, Rebecca; Lenaburg, Lubella

    2009-01-01

    In certain data analyses (e.g., multiple discriminant analysis and multinomial log-linear modeling), classification decisions are made based on the estimated posterior probabilities that individuals belong to each of several distinct categories. In the Bayesian network literature, this type of classification is often accomplished by assigning…

  14. Initial steps towards an evidence-based classification system for golfers with a physical impairment

    NARCIS (Netherlands)

    Stoter, Inge K; Hettinga, Florentina J; Altmann, Viola; Eisma, Wim; Arendzen, Hans; Bennett, Tony; van der Woude, Lucas H; Dekker, Rienk

    2015-01-01

    PURPOSE: The present narrative review aims to make a first step towards an evidence-based classification system in handigolf following the International Paralympic Committee (IPC). It intends to create a conceptual framework of classification for handigolf and an agenda for future research. METHOD:

  15. Initial steps towards an evidence-based classification system for golfers with a physical impairment

    NARCIS (Netherlands)

    Stoter, Inge K.; Hettinga, Florentina J.; Altmann, Viola; Eisma, Wim; Arendzen, Hans; Bennett, Tony; van der Woude, Lucas H.; Dekker, Rienk

    2017-01-01

    Purpose: The present narrative review aims to make a first step towards an evidence-based classification system in handigolf following the International Paralympic Committee (IPC). It intends to create a conceptual framework of classification for handigolf and an agenda for future research. Method:

  16. Self-organizing maps classification of epidemiological data and toenail selenium content monitored on cancer and healthy patients from Poland.

    Science.gov (United States)

    Tsakovski, Stefan L; Zukowska, Joanna; Bode, Peter; Bizuk, Marek K; Kowalczyk, Anna

    2010-01-01

    This paper deals with epidemiological multivariate statistical analysis of cancer and health patients from Pomeranian and Lubuskie Voivodships, Poland. The anthropometric and epidemiologic data include 8 parameters: toenail selenium concentration, sex, age, body mass index (BMI), smoking status, taking of Se supplements, health state, and family history of cancer. The self-organizing maps (SOM) are used for simultaneous classification of parameters and patients with relation to cancer diagnosis. Three different patterns (groups) of patients with cancer diagnosis are outlined: (i) older, smoking men with low toenail selenium concentration; (ii) older smoking women with family relation to cancer and toenail selenium deficiency; (iii) middle, aged nonsmokers with high level of selenium toenail concentration. The simultaneous classification of parameters and patients makes it possible to determine discriminating parameters for each pattern and relations between parameters. The relation of each parameter to cancer disease is discussed as special attention is paid to toenail selenium deficiency. More than 80% of patients with cancer diagnosis possess toenail selenium deficiency, accompanied by old age and smoking.

  17. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  18. Automatic Classification of Normal and Cancer Lung CT Images Using Multiscale AM-FM Features.

    Science.gov (United States)

    Magdy, Eman; Zayed, Nourhan; Fakhr, Mahmoud

    2015-01-01

    Computer-aided diagnostic (CAD) systems provide fast and reliable diagnosis for medical images. In this paper, CAD system is proposed to analyze and automatically segment the lungs and classify each lung into normal or cancer. Using 70 different patients' lung CT dataset, Wiener filtering on the original CT images is applied firstly as a preprocessing step. Secondly, we combine histogram analysis with thresholding and morphological operations to segment the lung regions and extract each lung separately. Amplitude-Modulation Frequency-Modulation (AM-FM) method thirdly, has been used to extract features for ROIs. Then, the significant AM-FM features have been selected using Partial Least Squares Regression (PLSR) for classification step. Finally, K-nearest neighbour (KNN), support vector machine (SVM), naïve Bayes, and linear classifiers have been used with the selected AM-FM features. The performance of each classifier in terms of accuracy, sensitivity, and specificity is evaluated. The results indicate that our proposed CAD system succeeded to differentiate between normal and cancer lungs and achieved 95% accuracy in case of the linear classifier.

  19. Classification of samples into two or more ordered populations with application to a cancer trial.

    Science.gov (United States)

    Conde, D; Fernández, M A; Rueda, C; Salvador, B

    2012-12-10

    In many applications, especially in cancer treatment and diagnosis, investigators are interested in classifying patients into various diagnosis groups on the basis of molecular data such as gene expression or proteomic data. Often, some of the diagnosis groups are known to be related to higher or lower values of some of the predictors. The standard methods of classifying patients into various groups do not take into account the underlying order. This could potentially result in high misclassification rates, especially when the number of groups is larger than two. In this article, we develop classification procedures that exploit the underlying order among the mean values of the predictor variables and the diagnostic groups by using ideas from order-restricted inference. We generalize the existing methodology on discrimination under restrictions and provide empirical evidence to demonstrate that the proposed methodology improves over the existing unrestricted methodology. The proposed methodology is applied to a bladder cancer data set where the researchers are interested in classifying patients into various groups.

  20. Leveraging Sequence Classification by Taxonomy-Based Multitask Learning

    Science.gov (United States)

    Widmer, Christian; Leiva, Jose; Altun, Yasemin; Rätsch, Gunnar

    In this work we consider an inference task that biologists are very good at: deciphering biological processes by bringing together knowledge that has been obtained by experiments using various organisms, while respecting the differences and commonalities of these organisms. We look at this problem from an sequence analysis point of view, where we aim at solving the same classification task in different organisms. We investigate the challenge of combining information from several organisms, whereas we consider the relation between the organisms to be defined by a tree structure derived from their phylogeny. Multitask learning, a machine learning technique that recently received considerable attention, considers the problem of learning across tasks that are related to each other. We treat each organism as one task and present three novel multitask learning methods to handle situations in which the relationships among tasks can be described by a hierarchy. These algorithms are designed for large-scale applications and are therefore applicable to problems with a large number of training examples, which are frequently encountered in sequence analysis. We perform experimental analyses on synthetic data sets in order to illustrate the properties of our algorithms. Moreover, we consider a problem from genomic sequence analysis, namely splice site recognition, to illustrate the usefulness of our approach. We show that intelligently combining data from 15 eukaryotic organisms can indeed significantly improve the prediction performance compared to traditional learning approaches. On a broader perspective, we expect that algorithms like the ones presented in this work have the potential to complement and enrich the strategy of homology-based sequence analysis that are currently the quasi-standard in biological sequence analysis.

  1. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  2. Ultrasonic signal classification based on ambiguity plane feature

    Institute of Scientific and Technical Information of China (English)

    Du Xiuli; Wang Yan; Shen Yi

    2009-01-01

    Ambiguity function (AF) is proposed to represent ultrasonic signal to resolve the preprocessing prob-lem of different center frequencies and different arriving times among ultrasonic signals for feature extraction, as well as offer time-frequency features for signal classification. Moreover, Karhunen-Loeve (K-L) transform is considered to extract signal features from ambiguity plane, and then the features are presented to probabilistic neural network (PNN) for signal classification. Experimental results show that ambiguity function eliminates the difference of center frequency and arriving time existing in ultrasonic signals, and ambiguity plane features extracted by K-L transform describe the signal of different classes effectively in a reduced dimensional space. Classification result suggests that the ambiguity plane features obtain better performance than the features extracted by wavelet transform (WT).

  3. Algebraic classification of higher dimensional spacetimes based on null alignment

    CERN Document Server

    Ortaggio, Marcello; Pravdova, Alena

    2012-01-01

    We review recent developments and applications of the classification of the Weyl tensor in higher dimensional Lorentzian geometries. First, we discuss the general setup, i.e. main definitions and methods for the classification, some refinements and the generalized Newman-Penrose and Geroch-Held-Penrose formalisms. Next, we summarize general results, such as a partial extension of the Goldberg-Sachs theorem, characterization of spacetimes with vanishing (or constant) curvature invariants and the peeling behaviour in asymptotically flat spacetimes. Finally, we discuss certain invariantly defined families of metrics and their relation with the Weyl tensor classification, including: Kundt and Robinson-Trautman spacetimes; the Kerr-Schild ansatz in a constant-curvature background; purely electric and purely magnetic spacetimes; direct and (some) warped products; and geometries with certain symmetries. To conclude, some applications to quadratic gravity are also overviewed.

  4. Dendritic cell-based cancer immunotherapy for colorectal cancer.

    Science.gov (United States)

    Kajihara, Mikio; Takakura, Kazuki; Kanai, Tomoya; Ito, Zensho; Saito, Keisuke; Takami, Shinichiro; Shimodaira, Shigetaka; Okamoto, Masato; Ohkusa, Toshifumi; Koido, Shigeo

    2016-05-01

    Colorectal cancer (CRC) is one of the most common cancers and a leading cause of cancer-related mortality worldwide. Although systemic therapy is the standard care for patients with recurrent or metastatic CRC, the prognosis is extremely poor. The optimal sequence of therapy remains unknown. Therefore, alternative strategies, such as immunotherapy, are needed for patients with advanced CRC. This review summarizes evidence from dendritic cell-based cancer immunotherapy strategies that are currently in clinical trials. In addition, we discuss the possibility of antitumor immune responses through immunoinhibitory PD-1/PD-L1 pathway blockade in CRC patients.

  5. Support vector machine classification trees based on fuzzy entropy of classification.

    Science.gov (United States)

    de Boves Harrington, Peter

    2017-02-15

    The support vector machine (SVM) is a powerful classifier that has recently been implemented in a classification tree (SVMTreeG). This classifier partitioned the data by finding gaps in the data space. For large and complex datasets, there may be no gaps in the data space confounding this type of classifier. A novel algorithm was devised that uses fuzzy entropy to find optimal partitions for situations when clusters of data are overlapped in the data space. Also, a kernel version of the fuzzy entropy algorithm was devised. A fast support vector machine implementation is used that has no cost C or slack variables to optimize. Statistical comparisons using bootstrapped Latin partitions among the tree classifiers were made using a synthetic XOR data set and validated with ten prediction sets comprised of 50,000 objects and a data set of NMR spectra obtained from 12 tea sample extracts.

  6. Seafloor Sediment Classification Based on Multibeam Sonar Data

    Institute of Scientific and Technical Information of China (English)

    ZHOU Xinghua; CHEN Yongqi

    2004-01-01

    The multibeam sonars can provide hydrographic quality depth data as well as hold the potential to provide calibrated measurements of the seafloor acoustic backscattering strength. There has been much interest in utilizing backscatters and images from multibeam sonar for seabed type identification and most results are obtained. This paper has presented a focused review of several main methods and recent developments of seafloor classification utilizing multibeam sonar data or/and images. These are including the power spectral analysis methods, the texture analysis, traditional Bayesian classification theory and the most active neural network approaches.

  7. Knowledge based cluster ensemble for cancer discovery from biomolecular data.

    Science.gov (United States)

    Yu, Zhiwen; Wongb, Hau-San; You, Jane; Yang, Qinmin; Liao, Hongying

    2011-06-01

    The adoption of microarray techniques in biological and medical research provides a new way for cancer diagnosis and treatment. In order to perform successful diagnosis and treatment of cancer, discovering and classifying cancer types correctly is essential. Class discovery is one of the most important tasks in cancer classification using biomolecular data. Most of the existing works adopt single clustering algorithms to perform class discovery from biomolecular data. However, single clustering algorithms have limitations, which include a lack of robustness, stability, and accuracy. In this paper, we propose a new cluster ensemble approach called knowledge based cluster ensemble (KCE) which incorporates the prior knowledge of the data sets into the cluster ensemble framework. Specifically, KCE represents the prior knowledge of a data set in the form of pairwise constraints. Then, the spectral clustering algorithm (SC) is adopted to generate a set of clustering solutions. Next, KCE transforms pairwise constraints into confidence factors for these clustering solutions. After that, a consensus matrix is constructed by considering all the clustering solutions and their corresponding confidence factors. The final clustering result is obtained by partitioning the consensus matrix. Comparison with single clustering algorithms and conventional cluster ensemble approaches, knowledge based cluster ensemble approaches are more robust, stable and accurate. The experiments on cancer data sets show that: 1) KCE works well on these data sets; 2) KCE not only outperforms most of the state-of-the-art single clustering algorithms, but also outperforms most of the state-of-the-art cluster ensemble approaches.

  8. A Multi-Label Classification Approach Based on Correlations Among Labels

    Directory of Open Access Journals (Sweden)

    Raed Alazaidah

    2015-02-01

    Full Text Available Multi label classification is concerned with learning from a set of instances that are associated with a set of labels, that is, an instance could be associated with multiple labels at the same time. This task occurs frequently in application areas like text categorization, multimedia classification, bioinformatics, protein function classification and semantic scene classification. Current multi-label classification methods could be divided into two categories. The first is called problem transformation methods, which transform multi-label classification problem into single label classification problem, and then apply any single label classifier to solve the problem. The second category is called algorithm adaptation methods, which adapt an existing single label classification algorithm to handle multi-label data. In this paper, we propose a multi-label classification approach based on correlations among labels that use both problem transformation methods and algorithm adaptation methods. The approach begins with transforming multi-label dataset into a single label dataset using least frequent label criteria, and then applies the PART algorithm on the transformed dataset. The output of the approach is multi-labels rules. The approach also tries to get benefit from positive correlations among labels using predictive Apriori algorithm. The proposed approach has been evaluated using two multi-label datasets named (Emotions and Yeast and three evaluation measures (Accuracy, Hamming Loss, and Harmonic Mean. The experiments showed that the proposed approach has a fair accuracy in comparison to other related methods.

  9. A method for cloud detection and opacity classification based on ground based sky imagery

    Directory of Open Access Journals (Sweden)

    M. S. Ghonima

    2012-07-01

    Full Text Available Digital images of the sky obtained using a total sky imager (TSI are classified pixel by pixel into clear sky, optically thin and optically thick clouds. A new classification algorithm was developed that compares the pixel red-blue ratio (RBR to the RBR of a clear sky library (CSL generated from images captured on clear days. The difference, rather than the ratio, between pixel RBR and CSL RBR resulted in more accurate cloud classification. High correlation between TSI image RBR and aerosol optical depth (AOD measured by an AERONET photometer was observed and motivated the addition of a haze correction factor (HCF to the classification model to account for variations in AOD. Thresholds for clear and thick clouds were chosen based on a training image set and validated with set of manually annotated images. Misclassifications of clear and thick clouds into the opposite category were less than 1%. Thin clouds were classified with an accuracy of 60%. Accurate cloud detection and opacity classification techniques will improve the accuracy of short-term solar power forecasting.

  10. A method for cloud detection and opacity classification based on ground based sky imagery

    Directory of Open Access Journals (Sweden)

    M. S. Ghonima

    2012-11-01

    Full Text Available Digital images of the sky obtained using a total sky imager (TSI are classified pixel by pixel into clear sky, optically thin and optically thick clouds. A new classification algorithm was developed that compares the pixel red-blue ratio (RBR to the RBR of a clear sky library (CSL generated from images captured on clear days. The difference, rather than the ratio, between pixel RBR and CSL RBR resulted in more accurate cloud classification. High correlation between TSI image RBR and aerosol optical depth (AOD measured by an AERONET photometer was observed and motivated the addition of a haze correction factor (HCF to the classification model to account for variations in AOD. Thresholds for clear and thick clouds were chosen based on a training image set and validated with set of manually annotated images. Misclassifications of clear and thick clouds into the opposite category were less than 1%. Thin clouds were classified with an accuracy of 60%. Accurate cloud detection and opacity classification techniques will improve the accuracy of short-term solar power forecasting.

  11. A kernel-based multivariate feature selection method for microarray data classification.

    Directory of Open Access Journals (Sweden)

    Shiquan Sun

    Full Text Available High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Formula: see text]-nearest neighbor on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.

  12. Novel round-robin tabu search algorithm for prostate cancer classification and diagnosis using multispectral imagery.

    Science.gov (United States)

    Tahir, Muhammad Atif; Bouridane, Ahmed

    2006-10-01

    Quantitative cell imagery in cancer pathology has progressed greatly in the last 25 years. The application areas are mainly those in which the diagnosis is still critically reliant upon the analysis of biopsy samples, which remains the only conclusive method for making an accurate diagnosis of the disease. Biopsies are usually analyzed by a trained pathologist who, by analyzing the biopsies under a microscope, assesses the normality or malignancy of the samples submitted. Different grades of malignancy correspond to different structural patterns as well as to apparent textures. In the case of prostate cancer, four major groups have to be recognized: stroma, benign prostatic hyperplasia, prostatic intraepithelial neoplasia, and prostatic carcinoma. Recently, multispectral imagery has been used to solve this multiclass problem. Unlike conventional RGB color space, multispectral images allow the acquisition of a large number of spectral bands within the visible spectrum, resulting in a large feature vector size. For such a high dimensionality, pattern recognition techniques suffer from the well-known "curse-of-dimensionality" problem. This paper proposes a novel round-robin tabu search (RR-TS) algorithm to address the curse-of-dimensionality for this multiclass problem. The experiments have been carried out on a number of prostate cancer textured multispectral images, and the results obtained have been assessed and compared with previously reported works. The system achieved 98%-100% classification accuracy when testing on two datasets. It outperformed principal component/linear discriminant classifier (PCA-LDA), tabu search/nearest neighbor classifier (TS-1NN), and bagging/boosting with decision tree (C4.5) classifier.

  13. Segmentation-Based PolSAR Image Classification Using Visual Features: RHLBP and Color Features

    Directory of Open Access Journals (Sweden)

    Jian Cheng

    2015-05-01

    Full Text Available A segmentation-based fully-polarimetric synthetic aperture radar (PolSAR image classification method that incorporates texture features and color features is designed and implemented. This method is based on the framework that conjunctively uses statistical region merging (SRM for segmentation and support vector machine (SVM for classification. In the segmentation step, we propose an improved local binary pattern (LBP operator named the regional homogeneity local binary pattern (RHLBP to guarantee the regional homogeneity in PolSAR images. In the classification step, the color features extracted from false color images are applied to improve the classification accuracy. The RHLBP operator and color features can provide discriminative information to separate those pixels and regions with similar polarimetric features, which are from different classes. Extensive experimental comparison results with conventional methods on L-band PolSAR data demonstrate the effectiveness of our proposed method for PolSAR image classification.

  14. The method of narrow-band audio classification based on universal noise background model

    Science.gov (United States)

    Rui, Rui; Bao, Chang-chun

    2013-03-01

    Audio classification is the basis of content-based audio analysis and retrieval. The conventional classification methods mainly depend on feature extraction of audio clip, which certainly increase the time requirement for classification. An approach for classifying the narrow-band audio stream based on feature extraction of audio frame-level is presented in this paper. The audio signals are divided into speech, instrumental music, song with accompaniment and noise using the Gaussian mixture model (GMM). In order to satisfy the demand of actual environment changing, a universal noise background model (UNBM) for white noise, street noise, factory noise and car interior noise is built. In addition, three feature schemes are considered to optimize feature selection. The experimental results show that the proposed algorithm achieves a high accuracy for audio classification, especially under each noise background we used and keep the classification time less than one second.

  15. Maximum-margin based representation learning from multiple atlases for Alzheimer's disease classification.

    Science.gov (United States)

    Min, Rui; Cheng, Jian; Price, True; Wu, Guorong; Shen, Dinggang

    2014-01-01

    In order to establish the correspondences between different brains for comparison, spatial normalization based morphometric measurements have been widely used in the analysis of Alzheimer's disease (AD). In the literature, different subjects are often compared in one atlas space, which may be insufficient in revealing complex brain changes. In this paper, instead of deploying one atlas for feature extraction and classification, we propose a maximum-margin based representation learning (MMRL) method to learn the optimal representation from multiple atlases. Unlike traditional methods that perform the representation learning separately from the classification, we propose to learn the new representation jointly with the classification model, which is more powerful in discriminating AD patients from normal controls (NC). We evaluated the proposed method on the ADNI database, and achieved 90.69% for AD/NC classification and 73.69% for p-MCI/s-MCI classification.

  16. Salient Feature Identification and Analysis using Kernel-Based Classification Techniques for Synthetic Aperture Radar Automatic Target Recognition

    Science.gov (United States)

    2014-03-27

    SALIENT FEATURE IDENTIFICATION AND ANALYSIS USING KERNEL-BASED CLASSIFICATION TECHNIQUES FOR SYNTHETIC APERTURE RADAR AUTOMATIC TARGET RECOGNITION...FEATURE IDENTIFICATION AND ANALYSIS USING KERNEL-BASED CLASSIFICATION TECHNIQUES FOR SYNTHETIC APERTURE RADAR AUTOMATIC TARGET RECOGNITION THESIS Presented...SALIENT FEATURE IDENTIFICATION AND ANALYSIS USING KERNEL-BASED CLASSIFICATION TECHNIQUES FOR SYNTHETIC APERTURE RADAR AUTOMATIC TARGET RECOGNITION

  17. Analysis of uncertainty in multi-temporal object-based classification

    Science.gov (United States)

    Löw, Fabian; Knöfel, Patrick; Conrad, Christopher

    2015-07-01

    Agricultural management increasingly uses crop maps based on classification of remotely sensed data. However, classification errors can translate to errors in model outputs, for instance agricultural production monitoring (yield, water demand) or crop acreage calculation. Hence, knowledge on the spatial variability of the classier performance is important information for the user. But this is not provided by traditional assessments of accuracy, which are based on the confusion matrix. In this study, classification uncertainty was analyzed, based on the support vector machines (SVM) algorithm. SVM was applied to multi-spectral time series data of RapidEye from different agricultural landscapes and years. Entropy was calculated as a measure of classification uncertainty, based on the per-object class membership estimations from the SVM algorithm. Permuting all possible combinations of available images allowed investigating the impact of the image acquisition frequency and timing, respectively, on the classification uncertainty. Results show that multi-temporal datasets decrease classification uncertainty for different crops compared to single data sets, but there was no "one-image-combination-fits-all" solution. The number and acquisition timing of the images, for which a decrease in uncertainty could be realized, proved to be specific to a given landscape, and for each crop they differed across different landscapes. For some crops, an increase of uncertainty was observed when increasing the quantity of images, even if classification accuracy was improved. Random forest regression was employed to investigate the impact of different explanatory variables on the observed spatial pattern of classification uncertainty. It was strongly influenced by factors related with the agricultural management and training sample density. Lower uncertainties were revealed for fields close to rivers or irrigation canals. This study demonstrates that classification uncertainty estimates

  18. Emotion of Physiological Signals Classification Based on TS Feature Selection

    Institute of Scientific and Technical Information of China (English)

    Wang Yujing; Mo Jianlin

    2015-01-01

    This paper propose a method of TS-MLP about emotion recognition of physiological signal.It can recognize emotion successfully by Tabu search which selects features of emotion’s physiological signals and multilayer perceptron that is used to classify emotion.Simulation shows that it has achieved good emotion classification performance.

  19. Laguerre Kernels –Based SVM for Image Classification

    Directory of Open Access Journals (Sweden)

    Ashraf Afifi

    2014-01-01

    Full Text Available Support vector machines (SVMs have been promising methods for classification and regression analysis because of their solid mathematical foundations which convey several salient properties that other methods hardly provide. However the performance of SVMs is very sensitive to how the kernel function is selected, the challenge is to choose the kernel function for accurate data classification. In this paper, we introduce a set of new kernel functions derived from the generalized Laguerre polynomials. The proposed kernels could improve the classification accuracy of SVMs for both linear and nonlinear data sets. The proposed kernel functions satisfy Mercer’s condition and orthogonally properties which are important and useful in some applications when the support vector number is needed as in feature selection. The performance of the generalized Laguerre kernels is evaluated in comparison with the existing kernels. It was found that the choice of the kernel function, and the values of the parameters for that kernel are critical for a given amount of data. The proposed kernels give good classification accuracy in nearly all the data sets, especially those of high dimensions.

  20. Image-Based Coral Reef Classification and Thematic Mapping

    Directory of Open Access Journals (Sweden)

    Brooke Gintert

    2013-04-01

    Full Text Available This paper presents a novel image classification scheme for benthic coral reef images that can be applied to both single image and composite mosaic datasets. The proposed method can be configured to the characteristics (e.g., the size of the dataset, number of classes, resolution of the samples, color information availability, class types, etc. of individual datasets. The proposed method uses completed local binary pattern (CLBP, grey level co-occurrence matrix (GLCM, Gabor filter response, and opponent angle and hue channel color histograms as feature descriptors. For classification, either k-nearest neighbor (KNN, neural network (NN, support vector machine (SVM or probability density weighted mean distance (PDWMD is used. The combination of features and classifiers that attains the best results is presented together with the guidelines for selection. The accuracy and efficiency of our proposed method are compared with other state-of-the-art techniques using three benthic and three texture datasets. The proposed method achieves the highest overall classification accuracy of any of the tested methods and has moderate execution time. Finally, the proposed classification scheme is applied to a large-scale image mosaic of the Red Sea to create a completely classified thematic map of the reef benthos.

  1. Colour based off-road environment and terrain type classification

    NARCIS (Netherlands)

    Jansen, P.; Mark, W. van der; Heuvel, J.C. van den; Groen, F.C.A.

    2005-01-01

    Terrain classification is an important problem that still remains to be solved for off-road autonomous robot vehicle guidance. Often, obstacle detection systems are used which cannot distinguish between solid obstacles such as rocks or soft obstacles such as tall patches of grass. Terrain classifica

  2. Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data

    Directory of Open Access Journals (Sweden)

    J. Sunil Rao

    2007-01-01

    Full Text Available In gene selection for cancer classifi cation using microarray data, we define an eigenvalue-ratio statistic to measure a gene’s contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalueratio statistic, we define a novel hypothesis testing for gene statistical redundancy and propose two gene selection methods. Simulation studies illustrate the agreement between statistical redundancy testing and gene selection methods. Real data examples show the proposed gene selection methods can select a compact gene subset which can not only be used to build high quality cancer classifiers but also show biological relevance.

  3. Application of SVM classifier in thermographic image classification for early detection of breast cancer

    Science.gov (United States)

    Oleszkiewicz, Witold; Cichosz, Paweł; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał

    2016-09-01

    This article presents the application of machine learning algorithms for early detection of breast cancer on the basis of thermographic images. Supervised learning model: Support vector machine (SVM) and Sequential Minimal Optimization algorithm (SMO) for the training of SVM classifier were implemented. The SVM classifier was included in a client-server application which enables to create a training set of examinations and to apply classifiers (including SVM) for the diagnosis and early detection of the breast cancer. The sensitivity and specificity of SVM classifier were calculated based on the thermographic images from studies. Furthermore, the heuristic method for SVM's parameters tuning was proposed.

  4. Classification of weld defect based on information fusion technology for radiographic testing system

    Science.gov (United States)

    Jiang, Hongquan; Liang, Zeming; Gao, Jianmin; Dang, Changying

    2016-03-01

    Improving the efficiency and accuracy of weld defect classification is an important technical problem in developing the radiographic testing system. This paper proposes a novel weld defect classification method based on information fusion technology, Dempster-Shafer evidence theory. First, to characterize weld defects and improve the accuracy of their classification, 11 weld defect features were defined based on the sub-pixel level edges of radiographic images, four of which are presented for the first time in this paper. Second, we applied information fusion technology to combine different features for weld defect classification, including a mass function defined based on the weld defect feature information and the quartile-method-based calculation of standard weld defect class which is to solve a sample problem involving a limited number of training samples. A steam turbine weld defect classification case study is also presented herein to illustrate our technique. The results show that the proposed method can increase the correct classification rate with limited training samples and address the uncertainties associated with weld defect classification.

  5. Robust real-time mine classification based on side-scan sonar imagery

    Science.gov (United States)

    Bello, Martin G.

    2000-08-01

    We describe here image processing and neural network based algorithms for detection and classification of mines in side-scan sonar imagery, and the results obtained from their application to two distinct image data bases. These algorithms evolved over a period from 1994 to the present, originally at Draper Laboratory, and currently at Alphatech Inc. The mine-detection/classification system is partitioned into an anomaly screening stage followed by a classification stage involving the calculation of features on blobs, and their input into a multilayer perceptron neural network. Particular attention is given to the selection of algorithm parameters, and training data, in order to optimize performance over the aggregate data set.

  6. Classification of melanoma using wavelet-transform-based optimal feature set

    Science.gov (United States)

    Walvick, Ronn P.; Patel, Ketan; Patwardhan, Sachin V.; Dhawan, Atam P.

    2004-05-01

    The features used in the ABCD rule for characterization of skin lesions suggest that the spatial and frequency information in the nevi changes at various stages of melanoma development. To analyze these changes wavelet transform based features have been reported. The classification of melanoma using these features has produced varying results. In this work, all the reported wavelet transform based features are combined to form a single feature set. This feature set is then optimized by removing redundancies using principal component analysis. A feed forward neural network trained with the back propagation algorithm is then used in the classification process to obtain better classification results.

  7. Radial Basis Function Networks Applied in Bacterial Classification Based on MALDI-TOF-MS

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The radial basis function networks were applied to bacterial classification based on the matrix-assisted laser desorption/ionization time-of-flight mass spectrometric (MALDI-TOF-MS) data. The classification of bacteria cultured at different time was discussed and the effect of the network parameters on the classification was investigated. The cross-validation method was used to test the trained networks. The correctness of the classification of different bacteria investigated changes in a wide range from 61.5% to 92.8%. Owing to the complexity of biological effects in bacterial growth, the more rigid control of bacterial culture conditions seems to be a critical factor for improving the rate of correctness for bacterial classification.

  8. [Classification of cell-based medicinal products and legal implications: An overview and an update].

    Science.gov (United States)

    Scherer, Jürgen; Flory, Egbert

    2015-11-01

    In general, cell-based medicinal products do not represent a uniform class of medicinal products, but instead comprise medicinal products with diverse regulatory classification as advanced-therapy medicinal products (ATMP), medicinal products (MP), tissue preparations, or blood products. Due to the legal and scientific consequences of the development and approval of MPs, classification should be clarified as early as possible. This paper describes the legal situation in Germany and highlights specific criteria and concepts for classification, with a focus on, but not limited to, ATMPs and non-ATMPs. Depending on the stage of product development and the specific application submitted to a competent authority, legally binding classification is done by the German Länder Authorities, Paul-Ehrlich-Institut, or European Medicines Agency. On request by the applicants, the Committee for Advanced Therapies may issue scientific recommendations for classification.

  9. Scene Classification of Remote Sensing Image Based on Multi-scale Feature and Deep Neural Network

    Directory of Open Access Journals (Sweden)

    XU Suhui

    2016-07-01

    Full Text Available Aiming at low precision of remote sensing image scene classification owing to small sample sizes, a new classification approach is proposed based on multi-scale deep convolutional neural network (MS-DCNN, which is composed of nonsubsampled Contourlet transform (NSCT, deep convolutional neural network (DCNN, and multiple-kernel support vector machine (MKSVM. Firstly, remote sensing image multi-scale decomposition is conducted via NSCT. Secondly, the decomposing high frequency and low frequency subbands are trained by DCNN to obtain image features in different scales. Finally, MKSVM is adopted to integrate multi-scale image features and implement remote sensing image scene classification. The experiment results in the standard image classification data sets indicate that the proposed approach obtains great classification effect due to combining the recognition superiority to different scenes of low frequency and high frequency subbands.

  10. Three-Class EEG-Based Motor Imagery Classification Using Phase-Space Reconstruction Technique

    Science.gov (United States)

    Djemal, Ridha; Bazyed, Ayad G.; Belwafi, Kais; Gannouni, Sofien; Kaaniche, Walid

    2016-01-01

    Over the last few decades, brain signals have been significantly exploited for brain-computer interface (BCI) applications. In this paper, we study the extraction of features using event-related desynchronization/synchronization techniques to improve the classification accuracy for three-class motor imagery (MI) BCI. The classification approach is based on combining the features of the phase and amplitude of the brain signals using fast Fourier transform (FFT) and autoregressive (AR) modeling of the reconstructed phase space as well as the modification of the BCI parameters (trial length, trial frequency band, classification method). We report interesting results compared with those present in the literature by utilizing sequential forward floating selection (SFFS) and a multi-class linear discriminant analysis (LDA), our findings showed superior classification results, a classification accuracy of 86.06% and 93% for two BCI competition datasets, with respect to results from previous studies. PMID:27563927

  11. A Statistical Approach of Texton Based Texture Classification Using LPboosting Classifier

    Directory of Open Access Journals (Sweden)

    C. Vivek

    2014-05-01

    Full Text Available The aim of the study in this research deals with the accurate texture classification and the image texture analysis has a voluminous errand prospective in real world applications. In this study, the texton co-occurrence matrix applied to the Broadatz database images that derive the template texton grid images and it undergoes to the discrete shearlet transform to decompose the image. The entropy lineage parameters of redundant and interpolate at a certain point which congregating adjacent regions based on geometric properties then the classification is apprehended by comparing the similarity between the estimated distributions of all detail sub bands through the strong LP boosting classification with various weak classifier configurations. We show that the resulted texture features while incurring the maximum of the discriminative information. Our hybrid classification method significantly outperforms the existing texture descriptors and stipulates classification accuracy in the state-of-the-art real world imaging applications.

  12. Spectral Collaborative Representation based Classification for Hand Gestures recognition on Electromyography Signals

    OpenAIRE

    Boyali, Ali

    2015-01-01

    In this study, we introduce a novel variant and application of the Collaborative Representation based Classification in spectral domain for recognition of the hand gestures using the raw surface Electromyography signals. The intuitive use of spectral features are explained via circulant matrices. The proposed Spectral Collaborative Representation based Classification (SCRC) is able to recognize gestures with higher levels of accuracy for a fairly rich gesture set. The worst recognition result...

  13. State-Based Models for Light Curve Classification

    Science.gov (United States)

    Becker, A.

    I discuss here the application of continuous time autoregressive models to the characterization of astrophysical variability. These types of models are general enough to represent many classes of variability, and descriptive enough to provide features for lightcurve classification. Importantly, the features of these models may be interpreted in terms of the power spectrum of the lightcurve, enabling constraints on characteristic timescales and periodicity. These models may be extended to include vector-valued inputs, raising the prospect of a fully general modeling and classification environment that uses multi-passband inputs to create a single phenomenological model. These types of spectral-temporal models are an important extension of extant techniques, and necessary in the upcoming eras of Gaia and LSST.

  14. Knowledge Based Pipeline Network Classification and Recognition Method of Maps

    Institute of Scientific and Technical Information of China (English)

    Liu Tongyu; Gu Shusheng

    2001-01-01

    Map recognition is an e.ssenfial data input means of Geographic Information System(GIS). How to solve the problems in the procedure, such as recognition of maps with crisscross pipeline networks, classification of buildings and roads, and processing of connected text, is a critical step for GIS keeping high-speed development. In this paper, a new recognition method of pipeline maps is presented, and some common patterns of pipeline connection and component labels are establishecd Through pattern matching, pipelines and component labels are recognized and peeled off from maps. After this approach, maps simply consist of buildings and roads, which are recognized and classified with fuzzy classification method. In addition, the Double Sides Scan (DSS) technique is also described, through which the effect of connected text can be eliminated.

  15. A Spectral Signature Shape-Based Algorithm for Landsat Image Classification

    Directory of Open Access Journals (Sweden)

    Yuanyuan Chen

    2016-08-01

    Full Text Available Land-cover datasets are crucial for earth system modeling and human-nature interaction research at local, regional and global scales. They can be obtained from remotely sensed data using image classification methods. However, in processes of image classification, spectral values have received considerable attention for most classification methods, while the spectral curve shape has seldom been used because it is difficult to be quantified. This study presents a classification method based on the observation that the spectral curve is composed of segments and certain extreme values. The presented classification method quantifies the spectral curve shape and takes full use of the spectral shape differences among land covers to classify remotely sensed images. Using this method, classification maps from TM (Thematic mapper data were obtained with an overall accuracy of 0.834 and 0.854 for two respective test areas. The approach presented in this paper, which differs from previous image classification methods that were mostly concerned with spectral “value” similarity characteristics, emphasizes the "shape" similarity characteristics of the spectral curve. Moreover, this study will be helpful for classification research on hyperspectral and multi-temporal images.

  16. Dihedral-Based Segment Identification and Classification of Biopolymers I: Proteins

    Science.gov (United States)

    2013-01-01

    A new structure classification scheme for biopolymers is introduced, which is solely based on main-chain dihedral angles. It is shown that by dividing a biopolymer into segments containing two central residues, a local classification can be performed. The method is referred to as DISICL, short for Dihedral-based Segment Identification and Classification. Compared to other popular secondary structure classification programs, DISICL is more detailed as it offers 18 distinct structural classes, which may be simplified into a classification in terms of seven more general classes. It was designed with an eye to analyzing subtle structural changes as observed in molecular dynamics simulations of biomolecular systems. Here, the DISICL algorithm is used to classify two databases of protein structures, jointly containing more than 10 million segments. The data is compared to two alternative approaches in terms of the amount of classified residues, average occurrence and length of structural elements, and pair wise matches of the classifications by the different programs. In an accompanying paper (Nagy, G.; Oostenbrink, C. Dihedral-based segment identification and classification of biopolymers II: Polynucleotides. J. Chem. Inf. Model. 2013, DOI: 10.1021/ci400542n), the analysis of polynucleotides is described and applied. Overall, DISICL represents a potentially useful tool to analyze biopolymer structures at a high level of detail. PMID:24364820

  17. Dihedral-based segment identification and classification of biopolymers I: proteins.

    Science.gov (United States)

    Nagy, Gabor; Oostenbrink, Chris

    2014-01-27

    A new structure classification scheme for biopolymers is introduced, which is solely based on main-chain dihedral angles. It is shown that by dividing a biopolymer into segments containing two central residues, a local classification can be performed. The method is referred to as DISICL, short for Dihedral-based Segment Identification and Classification. Compared to other popular secondary structure classification programs, DISICL is more detailed as it offers 18 distinct structural classes, which may be simplified into a classification in terms of seven more general classes. It was designed with an eye to analyzing subtle structural changes as observed in molecular dynamics simulations of biomolecular systems. Here, the DISICL algorithm is used to classify two databases of protein structures, jointly containing more than 10 million segments. The data is compared to two alternative approaches in terms of the amount of classified residues, average occurrence and length of structural elements, and pair wise matches of the classifications by the different programs. In an accompanying paper (Nagy, G.; Oostenbrink, C. Dihedral-based segment identification and classification of biopolymers II: Polynucleotides. J. Chem. Inf. Model. 2013, DOI: 10.1021/ci400542n), the analysis of polynucleotides is described and applied. Overall, DISICL represents a potentially useful tool to analyze biopolymer structures at a high level of detail.

  18. Power Disturbances Classification Using S-Transform Based GA-PNN

    Science.gov (United States)

    Manimala, K.; Selvi, K.

    2015-09-01

    The significance of detection and classification of power quality events that disturb the voltage and/or current waveforms in the electrical power distribution networks is well known. Consequently, in spite of a large number of research reports in this area, a research on the selection of proper parameter for specific classifiers was so far not explored. The parameter selection is very important for successful modelling of input-output relationship in a function approximation model. In this study, probabilistic neural network (PNN) has been used as a function approximation tool for power disturbance classification and genetic algorithm (GA) is utilised for optimisation of the smoothing parameter of the PNN. The important features extracted from raw power disturbance signal using S-Transform are given to the PNN for effective classification. The choice of smoothing parameter for PNN classifier will significantly impact the classification accuracy. Hence, GA based parameter optimization is done to ensure good classification accuracy by selecting suitable parameter of the PNN classifier. Testing results show that the proposed S-Transform based GA-PNN model has better classification ability than classifiers based on conventional grid search method for parameter selection. The noisy and practical signals are considered for the classification process to show the effectiveness of the proposed method in comparison with existing methods.

  19. Carcinoma de mama: novos conceitos na classificação Breast cancer: new concepts in classification

    Directory of Open Access Journals (Sweden)

    Daniella Serafin Couto Vieira

    2008-01-01

    Full Text Available O carcinoma de mama é a neoplasia maligna mais comum em mulheres. Estudos moleculares do carcinoma de mama, baseados na identificação do perfil de expressão gênica por meio do cDNA microarray, permitiram definir pelo menos cinco sub-grupos distintos: luminal A, luminal B, superexpressão do HER2, basal e normal breast-like. A técnica de tissue microarray (TMA, descrita pela primeira vez em 1998, permitiu estudar, em várias amostras de carcinoma, os perfis de expressão protéica de diferentes neoplasias. No carcinoma de mama, os TMAs têm sido utilizados para validar os achados dos estudos preliminares, identificando, desta forma, os novos subtipos fenotípicos do carcinoma de mama. Dentre os subtipos classicamente descritos, o grupo basal constitui um dos mais intrigantes subtipos tumorais e é freqüentemente associado com pior prognóstico e ausência de alvos terapêuticos definidos. A classificação histopatológica do carcinoma de mama tem pobre valor preditivo. Portanto, a associação entre o diagnóstico histológico com técnicas moleculares nos laboratórios de anatomia patológica, por meio do estudo imunoistoquímico, pode determinar o perfil molecular do carcinoma de mama, buscando melhorar a resposta terapêutica. Este estudo visou resumir os mais recentes conhecimentos em que se baseiam os novos conceitos da classificação do carcinoma de mama.Breast cancer is the principal cause of death from cancer in women. Molecular studies of breast cancer, based in the identification of the molecular profiling techniques through cDNA microarray, had allowed defining at least five distinct sub-group: luminal A, luminal B, HER-2-overexpression, basal and " normal" type breast-like. The technique of tissue microarrays (TMA, described for the first time in 1998, allows to study, in some samples of breast cancer, distinguished by differences in their gene expression patterns, which provide a distinctive molecular portrait for each tumor

  20. Investigation into Text Classification With Kernel Based Schemes

    Science.gov (United States)

    2010-03-01

    classification/categorization applications. The text database considered in this study was collected from the IEEE Xplore database website [2]. The...database considered in this study was collected from the IEEE Xplore database website [2]. The documents collected were limited to Electrical engineering...Linear Discriminant Analysis (LDA) scheme. Titles, along with abstracts from IEEE journal articles published between 1990 and 1999 with specific key

  1. Image Analysis and Classification Based on Soil Strength

    Science.gov (United States)

    2016-08-01

    Impact Hammer, which is light, easy to operate, and cost effective. The Clegg Impact Hammer measures stiffness of the soil surface by drop- ping a...WorldView-2 multi- spectral satellite imagery. This paper presents the work done on the im- agery classification for soil strength, the apparent...landing zone ............... 6 4 CRREL research technician, Jesse Stanley, taking Clegg measurements at a test location in San Miguelito

  2. A differentiation-based phylogeny of cancer subtypes.

    Directory of Open Access Journals (Sweden)

    Markus Riester

    2010-05-01

    Full Text Available Histopathological classification of human tumors relies in part on the degree of differentiation of the tumor sample. To date, there is no objective systematic method to categorize tumor subtypes by maturation. In this paper, we introduce a novel computational algorithm to rank tumor subtypes according to the dissimilarity of their gene expression from that of stem cells and fully differentiated tissue, and thereby construct a phylogenetic tree of cancer. We validate our methodology with expression data of leukemia, breast cancer and liposarcoma subtypes and then apply it to a broader group of sarcomas. This ranking of tumor subtypes resulting from the application of our methodology allows the identification of genes correlated with differentiation and may help to identify novel therapeutic targets. Our algorithm represents the first phylogeny-based tool to analyze the differentiation status of human tumors.

  3. Instrument classification in polyphonic music based on timbre analysis

    Science.gov (United States)

    Zhang, Tong

    2001-07-01

    While most previous work on musical instrument recognition is focused on the classification of single notes in monophonic music, a scheme is proposed in this paper for the distinction of instruments in continuous music pieces which may contain one or more kinds of instruments. Highlights of the system include music segmentation into notes, harmonic partial estimation in polyphonic sound, note feature calculation and normalization, note classification using a set of neural networks, and music piece categorization with fuzzy logic principles. Example outputs of the system are `the music piece is 100% guitar (with 90% likelihood)' and `the music piece is 60% violin and 40% piano, thus a violin/piano duet'. The system has been tested with twelve kinds of musical instruments, and very promising experimental results have been obtained. An accuracy of about 80% is achieved, and the number can be raised to 90% if misindexings within the same instrument family are tolerated (e.g. cello, viola and violin). A demonstration system for musical instrument classification and music timbre retrieval is also presented.

  4. Three-Class Mammogram Classification Based on Descriptive CNN Features

    Science.gov (United States)

    Zhang, Qianni; Jadoon, Adeel

    2017-01-01

    In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases). In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT). An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE). In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT), while in the second method discrete curvelet transform (DCT) is used. In both methods, dense scale invariant feature (DSIFT) for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN). Softmax layer and support vector machine (SVM) layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques. PMID:28191461

  5. Magnetic nanoparticle-based cancer therapy

    Institute of Scientific and Technical Information of China (English)

    Yu Jing; Huang Dong-Yan; Muhammad Zubair Yousaf; Hou Yang-Long; Gao Song

    2013-01-01

    Nanoparticles (NPs) with easily modified surfaces have been playing an important role in biomedicine.As cancer is one of the major causes of death,tremendous efforts have been devoted to advance the methods of cancer diagnosis and therapy.Recently,magnetic nanoparticles (MNPs) that are responsive to a magnetic field have shown great promise in cancer therapy.Compared with traditional cancer therapy,magnetic field triggered therapeutic approaches can treat cancer in an unconventional but more effective and safer way.In this review,we will discuss the recent progress in cancer therapies based on MNPs,mainly including magnetic hyperthermia,magnetic specific targeting,magnetically controlled drug delivery,magnetofection,and magnetic switches for controlling cell fate.Some recently developed strategies such as magnetic resonance imaging (MRI) monitoring cancer therapy and magnetic tissue engineering are also addressed.

  6. Ship Classification with High Resolution TerraSAR-X Imagery Based on Analytic Hierarchy Process

    Directory of Open Access Journals (Sweden)

    Zhi Zhao

    2013-01-01

    Full Text Available Ship surveillance using space-borne synthetic aperture radar (SAR, taking advantages of high resolution over wide swaths and all-weather working capability, has attracted worldwide attention. Recent activity in this field has concentrated mainly on the study of ship detection, but the classification is largely still open. In this paper, we propose a novel ship classification scheme based on analytic hierarchy process (AHP in order to achieve better performance. The main idea is to apply AHP on both feature selection and classification decision. On one hand, the AHP based feature selection constructs a selection decision problem based on several feature evaluation measures (e.g., discriminability, stability, and information measure and provides objective criteria to make comprehensive decisions for their combinations quantitatively. On the other hand, we take the selected feature sets as the input of KNN classifiers and fuse the multiple classification results based on AHP, in which the feature sets’ confidence is taken into account when the AHP based classification decision is made. We analyze the proposed classification scheme and demonstrate its results on a ship dataset that comes from TerraSAR-X SAR images.

  7. A New Classification Analysis of Customer Requirement Information Based on Quantitative Standardization for Product Configuration

    Directory of Open Access Journals (Sweden)

    Zheng Xiao

    2016-01-01

    Full Text Available Traditional methods used for the classification of customer requirement information are typically based on specific indicators, hierarchical structures, and data formats and involve a qualitative analysis in terms of stationary patterns. Because these methods neither consider the scalability of classification results nor do they regard subsequent application to product configuration, their classification becomes an isolated operation. However, the transformation of customer requirement information into quantifiable values would lead to a dynamic classification according to specific conditions and would enable an association with product configuration in an enterprise. This paper introduces a classification analysis based on quantitative standardization, which focuses on (i expressing customer requirement information mathematically and (ii classifying customer requirement information for product configuration purposes. Our classification analysis treated customer requirement information as follows: first, it was transformed into standardized values using mathematics, subsequent to which it was classified through calculating the dissimilarity with general customer requirement information related to the product family. Finally, a case study was used to demonstrate and validate the feasibility and effectiveness of the classification analysis.

  8. Land Cover Classification from Full-Waveform LIDAR Data Based on Support Vector Machines

    Science.gov (United States)

    Zhou, M.; Li, C. R.; Ma, L.; Guan, H. C.

    2016-06-01

    In this study, a land cover classification method based on multi-class Support Vector Machines (SVM) is presented to predict the types of land cover in Miyun area. The obtained backscattered full-waveforms were processed following a workflow of waveform pre-processing, waveform decomposition and feature extraction. The extracted features, which consist of distance, intensity, Full Width at Half Maximum (FWHM) and back scattering cross-section, were corrected and used as attributes for training data to generate the SVM prediction model. The SVM prediction model was applied to predict the types of land cover in Miyun area as ground, trees, buildings and farmland. The classification results of these four types of land covers were obtained based on the ground truth information according to the CCD image data of Miyun area. It showed that the proposed classification algorithm achieved an overall classification accuracy of 90.63%. In order to better explain the SVM classification results, the classification results of SVM method were compared with that of Artificial Neural Networks (ANNs) method and it showed that SVM method could achieve better classification results.

  9. LAND COVER CLASSIFICATION FROM FULL-WAVEFORM LIDAR DATA BASED ON SUPPORT VECTOR MACHINES

    Directory of Open Access Journals (Sweden)

    M. Zhou

    2016-06-01

    Full Text Available In this study, a land cover classification method based on multi-class Support Vector Machines (SVM is presented to predict the types of land cover in Miyun area. The obtained backscattered full-waveforms were processed following a workflow of waveform pre-processing, waveform decomposition and feature extraction. The extracted features, which consist of distance, intensity, Full Width at Half Maximum (FWHM and back scattering cross-section, were corrected and used as attributes for training data to generate the SVM prediction model. The SVM prediction model was applied to predict the types of land cover in Miyun area as ground, trees, buildings and farmland. The classification results of these four types of land covers were obtained based on the ground truth information according to the CCD image data of Miyun area. It showed that the proposed classification algorithm achieved an overall classification accuracy of 90.63%. In order to better explain the SVM classification results, the classification results of SVM method were compared with that of Artificial Neural Networks (ANNs method and it showed that SVM method could achieve better classification results.

  10. Virtual images inspired consolidate collaborative representation-based classification method for face recognition

    Science.gov (United States)

    Liu, Shigang; Zhang, Xinxin; Peng, Yali; Cao, Han

    2016-07-01

    The collaborative representation-based classification method performs well in the field of classification of high-dimensional images such as face recognition. It utilizes training samples from all classes to represent a test sample and assigns a class label to the test sample using the representation residuals. However, this method still suffers from the problem that limited number of training sample influences the classification accuracy when applied to image classification. In this paper, we propose a modified collaborative representation-based classification method (MCRC), which exploits novel virtual images and can obtain high classification accuracy. The procedure to produce virtual images is very simple but the use of them can bring surprising performance improvement. The virtual images can sufficiently denote the features of original face images in some case. Extensive experimental results doubtlessly demonstrate that the proposed method can effectively improve the classification accuracy. This is mainly attributed to the integration of the collaborative representation and the proposed feature-information dominated virtual images.

  11. Next-Generation Sequencing in Oncology: Genetic Diagnosis, Risk Prediction and Cancer Classification

    Directory of Open Access Journals (Sweden)

    Rick Kamps

    2017-01-01

    Full Text Available Next-generation sequencing (NGS technology has expanded in the last decades with significant improvements in the reliability, sequencing chemistry, pipeline analyses, data interpretation and costs. Such advances make the use of NGS feasible in clinical practice today. This review describes the recent technological developments in NGS applied to the field of oncology. A number of clinical applications are reviewed, i.e., mutation detection in inherited cancer syndromes based on DNA-sequencing, detection of spliceogenic variants based on RNA-sequencing, DNA-sequencing to identify risk modifiers and application for pre-implantation genetic diagnosis, cancer somatic mutation analysis, pharmacogenetics and liquid biopsy. Conclusive remarks, clinical limitations, implications and ethical considerations that relate to the different applications are provided.

  12. A Neuro-Fuzzy based System for Classification of Natural Textures

    Science.gov (United States)

    Jiji, G. Wiselin

    2016-12-01

    A statistical approach based on the coordinated clusters representation of images is used for classification and recognition of textured images. In this paper, two issues are being addressed; one is the extraction of texture features from the fuzzy texture spectrum in the chromatic and achromatic domains from each colour component histogram of natural texture images and the second issue is the concept of a fusion of multiple classifiers. The implementation of an advanced neuro-fuzzy learning scheme has been also adopted in this paper. The results of classification tests show the high performance of the proposed method that may have industrial application for texture classification, when compared with other works.

  13. An Analysis of Social Class Classification Based on Linguistic Variables

    Institute of Scientific and Technical Information of China (English)

    QU Xia-sha

    2016-01-01

    Since language is an influential tool in social interaction, the relationship of speech and social factors, such as social class, gender, even age is worth studying. People employ different linguistic variables to imply their social class, status and iden-tity in the social interaction. Thus the linguistic variation involves vocabulary, sounds, grammatical constructions, dialects and so on. As a result, a classification of social class draws people’s attention. Linguistic variable in speech interactions indicate the social relationship between people. This paper attempts to illustrate three main linguistic variables which influence the social class, and further sociolinguistic studies need to be more concerned about.

  14. Gaussian Mixture Model and Deep Neural Network based Vehicle Detection and Classification

    Directory of Open Access Journals (Sweden)

    S Sri Harsha

    2016-09-01

    Full Text Available The exponential rise in the demand of vision based traffic surveillance systems have motivated academia-industries to develop optimal vehicle detection and classification scheme. In this paper, an adaptive learning rate based Gaussian mixture model (GMM algorithm has been developed for background subtraction of multilane traffic data. Here, vehicle rear information and road dash-markings have been used for vehicle detection. Performing background subtraction, connected component analysis has been applied to retrieve vehicle region. A multilayered AlexNet deep neural network (DNN has been applied to extract higher layer features. Furthermore, scale invariant feature transform (SIFT based vehicle feature extraction has been performed. The extracted 4096-dimensional features have been processed for dimensional reduction using principle component analysis (PCA and linear discriminant analysis (LDA. The features have been mapped for SVM-based classification. The classification results have exhibited that AlexNet-FC6 features with LDA give the accuracy of 97.80%, followed by AlexNet-FC6 with PCA (96.75%. AlexNet-FC7 feature with LDA and PCA algorithms has exhibited classification accuracy of 91.40% and 96.30%, respectively. On the contrary, SIFT features with LDA algorithm has exhibited 96.46% classification accuracy. The results revealed that enhanced GMM with AlexNet DNN at FC6 and FC7 can be significant for optimal vehicle detection and classification.

  15. Computer vision-based limestone rock-type classification using probabilistic neural network

    Institute of Scientific and Technical Information of China (English)

    Ashok Kumar Patel; Snehamoy Chatterjee

    2016-01-01

    Proper quality planning of limestone raw materials is an essential job of maintaining desired feed in cement plant. Rock-type identification is an integrated part of quality planning for limestone mine. In this paper, a computer vision-based rock-type classification algorithm is proposed for fast and reliable identification without human intervention. A laboratory scale vision-based model was developed using probabilistic neural network (PNN) where color histogram features are used as input. The color image histogram-based features that include weighted mean, skewness and kurtosis features are extracted for all three color space red, green, and blue. A total nine features are used as input for the PNN classification model. The smoothing parameter for PNN model is selected judicially to develop an optimal or close to the optimum classification model. The developed PPN is validated using the test data set and results reveal that the proposed vision-based model can perform satisfactorily for classifying limestone rock-types. Overall the error of mis-classification is below 6%. When compared with other three classifica-tion algorithms, it is observed that the proposed method performs substantially better than all three classification algorithms.

  16. Computer vision-based limestone rock-type classification using probabilistic neural network

    Directory of Open Access Journals (Sweden)

    Ashok Kumar Patel

    2016-01-01

    Full Text Available Proper quality planning of limestone raw materials is an essential job of maintaining desired feed in cement plant. Rock-type identification is an integrated part of quality planning for limestone mine. In this paper, a computer vision-based rock-type classification algorithm is proposed for fast and reliable identification without human intervention. A laboratory scale vision-based model was developed using probabilistic neural network (PNN where color histogram features are used as input. The color image histogram-based features that include weighted mean, skewness and kurtosis features are extracted for all three color space red, green, and blue. A total nine features are used as input for the PNN classification model. The smoothing parameter for PNN model is selected judicially to develop an optimal or close to the optimum classification model. The developed PPN is validated using the test data set and results reveal that the proposed vision-based model can perform satisfactorily for classifying limestone rock-types. Overall the error of mis-classification is below 6%. When compared with other three classification algorithms, it is observed that the proposed method performs substantially better than all three classification algorithms.

  17. A Bayes fusion method based ensemble classification approach for Brown cloud application

    Directory of Open Access Journals (Sweden)

    M.Krishnaveni

    2014-03-01

    Full Text Available Classification is a recurrent task of determining a target function that maps each attribute set to one of the predefined class labels. Ensemble fusion is one of the suitable classifier model fusion techniques which combine the multiple classifiers to perform high classification accuracy than individual classifiers. The main objective of this paper is to combine base classifiers using ensemble fusion methods namely Decision Template, Dempster-Shafer and Bayes to compare the accuracy of the each fusion methods on the brown cloud dataset. The base classifiers like KNN, MLP and SVM have been considered in ensemble classification in which each classifier with four different function parameters. From the experimental study it is proved, that the Bayes fusion method performs better classification accuracy of 95% than Decision Template of 80%, Dempster-Shaferof 85%, in a Brown Cloud image dataset.

  18. Content-based similarity for 3D model retrieval and classification

    Institute of Scientific and Technical Information of China (English)

    Ke Lü; Ning He; Jian Xue

    2009-01-01

    With the rapid development of 3D digital shape information,content-based 3D model retrieval and classification has become an important research area.This paper presents a novel 3D model retrieval and classification algorithm.For feature representation,a method combining a distance histogram and moment invariants is proposed to improve the retrieval performance.The major advantage of using a distance histogram is its invariance to the transforms of scaling,translation and rotation.Based on the premise that two similar objects should have high mutual information,the querying of 3D data should convey a great deal of information on the shape of the two objects,and so we propose a mutual information distance measurement to perform the similarity comparison of 3D objects.The proposed algorithm is tested with a 3D model retrieval and classification prototype,and the experimental evaluation demonstrates satisfactory retrieval results and classification accuracy.

  19. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  20. A Comparison of RBF Neural Network Training Algorithms for Inertial Sensor Based Terrain Classification

    Directory of Open Access Journals (Sweden)

    Erkan Beşdok

    2009-08-01

    Full Text Available This paper introduces a comparison of training algorithms of radial basis function (RBF neural networks for classification purposes. RBF networks provide effective solutions in many science and engineering fields. They are especially popular in the pattern classification and signal processing areas. Several algorithms have been proposed for training RBF networks. The Artificial Bee Colony (ABC algorithm is a new, very simple and robust population based optimization algorithm that is inspired by the intelligent behavior of honey bee swarms. The training performance of the ABC algorithm is compared with the Genetic algorithm, Kalman filtering algorithm and gradient descent algorithm. In the experiments, not only well known classification problems from the UCI repository such as the Iris, Wine and Glass datasets have been used, but also an experimental setup is designed and inertial sensor based terrain classification for autonomous ground vehicles was also achieved. Experimental results show that the use of the ABC algorithm results in better learning than those of others.

  1. Deep learning based classification of breast tumors with shear-wave elastography.

    Science.gov (United States)

    Zhang, Qi; Xiao, Yang; Dai, Wei; Suo, Jingfeng; Wang, Congzhi; Shi, Jun; Zheng, Hairong

    2016-12-01

    This study aims to build a deep learning (DL) architecture for automated extraction of learned-from-data image features from the shear-wave elastography (SWE), and to evaluate the DL architecture in differentiation between benign and malignant breast tumors. We construct a two-layer DL architecture for SWE feature extraction, comprised of the point-wise gated Boltzmann machine (PGBM) and the restricted Boltzmann machine (RBM). The PGBM contains task-relevant and task-irrelevant hidden units, and the task-relevant units are connected to the RBM. Experimental evaluation was performed with five-fold cross validation on a set of 227 SWE images, 135 of benign tumors and 92 of malignant tumors, from 121 patients. The features learned with our DL architecture were compared with the statistical features quantifying image intensity and texture. Results showed that the DL features achieved better classification performance with an accuracy of 93.4%, a sensitivity of 88.6%, a specificity of 97.1%, and an area under the receiver operating characteristic curve of 0.947. The DL-based method integrates feature learning with feature selection on SWE. It may be potentially used in clinical computer-aided diagnosis of breast cancer.

  2. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  3. Fractal classification and natural classification of coal pore structure based on migration of coal bed methane

    Institute of Scientific and Technical Information of China (English)

    FU Xuehai; QIN Yong; ZHANG Wanhong; WEI Chongtao; ZHOU Rongfu

    2005-01-01

    According to the data of 146 coal samples measured by mercury penetration, coal pores are classified into two levels of <65 nm diffusion pore and >65 nm seeping pore by fractal method based on the characteristics of diffusion, seepage of coal bed methane(CBM) and on the research results of specific pore volume and pore structure. The diffusion pores are further divided into three categories: <8 nm micropore, 8-20 nm transitional pore, and 20-65 nm minipore based on the relationship between increment of specific surface area and diameter of pores, while seepage pores are further divided into three categories: 65-325 nm mesopore,325-1000 nm transitional pore, and >1000 nm macropore based on the abrupt change in the increment of specific pore volume.

  4. Management of patients with sphincter of Oddi dysfunction based on a new classification

    Institute of Scientific and Technical Information of China (English)

    Jia-Qing Gong; Jian-Dong Ren; Fu-Zhou Tian; Rui Jiang; Li-Jun Tang; Yong Pang

    2011-01-01

    AIM: To propose a new classification system for sphincter of Oddi dysfunction (SOD) based on clinical data of patients.METHODS: The clinical data of 305 SOD patients documented over the past decade at our center were analyzed retrospectively, and typical cases were reported.CONCLUSION: The newly proposed SOD classification system introduced in this study better explains the clinical symptoms of SOD from the anatomical perspective and can guide clinical treatment of this disease.

  5. On the rate of convergence for multi-category classification based on convex losses

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    The multi-category classification algorithms play an important role in both theory and practice of machine learning.In this paper,we consider an approach to the multi-category classification based on minimizing a convex surrogate of the nonstandard misclassification loss.We bound the excess misclassification error by the excess convex risk.We construct an adaptive procedure to search the classifier and furthermore obtain its convergence rate to the Bayes rule.

  6. Analysis on Systematic Water Scarcity Based on Establishment of Water Scarcity Classification System

    Institute of Scientific and Technical Information of China (English)

    2011-01-01

    It would be very helpful for making countermeasures against complex water scarcity by analysis on systematic water scarcity.Based on the previous researches on water scarcity classification,a classification system of water scarcity was established according to contributing factors,which comprises three water scarcity categories caused by anthropic factors,natural factors and mixed factors respectively.Accordingly,the concept of systematic water scarcity was proposed,which can be defined as one type of water...

  7. Wavelet-based texture image classification using vector quantization

    Science.gov (United States)

    Lam, Eric P.

    2007-02-01

    Classification of image segments on textures can be helpful for target recognition. Sometimes target cueing is performed before target recognition. Textures are sometimes used to cue an image processor of a potential region of interest. In certain imaging sensors, such as those used in synthetic aperture radar, textures may be abundant. The textures may be caused by the object material or speckle noise. Even speckle noise can create the illusion of texture, which must be compensated in image pre-processing. In this paper, we will discuss how to perform texture classification but constrain the number of wavelet packet node decomposition. The new approach performs a twochannel wavelet decomposition. Comparing the strength of each new subband with others at the same level of the wavelet packet determines when to stop further decomposition. This type of decomposition is performed recursively. Once the decompositions stop, the structure of the packet is stored in a data structure. Using the information from the data structure, dominating channels are extracted. These are defined as paths from the root of the packet to the leaf with the highest strengths. The list of dominating channels are used to train a learning vector quantization neural network.

  8. Maximum likelihood based classification of electron tomographic data.

    Science.gov (United States)

    Stölken, Michael; Beck, Florian; Haller, Thomas; Hegerl, Reiner; Gutsche, Irina; Carazo, Jose-Maria; Baumeister, Wolfgang; Scheres, Sjors H W; Nickell, Stephan

    2011-01-01

    Classification and averaging of sub-tomograms can improve the fidelity and resolution of structures obtained by electron tomography. Here we present a three-dimensional (3D) maximum likelihood algorithm--MLTOMO--which is characterized by integrating 3D alignment and classification into a single, unified processing step. The novelty of our approach lies in the way we calculate the probability of observing an individual sub-tomogram for a given reference structure. We assume that the reference structure is affected by a 'compound wedge', resulting from the summation of many individual missing wedges in distinct orientations. The distance metric underlying our probability calculations effectively down-weights Fourier components that are observed less frequently. Simulations demonstrate that MLTOMO clearly outperforms the 'constrained correlation' approach and has advantages over existing approaches in cases where the sub-tomograms adopt preferred orientations. Application of our approach to cryo-electron tomographic data of ice-embedded thermosomes revealed distinct conformations that are in good agreement with results obtained by previous single particle studies.

  9. Multi-Frequency Polarimetric SAR Classification Based on Riemannian Manifold and Simultaneous Sparse Representation

    Directory of Open Access Journals (Sweden)

    Fan Yang

    2015-07-01

    Full Text Available Normally, polarimetric SAR classification is a high-dimensional nonlinear mapping problem. In the realm of pattern recognition, sparse representation is a very efficacious and powerful approach. As classical descriptors of polarimetric SAR, covariance and coherency matrices are Hermitian semidefinite and form a Riemannian manifold. Conventional Euclidean metrics are not suitable for a Riemannian manifold, and hence, normal sparse representation classification cannot be applied to polarimetric SAR directly. This paper proposes a new land cover classification approach for polarimetric SAR. There are two principal novelties in this paper. First, a Stein kernel on a Riemannian manifold instead of Euclidean metrics, combined with sparse representation, is employed for polarimetric SAR land cover classification. This approach is named Stein-sparse representation-based classification (SRC. Second, using simultaneous sparse representation and reasonable assumptions of the correlation of representation among different frequency bands, Stein-SRC is generalized to simultaneous Stein-SRC for multi-frequency polarimetric SAR classification. These classifiers are assessed using polarimetric SAR images from the Airborne Synthetic Aperture Radar (AIRSAR sensor of the Jet Propulsion Laboratory (JPL and the Electromagnetics Institute Synthetic Aperture Radar (EMISAR sensor of the Technical University of Denmark (DTU. Experiments on single-band and multi-band data both show that these approaches acquire more accurate classification results in comparison to many conventional and advanced classifiers.

  10. A new structure-based classification of gram-positive bacteriocins.

    Science.gov (United States)

    Zouhir, Abdelmajid; Hammami, Riadh; Fliss, Ismail; Hamida, Jeannette Ben

    2010-08-01

    Bacteriocins are ribosomally-synthesized peptides or proteins produced by a wide range of bacteria. The antimicrobial activity of this group of natural substances against foodborne pathogenic and spoilage bacteria has raised considerable interest for their application in food preservation. Classifying these bacteriocins in well defined classes according to their biochemical properties is a major step towards characterizing these anti-infective peptides and understanding their mode of action. Actually, the chosen criteria for bacteriocins' classification lack consistency and coherence. So, various classification schemes of bacteriocins resulted various levels of contradiction and sorting inefficiencies leading to bacteriocins belonging to more than one class at the same time and to a general lack of classification of many bacteriocins. Establishing a coherent and adequate classification scheme for these bacteriocins is sought after by several researchers in the field. It is not straightforward to formulate an efficient classification scheme that encompasses all of the existing bacteriocins. In the light of the structural data, here we revisit the previously proposed contradictory classification and we define new structure-based sequence fingerprints that support a subdivision of the bacteriocins into 12 groups. The paper lays down a resourceful and consistent classification approach that resulted in classifying more than 70% of bacteriocins known to date and with potential to identify distinct classes for the remaining unclassified bacteriocins. Identified groups are characterized by the presence of highly conserved short amino acid motifs. Furthermore, unclassified bacteriocins are expected to form an identified group when there will be sufficient sequences.

  11. A multiple-point spatially weighted k-NN method for object-based classification

    Science.gov (United States)

    Tang, Yunwei; Jing, Linhai; Li, Hui; Atkinson, Peter M.

    2016-10-01

    Object-based classification, commonly referred to as object-based image analysis (OBIA), is now commonly regarded as able to produce more appealing classification maps, often of greater accuracy, than pixel-based classification and its application is now widespread. Therefore, improvement of OBIA using spatial techniques is of great interest. In this paper, multiple-point statistics (MPS) is proposed for object-based classification enhancement in the form of a new multiple-point k-nearest neighbour (k-NN) classification method (MPk-NN). The proposed method first utilises a training image derived from a pre-classified map to characterise the spatial correlation between multiple points of land cover classes. The MPS borrows spatial structures from other parts of the training image, and then incorporates this spatial information, in the form of multiple-point probabilities, into the k-NN classifier. Two satellite sensor images with a fine spatial resolution were selected to evaluate the new method. One is an IKONOS image of the Beijing urban area and the other is a WorldView-2 image of the Wolong mountainous area, in China. The images were object-based classified using the MPk-NN method and several alternatives, including the k-NN, the geostatistically weighted k-NN, the Bayesian method, the decision tree classifier (DTC), and the support vector machine classifier (SVM). It was demonstrated that the new spatial weighting based on MPS can achieve greater classification accuracy relative to the alternatives and it is, thus, recommended as appropriate for object-based classification.

  12. Non-target adjacent stimuli classification improves performance of classical ERP-based brain computer interface

    Science.gov (United States)

    Ceballos, G. A.; Hernández, L. F.

    2015-04-01

    Objective. The classical ERP-based speller, or P300 Speller, is one of the most commonly used paradigms in the field of Brain Computer Interfaces (BCI). Several alterations to the visual stimuli presentation system have been developed to avoid unfavorable effects elicited by adjacent stimuli. However, there has been little, if any, regard to useful information contained in responses to adjacent stimuli about spatial location of target symbols. This paper aims to demonstrate that combining the classification of non-target adjacent stimuli with standard classification (target versus non-target) significantly improves classical ERP-based speller efficiency. Approach. Four SWLDA classifiers were trained and combined with the standard classifier: the lower row, upper row, right column and left column classifiers. This new feature extraction procedure and the classification method were carried out on three open databases: the UAM P300 database (Universidad Autonoma Metropolitana, Mexico), BCI competition II (dataset IIb) and BCI competition III (dataset II). Main results. The inclusion of the classification of non-target adjacent stimuli improves target classification in the classical row/column paradigm. A gain in mean single trial classification of 9.6% and an overall improvement of 25% in simulated spelling speed was achieved. Significance. We have provided further evidence that the ERPs produced by adjacent stimuli present discriminable features, which could provide additional information about the spatial location of intended symbols. This work promotes the searching of information on the peripheral stimulation responses to improve the performance of emerging visual ERP-based spellers.

  13. The study of a patient's immune system may prove to be a useful noninvasive tool for stage classification in colon cancer.

    Science.gov (United States)

    Pellegrini, Patrizia; Berghella, Anna Maria; Contasta, Ida; Del Beato, Tiziana; Adorno, Domenico

    2006-10-01

    Therapy, and, therefore, prognosis, is strictly related to cancer stage, and hence, screening tests that can contribute to the early classification of disease stage represent a step forward in treatment. Unfortunately, few prognostic indices are available, especially noninvasive ones. Our study of the physiological network of the immune response, however, leads us to believe that it may well be possible to define immunological indices for the classification of cancer stage using blood parameters. In this paper, we show how the study of a patient's immune system can be used as a noninvasive tool for early-stage classification.

  14. Effect of World Health Organization (WHO) Histological Classification on Predicting Lymph Node Metastasis and Recurrence in Early Gastric Cancer

    Science.gov (United States)

    Lai, Ji Fu; Xu, Wen Na; Noh, Sung Hoon; Lu, Wei Qin

    2016-01-01

    Background The World Health Organization (WHO) histological classification for gastric cancer is widely accepted and used. However, its impact on predicting lymph node metastasis and recurrence in early gastric cancer (EGC) is not well studied. Material/Methods From 1987 to 2005, 2873 EGC patients with known WHO histological type who had undergone curative resection were enrolled in this study. In all, 637 well-differentiated adenocarcinomas (WD), 802 moderately-differentiated adenocarcinomas (MD), 689 poorly-differentiated adenocarcinomas (PD), and 745 signet-ring cell adenocarcinomas (SRC) were identified. Results The distribution of demographic and clinical features in early gastric cancer among WD, MD, PD, and SRC were significantly different. Lymph node metastasis was observed in 317 patients (11.0%), with the lymph node metastasis rate being 5.3%, 14.8%, 17.0%, and 6.3% in WD, MD, PD, and SRC, respectively. Univariate and multivariate analyses indicated that gender, tumor size, gross appearance, depth of invasion, and WHO classification were significantly associated with lymph node metastasis. Recurrence was observed in 83 patients (2.9%), with the recurrence rate being 2.2%, 4.5%, 3.0%, and 1.6% in WD, MD, PD, and SRC, respectively. Multivariate analysis confirmed that MD, elevated gross type, and lymph node metastasis were independent risk factors for recurrence in EGC. MD patients showed worse disease-free survival than non-MD patients (P=0.001). Conclusions WHO classification is useful and necessary to evaluate during the perioperative management of EGC. Treatment strategies for EGC should be made prudently according to WHO classification, especially for MD patients. PMID:27595490

  15. BRAIN TUMOR CLASSIFICATION BASED ON CLUSTERED DISCRETE COSINE TRANSFORM IN COMPRESSED DOMAIN

    Directory of Open Access Journals (Sweden)

    V. Anitha

    2014-01-01

    Full Text Available This study presents a novel method to classify the brain tumors by means of efficient and integrated methods so as to increase the classification accuracy. In conventional systems, the problem being the same to extract the feature sets from the database and classify tumors based on the features sets. The main idea in plethora of earlier researches related to any classification method is to increase the classification accuracy.The actual need is to achieve a better accuracy in classification, by extracting more relevant feature sets after dimensionality reduction. There exists a trade-off between accuracy and the number of feature sets. Hence the focus in this study is to implement Discrete Cosine Transform (DCT on the brain tumor images for various classes. Using DCT, by itself, it offers a fair dimension reduction in feature sets.Later on, sequentially K-means algorithm is applied on DCT coefficients to cluster the feature sets. These cluster information are considered as refined feature sets and classified using Support Vector Machine (SVM is proposed in this study. This method of using DCT helps to adjust and vary the performance of classification based on the count of the DCT coefficients taken into account. There exists a good demand for an automatic classification of brain tumors which grealtly helps in the process of diagnosis. In this novel work, an average of 97% and a maximum of 100% classification accuracy has been achieved. This research is basically aiming and opening a new way of classification under compressed domain. Hence this study may be highly suitable for diagnosing under mobile computing and internet based medical diagnosis.

  16. Research On The Classification Of High Resolution Image Based On Object-oriented And Class Rule

    Science.gov (United States)

    Li, C. K.; Fang, W.; Dong, X. J.

    2015-06-01

    With the development of remote sensing technology, the spatial resolution, spectral resolution and time resolution of remote sensing data is greatly improved. How to efficiently process and interpret the massive high resolution remote sensing image data for ground objects, which with spatial geometry and texture information, has become the focus and difficulty in the field of remote sensing research. An object oriented and rule of the classification method of remote sensing data has presents in this paper. Through the discovery and mining the rich knowledge of spectrum and spatial characteristics of high-resolution remote sensing image, establish a multi-level network image object segmentation and classification structure of remote sensing image to achieve accurate and fast ground targets classification and accuracy assessment. Based on worldview-2 image data in the Zangnan area as a study object, using the object-oriented image classification method and rules to verify the experiment which is combination of the mean variance method, the maximum area method and the accuracy comparison to analysis, selected three kinds of optimal segmentation scale and established a multi-level image object network hierarchy for image classification experiments. The results show that the objectoriented rules classification method to classify the high resolution images, enabling the high resolution image classification results similar to the visual interpretation of the results and has higher classification accuracy. The overall accuracy and Kappa coefficient of the object-oriented rules classification method were 97.38%, 0.9673; compared with object-oriented SVM method, respectively higher than 6.23%, 0.078; compared with object-oriented KNN method, respectively more than 7.96%, 0.0996. The extraction precision and user accuracy of the building compared with object-oriented SVM method, respectively higher than 18.39%, 3.98%, respectively better than the object-oriented KNN method 21

  17. GIS—Based Red Soil Resources Classification and Evaluation

    Institute of Scientific and Technical Information of China (English)

    HUYUEMING; WANGRENCHAO; 等

    1999-01-01

    A small scale red soil resources information system(RSRIS) with applied mathematical models was developed and applied in red soil resources(RSR) classification and evaluation,taking Zhejiang Province,a typical distribution area of red soil,as the study area.Computer-aided overlay was conductied to classifty RSR types.The evaluation was carried out by using three methods,i.e.,index summation,square root of index multiplication and fuzzy comprehensive assessment,with almost identical results,The result of index summation could represent the basic qualitatie condition of RSR,that of square root of index miltiplication reflected the real condition of RSR qualitative rank,while fuzzy comprehensive assessment could satisfactorily handle the relationship between the evaluation factors and the qualitative rank of RSR,and therefore it is a feasible method for RSR evaluation.

  18. Data association based on target signal classification information

    Institute of Scientific and Technical Information of China (English)

    Guo Lei; Tang Bin; Liu Gang

    2008-01-01

    In most of the passive tracking systems, only the target kinematical information is used in the measurement-to-track association, which results in error tracking in a multitarget environment, where the tar-gets are too close to each other. To enhance the tracking accuracy, the target signal classification information (TSCI) should be used to improve the data association. The TSCI is integrated in the data association process using the JPDA (joint probabilistie data association). The use of the TSCI in the data association can improve discrimination by yielding a purer track and preserving continuity. To verify the validity of the application of TSCI,two simulation experiments are done on an air target-tracing problem, that is, one using the TSCI and the other not using the TSCI. The final comparison shows that the use of the TSCI can effectively improve tracking accuracy.

  19. The Normalization of Citation Counts Based on Classification Systems

    Directory of Open Access Journals (Sweden)

    Andreas Barth

    2013-08-01

    Full Text Available If we want to assess whether the paper in question has had a particularly high or low citation impact compared to other papers, the standard practice in bibliometrics is to normalize citations in respect of the subject category and publication year. A number of proposals for an improved procedure in the normalization of citation impact have been put forward in recent years. Against the background of these proposals, this study describes an ideal solution for the normalization of citation impact: in a first step, the reference set for the publication in question is collated by means of a classification scheme, where every publication is associated with a single principal research field or subfield entry (e.g., via Chemical Abstracts sections and a publication year. In a second step, percentiles of citation counts are calculated for this set and used to assign the normalized citation impact score to the publications (and also to the publication in question.

  20. The normalization of citation counts based on classification systems

    CERN Document Server

    Bornmann, Lutz; Barth, Andreas

    2013-01-01

    If we want to assess whether the paper in question has had a particularly high or low citation impact compared to other papers, the standard practice in bibliometrics is to normalize citations in respect of the subject category and publication year. A number of proposals for an improved procedure in the normalization of citation impact have been put forward in recent years. Against the background of these proposals this study describes an ideal solution for the normalization of citation impact: in a first step, the reference set for the publication in question is collated by means of a classification scheme, where every publication is associated with a single principal research field or subfield entry (e. g. via Chemical Abstracts sections) and a publication year. In a second step, percentiles of citation counts are calculated for this set and used to assign the normalized citation impact score to the publications (and also to the publication in question).

  1. MEDLINE Abstracts Classification Based on Noun Phrases Extraction

    Science.gov (United States)

    Ruiz-Rico, Fernando; Vicedo, José-Luis; Rubio-Sánchez, María-Consuelo

    Many algorithms have come up in the last years to tackle automated text categorization. They have been exhaustively studied, leading to several variants and combinations not only in the particular procedures but also in the treatment of the input data. A widely used approach is representing documents as Bag-Of-Words (BOW) and weighting tokens with the TFIDF schema. Many researchers have thrown into precision and recall improvements and classification time reduction enriching BOW with stemming, n-grams, feature selection, noun phrases, metadata, weight normalization, etc. We contribute to this field with a novel combination of these techniques. For evaluation purposes, we provide comparisons to previous works with SVM against the simple BOW. The well known OHSUMED corpus is exploited and different sets of categories are selected, as previously done in the literature. The conclusion is that the proposed method can be successfully applied to existing binary classifiers such as SVM outperforming the mixture of BOW and TFIDF approaches.

  2. Vehicle Maneuver Detection with Accelerometer-Based Classification

    Directory of Open Access Journals (Sweden)

    Javier Cervantes-Villanueva

    2016-09-01

    Full Text Available In the mobile computing era, smartphones have become instrumental tools to develop innovative mobile context-aware systems. In that sense, their usage in the vehicular domain eases the development of novel and personal transportation solutions. In this frame, the present work introduces an innovative mechanism to perceive the current kinematic state of a vehicle on the basis of the accelerometer data from a smartphone mounted in the vehicle. Unlike previous proposals, the introduced architecture targets the computational limitations of such devices to carry out the detection process following an incremental approach. For its realization, we have evaluated different classification algorithms to act as agents within the architecture. Finally, our approach has been tested with a real-world dataset collected by means of the ad hoc mobile application developed.

  3. International Classification of Functioning, Disability, and Health in women with breast cancer: a proposal for measurement instruments.

    Science.gov (United States)

    Carvalho, Flávia Nascimento de; Koifman, Rosalina Jorge; Bergmann, Anke

    2013-06-01

    The International Classification of Functioning, Disability, and Health (ICF) aims at standardization, but its applicability requires consistent instruments. In Brazil, invasive therapeutic approaches are frequent, leading to functional alterations. The current study thus aimed to identify and discuss instruments capable of measuring ICF core set codes for breast cancer. The review included ICF studies in women with breast cancer diagnosis and studies with the objective of translating and validating instruments for the Brazilian population, and consistent with the codes. Review studies, systematic or not, were excluded. Eight instruments were selected, and the WHOQOL-Bref was the most comprehensive. The use of various instruments showed 19 coinciding codes, and the instruments as a whole covered 58 of the total of 81 codes. The use of multiple instruments is time-consuming, so new studies are needed to propose parsimonious tools capable of measuring functioning in women treated for breast cancer.

  4. Improving SVDD classification performance on hyperspectral images via correlation based ensemble technique

    Science.gov (United States)

    Uslu, Faruk Sukru; Binol, Hamidullah; Ilarslan, Mustafa; Bal, Abdullah

    2017-02-01

    Support Vector Data Description (SVDD) is a nonparametric and powerful method for target detection and classification. The SVDD constructs a minimum hypersphere enclosing the target objects as much as possible. It has advantages of sparsity, good generalization and using kernel machines. In many studies, different methods have been offered in order to improve the performance of the SVDD. In this paper, we have presented ensemble methods to improve classification performance of the SVDD in remotely sensed hyperspectral imagery (HSI) data. Among various ensemble approaches we have selected bagging technique for training data set with different combinations. As a novel technique for weighting we have proposed a correlation based weight coefficients assignment. In this technique, correlation between each bagged classifier is calculated to give coefficients to weighted combinators. To verify the improvement performance, two hyperspectral images are processed for classification purpose. The obtained results show that the ensemble SVDD has been found to be significantly better than conventional SVDD in terms of classification accuracy.

  5. Hyperspectral Image Classification Based on the Combination of Spatial-spectral Feature and Sparse Representation

    Directory of Open Access Journals (Sweden)

    YANG Zhaoxia

    2015-07-01

    Full Text Available In order to avoid the problem of being over-dependent on high-dimensional spectral feature in the traditional hyperspectral image classification, a novel approach based on the combination of spatial-spectral feature and sparse representation is proposed in this paper. Firstly, we extract the spatial-spectral feature by reorganizing the local image patch with the first d principal components(PCs into a vector representation, followed by a sorting scheme to make the vector invariant to local image rotation. Secondly, we learn the dictionary through a supervised method, and use it to code the features from test samples afterwards. Finally, we embed the resulting sparse feature coding into the support vector machine(SVM for hyperspectral image classification. Experiments using three hyperspectral data show that the proposed method can effectively improve the classification accuracy comparing with traditional classification methods.

  6. A NEW UNSUPERVISED CLASSIFICATION ALGORITHM FOR POLARIMETRIC SAR IMAGES BASED ON FUZZY SET THEORY

    Institute of Scientific and Technical Information of China (English)

    Fu Yusheng; Xie Yan; Pi Yiming; Hou Yinming

    2006-01-01

    In this letter, a new method is proposed for unsupervised classification of terrain types and man-made objects using POLarimetric Synthetic Aperture Radar (POLSAR) data. This technique is a combination of the usage of polarimetric information of SAR images and the unsupervised classification method based on fuzzy set theory. Image quantization and image enhancement are used to preprocess the POLSAR data. Then the polarimetric information and Fuzzy C-Means (FCM) clustering algorithm are used to classify the preprocessed images. The advantages of this algorithm are the automated classification, its high classification accuracy, fast convergence and high stability. The effectiveness of this algorithm is demonstrated by experiments using SIR-C/X-SAR (Spaceborne Imaging Radar-C/X-band Synthetic Aperture Radar) data.

  7. Hyperspectral remote sensing image classification based on combined SVM and LDA

    Science.gov (United States)

    Zhang, Chunsen; Zheng, Yiwei

    2014-11-01

    This paper presents a novel method for hyperspectral image classification based on the minimum noise fraction (MNF) and an approach combining support vector machine (SVM) and linear discriminant analysis (LDA). A new SVM/LDA algorithm is used for the classification. First, we use MNF method to reduce the dimension and extract features of the image, and then use the SVM/LDA algorithm to transform the extracted features. Next, we train the result of transformation, optimize the parameters through cross-validation and grid search method, then get a optimal hyperspectral image classifier. Finally, we use this classifier to complete classification. In order to verify the proposed method, the AVIRIS Indian Pines image was used. The experimental results show that the proposed method can solve the contradiction between the small amount of samples and high dimension, improve classification accuracy compared to the classical SVM method.

  8. Differential survival and recurrence patterns of patients operated for breast cancer according to the new immunohistochemical classification: analytical survey from 1997 to 2012.

    Science.gov (United States)

    García Fernández, Antonio; Chabrera, Carol; García Font, Marc; Fraile, Manel; Gónzalez, Sonia; Barco, Israel; González, Clarisa; Cirera, Lluís; Veloso, Enrique; Lain, José María; Pessarrodona, Antoni; Giménez, Nuria

    2013-08-01

    Breast cancer can no longer be considered only one condition. It should be regarded rather as a heterogeneous group of diseases with different molecular outlines. The aim of this study is to establish a correlation between immunohistochemical tumor sub-typing and surgical treatment, local recurrence rates, distant metastases, and cancer-specific mortality at 5 and 10 years. At least, four tumor sub-types have been described, which were associated with variable risk factors, different natural clinical course, and different response to both local and systemic therapies. For Luminal A: ER + and/or PR + HER2- Ki67 breast tumors were included. Disease-free survival, overall mortality, and breast cancer-specific mortality at 5 and 10 years were calculated. Distant metastases prevalence ranged from 8 to 28 % across sub-types, increasing stepwise from Luminal A, Luminal B, and pure HER2 through triple negative. Conversely, larger tumors with significant axillary burden were more likely to belong to HER2 or triple negative groups. Luminal A sub-type patients showed significantly lower mortality rates both overall and specific at 5 and 10 years, as compared to the rest. Luminal B patients showed lower mortality rates only when compared with triple negative patients. Simple classification of breast cancer patients based on immunohistochemistry and other risk factors is quite useful to establish groups with bad or even worse prognosis. Although results from immunohistochemical classification were not taken into account for surgical procedure decision-making, we found that pure HER2 and triple negative patients received nevertheless higher rates of radical treatment.

  9. Static micro-array isolation, dynamic time series classification, capture and enumeration of spiked breast cancer cells in blood: the nanotube-CTC chip

    Science.gov (United States)

    Khosravi, Farhad; Trainor, Patrick J.; Lambert, Christopher; Kloecker, Goetz; Wickstrom, Eric; Rai, Shesh N.; Panchapakesan, Balaji

    2016-11-01

    We demonstrate the rapid and label-free capture of breast cancer cells spiked in blood using nanotube-antibody micro-arrays. 76-element single wall carbon nanotube arrays were manufactured using photo-lithography, metal deposition, and etching techniques. Anti-epithelial cell adhesion molecule (anti-EpCAM), Anti-human epithelial growth factor receptor 2 (anti-Her2) and non-specific IgG antibodies were functionalized to the surface of the nanotube devices using 1-pyrene-butanoic acid succinimidyl ester. Following device functionalization, blood spiked with SKBR3, MCF7 and MCF10A cells (100/1000 cells per 5 μl per device, 170 elements totaling 0.85 ml of whole blood) were adsorbed on to the nanotube device arrays. Electrical signatures were recorded from each device to screen the samples for differences in interaction (specific or non-specific) between samples and devices. A zone classification scheme enabled the classification of all 170 elements in a single map. A kernel-based statistical classifier for the ‘liquid biopsy’ was developed to create a predictive model based on dynamic time warping series to classify device electrical signals that corresponded to plain blood (control) or SKBR3 spiked blood (case) on anti-Her2 functionalized devices with ˜90% sensitivity, and 90% specificity in capture of 1000 SKBR3 breast cancer cells in blood using anti-Her2 functionalized devices. Screened devices that gave positive electrical signatures were confirmed using optical/confocal microscopy to hold spiked cancer cells. Confocal microscopic analysis of devices that were classified to hold spiked blood based on their electrical signatures confirmed the presence of cancer cells through staining for DAPI (nuclei), cytokeratin (cancer cells) and CD45 (hematologic cells) with single cell sensitivity. We report 55%-100% cancer cell capture yield depending on the active device area for blood adsorption with mean of 62% (˜12 500 captured off 20 000 spiked cells in 0.1 ml

  10. Classification moléculaire du cancer du sein au Maroc

    Science.gov (United States)

    Fouad, Abbass; Yousra, Akasbi; Kaoutar, Znati; Omar, El Mesbahi; Afaf, Amarti; Sanae, Bennis

    2012-01-01

    Introduction La classification moléculaire des cancers du sein basée sur l'expression génique puis sur le profil protéique a permis de distinguer cinq groupes moléculaires: luminal A, luminal B, Her2/neu, basal-like et non-classées. L'objectif de cette étude réalisée au CHU Hassan II de Fès est de classer 335 cancers du sein infiltrant en groupes moléculaires, puis de les corréler avec les caractéristiques clinicopathologiques. Méthodes Etude rétrospective étalée sur 45 mois, comportant 335 patientes colligées au CHU pour le diagnostic et le suivi. Les tumeurs sont analysées histologiquement et classées après une étude immunohistochimique en groupes: luminal A, luminal B, Her2+, basal-like et non-classées. Résultats 54.3% des tumeurs sont du groupe luminal A, 16% luminal B, 11.3% Her2+, 11.3% basal-like et 7% non-classées. Le groupe luminal A renferme le plus faible taux de grade III, d'emboles vasculaires ainsi que de métastases; alors que le groupe des non-classées et basal-like représentent un taux élevé de grade III, une faible proportion d'emboles vasculaires et d'envahissement ganglionnaire. Ces facteurs sont significativement élevés dans les groupes luminal B et Her2+ avec un taux de survie globale de 78% et 76% respectivement. Dans le groupe luminal A, la survie globale des patientes est élevée (87%) alors qu'elle n'est que de 49% dans le groupe des triples négatifs (basal-like et non-classés). Conclusion Le groupe luminal B est différent du luminal A et il est de pronostic péjoratif vis à vis du groupe Her2+. Les caractéristiques clinicopathologiques concordent avec le profil moléculaire donc devraient être pris en considération comme facteurs pronostiques. PMID:23396646

  11. Hyperspectral Image Classification Based on the Weighted Probabilistic Fusion of Multiple Spectral-spatial Features

    Directory of Open Access Journals (Sweden)

    ZHANG Chunsen

    2015-08-01

    Full Text Available A hyperspectral images classification method based on the weighted probabilistic fusion of multiple spectral-spatial features was proposed in this paper. First, the minimum noise fraction (MNF approach was employed to reduce the dimension of hyperspectral image and extract the spectral feature from the image, then combined the spectral feature with the texture feature extracted based on gray level co-occurrence matrix (GLCM, the multi-scale morphological feature extracted based on OFC operator and the end member feature extracted based on sequential maximum angle convex cone (SMACC method to form three spectral-spatial features. Afterwards, support vector machine (SVM classifier was used for the classification of each spectral-spatial feature separately. Finally, we established the weighted probabilistic fusion model and applied the model to fuse the SVM outputs for the final classification result. In order to verify the proposed method, the ROSIS and AVIRIS image were used in our experiment and the overall accuracy reached 97.65% and 96.62% separately. The results indicate that the proposed method can not only overcome the limitations of traditional single-feature based hyperspectral image classification, but also be superior to conventional VS-SVM method and probabilistic fusion method. The classification accuracy of hyperspectral images was improved effectively.

  12. Clinicopathological significance of altered metallothionein 2A expression in gastric cancer according to Lauren's classification

    Institute of Scientific and Technical Information of China (English)

    PAN Yuan-ming; XING Rui; CUI Jian-tao; LI Wen-mei; L(U) You-yong

    2013-01-01

    Background Dysregulated metallothionein 2A (MT2A) has been implicated in carcinogenesis.The purpose of this study was to investigate the expression of MT2A in gastric cancer (GC) and its correlation with prognosis.Methods Reverse transcription-polymerase chain reaction and real-time polymerase chain reaction were used to detect the mRNA expression of MT2A in 12 GC cell lines,normal gastric epithelial GES-1 cells,and 36 GC and adjacent normal tissues.MT2A protein expression was determined in 258 GC tissues and 171 adjacent normal tissues by immunohistochemistry.Results MT2A mRNA expression was lower in GC cells and primary tumors than in GES-1 cells and adjacent normal tissues,respectively.High protein expression of MT2A was present in 130 of 171 normal tissues (76.0%) and in 56 of 258 GC tissues (21.7%; P <0.001).MT2A protein expression was higher in well/moderately differentiated GC (22/54;40.7%) than in poorly differentiated GC (34/204; 16.7%; P <0.001).Moreover,the protein expression of MT2A was lower in diffuse-type GC (6/82; 7.3%) than in intestinal-type GC (50/176; 28.4%; P=0.0001).Importantly,MT2A expression was an independent prognostic factor for GC,and decreased MT2A expression was associated with poor clinical outcome (P <0.001).The expression status of MT2A could predict prognosis in intestinal and diffuse-type GCs.Conclusion Expression status of MT2A might be a useful prognostic biomarker for GC,especially when used in combination with Lauren's classification.

  13. Clustering and rule-based classifications of chemical structures evaluated in the biological activity space.

    Science.gov (United States)

    Schuffenhauer, Ansgar; Brown, Nathan; Ertl, Peter; Jenkins, Jeremy L; Selzer, Paul; Hamon, Jacques

    2007-01-01

    Classification methods for data sets of molecules according to their chemical structure were evaluated for their biological relevance, including rule-based, scaffold-oriented classification methods and clustering based on molecular descriptors. Three data sets resulting from uniformly determined in vitro biological profiling experiments were classified according to their chemical structures, and the results were compared in a Pareto analysis with the number of classes and their average spread in the profile space as two concurrent objectives which were to be minimized. It has been found that no classification method is overall superior to all other studied methods, but there is a general trend that rule-based, scaffold-oriented methods are the better choice if classes with homogeneous biological activity are required, but a large number of clusters can be tolerated. On the other hand, clustering based on chemical fingerprints is superior if fewer and larger classes are required, and some loss of homogeneity in biological activity can be accepted.

  14. Network traffic classification based on ensemble learning and co-training

    Institute of Scientific and Technical Information of China (English)

    HE HaiTao; LUO XiaoNan; MA FeiTeng; CHE ChunHui; WANG JianMin

    2009-01-01

    Classification of network traffic Is the essential step for many network researches. However, with the rapid evolution of Internet applications the effectiveness of the port-based or payload-based identifi-cation approaches has been greatly diminished In recent years. And many researchers begin to turn their attentions to an alternative machine learning based method. This paper presents a novel machine learning-based classification model, which combines ensemble learning paradigm with co-training tech-niques. Compared to previous approaches, most of which only employed single classifier, multiple clas-sifiers and semi-supervised learning are applied in our method and it mainly helps to overcome three shortcomings: limited flow accuracy rate, weak adaptability and huge demand of labeled training set. In this paper, statistical characteristics of IP flows are extracted from the packet level traces to establish the feature set, then the classification model is created and tested and the empirical results prove its feasibility and effectiveness.

  15. Mastectomy or breast conserving surgery? Factors affecting type of surgical treatment for breast cancer – a classification tree approach

    Directory of Open Access Journals (Sweden)

    O'Neill Terry

    2006-04-01

    Full Text Available Abstract Background A critical choice facing breast cancer patients is which surgical treatment – mastectomy or breast conserving surgery (BCS – is most appropriate. Several studies have investigated factors that impact the type of surgery chosen, identifying features such as place of residence, age at diagnosis, tumor size, socio-economic and racial/ethnic elements as relevant. Such assessment of "propensity" is important in understanding issues such as a reported under-utilisation of BCS among women for whom such treatment was not contraindicated. Using Western Australian (WA data, we further examine the factors associated with the type of surgical treatment for breast cancer using a classification tree approach. This approach deals naturally with complicated interactions between factors, and so allows flexible and interpretable models for treatment choice to be built that add to the current understanding of this complex decision process. Methods Data was extracted from the WA Cancer Registry on women diagnosed with breast cancer in WA from 1990 to 2000. Subjects' treatment preferences were predicted from covariates using both classification trees and logistic regression. Results Tumor size was the primary determinant of patient choice, subjects with tumors smaller than 20 mm in diameter preferring BCS. For subjects with tumors greater than 20 mm in diameter factors such as patient age, nodal status, and tumor histology become relevant as predictors of patient choice. Conclusion Classification trees perform as well as logistic regression for predicting patient choice, but are much easier to interpret for clinical use. The selected tree can inform clinicians' advice to patients.

  16. Fourier-transform infrared spectroscopy coupled with a classification machine for the analysis of blood plasma or serum: a novel diagnostic approach for ovarian cancer.

    Science.gov (United States)

    Gajjar, Ketan; Trevisan, Júlio; Owens, Gemma; Keating, Patrick J; Wood, Nicholas J; Stringfellow, Helen F; Martin-Hirsch, Pierre L; Martin, Francis L

    2013-07-21

    Currently available screening tests do not deliver the required sensitivity and specificity for accurate diagnosis of ovarian or endometrial cancer. Infrared (IR) spectroscopy of blood plasma or serum is a rapid, versatile, and relatively non-invasive approach which could characterize biomolecular alterations due to cancer and has potential to be utilized as a screening or diagnostic tool. In the past, no such approach has been investigated for its applicability in screening and/or diagnosis of gynaecological cancers. We set out to determine whether attenuated total reflection Fourier-transform IR (ATR-FTIR) spectroscopy coupled with a proposed classification machine could be applied to IR spectra obtained from plasma and serum for accurate class prediction (cancer vs. normal). Plasma and serum samples were obtained from ovarian cancer cases (n = 30), endometrial cancer cases (n = 30) and non-cancer controls (n = 30), and subjected to ATR-FTIR spectroscopy. Four derived datasets were processed to estimate the real-world diagnosis of ovarian and endometrial cancer. Classification results for ovarian cancer were remarkable (up to 96.7%), whereas endometrial cancer was classified with a relatively high accuracy (up to 81.7%). The results from different combinations of feature extraction and classification methods, and also classifier ensembles, were compared. No single classification system performed best for all different datasets. This demonstrates the need for a framework that can accommodate a diverse set of analytical methods in order to be adaptable to different datasets. This pilot study suggests that ATR-FTIR spectroscopy of blood is a robust tool for accurate diagnosis, and carries the potential to be utilized as a screening test for ovarian cancer in primary care settings. The proposed classification machine is a powerful tool which could be applied to classify the vibrational spectroscopy data of different biological systems (e.g., tissue, urine, saliva

  17. Object-Based Classification as an Alternative Approach to the Traditional Pixel-Based Classification to Identify Potential Habitat of the Grasshopper Sparrow

    Science.gov (United States)

    Jobin, Benoît; Labrecque, Sandra; Grenier, Marcelle; Falardeau, Gilles

    2008-01-01

    The traditional method of identifying wildlife habitat distribution over large regions consists of pixel-based classification of satellite images into a suite of habitat classes used to select suitable habitat patches. Object-based classification is a new method that can achieve the same objective based on the segmentation of spectral bands of the image creating homogeneous polygons with regard to spatial or spectral characteristics. The segmentation algorithm does not solely rely on the single pixel value, but also on shape, texture, and pixel spatial continuity. The object-based classification is a knowledge base process where an interpretation key is developed using ground control points and objects are assigned to specific classes according to threshold values of determined spectral and/or spatial attributes. We developed a model using the eCognition software to identify suitable habitats for the Grasshopper Sparrow, a rare and declining species found in southwestern Québec. The model was developed in a region with known breeding sites and applied on o