WorldWideScience

Sample records for cancer classification based

  1. Pathway-based classification of cancer subtypes

    Directory of Open Access Journals (Sweden)

    Kim Shinuk

    2012-07-01

    Full Text Available Abstract Background Molecular markers based on gene expression profiles have been used in experimental and clinical settings to distinguish cancerous tumors in stage, grade, survival time, metastasis, and drug sensitivity. However, most significant gene markers are unstable (not reproducible among data sets. We introduce a standardized method for representing cancer markers as 2-level hierarchical feature vectors, with a basic gene level as well as a second level of (more stable pathway markers, for the purpose of discriminating cancer subtypes. This extends standard gene expression arrays with new pathway-level activation features obtained directly from off-the-shelf gene set enrichment algorithms such as GSEA. Such so-called pathway-based expression arrays are significantly more reproducible across datasets. Such reproducibility will be important for clinical usefulness of genomic markers, and augment currently accepted cancer classification protocols. Results The present method produced more stable (reproducible pathway-based markers for discriminating breast cancer metastasis and ovarian cancer survival time. Between two datasets for breast cancer metastasis, the intersection of standard significant gene biomarkers totaled 7.47% of selected genes, compared to 17.65% using pathway-based markers; the corresponding percentages for ovarian cancer datasets were 20.65% and 33.33% respectively. Three pathways, consisting of Type_1_diabetes mellitus, Cytokine-cytokine_receptor_interaction and Hedgehog_signaling (all previously implicated in cancer, are enriched in both the ovarian long survival and breast non-metastasis groups. In addition, integrating pathway and gene information, we identified five (ID4, ANXA4, CXCL9, MYLK, FBXL7 and six (SQLE, E2F1, PTTG1, TSTA3, BUB1B, MAD2L1 known cancer genes significant for ovarian and breast cancer respectively. Conclusions Standardizing the analysis of genomic data in the process of cancer staging

  2. Nominated Texture Based Cervical Cancer Classification

    Directory of Open Access Journals (Sweden)

    Edwin Jayasingh Mariarputham

    2015-01-01

    Full Text Available Accurate classification of Pap smear images becomes the challenging task in medical image processing. This can be improved in two ways. One way is by selecting suitable well defined specific features and the other is by selecting the best classifier. This paper presents a nominated texture based cervical cancer (NTCC classification system which classifies the Pap smear images into any one of the seven classes. This can be achieved by extracting well defined texture features and selecting best classifier. Seven sets of texture features (24 features are extracted which include relative size of nucleus and cytoplasm, dynamic range and first four moments of intensities of nucleus and cytoplasm, relative displacement of nucleus within the cytoplasm, gray level cooccurrence matrix, local binary pattern histogram, tamura features, and edge orientation histogram. Few types of support vector machine (SVM and neural network (NN classifiers are used for the classification. The performance of the NTCC algorithm is tested and compared to other algorithms on public image database of Herlev University Hospital, Denmark, with 917 Pap smear images. The output of SVM is found to be best for the most of the classes and better results for the remaining classes.

  3. Cancer classification based on gene expression using neural networks.

    Science.gov (United States)

    Hu, H P; Niu, Z J; Bai, Y P; Tan, X H

    2015-12-21

    Based on gene expression, we have classified 53 colon cancer patients with UICC II into two groups: relapse and no relapse. Samples were taken from each patient, and gene information was extracted. Of the 53 samples examined, 500 genes were considered proper through analyses by S-Kohonen, BP, and SVM neural networks. Classification accuracy obtained by S-Kohonen neural network reaches 91%, which was more accurate than classification by BP and SVM neural networks. The results show that S-Kohonen neural network is more plausible for classification and has a certain feasibility and validity as compared with BP and SVM neural networks.

  4. Pathological Bases for a Robust Application of Cancer Molecular Classification

    Directory of Open Access Journals (Sweden)

    Salvador J. Diaz-Cano

    2015-04-01

    Full Text Available Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes, and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors.

  5. Histotype-based prognostic classification of gastric cancer

    Institute of Scientific and Technical Information of China (English)

    Anna Maria Chiaravalli; Catherine Klersy; Alessandro Vanoli; Andrea Ferretti; Carlo Capella; Enrico Solcia

    2012-01-01

    AIM:To test the efficiency of a recently proposed histotype-based grading system in a consecutive series of gastric cancers.METHOIS:Two hundred advanced gastric cancers operated upon in 1980-1987 and followed for a median 159 mo were investigated on hematoxylin-eosinstained sections to identify low-grade [muconodular,well differentiated tubular,diffuse desmoplastic and high lymphoid response (HLR)],high-grade (anaplastic and mucinous invasive) and intermediate-grade (ordinarycohesive,diffuse and mucinous) cancers,in parallel with a previously investigated series of 292 cases.In addition,immunohistochemical analyses for CD8,CD11 and HLA-DR antigens,pancytokeratin and podoplanin,as well as immunohistochemical and molecular tests for microsatellite DNA instability and in situ hybridization for the Epstein-Barr virus (EBV) EBER1 gene were performed.Patient survival was assessed with death rates per 100 person-years and with Kaplan-Meier or Cox model estimates.RESULTS:Collectively,the four low-grade histotypes accounted for 22% and the two high-grade histotypes for 7% of the consecutive cancers investigated,while the remaining 71% of cases were intermediate-grade cancers,with highly significant,stage-independent,survival differences among the three tumor grades (P =0.004 for grade 1 vs 2 and P =0.0019 for grade 2 vs grade 3),thus confirming the results in the original series.A combined analysis of 492 cases showed an improved prognostic value of histotype-based grading compared with the Lauren classification.In addition,it allowed better characterization of rare histotypes,particularly the three subsets of prognostically different mucinous neoplasms,of which 10 ordinary mucinous cancers showed stage-inclusive survival worse than that of 20 muconodular (P =0.037) and better than that of 21 high-grade (P < 0.001) cases.Tumors with high-level microsatellite DNA instability(MSI-H) or EBV infection,together with a third subset negative for both conditions,formed the

  6. Hybrid Local Feature Selection In DNA Analysis Based Cancer Classification

    OpenAIRE

    Akila, M.; Mr.S.Senthamarai kannan

    2012-01-01

    Feature selection, as a preprocessing step to machine learning, is effective in reducing dimensionality, removing irrelevant data and increasing learning accuracy. The development of microarray dataset technology has supplied a large volume of data to many fields. In particular, it has been applied to prediction and diagnosis of cancer, so that it helps us to exactly predict and diagnose cancer. To precisely classify cancer we have to select genes related to cancer. The challenging task in ca...

  7. Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data

    Directory of Open Access Journals (Sweden)

    He Miao

    2009-12-01

    Full Text Available Abstract Background More studies based on gene expression data have been reported in great detail, however, one major challenge for the methodologists is the choice of classification methods. The main purpose of this research was to compare the performance of linear discriminant analysis (LDA and its modification methods for the classification of cancer based on gene expression data. Methods The classification performance of linear discriminant analysis (LDA and its modification methods was evaluated by applying these methods to six public cancer gene expression datasets. These methods included linear discriminant analysis (LDA, prediction analysis for microarrays (PAM, shrinkage centroid regularized discriminant analysis (SCRDA, shrinkage linear discriminant analysis (SLDA and shrinkage diagonal discriminant analysis (SDDA. The procedures were performed by software R 2.80. Results PAM picked out fewer feature genes than other methods from most datasets except from Brain dataset. For the two methods of shrinkage discriminant analysis, SLDA selected more genes than SDDA from most datasets except from 2-class lung cancer dataset. When comparing SLDA with SCRDA, SLDA selected more genes than SCRDA from 2-class lung cancer, SRBCT and Brain dataset, the result was opposite for the rest datasets. The average test error of LDA modification methods was lower than LDA method. Conclusions The classification performance of LDA modification methods was superior to that of traditional LDA with respect to the average error and there was no significant difference between theses modification methods.

  8. Cancer Pain: A Critical Review of Mechanism-based Classification and Physical Therapy Management in Palliative Care.

    Science.gov (United States)

    Kumar, Senthil P

    2011-05-01

    Mechanism-based classification and physical therapy management of pain is essential to effectively manage painful symptoms in patients attending palliative care. The objective of this review is to provide a detailed review of mechanism-based classification and physical therapy management of patients with cancer pain. Cancer pain can be classified based upon pain symptoms, pain mechanisms and pain syndromes. Classification based upon mechanisms not only addresses the underlying pathophysiology but also provides us with an understanding behind patient's symptoms and treatment responses. Existing evidence suggests that the five mechanisms - central sensitization, peripheral sensitization, sympathetically maintained pain, nociceptive and cognitive-affective - operate in patients with cancer pain. Summary of studies showing evidence for physical therapy treatment methods for cancer pain follows with suggested therapeutic implications. Effective palliative physical therapy care using a mechanism-based classification model should be tailored to suit each patient's findings, using a biopsychosocial model of pain. PMID:21976851

  9. Cancer pain: A critical review of mechanism-based classification and physical therapy management in palliative care

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2011-01-01

    Full Text Available Mechanism-based classification and physical therapy management of pain is essential to effectively manage painful symptoms in patients attending palliative care. The objective of this review is to provide a detailed review of mechanism-based classification and physical therapy management of patients with cancer pain. Cancer pain can be classified based upon pain symptoms, pain mechanisms and pain syndromes. Classification based upon mechanisms not only addresses the underlying pathophysiology but also provides us with an understanding behind patient′s symptoms and treatment responses. Existing evidence suggests that the five mechanisms - central sensitization, peripheral sensitization, sympathetically maintained pain, nociceptive and cognitive-affective - operate in patients with cancer pain. Summary of studies showing evidence for physical therapy treatment methods for cancer pain follows with suggested therapeutic implications. Effective palliative physical therapy care using a mechanism-based classification model should be tailored to suit each patient′s findings, using a biopsychosocial model of pain.

  10. Classification of Cancer Gene Selection Using Random Forest and Neural Network Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Jogendra Kushwah

    2013-06-01

    Full Text Available The free radical gene classification of cancer diseases is challenging job in biomedical data engineering. The improving of classification of gene selection of cancer diseases various classifier are used, but the classification of classifier are not validate. So ensemble classifier is used for cancer gene classification using neural network classifier with random forest tree. The random forest tree is ensembling technique of classifier in this technique the number of classifier ensemble of their leaf node of class of classifier. In this paper we combined neural network with random forest ensemble classifier for classification of cancer gene selection for diagnose analysis of cancer diseases. The proposed method is different from most of the methods of ensemble classifier, which follow an input output paradigm of neural network, where the members of the ensemble are selected from a set of neural network classifier. the number of classifiers is determined during the rising procedure of the forest. Furthermore, the proposed method produces an ensemble not only correct, but also assorted, ensuring the two important properties that should characterize an ensemble classifier. For empirical evaluation of our proposed method we used UCI cancer diseases data set for classification. Our experimental result shows that better result in compression of random forest tree classification.

  11. A minimum spanning forest based hyperspectral image classification method for cancerous tissue detection

    Science.gov (United States)

    Pike, Robert; Patton, Samuel K.; Lu, Guolan; Halig, Luma V.; Wang, Dongsheng; Chen, Zhuo Georgia; Fei, Baowei

    2014-03-01

    Hyperspectral imaging is a developing modality for cancer detection. The rich information associated with hyperspectral images allow for the examination between cancerous and healthy tissue. This study focuses on a new method that incorporates support vector machines into a minimum spanning forest algorithm for differentiating cancerous tissue from normal tissue. Spectral information was gathered to test the algorithm. Animal experiments were performed and hyperspectral images were acquired from tumor-bearing mice. In vivo imaging experimental results demonstrate the applicability of the proposed classification method for cancer tissue classification on hyperspectral images.

  12. Diagnostic Classification of Normal Persons and Cancer Patients by Using Neural Network Based on Trace Metal Contents in Serum Samples

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Artificial neural network with the back-propagation(BP-ANN) approach was applied to the classification of normal persons and various cancer patients based on the elemental contents in serum samples. This method was verified by the cross-validation method. The effects of the net work parameters were investigated and the related problems were discussed. The samples of 72, 42, and 52 for lung, liver, and stomach cancer patients and normal persons, respectively, were used for the classification study. About 95% of the samples can be classified correctly. There fore, the method can be used as an auxiliary means of the diagnosis of cancer.

  13. Efficient Cancer Classification using Fast Adaptive Neuro-Fuzzy Inference System (FANFIS based on Statistical Techniques

    Directory of Open Access Journals (Sweden)

    K.Ananda Kumar

    2011-09-01

    Full Text Available The increase in number of cancer is detected throughout the world. This leads to the requirement of developing a new technique which can detect the occurrence the cancer. This will help in better diagnosis in order to reduce the cancer patients. This paper aim at finding the smallest set of genes that can ensure highly accurate classification of cancer from micro array data by using supervised machine learning algorithms. The significance of finding the minimum subset is three fold: a The computational burden and noise arising from irrelevant genes are much reduced; b the cost for cancer testing is reduced significantly as it simplifies the gene expression tests to include only a very small number of genes rather than thousands of genes; c it calls for more investigation into the probable biological relationship between these small numbers of genes and cancer development and treatment. The proposed method involves two steps. In the first step, some important genes are chosen with the help of Analysis of Variance (ANOVA ranking scheme. In the second step, the classification capability is tested for all simple combinations of those important genes using a better classifier. The proposed method uses Fast Adaptive Neuro-Fuzzy Inference System (FANFIS as a classification model. This classification model uses Modified Levenberg-Marquardt algorithm for learning phase. The experimental results suggest that the proposed method results in better accuracy and also it takes lesser time for classification when compared to the conventional techniques.

  14. Classification of Cancer Gene Selection Using Random Forest and Neural Network Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Jogendra Kushwah

    2013-06-01

    Full Text Available The free radical gene classification of cancerdiseasesis challenging job in biomedical dataengineering. The improving of classification of geneselection of cancer diseases various classifier areused, but the classification of classifier are notvalidate. So ensemble classifier is used for cancergene classification using neural network classifierwith random forest tree. The random forest tree isensembling technique of classifier in this techniquethe number of classifier ensemble of their leaf nodeof class of classifier. In this paper we combinedneuralnetwork with random forest ensembleclassifier for classification of cancer gene selectionfor diagnose analysis of cancer diseases.Theproposed method is different from most of themethods of ensemble classifier, which follow aninput output paradigm ofneural network, where themembers of the ensemble are selected from a set ofneural network classifier. the number of classifiersis determined during the rising procedure of theforest. Furthermore, the proposed method producesan ensemble not only correct, but also assorted,ensuring the two important properties that shouldcharacterize an ensemble classifier. For empiricalevaluation of our proposed method we used UCIcancer diseases data set for classification. Ourexperimental result shows that betterresult incompression of random forest tree classification

  15. Superpixel-based spectral classification for the detection of head and neck cancer with hyperspectral imaging

    Science.gov (United States)

    Chung, Hyunkoo; Lu, Guolan; Tian, Zhiqiang; Wang, Dongsheng; Chen, Zhuo Georgia; Fei, Baowei

    2016-01-01

    Hyperspectral imaging (HSI) is an emerging imaging modality for medical applications. HSI acquires two dimensional images at various wavelengths. The combination of both spectral and spatial information provides quantitative information for cancer detection and diagnosis. This paper proposes using superpixels, principal component analysis (PCA), and support vector machine (SVM) to distinguish regions of tumor from healthy tissue. The classification method uses 2 principal components decomposed from hyperspectral images and obtains an average sensitivity of 93% and an average specificity of 85% for 11 mice. The hyperspectral imaging technology and classification method can have various applications in cancer research and management.

  16. Superpixel-based spectral classification for the detection of head and neck cancer with hyperspectral imaging

    Science.gov (United States)

    Chung, Hyunkoo; Lu, Guolan; Tian, Zhiqiang; Wang, Dongsheng; Chen, Zhuo Georgia; Fei, Baowei

    2016-03-01

    Hyperspectral imaging (HSI) is an emerging imaging modality for medical applications. HSI acquires two dimensional images at various wavelengths. The combination of both spectral and spatial information provides quantitative information for cancer detection and diagnosis. This paper proposes using superpixels, principal component analysis (PCA), and support vector machine (SVM) to distinguish regions of tumor from healthy tissue. The classification method uses 2 principal components decomposed from hyperspectral images and obtains an average sensitivity of 93% and an average specificity of 85% for 11 mice. The hyperspectral imaging technology and classification method can have various applications in cancer research and management.

  17. Swarm Intelligence Approach Based on Adaptive ELM Classifier with ICGA Selection for Microarray Gene Expression and Cancer Classification

    Directory of Open Access Journals (Sweden)

    T. Karthikeyan

    2014-05-01

    Full Text Available The aim of this research study is based on efficient gene selection and classification of microarray data analysis using hybrid machine learning algorithms. The beginning of microarray technology has enabled the researchers to quickly measure the position of thousands of genes expressed in an organic/biological tissue samples in a solitary experiment. One of the important applications of this microarray technology is to classify the tissue samples using their gene expression representation, identify numerous type of cancer. Cancer is a group of diseases in which a set of cells shows uncontrolled growth, instance that interrupts upon and destroys nearby tissues and spreading to other locations in the body via lymph or blood. Cancer has becomes a one of the major important disease in current scenario. DNA microarrays turn out to be an effectual tool utilized in molecular biology and cancer diagnosis. Microarrays can be measured to establish the relative quantity of mRNAs in two or additional organic/biological tissue samples for thousands/several thousands of genes at the same time. As the superiority of this technique become exactly analysis/identifying the suitable assessment of microarray data in various open issues. In the field of medical sciences multi-category cancer classification play a major important role to classify the cancer types according to the gene expression. The need of the cancer classification has been become indispensible, because the numbers of cancer victims are increasing steadily identified by recent years. To perform this proposed a combination of Integer-Coded Genetic Algorithm (ICGA and Artificial Bee Colony algorithm (ABC, coupled with an Adaptive Extreme Learning Machine (AELM, is used for gene selection and cancer classification. ICGA is used with ABC based AELM classifier to chose an optimal set of genes which results in an efficient hybrid algorithm that can handle sparse data and sample imbalance. The

  18. Proteomic classification of breast cancer.

    LENUS (Irish Health Repository)

    Kamel, Dalia

    2012-11-01

    Being a significant health problem that affects patients in various age groups, breast cancer has been extensively studied to date. Recently, molecular breast cancer classification has advanced significantly with the availability of genomic profiling technologies. Proteomic technologies have also advanced from traditional protein assays including enzyme-linked immunosorbent assay, immunoblotting and immunohistochemistry to more comprehensive approaches including mass spectrometry and reverse phase protein lysate arrays (RPPA). The purpose of this manuscript is to review the current protein markers that influence breast cancer prediction and prognosis and to focus on novel advances in proteomic classification of breast cancer.

  19. Molecular classification of gastric cancer.

    Science.gov (United States)

    Chia, N-Y; Tan, P

    2016-05-01

    Gastric cancer (GC), a heterogeneous disease characterized by epidemiologic and histopathologic differences across countries, is a leading cause of cancer-related death. Treatment of GC patients is currently suboptimal due to patients being commonly treated in a uniform fashion irrespective of disease subtype. With the advent of next-generation sequencing and other genomic technologies, GCs are now being investigated in great detail at the molecular level. High-throughput technologies now allow a comprehensive study of genomic and epigenomic alterations associated with GC. Gene mutations, chromosomal aberrations, differential gene expression and epigenetic alterations are some of the genetic/epigenetic influences on GC pathogenesis. In addition, integrative analyses of molecular profiling data have led to the identification of key dysregulated pathways and importantly, the establishment of GC molecular classifiers. Recently, The Cancer Genome Atlas (TCGA) network proposed a four subtype classification scheme for GC based on the underlying tumor molecular biology of each subtype. This landmark study, together with other studies, has expanded our understanding on the characteristics of GC at the molecular level. Such knowledge may improve the medical management of GC in the future. PMID:26861606

  20. Genetic Fuzzy System (GFS based wavelet co-occurrence feature selection in mammogram classification for breast cancer diagnosis

    Directory of Open Access Journals (Sweden)

    Meenakshi M. Pawar

    2016-09-01

    Full Text Available Breast cancer is significant health problem diagnosed mostly in women worldwide. Therefore, early detection of breast cancer is performed with the help of digital mammography, which can reduce mortality rate. This paper presents wrapper based feature selection approach for wavelet co-occurrence feature (WCF using Genetic Fuzzy System (GFS in mammogram classification problem. The performance of GFS algorithm is explained using mini-MIAS database. WCF features are obtained from detail wavelet coefficients at each level of decomposition of mammogram image. At first level of decomposition, 18 features are applied to GFS algorithm, which selects 5 features with an average classification success rate of 39.64%. Subsequently, at second level it selects 9 features from 36 features and the classification success rate is improved to 56.75%. For third level, 16 features are selected from 54 features and average success rate is improved to 64.98%. Lastly, at fourth level 72 features are applied to GFS, which selects 16 features and thereby increasing average success rate to 89.47%. Hence, GFS algorithm is the effective way of obtaining optimal set of feature in breast cancer diagnosis.

  1. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  2. Breast cancer detection and classification in digital mammography based on Non-Subsampled Contourlet Transform (NSCT) and Super Resolution.

    Science.gov (United States)

    Pak, Fatemeh; Kanan, Hamidreza Rashidy; Alikhassi, Afsaneh

    2015-11-01

    Breast cancer is one of the most perilous diseases among women. Breast screening is a method of detecting breast cancer at a very early stage which can reduce the mortality rate. Mammography is a standard method for the early diagnosis of breast cancer. In this paper, a new algorithm is proposed for breast cancer detection and classification in digital mammography based on Non-Subsampled Contourlet Transform (NSCT) and Super Resolution (SR). The presented algorithm includes three main parts including pre-processing, feature extraction and classification. In the pre-processing stage, after determining the region of interest (ROI) by an automatic technique, the quality of image is improved using NSCT and SR algorithm. In the feature extraction part, several features of the image components are extracted and skewness of each feature is calculated. Finally, AdaBoost algorithm is used to classify and determine the probability of benign and malign disease. The obtained results on Mammographic Image Analysis Society (MIAS) database indicate the significant performance and superiority of the proposed method in comparison with the state of the art approaches. According to the obtained results, the proposed technique achieves 91.43% and 6.42% as a mean accuracy and FPR, respectively.

  3. Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method

    Directory of Open Access Journals (Sweden)

    Huang Desheng

    2009-07-01

    Full Text Available Abstract Background A reliable and precise classification is essential for successful diagnosis and treatment of cancer. Gene expression microarrays have provided the high-throughput platform to discover genomic biomarkers for cancer diagnosis and prognosis. Rational use of the available bioinformation can not only effectively remove or suppress noise in gene chips, but also avoid one-sided results of separate experiment. However, only some studies have been aware of the importance of prior information in cancer classification. Methods Together with the application of support vector machine as the discriminant approach, we proposed one modified method that incorporated prior knowledge into cancer classification based on gene expression data to improve accuracy. A public well-known dataset, Malignant pleural mesothelioma and lung adenocarcinoma gene expression database, was used in this study. Prior knowledge is viewed here as a means of directing the classifier using known lung adenocarcinoma related genes. The procedures were performed by software R 2.80. Results The modified method performed better after incorporating prior knowledge. Accuracy of the modified method improved from 98.86% to 100% in training set and from 98.51% to 99.06% in test set. The standard deviations of the modified method decreased from 0.26% to 0 in training set and from 3.04% to 2.10% in test set. Conclusion The method that incorporates prior knowledge into discriminant analysis could effectively improve the capacity and reduce the impact of noise. This idea may have good future not only in practice but also in methodology.

  4. Improved classification of lung cancer tumors based on structural and physicochemical properties of proteins using data mining models.

    Directory of Open Access Journals (Sweden)

    R Geetha Ramani

    Full Text Available Detecting divergence between oncogenic tumors plays a pivotal role in cancer diagnosis and therapy. This research work was focused on designing a computational strategy to predict the class of lung cancer tumors from the structural and physicochemical properties (1497 attributes of protein sequences obtained from genes defined by microarray analysis. The proposed methodology involved the use of hybrid feature selection techniques (gain ratio and correlation based subset evaluators with Incremental Feature Selection followed by Bayesian Network prediction to discriminate lung cancer tumors as Small Cell Lung Cancer (SCLC, Non-Small Cell Lung Cancer (NSCLC and the COMMON classes. Moreover, this methodology eliminated the need for extensive data cleansing strategies on the protein properties and revealed the optimal and minimal set of features that contributed to lung cancer tumor classification with an improved accuracy compared to previous work. We also attempted to predict via supervised clustering the possible clusters in the lung tumor data. Our results revealed that supervised clustering algorithms exhibited poor performance in differentiating the lung tumor classes. Hybrid feature selection identified the distribution of solvent accessibility, polarizability and hydrophobicity as the highest ranked features with Incremental feature selection and Bayesian Network prediction generating the optimal Jack-knife cross validation accuracy of 87.6%. Precise categorization of oncogenic genes causing SCLC and NSCLC based on the structural and physicochemical properties of their protein sequences is expected to unravel the functionality of proteins that are essential in maintaining the genomic integrity of a cell and also act as an informative source for drug design, targeting essential protein properties and their composition that are found to exist in lung cancer tumors.

  5. Laser Raman detection for oral cancer based on a Gaussian process classification method

    Science.gov (United States)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Zhang, Chijun; Chen, He; Luo, Yusheng; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Jia, Jun; Shen, Aiguo; Hu, Jiming

    2013-06-01

    Oral squamous cell carcinoma is the most common neoplasm of the oral cavity. The incidence rate accounts for 80% of total oral cancer and shows an upward trend in recent years. It has a high degree of malignancy and is difficult to detect in terms of differential diagnosis, as a consequence of which the timing of treatment is always delayed. In this work, Raman spectroscopy was adopted to differentially diagnose oral squamous cell carcinoma and oral gland carcinoma. In total, 852 entries of raw spectral data which consisted of 631 items from 36 oral squamous cell carcinoma patients, 87 items from four oral gland carcinoma patients and 134 items from five normal people were collected by utilizing an optical method on oral tissues. The probability distribution of the datasets corresponding to the spectral peaks of the oral squamous cell carcinoma tissue was analyzed and the experimental result showed that the data obeyed a normal distribution. Moreover, the distribution characteristic of the noise was also in compliance with a Gaussian distribution. A Gaussian process (GP) classification method was utilized to distinguish the normal people and the oral gland carcinoma patients from the oral squamous cell carcinoma patients. The experimental results showed that all the normal people could be recognized. 83.33% of the oral squamous cell carcinoma patients could be correctly diagnosed and the remaining ones would be diagnosed as having oral gland carcinoma. For the classification process of oral gland carcinoma and oral squamous cell carcinoma, the correct ratio was 66.67% and the erroneously diagnosed percentage was 33.33%. The total sensitivity was 80% and the specificity was 100% with the Matthews correlation coefficient (MCC) set to 0.447 213 595. Considering the numerical results above, the application prospects and clinical value of this technique are significantly impressive.

  6. Laser Raman detection for oral cancer based on a Gaussian process classification method

    International Nuclear Information System (INIS)

    Oral squamous cell carcinoma is the most common neoplasm of the oral cavity. The incidence rate accounts for 80% of total oral cancer and shows an upward trend in recent years. It has a high degree of malignancy and is difficult to detect in terms of differential diagnosis, as a consequence of which the timing of treatment is always delayed. In this work, Raman spectroscopy was adopted to differentially diagnose oral squamous cell carcinoma and oral gland carcinoma. In total, 852 entries of raw spectral data which consisted of 631 items from 36 oral squamous cell carcinoma patients, 87 items from four oral gland carcinoma patients and 134 items from five normal people were collected by utilizing an optical method on oral tissues. The probability distribution of the datasets corresponding to the spectral peaks of the oral squamous cell carcinoma tissue was analyzed and the experimental result showed that the data obeyed a normal distribution. Moreover, the distribution characteristic of the noise was also in compliance with a Gaussian distribution. A Gaussian process (GP) classification method was utilized to distinguish the normal people and the oral gland carcinoma patients from the oral squamous cell carcinoma patients. The experimental results showed that all the normal people could be recognized. 83.33% of the oral squamous cell carcinoma patients could be correctly diagnosed and the remaining ones would be diagnosed as having oral gland carcinoma. For the classification process of oral gland carcinoma and oral squamous cell carcinoma, the correct ratio was 66.67% and the erroneously diagnosed percentage was 33.33%. The total sensitivity was 80% and the specificity was 100% with the Matthews correlation coefficient (MCC) set to 0.447 213 595. Considering the numerical results above, the application prospects and clinical value of this technique are significantly impressive. (letter)

  7. A refined molecular taxonomy of breast cancer. : molecular classification of breast cancer

    OpenAIRE

    Guedj, Michael; Marisa, Laëtitia; De Reynies, Aurélien; Orsetti, Béatrice; Schiappa, Renaud; Bibeau, Frédéric; MacGrogan, Gaëtan; Lerebours, Florence; Finetti, Pascal; Longy, Michel; Bertheau, Philippe; Bertrand, Françoise; Bonnet, Françoise; Martin, Anne-Laure; Feugeas, Jean-Paul

    2012-01-01

    International audience; The current histoclinical breast cancer classification is simple but imprecise. Several molecular classifications of breast cancers based on expression profiling have been proposed as alternatives. However, their reliability and clinical utility have been repeatedly questioned, notably because most of them were derived from relatively small initial patient populations. We analyzed the transcriptomes of 537 breast tumors using three unsupervised classification methods. ...

  8. Pitch Based Sound Classification

    OpenAIRE

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U.

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classif...

  9. Mechanism-based classification and physical therapy management of persons with cancer pain: A prospective case series

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2013-01-01

    Full Text Available Context: Mechanism-based classification (MBC was established with current evidence and physical therapy (PT management methods for both cancer and for noncancer pain. Aims: This study aims to describe the efficacy of MBC-based PT in persons with primary complaints of cancer pain. Settings and Design: A prospective case series of patients who attended the physiotherapy department of a multispecialty university-affiliated teaching hospital. Material and Methods: A total of 24 adults (18 female, 6 male aged 47.5 ± 10.6 years, with primary diagnosis of heterogeneous group of cancer, chief complaints of chronic disabling pain were included in the study on their consent for participation The patients were evaluated and classified on the basis of five predominant mechanisms for pain. Physical therapy interventions were recommended based on mechanisms identified and home program was prescribed with a patient log to ensure compliance. Treatments were given in five consecutive weekly sessions for five weeks each of 30 min duration. Statistical Analysis Used: Pre-post comparisons for pain severity (PS and pain interference (PI subscales of Brief pain inventory-Cancer pain (BPI-CP and, European organization for research and treatment in cancer-quality of life questionnaire (EORTC-QLQ-C30 were done using Wilcoxon signed-rank test at 95% confidence interval using SPSS for Windows version 16.0 (SPSS Inc, Chicago, IL. Results: There were statistically significant ( P < 0.05 reduction in pain severity, pain interference and total BPI-CP scores, and the EORTC-QLQ-C30. Conclusion: MBC-PT was effective for improving BPI-CP and EORTC-QLQ-C30 scores in people with cancer pain.

  10. Accurate molecular classification of cancer using simple rules

    OpenAIRE

    Gotoh Osamu; Wang Xiaosheng

    2009-01-01

    Abstract Background One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often ...

  11. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic......, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s....

  12. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...... datasets. Our model also outperforms A Decision Cluster Classification (ADCC) and the Decision Cluster Forest Classification (DCFC) models on the Reuters-21578 dataset....

  13. Clinical study of quantitative diagnosis of early cervical cancer based on the classification of acetowhitening kinetics

    Science.gov (United States)

    Wu, Tao; Cheung, Tak-Hong; Yim, So-Fan; Qu, Jianan Y.

    2010-03-01

    A quantitative colposcopic imaging system for the diagnosis of early cervical cancer is evaluated in a clinical study. This imaging technology based on 3-D active stereo vision and motion tracking extracts diagnostic information from the kinetics of acetowhitening process measured from the cervix of human subjects in vivo. Acetowhitening kinetics measured from 137 cervical sites of 57 subjects are analyzed and classified using multivariate statistical algorithms. Cross-validation methods are used to evaluate the performance of the diagnostic algorithms. The results show that an algorithm for screening precancer produced 95% sensitivity (SE) and 96% specificity (SP) for discriminating normal and human papillomavirus (HPV)-infected tissues from cervical intraepithelial neoplasia (CIN) lesions. For a diagnostic algorithm, 91% SE and 90% SP are achieved for discriminating normal tissue, HPV infected tissue, and low-grade CIN lesions from high-grade CIN lesions. The results demonstrate that the quantitative colposcopic imaging system could provide objective screening and diagnostic information for early detection of cervical cancer.

  14. Constructing Support Vector Machine Ensembles for Cancer Classification Based on Proteomic Profiling

    Institute of Scientific and Technical Information of China (English)

    Yong Mao; Xiao-Bo Zhou; Dao-Ying Pi; You-Xian Sun

    2005-01-01

    In this study, we present a constructive algorithm for training cooperative support vector machine ensembles (CSVMEs). CSVME combines ensemble architecture design with cooperative training for individual SVMs in ensembles. Unlike most previous studies on training ensembles, CSVME puts emphasis on both accuracy and collaboration among individual SVMs in an ensemble. A group of SVMs selected on the basis of recursive classifier elimination is used in CSVME, and the number of the individual SVMs selected to construct CSVME is determined by 10-fold cross-validation. This kind of SVME has been tested on two ovarian cancer datasets previously obtained by proteomic mass spectrometry. By combining several individual SVMs, the proposed method achieves better performance than the SVME of all base SVMs.

  15. Identification of immune cell infiltration in hematoxylin-eosin stained breast cancer samples: texture-based classification of tissue morphologies

    Science.gov (United States)

    Turkki, Riku; Linder, Nina; Kovanen, Panu E.; Pellinen, Teijo; Lundin, Johan

    2016-03-01

    The characteristics of immune cells in the tumor microenvironment of breast cancer capture clinically important information. Despite the heterogeneity of tumor-infiltrating immune cells, it has been shown that the degree of infiltration assessed by visual evaluation of hematoxylin-eosin (H and E) stained samples has prognostic and possibly predictive value. However, quantification of the infiltration in H and E-stained tissue samples is currently dependent on visual scoring by an expert. Computer vision enables automated characterization of the components of the tumor microenvironment, and texture-based methods have successfully been used to discriminate between different tissue morphologies and cell phenotypes. In this study, we evaluate whether local binary pattern texture features with superpixel segmentation and classification with support vector machine can be utilized to identify immune cell infiltration in H and E-stained breast cancer samples. Guided with the pan-leukocyte CD45 marker, we annotated training and test sets from 20 primary breast cancer samples. In the training set of arbitrary sized image regions (n=1,116) a 3-fold cross-validation resulted in 98% accuracy and an area under the receiver-operating characteristic curve (AUC) of 0.98 to discriminate between immune cell -rich and - poor areas. In the test set (n=204), we achieved an accuracy of 96% and AUC of 0.99 to label cropped tissue regions correctly into immune cell -rich and -poor categories. The obtained results demonstrate strong discrimination between immune cell -rich and -poor tissue morphologies. The proposed method can provide a quantitative measurement of the degree of immune cell infiltration and applied to digitally scanned H and E-stained breast cancer samples for diagnostic purposes.

  16. Accurate molecular classification of cancer using simple rules

    Directory of Open Access Journals (Sweden)

    Gotoh Osamu

    2009-10-01

    Full Text Available Abstract Background One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible. Methods We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV of training sets and classification of independent test sets. Results We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML], lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML. Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods. Conclusion In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction.

  17. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    Directory of Open Access Journals (Sweden)

    Liu Qingzhong

    2011-12-01

    Full Text Available Abstract Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.

  18. Reproducibility of histologic classification of gastric cancer.

    OpenAIRE

    Palli, D; Bianchi, S.; Cipriani, F; Duca, P; Amorosi, A; C. Avellini; A. Russo; Saragoni, A; P. Todde; Valdes, E.

    1991-01-01

    A panel review of histologic specimens was carried out as part of a multi-centre case-control study of gastric cancer (GC) and diet. Comparisons of diagnoses of 100 GCs by six pathologists revealed agreement in histologic classification for about 70-80% of the cancers. Concordance was somewhat higher when using the Lauren rather than the Ming or World Health Organization classification systems. Histologic types from reading biopsy tissue agreed with those derived from surgical specimens for 6...

  19. Is cancer a disease that can be cured? An answer based on a new classification of diseases

    CERN Document Server

    Richmond, Peter

    2016-01-01

    Is cancer a disease that can be cured or a degenerative disease which comes predominantly with old age? We give an answer based on a two-dimensional representation of diseases. These two dimensions are defined as follows. In mortality curves there is an age, namely a_c = 10 years, which plays a crucial role in the sense that the mortality rate decreases in the interval I1=(aa_c). The respective trends in I1 and I2 are the two parameters used in our classification of diseases. Within the framework of reliability analysis, I1 and I2 would be referred to as the "burn-in" and "wear-out" phases. This leads to define three broad groups of diseases. (AS1) Asymmetry with prevalence of I1. (AS2) Asymmetry with prevalence of I2. (S) Symmetry, with I1 and I2 both playing roles of comparable importance. Not surprisingly, among AS1-cases one finds all diseases due to congenital malformations. In the AS2-class one finds degenerative diseases, e.g. Alzheimer's disease. Among S-cases one finds most diseases due to external p...

  20. A Gene Selection Approach based on Clustering for Classification Tasks in Colon Cancer

    Directory of Open Access Journals (Sweden)

    José Antonio CASTELLANOS GARZÓN

    2016-06-01

    Full Text Available Gene selection (GS is an important research area in the analysis of DNA-microarray data, since it involves gene discovery meaningful for a particular target annotation or able to discriminate expression profiles of samples coming from different populations. In this context, a wide number of filter methods have been proposed in the literature to identify subsets of relevant genes in accordance with prefixed targets. Despite the fact that there is a wide number of proposals, the complexity imposed by this problem (GS remains a challenge. Hence, this paper proposes a novel approach for gene selection by using cluster techniques and filter methods on the found groupings to achieve informative gene subsets. As a result of applying our methodology to Colon cancer data, we have identified the best informative gene subset between several one subsets. According to the above, the reached results have proven the reliability of the approach given in this paper.

  1. AN APPROACH FOR BREAST CANCER DIAGNOSIS CLASSIFICATION USING NEURAL NETWORK

    Directory of Open Access Journals (Sweden)

    Htet Thazin Tike Thein

    2015-01-01

    Full Text Available Artificial neural network has been widely used in various fields as an intelligent tool in recent years, such as artificial intelligence, pattern recognition, medical diagnosis, machine learning and so on. The classification of breast cancer is a medical application that poses a great challenge for researchers and scientists. Recently, the neural network has become a popular tool in the classification of cancer datasets. Classification is one of the most active research and application areas of neural networks. Major disadvantages of artificial neural network (ANN classifier are due to its sluggish convergence and always being trapped at the local minima. To overcome this problem, differential evolution algorithm (DE has been used to determine optimal value or near optimal value for ANN parameters. DE has been applied successfully to improve ANN learning from previous studies. However, there are still some issues on DE approach such as longer training time and lower classification accuracy. To overcome these problems, island based model has been proposed in this system. The aim of our study is to propose an approach for breast cancer distinguishing between different classes of breast cancer. This approach is based on the Wisconsin Diagnostic and Prognostic Breast Cancer and the classification of different types of breast cancer datasets. The proposed system implements the island-based training method to be better accuracy and less training time by using and analysing between two different migration topologies.

  2. Laser Raman detection for oral cancer based on an adaptive Gaussian process classification method with posterior probabilities

    Science.gov (United States)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Jia, Jun; Shen, Aiguo; Hu, Jiming

    2013-03-01

    The existing methods for early and differential diagnosis of oral cancer are limited due to the unapparent early symptoms and the imperfect imaging examination methods. In this paper, the classification models of oral adenocarcinoma, carcinoma tissues and a control group with just four features are established by utilizing the hybrid Gaussian process (HGP) classification algorithm, with the introduction of the mechanisms of noise reduction and posterior probability. HGP shows much better performance in the experimental results. During the experimental process, oral tissues were divided into three groups, adenocarcinoma (n = 87), carcinoma (n = 100) and the control group (n = 134). The spectral data for these groups were collected. The prospective application of the proposed HGP classification method improved the diagnostic sensitivity to 56.35% and the specificity to about 70.00%, and resulted in a Matthews correlation coefficient (MCC) of 0.36. It is proved that the utilization of HGP in LRS detection analysis for the diagnosis of oral cancer gives accurate results. The prospect of application is also satisfactory.

  3. Laser Raman detection for oral cancer based on an adaptive Gaussian process classification method with posterior probabilities

    International Nuclear Information System (INIS)

    The existing methods for early and differential diagnosis of oral cancer are limited due to the unapparent early symptoms and the imperfect imaging examination methods. In this paper, the classification models of oral adenocarcinoma, carcinoma tissues and a control group with just four features are established by utilizing the hybrid Gaussian process (HGP) classification algorithm, with the introduction of the mechanisms of noise reduction and posterior probability. HGP shows much better performance in the experimental results. During the experimental process, oral tissues were divided into three groups, adenocarcinoma (n = 87), carcinoma (n = 100) and the control group (n = 134). The spectral data for these groups were collected. The prospective application of the proposed HGP classification method improved the diagnostic sensitivity to 56.35% and the specificity to about 70.00%, and resulted in a Matthews correlation coefficient (MCC) of 0.36. It is proved that the utilization of HGP in LRS detection analysis for the diagnosis of oral cancer gives accurate results. The prospect of application is also satisfactory. (paper)

  4. Machine learning-based receiver operating characteristic (ROC) curves for crisp and fuzzy classification of DNA microarrays in cancer research.

    Science.gov (United States)

    Peterson, Leif E; Coleman, Matthew A

    2008-01-01

    Receiver operating characteristic (ROC) curves were generated to obtain classification area under the curve (AUC) as a function of feature standardization, fuzzification, and sample size from nine large sets of cancer-related DNA microarrays. Classifiers used included k nearest neighbor (kNN), näive Bayes classifier (NBC), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), learning vector quantization (LVQ1), logistic regression (LOG), polytomous logistic regression (PLOG), artificial neural networks (ANN), particle swarm optimization (PSO), constricted particle swarm optimization (CPSO), kernel regression (RBF), radial basis function networks (RBFN), gradient descent support vector machines (SVMGD), and least squares support vector machines (SVMLS). For each data set, AUC was determined for a number of combinations of sample size, total sum[-log(p)] of feature t-tests, with and without feature standardization and with (fuzzy) and without (crisp) fuzzification of features. Altogether, a total of 2,123,530 classification runs were made. At the greatest level of sample size, ANN resulted in a fitted AUC of 90%, while PSO resulted in the lowest fitted AUC of 72.1%. AUC values derived from 4NN were the most dependent on sample size, while PSO was the least. ANN depended the most on total statistical significance of features used based on sum[-log(p)], whereas PSO was the least dependent. Standardization of features increased AUC by 8.1% for PSO and -0.2% for QDA, while fuzzification increased AUC by 9.4% for PSO and reduced AUC by 3.8% for QDA. AUC determination in planned microarray experiments without standardization and fuzzification of features will benefit the most if CPSO is used for lower levels of feature significance (i.e., sum[-log(p)] ~ 50) and ANN is used for greater levels of significance (i.e., sum[-log(p)] ~ 500). When only standardization of features is performed, studies are likely to benefit most by using CPSO for low levels

  5. Magnetic resonance imaging texture analysis classification of primary breast cancer

    Energy Technology Data Exchange (ETDEWEB)

    Waugh, S.A.; Lerski, R.A. [Ninewells Hospital and Medical School, Department of Medical Physics, Dundee (United Kingdom); Purdie, C.A.; Jordan, L.B. [Ninewells Hospital and Medical School, Department of Pathology, Dundee (United Kingdom); Vinnicombe, S. [University of Dundee, Division of Imaging and Technology, Ninewells Hospital and Medical School, Dundee (United Kingdom); Martin, P. [Ninewells Hospital and Medical School, Department of Clinical Radiology, Dundee (United Kingdom); Thompson, A.M. [University of Texas MD Anderson Cancer Center, Department of Surgical Oncology, Houston, TX (United States)

    2016-02-15

    Patient-tailored treatments for breast cancer are based on histological and immunohistochemical (IHC) subtypes. Magnetic Resonance Imaging (MRI) texture analysis (TA) may be useful in non-invasive lesion subtype classification. Women with newly diagnosed primary breast cancer underwent pre-treatment dynamic contrast-enhanced breast MRI. TA was performed using co-occurrence matrix (COM) features, by creating a model on retrospective training data, then prospectively applying to a test set. Analyses were blinded to breast pathology. Subtype classifications were performed using a cross-validated k-nearest-neighbour (k = 3) technique, with accuracy relative to pathology assessed and receiver operator curve (AUROC) calculated. Mann-Whitney U and Kruskal-Wallis tests were used to assess raw entropy feature values. Histological subtype classifications were similar across training (n = 148 cancers) and test sets (n = 73 lesions) using all COM features (training: 75 %, AUROC = 0.816; test: 72.5 %, AUROC = 0.823). Entropy features were significantly different between lobular and ductal cancers (p < 0.001; Mann-Whitney U). IHC classifications using COM features were also similar for training and test data (training: 57.2 %, AUROC = 0.754; test: 57.0 %, AUROC = 0.750). Hormone receptor positive and negative cancers demonstrated significantly different entropy features. Entropy features alone were unable to create a robust classification model. Textural differences on contrast-enhanced MR images may reflect underlying lesion subtypes, which merits testing against treatment response. (orig.)

  6. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  7. At last: classification of human mammary cells elucidates breast cancer origins

    OpenAIRE

    Robert D Cardiff; Alexander D Borowsky

    2014-01-01

    Current breast cancer classification systems are based on molecular evaluation of tumor receptor status and do not account for distinct morphological phenotypes. In other types of cancer, taxonomy based on normal cell phenotypes has been extremely useful for diagnosis and treatment strategies. In this issue of the JCI, Santagata and colleagues developed a breast cancer classification scheme based on characterization of healthy mammary cells. Reclassification of breast cancer cells and breast ...

  8. Image Similarity to Improve the Classification of Breast Cancer Images

    OpenAIRE

    Dave Tahmoush

    2009-01-01

    Techniques in image similarity can be used to improve the classification of breast cancer images. Breast cancer images in the mammogram modality have an abundance of non-cancerous structures that are similar to cancer, which make classification of images as containing cancer especially difficult to work with. Only the cancerous part of the image is relevant, so the techniques must learn to recognize cancer in noisy mammograms and extract features from that cancer to appropriately classify ima...

  9. Reproducibility of histologic classification of gastric cancer.

    Science.gov (United States)

    Palli, D.; Bianchi, S.; Cipriani, F.; Duca, P.; Amorosi, A.; Avellini, C.; Russo, A.; Saragoni, A.; Todde, P.; Valdes, E.

    1991-01-01

    A panel review of histologic specimens was carried out as part of a multi-centre case-control study of gastric cancer (GC) and diet. Comparisons of diagnoses of 100 GCs by six pathologists revealed agreement in histologic classification for about 70-80% of the cancers. Concordance was somewhat higher when using the Lauren rather than the Ming or World Health Organization classification systems. Histologic types from reading biopsy tissue agreed with those derived from surgical specimens for 65-75% of the 100 tumours. Intra-observer agreement in histologic classification, assessed by repeat readings up to 3 years apart by one pathologist, was 95%. The findings indicate that, although overall concordance was good, it is important to standardise diagnoses in multi-centre epidemiologic studies of GC by histologic type. PMID:2039701

  10. Automated Breast Cancer Diagnosis based on GVF-Snake Segmentation, Wavelet Features Extraction and Neural Network Classification

    Directory of Open Access Journals (Sweden)

    Abderrahim Sebri

    2007-01-01

    Full Text Available Breast cancer accounts for the second most cancer diagnoses among women and the second most cancer deaths in the world. In fact, more than 11000 women die each year, all over the world, because this disease. The automatic breast cancer diagnosis is a very important purpose of medical informatics researches. Some researches has been oriented to make automatic the diagnosis at the step of mammographic diagnosis, some others treated the problem at the step of cytological diagnosis. In this work, we describes the current state of the ongoing the BC automated diagnosis research program. It is a software system that provides expert diagnosis of breast cancer based on three step of cytological image analysis. The first step is based on segmentation using an active contour for cell tracking and isolating of the nucleus in the studied image. Then from this nucleus, have been extracted some textural features using the wavelet transforms to characterize image using its texture, so that malign texture can be differentiated from benign on the assumption that tumoral texture is different from the texture of other kinds of tissues. Finally, the obtained features will be introduced as the input vector of a Multi-Layer Perceptron (MLP, to classify the images into malign and benign ones.

  11. Biogeography based Satellite Image Classification

    CERN Document Server

    Panchal, V K; Kaur, Navdeep; Kundra, Harish

    2009-01-01

    Biogeography is the study of the geographical distribution of biological organisms. The mindset of the engineer is that we can learn from nature. Biogeography Based Optimization is a burgeoning nature inspired technique to find the optimal solution of the problem. Satellite image classification is an important task because it is the only way we can know about the land cover map of inaccessible areas. Though satellite images have been classified in past by using various techniques, the researchers are always finding alternative strategies for satellite image classification so that they may be prepared to select the most appropriate technique for the feature extraction task in hand. This paper is focused on classification of the satellite image of a particular land cover using the theory of Biogeography based Optimization. The original BBO algorithm does not have the inbuilt property of clustering which is required during image classification. Hence modifications have been proposed to the original algorithm and...

  12. Classification method based on KCCA

    Science.gov (United States)

    Wang, Zhanqing; Zhang, Guilin; Zhao, Guangzhou

    2007-11-01

    Nonlinear CCA extends the linear CCA in that it operates in the kernel space and thus implies the nonlinear combinations in the original space. This paper presents a classification method based on the kernel canonical correlation analysis (KCCA). We introduce the probabilistic label vectors (PLV) for a give pattern which extend the conventional concept of class label, and investigate the correlation between feature variables and PLV variables. A PLV predictor is presented based on KCCA, and then classification is performed on the predicted PLV. We formulate a frame for classification by integrating class information through PLV. Experimental results on Iris data set classification and facial expression recognition show the efficiencies of the proposed method.

  13. Reverse phase protein array based tumor profiling identifies a biomarker signature for risk classification of hormone receptor-positive breast cancer

    Directory of Open Access Journals (Sweden)

    Johanna Sonntag

    2014-03-01

    Full Text Available A robust subclassification of luminal breast cancer, the most common molecular subtype of human breast cancer, is crucial for therapy decisions. While a part of patients is at higher risk of recurrence and requires chemo-endocrine treatment, the other part is at lower risk and also poorly responds to chemotherapeutic regimens. To approximate the risk of cancer recurrence, clinical guidelines recommend determining histologic grading and abundance of a cell proliferation marker in tumor specimens. However, this approach assigns an intermediate risk to a substantial number of patients and in addition suffers from a high interobserver variability. Therefore, the aim of our study was to identify a quantitative protein biomarker signature to facilitate risk classification. Reverse phase protein arrays (RPPA were used to obtain quantitative expression data for 128 breast cancer relevant proteins in a set of hormone receptor-positive tumors (n = 109. Proteomic data for the subset of histologic G1 (n = 14 and G3 (n = 22 samples were used for biomarker discovery serving as surrogates of low and high recurrence risk, respectively. A novel biomarker selection workflow based on combining three different classification methods identified caveolin-1, NDKA, RPS6, and Ki-67 as top candidates. NDKA, RPS6, and Ki-67 were expressed at elevated levels in high risk tumors whereas caveolin-1 was observed as downregulated. The identified biomarker signature was subsequently analyzed using an independent test set (AUC = 0.78. Further evaluation of the identified biomarker panel by Western blot and mRNA profiling confirmed the proteomic signature obtained by RPPA. In conclusion, the biomarker signature introduced supports RPPA as a tool for cancer biomarker discovery.

  14. Classification-based reasoning

    Science.gov (United States)

    Gomez, Fernando; Segami, Carlos

    1991-01-01

    A representation formalism for N-ary relations, quantification, and definition of concepts is described. Three types of conditions are associated with the concepts: (1) necessary and sufficient properties, (2) contingent properties, and (3) necessary properties. Also explained is how complex chains of inferences can be accomplished by representing existentially quantified sentences, and concepts denoted by restrictive relative clauses as classification hierarchies. The representation structures that make possible the inferences are explained first, followed by the reasoning algorithms that draw the inferences from the knowledge structures. All the ideas explained have been implemented and are part of the information retrieval component of a program called Snowy. An appendix contains a brief session with the program.

  15. Classification of treatment-related mortality in children with cancer

    DEFF Research Database (Denmark)

    Alexander, Sarah; Pole, Jason D; Gibson, Paul;

    2015-01-01

    Treatment-related mortality is an important outcome in paediatric cancer clinical trials. An international group of experts in supportive care in paediatric cancer developed a consensus-based definition of treatment-related mortality and a cause-of-death attribution system. The reliability...... and validity of the system was tested in 30 deaths, which were independently assessed by two clinical research associates and two paediatric oncologists. We defined treatment-related mortality as death occurring in the absence of progressive cancer. Of the 30 reviewed deaths, the reliability of classification...

  16. Diagnostic Classification of Normal Persons and Cancer Patients by Using Neural Network Based on Trace Metal Contents in Serum Samples

    Institute of Scientific and Technical Information of China (English)

    ZHANG; Zhuo-yong

    2001-01-01

    [1]Miatto, O. , Casaril, M. , Gabriell, G. B. , et al. , Cancer, 55, 774(1985)[2]Margalioth, E. J., Udassin, R., Maor, J. , et al. , Cancer, 56, 856(1986)[3]Xu, B., Chinese Journal of Tumor, 12, 512(1990)[4]Jayadeep, A. , Raveendran, P. K. , Kannan, S. , et al. , J. Exp. Clin. Cancer Res. , 16, 295 (1997)[5]Sattar, N. , Scott, H. R. , McMillan, D. C. , et al. , Nutr. Cancer, 28, 308(1997)[6]Koksoy, C. , Kavas, G. O. , Akcil, E. , et al. , Breast Cancer Res. Treat. , 45, 1(1997)[7]Leung,P. L. , Huang, H. M. , Biol. Trace Elem. Res. , 57, 19(1997)[8]Antila, E. , Mussalo-Rauhamaa, H. , Kantola, M. , et al. , Sci. Total Environ. , 186, 251(1996)[9]Tariq, M. A. , Qama-un-Nisa, Fatima, A. , Sci. Total Environ. , 175, 43(1995)[10]Martin-Lagos, F. , Navarro-Alarcon, M. , Terres-Martos, C. , et al. , Sci. Total Environ. , 204, 27(1997)[11]Poo, J. L. , Romero, R. R. , Robles, J. A. , et al. , Arch. Med. Res. , 28, 259(1997)[12]Magalova, T., Bella, V. , Brtkova, A. , et al. , Neoplasma, 46, 100(1999)[13]Ferrigno, D. , Buccheri, G. , Camilla, T. , et al. , Archives for Chest Disease, 54, 204(1999)[14]Huang, Y. L. , Sheu, J. Y. , Lin, T. H. , Clinical Biochem. , 32, 131(1999)[15]Songchitsomboon, S. , Komindr, S. , Komindr, A. , et al. , J. Med. Assoc. Thai, 82, 701(1999)[16]Mason, R. P. , Cancer, 85, 2 093(1999)[17]Wargovith, M. J. , Ed. Moon T. E. , Micozz M. S. , Calcium, Vitamin D and the Prevention of Gastrointestinal Cancer, in Nutrition and Cancer Prevention, Marcel Dekker Inc. , New York, 1989:291[18]Leung, P. L. , Li, X. L. , Li, Z. X. , et al. , Biol. Trace Elem. Res. , 42, 1(1994)[19]Jing, X. ,Han, C., Cancer Research on Prevention and Treatment, 25, 186(1998)[20]Huang, Y. , Li, J. , Carcinogenesis, Teratogenesis and Mutagenesis, 10, 123(1998)[21]Wang, X. , Zhu, E. ,Yan, X. , et al. , Acta Chimica Sinica, 51, 1 094(1993)[22]Wan, T. , Qin, S. , Zhuang, S. , et al. , Rock and Mineral

  17. Classification of base sequences

    CERN Document Server

    Djokovic, Dragomir Z

    2010-01-01

    Base sequences BS(n+1,n) are quadruples of {1,-1}-sequences (A;B;C;D), with A and B of length n+1 and C and D of length n, such that the sum of their nonperiodic autocorrelation functions is a delta-function. The base sequence conjecture, asserting that BS(n+1,n) exist for all n, is stronger than the famous Hadamard matrix conjecture. We introduce a new definition of equivalence for base sequences BS(n+1,n) and construct a canonical form. By using this canonical form, we have enumerated the equivalence classes of BS(n+1,n) for n <= 30. Due to excessive size of the equivalence classes, the tables in the paper cover only the cases n <= 12.

  18. Classification of Rat FTIR Colon Cancer Data Using Waveletsand BPNN

    Institute of Scientific and Technical Information of China (English)

    CHENG,Cungui; XIONG,Wei; TIAN,Yumei

    2009-01-01

    A feature extracting method based on wavelets for horizontal attenuated total reflectance Fourier transform in-frared spectroscopy (HATR-FTIR) and the cancer classification using artificial neural network trained with back-propagation algorithm is presented. The FTIR data collected from 36 normal Sprague-dawley (SD) rats, 60 1,2-DMH-induced SD rats, and 44 second generation rats of those induced rats were first preprocessed. Then, 12 feature variants were extracted using continuous wavelet analysis. Based on BPNN classification, all spectra were classified into two categories: normal and abnormal ones. The accuracy values of identifying normal, dysplastic, early carcinoma, and advanced carcinoma were 100%, 94%, 97.5%, and 100%, respectively. This result indicated that FTIR with continuous wavelet transform (CWT) and the back-propagation neural network (BPNN) could ef- fectively and easily diagnose colon cancer in its early stages.

  19. Clinicopathological classification and individualized treatment of breast cancer

    Institute of Scientific and Technical Information of China (English)

    HU Hui; LIU Yin-hua; XU Ling; ZHAO Jian-xin; DUAN Xue-ning; YE Jing-ming; LI Ting

    2013-01-01

    Background The clinicopathological classification was proposed in the St.Gallen Consensus Report 2011.We conducted a retrospective analysis of breast cancer subtypes,tumor-nodal-metastatic (TNM) staging,and histopathological grade to investigate the value of these parameters in the treatment strategies of invasive breast cancer.Methods A retrospective analysis of breast cancer subtypes,TNM staging,and histopathological grading of 213 cases has been performed by the methods recommended in the St.Gallen International Expert Consensus Report 2011.The estrogen receptor (ER),progesterone receptor (PR),human epidermal growth factor receptor-2 (HER2),and Ki-67 of 213 tumor samples have been investigated by immunohistochemistry according to methods for classifying breast cancer subtypes proposed in the St.Gallen Consensus Report 2011.Results The luminal A subtype was found in 53 patients (24.9%),the luminal B subtype was found in 112 patients (52.6%),the HER2-positive subtype was found in 22 patients (10.3%),and the triple-negative subtype was found in 26 patients (12%).Histopathological grade and TNM staging differed significantly among the four subtypes of breast cancer (P<0.001).Conclusion It is important to consider TNM staging and histopathological grading in the treatment strategies of breast cancer based on the current clinicopathological classification methods.

  20. Classification of Cancer Recurrence with Alpha-Beta BAM

    Directory of Open Access Journals (Sweden)

    María Elena Acevedo

    2009-01-01

    Full Text Available Bidirectional Associative Memories (BAMs based on first model proposed by Kosko do not have perfect recall of training set, and their algorithm must iterate until it reaches a stable state. In this work, we use the model of Alpha-Beta BAM to classify automatically cancer recurrence in female patients with a previous breast cancer surgery. Alpha-Beta BAM presents perfect recall of all the training patterns and it has a one-shot algorithm; these advantages make to Alpha-Beta BAM a suitable tool for classification. We use data from Haberman database, and leave-one-out algorithm was applied to analyze the performance of our model as classifier. We obtain a percentage of classification of 99.98%.

  1. Computer aided decision support system for cervical cancer classification

    Science.gov (United States)

    Rahmadwati, Rahmadwati; Naghdy, Golshah; Ros, Montserrat; Todd, Catherine

    2012-10-01

    Conventional analysis of a cervical histology image, such a pap smear or a biopsy sample, is performed by an expert pathologist manually. This involves inspecting the sample for cellular level abnormalities and determining the spread of the abnormalities. Cancer is graded based on the spread of the abnormal cells. This is a tedious, subjective and time-consuming process with considerable variations in diagnosis between the experts. This paper presents a computer aided decision support system (CADSS) tool to help the pathologists in their examination of the cervical cancer biopsies. The main aim of the proposed CADSS system is to identify abnormalities and quantify cancer grading in a systematic and repeatable manner. The paper proposes three different methods which presents and compares the results using 475 images of cervical biopsies which include normal, three stages of pre cancer, and malignant cases. This paper will explore various components of an effective CADSS; image acquisition, pre-processing, segmentation, feature extraction, classification, grading and disease identification. Cervical histological images are captured using a digital microscope. The images are captured in sufficient resolution to retain enough information for effective classification. Histology images of cervical biopsies consist of three major sections; background, stroma and squamous epithelium. Most diagnostic information are contained within the epithelium region. This paper will present two levels of segmentations; global (macro) and local (micro). At the global level the squamous epithelium is separated from the background and stroma. At the local or cellular level, the nuclei and cytoplasm are segmented for further analysis. Image features that influence the pathologists' decision during the analysis and classification of a cervical biopsy are the nuclei's shape and spread; the ratio of the areas of nuclei and cytoplasm as well as the texture and spread of the abnormalities

  2. Modulation classification based on spectrogram

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    The aim of modulation classification (MC) is to identify the modulation type of a communication signal. It plays an important role in many cooperative or noncooperative communication applications. Three spectrogram-based modulation classification methods are proposed. Their reccgnition scope and performance are investigated or evaluated by theoretical analysis and extensive simulation studies. The method taking moment-like features is robust to frequency offset while the other two, which make use of principal component analysis (PCA) with different transformation inputs,can achieve satisfactory accuracy even at low SNR (as low as 2 dB). Due to the properties of spectrogram, the statistical pattern recognition techniques, and the image preprocessing steps, all of our methods are insensitive to unknown phase and frequency offsets, timing errors, and the arriving sequence of symbols.

  3. Classification of neuropathic pain in cancer patients

    DEFF Research Database (Denmark)

    Brunelli, Cinzia; Bennett, Michael I; Kaasa, Stein;

    2014-01-01

    Neuropathic pain (NP) in cancer patients lacks standards for diagnosis. This study is aimed at reaching consensus on the application of the International Association for the Study of Pain (IASP) special interest group for neuropathic pain (NeuPSIG) criteria to the diagnosis of NP in cancer patients...... was found on the statement "the pathophysiology of NP due to cancer can be different from non-cancer NP" (MED=9, IQR=2). Satisfactory consensus was reached for the first 3 NeuPSIG criteria (pain distribution, history, and sensory findings; MEDs⩾8, IQRs⩽3), but not for the fourth one (diagnostic test....../imaging; MED=6, IQR=3). Agreement was also reached on clinical examination by soft brush or pin stimulation (MEDs⩾7 and IQRs⩽3) and on the use of PRO descriptors for NP screening (MED=8, IQR=3). Based on the study results, a clinical algorithm for NP diagnostic criteria in cancer patients with pain...

  4. Network-Based Logistic Classification with an Enhanced L1/2 Solver Reveals Biomarker and Subnetwork Signatures for Diagnosing Lung Cancer

    Directory of Open Access Journals (Sweden)

    Hai-Hui Huang

    2015-01-01

    Full Text Available Identifying biomarker and signaling pathway is a critical step in genomic studies, in which the regularization method is a widely used feature extraction approach. However, most of the regularizers are based on L1-norm and their results are not good enough for sparsity and interpretation and are asymptotically biased, especially in genomic research. Recently, we gained a large amount of molecular interaction information about the disease-related biological processes and gathered them through various databases, which focused on many aspects of biological systems. In this paper, we use an enhanced L1/2 penalized solver to penalize network-constrained logistic regression model called an enhanced L1/2 net, where the predictors are based on gene-expression data with biologic network knowledge. Extensive simulation studies showed that our proposed approach outperforms L1 regularization, the old L1/2 penalized solver, and the Elastic net approaches in terms of classification accuracy and stability. Furthermore, we applied our method for lung cancer data analysis and found that our method achieves higher predictive accuracy than L1 regularization, the old L1/2 penalized solver, and the Elastic net approaches, while fewer but informative biomarkers and pathways are selected.

  5. Impact of esophageal cancer staging on overall survival and disease-free survival based on the 2010 AJCC classification by lymph nodes

    International Nuclear Information System (INIS)

    This retrospective study investigated the effect of modifications presented in the seventh edition of the American Joint Committee on Cancer (AJCC) Manual for staging esophageal cancer on the characterization of the effectiveness of post-operative chemotherapy and/or radiotherapy, as measured by overall and disease-free survival. The seventh edition of the AJCC Manual classifies the number of lymph nodes (N) positive for regional metastasis into three subclasses. We used the AJCC classification system to characterize the cancers of 413 Chinese patients with esophageal cancer who underwent radical resection plus regional lymph node dissection over a 10-year period. The 10-year survival rate was 14.3% for stage N1 patients and 6.1% for stage N2 patients. Only one stage N3 patient was followed >4 years (53.4 months). The 10-year disease-free rate was 13.6% for stage N1 patients. Patients with stage N2 or N3 cancer were more likely to have tumor recurrences, metastases or death than patients with stage N1 cancer. Post-operative radiotherapy provided no survival benefit, and may have had a negative effect on survival. In this study, the N stage of esophageal cancer was an independent factor affecting overall and disease-free survival. Our results did not clarify whether or not radiotherapy after radical esophagectomy offers any survival benefit to patients with esophageal cancer. (author)

  6. Molecular Classification of Gastric Cancer: A new paradigm

    Science.gov (United States)

    Shah, Manish A.; Khanin, Raya; Tang, Laura; Janjigian, Yelena Y.; Klimstra, David S.; Gerdes, Hans; Kelsen, David P.

    2011-01-01

    Purpose Gastric cancer may be subdivided into three distinct subtypes –proximal, diffuse, and distal gastric cancer– based on histopathologic and anatomic criteria. Each subtype is associated with unique epidemiology. Our aim is to test the hypothesis that these distinct gastric cancer subtypes may also be distinguished by gene expression analysis. Experimental Design Patients with localized gastric adenocarcinoma being screened for a phase II preoperative clinical trial (NCI 5917) underwent endoscopic biopsy for fresh tumor procurement. 4–6 targeted biopsies of the primary tumor were obtained. Macrodissection was performed to ensure >80% carcinoma in the sample. HG-U133A GeneChip (Affymetrix) was used for cDNA expression analysis, and all arrays were processed and analyzed using the Bioconductor R-package. Results Between November 2003 and January 2006, 57 patients were screened to identify 36 patients with localized gastric cancer who had adequate RNA for expression analysis. Using supervised analysis, we built a classifier to distinguish the three gastric cancer subtypes, successfully classifying each into tightly grouped clusters. Leave-one-out cross validation error was 0.14, suggesting that >85% of samples were classified correctly. Gene set analysis with the False Discovery Rate set at 0.25 identified several pathways that were differentially regulated when comparing each gastric cancer subtype to adjacent normal stomach. Conclusions Subtypes of gastric cancer that have epidemiologic and histologic distinction are also distinguished by gene expression data. These preliminary data suggest a new classification of gastric cancer with implications for improving our understanding of disease biology and identification of unique molecular drivers for each gastric cancer subtype. PMID:21430069

  7. Automated classification of histopathology images of prostate cancer using a Bag-of-Words approach

    Science.gov (United States)

    Sanghavi, Foram M.; Agaian, Sos S.

    2016-05-01

    The goals of this paper are (1) test the Computer Aided Classification of the prostate cancer histopathology images based on the Bag-of-Words (BoW) approach (2) evaluate the performance of the classification grade 3 and 4 of the proposed method using the results of the approach proposed by the authors Khurd et al. in [9] and (3) classify the different grades of cancer namely, grade 0, 3, 4, and 5 using the proposed approach. The system performance is assessed using 132 prostate cancer histopathology of different grades. The system performance of the SURF features are also analyzed by comparing the results with SIFT features using different cluster sizes. The results show 90.15% accuracy in detection of prostate cancer images using SURF features with 75 clusters for k-mean clustering. The results showed higher sensitivity for SURF based BoW classification compared to SIFT based BoW.

  8. Arabic Text Mining Using Rule Based Classification

    OpenAIRE

    Fadi Thabtah; Omar Gharaibeh; Rashid Al-Zubaidy

    2012-01-01

    A well-known classification problem in the domain of text mining is text classification, which concerns about mapping textual documents into one or more predefined category based on its content. Text classification arena recently attracted many researchers because of the massive amounts of online documents and text archives which hold essential information for a decision-making process. In this field, most of such researches focus on classifying English documents while there are limited studi...

  9. Sparse discriminant analysis for breast cancer biomarker identification and classification

    Institute of Scientific and Technical Information of China (English)

    Yu Shi; Daoqing Dai; Chaochun Liu; Hong Yan

    2009-01-01

    Biomarker identification and cancer classification are two important procedures in microarray data analysis. We propose a novel uni-fied method to carry out both tasks. We first preselect biomarker candidates by eliminating unrelated genes through the BSS/WSS ratio filter to reduce computational cost, and then use a sparse discriminant analysis method for simultaneous biomarker identification and cancer classification. Moreover, we give a mathematical justification about automatic biomarker identification. Experimental results show that the proposed method can identify key genes that have been verified in biochemical or biomedical research and classify the breast cancer type correctly.

  10. Visual words based approach for tissue classification in mammograms

    Science.gov (United States)

    Diamant, Idit; Goldberger, Jacob; Greenspan, Hayit

    2013-02-01

    The presence of Microcalcifications (MC) is an important indicator for developing breast cancer. Additional indicators for cancer risk exist, such as breast tissue density type. Different methods have been developed for breast tissue classification for use in Computer-aided diagnosis systems. Recently, the visual words (VW) model has been successfully applied for different classification tasks. The goal of our work is to explore VW based methodologies for various mammography classification tasks. We start with the challenge of classifying breast density and then focus on classification of normal tissue versus Microcalcifications. The presented methodology is based on patch-based visual words model which includes building a dictionary for a training set using local descriptors and representing the image using a visual word histogram. Classification is then performed using k-nearest-neighbour (KNN) and Support vector machine (SVM) classifiers. We tested our algorithm on the MIAS and DDSM publicly available datasets. The input is a representative region-of-interest per mammography image, manually selected and labelled by expert. In the tissue density task, classification accuracy reached 85% using KNN and 88% using SVM, which competes with the state-of-the-art results. For MC vs. normal tissue, accuracy reached 95.6% using SVM. Results demonstrate the feasibility to classify breast tissue using our model. Currently, we are improving the results further while also investigating VW capability to classify additional important mammogram classification problems. We expect that the methodology presented will enable high levels of classification, suggesting new means for automated tools for mammography diagnosis support.

  11. [Molecular classification of bladder cancer. Possible similarities to breast cancer].

    Science.gov (United States)

    Wirtz, R M; Fritz, V; Stöhr, R; Hartmann, A

    2016-02-01

    Therapeutic decisions for breast cancer are increasingly becoming based on subtype-specific gene expression tests. For bladder cancer very similar subtypes have been identified by genome-wide mRNA analysis, which as for breast cancer differ with respect to the prognosis and response to therapy on the basis of their hormone dependency. At the DNA level, however, the type of mutations and their frequencies within the subtypes are strikingly different between bladder and breast cancers. It will be interesting to see whether possible driver mutations can serve as therapeutic targets in both indications. In contrast, the apparent hormone dependency of a substantial number of bladder carcinomas suggests that hormonal and anti-hormonal treatment can be valid therapy options similar to breast cancer. Moreover, gender-specific differences with respect to the incidence and aggressiveness of male compared to female bladder cancers can be explained by hormonal effects. Together with forthcoming immunomodulatory therapies these multiple therapy options raise and give new hope to efficiently combat this aggressive disease. PMID:26780243

  12. Multi-Organ Cancer Classification and Survival Analysis

    OpenAIRE

    Bauer, Stefan; Carion, Nicolas; Schüffler, Peter; Fuchs, Thomas; Wild, Peter; Buhmann, Joachim M.

    2016-01-01

    Accurate and robust cell nuclei classification is the cornerstone for a wider range of tasks in digital and Computational Pathology. However, most machine learning systems require extensive labeling from expert pathologists for each individual problem at hand, with no or limited abilities for knowledge transfer between datasets and organ sites. In this paper we implement and evaluate a variety of deep neural network models and model ensembles for nuclei classification in renal cell cancer (RC...

  13. Classification of oral cancers using Raman spectroscopy of serum

    Science.gov (United States)

    Sahu, Aditi; Talathi, Sneha; Sawant, Sharada; Krishna, C. Murali

    2014-03-01

    Oral cancers are the sixth most common malignancy worldwide, with low 5-year disease free survival rates, attributable to late detection due to lack of reliable screening modalities. Our in vivo Raman spectroscopy studies have demonstrated classification of normal and tumor as well as cancer field effects (CFE), the earliest events in oral cancers. In view of limitations such as requirement of on-site instrumentation and stringent experimental conditions of this approach, feasibility of classification of normal and cancer using serum was explored using 532 nm excitation. In this study, strong resonance features of β-carotenes, present differentially in normal and pathological conditions, were observed. In the present study, Raman spectra of sera of 36 buccal mucosa, 33 tongue cancers and 17 healthy subjects were recorded using Raman microprobe coupled with 40X objective using 785 nm excitation, a known source of excitation for biomedical applications. To eliminate heterogeneity, average of 3 spectra recorded from each sample was subjected to PC-LDA followed by leave-one-out-cross-validation. Findings indicate average classification efficiency of ~70% for normal and cancer. Buccal mucosa and tongue cancer serum could also be classified with an efficiency of ~68%. Of the two cancers, buccal mucosa cancer and normal could be classified with a higher efficiency. Findings of the study are quite comparable to that of our earlier study, which suggest that there exist significant differences, other than β- carotenes, between normal and cancerous samples which can be exploited for the classification. Prospectively, extensive validation studies will be undertaken to confirm the findings.

  14. Cancer stem cell-related marker expression in lung adenocarcinoma and relevance of histologic subtypes based on IASLC/ATS/ERS classification

    Directory of Open Access Journals (Sweden)

    Shimada Y

    2013-11-01

    Full Text Available Yoshihisa Shimada,1 Hisashi Saji,3 Masaharu Nomura,1,2 Jun Matsubayashi,2 Koichi Yoshida,1 Masatoshi Kakihana,1 Naohiro Kajiwara,1 Tatsuo Ohira,1 Norihiko Ikeda11Department of Surgery I, 2Department of Anatomic Pathology, Tokyo Medical University Hospital, Tokyo, Japan; 3Department of Chest Surgery, St Marianna University School of Medicine, Kawasaki, JapanBackground: The cancer stem cell (CSC theory has been proposed to explain tumor heterogeneity and the carcinogenesis of solid tumors. The aim of this study was to clarify the clinical role of CSC-related markers in patients with lung adenocarcinoma and to determine whether each CSC-related marker expression correlates with the histologic subtyping proposed by the International Association for the Study of Lung Cancer (IASLC, the American Thoracic Society (ATS, and the European Respiratory Society (ERS classifications.Methods: We reviewed data for all 103 patients in whom complete resection of adenocarcinoma had been performed. Expression of CSC-related markers, ie, aldehyde dehydrogenase 1A1 (ALDH1A1, aldo-keto reductase 1C family member 1 (AK1C1, and 1C family member 3 (AK1C3, was examined using immunostaining on whole-mount tissue slides, and the tumors were reclassified according to the IASLC/ATS/ERS classification.Results: ALDH1A1 expression was observed in 66.0% of tumors, AK1C1 in 62.7%, and AK1C3 in 86.1%. Immunoreactivities with the frequency of mean expression of ALDH1A1 in papillary predominant adenocarcinoma were significantly higher than those of solid predominant adenocarcinoma (P<0.05. Papillary predominant adenocarcinoma had significantly lower expression of AK1C1 when compared with noninvasive or solid predominant adenocarcinomas (P<0.05. On multivariate analysis, larger tumor size (hazards ratio 1.899, P=0.044, lymph node metastasis (hazards ratio 2.702, P=0.005, and low expression of ALDH1A1 (hazards ratio 3.218, P<0.001 were shown to be independently associated with an

  15. Novel approaches for the molecular classification of prostate cancer

    Institute of Scientific and Technical Information of China (English)

    Robert H. Getzenberg

    2010-01-01

    @@ Among the urologic cancers, prostate cancer is by far the most common, and it appears to have the potential to affect almost all men throughout the world as they age. A number of studies have shown that many men with prostate cancer will not die from their disease, but rather with the disease but from other causes. These men have a form of prostate cancer that is de-scribed as "very low risk" and has often been called indolent. There are however a group of men that have a form of prostate cancer that is much more aggressive and life threatening. Unlike other cancer types, we have few tools to provide for the molecular classification of prostate cancer.

  16. Cancer classification using the Immunoscore: a worldwide task force.

    Science.gov (United States)

    Galon, Jérôme; Pagès, Franck; Marincola, Francesco M; Angell, Helen K; Thurin, Magdalena; Lugli, Alessandro; Zlobec, Inti; Berger, Anne; Bifulco, Carlo; Botti, Gerardo; Tatangelo, Fabiana; Britten, Cedrik M; Kreiter, Sebastian; Chouchane, Lotfi; Delrio, Paolo; Arndt, Hartmann; Asslaber, Martin; Maio, Michele; Masucci, Giuseppe V; Mihm, Martin; Vidal-Vanaclocha, Fernando; Allison, James P; Gnjatic, Sacha; Hakansson, Leif; Huber, Christoph; Singh-Jasuja, Harpreet; Ottensmeier, Christian; Zwierzina, Heinz; Laghi, Luigi; Grizzi, Fabio; Ohashi, Pamela S; Shaw, Patricia A; Clarke, Blaise A; Wouters, Bradly G; Kawakami, Yutaka; Hazama, Shoichi; Okuno, Kiyotaka; Wang, Ena; O'Donnell-Tormey, Jill; Lagorce, Christine; Pawelec, Graham; Nishimura, Michael I; Hawkins, Robert; Lapointe, Réjean; Lundqvist, Andreas; Khleif, Samir N; Ogino, Shuji; Gibbs, Peter; Waring, Paul; Sato, Noriyuki; Torigoe, Toshihiko; Itoh, Kyogo; Patel, Prabhu S; Shukla, Shilin N; Palmqvist, Richard; Nagtegaal, Iris D; Wang, Yili; D'Arrigo, Corrado; Kopetz, Scott; Sinicrope, Frank A; Trinchieri, Giorgio; Gajewski, Thomas F; Ascierto, Paolo A; Fox, Bernard A

    2012-10-03

    Prediction of clinical outcome in cancer is usually achieved by histopathological evaluation of tissue samples obtained during surgical resection of the primary tumor. Traditional tumor staging (AJCC/UICC-TNM classification) summarizes data on tumor burden (T), presence of cancer cells in draining and regional lymph nodes (N) and evidence for metastases (M). However, it is now recognized that clinical outcome can significantly vary among patients within the same stage. The current classification provides limited prognostic information, and does not predict response to therapy. Recent literature has alluded to the importance of the host immune system in controlling tumor progression. Thus, evidence supports the notion to include immunological biomarkers, implemented as a tool for the prediction of prognosis and response to therapy. Accumulating data, collected from large cohorts of human cancers, has demonstrated the impact of immune-classification, which has a prognostic value that may add to the significance of the AJCC/UICC TNM-classification. It is therefore imperative to begin to incorporate the 'Immunoscore' into traditional classification, thus providing an essential prognostic and potentially predictive tool. Introduction of this parameter as a biomarker to classify cancers, as part of routine diagnostic and prognostic assessment of tumors, will facilitate clinical decision-making including rational stratification of patient treatment. Equally, the inherent complexity of quantitative immunohistochemistry, in conjunction with protocol variation across laboratories, analysis of different immune cell types, inconsistent region selection criteria, and variable ways to quantify immune infiltration, all underline the urgent requirement to reach assay harmonization. In an effort to promote the Immunoscore in routine clinical settings, an international task force was initiated. This review represents a follow-up of the announcement of this initiative, and of the J

  17. Weakly supervised histopathology cancer image segmentation and classification.

    Science.gov (United States)

    Xu, Yan; Zhu, Jun-Yan; Chang, Eric I-Chao; Lai, Maode; Tu, Zhuowen

    2014-04-01

    Labeling a histopathology image as having cancerous regions or not is a critical task in cancer diagnosis; it is also clinically important to segment the cancer tissues and cluster them into various classes. Existing supervised approaches for image classification and segmentation require detailed manual annotations for the cancer pixels, which are time-consuming to obtain. In this paper, we propose a new learning method, multiple clustered instance learning (MCIL) (along the line of weakly supervised learning) for histopathology image segmentation. The proposed MCIL method simultaneously performs image-level classification (cancer vs. non-cancer image), medical image segmentation (cancer vs. non-cancer tissue), and patch-level clustering (different classes). We embed the clustering concept into the multiple instance learning (MIL) setting and derive a principled solution to performing the above three tasks in an integrated framework. In addition, we introduce contextual constraints as a prior for MCIL, which further reduces the ambiguity in MIL. Experimental results on histopathology colon cancer images and cytology images demonstrate the great advantage of MCIL over the competing methods.

  18. Wavelet-based multiscale analysis of bioimpedance data measured by electric cell-substrate impedance sensing for classification of cancerous and normal cells

    Science.gov (United States)

    Das, Debanjan; Shiladitya, Kumar; Biswas, Karabi; Dutta, Pranab Kumar; Parekh, Aditya; Mandal, Mahitosh; Das, Soumen

    2015-12-01

    The paper presents a study to differentiate normal and cancerous cells using label-free bioimpedance signal measured by electric cell-substrate impedance sensing. The real-time-measured bioimpedance data of human breast cancer cells and human epithelial normal cells employs fluctuations of impedance value due to cellular micromotions resulting from dynamic structural rearrangement of membrane protrusions under nonagitated condition. Here, a wavelet-based multiscale quantitative analysis technique has been applied to analyze the fluctuations in bioimpedance. The study demonstrates a method to classify cancerous and normal cells from the signature of their impedance fluctuations. The fluctuations associated with cellular micromotion are quantified in terms of cellular energy, cellular power dissipation, and cellular moments. The cellular energy and power dissipation are found higher for cancerous cells associated with higher micromotions in cancer cells. The initial study suggests that proposed wavelet-based quantitative technique promises to be an effective method to analyze real-time bioimpedance signal for distinguishing cancer and normal cells.

  19. Cuckoo search optimisation for feature selection in cancer classification: a new approach.

    Science.gov (United States)

    Gunavathi, C; Premalatha, K

    2015-01-01

    Cuckoo Search (CS) optimisation algorithm is used for feature selection in cancer classification using microarray gene expression data. Since the gene expression data has thousands of genes and a small number of samples, feature selection methods can be used for the selection of informative genes to improve the classification accuracy. Initially, the genes are ranked based on T-statistics, Signal-to-Noise Ratio (SNR) and F-statistics values. The CS is used to find the informative genes from the top-m ranked genes. The classification accuracy of k-Nearest Neighbour (kNN) technique is used as the fitness function for CS. The proposed method is experimented and analysed with ten different cancer gene expression datasets. The results show that the CS gives 100% average accuracy for DLBCL Harvard, Lung Michigan, Ovarian Cancer, AML-ALL and Lung Harvard2 datasets and it outperforms the existing techniques in DLBCL outcome and prostate datasets. PMID:26547979

  20. An Authentication Technique Based on Classification

    Institute of Scientific and Technical Information of China (English)

    李钢; 杨杰

    2004-01-01

    We present a novel watermarking approach based on classification for authentication, in which a watermark is embedded into the host image. When the marked image is modified, the extracted watermark is also different to the original watermark, and different kinds of modification lead to different extracted watermarks. In this paper, different kinds of modification are considered as classes, and we used classification algorithm to recognize the modifications with high probability. Simulation results show that the proposed method is potential and effective.

  1. Texture Classification based on Gabor Wavelet

    Directory of Open Access Journals (Sweden)

    Amandeep Kaur

    2012-07-01

    Full Text Available This paper presents the comparison of Texture classification algorithms based on Gabor Wavelets. The focus of this paper is on feature extraction scheme for texture classification. The texture feature for an image can be classified using texture descriptors. In this paper we have used Homogeneous texture descriptor that uses Gabor Wavelets concept. For texture classification, we have used online texture database that is Brodatz’s database and three advanced well known classifiers: Support Vector Machine, K-nearest neighbor method and decision tree induction method. The results shows that classification using Support vector machines gives better results as compare to the other classifiers. It can accurately discriminate between a testing image data and training data.

  2. A CAD System for Identification and Classification of Breast Cancer Tumors in DCE-MR Images Based on Hierarchical Convolutional Neural Networks

    Directory of Open Access Journals (Sweden)

    Reza Rastiboroujeni

    2015-06-01

    Full Text Available In this paper, we propose a computer aided diagnosis (CAD system based on hierarchical convolutional neural networks (HCNNs to discriminate between malignant and benign tumors in breast DCE-MRIs. A HCNN is a hierarchical neural network that operates on two-dimensional images. A HCNN integrates feature extraction and classification processes into one single and fully adaptive structure. It can extract two-dimensional key features automatically, and it is relatively tolerant to geometric and local distortions in input images. We evaluate CNN implementation learning and testing processes based on gradient descent (GD and resilient back-propagation (RPROP approaches. We show that, proposed HCNN with RPROP learning approach provide an effective and robust neural structure to design a CAD base system for breast MRI, and has potential as a mechanism for the evaluation of different types of abnormalities in medical images.

  3. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim U.; Emberton, Mark [University College London, Research Department of Urology, Division of Surgery and Interventional Science, London (United Kingdom); Kirkham, Alex [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2015-09-15

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  4. The classification and staging of cancerous growths of the anal canal

    International Nuclear Information System (INIS)

    In this chapter authors give information about frequency of cancerous growths of the anal canal, general analysis of observations the classification and staging of cancerous growths of the anal canal, clinical-anatomy classification of cancerous growths of the anal canal and staging of cancerous growths of anal canal

  5. Texture Image Classification Based on Gabor Wavelet

    Institute of Scientific and Technical Information of China (English)

    DENG Wei-bing; LI Hai-fei; SHI Ya-li; YANG Xiao-hui

    2014-01-01

    For a texture image, by recognizining the class of every pixel of the image, it can be partitioned into disjoint regions of uniform texture. This paper proposed a texture image classification algorithm based on Gabor wavelet. In this algorithm, characteristic of every image is obtained through every pixel and its neighborhood of this image. And this algorithm can achieve the information transform between different sizes of neighborhood. Experiments on standard Brodatz texture image dataset show that our proposed algorithm can achieve good classification rates.

  6. Cancer classification using the Immunoscore: a worldwide task force

    Directory of Open Access Journals (Sweden)

    Galon Jérôme

    2012-10-01

    Full Text Available Abstract Prediction of clinical outcome in cancer is usually achieved by histopathological evaluation of tissue samples obtained during surgical resection of the primary tumor. Traditional tumor staging (AJCC/UICC-TNM classification summarizes data on tumor burden (T, presence of cancer cells in draining and regional lymph nodes (N and evidence for metastases (M. However, it is now recognized that clinical outcome can significantly vary among patients within the same stage. The current classification provides limited prognostic information, and does not predict response to therapy. Recent literature has alluded to the importance of the host immune system in controlling tumor progression. Thus, evidence supports the notion to include immunological biomarkers, implemented as a tool for the prediction of prognosis and response to therapy. Accumulating data, collected from large cohorts of human cancers, has demonstrated the impact of immune-classification, which has a prognostic value that may add to the significance of the AJCC/UICC TNM-classification. It is therefore imperative to begin to incorporate the ‘Immunoscore’ into traditional classification, thus providing an essential prognostic and potentially predictive tool. Introduction of this parameter as a biomarker to classify cancers, as part of routine diagnostic and prognostic assessment of tumors, will facilitate clinical decision-making including rational stratification of patient treatment. Equally, the inherent complexity of quantitative immunohistochemistry, in conjunction with protocol variation across laboratories, analysis of different immune cell types, inconsistent region selection criteria, and variable ways to quantify immune infiltration, all underline the urgent requirement to reach assay harmonization. In an effort to promote the Immunoscore in routine clinical settings, an international task force was initiated. This review represents a follow-up of the announcement of

  7. Classification for breast cancer diagnosis with Raman spectroscopy

    Science.gov (United States)

    Li, Qingbo; Gao, Qishuo; Zhang, Guangjun

    2014-01-01

    In order to promote the development of the portable, low-cost and in vivo cancer diagnosis instrument, a miniature laser Raman spectrometer was employed to acquire the conventional Raman spectra for breast cancer detection in this paper. But it is difficult to achieve high discrimination accuracy. Then a novel method of adaptive weight k-local hyperplane (AWKH) is proposed to increase the classification accuracy. AWKH is an extension and improvement of K-local hyperplane distance nearest-neighbor (HKNN). It considers the features weights of the training data in the nearest neighbor selection and local hyperplane construction stage, which resolve the basic shortcoming of HKNN works well only for small values of the nearest-neighbor. Experimental results on Raman spectra of breast tissues in vitro show the proposed method can realize high classification accuracy. PMID:25071976

  8. Classification of Base Sequences (+1,

    Directory of Open Access Journals (Sweden)

    Dragomir Ž. Ðoković

    2010-01-01

    Full Text Available Base sequences BS(+1, are quadruples of {±1}-sequences (;;;, with A and B of length +1 and C and D of length n, such that the sum of their nonperiodic autocor-relation functions is a -function. The base sequence conjecture, asserting that BS(+1, exist for all n, is stronger than the famous Hadamard matrix conjecture. We introduce a new definition of equivalence for base sequences BS(+1, and construct a canonical form. By using this canonical form, we have enumerated the equivalence classes of BS(+1, for ≤30. As the number of equivalence classes grows rapidly (but not monotonically with n, the tables in the paper cover only the cases ≤13.

  9. Vessel-guided airway segmentation based on voxel classification

    DEFF Research Database (Denmark)

    Lo, Pechin Chien Pau; Sporring, Jon; Ashraf, Haseem;

    2008-01-01

    This paper presents a method for improving airway tree segmentation using vessel orientation information. We use the fact that an airway branch is always accompanied by an artery, with both structures having similar orientations. This work is based on a  voxel classification airway segmentation...... method proposed previously. The probability of a voxel belonging to the airway, from the voxel classification method, is augmented with an orientation similarity measure as a criterion for region growing. The orientation similarity measure of a voxel indicates how similar is the orientation...... of the surroundings of a voxel, estimated based on a tube model, is to that of a neighboring vessel. The proposed method is tested on 20 CT images from different subjects selected randomly from a lung cancer screening study. Length of the airway branches from the results of the proposed method are significantly...

  10. A novel subtype classification and risk of breast cancer by histone modification profiling.

    Science.gov (United States)

    Chen, Xiaohua; Hu, Hanyang; He, Lin; Yu, Xueyuan; Liu, Xiangyu; Zhong, Rong; Shu, Maoguo

    2016-06-01

    Breast cancer has been classified into several intrinsic molecular subtypes on the basis of genetic and epigenetic factors. However, knowledge about histone modifications that contribute to the classification and development of biologically distinct breast cancer subtypes remains limited. Here we compared the genome-wide binding patterns of H3K4me3 and H3K27me3 between human mammary epithelial cells and three breast cancer cell lines representing the luminal, HER2, and basal subtypes. We characterized thousands of unique binding events as well as bivalent chromatin signatures unique to each cancer subtype, which were involved in different epigenetic regulation programs and signaling pathways in breast cancer progression. Genes linked to the unique histone mark features exhibited subtype-specific expression patterns, both in cancer cell lines and primary tumors, some of which were confirmed by qPCR in our primary cancer samples. Finally, histone mark-based gene classifiers were significantly correlated with relapse-free survival outcomes in patients. In summary, we have provided a valuable resource for the identification of novel biomarkers of subtype classification and clinical prognosis evaluation in breast cancers. PMID:27178334

  11. An Agent Based Classification Model

    CERN Document Server

    Gu, Feng; Greensmith, Julie

    2009-01-01

    The major function of this model is to access the UCI Wisconsin Breast Can- cer data-set[1] and classify the data items into two categories, which are normal and anomalous. This kind of classifi cation can be referred as anomaly detection, which discriminates anomalous behaviour from normal behaviour in computer systems. One popular solution for anomaly detection is Artifi cial Immune Sys- tems (AIS). AIS are adaptive systems inspired by theoretical immunology and observed immune functions, principles and models which are applied to prob- lem solving. The Dendritic Cell Algorithm (DCA)[2] is an AIS algorithm that is developed specifi cally for anomaly detection. It has been successfully applied to intrusion detection in computer security. It is believed that agent-based mod- elling is an ideal approach for implementing AIS, as intelligent agents could be the perfect representations of immune entities in AIS. This model evaluates the feasibility of re-implementing the DCA in an agent-based simulation environ- ...

  12. Image-based Vehicle Classification System

    CERN Document Server

    Ng, Jun Yee

    2012-01-01

    Electronic toll collection (ETC) system has been a common trend used for toll collection on toll road nowadays. The implementation of electronic toll collection allows vehicles to travel at low or full speed during the toll payment, which help to avoid the traffic delay at toll road. One of the major components of an electronic toll collection is the automatic vehicle detection and classification (AVDC) system which is important to classify the vehicle so that the toll is charged according to the vehicle classes. Vision-based vehicle classification system is one type of vehicle classification system which adopt camera as the input sensing device for the system. This type of system has advantage over the rest for it is cost efficient as low cost camera is used. The implementation of vision-based vehicle classification system requires lower initial investment cost and very suitable for the toll collection trend migration in Malaysia from single ETC system to full-scale multi-lane free flow (MLFF). This project ...

  13. A Hybrid Reduction Approach for Enhancing Cancer Classification of Microarray Data

    Directory of Open Access Journals (Sweden)

    Abeer M. Mahmoud

    2014-10-01

    Full Text Available This paper presents a novel hybrid machine learning (MLreduction approach to enhance cancer classification accuracy of microarray data based on two ML gene ranking techniques (T-test and Class Separability (CS. The proposed approach is integrated with two ML classifiers; K-nearest neighbor (KNN and support vector machine (SVM; for mining microarray gene expression profiles. Four public cancer microarray databases are used for evaluating the proposed approach and successfully accomplish the mining process. These are Lymphoma, Leukemia SRBCT, and Lung Cancer. The strategy to select genes only from the training samples and totally excluding the testing samples from the classifier building process is utilized for more accurate and validated results. Also, the computational experiments are illustrated in details and comprehensively presented with literature related results. The results showed that the proposed reduction approach reached promising results of the number of genes supplemented to the classifiers as well as the classification accuracy.

  14. Mechanism-based drug exposure classification in pharmacoepidemiological studies

    NARCIS (Netherlands)

    Verdel, B.M.

    2010-01-01

    Mechanism-based classification of drug exposure in pharmacoepidemiological studies In pharmacoepidemiology and pharmacovigilance, the relation between drug exposure and clinical outcomes is crucial. Exposure classification in pharmacoepidemiological studies is traditionally based on pharmacotherapeu

  15. Review on Feature Selection Techniques and the Impact of SVM for Cancer Classification using Gene Expression Profile

    CERN Document Server

    George, G Victo Sudha; 10.5121/ijcses.2011.2302

    2011-01-01

    The DNA microarray technology has modernized the approach of biology research in such a way that scientists can now measure the expression levels of thousands of genes simultaneously in a single experiment. Gene expression profiles, which represent the state of a cell at a molecular level, have great potential as a medical diagnosis tool. But compared to the number of genes involved, available training data sets generally have a fairly small sample size for classification. These training data limitations constitute a challenge to certain classification methodologies. Feature selection techniques can be used to extract the marker genes which influence the classification accuracy effectively by eliminating the un wanted noisy and redundant genes This paper presents a review of feature selection techniques that have been employed in micro array data based cancer classification and also the predominant role of SVM for cancer classification.

  16. REVIEW ON FEATURE SELECTION TECHNIQUES AND THE IMPACT OF SVM FOR CANCER CLASSIFICATION USING GENE EXPRESSION PROFILE

    Directory of Open Access Journals (Sweden)

    G.Victo Sudha George

    2011-09-01

    Full Text Available The DNA microarray technology has modernized the approach of biology research in such a way thatscientists can now measure the expression levels of thousands of genes simultaneously in a singleexperiment. Gene expression profiles, which represent the state of a cell at a molecular level, have greatpotential as a medical diagnosis tool. But compared to the number of genes involved, available trainingdata sets generally have a fairly small sample size for classification. These training data limitationsconstitute a challenge to certain classification methodologies. Feature selection techniques can be usedto extract the marker genes which influence the classification accuracy effectively by eliminating the unwanted noisy and redundant genes This paper presents a review of feature selection techniques that havebeen employed in micro array data based cancer classification and also the predominant role of SVMfor cancer classification.

  17. Research on the Gastric Cancer Clinical Medical Data Mining Research Based on SPRINT Classification Algorithm%基于SPRINT算法的胃癌临床医疗数据挖掘研究

    Institute of Scientific and Technical Information of China (English)

    郑丹青

    2012-01-01

    To supply the data mining demand,a decision-tree based model is proposed for gastric cancer clinical medical information analysis and application.The model is developed from the existing operational database or data warehouse,from which the factors related to gastric cancer recurrence are extracted to form a decision tree training data set.Using the SPRINT classification algorithm,the model is capable of analyzing the risk factors for gastric cancer recurrence.Based on the analysis of all the potential factors affecting clinical diagnosis,treatment and prognosis,the model confirmed that the primary risk factor for gastric cancer recurrence was hereditary.%为了满足数据挖掘的需要,本文提出了一个基于决策树的胃癌临床医疗信息分析应用研究模型.该模型是从业务数据库或数据仓库中抽取与胃癌术后复发因素有关的数据,形成决策树的训练数据集.运用SPRINT算法,构建胃癌术后复发的危险因素分析模型.通过对模型分析,寻找疾病的临床诊断、治疗和预后的关系,证实胃癌术后复发首要危险因素是家族遗传.

  18. Registration and classification of adolescent and young adult cancer cases.

    Science.gov (United States)

    Pollock, Brad H; Birch, Jillian M

    2008-05-01

    Cancer registries are an important research resource that facilitate the study of etiology, tumor biology, patterns of delayed diagnosis and health planning needs. When outcome data are included, registries can track secular changes in survival related to improvements in early detection or treatment. The surveillance, epidemiology, and end results (SEER) registry has been used to identify major gaps in survival for older adolescent and young adult (AYA) patients compared with younger children and older adults. In order to determine the reasons for this gap, the complete registration and accurate classification of AYA malignancies is necessary. There are inconsistencies in defining the age limits for AYAs although the Adolescent and Young Adult Oncology Progress Review Group proposed a definition of ages 15 through 39 years. The central registration and classification issues for AYAs are case-finding, defining common data elements (CDE) collected across different registries and the diagnostic classification of these malignancies. Goals to achieve by 2010 include extending and validating current diagnostic classification schemes and expanding the CDE to support AYA oncology research, including the collection of tracking information to assess long-term outcomes. These efforts will advance preventive, etiologic, therapeutic, and health services-related research for this understudied age group.

  19. Call for a Computer-Aided Cancer Detection and Classification Research Initiative in Oman.

    Science.gov (United States)

    Mirzal, Andri; Chaudhry, Shafique Ahmad

    2016-01-01

    Cancer is a major health problem in Oman. It is reported that cancer incidence in Oman is the second highest after Saudi Arabia among Gulf Cooperation Council countries. Based on GLOBOCAN estimates, Oman is predicted to face an almost two-fold increase in cancer incidence in the period 2008-2020. However, cancer research in Oman is still in its infancy. This is due to the fact that medical institutions and infrastructure that play central roles in data collection and analysis are relatively new developments in Oman. We believe the country requires an organized plan and efforts to promote local cancer research. In this paper, we discuss current research progress in cancer diagnosis using machine learning techniques to optimize computer aided cancer detection and classification (CAD). We specifically discuss CAD using two major medical data, i.e., medical imaging and microarray gene expression profiling, because medical imaging like mammography, MRI, and PET have been widely used in Oman for assisting radiologists in early cancer diagnosis and microarray data have been proven to be a reliable source for differential diagnosis. We also discuss future cancer research directions and benefits to Oman economy for entering the cancer research and treatment business as it is a multi-billion dollar industry worldwide. PMID:27268600

  20. Side effects of cancer therapies. International classification and documentation systems

    International Nuclear Information System (INIS)

    The publication presents and explains verified, international classification and documentation systems for side effects induced by cancer treatments, applicable in general and clinical practice and clinical research, and covers in a clearly arranged manner the whole range of treatments, including acute and chronic side effects of chemotherapy and radiotherapy, surgery, or combined therapies. The book fills a long-felt need in tumor documentation and is a major contribution to quality assurance in clinical oncology in German-speaking countries. As most parts of the book are bilingual, presenting German and English texts and terminology, it satisfies the principles of interdisciplinarity and internationality. The tabulated form chosen for presentation of classification systems and criteria facilitate the user's approach as well as application in daily work. (orig./CB)

  1. Collaborative Representation based Classification for Face Recognition

    CERN Document Server

    Zhang, Lei; Feng, Xiangchu; Ma, Yi; Zhang, David

    2012-01-01

    By coding a query sample as a sparse linear combination of all training samples and then classifying it by evaluating which class leads to the minimal coding residual, sparse representation based classification (SRC) leads to interesting results for robust face recognition. It is widely believed that the l1- norm sparsity constraint on coding coefficients plays a key role in the success of SRC, while its use of all training samples to collaboratively represent the query sample is rather ignored. In this paper we discuss how SRC works, and show that the collaborative representation mechanism used in SRC is much more crucial to its success of face classification. The SRC is a special case of collaborative representation based classification (CRC), which has various instantiations by applying different norms to the coding residual and coding coefficient. More specifically, the l1 or l2 norm characterization of coding residual is related to the robustness of CRC to outlier facial pixels, while the l1 or l2 norm c...

  2. Feature-Based Classification of Networks

    CERN Document Server

    Barnett, Ian; Kuijjer, Marieke L; Mucha, Peter J; Onnela, Jukka-Pekka

    2016-01-01

    Network representations of systems from various scientific and societal domains are neither completely random nor fully regular, but instead appear to contain recurring structural building blocks. These features tend to be shared by networks belonging to the same broad class, such as the class of social networks or the class of biological networks. At a finer scale of classification within each such class, networks describing more similar systems tend to have more similar features. This occurs presumably because networks representing similar purposes or constructions would be expected to be generated by a shared set of domain specific mechanisms, and it should therefore be possible to classify these networks into categories based on their features at various structural levels. Here we describe and demonstrate a new, hybrid approach that combines manual selection of features of potential interest with existing automated classification methods. In particular, selecting well-known and well-studied features that ...

  3. Texture classification based on EMD and FFT

    Institute of Scientific and Technical Information of China (English)

    XIONG Chang-zhen; XU Jun-yi; ZOU Jian-cheng; QI Dong-xu

    2006-01-01

    Empirical mode decomposition (EMD) is an adaptive and approximately orthogonal filtering process that reflects human's visual mechanism of differentiating textures. In this paper, we present a modified 2D EMD algorithm using the FastRBF and an appropriate number of iterations in the shifting process (SP), then apply it to texture classification. Rotation-invariant texture feature vectors are extracted using auto-registration and circular regions of magnitude spectra of 2D fast Fourier transform(FFT). In the experiments, we employ a Bayesion classifier to classify a set of 15 distinct natural textures selected from the Brodatz album. The experimental results, based on different testing datasets for images with different orientations, show the effectiveness of the proposed classification scheme.

  4. An Approach for Leukemia Classification Based on Cooperative Game Theory

    Directory of Open Access Journals (Sweden)

    Atefeh Torkaman

    2011-01-01

    Full Text Available Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers in the world. So, the emphasis on diagnostic techniques and best treatments would be able to provide better prognosis and survival for patients. In this paper, an automatic diagnosis recommender system for classifying leukemia based on cooperative game is presented. Through out this research, we analyze the flow cytometry data toward the classification of leukemia into eight classes. We work on real data set from different types of leukemia that have been collected at Iran Blood Transfusion Organization (IBTO. Generally, the data set contains 400 samples taken from human leukemic bone marrow. This study deals with cooperative game used for classification according to different weights assigned to the markers. The proposed method is versatile as there are no constraints to what the input or output represent. This means that it can be used to classify a population according to their contributions. In other words, it applies equally to other groups of data. The experimental results show the accuracy rate of 93.12%, for classification and compared to decision tree (C4.5 with (90.16% in accuracy. The result demonstrates that cooperative game is very promising to be used directly for classification of leukemia as a part of Active Medical decision support system for interpretation of flow cytometry readout. This system could assist clinical hematologists to properly recognize different kinds of leukemia by preparing suggestions and this could improve the treatment

  5. Les cancers de la cavité buccale et de l'oropharynx dans le monde : incidence internationale et classification TNM dans les registres du cancer

    OpenAIRE

    de Camargo Cancela, Marianna

    2010-01-01

    Oral cavity and oropharynx cancers : International incidence and TNM classification in population-based cancer registries The aim of this work was to know and to evaluate the epidemiological patterns of oral cavity and ororpharynx cancers. These topographies share some common risk factors and they are often grouped in epidemiological studies. However, the implication of the human papilloma virus in oropharyngeal tumors lead us to provide incidence rates according to the anatomical classificat...

  6. Digital image-based classification of biodiesel.

    Science.gov (United States)

    Costa, Gean Bezerra; Fernandes, David Douglas Sousa; Almeida, Valber Elias; Araújo, Thomas Souto Policarpo; Melo, Jessica Priscila; Diniz, Paulo Henrique Gonçalves Dias; Véras, Germano

    2015-07-01

    This work proposes a simple, rapid, inexpensive, and non-destructive methodology based on digital images and pattern recognition techniques for classification of biodiesel according to oil type (cottonseed, sunflower, corn, or soybean). For this, differing color histograms in RGB (extracted from digital images), HSI, Grayscale channels, and their combinations were used as analytical information, which was then statistically evaluated using Soft Independent Modeling by Class Analogy (SIMCA), Partial Least Squares Discriminant Analysis (PLS-DA), and variable selection using the Successive Projections Algorithm associated with Linear Discriminant Analysis (SPA-LDA). Despite good performances by the SIMCA and PLS-DA classification models, SPA-LDA provided better results (up to 95% for all approaches) in terms of accuracy, sensitivity, and specificity for both the training and test sets. The variables selected Successive Projections Algorithm clearly contained the information necessary for biodiesel type classification. This is important since a product may exhibit different properties, depending on the feedstock used. Such variations directly influence the quality, and consequently the price. Moreover, intrinsic advantages such as quick analysis, requiring no reagents, and a noteworthy reduction (the avoidance of chemical characterization) of waste generation, all contribute towards the primary objective of green chemistry.

  7. BROAD PHONEME CLASSIFICATION USING SIGNAL BASED FEATURES

    Directory of Open Access Journals (Sweden)

    Deekshitha G

    2014-12-01

    Full Text Available Speech is the most efficient and popular means of human communication Speech is produced as a sequence of phonemes. Phoneme recognition is the first step performed by automatic speech recognition system. The state-of-the-art recognizers use mel-frequency cepstral coefficients (MFCC features derived through short time analysis, for which the recognition accuracy is limited. Instead of this, here broad phoneme classification is achieved using features derived directly from the speech at the signal level itself. Broad phoneme classes include vowels, nasals, fricatives, stops, approximants and silence. The features identified useful for broad phoneme classification are voiced/unvoiced decision, zero crossing rate (ZCR, short time energy, most dominant frequency, energy in most dominant frequency, spectral flatness measure and first three formants. Features derived from short time frames of training speech are used to train a multilayer feedforward neural network based classifier with manually marked class label as output and classification accuracy is then tested. Later this broad phoneme classifier is used for broad syllable structure prediction which is useful for applications such as automatic speech recognition and automatic language identification.

  8. Changing Histopathological Diagnostics by Genome-Based Tumor Classification

    Directory of Open Access Journals (Sweden)

    Michael Kloth

    2014-05-01

    Full Text Available Traditionally, tumors are classified by histopathological criteria, i.e., based on their specific morphological appearances. Consequently, current therapeutic decisions in oncology are strongly influenced by histology rather than underlying molecular or genomic aberrations. The increase of information on molecular changes however, enabled by the Human Genome Project and the International Cancer Genome Consortium as well as the manifold advances in molecular biology and high-throughput sequencing techniques, inaugurated the integration of genomic information into disease classification. Furthermore, in some cases it became evident that former classifications needed major revision and adaption. Such adaptations are often required by understanding the pathogenesis of a disease from a specific molecular alteration, using this molecular driver for targeted and highly effective therapies. Altogether, reclassifications should lead to higher information content of the underlying diagnoses, reflecting their molecular pathogenesis and resulting in optimized and individual therapeutic decisions. The objective of this article is to summarize some particularly important examples of genome-based classification approaches and associated therapeutic concepts. In addition to reviewing disease specific markers, we focus on potentially therapeutic or predictive markers and the relevance of molecular diagnostics in disease monitoring.

  9. Classification of Cancer-related Death Certificates using Machine Learning

    Directory of Open Access Journals (Sweden)

    Luke Butt

    2013-05-01

    Full Text Available BackgroundCancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities.AimsIn this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated.Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes.ResultsDeath certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032 and false negative rate (0.0297 while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers.ConclusionThe selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with

  10. A Dataset for Breast Cancer Histopathological Image Classification.

    Science.gov (United States)

    Spanhol, Fabio A; Oliveira, Luiz S; Petitjean, Caroline; Heutte, Laurent

    2016-07-01

    Today, medical image analysis papers require solid experiments to prove the usefulness of proposed methods. However, experiments are often performed on data selected by the researchers, which may come from different institutions, scanners, and populations. Different evaluation measures may be used, making it difficult to compare the methods. In this paper, we introduce a dataset of 7909 breast cancer histopathology images acquired on 82 patients, which is now publicly available from http://web.inf.ufpr.br/vri/breast-cancer-database. The dataset includes both benign and malignant images. The task associated with this dataset is the automated classification of these images in two classes, which would be a valuable computer-aided diagnosis tool for the clinician. In order to assess the difficulty of this task, we show some preliminary results obtained with state-of-the-art image classification systems. The accuracy ranges from 80% to 85%, showing room for improvement is left. By providing this dataset and a standardized evaluation protocol to the scientific community, we hope to gather researchers in both the medical and the machine learning field to advance toward this clinical application. PMID:26540668

  11. Cirrhosis Classification Based on Texture Classification of Random Features

    Directory of Open Access Journals (Sweden)

    Hui Liu

    2014-01-01

    Full Text Available Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage. CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM.

  12. Fuzzy Rule Base System for Software Classification

    Directory of Open Access Journals (Sweden)

    Adnan Shaout

    2013-07-01

    Full Text Available Given the central role that software development plays in the delivery and application of informationtechnology, managers have been focusing on process improvement in the software development area. Thisimprovement has increased the demand for software measures, or metrics to manage the process. Thismetrics provide a quantitative basis for the development and validation of models during the softwaredevelopment process. In this paper a fuzzy rule-based system will be developed to classify java applicationsusing object oriented metrics. The system will contain the following features:Automated method to extract the OO metrics from the source code,Default/base set of rules that can be easily configured via XML file so companies, developers, teamleaders,etc, can modify the set of rules according to their needs,Implementation of a framework so new metrics, fuzzy sets and fuzzy rules can be added or removeddepending on the needs of the end user,General classification of the software application and fine-grained classification of the java classesbased on OO metrics, andTwo interfaces are provided for the system: GUI and command.

  13. Pathohistological classification systems in gastric cancer: Diagnostic relevance and prognostic value

    OpenAIRE

    Berlth, Felix; Bollschweiler, Elfriede; Drebber, Uta; Hoelscher, Arnulf H; Moenig, Stefan

    2014-01-01

    Several pathohistological classification systems exist for the diagnosis of gastric cancer. Many studies have investigated the correlation between the pathohistological characteristics in gastric cancer and patient characteristics, disease specific criteria and overall outcome. It is still controversial as to which classification system imparts the most reliable information, and therefore, the choice of system may vary in clinical routine. In addition to the most common classification systems...

  14. Automatic web services classification based on rough set theory

    Institute of Scientific and Technical Information of China (English)

    陈立; 张英; 宋自林; 苗壮

    2013-01-01

    With development of web services technology, the number of existing services in the internet is growing day by day. In order to achieve automatic and accurate services classification which can be beneficial for service related tasks, a rough set theory based method for services classification was proposed. First, the services descriptions were preprocessed and represented as vectors. Elicited by the discernibility matrices based attribute reduction in rough set theory and taking into account the characteristic of decision table of services classification, a method based on continuous discernibility matrices was proposed for dimensionality reduction. And finally, services classification was processed automatically. Through the experiment, the proposed method for services classification achieves approving classification result in all five testing categories. The experiment result shows that the proposed method is accurate and could be used in practical web services classification.

  15. High-Throughput Prostate Cancer Gland Detection, Segmentation, and Classification from Digitized Needle Core Biopsies

    Science.gov (United States)

    Xu, Jun; Sparks, Rachel; Janowczyk, Andrew; Tomaszewski, John E.; Feldman, Michael D.; Madabhushi, Anant

    We present a high-throughput computer-aided system for the segmentation and classification of glands in high resolution digitized images of needle core biopsy samples of the prostate. It will allow for rapid and accurate identification of suspicious regions on these samples. The system includes the following three modules: 1) a hierarchical frequency weighted mean shift normalized cut (HNCut) for initial detection of glands; 2) a geodesic active contour (GAC) model for gland segmentation; and 3) a diffeomorphic based similarity (DBS) feature extraction for classification of glands as benign or cancerous. HNCut is a minimally supervised color based detection scheme that combines the frequency weighted mean shift and normalized cuts algorithms to detect the lumen region of candidate glands. A GAC model, initialized using the results of HNCut, uses a color gradient based edge detection function for accurate gland segmentation. Lastly, DBS features are a set of morphometric features derived from the nonlinear dimensionality reduction of a dissimilarity metric between shape models. The system integrates these modules to enable the rapid detection, segmentation, and classification of glands on prostate biopsy images. Across 23 H & E stained prostate studies of whole-slides, 105 regions of interests (ROIs) were selected for the evaluation of segmentation and classification. The segmentation results were evaluated on 10 ROIs and compared to manual segmentation in terms of mean distance (2.6 ±0.2 pixels), overlap (62±0.07%), sensitivity (85±0.01%), specificity (94±0.003%) and positive predictive value (68±0.08%). Over 105 ROIs, the classification accuracy for glands automatically segmented was (82.5 ±9.10%) while the accuracy for glands manually segmented was (82.89 ±3.97%); no statistically significant differences were identified between the classification results.

  16. Molecular voting for glioma classification reflecting heterogeneity in the continuum of cancer progression.

    Science.gov (United States)

    Fuller, Gregory N; Mircean, Cristian; Tabus, Ioan; Taylor, Ellen; Sawaya, Raymond; Bruner, Janet M; Shmulevich, Ilya; Zhang, Wei

    2005-09-01

    Gliomas, the most common brain tumors, are generally categorized into two lineages (astrocytic and oligodendrocytic) and further classified as low-grade (astrocytoma and oligodendroglioma), mid-grade (anaplastic astrocytoma and anaplastic oligodendroglioma), and high-grade (glioblastoma multiforme) based on morphological features. A strict classification scheme has limitations because a specific glioma can be at any stage of the continuum of cancer progression and may contain mixed features. Thus, a more comprehensive classification based on molecular signatures may reflect the biological nature of specific tumors more accurately. In this study, we used microarray technology to profile the gene expression of 49 human brain tumors and applied the k-nearest neighbor algorithm for classification. We first trained the classification gene set with 19 of the most typical glioma cases and selected a set of genes that provide the lowest cross-validation classification error with k=5. We then applied this gene set to the 30 remaining cases, including several that do not belong to gliomas such as atypical meningioma. The results showed that not only does the algorithm correctly classify most of the gliomas, but the detailed voting results also provide more subtle information regarding the molecular similarities to neighboring classes. For atypical meningioma, the voting was equally split among the four classes, indicating a difficulty in placement of meningioma into the four classes of gliomas. Thus, the actual voting results, which are typically used only to decide the winning class label in k-nearest neighbor algorithms, provide a useful method for gaining deeper insight into the stage of a tumor in the continuum of cancer development.

  17. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine.

    Science.gov (United States)

    Xi, Maolong; Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun

    2016-01-01

    This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms. PMID:27642363

  18. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  19. Graph-based Methods for Orbit Classification

    Energy Technology Data Exchange (ETDEWEB)

    Bagherjeiran, A; Kamath, C

    2005-09-29

    An important step in the quest for low-cost fusion power is the ability to perform and analyze experiments in prototype fusion reactors. One of the tasks in the analysis of experimental data is the classification of orbits in Poincare plots. These plots are generated by the particles in a fusion reactor as they move within the toroidal device. In this paper, we describe the use of graph-based methods to extract features from orbits. These features are then used to classify the orbits into several categories. Our results show that existing machine learning algorithms are successful in classifying orbits with few points, a situation which can arise in data from experiments.

  20. Sentiment classification technology based on Markov logic networks

    Science.gov (United States)

    He, Hui; Li, Zhigang; Yao, Chongchong; Zhang, Weizhe

    2016-07-01

    With diverse online media emerging, there is a growing concern of sentiment classification problem. At present, text sentiment classification mainly utilizes supervised machine learning methods, which feature certain domain dependency. On the basis of Markov logic networks (MLNs), this study proposed a cross-domain multi-task text sentiment classification method rooted in transfer learning. Through many-to-one knowledge transfer, labeled text sentiment classification, knowledge was successfully transferred into other domains, and the precision of the sentiment classification analysis in the text tendency domain was improved. The experimental results revealed the following: (1) the model based on a MLN demonstrated higher precision than the single individual learning plan model. (2) Multi-task transfer learning based on Markov logical networks could acquire more knowledge than self-domain learning. The cross-domain text sentiment classification model could significantly improve the precision and efficiency of text sentiment classification.

  1. Artificial neural networks as classification and diagnostic tools for lymph node-negative breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Eswari J, Satya; Chandrakar, Neha [National Institute of Technology Raipur, Raipur (India)

    2016-04-15

    Artificial neural networks (ANNs) can be used to develop a technique to classify lymph node negative breast cancer that is prone to distant metastases based on gene expression signatures. The neural network used is a multilayered feed forward network that employs back propagation algorithm. Once trained with DNA microarraybased gene expression profiles of genes that were predictive of distant metastasis recurrence of lymph node negative breast cancer, the ANNs became capable of correctly classifying all samples and recognizing the genes most appropriate to the classification. To test the ability of the trained ANN models in recognizing lymph node negative breast cancer, we analyzed additional idle samples that were not used beforehand for the training procedure and obtained the correctly classified result in the validation set. For more substantial result, bootstrapping of training and testing dataset was performed as external validation. This study illustrates the potential application of ANN for breast tumor diagnosis and the identification of candidate targets in patients for therapy.

  2. Profiling alternatively spliced mRNA isoforms for prostate cancer classification

    Directory of Open Access Journals (Sweden)

    Fan Jian-Bing

    2006-04-01

    Full Text Available Abstract Background Prostate cancer is one of the leading causes of cancer illness and death among men in the United States and world wide. There is an urgent need to discover good biomarkers for early clinical diagnosis and treatment. Previously, we developed an exon-junction microarray-based assay and profiled 1532 mRNA splice isoforms from 364 potential prostate cancer related genes in 38 prostate tissues. Here, we investigate the advantage of using splice isoforms, which couple transcriptional and splicing regulation, for cancer classification. Results As many as 464 splice isoforms from more than 200 genes are differentially regulated in tumors at a false discovery rate (FDR of 0.05. Remarkably, about 30% of genes have isoforms that are called significant but do not exhibit differential expression at the overall mRNA level. A support vector machine (SVM classifier trained on 128 signature isoforms can correctly predict 92% of the cases, which outperforms the classifier using overall mRNA abundance by about 5%. It is also observed that the classification performance can be improved using multivariate variable selection methods, which take correlation among variables into account. Conclusion These results demonstrate that profiling of splice isoforms is able to provide unique and important information which cannot be detected by conventional microarrays.

  3. Improved prognostic classification of breast cancer defined by antagonistic activation patterns of immune response pathway modules

    International Nuclear Information System (INIS)

    Elucidating the activation pattern of molecular pathways across a given tumour type is a key challenge necessary for understanding the heterogeneity in clinical response and for developing novel more effective therapies. Gene expression signatures of molecular pathway activation derived from perturbation experiments in model systems as well as structural models of molecular interactions ('model signatures') constitute an important resource for estimating corresponding activation levels in tumours. However, relatively few strategies for estimating pathway activity from such model signatures exist and only few studies have used activation patterns of pathways to refine molecular classifications of cancer. Here we propose a novel network-based method for estimating pathway activation in tumours from model signatures. We find that although the pathway networks inferred from cancer expression data are highly consistent with the prior information contained in the model signatures, that they also exhibit a highly modular structure and that estimation of pathway activity is dependent on this modular structure. We apply our methodology to a panel of 438 estrogen receptor negative (ER-) and 785 estrogen receptor positive (ER+) breast cancers to infer activation patterns of important cancer related molecular pathways. We show that in ER negative basal and HER2+ breast cancer, gene expression modules reflecting T-cell helper-1 (Th1) and T-cell helper-2 (Th2) mediated immune responses play antagonistic roles as major risk factors for distant metastasis. Using Boolean interaction Cox-regression models to identify non-linear pathway combinations associated with clinical outcome, we show that simultaneous high activation of Th1 and low activation of a TGF-beta pathway module defines a subtype of particularly good prognosis and that this classification provides a better prognostic model than those based on the individual pathways. In ER+ breast cancer, we find that

  4. Improved prognostic classification of breast cancer defined by antagonistic activation patterns of immune response pathway modules

    Directory of Open Access Journals (Sweden)

    El-Ashry Dorraya

    2010-11-01

    Full Text Available Abstract Background Elucidating the activation pattern of molecular pathways across a given tumour type is a key challenge necessary for understanding the heterogeneity in clinical response and for developing novel more effective therapies. Gene expression signatures of molecular pathway activation derived from perturbation experiments in model systems as well as structural models of molecular interactions ("model signatures" constitute an important resource for estimating corresponding activation levels in tumours. However, relatively few strategies for estimating pathway activity from such model signatures exist and only few studies have used activation patterns of pathways to refine molecular classifications of cancer. Methods Here we propose a novel network-based method for estimating pathway activation in tumours from model signatures. We find that although the pathway networks inferred from cancer expression data are highly consistent with the prior information contained in the model signatures, that they also exhibit a highly modular structure and that estimation of pathway activity is dependent on this modular structure. We apply our methodology to a panel of 438 estrogen receptor negative (ER- and 785 estrogen receptor positive (ER+ breast cancers to infer activation patterns of important cancer related molecular pathways. Results We show that in ER negative basal and HER2+ breast cancer, gene expression modules reflecting T-cell helper-1 (Th1 and T-cell helper-2 (Th2 mediated immune responses play antagonistic roles as major risk factors for distant metastasis. Using Boolean interaction Cox-regression models to identify non-linear pathway combinations associated with clinical outcome, we show that simultaneous high activation of Th1 and low activation of a TGF-beta pathway module defines a subtype of particularly good prognosis and that this classification provides a better prognostic model than those based on the individual pathways

  5. Classification of Laser Induced Fluorescence Spectra from Normal and Malignant bladder tissues using Learning Vector Quantization Neural Network in Bladder Cancer Diagnosis

    DEFF Research Database (Denmark)

    Karemore, Gopal Raghunath; Mascarenhas, Kim Komal; Patil, Choudhary;

    2008-01-01

    classification accuracy of LVQ with other classifiers (eg. SVM and Multi Layer Perceptron) for the same data set. Good agreement has been obtained between LVQ based classification of spectroscopy data and histopathology results which demonstrate the use of LVQ classifier in bladder cancer diagnosis....

  6. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge Emil Borch Laurs; Lo, Pechin Chien Pau; Dirksen, Asger;

    2011-01-01

    A novel method for classification of abnormality in anatomical tree structures is presented. A tree is classified based on direct comparisons with other trees in a dissimilarity-based classification scheme. The pair-wise dissimilarity measure between two trees is based on a linear assignment betw...

  7. A MapReduce based Parallel SVM for Email Classification

    Directory of Open Access Journals (Sweden)

    Ke Xu

    2014-06-01

    Full Text Available Support Vector Machine (SVM is a powerful classification and regression tool. Varying approaches including SVM based techniques are proposed for email classification. Automated email classification according to messages or user-specific folders and information extraction from chronologically ordered email streams have become interesting areas in text machine learning research. This paper presents a parallel SVM based on MapReduce (PSMR algorithm for email classification. We discuss the challenges that arise from differences between email foldering and traditional document classification. We show experimental results from an array of automated classification methods and evaluation methodologies, including Naive Bayes, SVM and PSMR method of foldering results on the Enron datasets based on the timeline. By distributing, processing and optimizing the subsets of the training data across multiple participating nodes, the parallel SVM based on MapReduce algorithm reduces the training time significantly

  8. Gender Classification Based on Geometry Features of Palm Image

    OpenAIRE

    Ming Wu; Yubo Yuan

    2014-01-01

    This paper presents a novel gender classification method based on geometry features of palm image which is simple, fast, and easy to handle. This gender classification method based on geometry features comprises two main attributes. The first one is feature extraction by image processing. The other one is classification system with polynomial smooth support vector machine (PSSVM). A total of 180 palm images were collected from 30 persons to verify the validity of the proposed gender classi...

  9. Gene Expression Profiles for Predicting Metastasis in Breast Cancer: A Cross-Study Comparison of Classification Methods

    Directory of Open Access Journals (Sweden)

    Mark Burton

    2012-01-01

    Full Text Available Machine learning has increasingly been used with microarray gene expression data and for the development of classifiers using a variety of methods. However, method comparisons in cross-study datasets are very scarce. This study compares the performance of seven classification methods and the effect of voting for predicting metastasis outcome in breast cancer patients, in three situations: within the same dataset or across datasets on similar or dissimilar microarray platforms. Combining classification results from seven classifiers into one voting decision performed significantly better during internal validation as well as external validation in similar microarray platforms than the underlying classification methods. When validating between different microarray platforms, random forest, another voting-based method, proved to be the best performing method. We conclude that voting based classifiers provided an advantage with respect to classifying metastasis outcome in breast cancer patients.

  10. Appraisal of progenitor markers in the context of molecular classification of breast cancers

    OpenAIRE

    Haviv, Izhak

    2011-01-01

    Clinical management of breast cancer relies on case stratification, which increasingly employs molecular markers. The motivation behind delineating breast epithelial differentiation is to better target cancer cases through innate sensitivities bequeathed to the cancer from its normal progenitor state. A combination of histopathological and molecular classification of breast cancer cases suggests a role for progenitors in particular breast cancer cases. Although a remarkable fraction of the re...

  11. DNA sequence analysis using hierarchical ART-based classification networks

    Energy Technology Data Exchange (ETDEWEB)

    LeBlanc, C.; Hruska, S.I. [Florida State Univ., Tallahassee, FL (United States); Katholi, C.R.; Unnasch, T.R. [Univ. of Alabama, Birmingham, AL (United States)

    1994-12-31

    Adaptive resonance theory (ART) describes a class of artificial neural network architectures that act as classification tools which self-organize, work in real-time, and require no retraining to classify novel sequences. We have adapted ART networks to provide support to scientists attempting to categorize tandem repeat DNA fragments from Onchocerca volvulus. In this approach, sequences of DNA fragments are presented to multiple ART-based networks which are linked together into two (or more) tiers; the first provides coarse sequence classification while the sub- sequent tiers refine the classifications as needed. The overall rating of the resulting classification of fragments is measured using statistical techniques based on those introduced to validate results from traditional phylogenetic analysis. Tests of the Hierarchical ART-based Classification Network, or HABclass network, indicate its value as a fast, easy-to-use classification tool which adapts to new data without retraining on previously classified data.

  12. Use of multivariate analysis to suggest a new molecular classification of colorectal cancer

    Science.gov (United States)

    Domingo, Enric; Ramamoorthy, Rajarajan; Oukrif, Dahmane; Rosmarin, Daniel; Presz, Michal; Wang, Haitao; Pulker, Hannah; Lockstone, Helen; Hveem, Tarjei; Cranston, Treena; Danielsen, Havard; Novelli, Marco; Davidson, Brian; Xu, Zheng-Zhou; Molloy, Peter; Johnstone, Elaine; Holmes, Christopher; Midgley, Rachel; Kerr, David; Sieber, Oliver; Tomlinson, Ian

    2013-01-01

    Abstract Molecular classification of colorectal cancer (CRC) is currently based on microsatellite instability (MSI), KRAS or BRAF mutation and, occasionally, chromosomal instability (CIN). Whilst useful, these categories may not fully represent the underlying molecular subgroups. We screened 906 stage II/III CRCs from the VICTOR clinical trial for somatic mutations. Multivariate analyses (logistic regression, clustering, Bayesian networks) identified the primary molecular associations. Positive associations occurred between: CIN and TP53 mutation; MSI and BRAF mutation; and KRAS and PIK3CA mutations. Negative associations occurred between: MSI and CIN; MSI and NRAS mutation; and KRAS mutation, and each of NRAS, TP53 and BRAF mutations. Some complex relationships were elucidated: KRAS and TP53 mutations had both a direct negative association and a weaker, confounding, positive association via TP53–CIN–MSI–BRAF–KRAS. Our results suggested a new molecular classification of CRCs: (1) MSI+ and/or BRAF-mutant; (2) CIN+ and/or TP53– mutant, with wild-type KRAS and PIK3CA; (3) KRAS- and/or PIK3CA-mutant, CIN+, TP53-wild-type; (4) KRAS– and/or PIK3CA-mutant, CIN–, TP53-wild-type; (5) NRAS-mutant; (6) no mutations; (7) others. As expected, group 1 cancers were mostly proximal and poorly differentiated, usually occurring in women. Unexpectedly, two different types of CIN+ CRC were found: group 2 cancers were usually distal and occurred in men, whereas group 3 showed neither of these associations but were of higher stage. CIN+ cancers have conventionally been associated with all three of these variables, because they have been tested en masse. Our classification also showed potentially improved prognostic capabilities, with group 3, and possibly group 1, independently predicting disease-free survival. Copyright © 2012 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. PMID:23165447

  13. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression

    Science.gov (United States)

    Shayan, Zahra; Mezerji, Naser Mohammad Gholi; Shayan, Leila; Naseri, Parisa

    2016-01-01

    Background: Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. Methods: This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. Results: CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. Conclusion: The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  14. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.

    2015-02-01

    © 2014 The Authors. Microcirculation published by John Wiley & Sons Ltd. Objective: Recent developments in high-resolution imaging techniques have enabled digital reconstruction of three-dimensional sections of microvascular networks down to the capillary scale. To better interpret these large data sets, our goal is to distinguish branching trees of arterioles and venules from capillaries. Methods: Two novel algorithms are presented for classifying vessels in microvascular anatomical data sets without requiring flow information. The algorithms are compared with a classification based on observed flow directions (considered the gold standard), and with an existing resistance-based method that relies only on structural data. Results: The first algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules. The second algorithm, developed for networks with multiple inlets and outlets, correctly identifies more arterioles and venules, but is more sensitive to parameter changes. Conclusions: The algorithms presented here can be used to classify microvessels in large microvascular data sets lacking flow information. This provides a basis for analyzing the distinct geometrical properties and modelling the functional behavior of arterioles, capillaries, and venules.

  15. Classification of CMEs Based on Their Dynamics

    Science.gov (United States)

    Nicewicz, J.; Michalek, G.

    2016-05-01

    A large set of coronal mass ejections CMEs (6621) has been selected to study their dynamics seen with the Large Angle and Spectroscopic Coronagraph (LASCO) onboard the Solar and Heliospheric Observatory (SOHO) field of view (LFOV). These events were selected based on having at least six height-time measurements so that their dynamic properties, in the LFOV, can be evaluated with reasonable accuracy. Height-time measurements (in the SOHO/LASCO catalog) were used to determine the velocities and accelerations of individual CMEs at successive distances from the Sun. Linear and quadratic functions were fitted to these data points. On the basis of the best fits to the velocity data points, we were able to classify CMEs into four groups. The types of CMEs do not only have different dynamic behaviors but also different masses, widths, velocities, and accelerations. We also show that these groups of events are initiated by different onset mechanisms. The results of our study allow us to present a consistent classification of CMEs based on their dynamics.

  16. SELDI-TOF Serum Profiling for Prognostic and Diagnostic Classification of Breast Cancers

    Directory of Open Access Journals (Sweden)

    Christine Laronga

    2004-01-01

    Full Text Available Surface enhanced laser desorption/ionization (SELDI time-of-flight mass spectrometry has emerged as a successful tool for serum based detection and differentiation of many cancer types, including breast cancers. In this study, we have applied the SELDI technology to evaluate three potential applications that could extend the effectiveness of established procedures and biomarkers used for prognostication of breast cancers. Paired serum samples obtained from women with breast cancers prior to surgery and post-surgery (6–9 mos. were examined. In 14/16 post-treatment patients, serum protein profiles could be used to distinguish these samples from the pre-treatment cancer samples. When compared to serum samples from normal healthy women, 11 of these post-treatment samples retained global protein profiles not found in healthy women, including five low-mass proteins that remained elevated in both pre-treatment and post-treatment serum groups. In another pilot study, serum profiles were compared for a group of 30 women who were known BRCA-1 mutation carriers, half of whom subsequently developed breast cancer within three years of the sample procurement. SELDI protein profiling accurately classified 13/15 women with BRCA-1 breast cancers from the 15 non-cancer BRCA-1 carriers. Additionally, the ability of SELDI to distinguish between the serum profiles from sentinel lymph node positive and sentinel lymph node negative patients was evaluated. In sentinel lymph node positive samples, 22/27 samples were correctly classified, in comparison to the correct classification of 55/71 sentinel lymph node negative samples. These initial results indicate the utility of protein profiling approaches for developing new diagnostic and prognostic assays for breast cancers.

  17. Improving breast cancer classification with mammography, supported on an appropriate variable selection analysis

    Science.gov (United States)

    Pérez, Noel; Guevara, Miguel A.; Silva, Augusto

    2013-02-01

    This work addresses the issue of variable selection within the context of breast cancer classification with mammography. A comprehensive repository of feature vectors was used including a hybrid subset gathering image-based and clinical features. It aimed to gather experimental evidence of variable selection in terms of cardinality, type and find a classification scheme that provides the best performance over the Area Under Receiver Operating Characteristics Curve (AUC) scores using the ranked features subset. We evaluated and classified a total of 300 subsets of features formed by the application of Chi-Square Discretization, Information-Gain, One-Rule and RELIEF methods in association with Feed-Forward Backpropagation Neural Network (FFBP), Support Vector Machine (SVM) and Decision Tree J48 (DTJ48) Machine Learning Algorithms (MLA) for a comparative performance evaluation based on AUC scores. A variable selection analysis was performed for Single-View Ranking and Multi-View Ranking groups of features. Features subsets representing Microcalcifications (MCs), Masses and both MCs and Masses lesions achieved AUC scores of 0.91, 0.954 and 0.934 respectively. Experimental evidence demonstrated that classification performance was improved by combining image-based and clinical features. The most important clinical and image-based features were StromaDistortion and Circularity respectively. Other less important but worth to use due to its consistency were Contrast, Perimeter, Microcalcification, Correlation and Elongation.

  18. MEDICAL DIAGNOSIS CLASSIFICATION USING MIGRATION BASED DIFFERENTIAL EVOLUTION ALGORITHM

    Directory of Open Access Journals (Sweden)

    Htet Thazin Tike Thein

    2014-12-01

    Full Text Available Constructing a classification model is important in machine learning for a particular task. A classification process involves assigning objects into predefined groups or classes based on a number of observed attributes related to those objects. Artificial neural network is one of the classification algorithms which, can be used in many application areas. This paper investigates the potential of applying the feed forward neural network architecture for the classification of medical datasets. Migration based differential evolution algorithm (MBDE is chosen and applied to feed forward neural network to enhance the learning process and the network learning is validated in terms of convergence rate and classification accuracy. In this paper, MBDE algorithm with various migration policies is proposed for classification problems using medical diagnosis.

  19. Molecular classification of familial non-BRCA1/BRCA2 breast cancer.

    Science.gov (United States)

    Hedenfalk, Ingrid; Ringner, Markus; Ben-Dor, Amir; Yakhini, Zohar; Chen, Yidong; Chebil, Gunilla; Ach, Robert; Loman, Niklas; Olsson, Håkan; Meltzer, Paul; Borg, Ake; Trent, Jeffrey

    2003-03-01

    In the decade since their discovery, the two major breast cancer susceptibility genes BRCA1 and BRCA2, have been shown conclusively to be involved in a significant fraction of families segregating breast and ovarian cancer. However, it has become equally clear that a large proportion of families segregating breast cancer alone are not caused by mutations in BRCA1 or BRCA2. Unfortunately, despite intensive effort, the identification of additional breast cancer predisposition genes has so far been unsuccessful, presumably because of genetic heterogeneity, low penetrance, or recessive/polygenic mechanisms. These non-BRCA1/2 breast cancer families (termed BRCAx families) comprise a histopathologically heterogeneous group, further supporting their origin from multiple genetic events. Accordingly, the identification of a method to successfully subdivide BRCAx families into recognizable groups could be of considerable value to further genetic analysis. We have previously shown that global gene expression analysis can identify unique and distinct expression profiles in breast tumors from BRCA1 and BRCA2 mutation carriers. Here we show that gene expression profiling can discover novel classes among BRCAx tumors, and differentiate them from BRCA1 and BRCA2 tumors. Moreover, microarray-based comparative genomic hybridization (CGH) to cDNA arrays revealed specific somatic genetic alterations within the BRCAx subgroups. These findings illustrate that, when gene expression-based classifications are used, BRCAx families can be grouped into homogeneous subsets, thereby potentially increasing the power of conventional genetic analysis.

  20. Cellular based cancer vaccines

    DEFF Research Database (Denmark)

    Hansen, Morten; Met, O; Svane, I M;

    2012-01-01

    Cancer vaccines designed to re-calibrate the existing host-tumour interaction, tipping the balance from tumor acceptance towards tumor control holds huge potential to complement traditional cancer therapies. In general, limited success has been achieved with vaccines composed of tumor...... in vitro migration via autocrine receptor-mediated endocytosis of CCR7. In the current review, we discuss optimal design of DC maturation focused on pre-clinical as well as clinical results from standard and polarized dendritic cell based cancer vaccines....

  1. A new classification algorithm based on RGH-tree search

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    In this paper, we put forward a new classification algorithm based on RGH-Tree search and perform the classification analysis and comparison study. This algorithm can save computing resource and increase the classification efficiency. The experiment shows that this algorithm can get better effect in dealing with three dimensional multi-kind data. We find that the algorithm has better generalization ability for small training set and big testing result.

  2. MASS CLASSIFICATION IN DIGITAL MAMMOGRAMS BASED ON DISCRETE SHEARLET TRANSFORM

    Directory of Open Access Journals (Sweden)

    J. Amjath Ali

    2013-01-01

    Full Text Available The most significant health problem in the world is breast cancer and early detection is the key to predict it. Mammography is the most reliable method to diagnose breast cancer at the earliest. The classification of the two most findings in the digital mammograms, micro calcifications and mass are valuable for early detection. Since, the appearance of the masses are similar to the surrounding parenchyma, the classification is not an easy task. In this study, an efficient approach to classify masses in the Mammography Image Analysis Society (MIAS database mammogram images is presented. The key features used for the classification is the energies of shearlet decomposed image. These features are fed into SVM classifier to classify mass/non mass images and also benign/malignant. The results demonstrate that the proposed shearlet energy features outperforms the wavelet energy features in terms of accuracy."

  3. EPA`s program for risk assessment guidelines: Cancer classification issues

    Energy Technology Data Exchange (ETDEWEB)

    Wiltse, J. [Environmental Protection Agency, Washington, DC (United States)

    1990-12-31

    Issues presented are related to classification of weight of evidence in cancer risk assessments. The focus in this paper is on lines of evidence used in constructing a conclusion about potential human carcinogenicity. The paper also discusses issues that are mistakenly addressed as classification issues but are really part of the risk assessment process. 2 figs.

  4. Preliminary Research on Grassland Fine-classification Based on MODIS

    International Nuclear Information System (INIS)

    Grassland ecosystem is important for climatic regulation, maintaining the soil and water. Research on the grassland monitoring method could provide effective reference for grassland resource investigation. In this study, we used the vegetation index method for grassland classification. There are several types of climate in China. Therefore, we need to use China's Main Climate Zone Maps and divide the study region into four climate zones. Based on grassland classification system of the first nation-wide grass resource survey in China, we established a new grassland classification system which is only suitable for this research. We used MODIS images as the basic data resources, and use the expert classifier method to perform grassland classification. Based on the 1:1,000,000 Grassland Resource Map of China, we obtained the basic distribution of all the grassland types and selected 20 samples evenly distributed in each type, then used NDVI/EVI product to summarize different spectral features of different grassland types. Finally, we introduced other classification auxiliary data, such as elevation, accumulate temperature (AT), humidity index (HI) and rainfall. China's nation-wide grassland classification map is resulted by merging the grassland in different climate zone. The overall classification accuracy is 60.4%. The result indicated that expert classifier is proper for national wide grassland classification, but the classification accuracy need to be improved

  5. AN OBJECT-BASED METHOD FOR CHINESE LANDFORM TYPES CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    H. Ding

    2016-06-01

    Full Text Available Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM. In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  6. A Curriculum-Based Classification System for Community Colleges.

    Science.gov (United States)

    Schuyler, Gwyer

    2003-01-01

    Proposes and tests a community college classification system based on curricular characteristics and their association with institutional characteristics. Seeks readily available data correlates to represent percentage of a college's course offerings that are in the liberal arts. A simple two-category classification system using total enrollment…

  7. An Object-Based Method for Chinese Landform Types Classification

    Science.gov (United States)

    Ding, Hu; Tao, Fei; Zhao, Wufan; Na, Jiaming; Tang, Guo'an

    2016-06-01

    Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM). In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  8. Two-Dimensional ARMA Modeling for Breast Cancer Detection and Classification

    CERN Document Server

    Bouaynaya, Nidhal; Schonfeld, Dan

    2009-01-01

    We propose a new model-based computer-aided diagnosis (CAD) system for tumor detection and classification (cancerous v.s. benign) in breast images. Specifically, we show that (x-ray, ultrasound and MRI) images can be accurately modeled by two-dimensional autoregressive-moving average (ARMA) random fields. We derive a two-stage Yule-Walker Least-Squares estimates of the model parameters, which are subsequently used as the basis for statistical inference and biophysical interpretation of the breast image. We use a k-means classifier to segment the breast image into three regions: healthy tissue, benign tumor, and cancerous tumor. Our simulation results on ultrasound breast images illustrate the power of the proposed approach.

  9. Regularization in Retrieval-Driven Classification of Clustered Microcalcifications for Breast Cancer

    Directory of Open Access Journals (Sweden)

    Hao Jing

    2012-01-01

    Full Text Available We propose a regularization based approach for case-adaptive classification in computer-aided diagnosis (CAD of breast cancer. The goal is to improve the classification accuracy on a query case by making use of a set of similar cases retrieved from an existing library of known cases. In the proposed approach, a prior is first derived from a traditional CAD classifier (which is typically pre-trained offline on a set of training cases. It is then used together with the retrieved similar cases to obtain an adaptive classifier on the query case. We consider two different forms for the regularization prior: one is fixed for all query cases and the other is allowed to vary with different query cases. In the experiments the proposed approach is demonstrated on a dataset of 1,006 clinical cases. The results show that it could achieve significant improvement in numerical efficiency compared with a previously proposed case adaptive approach (by about an order of magnitude while maintaining similar (or better improvement in classification accuracy; it could also adapt faster in performance with a small number of retrieved cases. Measured by the area of under the ROC curve (AUC, the regularization based approach achieved AUC = 0.8215, compared with AUC = 0.7329 for the baseline classifier (-value=0.001.

  10. Regularization in Retrieval-Driven Classification of Clustered Microcalcifications for Breast Cancer

    Science.gov (United States)

    Jing, Hao; Yang, Yongyi; Nishikawa, Robert M.

    2012-01-01

    We propose a regularization based approach for case-adaptive classification in computer-aided diagnosis (CAD) of breast cancer. The goal is to improve the classification accuracy on a query case by making use of a set of similar cases retrieved from an existing library of known cases. In the proposed approach, a prior is first derived from a traditional CAD classifier (which is typically pre-trained offline on a set of training cases). It is then used together with the retrieved similar cases to obtain an adaptive classifier on the query case. We consider two different forms for the regularization prior: one is fixed for all query cases and the other is allowed to vary with different query cases. In the experiments the proposed approach is demonstrated on a dataset of 1,006 clinical cases. The results show that it could achieve significant improvement in numerical efficiency compared with a previously proposed case adaptive approach (by about an order of magnitude) while maintaining similar (or better) improvement in classification accuracy; it could also adapt faster in performance with a small number of retrieved cases. Measured by the area of under the ROC curve (AUC), the regularization based approach achieved AUC = 0.8215, compared with AUC = 0.7329 for the baseline classifier (P-value = 0.001). PMID:22919363

  11. Efficient molecular subtype classification of high-grade serous ovarian cancer.

    Science.gov (United States)

    Leong, Huei San; Galletta, Laura; Etemadmoghadam, Dariush; George, Joshy; Köbel, Martin; Ramus, Susan J; Bowtell, David

    2015-07-01

    High-grade serous carcinomas (HGSCs) account for approximately 70% of all epithelial ovarian cancers diagnosed. Using microarray gene expression profiling, we previously identified four molecular subtypes of HGSC: C1 (mesenchymal), C2 (immunoreactive), C4 (differentiated), and C5 (proliferative), which correlate with patient survival and have distinct biological features. Here, we describe molecular classification of HGSC based on a limited number of genes to allow cost-effective and high-throughput subtype analysis. We determined a minimal signature for accurate classification, including 39 differentially expressed and nine control genes from microarray experiments. Taqman-based (low-density arrays and Fluidigm), fluorescent oligonucleotides (Nanostring), and targeted RNA sequencing (Illumina) assays were then compared for their ability to correctly classify fresh and formalin-fixed, paraffin-embedded samples. All platforms achieved > 90% classification accuracy with RNA from fresh frozen samples. The Illumina and Nanostring assays were superior with fixed material. We found that the C1, C2, and C4 molecular subtypes were largely consistent across multiple surgical deposits from individual chemo-naive patients. In contrast, we observed substantial subtype heterogeneity in patients whose primary ovarian sample was classified as C5. The development of an efficient molecular classifier of HGSC should enable further biological characterization of molecular subtypes and the development of targeted clinical trials. PMID:25810134

  12. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value.

    Directory of Open Access Journals (Sweden)

    Laetitia Marisa

    Full Text Available BACKGROUND: Colon cancer (CC pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses. METHODS AND FINDINGS: Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype-like, normal-like, serrated CC phenotype-like, and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II-III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse

  13. Fast Wavelet-Based Visual Classification

    CERN Document Server

    Yu, Guoshen

    2008-01-01

    We investigate a biologically motivated approach to fast visual classification, directly inspired by the recent work of Serre et al. Specifically, trading-off biological accuracy for computational efficiency, we explore using wavelet and grouplet-like transforms to parallel the tuning of visual cortex V1 and V2 cells, alternated with max operations to achieve scale and translation invariance. A feature selection procedure is applied during learning to accelerate recognition. We introduce a simple attention-like feedback mechanism, significantly improving recognition and robustness in multiple-object scenes. In experiments, the proposed algorithm achieves or exceeds state-of-the-art success rate on object recognition, texture and satellite image classification, language identification and sound classification.

  14. Knowledge-Based Classification in Automated Soil Mapping

    Institute of Scientific and Technical Information of China (English)

    ZHOU BIN; WANG RENCHAO

    2003-01-01

    A machine-learning approach was developed for automated building of knowledge bases for soil resourcesmapping by using a classification tree to generate knowledge from training data. With this method, buildinga knowledge base for automated soil mapping was easier than using the conventional knowledge acquisitionapproach. The knowledge base built by classification tree was used by the knowledge classifier to perform thesoil type classification of Longyou County, Zhejiang Province, China using Landsat TM bi-temporal imagesand GIS data. To evaluate the performance of the resultant knowledge bases, the classification results werecompared to existing soil map based on a field survey. The accuracy assessment and analysis of the resultantsoil maps suggested that the knowledge bases built by the machine-learning method was of good quality formapping distribution model of soil classes over the study area.

  15. Shape classification based on singular value decomposition transform

    Institute of Scientific and Technical Information of China (English)

    SHAABAN Zyad; ARIF Thawar; BABA Sami; KREKOR Lala

    2009-01-01

    In this paper, a new shape classification system based on singular value decomposition (SVD) transform using nearest neighbour classifier was proposed. The gray scale image of the shape object was converted into a black and white image. The squared Euclidean distance transform on binary image was applied to extract the boundary image of the shape. SVD transform features were extracted from the the boundary of the object shapes. In this paper, the proposed classification system based on SVD transform feature extraction method was compared with classifier based on moment invariants using nearest neighbour classifier. The experimental results showed the advantage of our proposed classification system.

  16. Multiclass Classification Based on the Analytical Center of Version Space

    Institute of Scientific and Technical Information of China (English)

    ZENGFanzi; QIUZhengding; YUEJianhai; LIXiangqian

    2005-01-01

    Analytical center machine, based on the analytical center of version space, outperforms support vector machine, especially when the version space is elongated or asymmetric. While analytical center machine for binary classification is well understood, little is known about corresponding multiclass classification.Moreover, considering that the current multiclass classification method: “one versus all” needs repeatedly constructing classifiers to separate a single class from all the others, which leads to daunting computation and low efficiency of classification, and that though multiclass support vector machine corresponds to a simple quadratic optimization, it is not very effective when the version spaceis asymmetric or elongated, Thus, the multiclass classification approach based on the analytical center of version space is proposed to address the above problems. Experiments on wine recognition and glass identification dataset demonstrate validity of the approach proposed.

  17. Behavior Based Social Dimensions Extraction for Multi-Label Classification.

    Directory of Open Access Journals (Sweden)

    Le Li

    Full Text Available Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions.

  18. Behavior Based Social Dimensions Extraction for Multi-Label Classification.

    Science.gov (United States)

    Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin

    2016-01-01

    Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions. PMID:27049849

  19. Behavior Based Social Dimensions Extraction for Multi-Label Classification

    Science.gov (United States)

    Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin

    2016-01-01

    Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes’ behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes’ connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions. PMID:27049849

  20. A Fuzzy Logic Based Sentiment Classification

    Directory of Open Access Journals (Sweden)

    J.I.Sheeba

    2014-07-01

    Full Text Available Sentiment classification aims to detect information such as opinions, explicit , implicit feelings expressed in text. The most existing approaches are able to detect either explicit expressions or implicit expressions of sentiments in the text separately. In this proposed framework it will detect both Implicit and Explicit expressions available in the meeting transcripts. It will classify the Positive, Negative, Neutral words and also identify the topic of the particular meeting transcripts by using fuzzy logic. This paper aims to add some additional features for improving the classification method. The quality of the sentiment classification is improved using proposed fuzzy logic framework .In this fuzzy logic it includes the features like Fuzzy rules and Fuzzy C-means algorithm.The quality of the output is evaluated using the parameters such as precision, recall, f-measure. Here Fuzzy C-means Clustering technique measured in terms of Purity and Entropy. The data set was validated using 10-fold cross validation method and observed 95% confidence interval between the accuracy values .Finally, the proposed fuzzy logic method produced more than 85 % accurate results and error rate is very less compared to existing sentiment classification techniques.

  1. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  2. A Comparative Analysis of Swarm Intelligence Techniques for Feature Selection in Cancer Classification

    Directory of Open Access Journals (Sweden)

    Chellamuthu Gunavathi

    2014-01-01

    Full Text Available Feature selection in cancer classification is a central area of research in the field of bioinformatics and used to select the informative genes from thousands of genes of the microarray. The genes are ranked based on T-statistics, signal-to-noise ratio (SNR, and F-test values. The swarm intelligence (SI technique finds the informative genes from the top-m ranked genes. These selected genes are used for classification. In this paper the shuffled frog leaping with Lévy flight (SFLLF is proposed for feature selection. In SFLLF, the Lévy flight is included to avoid premature convergence of shuffled frog leaping (SFL algorithm. The SI techniques such as particle swarm optimization (PSO, cuckoo search (CS, SFL, and SFLLF are used for feature selection which identifies informative genes for classification. The k-nearest neighbour (k-NN technique is used to classify the samples. The proposed work is applied on 10 different benchmark datasets and examined with SI techniques. The experimental results show that the results obtained from k-NN classifier through SFLLF feature selection method outperform PSO, CS, and SFL.

  3. A comparative analysis of swarm intelligence techniques for feature selection in cancer classification.

    Science.gov (United States)

    Gunavathi, Chellamuthu; Premalatha, Kandasamy

    2014-01-01

    Feature selection in cancer classification is a central area of research in the field of bioinformatics and used to select the informative genes from thousands of genes of the microarray. The genes are ranked based on T-statistics, signal-to-noise ratio (SNR), and F-test values. The swarm intelligence (SI) technique finds the informative genes from the top-m ranked genes. These selected genes are used for classification. In this paper the shuffled frog leaping with Lévy flight (SFLLF) is proposed for feature selection. In SFLLF, the Lévy flight is included to avoid premature convergence of shuffled frog leaping (SFL) algorithm. The SI techniques such as particle swarm optimization (PSO), cuckoo search (CS), SFL, and SFLLF are used for feature selection which identifies informative genes for classification. The k-nearest neighbour (k-NN) technique is used to classify the samples. The proposed work is applied on 10 different benchmark datasets and examined with SI techniques. The experimental results show that the results obtained from k-NN classifier through SFLLF feature selection method outperform PSO, CS, and SFL. PMID:25157377

  4. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    N. Li

    2016-06-01

    Full Text Available Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could keep the initial spatial structure and insure the consideration of the neighborhood. Based on a small number of component features a k nearest neighborhood classification is applied.

  5. Tensor Modeling Based for Airborne LiDAR Data Classification

    Science.gov (United States)

    Li, N.; Liu, C.; Pfeifer, N.; Yin, J. F.; Liao, Z. Y.; Zhou, Y.

    2016-06-01

    Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the "raw" data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could keep the initial spatial structure and insure the consideration of the neighborhood. Based on a small number of component features a k nearest neighborhood classification is applied.

  6. Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms

    DEFF Research Database (Denmark)

    Yoo, C.; Gernaey, Krist

    2008-01-01

    the subtype classification 3 data sets solely on the basis of molecular-level monitoring. Compared to unsupervised clustering, the supervised method performed better for discriminating between cancer types and cancer subtypes for the leukemia data set. The performance of the proposed method, using only...

  7. Comparison of supervised classification methods for protein profiling in cancer diagnosis.

    Science.gov (United States)

    Dossat, Nadège; Mangé, Alain; Solassol, Jérôme; Jacot, William; Lhermitte, Ludovic; Maudelonde, Thierry; Daurès, Jean-Pierre; Molinari, Nicolas

    2007-01-01

    A key challenge in clinical proteomics of cancer is the identification of biomarkers that could allow detection, diagnosis and prognosis of the diseases. Recent advances in mass spectrometry and proteomic instrumentations offer unique chance to rapidly identify these markers. These advances pose considerable challenges, similar to those created by microarray-based investigation, for the discovery of pattern of markers from high-dimensional data, specific to each pathologic state (e.g. normal vs cancer). We propose a three-step strategy to select important markers from high-dimensional mass spectrometry data using surface enhanced laser desorption/ionization (SELDI) technology. The first two steps are the selection of the most discriminating biomarkers with a construction of different classifiers. Finally, we compare and validate their performance and robustness using different supervised classification methods such as Support Vector Machine, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Neural Networks, Classification Trees and Boosting Trees. We show that the proposed method is suitable for analysing high-throughput proteomics data and that the combination of logistic regression and Linear Discriminant Analysis outperform other methods tested. PMID:19455249

  8. Rule Based Classification to Detect Malnutrition in Children

    Directory of Open Access Journals (Sweden)

    Xu Dezhi

    2011-01-01

    Full Text Available Data mining is an area which used in vast field of areas. Rule based classification is one of the sub areas in data mining. From this paper it will describe how rule based classification is used alone with Agent Technology to detect malnutrition in children. This proposed system is implemented as an egovernment system. Further it will try to research whether there is connection between number of rules which is used with the optimality of the final decision.

  9. Iris Image Classification Based on Hierarchical Visual Codebook.

    Science.gov (United States)

    Zhenan Sun; Hui Zhang; Tieniu Tan; Jianyu Wang

    2014-06-01

    Iris recognition as a reliable method for personal identification has been well-studied with the objective to assign the class label of each iris image to a unique subject. In contrast, iris image classification aims to classify an iris image to an application specific category, e.g., iris liveness detection (classification of genuine and fake iris images), race classification (e.g., classification of iris images of Asian and non-Asian subjects), coarse-to-fine iris identification (classification of all iris images in the central database into multiple categories). This paper proposes a general framework for iris image classification based on texture analysis. A novel texture pattern representation method called Hierarchical Visual Codebook (HVC) is proposed to encode the texture primitives of iris images. The proposed HVC method is an integration of two existing Bag-of-Words models, namely Vocabulary Tree (VT), and Locality-constrained Linear Coding (LLC). The HVC adopts a coarse-to-fine visual coding strategy and takes advantages of both VT and LLC for accurate and sparse representation of iris texture. Extensive experimental results demonstrate that the proposed iris image classification method achieves state-of-the-art performance for iris liveness detection, race classification, and coarse-to-fine iris identification. A comprehensive fake iris image database simulating four types of iris spoof attacks is developed as the benchmark for research of iris liveness detection. PMID:26353275

  10. A Classification-based Review Recommender

    Science.gov (United States)

    O'Mahony, Michael P.; Smyth, Barry

    Many online stores encourage their users to submit product/service reviews in order to guide future purchasing decisions. These reviews are often listed alongside product recommendations but, to date, limited attention has been paid as to how best to present these reviews to the end-user. In this paper, we describe a supervised classification approach that is designed to identify and recommend the most helpful product reviews. Using the TripAdvisor service as a case study, we compare the performance of several classification techniques using a range of features derived from hotel reviews. We then describe how these classifiers can be used as the basis for a practical recommender that automatically suggests the mosthelpful contrasting reviews to end-users. We present an empirical evaluation which shows that our approach achieves a statistically significant improvement over alternative review ranking schemes.

  11. A classification-based review recommender

    OpenAIRE

    O'Mahony, Michael P.; Smyth, Barry

    2010-01-01

    Many online stores encourage their users to submit product or service reviews in order to guide future purchasing decisions. These reviews are often listed alongside product recommendations but, to date, limited attention has been paid as to how best to present these reviews to the end-user. In this paper, we describe a supervised classification approach that is designed to identify and recommend the most helpful product reviews. Using the TripAdvisor service as a case study, we compare...

  12. NONSUBSAMPLED CONTOURLET TRANSFORM BASED CLASSIFICATION OF MICROCALCIFICATION IN DIGITAL MAMMOGRAMS

    Directory of Open Access Journals (Sweden)

    J. S. Leena Jasmine

    2013-01-01

    Full Text Available Mammogram is the best available radiographic method to detect breast cancer in the early stage. However detecting a microcalcification clusters in the early stage is a tough task for the radiologist. Herein we present a novel approach for classifying microcalcification in digital mammograms using Nonsubsampled Contourlet Transform (NSCT and Support Vector Machine (SVM. The classification of microcalcification is achieved by extracting the microcalcification features from the Contourlet coefficients of the image and the outcomes are used as an input to the SVM for classification. The system classifies the mammogram images as normal or abnormal and the abnormal severity as benign or malignant. The evaluation of the system is carried on using Mammography Image Analysis Society (MIAS database. The experimental result shows that the proposed method provides improved classification rate.

  13. Intelligent Hybrid Cluster Based Classification Algorithm for Social Network Analysis

    Directory of Open Access Journals (Sweden)

    S. Muthurajkumar

    2014-05-01

    Full Text Available In this paper, we propose an hybrid clustering based classification algorithm based on mean approach to effectively classify to mine the ordered sequences (paths from weblog data in order to perform social network analysis. In the system proposed in this work for social pattern analysis, the sequences of human activities are typically analyzed by switching behaviors, which are likely to produce overlapping clusters. In this proposed system, a robust Modified Boosting algorithm is proposed to hybrid clustering based classification for clustering the data. This work is useful to provide connection between the aggregated features from the network data and traditional indices used in social network analysis. Experimental results show that the proposed algorithm improves the decision results from data clustering when combined with the proposed classification algorithm and hence it is proved that of provides better classification accuracy when tested with Weblog dataset. In addition, this algorithm improves the predictive performance especially for multiclass datasets which can increases the accuracy.

  14. Hybrid Support Vector Machines-Based Multi-fault Classification

    Institute of Scientific and Technical Information of China (English)

    GAO Guo-hua; ZHANG Yong-zhong; ZHU Yu; DUAN Guang-huang

    2007-01-01

    Support Vector Machines (SVM) is a new general machine-learning tool based on structural risk minimization principle. This characteristic is very signific ant for the fault diagnostics when the number of fault samples is limited. Considering that SVM theory is originally designed for a two-class classification, a hybrid SVM scheme is proposed for multi-fault classification of rotating machinery in our paper. Two SVM strategies, 1-v-1 (one versus one) and 1-v-r (one versus rest), are respectively adopted at different classification levels. At the parallel classification level, using 1-v-1 strategy, the fault features extracted by various signal analysis methods are transferred into the multiple parallel SVM and the local classification results are obtained. At the serial classification level, these local results values are fused by one serial SVM based on 1-v-r strategy. The hybrid SVM scheme introduced in our paper not only generalizes the performance of signal binary SVMs but improves the precision and reliability of the fault classification results. The actually testing results show the availability suitability of this new method.

  15. Words semantic orientation classification based on HowNet

    Institute of Scientific and Technical Information of China (English)

    LI Dun; MA Yong-tao; GUO Jian-li

    2009-01-01

    Based on the text orientation classification, a new measurement approach to semantic orientation of words was proposed. According to the integrated and detailed definition of words in HowNet, seed sets including the words with intense orientations were built up. The orientation similarity between the seed words and the given word was then calculated using the sentiment weight priority to recognize the semantic orientation of common words. Finally, the words' semantic orientation and the context were combined to recognize the given words' orientation. The experiments show that the measurement approach achieves better results for common words' orientation classification and contributes particularly to the text orientation classification of large granularities.

  16. Support vector classification algorithm based on variable parameter linear programming

    Institute of Scientific and Technical Information of China (English)

    Xiao Jianhua; Lin Jian

    2007-01-01

    To solve the problems of SVM in dealing with large sample size and asymmetric distributed samples, a support vector classification algorithm based on variable parameter linear programming is proposed.In the proposed algorithm, linear programming is employed to solve the optimization problem of classification to decrease the computation time and to reduce its complexity when compared with the original model.The adjusted punishment parameter greatly reduced the classification error resulting from asymmetric distributed samples and the detailed procedure of the proposed algorithm is given.An experiment is conducted to verify whether the proposed algorithm is suitable for asymmetric distributed samples.

  17. Radar Target Classification using Recursive Knowledge-Based Methods

    DEFF Research Database (Denmark)

    Jochumsen, Lars Wurtz

    The topic of this thesis is target classification of radar tracks from a 2D mechanically scanning coastal surveillance radar. The measurements provided by the radar are position data and therefore the classification is mainly based on kinematic data, which is deduced from the position. The target...... been terminated. Therefore, an update of the classification results must be made for each measurement of the target. The data for this work are collected throughout the PhD and are both collected from radars and other sensors such as GPS....

  18. Identifying Cancer Biomarkers Via Node Classification within a Mapreduce Framework

    Directory of Open Access Journals (Sweden)

    Taysir Hassan A. Soliman

    2015-12-01

    Full Text Available Big data are giving new research challenges in the life sciences domain because of their variety, volume, veracity, velocity, and value. Predicting gene biomarkers is one of the vital research issues in bioinformatics field, where microarray gene expression and network based methods can be used. These datasets suffer from the huge data voluminous, causing main memory problems. In this paper, a Random Committee Node Classifier algorithm (RCNC is proposed for identifying cancer biomarkers, which is based on microarray gene expression data and Protein-Protein Interaction (PPI data. Data are enriched from other public databases, such as IntACT1 and UniProt2 and Gene Ontology3 (GO. Cancer Biomarkers are identified when applied to different datasets with an accuracy rate an accuracy rate 99.16%, 99.96% precision, 99.24% recall, 99.16% F1-measure and 99.6 ROC. To speed up the performance, it is run within a MapReduce framework, where RCNC MapReduce algorithm is much faster than RCNC sequential algorithm when having large datasets.

  19. Analysis of Kernel Approach in Fuzzy-Based Image Classifications

    Directory of Open Access Journals (Sweden)

    Mragank Singhal

    2013-03-01

    Full Text Available This paper presents a framework of kernel approach in the field of fuzzy based image classification in remote sensing. The goal of image classification is to separate images according to their visual content into two or more disjoint classes. Fuzzy logic is relatively young theory. Major advantage of this theory is that it allows the natural description, in linguistic terms, of problems that should be solved rather than in terms of relationships between precise numerical values. This paper describes how remote sensing data with uncertainty are handled with fuzzy based classification using Kernel approach for land use/land cover maps generation. The introduction to fuzzification using Kernel approach provides the basis for the development of more robust approaches to the remote sensing classification problem. The kernel explicitly defines a similarity measure between two samples and implicitly represents the mapping of the input space to the feature space.

  20. A Syntactic Classification based Web Page Ranking Algorithm

    CERN Document Server

    Mukhopadhyay, Debajyoti; Kim, Young-Chon

    2011-01-01

    The existing search engines sometimes give unsatisfactory search result for lack of any categorization of search result. If there is some means to know the preference of user about the search result and rank pages according to that preference, the result will be more useful and accurate to the user. In the present paper a web page ranking algorithm is being proposed based on syntactic classification of web pages. Syntactic Classification does not bother about the meaning of the content of a web page. The proposed approach mainly consists of three steps: select some properties of web pages based on user's demand, measure them, and give different weightage to each property during ranking for different types of pages. The existence of syntactic classification is supported by running fuzzy c-means algorithm and neural network classification on a set of web pages. The change in ranking for difference in type of pages but for same query string is also being demonstrated.

  1. Feature Extraction based Face Recognition, Gender and Age Classification

    Directory of Open Access Journals (Sweden)

    Venugopal K R

    2010-01-01

    Full Text Available The face recognition system with large sets of training sets for personal identification normally attains good accuracy. In this paper, we proposed Feature Extraction based Face Recognition, Gender and Age Classification (FEBFRGAC algorithm with only small training sets and it yields good results even with one image per person. This process involves three stages: Pre-processing, Feature Extraction and Classification. The geometric features of facial images like eyes, nose, mouth etc. are located by using Canny edge operator and face recognition is performed. Based on the texture and shape information gender and age classification is done using Posteriori Class Probability and Artificial Neural Network respectively. It is observed that the face recognition is 100%, the gender and age classification is around 98% and 94% respectively.

  2. A NOVEL RULE-BASED FINGERPRINT CLASSIFICATION APPROACH

    Directory of Open Access Journals (Sweden)

    Faezeh Mirzaei

    2014-03-01

    Full Text Available Fingerprint classification is an important phase in increasing the speed of a fingerprint verification system and narrow down the search of fingerprint database. Fingerprint verification is still a challenging problem due to the difficulty of poor quality images and the need for faster response. The classification gets even harder when just one core has been detected in the input image. This paper has proposed a new classification approach which includes the images with one core. The algorithm extracts singular points (core and deltas from the input image and performs classification based on the number, locations and surrounded area of the detected singular points. The classifier is rule-based, where the rules are generated independent of a given data set. Moreover, shortcomings of a related paper has been reported in detail. The experimental results and comparisons on FVC2002 database have shown the effectiveness and efficiency of the proposed method.

  3. Feature Extraction based Face Recognition, Gender and Age Classification

    OpenAIRE

    Venugopal K R2; L M Patnaik; Ramesha K; K B Raja

    2010-01-01

    The face recognition system with large sets of training sets for personal identification normally attains good accuracy. In this paper, we proposed Feature Extraction based Face Recognition, Gender and Age Classification (FEBFRGAC) algorithm with only small training sets and it yields good results even with one image per person. This process involves three stages: Pre-processing, Feature Extraction and Classification. The geometric features of facial images like eyes, nose, mouth etc. are loc...

  4. GAUSSIAN MIXTURE MODEL BASED CLASSIFICATION OF MICROCALCIFICATION IN MAMMOGRAMS USING DYADIC WAVELET TRANSFORM

    Directory of Open Access Journals (Sweden)

    Suman Mishra

    2013-01-01

    Full Text Available Breast cancer is a serious health related issue for women in the world. Cancer detected at premature stages has a higher probability of being cured, whereas at advanced stages chances of survival are bleak. Screening programs aid in detecting potential breast cancer at early stages of the disease. Among the various screening programs, mammography is the proven standard for screening breast cancer, because even small tumors can be detected on mammograms. In this study, a novel feature extraction technique based on dyadic wavelet transform for classification of microcalcification in digital mammograms is proposed. In the feature extraction module, the high frequency sub-bands obtained from the decomposition of dyadic wavelet transform is used to form innovative sub-bands. From the newly constructed sub-bands, the features such as energy and entropy are computed. In the classification module, the extracted features are fed into a Gaussian Mixture Model (GMM classifier and the severity of given microcalcification; benign or malignant are given. A classification accuracy of 95.5% is obtained using the proposed approach on DDSM database.

  5. Classification of LiDAR Data with Point Based Classification Methods

    Science.gov (United States)

    Yastikli, N.; Cetin, Z.

    2016-06-01

    LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applications directly use the classified point clouds. The different classification methods can be seen in recent researches and most of researches work with the gridded LiDAR point cloud. In grid based data processing of the LiDAR data, the characteristic point loss in the LiDAR point cloud especially vegetation and buildings or losing height accuracy during the interpolation stage are inevitable. In this case, the possible solution is the use of the raw point cloud data for classification to avoid data and accuracy loss in gridding process. In this study, the point based classification possibilities of the LiDAR point cloud is investigated to obtain more accurate classes. The automatic point based approaches, which are based on hierarchical rules, have been proposed to achieve ground, building and vegetation classes using the raw LiDAR point cloud data. In proposed approaches, every single LiDAR point is analyzed according to their features such as height, multi-return, etc. then automatically assigned to the class which they belong to. The use of un-gridded point cloud in proposed point based classification process helped the determination of more realistic rule sets. The detailed parameter analyses have been performed to obtain the most appropriate parameters in the rule sets to achieve accurate classes. The hierarchical rule sets were created for proposed Approach 1 (using selected spatial-based and echo-based features) and Approach 2 (using only selected spatial-based features

  6. Comparison of Computational Algorithms for the Classification of Liver Cancer using SELDI Mass Spectrometry: A Case Study

    Directory of Open Access Journals (Sweden)

    Robert J Hickey

    2007-01-01

    Full Text Available Introduction: As an alternative to DNA microarrays, mass spectrometry based analysis of proteomic patterns has shown great potential in cancer diagnosis. The ultimate application of this technique in clinical settings relies on the advancement of the technology itself and the maturity of the computational tools used to analyze the data. A number of computational algorithms constructed on different principles are available for the classification of disease status based on proteomic patterns. Nevertheless, few studies have addressed the difference in the performance of these approaches. In this report, we describe a comparative case study on the classification accuracy of hepatocellular carcinoma based on the serum proteomic pattern generated from a Surface Enhanced Laser Desorption/Ionization (SELDI mass spectrometer.Methods: Nine supervised classifi cation algorithms are implemented in R software and compared for the classification accuracy.Results: We found that the support vector machine with radial function is preferable as a tool for classification of hepatocellular carcinoma using features in SELDI mass spectra. Among the rest of the methods, random forest and prediction analysis of microarrays have better performance. A permutation-based technique reveals that the support vector machine with a radial function seems intrinsically superior in learning from the training data since it has a lower prediction error than others when there is essentially no differential signal. On the other hand, the performance of the random forest and prediction analysis of microarrays rely on their capability of capturing the signals with substantial differentiation between groups.Conclusions: Our finding is similar to a previous study, where classification methods based on the Matrix Assisted Laser Desorption/Ionization (MALDI mass spectrometry are compared for the prediction accuracy of ovarian cancer. The support vector machine, random forest and prediction

  7. Ensemble polarimetric SAR image classification based on contextual sparse representation

    Science.gov (United States)

    Zhang, Lamei; Wang, Xiao; Zou, Bin; Qiao, Zhijun

    2016-05-01

    Polarimetric SAR image interpretation has become one of the most interesting topics, in which the construction of the reasonable and effective technique of image classification is of key importance. Sparse representation represents the data using the most succinct sparse atoms of the over-complete dictionary and the advantages of sparse representation also have been confirmed in the field of PolSAR classification. However, it is not perfect, like the ordinary classifier, at different aspects. So ensemble learning is introduced to improve the issue, which makes a plurality of different learners training and obtained the integrated results by combining the individual learner to get more accurate and ideal learning results. Therefore, this paper presents a polarimetric SAR image classification method based on the ensemble learning of sparse representation to achieve the optimal classification.

  8. Classification approach based on association rules mining for unbalanced data

    CERN Document Server

    Ndour, Cheikh

    2012-01-01

    This paper deals with the supervised classification when the response variable is binary and its class distribution is unbalanced. In such situation, it is not possible to build a powerful classifier by using standard methods such as logistic regression, classification tree, discriminant analysis, etc. To overcome this short-coming of these methods that provide classifiers with low sensibility, we tackled the classification problem here through an approach based on the association rules learning because this approach has the advantage of allowing the identification of the patterns that are well correlated with the target class. Association rules learning is a well known method in the area of data-mining. It is used when dealing with large database for unsupervised discovery of local patterns that expresses hidden relationships between variables. In considering association rules from a supervised learning point of view, a relevant set of weak classifiers is obtained from which one derives a classification rule...

  9. ELABORATION OF A VECTOR BASED SEMANTIC CLASSIFICATION OVER THE WORDS AND NOTIONS OF THE NATURAL LANGUAGE

    OpenAIRE

    Safonov, K.; Lichargin, D.

    2009-01-01

    The problem of vector-based semantic classification over the words and notions of the natural language is discussed. A set of generative grammar rules is offered for generating the semantic classification vector. Examples of the classification application and a theorem of optional formal classification incompleteness are presented. The principles of assigning the meaningful phrases functions over the classification word groups are analyzed.

  10. 5th International ACC Symposium: Classification of Adrenocortical Cancers from Pathology to Integrated Genomics: Real Advances or Lost in Translation?

    Science.gov (United States)

    de Krijger, Ronald E; Bertherat, Jérôme

    2016-02-01

    For the clinician, despite its rarity, adrenocortical cancer is a heterogeneous tumor both in term of steroid excess and tumor evolution. For patient management, it is crucial to have an accurate vision of this heterogeneity, in order to use a correct tumor classification. Pathology is the best way to classify operated adrenocortical tumors: to recognize their adrenocortical nature and to differentiate benign from malignant tumors. Among malignant tumors pathology also aims at prognosis assessment. Although progress has being made for prognosis assessment, there is still a need for improvement. Recent studies have established the value of Ki67 for adrenocortical cancer (ACC) prognostication, aiming also at standardization to reduce variability. The use of genomics to study adrenocortical tumors gives a very new insight in their pathogenesis and molecular classification. Genomics studies of ACC give now a clear description of the mRNA (transcriptome) and miRNA expression profile, as well as chromosomal and methylation alterations. Exome sequencing also established firmly the list of the main ACC driver genes. Interestingly, genomics study of ACC also revealed subtypes of malignant tumors with different pattern of molecular alterations, associated with different outcome. This leads to a new vision of adrenocortical tumors classification based on molecular analysis. Interestingly, these molecular classifications meet also the results of pathological analysis. This opens new perspectives on the development and use of various molecular tools to classify, along with pathological analysis, ACC, and guides patient management at the area of precision medicine. PMID:26676358

  11. Super pixel density based clustering automatic image classification method

    Science.gov (United States)

    Xu, Mingxing; Zhang, Chuan; Zhang, Tianxu

    2015-12-01

    The image classification is an important means of image segmentation and data mining, how to achieve rapid automated image classification has been the focus of research. In this paper, based on the super pixel density of cluster centers algorithm for automatic image classification and identify outlier. The use of the image pixel location coordinates and gray value computing density and distance, to achieve automatic image classification and outlier extraction. Due to the increased pixel dramatically increase the computational complexity, consider the method of ultra-pixel image preprocessing, divided into a small number of super-pixel sub-blocks after the density and distance calculations, while the design of a normalized density and distance discrimination law, to achieve automatic classification and clustering center selection, whereby the image automatically classify and identify outlier. After a lot of experiments, our method does not require human intervention, can automatically categorize images computing speed than the density clustering algorithm, the image can be effectively automated classification and outlier extraction.

  12. A new circulation type classification based upon Lagrangian air trajectories

    Directory of Open Access Journals (Sweden)

    Alexandre M. Ramos

    2014-10-01

    Full Text Available A new classification method of the large-scale circulation characteristic for a specific target area (NW Iberian Peninsula is presented, based on the analysis of 90-h backward trajectories arriving in this area calculated with the 3-D Lagrangian particle dispersion model FLEXPART. A cluster analysis is applied to separate the backward trajectories in up to five representative air streams for each day. Specific measures are then used to characterise the distinct air streams (e.g., curvature of the trajectories, cyclonic or anticyclonic flow, moisture evolution, origin and length of the trajectories. The robustness of the presented method is demonstrated in comparison with the Eulerian Lamb weather type classification.A case study of the 2003 heatwave is discussed in terms of the new Lagrangian circulation and the Lamb weather type classifications. It is shown that the new classification method adds valuable information about the pertinent meteorological conditions, which are missing in an Eulerian approach. The new method is climatologically evaluated for the five-year time period from December 1999 to November 2004. The ability of the method to capture the inter-seasonal circulation variability in the target region is shown. Furthermore, the multi-dimensional character of the classification is shortly discussed, in particular with respect to inter-seasonal differences. Finally, the relationship between the new Lagrangian classification and the precipitation in the target area is studied.

  13. D Land Cover Classification Based on Multispectral LIDAR Point Clouds

    Science.gov (United States)

    Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong

    2016-06-01

    Multispectral Lidar System can emit simultaneous laser pulses at the different wavelengths. The reflected multispectral energy is captured through a receiver of the sensor, and the return signal together with the position and orientation information of sensor is recorded. These recorded data are solved with GNSS/IMU data for further post-processing, forming high density multispectral 3D point clouds. As the first commercial multispectral airborne Lidar sensor, Optech Titan system is capable of collecting point clouds data from all three channels at 532nm visible (Green), at 1064 nm near infrared (NIR) and at 1550nm intermediate infrared (IR). It has become a new source of data for 3D land cover classification. The paper presents an Object Based Image Analysis (OBIA) approach to only use multispectral Lidar point clouds datasets for 3D land cover classification. The approach consists of three steps. Firstly, multispectral intensity images are segmented into image objects on the basis of multi-resolution segmentation integrating different scale parameters. Secondly, intensity objects are classified into nine categories by using the customized features of classification indexes and a combination the multispectral reflectance with the vertical distribution of object features. Finally, accuracy assessment is conducted via comparing random reference samples points from google imagery tiles with the classification results. The classification results show higher overall accuracy for most of the land cover types. Over 90% of overall accuracy is achieved via using multispectral Lidar point clouds for 3D land cover classification.

  14. A Brief Summary of Dictionary Learning Based Approach for Classification

    CERN Document Server

    Shu, Kong

    2012-01-01

    This note presents some representative methods which are based on dictionary learning (DL) for classification. We do not review the sophisticated methods or frameworks that involve DL for classification, such as online DL and spatial pyramid matching (SPM), but rather, we concentrate on the direct DL-based classification methods. Here, the "so-called direct DL-based method" is the approach directly deals with DL framework by adding some meaningful penalty terms. By listing some representative methods, we can roughly divide them into two categories, i.e. (1) directly making the dictionary discriminative and (2) forcing the sparse coefficients discriminative to push the discrimination power of the dictionary. From this taxonomy, we can expect some extensions of them as future researches.

  15. Finding Combination of Features from Promoter Regions for Ovarian Cancer-related Gene Group Classification

    KAUST Repository

    Olayan, Rawan S.

    2012-12-01

    In classification problems, it is always important to use the suitable combination of features that will be employed by classifiers. Generating the right combination of features usually results in good classifiers. In the situation when the problem is not well understood, data items are usually described by many features in the hope that some of these may be the relevant or most relevant ones. In this study, we focus on one such problem related to genes implicated in ovarian cancer (OC). We try to recognize two important OC-related gene groups: oncogenes, which support the development and progression of OC, and oncosuppressors, which oppose such tendencies. For this, we use the properties of promoters of these genes. We identified potential “regulatory features” that characterize OC-related oncogenes and oncosuppressors promoters. In our study, we used 211 oncogenes and 39 oncosuppressors. For these, we identified 538 characteristic sequence motifs from their promoters. Promoters are annotated by these motifs and derived feature vectors used to develop classification models. We made a comparison of a number of classification models in their ability to distinguish oncogenes from oncosuppressors. Based on 10-fold cross-validation, the resultant model was able to separate the two classes with sensitivity of 96% and specificity of 100% with the complete set of features. Moreover, we developed another recognition model where we attempted to distinguish oncogenes and oncosuppressors as one group from other OC-related genes. That model achieved accuracy of 82%. We believe that the results of this study will help in discovering other OC-related oncogenes and oncosuppressors not identified as yet.

  16. Basal cytokeratins and their relationship to the cellular origin and functional classification of breast cancer

    OpenAIRE

    Gusterson, Barry A.; Ross, Douglas T.; Heath, Victoria J; Stein, Torsten

    2005-01-01

    Recent publications have classified breast cancers on the basis of expression of cytokeratin-5 and -17 at the RNA and protein levels, and demonstrated the importance of these markers in defining sporadic tumours with bad prognosis and an association with BRCA1-related breast cancers. These important observations using different technology platforms produce a new functional classification of breast carcinoma. However, it is important in developing hypotheses about the pathogenesis of this tumo...

  17. An Efficient Semantic Model For Concept Based Clustering And Classification

    Directory of Open Access Journals (Sweden)

    SaiSindhu Bandaru

    2012-03-01

    Full Text Available Usually in text mining techniques the basic measures like term frequency of a term (word or phrase is computed to compute the importance of the term in the document. But with statistical analysis, the original semantics of the term may not carry the exact meaning of the term. To overcome this problem, a new framework has been introduced which relies on concept based model and synonym based approach. The proposed model can efficiently find significant matching and related concepts between documents according to concept based and synonym based approaches. Large sets of experiments using the proposed model on different set in clustering and classification are conducted. Experimental results demonstrate the substantialenhancement of the clustering quality using sentence based, document based, corpus based and combined approach concept analysis. A new similarity measure has been proposed to find the similarity between adocument and the existing clusters, which can be used in classification of the document with existing clusters.

  18. High dimensional multiclass classification with applications to cancer diagnosis

    DEFF Research Database (Denmark)

    Vincent, Martin

    Probabilistic classifiers are introduced and it is shown that the only regular linear probabilistic classifier with convex risk is multinomial regression. Penalized empirical risk minimization is introduced and used to construct supervised learning methods for probabilistic classifiers. A sparse...... group lasso penalized approach to high dimensional multinomial classification is presented. On different real data examples it is found that this approach clearly outperforms multinomial lasso in terms of error rate and features included in the model. An efficient coordinate descent algorithm...

  19. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is constr

  20. Hierarchical Real-time Network Traffic Classification Based on ECOC

    Directory of Open Access Journals (Sweden)

    Yaou Zhao

    2013-09-01

    Full Text Available Classification of network traffic is basic and essential for manynetwork researches and managements. With the rapid development ofpeer-to-peer (P2P application using dynamic port disguisingtechniques and encryption to avoid detection, port-based and simplepayload-based network traffic classification methods were diminished.An alternative method based on statistics and machine learning hadattracted researchers' attention in recent years. However, most ofthe proposed algorithms were off-line and usually used a single classifier.In this paper a new hierarchical real-time model was proposed which comprised of a three tuple (source ip, destination ip and destination portlook up table(TT-LUT part and layered milestone part. TT-LUT was used to quickly classify short flows whichneed not to pass the layered milestone part, and milestones in layered milestone partcould classify the other flows in real-time with the real-time feature selection and statistics.Every milestone was a ECOC(Error-Correcting Output Codes based model which was usedto improve classification performance. Experiments showed that the proposedmodel can improve the efficiency of real-time to 80%, and themulti-class classification accuracy encouragingly to 91.4% on the datasets which had been captured from the backbone router in our campus through a week.

  1. Impact of Information based Classification on Network Epidemics

    Science.gov (United States)

    Mishra, Bimal Kumar; Haldar, Kaushik; Sinha, Durgesh Nandini

    2016-06-01

    Formulating mathematical models for accurate approximation of malicious propagation in a network is a difficult process because of our inherent lack of understanding of several underlying physical processes that intrinsically characterize the broader picture. The aim of this paper is to understand the impact of available information in the control of malicious network epidemics. A 1-n-n-1 type differential epidemic model is proposed, where the differentiality allows a symptom based classification. This is the first such attempt to add such a classification into the existing epidemic framework. The model is incorporated into a five class system called the DifEpGoss architecture. Analysis reveals an epidemic threshold, based on which the long-term behavior of the system is analyzed. In this work three real network datasets with 22002, 22469 and 22607 undirected edges respectively, are used. The datasets show that classification based prevention given in the model can have a good role in containing network epidemics. Further simulation based experiments are used with a three category classification of attack and defense strengths, which allows us to consider 27 different possibilities. These experiments further corroborate the utility of the proposed model. The paper concludes with several interesting results.

  2. Classification-Based Method of Linear Multicriteria Optimization

    OpenAIRE

    Vassilev, Vassil; Genova, Krassimira; Vassileva, Mariyana; Narula, Subhash

    2003-01-01

    The paper describes a classification-based learning-oriented interactive method for solving linear multicriteria optimization problems. The method allows the decision makers describe their preferences with greater flexibility, accuracy and reliability. The method is realized in an experimental software system supporting the solution of multicriteria optimization problems.

  3. Classification of CT-brain slices based on local histograms

    Science.gov (United States)

    Avrunin, Oleg G.; Tymkovych, Maksym Y.; Pavlov, Sergii V.; Timchik, Sergii V.; Kisała, Piotr; Orakbaev, Yerbol

    2015-12-01

    Neurosurgical intervention is a very complicated process. Modern operating procedures based on data such as CT, MRI, etc. Automated analysis of these data is an important task for researchers. Some modern methods of brain-slice segmentation use additional data to process these images. Classification can be used to obtain this information. To classify the CT images of the brain, we suggest using local histogram and features extracted from them. The paper shows the process of feature extraction and classification CT-slices of the brain. The process of feature extraction is specialized for axial cross-section of the brain. The work can be applied to medical neurosurgical systems.

  4. Pulse frequency classification based on BP neural network

    Institute of Scientific and Technical Information of China (English)

    WANG Rui; WANG Xu; YANG Dan; FU Rong

    2006-01-01

    In Traditional Chinese Medicine (TCM), it is an important parameter of the clinic disease diagnosis to analysis the pulse frequency. This article accords to pulse eight major essentials to identify pulse type of the pulse frequency classification based on back-propagation neural networks (BPNN). The pulse frequency classification includes slow pulse, moderate pulse, rapid pulse etc. By feature parameter of the pulse frequency analysis research and establish to identify system of pulse frequency features. The pulse signal from detecting system extracts period, frequency etc feature parameter to compare with standard feature value of pulse type. The result shows that identify-rate attains 92.5% above.

  5. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    OpenAIRE

    Li, N.; Liu, C; Pfeifer, N; Yin, J. F.; Liao, Z.Y.; Zhou, Y.

    2016-01-01

    Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could kee...

  6. Optimizing Mining Association Rules for Artificial Immune System based Classification

    Directory of Open Access Journals (Sweden)

    SAMEER DIXIT

    2011-08-01

    Full Text Available The primary function of a biological immune system is to protect the body from foreign molecules known as antigens. It has great pattern recognition capability that may be used to distinguish between foreigncells entering the body (non-self or antigen and the body cells (self. Immune systems have many characteristics such as uniqueness, autonomous, recognition of foreigners, distributed detection, and noise tolerance . Inspired by biological immune systems, Artificial Immune Systems have emerged during the last decade. They are incited by many researchers to design and build immune-based models for a variety of application domains. Artificial immune systems can be defined as a computational paradigm that is inspired by theoretical immunology, observed immune functions, principles and mechanisms. Association rule mining is one of the most important and well researched techniques of data mining. The goal of association rules is to extract interesting correlations, frequent patterns, associations or casual structures among sets of items in thetransaction databases or other data repositories. Association rules are widely used in various areas such as inventory control, telecommunication networks, intelligent decision making, market analysis and risk management etc. Apriori is the most widely used algorithm for mining the association rules. Other popular association rule mining algorithms are frequent pattern (FP growth, Eclat, dynamic itemset counting (DIC etc. Associative classification uses association rule mining in the rule discovery process to predict the class labels of the data. This technique has shown great promise over many other classification techniques. Associative classification also integrates the process of rule discovery and classification to build the classifier for the purpose of prediction. The main problem with the associative classification approach is the discovery of highquality association rules in a very large space of

  7. Online Network Traffic Classification Algorithm Based on RVM

    Directory of Open Access Journals (Sweden)

    Zhang Qunhui

    2013-06-01

    Full Text Available Since compared with the Support Vector Machine (SVM, the Relevance Vector Machine (RVM not only has the advantage of avoiding the over- learn which is the characteristic of the SVM, but also greatly reduces the amount of computation of the kernel function and avoids the defects of the SVM that the scarcity is not strong, the large amount of calculation as well as the kernel function must satisfy the Mercer's condition and that human empirically determined parameters, so we proposed a new online traffic classification algorithm base on the RVM for this purpose. Through the analysis of the basic principles of RVM and the steps of the modeling, we made use of the training traffic classification model of the RVM to identify the network traffic in the real time through this model and the “port number+ DPI”. When the RVM predicts that the probability is in the query interval, we jointly used the "port number" and "DPI". Finally, we made a detailed experimental validation which shows that: compared with the Support Vector Machine (SVM network traffic classification algorithm, this algorithm can achieve the online network traffic classification, and the classification predication probability is greatly improved.

  8. Torrent classification - Base of rational management of erosive regions

    Energy Technology Data Exchange (ETDEWEB)

    Gavrilovic, Zoran; Stefanovic, Milutin; Milovanovic, Irina; Cotric, Jelena; Milojevic, Mileta [Institute for the Development of Water Resources ' Jaroslav Cerni' , 11226 Beograd (Pinosava), Jaroslava Cernog 80 (Serbia)], E-mail: gavrilovicz@sbb.rs

    2008-11-01

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  9. Torrent classification - Base of rational management of erosive regions

    International Nuclear Information System (INIS)

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  10. Comparison of Supervised Classification Methods for Protein Profiling in Cancer Diagnosis

    Directory of Open Access Journals (Sweden)

    Nadège Dossat

    2007-01-01

    Full Text Available A key challenge in clinical proteomics of cancer is the identification of biomarkers that could allow detection, diagnosis and prognosis of the diseases. Recent advances in mass spectrometry and proteomic instrumentations offer unique chance to rapidly identify these markers. These advances pose considerable challenges, similar to those created by microarray-based investigation, for the discovery of pattern of markers from high-dimensional data, specific to each pathologic state (e.g. normal vs cancer. We propose a three-step strategy to select important markers from high-dimensional mass spectrometry data using surface enhanced laser desorption/ionization (SELDI technology. The fi rst two steps are the selection of the most discriminating biomarkers with a construction of different classifiers. Finally, we compare and validate their performance and robustness using different supervised classification methods such as Support Vector Machine, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Neural Networks, Classifi cation Trees and Boosting Trees. We show that the proposed method is suitable for analysing high-throughput proteomics data and that the combination of logistic regression and Linear Discriminant Analysis outperform other methods tested.

  11. Clinical classification of cancer cachexia: phenotypic correlates in human skeletal muscle.

    Directory of Open Access Journals (Sweden)

    Neil Johns

    Full Text Available BACKGROUND: Cachexia affects the majority of patients with advanced cancer and is associated with a reduction in treatment tolerance, response to therapy, and duration of survival. One impediment towards the effective treatment of cachexia is a validated classification system. METHODS: 41 patients with resectable upper gastrointestinal (GI or pancreatic cancer underwent characterisation for cachexia based on weight-loss (WL and/or low muscularity (LM. Four diagnostic criteria were used >5%WL, >10%WL, LM, and LM+>2%WL. All patients underwent biopsy of the rectus muscle. Analysis included immunohistochemistry for fibre size and type, protein and nucleic acid concentration, Western blots for markers of autophagy, SMAD signalling, and inflammation. FINDINGS: Compared with non-cachectic cancer patients, patients with LM or LM+>2%WL, mean muscle fibre diameter was reduced by about 25% (p = 0.02 and p = 0.001 respectively. No significant difference in fibre diameter was observed if patients had WL alone. Regardless of classification, there was no difference in fibre number or proportion of fibre type across all myosin heavy chain isoforms. Mean muscle protein content was reduced and the ratio of RNA/DNA decreased in patients with either >5%WL or LM+>2%WL. Compared with non-cachectic patients, SMAD3 protein levels were increased in patients with >5%WL (p = 0.022 and with >10%WL, beclin (p = 0.05 and ATG5 (p = 0.01 protein levels were increased. There were no differences in phospho-NFkB or phospho-STAT3 levels across any of the groups. CONCLUSION: Muscle fibre size, biochemical composition and pathway phenotype can vary according to whether the diagnostic criteria for cachexia are based on weight loss alone, a measure of low muscularity alone or a combination of the two. For intervention trials where the primary end-point is a change in muscle mass or function, use of combined diagnostic criteria may allow identification of a more

  12. Hardware Accelerators Targeting a Novel Group Based Packet Classification Algorithm

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2013-01-01

    Full Text Available Packet classification is a ubiquitous and key building block for many critical network devices. However, it remains as one of the main bottlenecks faced when designing fast network devices. In this paper, we propose a novel Group Based Search packet classification Algorithm (GBSA that is scalable, fast, and efficient. GBSA consumes an average of 0.4 megabytes of memory for a 10 k rule set. The worst-case classification time per packet is 2 microseconds, and the preprocessing speed is 3 M rules/second based on an Xeon processor operating at 3.4 GHz. When compared with other state-of-the-art classification techniques, the results showed that GBSA outperforms the competition with respect to speed, memory usage, and processing time. Moreover, GBSA is amenable to implementation in hardware. Three different hardware implementations are also presented in this paper including an Application Specific Instruction Set Processor (ASIP implementation and two pure Register-Transfer Level (RTL implementations based on Impulse-C and Handel-C flows, respectively. Speedups achieved with these hardware accelerators ranged from 9x to 18x compared with a pure software implementation running on an Xeon processor.

  13. Breast Cancer Survival Defined by the ER/PR/HER2 Subtypes and a Surrogate Classification according to Tumor Grade and Immunohistochemical Biomarkers

    Directory of Open Access Journals (Sweden)

    Carol A. Parise

    2014-01-01

    Full Text Available Introduction. ER, PR, and HER2 are routinely available in breast cancer specimens. The purpose of this study is to contrast breast cancer-specific survival for the eight ER/PR/HER2 subtypes with survival of an immunohistochemical surrogate for the molecular subtype based on the ER/PR/HER2 subtypes and tumor grade. Methods. We identified 123,780 cases of stages 1–3 primary female invasive breast cancer from California Cancer Registry. The surrogate classification was derived using ER/PR/HER2 and tumor grade. Kaplan-Meier survival analysis and Cox proportional hazards modeling were used to assess differences in survival and risk of mortality for the ER/PR/HER2 subtypes and surrogate classification within each stage. Results. The luminal B/HER2− surrogate classification had a higher risk of mortality than the luminal B/HER2+ for all stages of disease. There was no difference in risk of mortality between the ER+/PR+/HER2− and ER+/PR+/HER2+ in stage 3. With one exception in stage 3, the ER-negative subtypes all had an increased risk of mortality when compared with the ER-positive subtypes. Conclusions. Assessment of survival using ER/PR/HER2 illustrates the heterogeneity of HER2+ subtypes. The surrogate classification provides clear separation in survival and adjusted mortality but underestimates the wide variability within the subtypes that make up the classification.

  14. Prognostic classification index in Iranian colorectal cancer patients: Survival tree analysis

    Directory of Open Access Journals (Sweden)

    Amal Saki Malehi

    2016-01-01

    Full Text Available Aims: The aim of this study was to determine the prognostic index for separating homogenous subgroups in colorectal cancer (CRC patients based on clinicopathological characteristics using survival tree analysis. Methods: The current study was conducted at the Research Center of Gastroenterology and Liver Disease, Shahid Beheshti Medical University in Tehran, between January 2004 and January 2009. A total of 739 patients who already have been diagnosed with CRC based on pathologic report were enrolled. The data included demographic and clinical-pathological characteristic of patients. Tree-structured survival analysis based on a recursive partitioning algorithm was implemented to evaluate prognostic factors. The probability curves were calculated according to the Kaplan-Meier method, and the hazard ratio was estimated as an interest effect size. Result: There were 526 males (71.2% of these patients. The mean survival time (from diagnosis time was 42.46± (3.4. Survival tree identified three variables as main prognostic factors and based on their four prognostic subgroups was constructed. The log-rank test showed good separation of survival curves. Patients with Stage I-IIIA and treated with surgery as the first treatment showed low risk (median = 34 months whereas patients with stage IIIB, IV, and more than 68 years have the worse survival outcome (median = 9.5 months. Conclusion: Constructing the prognostic classification index via survival tree can aid the researchers to assess interaction between clinical variables and determining the cumulative effect of these variables on survival outcome.

  15. Choice-Based Conjoint Analysis: Classification vs. Discrete Choice Models

    Science.gov (United States)

    Giesen, Joachim; Mueller, Klaus; Taneva, Bilyana; Zolliker, Peter

    Conjoint analysis is a family of techniques that originated in psychology and later became popular in market research. The main objective of conjoint analysis is to measure an individual's or a population's preferences on a class of options that can be described by parameters and their levels. We consider preference data obtained in choice-based conjoint analysis studies, where one observes test persons' choices on small subsets of the options. There are many ways to analyze choice-based conjoint analysis data. Here we discuss the intuition behind a classification based approach, and compare this approach to one based on statistical assumptions (discrete choice models) and to a regression approach. Our comparison on real and synthetic data indicates that the classification approach outperforms the discrete choice models.

  16. Cell nuclei attributed relational graphs for efficient representation and classification of gastric cancer in digital histopathology

    Science.gov (United States)

    Sharma, Harshita; Zerbe, Norman; Heim, Daniel; Wienert, Stephan; Lohmann, Sebastian; Hellwich, Olaf; Hufnagl, Peter

    2016-03-01

    This paper describes a novel graph-based method for efficient representation and subsequent classification in histological whole slide images of gastric cancer. Her2/neu immunohistochemically stained and haematoxylin and eosin stained histological sections of gastric carcinoma are digitized. Immunohistochemical staining is used in practice by pathologists to determine extent of malignancy, however, it is laborious to visually discriminate the corresponding malignancy levels in the more commonly used haematoxylin and eosin stain, and this study attempts to solve this problem using a computer-based method. Cell nuclei are first isolated at high magnification using an automatic cell nuclei segmentation strategy, followed by construction of cell nuclei attributed relational graphs of the tissue regions. These graphs represent tissue architecture comprehensively, as they contain information about cell nuclei morphology as vertex attributes, along with knowledge of neighborhood in the form of edge linking and edge attributes. Global graph characteristics are derived and ensemble learning is used to discriminate between three types of malignancy levels, namely, non-tumor, Her2/neu positive tumor and Her2/neu negative tumor. Performance is compared with state of the art methods including four texture feature groups (Haralick, Gabor, Local Binary Patterns and Varma Zisserman features), color and intensity features, and Voronoi diagram and Delaunay triangulation. Texture, color and intensity information is also combined with graph-based knowledge, followed by correlation analysis. Quantitative assessment is performed using two cross validation strategies. On investigating the experimental results, it can be concluded that the proposed method provides a promising way for computer-based analysis of histopathological images of gastric cancer.

  17. Classification of Mental Disorders Based on Temperament

    Directory of Open Access Journals (Sweden)

    Nadi Sakhvidi

    2015-08-01

    Full Text Available Context Different paradoxical theories are available regarding psychiatric disorders. The current study aimed to establish a more comprehensive overall approach. Evidence Acquisition This basic study examined ancient medical books. “The Canon” by Avicenna and “Comprehensive Textbook of Psychiatry” by Kaplan and Sadock were the most important and frequently consulted books in this study. Results Four groups of temperaments were identified: high active, high flexible; high active, low flexible; low active, low flexible; and low active, high flexible. When temperament deteriorates personality, non-psychotic, and psychotic psychiatric disorders can develop. Conclusions Temperaments can provide a basis to classify psychiatric disorders. Psychiatric disorders can be placed in a spectrum based on temperaments.

  18. Update on epidemiology classification, and management of thyroid cancer

    Directory of Open Access Journals (Sweden)

    Heitham Gheriani

    2006-06-01

    Full Text Available Thyroid cancer represents approximately 0.5–1% of all human malignancy1. In the UK the incidence of thyroid cancer is 2-3 per 100,000 populations 2. In geographical areas of low iodine intake and in areas exposed to nuclear disasters the incidence of thyroid cancer is higher. Benign thyroid conditions are much more common. In the UK approximately 8 % of the population have nodular thyroid disease2. Nodular thyroid disease increases with age and is also more common in females and in geographical areas of low iodine intake. Primary thyroid malignancy can be broadly divided into 2 groups. The first group, which generally have much better prognosis, are the well-differentiated thyroid carcinoma, which includes papillary carcinoma, follicular carcinoma and Hürthle cell tumours. The second group includes the poorly differentiated thyroid carcinoma like medullary thyroid carcinoma and the anaplastic thyroid carcinoma. Other rare tumours such as sarcomas, lymphomas and the extremely rare primary squamous cell carcinoma of the thyroid should be included in the second group. Secondary or metastatic thyroid cancer can be from breast, lung, colon and kidney malignancies.

  19. Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification.

    Science.gov (United States)

    Algamal, Zakariya Yahya; Lee, Muhammad Hisyam

    2015-12-01

    Cancer classification and gene selection in high-dimensional data have been popular research topics in genetics and molecular biology. Recently, adaptive regularized logistic regression using the elastic net regularization, which is called the adaptive elastic net, has been successfully applied in high-dimensional cancer classification to tackle both estimating the gene coefficients and performing gene selection simultaneously. The adaptive elastic net originally used elastic net estimates as the initial weight, however, using this weight may not be preferable for certain reasons: First, the elastic net estimator is biased in selecting genes. Second, it does not perform well when the pairwise correlations between variables are not high. Adjusted adaptive regularized logistic regression (AAElastic) is proposed to address these issues and encourage grouping effects simultaneously. The real data results indicate that AAElastic is significantly consistent in selecting genes compared to the other three competitor regularization methods. Additionally, the classification performance of AAElastic is comparable to the adaptive elastic net and better than other regularization methods. Thus, we can conclude that AAElastic is a reliable adaptive regularized logistic regression method in the field of high-dimensional cancer classification.

  20. Toward automated classification of consumers' cancer-related questions with a new taxonomy of expected answer types.

    Science.gov (United States)

    McRoy, Susan; Jones, Sean; Kurmally, Adam

    2016-09-01

    This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions. PMID:25759063

  1. Toward automated classification of consumers' cancer-related questions with a new taxonomy of expected answer types.

    Science.gov (United States)

    McRoy, Susan; Jones, Sean; Kurmally, Adam

    2016-09-01

    This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions.

  2. Upper limit for context based crop classification

    DEFF Research Database (Denmark)

    Midtiby, Henrik; Åstrand, Björn; Jørgensen, Rasmus Nyholm;

    2012-01-01

    Mechanical in-row weed control of crops like sugarbeet require precise knowledge of where individual crop plants are located. If crop plants are placed in known pattern, information about plant locations can be used to discriminate between crop and weed plants. The success rate of such a classifier...... depends on the weed pressure, the position uncertainty of the crop plants and the crop upgrowth percentage. The first two measures can be combined to a normalized weed pressure, \\lambda. Given the normalized weed pressure an upper bound on the positive predictive value is shown to be 1/(1+\\lambda). If the...... weed pressure is \\rho = 400/m^2 and the crop position uncertainty is \\sigma_x = 0.0148m along the row and \\sigma_y = 0.0108m perpendicular to the row, the normalized weed pressure is \\lambda ~ 0.40$; the upper bound on the positive predictive value is then 0.71. This means that when a position based...

  3. Classification of Regional Ionospheric Disturbances Based on Support Vector Machines

    Science.gov (United States)

    Begüm Terzi, Merve; Arikan, Feza; Arikan, Orhan; Karatay, Secil

    2016-07-01

    Ionosphere is an anisotropic, inhomogeneous, time varying and spatio-temporally dispersive medium whose parameters can be estimated almost always by using indirect measurements. Geomagnetic, gravitational, solar or seismic activities cause variations of ionosphere at various spatial and temporal scales. This complex spatio-temporal variability is challenging to be identified due to extensive scales in period, duration, amplitude and frequency of disturbances. Since geomagnetic and solar indices such as Disturbance storm time (Dst), F10.7 solar flux, Sun Spot Number (SSN), Auroral Electrojet (AE), Kp and W-index provide information about variability on a global scale, identification and classification of regional disturbances poses a challenge. The main aim of this study is to classify the regional effects of global geomagnetic storms and classify them according to their risk levels. For this purpose, Total Electron Content (TEC) estimated from GPS receivers, which is one of the major parameters of ionosphere, will be used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. In this work, for the automated classification of the regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. SVM is a supervised learning model used for classification with associated learning algorithm that analyze the data and recognize patterns. In addition to performing linear classification, SVM can efficiently perform nonlinear classification by embedding data into higher dimensional feature spaces. Performance of the developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from the GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing the developed classification

  4. Object-Based Classification and Change Detection of Hokkaido, Japan

    Science.gov (United States)

    Park, J. G.; Harada, I.; Kwak, Y.

    2016-06-01

    Topography and geology are factors to characterize the distribution of natural vegetation. Topographic contour is particularly influential on the living conditions of plants such as soil moisture, sunlight, and windiness. Vegetation associations having similar characteristics are present in locations having similar topographic conditions unless natural disturbances such as landslides and forest fires or artificial disturbances such as deforestation and man-made plantation bring about changes in such conditions. We developed a vegetation map of Japan using an object-based segmentation approach with topographic information (elevation, slope, slope direction) that is closely related to the distribution of vegetation. The results found that the object-based classification is more effective to produce a vegetation map than the pixel-based classification.

  5. Impact of full field digital mammography on the classification and mammographic characteristics of interval breast cancers

    International Nuclear Information System (INIS)

    Highlights: • Digital mammography has changed the presentation of interval breast cancer. • Less interval breast cancers are associated with microcalcifications following FFDM. • Interval breast cancer audit remains a key feature of any breast screening program. - Abstract: Objective: Full field digital mammography (FFDM) is increasingly replacing screen film mammography (SFM) in breast screening programs. Interval breast cancers are an issue in all screening programs and the purpose of our study is to assess the impact of FFDM on the classification of interval breast cancers at independent blind review and to compare the mammographic features of interval cancers at FFDM and SFM. Materials and methods: This study included 138 cases of interval breast cancer, 76 following an FFDM screening examination and 62 following screening with SFM. The prior screening mammogram was assessed by each of five consultant breast radiologists who were blinded to the site of subsequent cancer. Subsequent review of the diagnostic mammogram was performed and cases were classified as missed, minimal signs, occult or true interval. Mammographic features of the interval cancer at diagnosis and any abnormality identified on the prior screening mammogram were recorded. Results: The percentages of cancers classified as missed at FFDM and SFM did not differ significantly, 10.5% (8 of 76) at FFDM and 8.1% (5 of 62) at SFM (p = .77). There were significantly less interval cancers presenting as microcalcifications (alone or in association with another abnormality) following screening with FFDM, 16% (12 of 76) than following a SFM examination, 32% (20 of 62) (p = .02). Conclusion: Interval breast cancers continue to pose a problem at FFDM. The switch to FFDM has changed the mammographic presentation of interval breast cancer, with less interval cancers presenting in association with microcalcifications

  6. Impact of full field digital mammography on the classification and mammographic characteristics of interval breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Knox, Mark, E-mail: marktknox@gmail.com; O’Brien, Angela, E-mail: angelaobrien@doctors.org.uk; Szabó, Endre, E-mail: endrebacsi@freemail.hu; Smith, Clare S., E-mail: csmith@mater.ie; Fenlon, Helen M., E-mail: helen.fenlon@cancerscreening.ie; McNicholas, Michelle M., E-mail: michelle.mcnicholas@cancerscreening.ie; Flanagan, Fidelma L., E-mail: fidelma.flanagan@cancerscreening.ie

    2015-06-15

    Highlights: • Digital mammography has changed the presentation of interval breast cancer. • Less interval breast cancers are associated with microcalcifications following FFDM. • Interval breast cancer audit remains a key feature of any breast screening program. - Abstract: Objective: Full field digital mammography (FFDM) is increasingly replacing screen film mammography (SFM) in breast screening programs. Interval breast cancers are an issue in all screening programs and the purpose of our study is to assess the impact of FFDM on the classification of interval breast cancers at independent blind review and to compare the mammographic features of interval cancers at FFDM and SFM. Materials and methods: This study included 138 cases of interval breast cancer, 76 following an FFDM screening examination and 62 following screening with SFM. The prior screening mammogram was assessed by each of five consultant breast radiologists who were blinded to the site of subsequent cancer. Subsequent review of the diagnostic mammogram was performed and cases were classified as missed, minimal signs, occult or true interval. Mammographic features of the interval cancer at diagnosis and any abnormality identified on the prior screening mammogram were recorded. Results: The percentages of cancers classified as missed at FFDM and SFM did not differ significantly, 10.5% (8 of 76) at FFDM and 8.1% (5 of 62) at SFM (p = .77). There were significantly less interval cancers presenting as microcalcifications (alone or in association with another abnormality) following screening with FFDM, 16% (12 of 76) than following a SFM examination, 32% (20 of 62) (p = .02). Conclusion: Interval breast cancers continue to pose a problem at FFDM. The switch to FFDM has changed the mammographic presentation of interval breast cancer, with less interval cancers presenting in association with microcalcifications.

  7. Classification data mining method based on dynamic RBF neural networks

    Science.gov (United States)

    Zhou, Lijuan; Xu, Min; Zhang, Zhang; Duan, Luping

    2009-04-01

    With the widely application of databases and sharp development of Internet, The capacity of utilizing information technology to manufacture and collect data has improved greatly. It is an urgent problem to mine useful information or knowledge from large databases or data warehouses. Therefore, data mining technology is developed rapidly to meet the need. But DM (data mining) often faces so much data which is noisy, disorder and nonlinear. Fortunately, ANN (Artificial Neural Network) is suitable to solve the before-mentioned problems of DM because ANN has such merits as good robustness, adaptability, parallel-disposal, distributing-memory and high tolerating-error. This paper gives a detailed discussion about the application of ANN method used in DM based on the analysis of all kinds of data mining technology, and especially lays stress on the classification Data Mining based on RBF neural networks. Pattern classification is an important part of the RBF neural network application. Under on-line environment, the training dataset is variable, so the batch learning algorithm (e.g. OLS) which will generate plenty of unnecessary retraining has a lower efficiency. This paper deduces an incremental learning algorithm (ILA) from the gradient descend algorithm to improve the bottleneck. ILA can adaptively adjust parameters of RBF networks driven by minimizing the error cost, without any redundant retraining. Using the method proposed in this paper, an on-line classification system was constructed to resolve the IRIS classification problem. Experiment results show the algorithm has fast convergence rate and excellent on-line classification performance.

  8. From Molecular Classification to Targeted Therapeutics: The Changing Face of Systemic Therapy in Metastatic Gastroesophageal Cancer

    Directory of Open Access Journals (Sweden)

    Adrian Murphy

    2015-01-01

    Full Text Available Histological classification of adenocarcinoma or squamous cell carcinoma for esophageal cancer or using the Lauren classification for intestinal and diffuse type gastric cancer has limited clinical utility in the management of advanced disease. Germline mutations in E-cadherin (CDH1 or mismatch repair genes (Lynch syndrome were identified many years ago but given their rarity, the identification of these molecular alterations does not substantially impact treatment in the advanced setting. Recent molecular profiling studies of upper GI tumors have added to our knowledge of the underlying biology but have not led to an alternative classification system which can guide clinician’s therapeutic decisions. Recently the Cancer Genome Atlas Research Network has proposed four subtypes of gastric cancer dividing tumors into those positive for Epstein-Barr virus, microsatellite unstable tumors, genomically stable tumors, and tumors with chromosomal instability. Unfortunately to date, many phase III clinical trials involving molecularly targeted agents have failed to meet their survival endpoints due to their use in unselected populations. Future clinical trials should utilize molecular profiling of individual tumors in order to determine the optimal use of targeted therapies in preselected patients.

  9. From molecular classification to targeted therapeutics: the changing face of systemic therapy in metastatic gastroesophageal cancer.

    Science.gov (United States)

    Murphy, Adrian; Kelly, Ronan J

    2015-01-01

    Histological classification of adenocarcinoma or squamous cell carcinoma for esophageal cancer or using the Lauren classification for intestinal and diffuse type gastric cancer has limited clinical utility in the management of advanced disease. Germline mutations in E-cadherin (CDH1) or mismatch repair genes (Lynch syndrome) were identified many years ago but given their rarity, the identification of these molecular alterations does not substantially impact treatment in the advanced setting. Recent molecular profiling studies of upper GI tumors have added to our knowledge of the underlying biology but have not led to an alternative classification system which can guide clinician's therapeutic decisions. Recently the Cancer Genome Atlas Research Network has proposed four subtypes of gastric cancer dividing tumors into those positive for Epstein-Barr virus, microsatellite unstable tumors, genomically stable tumors, and tumors with chromosomal instability. Unfortunately to date, many phase III clinical trials involving molecularly targeted agents have failed to meet their survival endpoints due to their use in unselected populations. Future clinical trials should utilize molecular profiling of individual tumors in order to determine the optimal use of targeted therapies in preselected patients.

  10. Land Cover and Land Use Classification with TWOPAC: towards Automated Processing for Pixel- and Object-Based Image Classification

    Directory of Open Access Journals (Sweden)

    Stefan Dech

    2012-09-01

    Full Text Available We present a novel and innovative automated processing environment for the derivation of land cover (LC and land use (LU information. This processing framework named TWOPAC (TWinned Object and Pixel based Automated classification Chain enables the standardized, independent, user-friendly, and comparable derivation of LC and LU information, with minimized manual classification labor. TWOPAC allows classification of multi-spectral and multi-temporal remote sensing imagery from different sensor types. TWOPAC enables not only pixel-based classification, but also allows classification based on object-based characteristics. Classification is based on a Decision Tree approach (DT for which the well-known C5.0 code has been implemented, which builds decision trees based on the concept of information entropy. TWOPAC enables automatic generation of the decision tree classifier based on a C5.0-retrieved ascii-file, as well as fully automatic validation of the classification output via sample based accuracy assessment.Envisaging the automated generation of standardized land cover products, as well as area-wide classification of large amounts of data in preferably a short processing time, standardized interfaces for process control, Web Processing Services (WPS, as introduced by the Open Geospatial Consortium (OGC, are utilized. TWOPAC’s functionality to process geospatial raster or vector data via web resources (server, network enables TWOPAC’s usability independent of any commercial client or desktop software and allows for large scale data processing on servers. Furthermore, the components of TWOPAC were built-up using open source code components and are implemented as a plug-in for Quantum GIS software for easy handling of the classification process from the user’s perspective.

  11. Full Intelligent Cancer Classification of Thermal Breast Images to Assist Physician in Clinical Diagnostic Applications.

    Science.gov (United States)

    Lashkari, AmirEhsan; Pak, Fatemeh; Firouzmand, Mohammad

    2016-01-01

    Breast cancer is the most common type of cancer among women. The important key to treat the breast cancer is early detection of it because according to many pathological studies more than 75% - 80% of all abnormalities are still benign at primary stages; so in recent years, many studies and extensive research done to early detection of breast cancer with higher precision and accuracy. Infra-red breast thermography is an imaging technique based on recording temperature distribution patterns of breast tissue. Compared with breast mammography technique, thermography is more suitable technique because it is noninvasive, non-contact, passive and free ionizing radiation. In this paper, a full automatic high accuracy technique for classification of suspicious areas in thermogram images with the aim of assisting physicians in early detection of breast cancer has been presented. Proposed algorithm consists of four main steps: pre-processing & segmentation, feature extraction, feature selection and classification. At the first step, using full automatic operation, region of interest (ROI) determined and the quality of image improved. Using thresholding and edge detection techniques, both right and left breasts separated from each other. Then relative suspected areas become segmented and image matrix normalized due to the uniqueness of each person's body temperature. At feature extraction stage, 23 features, including statistical, morphological, frequency domain, histogram and Gray Level Co-occurrence Matrix (GLCM) based features are extracted from segmented right and left breast obtained from step 1. To achieve the best features, feature selection methods such as minimum Redundancy and Maximum Relevance (mRMR), Sequential Forward Selection (SFS), Sequential Backward Selection (SBS), Sequential Floating Forward Selection (SFFS), Sequential Floating Backward Selection (SFBS) and Genetic Algorithm (GA) have been used at step 3. Finally to classify and TH labeling procedures

  12. An AERONET-based aerosol classification using the Mahalanobis distance

    Science.gov (United States)

    Hamill, Patrick; Giordano, Marco; Ward, Carolyne; Giles, David; Holben, Brent

    2016-09-01

    We present an aerosol classification based on AERONET aerosol data from 1993 to 2012. We used the AERONET Level 2.0 almucantar aerosol retrieval products to define several reference aerosol clusters which are characteristic of the following general aerosol types: Urban-Industrial, Biomass Burning, Mixed Aerosol, Dust, and Maritime. The classification of a particular aerosol observation as one of these aerosol types is determined by its five-dimensional Mahalanobis distance to each reference cluster. We have calculated the fractional aerosol type distribution at 190 AERONET sites, as well as the monthly variation in aerosol type at those locations. The results are presented on a global map and individually in the supplementary material. Our aerosol typing is based on recognizing that different geographic regions exhibit characteristic aerosol types. To generate reference clusters we only keep data points that lie within a Mahalanobis distance of 2 from the centroid. Our aerosol characterization is based on the AERONET retrieved quantities, therefore it does not include low optical depth values. The analysis is based on "point sources" (the AERONET sites) rather than globally distributed values. The classifications obtained will be useful in interpreting aerosol retrievals from satellite borne instruments.

  13. Stratification and prognostic relevance of Jass’s molecular classification of colorectal cancer

    Directory of Open Access Journals (Sweden)

    Inti eZlobec

    2012-02-01

    Full Text Available Background: The current proposed model of colorectal tumorigenesis is based primarily on CpG island methylator phenotype (CIMP, microsatellite instability (MSI, KRAS, BRAF, and methylation status of 0-6-Methylguanine DNA Methyltransferase (MGMT and classifies tumors into 5 subgroups. The aim of this study is to validate this molecular classification and test its prognostic relevance. Methods: 302 patients were included in this study. Molecular analysis was performed for 5 CIMP-related promoters (CRABP1, MLH1, p16INK4a, CACNA1G, NEUROG1, MGMT, MSI, KRAS and BRAF. Tumors were CIMP-high or CIMP-low if ≥4 and 1-3 promoters were methylated, respectively. Results: CIMP-high, CIMP-low and CIMP–negative were found in 7.1%, 43% and 49.9% cases, respectively. 123 tumors (41% could not be classified into any one of the proposed molecular subgroups, including 107 CIMP-low, 14 CIMP-high and 2 CIMP-negative cases. The 10-year survival rate for CIMP-high patients (22.6% (95%CI: 7-43 was significantly lower than for CIMP-low or CIMP-negative (p=0.0295. Only the combined analysis of BRAF and CIMP (negative versus low/high led to distinct prognostic subgroups. Conclusion: Although CIMP status has an effect on outcome, our results underline the need for standardized definitions of low- and high-level CIMP, which clearly hinders an effective prognostic and molecular classification of colorectal cancer.

  14. Hierarchical Classification of Chinese Documents Based on N-grams

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    We explore the techniques of utilizing N-gram informatio n tocategorize Chinese text documents hierarchically so that the classifier can shak e off the burden of large dictionaries and complex segmentation processing, and subsequently be domain and time independent. A hierarchical Chinese text classif ier is implemented. Experimental results show that hierarchically classifying Chinese text documents based N-grams can achieve satisfactory performance and outperforms the other traditional Chinese text classifiers.

  15. Understanding Acupuncture Based on ZHENG Classification from System Perspective

    OpenAIRE

    Junwei Fang; Ningning Zheng; Yang Wang; Huijuan Cao; Shujun Sun; Jianye Dai; Qianhua Li; Yongyu Zhang

    2013-01-01

    Acupuncture is an efficient therapy method originated in ancient China, the study of which based on ZHENG classification is a systematic research on understanding its complexity. The system perspective is contributed to understand the essence of phenomena, and, as the coming of the system biology era, broader technology platforms such as omics technologies were established for the objective study of traditional chinese medicine (TCM). Omics technologies could dynamically determine molecular c...

  16. Active Dictionary Learning in Sparse Representation Based Classification

    OpenAIRE

    Xu, Jin; He, Haibo; Man, Hong

    2014-01-01

    Sparse representation, which uses dictionary atoms to reconstruct input vectors, has been studied intensively in recent years. A proper dictionary is a key for the success of sparse representation. In this paper, an active dictionary learning (ADL) method is introduced, in which classification error and reconstruction error are considered as the active learning criteria in selection of the atoms for dictionary construction. The learned dictionaries are caculated in sparse representation based...

  17. Label-Embedding for Attribute-Based Classification

    OpenAIRE

    Akata, Zeynep; Perronnin, Florent; Harchaoui, Zaid; Schmid, Cordelia

    2013-01-01

    International audience; Attributes are an intermediate representation, which enables parameter sharing between classes, a must when training data is scarce. We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors. We introduce a function which measures the compatibility between an image and a label embedding. The parameters of this function are learned on a training set of labeled samples to ensure that, gi...

  18. DATA MINING BASED TECHNIQUE FOR IDS ALERT CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Hany Nashat Gabra

    2015-06-01

    Full Text Available Intrusion detection systems (IDSs have become a widely used measure for security systems. The main problem for such systems is the irrelevant alerts. We propose a data mining based method for classification to distinguish serious and irrelevant alerts with a performance of 99.9%, which is better in comparison with the other recent data mining methods that achieved 97%. A ranked alerts list is also created according to the alert’s importance to minimize human interventions.

  19. Simple-Random-Sampling-Based Multiclass Text Classification Algorithm

    OpenAIRE

    Wuying Liu; Lin Wang; Mianzhu Yi

    2014-01-01

    Multiclass text classification (MTC) is a challenging issue and the corresponding MTC algorithms can be used in many applications. The space-time overhead of the algorithms must be concerned about the era of big data. Through the investigation of the token frequency distribution in a Chinese web document collection, this paper reexamines the power law and proposes a simple-random-sampling-based MTC (SRSMTC) algorithm. Supported by a token level memory to store labeled documents, the SRSMTC al...

  20. Expected energy-based restricted Boltzmann machine for classification.

    Science.gov (United States)

    Elfwing, S; Uchibe, E; Doya, K

    2015-04-01

    In classification tasks, restricted Boltzmann machines (RBMs) have predominantly been used in the first stage, either as feature extractors or to provide initialization of neural networks. In this study, we propose a discriminative learning approach to provide a self-contained RBM method for classification, inspired by free-energy based function approximation (FE-RBM), originally proposed for reinforcement learning. For classification, the FE-RBM method computes the output for an input vector and a class vector by the negative free energy of an RBM. Learning is achieved by stochastic gradient-descent using a mean-squared error training objective. In an earlier study, we demonstrated that the performance and the robustness of FE-RBM function approximation can be improved by scaling the free energy by a constant that is related to the size of network. In this study, we propose that the learning performance of RBM function approximation can be further improved by computing the output by the negative expected energy (EE-RBM), instead of the negative free energy. To create a deep learning architecture, we stack several RBMs on top of each other. We also connect the class nodes to all hidden layers to try to improve the performance even further. We validate the classification performance of EE-RBM using the MNIST data set and the NORB data set, achieving competitive performance compared with other classifiers such as standard neural networks, deep belief networks, classification RBMs, and support vector machines. The purpose of using the NORB data set is to demonstrate that EE-RBM with binary input nodes can achieve high performance in the continuous input domain. PMID:25318375

  1. Tree-based disease classification using protein data.

    Science.gov (United States)

    Zhu, Hongtu; Yu, Chang-Yung; Zhang, Heping

    2003-09-01

    A reliable and precise classification of diseases is essential for successful diagnosis and treatment. Using mass spectrometry from clinical specimens, scientists may find the protein variations among disease and use this information to improve diagnosis. In this paper, we propose a novel procedure to classify disease status based on the protein data from mass spectrometry. Our new tree-based algorithm consists of three steps: projection, selection and classification tree. The projection step aims to project all observations from specimens into the same bases so that the projected data have fixed coordinates. Thus, for each specimen, we obtain a large vector of 'coefficients' on the same basis. The purpose of the selection step is data reduction by condensing the large vector from the projection step into a much lower order of informative vector. Finally, using these reduced vectors, we apply recursive partitioning to construct an informative classification tree. This method has been successfully applied to protein data, provided by the Department of Radiology and Chemistry at Duke University.

  2. Classification of pulmonary airway disease based on mucosal color analysis

    Science.gov (United States)

    Suter, Melissa; Reinhardt, Joseph M.; Riker, David; Ferguson, John Scott; McLennan, Geoffrey

    2005-04-01

    Airway mucosal color changes occur in response to the development of bronchial diseases including lung cancer, cystic fibrosis, chronic bronchitis, emphysema and asthma. These associated changes are often visualized using standard macro-optical bronchoscopy techniques. A limitation to this form of assessment is that the subtle changes that indicate early stages in disease development may often be missed as a result of this highly subjective assessment, especially in inexperienced bronchoscopists. Tri-chromatic CCD chip bronchoscopes allow for digital color analysis of the pulmonary airway mucosa. This form of analysis may facilitate a greater understanding of airway disease response. A 2-step image classification approach is employed: the first step is to distinguish between healthy and diseased bronchoscope images and the second is to classify the detected abnormal images into 1 of 4 possible disease categories. A database of airway mucosal color constructed from healthy human volunteers is used as a standard against which statistical comparisons are made from mucosa with known apparent airway abnormalities. This approach demonstrates great promise as an effective detection and diagnosis tool to highlight potentially abnormal airway mucosa identifying a region possibly suited to further analysis via airway forceps biopsy, or newly developed micro-optical biopsy strategies. Following the identification of abnormal airway images a neural network is used to distinguish between the different disease classes. We have shown that classification of potentially diseased airway mucosa is possible through comparative color analysis of digital bronchoscope images. The combination of the two strategies appears to increase the classification accuracy in addition to greatly decreasing the computational time.

  3. Lung Cancer Early Diagnosis Using Some Data Mining Classification Techniques: A Survey

    Directory of Open Access Journals (Sweden)

    Thangaraju P

    2014-06-01

    Full Text Available Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining is primarily used to this requirement thus finding its applications in diverse fields such as retail, financial, communication, marketing organizations and medicine. Data Mining plays an important role in healthcare organization because with the growth of population and dangerous deadly diseases like Cancer, SARS, Leprosy, HIV etc, Lung cancer is one of the most dangerous disease. This survey for appropriate medical image mining, Data Preprocessing, Feature Extraction, rule generation and classification, it provides basic framework for further improvement in medical diagnosis.

  4. Long-term Prostate-specific Antigen Velocity in Improved Classification of Prostate Cancer Risk and Mortality

    DEFF Research Database (Denmark)

    Ørsted, David Dynnes; Bojesen, Stig E; Kamstrup, Pia R;

    2013-01-01

    BACKGROUND: It remains unclear whether adding long-term prostate-specific antigen velocity (PSAV) to baseline PSA values improves classification of prostate cancer (PCa) risk and mortality in the general population. OBJECTIVE: To determine whether long-term PSAV improves classification of PCa risk...

  5. The DTW-based representation space for seismic pattern classification

    Science.gov (United States)

    Orozco-Alzate, Mauricio; Castro-Cabrera, Paola Alexandra; Bicego, Manuele; Londoño-Bonilla, John Makario

    2015-12-01

    Distinguishing among the different seismic volcanic patterns is still one of the most important and labor-intensive tasks for volcano monitoring. This task could be lightened and made free from subjective bias by using automatic classification techniques. In this context, a core but often overlooked issue is the choice of an appropriate representation of the data to be classified. Recently, it has been suggested that using a relative representation (i.e. proximities, namely dissimilarities on pairs of objects) instead of an absolute one (i.e. features, namely measurements on single objects) is advantageous to exploit the relational information contained in the dissimilarities to derive highly discriminant vector spaces, where any classifier can be used. According to that motivation, this paper investigates the suitability of a dynamic time warping (DTW) dissimilarity-based vector representation for the classification of seismic patterns. Results show the usefulness of such a representation in the seismic pattern classification scenario, including analyses of potential benefits from recent advances in the dissimilarity-based paradigm such as the proper selection of representation sets and the combination of different dissimilarity representations that might be available for the same data.

  6. Data Classification Based on Confidentiality in Virtual Cloud Environment

    Directory of Open Access Journals (Sweden)

    Munwar Ali Zardari

    2014-10-01

    Full Text Available The aim of this study is to provide suitable security to data based on the security needs of data. It is very difficult to decide (in cloud which data need what security and which data do not need security. However it will be easy to decide the security level for data after data classification according to their security level based on the characteristics of the data. In this study, we have proposed a data classification cloud model to solve data confidentiality issue in cloud computing environment. The data are classified into two major classes: sensitive and non-sensitive. The K-Nearest Neighbour (K-NN classifier is used for data classification and the Rivest, Shamir and Adelman (RSA algorithm is used to encrypt sensitive data. After implementing the proposed model, it is found that the confidentiality level of data is increased and this model is proved to be more cost and memory friendly for the users as well as for the cloud services providers. The data storage service is one of the cloud services where data servers are virtualized of all users. In a cloud server, the data are stored in two ways. First encrypt the received data and store on cloud servers. Second store data on the cloud servers without encryption. Both of these data storage methods can face data confidentiality issue, because the data have different values and characteristics that must be identified before sending to cloud severs.

  7. Simple-Random-Sampling-Based Multiclass Text Classification Algorithm

    Directory of Open Access Journals (Sweden)

    Wuying Liu

    2014-01-01

    Full Text Available Multiclass text classification (MTC is a challenging issue and the corresponding MTC algorithms can be used in many applications. The space-time overhead of the algorithms must be concerned about the era of big data. Through the investigation of the token frequency distribution in a Chinese web document collection, this paper reexamines the power law and proposes a simple-random-sampling-based MTC (SRSMTC algorithm. Supported by a token level memory to store labeled documents, the SRSMTC algorithm uses a text retrieval approach to solve text classification problems. The experimental results on the TanCorp data set show that SRSMTC algorithm can achieve the state-of-the-art performance at greatly reduced space-time requirements.

  8. A Fuzzy Similarity Based Concept Mining Model for Text Classification

    CERN Document Server

    Puri, Shalini

    2012-01-01

    Text Classification is a challenging and a red hot field in the current scenario and has great importance in text categorization applications. A lot of research work has been done in this field but there is a need to categorize a collection of text documents into mutually exclusive categories by extracting the concepts or features using supervised learning paradigm and different classification algorithms. In this paper, a new Fuzzy Similarity Based Concept Mining Model (FSCMM) is proposed to classify a set of text documents into pre - defined Category Groups (CG) by providing them training and preparing on the sentence, document and integrated corpora levels along with feature reduction, ambiguity removal on each level to achieve high system performance. Fuzzy Feature Category Similarity Analyzer (FFCSA) is used to analyze each extracted feature of Integrated Corpora Feature Vector (ICFV) with the corresponding categories or classes. This model uses Support Vector Machine Classifier (SVMC) to classify correct...

  9. Semantic analysis based forms information retrieval and classification

    Science.gov (United States)

    Saba, Tanzila; Alqahtani, Fatimah Ayidh

    2013-09-01

    Data entry forms are employed in all types of enterprises to collect hundreds of customer's information on daily basis. The information is filled manually by the customers. Hence, it is laborious and time consuming to use human operator to transfer these customers information into computers manually. Additionally, it is expensive and human errors might cause serious flaws. The automatic interpretation of scanned forms has facilitated many real applications from speed and accuracy point of view such as keywords spotting, sorting of postal addresses, script matching and writer identification. This research deals with different strategies to extract customer's information from these scanned forms, interpretation and classification. Accordingly, extracted information is segmented into characters for their classification and finally stored in the forms of records in databases for their further processing. This paper presents a detailed discussion of these semantic based analysis strategies for forms processing. Finally, new directions are also recommended for future research. [Figure not available: see fulltext.

  10. Entropy coders for image compression based on binary forward classification

    Science.gov (United States)

    Yoo, Hoon; Jeong, Jechang

    2000-12-01

    Entropy coders as a noiseless compression method are widely used as final step compression for images, and there have been many contributions to increase of entropy coder performance and to reduction of entropy coder complexity. In this paper, we propose some entropy coders based on the binary forward classification (BFC). The BFC requires overhead of classification but there is no change between the amount of input information and the total amount of classified output information, which we prove this property in this paper. And using the proved property, we propose entropy coders that are the BFC followed by Golomb-Rice coders (BFC+GR) and the BFC followed by arithmetic coders (BFC+A). The proposed entropy coders introduce negligible additional complexity due to the BFC. Simulation results also show better performance than other entropy coders that have similar complexity to the proposed coders.

  11. An ellipse detection algorithm based on edge classification

    Science.gov (United States)

    Yu, Liu; Chen, Feng; Huang, Jianming; Wei, Xiangquan

    2015-12-01

    In order to enhance the speed and accuracy of ellipse detection, an ellipse detection algorithm based on edge classification is proposed. Too many edge points are removed by making edge into point in serialized form and the distance constraint between the edge points. It achieves effective classification by the criteria of the angle between the edge points. And it makes the probability of randomly selecting the edge points falling on the same ellipse greatly increased. Ellipse fitting accuracy is significantly improved by the optimization of the RED algorithm. It uses Euclidean distance to measure the distance from the edge point to the elliptical boundary. Experimental results show that: it can detect ellipse well in case of edge with interference or edges blocking each other. It has higher detecting precision and less time consuming than the RED algorithm.

  12. Apparent diffusion coefficient value of gastric cancer by diffusion-weighted imaging: Correlations with the histological differentiation and Lauren classification

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Song, E-mail: songliu532909756@gmail.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Guan, Wenxian, E-mail: wenxianguan123@126.com [Department of Gastrointestinal Surgery, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Wang, Hao, E-mail: wanghao20140525@126.com [Department of Gastrointestinal Surgery, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Pan, Liang, E-mail: panliang2014@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Zhou, Zhuping, E-mail: zhupingzhou@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Yu, Haiping, E-mail: haipingyu2012@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Liu, Tian, E-mail: tianliu2014@126.com [Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322 (United States); Yang, Xiaofeng, E-mail: xiaofengyang2014@126.com [Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322 (United States); He, Jian, E-mail: hjxueren@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Zhou, Zhengyang, E-mail: zyzhou@nju.edu.cn [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China)

    2014-12-15

    Highlights: • Gastric cancers’ ADC values were significantly lower than normal gastric wall. • Gastric adenocarcinomas with different differentiation had different ADC values. • Gastric adenocarcinomas’ ADC values correlated with histologic differentiations. • Gastric cancers’ ADC values correlated with Lauren classifications. • Mean ADC value was better than min ADC value in characterizing gastric cancers. - Abstract: Objective: The purpose of this study was to evaluate the correlations between histological differentiation and Lauren classification of gastric cancer and the apparent diffusion coefficient (ADC) value of diffusion weighted imaging (DWI). Materials and methods: Sixty-nine patients with gastric cancer lesions underwent preoperative magnetic resonance imaging (MRI) (3.0T) and surgical resection. DWI was obtained with a single-shot, echo-planar imaging sequence in the axial plane (b values: 0 and 1000 s/mm{sup 2}). Mean and minimum ADC values were obtained for each gastric cancer and normal gastric walls by two radiologists, who were blinded to the histological findings. Histological type, degree of differentiation and Lauren classification of each resected specimen were determined by one pathologist. Mean and minimum ADC values of gastric cancers with different histological types, degrees of differentiation and Lauren classifications were compared. Correlations between ADC values and histological differentiation and Lauren classification were analyzed. Results: The mean and minimum ADC values of gastric cancers, as a whole and separately, were significantly lower than those of normal gastric walls (all p values <0.001). There were significant differences in the mean and minimum ADC values among gastric cancers with different histological types, degrees of differentiation and Lauren classifications (p < 0.05). Mean and minimum ADC values correlated significantly (all p < 0.001) with histological differentiation (r = 0.564, 0.578) and

  13. Local fractal dimension based approaches for colonic polyp classification.

    Science.gov (United States)

    Häfner, Michael; Tamaki, Toru; Tanaka, Shinji; Uhl, Andreas; Wimmer, Georg; Yoshida, Shigeto

    2015-12-01

    This work introduces texture analysis methods that are based on computing the local fractal dimension (LFD; or also called the local density function) and applies them for colonic polyp classification. The methods are tested on 8 HD-endoscopic image databases, where each database is acquired using different imaging modalities (Pentax's i-Scan technology combined with or without staining the mucosa) and on a zoom-endoscopic image database using narrow band imaging. In this paper, we present three novel extensions to a LFD based approach. These extensions additionally extract shape and/or gradient information of the image to enhance the discriminativity of the original approach. To compare the results of the LFD based approaches with the results of other approaches, five state of the art approaches for colonic polyp classification are applied to the employed databases. Experiments show that LFD based approaches are well suited for colonic polyp classification, especially the three proposed extensions. The three proposed extensions are the best performing methods or at least among the best performing methods for each of the employed databases. The methods are additionally tested by means of a public texture image database, the UIUCtex database. With this database, the viewpoint invariance of the methods is assessed, an important features for the employed endoscopic image databases. Results imply that most of the LFD based methods are more viewpoint invariant than the other methods. However, the shape, size and orientation adapted LFD approaches (which are especially designed to enhance the viewpoint invariance) are in general not more viewpoint invariant than the other LFD based approaches.

  14. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    Science.gov (United States)

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  15. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    Science.gov (United States)

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  16. Network Traffic Anomalies Identification Based on Classification Methods

    Directory of Open Access Journals (Sweden)

    Donatas Račys

    2015-07-01

    Full Text Available A problem of network traffic anomalies detection in the computer networks is analyzed. Overview of anomalies detection methods is given then advantages and disadvantages of the different methods are analyzed. Model for the traffic anomalies detection was developed based on IBM SPSS Modeler and is used to analyze SNMP data of the router. Investigation of the traffic anomalies was done using three classification methods and different sets of the learning data. Based on the results of investigation it was determined that C5.1 decision tree method has the largest accuracy and performance and can be successfully used for identification of the network traffic anomalies.

  17. Spectral classification of stars based on LAMOST spectra

    CERN Document Server

    Liu, Chao; Zhang, Bo; Wan, Jun-Chen; Deng, Li-Cai; Hou, Yonghui; Wang, Yuefei; Yang, Ming; Zhang, Yong

    2015-01-01

    In this work, we select the high signal-to-noise ratio spectra of stars from the LAMOST data andmap theirMK classes to the spectral features. The equivalentwidths of the prominent spectral lines, playing the similar role as the multi-color photometry, form a clean stellar locus well ordered by MK classes. The advantage of the stellar locus in line indices is that it gives a natural and continuous classification of stars consistent with either the broadly used MK classes or the stellar astrophysical parameters. We also employ a SVM-based classification algorithm to assignMK classes to the LAMOST stellar spectra. We find that the completenesses of the classification are up to 90% for A and G type stars, while it is down to about 50% for OB and K type stars. About 40% of the OB and K type stars are mis-classified as A and G type stars, respectively. This is likely owe to the difference of the spectral features between the late B type and early A type stars or between the late G and early K type stars are very we...

  18. Risk Classification and Risk-based Safety and Mission Assurance

    Science.gov (United States)

    Leitner, Jesse A.

    2014-01-01

    Recent activities to revamp and emphasize the need to streamline processes and activities for Class D missions across the agency have led to various interpretations of Class D, including the lumping of a variety of low-cost projects into Class D. Sometimes terms such as Class D minus are used. In this presentation, mission risk classifications will be traced to official requirements and definitions as a measure to ensure that projects and programs align with the guidance and requirements that are commensurate for their defined risk posture. As part of this, the full suite of risk classifications, formal and informal will be defined, followed by an introduction to the new GPR 8705.4 that is currently under review.GPR 8705.4 lays out guidance for the mission success activities performed at the Classes A-D for NPR 7120.5 projects as well as for projects not under NPR 7120.5. Furthermore, the trends in stepping from Class A into higher risk posture classifications will be discussed. The talk will conclude with a discussion about risk-based safety and mission assuranceat GSFC.

  19. Classification of body movements based on posturographic data.

    Science.gov (United States)

    Saripalle, Sashi K; Paiva, Gavin C; Cliett, Thomas C; Derakhshani, Reza R; King, Gregory W; Lovelace, Christopher T

    2014-02-01

    The human body, standing on two feet, produces a continuous sway pattern. Intended movements, sensory cues, emotional states, and illnesses can all lead to subtle changes in sway appearing as alterations in ground reaction forces and the body's center of pressure (COP). The purpose of this study is to demonstrate that carefully selected COP parameters and classification methods can differentiate among specific body movements while standing, providing new prospects in camera-free motion identification. Force platform data were collected from participants performing 11 choreographed postural and gestural movements. Twenty-three different displacement- and frequency-based features were extracted from COP time series, and supplied to classification-guided feature extraction modules. For identification of movement type, several linear and nonlinear classifiers were explored; including linear discriminants, nearest neighbor classifiers, and support vector machines. The average classification rates on previously unseen test sets ranged from 67% to 100%. Within the context of this experiment, no single method was able to uniformly outperform the others for all movement types, and therefore a set of movement-specific features and classifiers is recommended.

  20. Network analysis of genes regulated in renal diseases: implications for a molecular-based classification

    Directory of Open Access Journals (Sweden)

    Jagadish HV

    2009-09-01

    Full Text Available Abstract Background Chronic renal diseases are currently classified based on morphological similarities such as whether they produce predominantly inflammatory or non-inflammatory responses. However, such classifications do not reliably predict the course of the disease and its response to therapy. In contrast, recent studies in diseases such as breast cancer suggest that a classification which includes molecular information could lead to more accurate diagnoses and prediction of treatment response. This article describes how we extracted gene expression profiles from biopsies of patients with chronic renal diseases, and used network visualizations and associated quantitative measures to rapidly analyze similarities and differences between the diseases. Results The analysis revealed three main regularities: (1 Many genes associated with a single disease, and fewer genes associated with many diseases. (2 Unexpected combinations of renal diseases that share relatively large numbers of genes. (3 Uniform concordance in the regulation of all genes in the network. Conclusion The overall results suggest the need to define a molecular-based classification of renal diseases, in addition to hypotheses for the unexpected patterns of shared genes and the uniformity in gene concordance. Furthermore, the results demonstrate the utility of network analyses to rapidly understand complex relationships between diseases and regulated genes.

  1. Content-based image retrieval applied to BI-RADS tissue classification in screening mammography

    OpenAIRE

    2011-01-01

    AIM: To present a content-based image retrieval (CBIR) system that supports the classification of breast tissue density and can be used in the processing chain to adapt parameters for lesion segmentation and classification.

  2. A Chemistry-Based Classification for Peridotite Xenoliths

    Science.gov (United States)

    Block, K. A.; Ducea, M.; Raye, U.; Stern, R. J.; Anthony, E. Y.; Lehnert, K. A.

    2007-12-01

    The development of a petrological and geochemical database for mantle xenoliths is important for interpreting EarthScope geophysical results. Interpretation of compositional characteristics of xenoliths requires a sound basis for comparing geochemical results, even when no petrographic modes are available. Peridotite xenoliths are generally classified on the basis of mineralogy (Streckeisen, 1973) derived from point-counting methods. Modal estimates, particularly on heterogeneous samples, are conducted using various methodologies and are therefore subject to large statistical error. Also, many studies simply do not report the modes. Other classifications for peridotite xenoliths based on host matrix or tectonic setting (cratonic vs. non-cratonic) are poorly defined and provide little information on where samples from transitional settings fit within a classification scheme (e.g., xenoliths from circum-cratonic locations). We present here a classification for peridotite xenoliths based on bulk rock major element chemistry, which is one of the most common types of data reported in the literature. A chemical dataset of over 1150 peridotite xenoliths is compiled from two online geochemistry databases, the EarthChem Deep Lithosphere Dataset and from GEOROC (http://www.earthchem.org), and is downloaded with the rock names reported in the original publications. Ternary plots of combinations of the SiO2- CaO-Al2O3-MgO (SCAM) components display sharp boundaries that define the dunite, harzburgite, lherzolite, or wehrlite-pyroxenite fields and provide a graphical basis for classification. In addition, for the CaO-Al2O3-MgO (CAM) diagram, a boundary between harzburgite and lherzolite at approximately 19% CaO is defined by a plot of over 160 abyssal peridotite compositions calculated from observed modes using the methods of Asimow (1999) and Baker and Beckett (1999). We anticipate that our SCAM classification is a first step in the development of a uniform basis for

  3. A NEW FUNCTIONAL CLASSIFICATION OF STOMACH CANCER AND ITS PATHOBIOLOGICAL AND CLINICAL SIGNIFICANCE

    Institute of Scientific and Technical Information of China (English)

    辛彦; 赵风凯; 宫伟; 王艳萍; 张荫昌; 闫瑞方

    1994-01-01

    The functional differentiations of stomach cancer specimens from 121patients were investigated by enzyme-,mucin-,affinity-and immunohistochemical methods,and the stomach cancers were divided into five functionally differentiated types:1)Absorptive Function Differentiation Type (AFDT),19.8%;2)Mucin Secreting Func-tion Differentiation Type (MSFDT),24.0%;3)Absorptive and Mucin-Producing Function Differentiation Type (AMPFDT),47.1%;4)Special Function Differentiation Type (SFDT),0.8%;and 5)Non-Function Differ-entiation Type(NFDT),8.3%.The results indicate that stomach cancer tissues of the same histological type of -ten display differing functional differentiation,and these functionally differentiated types have different invasive and metastatic characteristics.In addition,the functionally differentiated types have particular organic affinities of metastasis and different clinical prognoses.This study suggests that this new functional classification may supple-ment histological classification.The mechanisms of liver and ovary metastases of stomach cancer are also dis-cussed.

  4. Intrusion Awareness Based on Data Fusion and SVM Classification

    Directory of Open Access Journals (Sweden)

    Ramnaresh Sharma

    2012-06-01

    Full Text Available Network intrusion awareness is important factor for risk analysis of network security. In the current decade various method and framework are available for intrusion detection and security awareness. Some method based on knowledge discovery process and some framework based on neural network. These entire model take rule based decision for the generation of security alerts. In this paper we proposed a novel method for intrusion awareness using data fusion and SVM classification. Data fusion work on the biases of features gathering of event. Support vector machine is super classifier of data. Here we used SVM for the detection of closed item of ruled based technique. Our proposed method simulate on KDD1999 DARPA data set and get better empirical evaluation result in comparison of rule based technique and neural network model.

  5. Intrusion Awareness Based on Data Fusion and SVM Classification

    Directory of Open Access Journals (Sweden)

    Ramnaresh Sharma

    2012-06-01

    Full Text Available Network intrusion awareness is important factor forrisk analysis of network security. In the currentdecade various method and framework are availablefor intrusion detection and security awareness.Some method based on knowledge discovery processand some framework based on neural network.These entire model take rule based decision for thegeneration of security alerts. In this paper weproposed a novel method for intrusion awarenessusing data fusion and SVM classification. Datafusion work on the biases of features gathering ofevent. Support vector machine is super classifier ofdata. Here we used SVM for the detection of closeditem of ruled based technique. Our proposedmethod simulate on KDD1999 DARPA data set andget better empirical evaluation result in comparisonof rule based technique and neural network model.

  6. Content Based Image Retrieval : Classification Using Neural Networks

    Directory of Open Access Journals (Sweden)

    Shereena V.B

    2014-10-01

    Full Text Available In a content-based image retrieval system (CBIR, the main issue is to extract the image features that effectively represent the image contents in a database. Such an extraction requires a detailed evaluation of retrieval performance of image features. This paper presents a review of fundamental aspects of content based image retrieval including feature extraction of color and texture features. Commonly used color features including color moments, color histogram and color correlogram and Gabor texture are compared. The paper reviews the increase in efficiency of image retrieval when the color and texture features are combined. The similarity measures based on which matches are made and images are retrieved are also discussed. For effective indexing and fast searching of images based on visual features, neural network based pattern learning can be used to achieve effective classification.

  7. Content Based Image Retrieval : Classification Using Neural Networks

    Directory of Open Access Journals (Sweden)

    Shereena V.B

    2014-11-01

    Full Text Available In a content-based image retrieval system (CBIR, the main issue is to extract the image features that effectively represent the image contents in a database. Such an extraction requires a detailed evaluation of retrieval performance of image features. This paper presents a review of fundamental aspects of content based image retrieval including feature extraction of color and texture features. Commonly used color features including color moments, color histogram and color correlogram and Gabor texture are compared. The paper reviews the increase in efficiency of image retrieval when the color and texture features are combined. The similarity measures based on which matches are made and images are retrieved are also discussed. For effective indexing and fast searching of images based on visual features, neural network based pattern learning can be used to achieve effective classification.

  8. Texton Based Shape Features on Local Binary Pattern for Age Classification

    OpenAIRE

    V. Vijaya Kumar; B. Eswara Reddy; P. Chandra Sekhar Reddy

    2012-01-01

    Classification and recognition of objects is interest of many researchers. Shape is a significant feature of objects and it plays a crucial role in image classification and recognition. The present paper assumes that the features that drastically affect the adulthood classification system are the Shape features (SF) of face. Based on this, the present paper proposes a new technique of adulthood classification by extracting feature parameters of face on Integrated Texton based LBP (IT-LBP) ima...

  9. A generalized representation-based approach for hyperspectral image classification

    Science.gov (United States)

    Li, Jiaojiao; Li, Wei; Du, Qian; Li, Yunsong

    2016-05-01

    Sparse representation-based classifier (SRC) is of great interest recently for hyperspectral image classification. It is assumed that a testing pixel is linearly combined with atoms of a dictionary. Under this circumstance, the dictionary includes all the training samples. The objective is to find a weight vector that yields a minimum L2 representation error with the constraint that the weight vector is sparse with a minimum L1 norm. The pixel is assigned to the class whose training samples yield the minimum error. In addition, collaborative representation-based classifier (CRC) is also proposed, where the weight vector has a minimum L2 norm. The CRC has a closed-form solution; when using class-specific representation it can yield even better performance than the SRC. Compared to traditional classifiers such as support vector machine (SVM), SRC and CRC do not have a traditional training-testing fashion as in supervised learning, while their performance is similar to or even better than SVM. In this paper, we investigate a generalized representation-based classifier which uses Lq representation error, Lp weight norm, and adaptive regularization. The classification performance of Lq and Lp combinations is evaluated with several real hyperspectral datasets. Based on these experiments, recommendation is provide for practical implementation.

  10. Subtype classification for prediction of prognosis of breast cancer from a biomarker panel: correlations and indications

    Directory of Open Access Journals (Sweden)

    Chen C

    2014-02-01

    Full Text Available Chuang Chen,1 Jing-Ping Yuan,2,3 Wen Wei,1 Yi Tu,1 Feng Yao,1 Xue-Qin Yang,4 Jin-Zhong Sun,1 Sheng-Rong Sun,1 Yan Li2 1Department of Breast and Thyroid Surgery, Wuhan University, Renmin Hospital, Wuhan, 2Department of Oncology, Zhongnan Hospital of Wuhan University and Hubei Key Laboratory of Tumor Biological Behaviors and Hubei Cancer Clinical Study Center, Wuhan, 3Department of Pathology, The Central Hospital of Wuhan, Wuhan, 4Medical School of Jingchu University of Technology, Jingmen, People’s Republic of China Background: Hormone receptors, including the estrogen receptor and progesterone receptor, human epidermal growth factor receptor 2 (HER2, and other biomarkers like Ki67, epidermal growth factor receptor (EGFR, also known as HER1, the androgen receptor, and p53, are key molecules in breast cancer. This study evaluated the relationship between HER2 and hormone receptors and explored the additional prognostic value of Ki67, EGFR, the androgen receptor, and p53. Methods: Quantitative determination of HER2 and EGFR was performed in 240 invasive breast cancer tissue microarray specimens using quantum dot (QD-based nanotechnology. We identified two subtypes of HER2, ie, high total HER2 load (HTH2 and low total HER2 load (LTH2, and three subtypes of hormone receptor, ie, high hormone receptor (HHR, low hormone receptor (LHR, and no hormone receptor (NHR. Therefore, breast cancer patients could be divided into five subtypes according to HER2 and hormone receptor status. Ki67, p53, and the androgen receptor were determined by traditional immunohistochemistry techniques. The relationship between hormone receptors and HER2 was investigated and the additional value of Ki67, EGFR, the androgen receptor, and p53 for prediction of 5-year disease-free survival was assessed. Results: In all patients, quantitative determination showed a statistically significant (P<0.001 negative correlation between HER2 and the hormone receptors and a significant

  11. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    Directory of Open Access Journals (Sweden)

    Hala Alshamlan

    2015-01-01

    Full Text Available An artificial bee colony (ABC is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR, and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO. The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  12. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    Science.gov (United States)

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems. PMID:25961028

  13. Classification of Breast Cancer Subtypes by combining Gene Expression and DNA Methylation Data

    DEFF Research Database (Denmark)

    List, Markus; Hauschild, Anne-Christin; Tan, Qihua;

    2014-01-01

    expression data for hundreds of patients, the challenge is to extract a minimal optimal set of genes with good prognostic properties from a large bulk of genes making a moderate contribution to classification. Several studies have successfully applied machine learning algorithms to solve this so-called gene...... on the transcriptomic, but also on an epigenetic level. We compared so-called random forest derived classification models based on gene expression and methylation data alone, to a model based on the combined features and to a model based on the gold standard PAM50. We obtained bootstrap errors of 10...

  14. Generalization performance of graph-based semisupervised classification

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    Semi-supervised learning has been of growing interest over the past few years and many methods have been proposed. Although various algorithms are provided to implement semi-supervised learning,there are still gaps in our understanding of the dependence of generalization error on the numbers of labeled and unlabeled data. In this paper,we consider a graph-based semi-supervised classification algorithm and establish its generalization error bounds. Our results show the close relations between the generalization performance and the structural invariants of data graph.

  15. Classification Based on Hierarchical Linear Models: The Need for Incorporation of Social Contexts in Classification Analysis

    Science.gov (United States)

    Vaughn, Brandon K.; Wang, Qui

    2009-01-01

    Many areas in educational and psychological research involve the use of classification statistical analysis. For example, school districts might be interested in attaining variables that provide optimal prediction of school dropouts. In psychology, a researcher might be interested in the classification of a subject into a particular psychological…

  16. A Method for Data Classification Based on Discernibility Matrix and Discernibility Function

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    A method for data classification will influence the efficiency of classification. Attributes reduction based on discernibility matrix and discernibility function in rough sets can use in data classification, so we put forward a method for data classification. Namely, firstly, we use discernibility matrix and discernibility function to delete superfluous attributes in formation system and get a necessary attribute set. Secondly, we delete superfluous attribute values and get decision rules. Finally, we classify data by means of decision rules. The experiments show that data classification using this method is simpler in the structure, and can improve the efficiency of classification.

  17. Semi-Supervised Classification based on Gaussian Mixture Model for remote imagery

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    Semi-Supervised Classification (SSC),which makes use of both labeled and unlabeled data to determine classification borders in feature space,has great advantages in extracting classification information from mass data.In this paper,a novel SSC method based on Gaussian Mixture Model (GMM) is proposed,in which each class’s feature space is described by one GMM.Experiments show the proposed method can achieve high classification accuracy with small amount of labeled data.However,for the same accuracy,supervised classification methods such as Support Vector Machine,Object Oriented Classification,etc.should be provided with much more labeled data.

  18. Cancer Biochemistry and Host-Tumor Interactions: A Decimal Classification, (Categories 51.6, 51.7, and 51.8).

    Science.gov (United States)

    Schneider, John H.

    This is a hierarchical decimal classification of information related to cancer biochemistry, to host-tumor interactions (including cancer immunology), and to occurrence of cancer in special types of animals and plants. It is a working draft of categories taken from an extensive classification of many fields of biomedical information. Because the…

  19. Feature selection gait-based gender classification under different circumstances

    Science.gov (United States)

    Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

    2014-05-01

    This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

  20. Task Classification Based Energy-Aware Consolidation in Clouds

    Directory of Open Access Journals (Sweden)

    HeeSeok Choi

    2016-01-01

    Full Text Available We consider a cloud data center, in which the service provider supplies virtual machines (VMs on hosts or physical machines (PMs to its subscribers for computation in an on-demand fashion. For the cloud data center, we propose a task consolidation algorithm based on task classification (i.e., computation-intensive and data-intensive and resource utilization (e.g., CPU and RAM. Furthermore, we design a VM consolidation algorithm to balance task execution time and energy consumption without violating a predefined service level agreement (SLA. Unlike the existing research on VM consolidation or scheduling that applies none or single threshold schemes, we focus on a double threshold (upper and lower scheme, which is used for VM consolidation. More specifically, when a host operates with resource utilization below the lower threshold, all the VMs on the host will be scheduled to be migrated to other hosts and then the host will be powered down, while when a host operates with resource utilization above the upper threshold, a VM will be migrated to avoid using 100% of resource utilization. Based on experimental performance evaluations with real-world traces, we prove that our task classification based energy-aware consolidation algorithm (TCEA achieves a significant energy reduction without incurring predefined SLA violations.

  1. Classification of prostate cancer grade using temporal ultrasound: in vivo feasibility study

    Science.gov (United States)

    Ghavidel, Sahar; Imani, Farhad; Khallaghi, Siavash; Gibson, Eli; Khojaste, Amir; Gaed, Mena; Moussa, Madeleine; Gomez, Jose A.; Siemens, D. Robert; Leveridge, Michael; Chang, Silvia; Fenster, Aaron; Ward, Aaron D.; Abolmaesumi, Purang; Mousavi, Parvin

    2016-03-01

    Temporal ultrasound has been shown to have high classification accuracy in differentiating cancer from benign tissue. In this paper, we extend the temporal ultrasound method to classify lower grade Prostate Cancer (PCa) from all other grades. We use a group of nine patients with mostly lower grade PCa, where cancerous regions are also limited. A critical challenge is to train a classifier with limited aggressive cancerous tissue compared to low grade cancerous tissue. To resolve the problem of imbalanced data, we use Synthetic Minority Oversampling Technique (SMOTE) to generate synthetic samples for the minority class. We calculate spectral features of temporal ultrasound data and perform feature selection using Random Forests. In leave-one-patient-out cross-validation strategy, an area under receiver operating characteristic curve (AUC) of 0.74 is achieved with overall sensitivity and specificity of 70%. Using an unsupervised learning approach prior to proposed method improves sensitivity and AUC to 80% and 0.79. This work represents promising results to classify lower and higher grade PCa with limited cancerous training samples, using temporal ultrasound.

  2. Classification of normal and cancerous lung tissues by electrical impendence tomography.

    Science.gov (United States)

    Gao, Jianling; Yue, Shihong; Chen, Jun; Wang, Huaxiang

    2014-01-01

    Biological tissue impedance spectroscopy can provide rich physiological and pathological information by measuring the variation of the complex impedance of biological tissues under various frequencies of driven current. Electrical Impedance Tomography (EIT) technique can measure the impedance spectroscopy of biological tissue in medical field. Before application, a key problem must be solved on how to generally distinguish normal tissues from the cancerous in terms of measurable EIT data. In this paper, the impedance spectroscopy characteristics of human lung tissue are studied. On the basis of the measured data of 109 lung cancer patients, Cole-Cole Circle radius (CCCR) and the complex modulus are extracted. In terms of the two characteristics, 71.6% and 66.4% samples of cancerous and normal tissues can be correctly classified, respectively. Furthermore, two characteristics of the measured EIT data of each patient consist of a two-dimensional vector and all such vectors comprise a set of vectors. When classifying the vector set, the rate of correctly partitioning normal and cancerous tissues can be raised to 78.2%. The main factors to affect the classification results on normal and cancerous tissues are generally analyzed. The proposed method will play an important role in further working out an efficient and feasible diagnostic method for potential lung cancer patients, and provide theoretical basis and reference data for electrical impedance tomography technology in monitoring pulmonary function.

  3. Association Technique based on Classification for Classifying Microcalcification and Mass in Mammogram

    Directory of Open Access Journals (Sweden)

    Herwanto

    2013-01-01

    Full Text Available Currently, mammography is recognized as the most effective imaging modality for breast cancer screening. The challenge of using mammography is how to locate the area, which is indeed a solitary geographic abnormality. In mammography screening it is important to define the risk for women who have radiologically negative findings and for those who might develop malignancy later in life. Microcalcification and mass segmentation are used frequently as the first step in mammography screening. The main objective of this paper is to apply association technique based on classification algorithm to classify microcalcification and mass in mammogram. The system that we propose consists of: (i a preprocessing phase to enhance the quality of the image and followed by segmentating region of interest; (ii a phase for mining a transactional table; and (iii a phase for organizing the resulted association rules in a classification model. This paper also illustrates how important the data cleaning phase in building the data mining process for image classification. The proposed method was evaluated using the mammogram data from Mammographic Image Analysis Society (MIAS. The MIAS data consist of 207 images of normal breast, 64 benign, and 51 malignant. 85 mammograms of MIAS data have mass, and 25 mammograms have microcalcification. The features of mean and Gray Level Co-occurrence Matrix homogeneity have been proved to be potential for discriminating microcalcification from mass. The accuracy obtained by this method is 83%.

  4. Joint Probability-Based Neuronal Spike Train Classification

    Directory of Open Access Journals (Sweden)

    Yan Chen

    2009-01-01

    Full Text Available Neuronal spike trains are used by the nervous system to encode and transmit information. Euclidean distance-based methods (EDBMs have been applied to quantify the similarity between temporally-discretized spike trains and model responses. In this study, using the same discretization procedure, we developed and applied a joint probability-based method (JPBM to classify individual spike trains of slowly adapting pulmonary stretch receptors (SARs. The activity of individual SARs was recorded in anaesthetized, paralysed adult male rabbits, which were artificially-ventilated at constant rate and one of three different volumes. Two-thirds of the responses to the 600 stimuli presented at each volume were used to construct three response models (one for each stimulus volume consisting of a series of time bins, each with spike probabilities. The remaining one-third of the responses where used as test responses to be classified into one of the three model responses. This was done by computing the joint probability of observing the same series of events (spikes or no spikes, dictated by the test response in a given model and determining which probability of the three was highest. The JPBM generally produced better classification accuracy than the EDBM, and both performed well above chance. Both methods were similarly affected by variations in discretization parameters, response epoch duration, and two different response alignment strategies. Increasing bin widths increased classification accuracy, which also improved with increased observation time, but primarily during periods of increasing lung inflation. Thus, the JPBM is a simple and effective method performing spike train classification.

  5. Scene classification of infrared images based on texture feature

    Science.gov (United States)

    Zhang, Xiao; Bai, Tingzhu; Shang, Fei

    2008-12-01

    Scene Classification refers to as assigning a physical scene into one of a set of predefined categories. Utilizing the method texture feature is good for providing the approach to classify scenes. Texture can be considered to be repeating patterns of local variation of pixel intensities. And texture analysis is important in many applications of computer image analysis for classification or segmentation of images based on local spatial variations of intensity. Texture describes the structural information of images, so it provides another data to classify comparing to the spectrum. Now, infrared thermal imagers are used in different kinds of fields. Since infrared images of the objects reflect their own thermal radiation, there are some shortcomings of infrared images: the poor contrast between the objectives and background, the effects of blurs edges, much noise and so on. Because of these shortcomings, it is difficult to extract to the texture feature of infrared images. In this paper we have developed an infrared image texture feature-based algorithm to classify scenes of infrared images. This paper researches texture extraction using Gabor wavelet transform. The transformation of Gabor has excellent capability in analysis the frequency and direction of the partial district. Gabor wavelets is chosen for its biological relevance and technical properties In the first place, after introducing the Gabor wavelet transform and the texture analysis methods, the infrared images are extracted texture feature by Gabor wavelet transform. It is utilized the multi-scale property of Gabor filter. In the second place, we take multi-dimensional means and standard deviation with different scales and directions as texture parameters. The last stage is classification of scene texture parameters with least squares support vector machine (LS-SVM) algorithm. SVM is based on the principle of structural risk minimization (SRM). Compared with SVM, LS-SVM has overcome the shortcoming of

  6. Soft computing based feature selection for environmental sound classification

    NARCIS (Netherlands)

    Shakoor, A.; May, T.M.; Van Schijndel, N.H.

    2010-01-01

    Environmental sound classification has a wide range of applications,like hearing aids, mobile communication devices, portable media players, and auditory protection devices. Sound classification systemstypically extract features from the input sound. Using too many features increases complexity unne

  7. Understanding Acupuncture Based on ZHENG Classification from System Perspective

    Directory of Open Access Journals (Sweden)

    Junwei Fang

    2013-01-01

    Full Text Available Acupuncture is an efficient therapy method originated in ancient China, the study of which based on ZHENG classification is a systematic research on understanding its complexity. The system perspective is contributed to understand the essence of phenomena, and, as the coming of the system biology era, broader technology platforms such as omics technologies were established for the objective study of traditional chinese medicine (TCM. Omics technologies could dynamically determine molecular components of various levels, which could achieve a systematic understanding of acupuncture by finding out the relationships of various response parts. After reviewing the literature of acupuncture studied by omics approaches, the following points were found. Firstly, with the help of omics approaches, acupuncture was found to be able to treat diseases by regulating the neuroendocrine immune (NEI network and the change of which could reflect the global effect of acupuncture. Secondly, the global effect of acupuncture could reflect ZHENG information at certain structure and function levels, which might reveal the mechanism of Meridian and Acupoint Specificity. Furthermore, based on comprehensive ZHENG classification, omics researches could help us understand the action characteristics of acupoints and the molecular mechanisms of their synergistic effect.

  8. Pixel classification based color image segmentation using quaternion exponent moments.

    Science.gov (United States)

    Wang, Xiang-Yang; Wu, Zhi-Fang; Chen, Liang; Zheng, Hong-Liang; Yang, Hong-Ying

    2016-02-01

    Image segmentation remains an important, but hard-to-solve, problem since it appears to be application dependent with usually no a priori information available regarding the image structure. In recent years, many image segmentation algorithms have been developed, but they are often very complex and some undesired results occur frequently. In this paper, we propose a pixel classification based color image segmentation using quaternion exponent moments. Firstly, the pixel-level image feature is extracted based on quaternion exponent moments (QEMs), which can capture effectively the image pixel content by considering the correlation between different color channels. Then, the pixel-level image feature is used as input of twin support vector machines (TSVM) classifier, and the TSVM model is trained by selecting the training samples with Arimoto entropy thresholding. Finally, the color image is segmented with the trained TSVM model. The proposed scheme has the following advantages: (1) the effective QEMs is introduced to describe color image pixel content, which considers the correlation between different color channels, (2) the excellent TSVM classifier is utilized, which has lower computation time and higher classification accuracy. Experimental results show that our proposed method has very promising segmentation performance compared with the state-of-the-art segmentation approaches recently proposed in the literature. PMID:26618250

  9. ECG-based heartbeat classification for arrhythmia detection: A survey.

    Science.gov (United States)

    Luz, Eduardo José da S; Schwartz, William Robson; Cámara-Chávez, Guillermo; Menotti, David

    2016-04-01

    An electrocardiogram (ECG) measures the electric activity of the heart and has been widely used for detecting heart diseases due to its simplicity and non-invasive nature. By analyzing the electrical signal of each heartbeat, i.e., the combination of action impulse waveforms produced by different specialized cardiac tissues found in the heart, it is possible to detect some of its abnormalities. In the last decades, several works were developed to produce automatic ECG-based heartbeat classification methods. In this work, we survey the current state-of-the-art methods of ECG-based automated abnormalities heartbeat classification by presenting the ECG signal preprocessing, the heartbeat segmentation techniques, the feature description methods and the learning algorithms used. In addition, we describe some of the databases used for evaluation of methods indicated by a well-known standard developed by the Association for the Advancement of Medical Instrumentation (AAMI) and described in ANSI/AAMI EC57:1998/(R)2008 (ANSI/AAMI, 2008). Finally, we discuss limitations and drawbacks of the methods in the literature presenting concluding remarks and future challenges, and also we propose an evaluation process workflow to guide authors in future works.

  10. Robust Pedestrian Classification Based on Hierarchical Kernel Sparse Representation

    Directory of Open Access Journals (Sweden)

    Rui Sun

    2016-08-01

    Full Text Available Vision-based pedestrian detection has become an active topic in computer vision and autonomous vehicles. It aims at detecting pedestrians appearing ahead of the vehicle using a camera so that autonomous vehicles can assess the danger and take action. Due to varied illumination and appearance, complex background and occlusion pedestrian detection in outdoor environments is a difficult problem. In this paper, we propose a novel hierarchical feature extraction and weighted kernel sparse representation model for pedestrian classification. Initially, hierarchical feature extraction based on a CENTRIST descriptor is used to capture discriminative structures. A max pooling operation is used to enhance the invariance of varying appearance. Then, a kernel sparse representation model is proposed to fully exploit the discrimination information embedded in the hierarchical local features, and a Gaussian weight function as the measure to effectively handle the occlusion in pedestrian images. Extensive experiments are conducted on benchmark databases, including INRIA, Daimler, an artificially generated dataset and a real occluded dataset, demonstrating the more robust performance of the proposed method compared to state-of-the-art pedestrian classification methods.

  11. A Cluster Based Approach for Classification of Web Results

    Directory of Open Access Journals (Sweden)

    Apeksha Khabia

    2014-12-01

    Full Text Available Nowadays significant amount of information from web is present in the form of text, e.g., reviews, forum postings, blogs, news articles, email messages, web pages. It becomes difficult to classify documents in predefined categories as the number of document grows. Clustering is the classification of a data into clusters, so that the data in each cluster share some common trait – often vicinity according to some defined measure. Underlying distribution of data set can somewhat be depicted based on the learned clusters under the guidance of initial data set. Thus, clusters of documents can be employed to train the classifier by using defined features of those clusters. One of the important issues is also to classify the text data from web into different clusters by mining the knowledge. Conforming to that, this paper presents a review on most of document clustering technique and cluster based classification techniques used so far. Also pre-processing on text dataset and document clustering method is explained in brief.

  12. ECG-based heartbeat classification for arrhythmia detection: A survey.

    Science.gov (United States)

    Luz, Eduardo José da S; Schwartz, William Robson; Cámara-Chávez, Guillermo; Menotti, David

    2016-04-01

    An electrocardiogram (ECG) measures the electric activity of the heart and has been widely used for detecting heart diseases due to its simplicity and non-invasive nature. By analyzing the electrical signal of each heartbeat, i.e., the combination of action impulse waveforms produced by different specialized cardiac tissues found in the heart, it is possible to detect some of its abnormalities. In the last decades, several works were developed to produce automatic ECG-based heartbeat classification methods. In this work, we survey the current state-of-the-art methods of ECG-based automated abnormalities heartbeat classification by presenting the ECG signal preprocessing, the heartbeat segmentation techniques, the feature description methods and the learning algorithms used. In addition, we describe some of the databases used for evaluation of methods indicated by a well-known standard developed by the Association for the Advancement of Medical Instrumentation (AAMI) and described in ANSI/AAMI EC57:1998/(R)2008 (ANSI/AAMI, 2008). Finally, we discuss limitations and drawbacks of the methods in the literature presenting concluding remarks and future challenges, and also we propose an evaluation process workflow to guide authors in future works. PMID:26775139

  13. Robust Pedestrian Classification Based on Hierarchical Kernel Sparse Representation.

    Science.gov (United States)

    Sun, Rui; Zhang, Guanghai; Yan, Xiaoxing; Gao, Jun

    2016-01-01

    Vision-based pedestrian detection has become an active topic in computer vision and autonomous vehicles. It aims at detecting pedestrians appearing ahead of the vehicle using a camera so that autonomous vehicles can assess the danger and take action. Due to varied illumination and appearance, complex background and occlusion pedestrian detection in outdoor environments is a difficult problem. In this paper, we propose a novel hierarchical feature extraction and weighted kernel sparse representation model for pedestrian classification. Initially, hierarchical feature extraction based on a CENTRIST descriptor is used to capture discriminative structures. A max pooling operation is used to enhance the invariance of varying appearance. Then, a kernel sparse representation model is proposed to fully exploit the discrimination information embedded in the hierarchical local features, and a Gaussian weight function as the measure to effectively handle the occlusion in pedestrian images. Extensive experiments are conducted on benchmark databases, including INRIA, Daimler, an artificially generated dataset and a real occluded dataset, demonstrating the more robust performance of the proposed method compared to state-of-the-art pedestrian classification methods. PMID:27537888

  14. Gear Crack Level Classification Based on EMD and EDT

    Directory of Open Access Journals (Sweden)

    Haiping Li

    2015-01-01

    Full Text Available Gears are the most essential parts in rotating machinery. Crack fault is one of damage modes most frequently occurring in gears. So, this paper deals with the problem of different crack levels classification. The proposed method is mainly based on empirical mode decomposition (EMD and Euclidean distance technique (EDT. First, vibration signal acquired by accelerometer is processed by EMD and intrinsic mode functions (IMFs are obtained. Then, a correlation coefficient based method is proposed to select the sensitive IMFs which contain main gear fault information. And energy of these IMFs is chosen as the fault feature by comparing with kurtosis and skewness. Finally, Euclidean distances between test sample and four classes trained samples are calculated, and on this basis, fault level classification of the test sample can be made. The proposed approach is tested and validated through a gearbox experiment, in which four crack levels and three kinds of loads are utilized. The results show that the proposed method has high accuracy rates in classifying different crack levels and may be adaptive to different conditions.

  15. Gene selection in class space for molecular classification of cancer

    Institute of Scientific and Technical Information of China (English)

    ZHANG Junying; Yue Joseph WANG; Javed KHAN; Robert CLARKE

    2004-01-01

    Gene selection (feature selection) is generally performed in gene space (feature space), where a very serious curse of dimensionality problem always exists because the number of genes is much larger than the number of samples in gene space (G-space). This results in difficulty in modeling the data set in this space and the low confidence of the result of gene selection. How to find a gene subset in this case is a challenging subject. In this paper, the above G-space is transformed into its dual space, referred to as class space (C-space) such that the number of dimensions is the very number of classes of the samples in G-space and the number of samples in C-space is the number of genes in G-space. It is obvious that the curse of dimensionality in C-space does not exist. A new gene selection method which is based on the principle of separating different classes as far as possible is presented with the help of Principal Component Analysis (PCA). The experimental results on gene selection for real data set are evaluated with Fisher criterion, weighted Fisher criterion as well as leave-one-out cross validation, showing that the method presented here is effective and efficient.

  16. Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Classification

    OpenAIRE

    Rajendra Palange,; Nishikant Pachpute

    2015-01-01

    This paper proposes the Rainfall Prediction System by using classification technique. The advanced and modified neural network called Data Core Based Fuzzy Min Max Neural Network (DCFMNN) is used for pattern classification. This classification method is applied to predict Rainfall. The neural network called fuzzy min max neural network (FMNN) that creates hyperboxes for classification and predication, has a problem of overlapping neurons that resoled in DCFMNN to give greater accu...

  17. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    Science.gov (United States)

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. PMID:25880524

  18. Classification of special breast cancer and the principles in diagnosis and treatment%特殊类型乳腺癌分类及诊治原则

    Institute of Scientific and Technical Information of China (English)

    姜军

    2013-01-01

    因病理和临床医生的关注点不同,对特殊类型乳腺癌的定义理解也不尽相同.世界卫生组织(WHO)于2012年发布的“肿瘤组织学和遗传学分类”(第4版)中介绍了《乳腺肿瘤组织学分类》的变化,从外科临床工作的角度将特殊类型乳腺癌进行分类.提出:对特殊类型乳腺癌应加强临床总结;对特殊情况乳腺癌在临床治疗上应区别对待.分子分型的提出对非特殊型浸润性乳腺癌组织学分型的意义提出了挑战.特殊类型乳腺癌外科的循证研究尚显不足,需进行更广泛的研究和实践.%From the aspect of pathology and the viewpoint of clinicians, the definitions of "special breast cancer" differ. This paper briefly introduced the "WHO classification of tumors of the breast" in "Histological and genetic classification of tumors" released by WHO in 2012, and classified the special breast cancers based on clinical experience. The main conclusions were listed as follows: it is necessary to summarize the clinical classification of special types of breast cancer; the treatment for the unusual conditions should be individualized. The molecular subtyping gives a challenge to the histological classification of "non-special type of invasive breast cancer . All in all, the evidence-based research for surgical treatment against "special breast cancer" is insufficient, which requires more extensive studies.

  19. Utilizing ECG-Based Heartbeat Classification for Hypertrophic Cardiomyopathy Identification.

    Science.gov (United States)

    Rahman, Quazi Abidur; Tereshchenko, Larisa G; Kongkatong, Matthew; Abraham, Theodore; Abraham, M Roselle; Shatkay, Hagit

    2015-07-01

    Hypertrophic cardiomyopathy (HCM) is a cardiovascular disease where the heart muscle is partially thickened and blood flow is (potentially fatally) obstructed. A test based on electrocardiograms (ECG) that record the heart electrical activity can help in early detection of HCM patients. This paper presents a cardiovascular-patient classifier we developed to identify HCM patients using standard 10-second, 12-lead ECG signals. Patients are classified as having HCM if the majority of their recorded heartbeats are recognized as characteristic of HCM. Thus, the classifier's underlying task is to recognize individual heartbeats segmented from 12-lead ECG signals as HCM beats, where heartbeats from non-HCM cardiovascular patients are used as controls. We extracted 504 morphological and temporal features—both commonly used and newly-developed ones—from ECG signals for heartbeat classification. To assess classification performance, we trained and tested a random forest classifier and a support vector machine classifier using 5-fold cross validation. We also compared the performance of these two classifiers to that obtained by a logistic regression classifier, and the first two methods performed better than logistic regression. The patient-classification precision of random forests and of support vector machine classifiers is close to 0.85. Recall (sensitivity) and specificity are approximately 0.90. We also conducted feature selection experiments by gradually removing the least informative features; the results show that a relatively small subset of 264 highly informative features can achieve performance measures comparable to those achieved by using the complete set of features. PMID:25915962

  20. 乳腺癌的分子分型%Molecular classification of breast cancer

    Institute of Scientific and Technical Information of China (English)

    张百红; 岳红云

    2014-01-01

    乳腺癌是一种分子水平异质性很高的疾病,分子分型可为乳腺癌的个体化治疗提供一个新视野.在分子病理学、分子生物学和系统生物学指导下,乳腺癌经历了4类分型、70种和21种基因蛋白谱以及基因组整合分类等不同分型.这些分型将为乳腺癌的精确治疗提供指导.%Breast cancer is a group of heterogeneous diseases.Molecular portraits provide a new insight for personalized cancer management in breast cancer.According to the molecular pathology,molecular biology and system biology,breast cancer goes through different typing methods,including four subclasses,geneexpression signature and integrated genomic classification.These major subtypes of breast cancer may provide guidance for precise therapeutics.

  1. Credal Classification based on AODE and compression coefficients

    CERN Document Server

    Corani, Giorgio

    2012-01-01

    Bayesian model averaging (BMA) is an approach to average over alternative models; yet, it usually gets excessively concentrated around the single most probable model, therefore achieving only sub-optimal classification performance. The compression-based approach (Boulle, 2007) overcomes this problem, averaging over the different models by applying a logarithmic smoothing over the models' posterior probabilities. This approach has shown excellent performances when applied to ensembles of naive Bayes classifiers. AODE is another ensemble of models with high performance (Webb, 2005), based on a collection of non-naive classifiers (called SPODE) whose probabilistic predictions are aggregated by simple arithmetic mean. Aggregating the SPODEs via BMA rather than by arithmetic mean deteriorates the performance; instead, we aggregate the SPODEs via the compression coefficients and we show that the resulting classifier obtains a slight but consistent improvement over AODE. However, an important issue in any Bayesian e...

  2. Highly comparative, feature-based time-series classification

    CERN Document Server

    Fulcher, Ben D

    2014-01-01

    A highly comparative, feature-based approach to time series classification is introduced that uses an extensive database of algorithms to extract thousands of interpretable features from time series. These features are derived from across the scientific time-series analysis literature, and include summaries of time series in terms of their correlation structure, distribution, entropy, stationarity, scaling properties, and fits to a range of time-series models. After computing thousands of features for each time series in a training set, those that are most informative of the class structure are selected using greedy forward feature selection with a linear classifier. The resulting feature-based classifiers automatically learn the differences between classes using a reduced number of time-series properties, and circumvent the need to calculate distances between time series. Representing time series in this way results in orders of magnitude of dimensionality reduction, allowing the method to perform well on ve...

  3. Captan: transition from 'B2' to 'not likely'. How pesticide registrants affected the EPA Cancer Classification Update.

    Science.gov (United States)

    Gordon, Elliot

    2007-01-01

    On 24 November 2004 EPA changed the cancer classification of captan from a 'probable human carcinogen' (Category B2) to 'not likely' when used according to label directions. The new cancer classification considers captan to be a potential carcinogen at prolonged high doses that cause cytotoxicity and regenerative cell hyperplasia. These high doses of captan are many orders of magnitude above those likely to be consumed in the diet, or encountered by individuals in occupational or residential settings. This revised cancer classification reflects EPA's implementation of their new cancer guidelines. The procedures involved in the reclassification effort were agreed upon with EPA and involved an Independent Transparent Review as it related to four components that formed the basis of the original 1986 B2 classification: mouse tumors; rat tumors; mutagenicity; and structural similarity to other carcinogens. A Peer Review Panel organized and administered by Toxicology Excellence for Risk Assessment (TERA) met on 2-3 September 2003. The Panel concluded that captan acted through a non-mutagenic threshold mode of action that required prolonged irritation of the duodenal villi as the initial key event. EPA's Cancer Assessment Review Committee (CARC) met on 9 June 2004 and endorsed the Peer Review findings. EPA intended to have the FIFRA Scientific Advisory Panel (SAP) consider the basis for this reclassification but found the science was robust and judged that a SAP review was not warranted. Using the revised classification, the margin of exposure is approximately 1,200,000, supporting the 'not likely' characterization.

  4. Neighborhood Hypergraph Based Classification Algorithm for Incomplete Information System

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2015-01-01

    Full Text Available The problem of classification in incomplete information system is a hot issue in intelligent information processing. Hypergraph is a new intelligent method for machine learning. However, it is hard to process the incomplete information system by the traditional hypergraph, which is due to two reasons: (1 the hyperedges are generated randomly in traditional hypergraph model; (2 the existing methods are unsuitable to deal with incomplete information system, for the sake of missing values in incomplete information system. In this paper, we propose a novel classification algorithm for incomplete information system based on hypergraph model and rough set theory. Firstly, we initialize the hypergraph. Second, we classify the training set by neighborhood hypergraph. Third, under the guidance of rough set, we replace the poor hyperedges. After that, we can obtain a good classifier. The proposed approach is tested on 15 data sets from UCI machine learning repository. Furthermore, it is compared with some existing methods, such as C4.5, SVM, NavieBayes, and KNN. The experimental results show that the proposed algorithm has better performance via Precision, Recall, AUC, and F-measure.

  5. Classification of EMG Signal Based on Human Percentile using SOM

    Directory of Open Access Journals (Sweden)

    M.H. Jali

    2014-07-01

    Full Text Available Electromyography (EMG is a bio signal that is formed by physiological variations in the state of muscle fibre membranes. Pattern recognition is one of the fields in the bio-signal processing which classified the signal into certain desired categories with subject to their area of application. This study described the classification of the EMG signal based on human body percentile using Self Organizing Mapping (SOM technique. Different human percentile definitively varies the arm circumference size. Variation of arm circumference is due to fatty tissue that lay between active muscle and skin. Generally the fatty tissue would decrease the overall amplitude of the EMG signal. Data collection is conducted randomly with fifteen subjects that have numerous percentiles using non-invasive technique at Biceps Brachii muscle. The signals are then going through filtering process to prepare them for the next stage. Then, five well known time domain feature extraction methods are applied to the signal before the classification process. Self Organizing Map (SOM technique is used as a classifier to discriminate between the human percentiles. Result shows that SOM is capable in clustering the EMG signal to the desired human percentile categories by optimizing the neurons of the technique.

  6. Radar Image Texture Classification based on Gabor Filter Bank

    Directory of Open Access Journals (Sweden)

    Mbainaibeye Jérôme

    2014-01-01

    Full Text Available The aim of this paper is to design and develop a filter bank for the detection and classification of radar image texture with 4.6m resolution obtained by airborne synthetic Aperture Radar. The textures of this kind of images are more correlated and contain forms with random disposition. The design and the developing of the filter bank is based on Gabor filter. We have elaborated a set of filters applied to each set of feature texture allowing its identification and enhancement in comparison with other textures. The filter bank which we have elaborated is represented by a combination of different texture filters. After processing, the selected filter bank is the filter bank which allows the identification of all the textures of an image with a significant identification rate. This developed filter is applied to radar image and the obtained results are compared with those obtained by using filter banks issue from the generalized Gaussian models (GGM. We have shown that Gabor filter developed in this work gives the classification rate greater than the results obtained by Generalized Gaussian model. The main contribution of this work is the generation of the filter banks able to give an optimal filter bank for a given texture and in particular for radar image textures

  7. Classification of knee arthropathy with accelerometer-based vibroarthrography.

    Science.gov (United States)

    Moreira, Dinis; Silva, Joana; Correia, Miguel V; Massada, Marta

    2016-01-01

    One of the most common knee joint disorders is known as osteoarthritis which results from the progressive degeneration of cartilage and subchondral bone over time, affecting essentially elderly adults. Current evaluation techniques are either complex, expensive, invasive or simply fails into detection of small and progressive changes that occur within the knee. Vibroarthrography appeared as a new solution where the mechanical vibratory signals arising from the knee are recorded recurring only to an accelerometer and posteriorly analyzed enabling the differentiation between a healthy and an arthritic joint. In this study, a vibration-based classification system was created using a dataset with 92 healthy and 120 arthritic segments of knee joint signals collected from 19 healthy and 20 arthritic volunteers, evaluated with k-nearest neighbors and support vector machine classifiers. The best classification was obtained using the k-nearest neighbors classifier with only 6 time-frequency features with an overall accuracy of 89.8% and with a precision, recall and f-measure of 88.3%, 92.4% and 90.1%, respectively. Preliminary results showed that vibroarthrography can be a promising, non-invasive and low cost tool that could be used for screening purposes. Despite this encouraging results, several upgrades in the data collection process and analysis can be further implemented.

  8. Hyperspectral image classification based on spatial and spectral features and sparse representation

    Institute of Scientific and Technical Information of China (English)

    Yang Jing-Hui; Wang Li-Guo; Qian Jin-Xi

    2014-01-01

    To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is based on the Gabor spatial texture features and nonparametric weighted spectral features, and the sparse representation classification method (Gabor–NWSF and SRC), abbreviated GNWSF–SRC. The proposed (GNWSF–SRC) method first combines the Gabor spatial features and nonparametric weighted spectral features to describe the hyperspectral image, and then applies the sparse representation method. Finally, the classification is obtained by analyzing the reconstruction error. We use the proposed method to process two typical hyperspectral data sets with different percentages of training samples. Theoretical analysis and simulation demonstrate that the proposed method improves the classification accuracy and Kappa coefficient compared with traditional classification methods and achieves better classification performance.

  9. Pro duct Image Classification Based on Fusion Features

    Institute of Scientific and Technical Information of China (English)

    YANG Xiao-hui; LIU Jing-jing; YANG Li-jun

    2015-01-01

    Two key challenges raised by a product images classification system are classi-fication precision and classification time. In some categories, classification precision of the latest techniques, in the product images classification system, is still low. In this paper, we propose a local texture descriptor termed fan refined local binary pattern, which captures more detailed information by integrating the spatial distribution into the local binary pattern feature. We compare our approach with different methods on a subset of product images on Amazon/eBay and parts of PI100 and experimental results have demonstrated that our proposed approach is superior to the current existing methods. The highest classification precision is increased by 21%and the average classification time is reduced by 2/3.

  10. A Method of Soil Salinization Information Extraction with SVM Classification Based on ICA and Texture Features

    Institute of Scientific and Technical Information of China (English)

    ZHANG Fei; TASHPOLAT Tiyip; KUNG Hsiang-te; DING Jian-li; MAMAT.Sawut; VERNER Johnson; HAN Gui-hong; GUI Dong-wei

    2011-01-01

    Salt-affected soils classification using remotely sensed images is one of the most common applications in remote sensing,and many algorithms have been developed and applied for this purpose in the literature.This study takes the Delta Oasis of Weigan and Kuqa Rivers as a study area and discusses the prediction of soil salinization from ETM+ Landsat data.It reports the Support Vector Machine(SVM) classification method based on Independent Component Analysis(ICA) and Texture features.Meanwhile,the letter introduces the fundamental theory of SVM algorithm and ICA,and then incorporates ICA and texture features.The classification result is compared with ICA-SVM classification,single data source SVM classification,maximum likelihood classification(MLC) and neural network classification qualitatively and quantitatively.The result shows that this method can effectively solve the problem of low accuracy and fracture classification result in single data source classification.It has high spread ability toward higher array input.The overall accuracy is 98.64%,which increases by 10.2% compared with maximum likelihood classification,even increases by 12.94% compared with neural net classification,and thus acquires good effectiveness.Therefore,the classification method based on SVM and incorporating the ICA and texture features can be adapted to RS image classification and monitoring of soil salinization.

  11. Classification of follicular cell-derived thyroid cancer by global RNA profiling

    DEFF Research Database (Denmark)

    Rossing, Maria

    2013-01-01

    classifiers that may differentiate malignant from benign thyroid nodules. Molecular classification models based on global RNA profiles from fine-needle aspirations are currently being evaluated; results are preliminary and lack validation in prospective clinical trials. There is no doubt that molecular...

  12. Classification of Dukes' B and C colorectal cancers using expression arrays

    DEFF Research Database (Denmark)

    Frederiksen, C.M.; Knudsen, Steen; Laurberg, S.;

    2003-01-01

    and D could not be classified correctly. A number of interesting gene clusters showed a discriminating difference between Dukes' B and C samples. These included mitochondrial genes, stromal remodeling genes, and genes related to cell adhesion. Conclusion. Molecular classification based on gene...

  13. Radiological classification of renal angiomyolipomas based on 127 tumors

    Directory of Open Access Journals (Sweden)

    Prando Adilson

    2003-01-01

    Full Text Available PURPOSE: Demonstrate radiological findings of 127 angiomyolipomas (AMLs and propose a classification based on the radiological evidence of fat. MATERIALS AND METHODS: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73, multiple without tuberous sclerosis (TS (n = 4 and multiple with TS (n = 8, were retrospectively reviewed. Eighteen AMLs (14% presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13, hemorrhage (n = 11 and impossibility of an adequate preoperative characterization (n = 8. There was not a case of renal cell carcinoma (RCC with fat component in this group of patients. RESULTS: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal: 54%; Pattern-II, partially fatty (intrarenal or exophytic: 29%; Pattern-III, minimally fatty (most exophytic and perirenal: 11%; and Pattern-IV, without fat (most exophytic and perirenal: 6%. CONCLUSIONS: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm, pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exophytic and heterogeneous (patterns II and III. The rare pattern-IV AMLs, however, can be small or large, intra-renal or exophytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with detectable

  14. Radiological classification of renal angiomyolipomas based on 127 tumors

    Energy Technology Data Exchange (ETDEWEB)

    Prando, Adilson [Hospital Vera Cruz, Campinas, SP (Brazil). Dept. de Radiologia]. E-mail: aprando@mpc.com.br

    2003-05-15

    Purpose: Demonstrate radiological findings of 127 angiomyolipomas (AMLs) and propose a classification based on the radiological evidence of fat. Materials And Methods: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73), multiple without tuberous sclerosis (TS) (n = 4) and multiple with TS (n = 8), were retrospectively reviewed. Eighteen AMLs (14%) presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13), hemorrhage (n = 11) and impossibility of an adequate preoperative characterization (n = 8). There was not a case of renal cell carcinoma (RCC) with fat component in this group of patients. Results: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal): 54%; Pattern-II, partially fatty (intrarenal or exo phytic): 29%; Pattern-III, minimally fatty (most exo phytic and peri renal): 11%; and Pattern-IV, without fat (most exo phytic and peri renal): 6%. Conclusions: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm), pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exo phytic and heterogeneous (patterns II and III). The rare pattern-IV AMLs, however, can be small or large, intra-renal or exo phytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with

  15. Radiological classification of renal angiomyolipomas based on 127 tumors

    International Nuclear Information System (INIS)

    Purpose: Demonstrate radiological findings of 127 angiomyolipomas (AMLs) and propose a classification based on the radiological evidence of fat. Materials And Methods: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73), multiple without tuberous sclerosis (TS) (n = 4) and multiple with TS (n = 8), were retrospectively reviewed. Eighteen AMLs (14%) presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13), hemorrhage (n = 11) and impossibility of an adequate preoperative characterization (n = 8). There was not a case of renal cell carcinoma (RCC) with fat component in this group of patients. Results: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal): 54%; Pattern-II, partially fatty (intrarenal or exo phytic): 29%; Pattern-III, minimally fatty (most exo phytic and peri renal): 11%; and Pattern-IV, without fat (most exo phytic and peri renal): 6%. Conclusions: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm), pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exo phytic and heterogeneous (patterns II and III). The rare pattern-IV AMLs, however, can be small or large, intra-renal or exo phytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with

  16. Statistical Analysis of Tissue Images for Detection and Classification of Cervical Cancer

    CERN Document Server

    Jagtap, Jaidip; Pandey, Kiran; Agarwa, Asha; Panigrahi, Prasanta K; Pradhan, Asima

    2011-01-01

    Cervical cancer is one of the major health threats in women worldwide. The current "gold standard" for detecting cancer of the epithelial tissue is the histopathology analysis of biopsy samples. However it relies on the pathologist's judgment of the disease. We investigate the utility of statistical parameters as a potential tool for detection and discrimination of the stages of dysplasia. Digital images of the tissue slides are captured with the help of a digital camera plugged to a microscope. Statistical data analysis is performed with the help of software to evaluate parameters such as mean, maxima, full width half maxima, skewness, kurtosis etc. for the images. We believe that these parameters can help effectively to improve the diagnosis and further classify normal and abnormal tissue sections. These parameters can be used independently as well as in tandem with other parameters as features in classification algorithms that involve the use of Neural networks or Principal component analysis.

  17. Molecular classification and prognostication of 300 node-negative breast cancer cases: A tertiary care experience

    Science.gov (United States)

    Shemin, K. M. Zuhara; Smitha, N. V.; Jojo, Annie; Vijaykumar, D. K.

    2015-01-01

    Background: The proportion of node-negative breast cancer patients has been increasing with improvement of diagnostic modalities and early detection. However, there is a 20–30% recurrence in node-negative breast cancers. Determining who should receive adjuvant therapy is challenging, as the majority are cured by surgery alone. Hence, it requires further stratification using additional prognostic and predictive factors. Subjects and Methods: Ours is a single institution retrospective study, on 300 node-negative breast cancer cases, who underwent primary surgery over a period of 7 years (2005–2011). We excluded all cases who took NACT. Prognostic factors of age, size, lymphovascular emboli, estrogen receptor (ER), progesterone receptor (PR), HER2neu Ki-67, grade and molecular classification were analyzed with respect to those with and without early events (recurrence, metastases or second malignancy, death) using-Pearson Chi-square method and logistic regression method for statistical analysis. Results: Majority belonged to the age group of 50–70 years. On univariate analysis, size >5 cm (P = 0.03) and ER negativity had significant association (P = 0.05) for early failures; PR negativity and lymphovascular emboli (LVE) had borderline significance (P = 0.07). Multivariate analysis showed size >5 cm to be significant (P = 0.04) and LVE positivity showed borderline significant association (P = 0.07) with early failures. About 62% belonged to luminal category followed by basal-like (25%) in molecular classification. Conclusions: ER negativity, PR negativity, LVE/lymphovascular invasion positivity and size >5 cm (T3 and T4) are associated with poor prognosis in node-negative breast cancers. PMID:26981506

  18. Pooling breast cancer datasets has a synergetic effect on classification performance and improves signature stability

    Directory of Open Access Journals (Sweden)

    van de Vijver Marc J

    2008-08-01

    Full Text Available Abstract Background Michiels et al. (Lancet 2005; 365: 488–92 employed a resampling strategy to show that the genes identified as predictors of prognosis from resamplings of a single gene expression dataset are highly variable. The genes most frequently identified in the separate resamplings were put forward as a 'gold standard'. On a higher level, breast cancer datasets collected by different institutions can be considered as resamplings from the underlying breast cancer population. The limited overlap between published prognostic signatures confirms the trend of signature instability identified by the resampling strategy. Six breast cancer datasets, totaling 947 samples, all measured on the Affymetrix platform, are currently available. This provides a unique opportunity to employ a substantial dataset to investigate the effects of pooling datasets on classifier accuracy, signature stability and enrichment of functional categories. Results We show that the resampling strategy produces a suboptimal ranking of genes, which can not be considered to be a 'gold standard'. When pooling breast cancer datasets, we observed a synergetic effect on the classification performance in 73% of the cases. We also observe a significant positive correlation between the number of datasets that is pooled, the validation performance, the number of genes selected, and the enrichment of specific functional categories. In addition, we have evaluated the support for five explanations that have been postulated for the limited overlap of signatures. Conclusion The limited overlap of current signature genes can be attributed to small sample size. Pooling datasets results in more accurate classification and a convergence of signature genes. We therefore advocate the analysis of new data within the context of a compendium, rather than analysis in isolation.

  19. Molecular classification and prognostication of 300 node-negative breast cancer cases: A tertiary care experience

    Directory of Open Access Journals (Sweden)

    K M Zuhara Shemin

    2015-01-01

    Full Text Available Background: The proportion of node-negative breast cancer patients has been increasing with improvement of diagnostic modalities and early detection. However, there is a 20-30% recurrence in node-negative breast cancers. Determining who should receive adjuvant therapy is challenging, as the majority are cured by surgery alone. Hence, it requires further stratification using additional prognostic and predictive factors. Subjects and Methods: Ours is a single institution retrospective study, on 300 node-negative breast cancer cases, who underwent primary surgery over a period of 7 years (2005-2011. We excluded all cases who took NACT. Prognostic factors of age, size, lymphovascular emboli, estrogen receptor (ER, progesterone receptor (PR, HER2neu Ki-67, grade and molecular classification were analyzed with respect to those with and without early events (recurrence, metastases or second malignancy, death using-Pearson Chi-square method and logistic regression method for statistical analysis. Results: Majority belonged to the age group of 50-70 years. On univariate analysis, size >5 cm (P = 0.03 and ER negativity had significant association (P = 0.05 for early failures; PR negativity and lymphovascular emboli (LVE had borderline significance (P = 0.07. Multivariate analysis showed size >5 cm to be significant (P = 0.04 and LVE positivity showed borderline significant association (P = 0.07 with early failures. About 62% belonged to luminal category followed by basal-like (25% in molecular classification. Conclusions: ER negativity, PR negativity, LVE/lymphovascular invasion positivity and size >5 cm (T3 and T4 are associated with poor prognosis in node-negative breast cancers.

  20. Sequence variant classification and reporting: recommendations for improving the interpretation of cancer susceptibility genetic test results.

    NARCIS (Netherlands)

    Plon, S.E.; Eccles, D.M.; Easton, D.; Foulkes, W.D.; Genuardi, M.; Greenblatt, M.S.; Hogervorst, F.B.; Hoogerbrugge, N.; Spurdle, A.B.; Tavtigian, S.V.

    2008-01-01

    Genetic testing of cancer susceptibility genes is now widely applied in clinical practice to predict risk of developing cancer. In general, sequence-based testing of germline DNA is used to determine whether an individual carries a change that is clearly likely to disrupt normal gene function. Genet

  1. About Classification Methods Based on Tensor Modelling for Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    Salah Bourennane

    2010-03-01

    Full Text Available Denoising and Dimensionality Reduction (DR are key issue to improve the classifiers efficiency for Hyper spectral images (HSI. The multi-way Wiener filtering recently developed is used, Principal and independent component analysis (PCA; ICA and projection pursuit(PP approaches to DR have been investigated. These matrix algebra methods are applied on vectorized images. Thereof, the spatial rearrangement is lost. To jointly take advantage of the spatial and spectral information, HSI has been recently represented as tensor. Offering multiple ways to decompose data orthogonally, we introduced filtering and DR methods based on multilinear algebra tools. The DR is performed on spectral way using PCA, or PP joint to an orthogonal projection onto a lower subspace dimension of the spatial ways. Weshow the classification improvement using the introduced methods in function to existing methods. This experiment is exemplified using real-world HYDICE data. Multi-way filtering, Dimensionality reduction, matrix and multilinear algebra tools, tensor processing.

  2. A comparative study on classification of sleep stage based on EEG signals using feature selection and classification algorithms.

    Science.gov (United States)

    Şen, Baha; Peker, Musa; Çavuşoğlu, Abdullah; Çelebi, Fatih V

    2014-03-01

    Sleep scoring is one of the most important diagnostic methods in psychiatry and neurology. Sleep staging is a time consuming and difficult task undertaken by sleep experts. This study aims to identify a method which would classify sleep stages automatically and with a high degree of accuracy and, in this manner, will assist sleep experts. This study consists of three stages: feature extraction, feature selection from EEG signals, and classification of these signals. In the feature extraction stage, it is used 20 attribute algorithms in four categories. 41 feature parameters were obtained from these algorithms. Feature selection is important in the elimination of irrelevant and redundant features and in this manner prediction accuracy is improved and computational overhead in classification is reduced. Effective feature selection algorithms such as minimum redundancy maximum relevance (mRMR); fast correlation based feature selection (FCBF); ReliefF; t-test; and Fisher score algorithms are preferred at the feature selection stage in selecting a set of features which best represent EEG signals. The features obtained are used as input parameters for the classification algorithms. At the classification stage, five different classification algorithms (random forest (RF); feed-forward neural network (FFNN); decision tree (DT); support vector machine (SVM); and radial basis function neural network (RBF)) classify the problem. The results, obtained from different classification algorithms, are provided so that a comparison can be made between computation times and accuracy rates. Finally, it is obtained 97.03 % classification accuracy using the proposed method. The results show that the proposed method indicate the ability to design a new intelligent assistance sleep scoring system.

  3. Classification of cassava genotypes based on qualitative and quantitative data.

    Science.gov (United States)

    Oliveira, E J; Oliveira Filho, O S; Santos, V S

    2015-02-02

    We evaluated the genetic variation of cassava accessions based on qualitative (binomial and multicategorical) and quantitative traits (continuous). We characterized 95 accessions obtained from the Cassava Germplasm Bank of Embrapa Mandioca e Fruticultura; we evaluated these accessions for 13 continuous, 10 binary, and 25 multicategorical traits. First, we analyzed the accessions based only on quantitative traits; next, we conducted joint analysis (qualitative and quantitative traits) based on the Ward-MLM method, which performs clustering in two stages. According to the pseudo-F, pseudo-t2, and maximum likelihood criteria, we identified five and four groups based on quantitative trait and joint analysis, respectively. The smaller number of groups identified based on joint analysis may be related to the nature of the data. On the other hand, quantitative data are more subject to environmental effects in the phenotype expression; this results in the absence of genetic differences, thereby contributing to greater differentiation among accessions. For most of the accessions, the maximum probability of classification was >0.90, independent of the trait analyzed, indicating a good fit of the clustering method. Differences in clustering according to the type of data implied that analysis of quantitative and qualitative traits in cassava germplasm might explore different genomic regions. On the other hand, when joint analysis was used, the means and ranges of genetic distances were high, indicating that the Ward-MLM method is very useful for clustering genotypes when there are several phenotypic traits, such as in the case of genetic resources and breeding programs.

  4. [Galaxy/quasar classification based on nearest neighbor method].

    Science.gov (United States)

    Li, Xiang-Ru; Lu, Yu; Zhou, Jian-Ming; Wang, Yong-Jun

    2011-09-01

    With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification. PMID:22097877

  5. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  6. Radiological and 'Imaging' methods in TNM classification of non-small-cell lung cancer

    International Nuclear Information System (INIS)

    Lung cancer is the most common worldwide malignant disease according to its incidence and mortality. The aim of our study was to evaluate the diagnostic value of the radiological and imaging methods, according to the TNM classification, compared to postoperative histological diagnosis. Thirty-seven patients with pulmonary carcinoma were studied prospectively using native chest radiography (PA and LL view), computed tomography (CT) and magnetic resonance imaging (MRI) during ten days before thoracotomy. Radiological and imaging findings were reviewed separately and results were compared with surgical and pathohistological findings on the basis of the TNM classification. All patients underwent chest x-rays, CT was performed in 36 patients and MRI in 12 of them. Imaging methods (CT and MRI) showed more accuracy in sensitivity and specificity compared with the native chest radiography in a great percentage. Generally no statistically significant differences were found between the two imagining methods for the evaluation of tumour extent (T) or lymph node metastases (N). MRI was slightly superior to CT in determination of the chest wall extent of the tumour. In the conclusion CT remains the imaging modality og choice both for assessing patients with abnormal chest radiographs suspected of having lung cancer, and in staging patients with histologically proven pulmonary carcinoma.

  7. The Evaluation of Microcarcinoma in Differentiated Thyroid Cancers According to Old and New TNM Classification

    Directory of Open Access Journals (Sweden)

    Zekiye Hasbek

    2011-12-01

    Full Text Available Objective: In this study, we aimed to evaluate the tumor size for proximal and distant metastases when the new and old TNM clas¬sification is taken into account in differentiated thyroid cancers. Material and Methods: Two hundred sixty eight patients diagnosed with thyroid carcinoma, undergoing bilateral total or subto¬tal thyroidectomy treated with high doses of I-131 were examined retrospectively. The data of these patients were compared after classification, according to tumor size 1 cm. In the same group, according to the revised TNM classification, in 149 of 207 patients (72% the tumor size was 2 cm. Of 187 patients with negative lymph nodes, 15 (8% showed abnormal activity accumulation in the first post I-131 treatment whole-body scan and 10 (40% of 25 patients positive lymph node (p<0.05 involvement. Conclusion: Since the treatment of patients with microcarcinoma is controversial, tumor size should not be the only factor consid¬ered in patients with differentiated thyroid cancer Tissue tumor invasion, age, gender and multifocality should also be taken into account. (MIRT2011;20:94-99

  8. BRAIN TUMOR CLASSIFICATION USING NEURAL NETWORK BASED METHODS

    OpenAIRE

    Kalyani A. Bhawar*, Prof. Nitin K. Bhil

    2016-01-01

    MRI (Magnetic resonance Imaging) brain neoplasm pictures Classification may be a troublesome tasks due to the variance and complexity of tumors. This paper presents two Neural Network techniques for the classification of the magnetic resonance human brain images. The proposed Neural Network technique consists of 3 stages, namely, feature extraction, dimensionality reduction, and classification. In the first stage, we have obtained the options connected with tomography pictures victimization d...

  9. CLASSIFICATION OF MULTIVARIATE DATA SETS WITHOUT MISSING VALUES USING MEMORY BASED CLASSIFIERS – AN EFFECTIVENESS EVALUATION

    Directory of Open Access Journals (Sweden)

    C. Lakshmi Devasena

    2013-01-01

    Full Text Available Classification is a gradual practice for allocating a given piece of input into any of the known category. Classification is a crucial Machine Learning technique. There are many classification problem occurs in different application areas and need to be solved. Different types are classification algorithms like memorybased, tree-based, rule-based, etc are widely used. This work evaluates the performance of different memory based classifiers for classification of Multivariate data set without having Missing values from UCI machine learning repository using the open source machine learning tool. A comparison of different memory based classifiers used and a practical guideline for selecting the renowned and most suited algorithm for a classification is presented. Apart from that some pragmatic criteria for describing and evaluating the best classifiers are discussed.

  10. Classification of Multivariate Data Sets without Missing Values Using Memory Based Classifiers - An Effectiveness Evaluation

    Directory of Open Access Journals (Sweden)

    C. Lakshmi Devasena

    2013-02-01

    Full Text Available Classification is a gradual practice for allocating a given piece of input into any of the known category.Classification is a crucial Machine Learning technique. There are many classification problem occurs indifferent application areas and need to be solved. Different types are classification algorithms like memorybased,tree-based, rule-based, etc are widely used. This work evaluates the performance of differentmemory based classifiers for classification of Multivariate data set without having Missing values fromUCI machine learning repository using the open source machine learning tool. A comparison of differentmemory based classifiers used and a practical guideline for selecting the renowned and most suitedalgorithm for a classification is presented. Apart from that some pragmatic criteria for describing andevaluating the best classifiers are discussed.

  11. Sequence-based classification using discriminatory motif feature selection.

    Directory of Open Access Journals (Sweden)

    Hao Xiong

    Full Text Available Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all k-mer patterns. The motivation behind such (enumerative approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of length ≤ k, such that potentially important, longer (> k predictors are not considered. Second, features so generated exhibit strong dependencies, which can complicate understanding of derived classification rules. Third, and most importantly, numerous irrelevant features are created. These concerns can compromise prediction and interpretation. While remedies have been proposed, they tend to be problem-specific and not broadly applicable. Here, we develop a generally applicable methodology, and an attendant software pipeline, that is predicated on discriminatory motif finding. In addition to the traditional training and validation partitions, our framework entails a third level of data partitioning, a discovery partition. A discriminatory motif finder is used on sequences and associated class labels in the discovery partition to yield a (small set of features. These features are then used as inputs to a classifier in the training partition. Finally, performance assessment occurs on the validation partition. Important attributes of our approach are its modularity (any discriminatory motif finder and any classifier can be deployed and its universality (all data, including sequences that are unaligned and/or of unequal length, can be accommodated. We illustrate our approach on two nucleosome occupancy datasets and a protein solubility dataset, previously analyzed using enumerative feature generation. Our method achieves excellent performance results, with and without optimization of classifier tuning parameters. A Python pipeline implementing the approach is

  12. Classification of types of stuttering symptoms based on brain activity.

    Directory of Open Access Journals (Sweden)

    Jing Jiang

    Full Text Available Among the non-fluencies seen in speech, some are more typical (MT of stuttering speakers, whereas others are less typical (LT and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT whole-word repetitions (WWR should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type.

  13. Multiview Sample Classification Algorithm Based on L1-Graph Domain Adaptation Learning

    OpenAIRE

    Huibin Lu; Zhengping Hu; Hongxiao Gao

    2015-01-01

    In the case of multiview sample classification with different distribution, training and testing samples are from different domains. In order to improve the classification performance, a multiview sample classification algorithm based on L1-Graph domain adaptation learning is presented. First of all, a framework of nonnegative matrix trifactorization based on domain adaptation learning is formed, in which the unchanged information is regarded as the bridge of knowledge transformation from the...

  14. Automated classification of mouse pup isolation syllables: from cluster analysis to an Excel-based "mouse pup syllable classification calculator".

    Science.gov (United States)

    Grimsley, Jasmine M S; Gadziola, Marie A; Wenstrup, Jeffrey J

    2012-01-01

    Mouse pups vocalize at high rates when they are cold or isolated from the nest. The proportions of each syllable type produced carry information about disease state and are being used as behavioral markers for the internal state of animals. Manual classifications of these vocalizations identified 10 syllable types based on their spectro-temporal features. However, manual classification of mouse syllables is time consuming and vulnerable to experimenter bias. This study uses an automated cluster analysis to identify acoustically distinct syllable types produced by CBA/CaJ mouse pups, and then compares the results to prior manual classification methods. The cluster analysis identified two syllable types, based on their frequency bands, that have continuous frequency-time structure, and two syllable types featuring abrupt frequency transitions. Although cluster analysis computed fewer syllable types than manual classification, the clusters represented well the probability distributions of the acoustic features within syllables. These probability distributions indicate that some of the manually classified syllable types are not statistically distinct. The characteristics of the four classified clusters were used to generate a Microsoft Excel-based mouse syllable classifier that rapidly categorizes syllables, with over a 90% match, into the syllable types determined by cluster analysis.

  15. Research and Application of Human Capital Strategic Classification Tool: Human Capital Classification Matrix Based on Biological Natural Attribute

    Directory of Open Access Journals (Sweden)

    Yong Liu

    2014-12-01

    Full Text Available In order to study the causes of weak human capital structure strategic classification management in China, we analyze that enterprises around the world face increasingly difficult for human capital management. In order to provide strategically sound answers, the HR managers need the critical information provided by the right technology processing and analytical tools. In this study, there are different types and levels of human capital in formal organization management, which is not the same contribution to a formal organization. An important guarantee for sustained and healthy development of the formal or informal organization is lower human capital risk. To resist this risk is primarily dependent on human capital hedge force and appreciation force in value, which is largely dependent on the strategic value of the performance of senior managers. Based on the analysis of high-level managers perspective, we also discuss the value and configuration of principles and methods to be followed in human capital strategic classification based on Boston Consulting Group (BCG matrix and build Human Capital Classification (HCC matrix based on biological natural attribute to effectively realize human capital structure strategic classification.

  16. [ECoG classification based on wavelet variance].

    Science.gov (United States)

    Yan, Shiyu; Liu, Chong; Wang, Hong; Zhao, Haibin

    2013-06-01

    For a typical electrocorticogram (ECoG)-based brain-computer interface (BCI) system in which the subject's task is to imagine movements of either the left small finger or the tongue, we proposed a feature extraction algorithm using wavelet variance. Firstly the definition and significance of wavelet variance were brought out and taken as feature based on the discussion of wavelet transform. Six channels with most distinctive features were selected from 64 channels for analysis. Consequently the EEG data were decomposed using db4 wavelet. The wavelet coeffi-cient variances containing Mu rhythm and Beta rhythm were taken out as features based on ERD/ERS phenomenon. The features were classified linearly with an algorithm of cross validation. The results of off-line analysis showed that high classification accuracies of 90. 24% and 93. 77% for training and test data set were achieved, the wavelet vari-ance had characteristics of simplicity and effectiveness and it was suitable for feature extraction in BCI research. K PMID:23865300

  17. Hyperspectral remote sensing image classification based on decision level fusion

    Institute of Scientific and Technical Information of China (English)

    Peijun Du; Wei Zhang; Junshi Xia

    2011-01-01

    @@ To apply decision level fusion to hyperspectral remote sensing (HRS) image classification, three decision level fusion strategies are experimented on and compared, namely, linear consensus algorithm, improved evidence theory, and the proposed support vector machine (SVM) combiner.To evaluate the effects of the input features on classification performance, four schemes are used to organize input features for member classifiers.In the experiment, by using the operational modular imaging spectrometer (OMIS) II HRS image, the decision level fusion is shown as an effective way for improving the classification accuracy of the HRS image, and the proposed SVM combiner is especially suitable for decision level fusion.The results also indicate that the optimization of input features can improve the classification performance.%To apply decision level fusion to hyperspectral remote sensing (HRS) image classification, three decision level fusion strategies are experimented on and compared, namely, linear consensus algorithm, improved evidence theory, and the proposed support vector machine (SVM) combiner. To evaluate the effects of the input features on classification performance, four schemes are used to organize input features for member classifiers. In the experiment, by using the operational modular imaging spectrometer (OMIS) Ⅱ HRS image, the decision level fusion is shown as an effective way for improving the classification accuracy of the HRS image, and the proposed SVM combiner is especially suitable for decision level fusion. The results also indicate that the optimization of input features can improve the classification performance.

  18. Text Classification Retrieval Based on Complex Network and ICA Algorithm

    Directory of Open Access Journals (Sweden)

    Hongxia Li

    2013-08-01

    Full Text Available With the development of computer science and information technology, the library is developing toward information and network. The library digital process converts the book into digital information. The high-quality preservation and management are achieved by computer technology as well as text classification techniques. It realizes knowledge appreciation. This paper introduces complex network theory in the text classification process and put forwards the ICA semantic clustering algorithm. It realizes the independent component analysis of complex network text classification. Through the ICA clustering algorithm of independent component, it realizes character words clustering extraction of text classification. The visualization of text retrieval is improved. Finally, we make a comparative analysis of collocation algorithm and ICA clustering algorithm through text classification and keyword search experiment. The paper gives the clustering degree of algorithm and accuracy figure. Through simulation analysis, we find that ICA clustering algorithm increases by 1.2% comparing with text classification clustering degree. Accuracy can be improved by 11.1% at most. It improves the efficiency and accuracy of text classification retrieval. It also provides a theoretical reference for text retrieval classification of eBook

  19. Data Stream Classification Based on the Gamma Classifier

    Directory of Open Access Journals (Sweden)

    Abril Valeria Uriarte-Arcia

    2015-01-01

    Full Text Available The ever increasing data generation confronts us with the problem of handling online massive amounts of information. One of the biggest challenges is how to extract valuable information from these massive continuous data streams during single scanning. In a data stream context, data arrive continuously at high speed; therefore the algorithms developed to address this context must be efficient regarding memory and time management and capable of detecting changes over time in the underlying distribution that generated the data. This work describes a novel method for the task of pattern classification over a continuous data stream based on an associative model. The proposed method is based on the Gamma classifier, which is inspired by the Alpha-Beta associative memories, which are both supervised pattern recognition models. The proposed method is capable of handling the space and time constrain inherent to data stream scenarios. The Data Streaming Gamma classifier (DS-Gamma classifier implements a sliding window approach to provide concept drift detection and a forgetting mechanism. In order to test the classifier, several experiments were performed using different data stream scenarios with real and synthetic data streams. The experimental results show that the method exhibits competitive performance when compared to other state-of-the-art algorithms.

  20. Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Łukasz Augustyniak

    2015-12-01

    Full Text Available We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.

  1. Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study

    Directory of Open Access Journals (Sweden)

    Ghazi Raho

    2015-02-01

    Full Text Available Feature selection is necessary for effective text classification. Dataset preprocessing is essential to make upright result and effective performance. This paper investigates the effectiveness of using feature selection. In this paper we have been compared the performance between different classifiers in different situations using feature selection with stemming, and without stemming.Evaluation used a BBC Arabic dataset, different classification algorithms such as decision tree (D.T, K-nearest neighbors (KNN, Naïve Bayesian (NB method and Naïve Bayes Multinomial(NBM classifier were used. The experimental results are presented in term of precision, recall, F-Measures, accuracy and time to build model.

  2. An approach for mechanical fault classification based on generalized discriminant analysis

    Institute of Scientific and Technical Information of China (English)

    LI Wei-hua; SHI Tie-lin; YANG Shu-zi

    2006-01-01

    To deal with pattern classification of complicated mechanical faults,an approach to multi-faults classification based on generalized discriminant analysis is presented.Compared with linear discriminant analysis (LDA),generalized discriminant analysis (GDA),one of nonlinear discriminant analysis methods,is more suitable for classifying the linear non-separable problem.The connection and difference between KPCA (Kernel Principal Component Analysis) and GDA is discussed.KPCA is good at detection of machine abnormality while GDA performs well in multi-faults classification based on the collection of historical faults symptoms.When the proposed method is applied to air compressor condition classification and gear fault classification,an excellent performance in complicated multi-faults classification is presented.

  3. A NEW SVM BASED EMOTIONAL CLASSIFICATION OF IMAGE

    Institute of Scientific and Technical Information of China (English)

    Wang Weining; Yu Yinglin; Zhang Jianchao

    2005-01-01

    How high-level emotional representation of art paintings can be inferred from percep tual level features suited for the particular classes (dynamic vs. static classification)is presented. The key points are feature selection and classification. According to the strong relationship between notable lines of image and human sensations, a novel feature vector WLDLV (Weighted Line Direction-Length Vector) is proposed, which includes both orientation and length information of lines in an image. Classification is performed by SVM (Support Vector Machine) and images can be classified into dynamic and static. Experimental results demonstrate the effectiveness and superiority of the algorithm.

  4. Identification of area-level influences on regions of high cancer incidence in Queensland, Australia: a classification tree approach

    Directory of Open Access Journals (Sweden)

    Mengersen Kerrie L

    2011-07-01

    Full Text Available Abstract Background Strategies for cancer reduction and management are targeted at both individual and area levels. Area-level strategies require careful understanding of geographic differences in cancer incidence, in particular the association with factors such as socioeconomic status, ethnicity and accessibility. This study aimed to identify the complex interplay of area-level factors associated with high area-specific incidence of Australian priority cancers using a classification and regression tree (CART approach. Methods Area-specific smoothed standardised incidence ratios were estimated for priority-area cancers across 478 statistical local areas in Queensland, Australia (1998-2007, n = 186,075. For those cancers with significant spatial variation, CART models were used to identify whether area-level accessibility, socioeconomic status and ethnicity were associated with high area-specific incidence. Results The accessibility of a person's residence had the most consistent association with the risk of cancer diagnosis across the specific cancers. Many cancers were likely to have high incidence in more urban areas, although male lung cancer and cervical cancer tended to have high incidence in more remote areas. The impact of socioeconomic status and ethnicity on these associations differed by type of cancer. Conclusions These results highlight the complex interactions between accessibility, socioeconomic status and ethnicity in determining cancer incidence risk.

  5. Comparing Machine Learning Classifiers for Object-Based Land Cover Classification Using Very High Resolution Imagery

    OpenAIRE

    Yuguo Qian; Weiqi Zhou; Jingli Yan; Weifeng Li; Lijian Han

    2014-01-01

    This study evaluates and compares the performance of four machine learning classifiers—support vector machine (SVM), normal Bayes (NB), classification and regression tree (CART) and K nearest neighbor (KNN)—to classify very high resolution images, using an object-based classification procedure. In particular, we investigated how tuning parameters affect the classification accuracy with different training sample sizes. We found that: (1) SVM and NB were superior to CART and KNN, and both could...

  6. Analysis on Design of Kohonen-network System Based on Classification of Complex Signals

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The key methods of detection and classification of the electroencephalogram(EEG) used in recent years are introduced . Taking EEG for example, the design plan of Kohonen neural network system based on detection and classification of complex signals is proposed, and both the network design and signal processing are analyzed, including pre-processing of signals, extraction of signal features, classification of signal and network topology, etc.

  7. Assessing the Performance of a Classification-Based Vulnerability Analysis Model

    OpenAIRE

    Wang, Tai-Ran; Mousseau, Vincent; Pedroni, Nicola; Zio, Enrico

    2015-01-01

    In this article, a classification model based on the majority rule sorting (MR-Sort) method is employed to evaluate the vulnerability of safety-critical systems with respect to malevolent intentional acts. The model is built on the basis of a (limited-size) set of data representing (a priori known) vulnerability classification examples. The empirical construction of the clas-sification model introduces a source of uncertainty into the vulnerability analysis process: a quantitative assessment ...

  8. Improving Sparse Representation-Based Classification Using Local Principal Component Analysis

    OpenAIRE

    Weaver, Chelsea; Saito, Naoki

    2016-01-01

    Sparse representation-based classification (SRC), proposed by Wright et al., seeks the sparsest decomposition of a test sample over the dictionary of training samples, with classification to the most-contributing class. Because it assumes test samples can be written as linear combinations of their same-class training samples, the success of SRC depends on the size and representativeness of the training set. Our proposed classification algorithm enlarges the training set by using local princip...

  9. [Treatment Ideas and Methods for Treating Breast Cancer Guided by Molecular Classification].

    Science.gov (United States)

    Wang, Hui-jie; Wang, Zhao-xia; Wan, Dong-gui; Li, Pei-wen

    2016-04-01

    The gene types of breast cancer can be classified into three types according to its molecules: Luminal type A, Luminal type B, HER-2-positive type, triple negative type. Authors combined pathological characteristics of breast cancer, biological characteristics, and comprehensive treatment, used syndrome typing based medication, and explored treatment meticulous ideas and methods of "treating the same disease with different methods" as well as "different treatment methods in accordance with patients individually". PMID:27323624

  10. Therapeutic pathomorphism of malignancies: Clinical and morphological criteria. Classifications. Prognostic value of therapeutic pathomorphism in breast cancer and other tumors

    Directory of Open Access Journals (Sweden)

    A. A. Lisayeva

    2011-01-01

    Full Text Available Pathomorphism is one of the most important prognostic factors for breast cancer . The paper gives the notion of pathomorphism an d its types and the most commonly used classifications of tumor pathomorphological changes. It also considers the long-term results of neoadjuvant treatment in relation to pathomorphism.

  11. Knowledge-based sea ice classification by polarimetric SAR

    DEFF Research Database (Denmark)

    Skriver, Henning; Dierking, Wolfgang

    2004-01-01

    Polarimetric SAR images acquired at C- and L-band over sea ice in the Greenland Sea, Baltic Sea, and Beaufort Sea have been analysed with respect to their potential for ice type classification. The polarimetric data were gathered by the Danish EMISAR and the US AIRSAR which both are airborne...... systems. A hierarchical classification scheme was chosen for sea ice because our knowledge about magnitudes, variations, and dependences of sea ice signatures can be directly considered. The optimal sequence of classification rules and the rules themselves depend on the ice conditions/regimes. The use...... of the polarimetric phase information improves the classification only in the case of thin ice types but is not necessary for thicker ice (above about 30 cm thickness)...

  12. Trace elements based classification on clinkers. Application to Spanish clinkers

    Directory of Open Access Journals (Sweden)

    Tamás, F. D.

    2001-12-01

    Full Text Available The qualitative identification to determine the origin (i.e. manufacturing factory of Spanish clinkers is described. The classification of clinkers produced in different factories can be based on their trace element content. Approximately fifteen clinker sorts are analysed, collected from 11 Spanish cement factories to determine their Mg, Sr, Ba, Mn, Ti, Zr, Zn and V content. An expert system formulated by a binary decision tree is designed based on the collected data. The performance of the obtained classifier was measured by ten-fold cross validation. The results show that the proposed method is useful to identify an easy-to-use expert system that is able to determine the origin of the clinker based on its trace element content.

    En el presente trabajo se describe el procedimiento de identificación cualitativa de clínkeres españoles con el objeto de determinar su origen (fábrica. Esa clasificación de los clínkeres se basa en el contenido de sus elementos traza. Se analizaron 15 clínkeres diferentes procedentes de 11 fábricas de cemento españolas, determinándose los contenidos en Mg, Sr, Ba, Mn, Ti, Zr, Zn y V. Se ha diseñado un sistema experto mediante un árbol de decisión binario basado en los datos recogidos. La clasificación obtenida fue examinada mediante la validación cruzada de 10 valores. Los resultados obtenidos muestran que el modelo propuesto es válido para identificar, de manera fácil, un sistema experto capaz de determinar el origen de un clínker basándose en el contenido de sus elementos traza.

  13. Neural Network based Vehicle Classification for Intelligent Traffic Control

    Directory of Open Access Journals (Sweden)

    Saeid Fazli

    2012-06-01

    Full Text Available Nowadays, number of vehicles has been increased and traditional systems of traffic controlling couldn’t be able to meet the needs that cause to emergence of Intelligent Traffic Controlling Systems. They improve controlling and urban management and increase confidence index in roads and highways. The goal of thisarticle is vehicles classification base on neural networks. In this research, it has been used a immovable camera which is located in nearly close height of the road surface to detect and classify the vehicles. The algorithm that used is included two general phases; at first, we are obtaining mobile vehicles in the traffic situations by using some techniques included image processing and remove background of the images and performing edge detection and morphology operations. In the second phase, vehicles near the camera areselected and the specific features are processed and extracted. These features apply to the neural networks as a vector so the outputs determine type of vehicle. This presented model is able to classify the vehicles in three classes; heavy vehicles, light vehicles and motorcycles. Results demonstrate accuracy of the algorithm and its highly functional level.

  14. Basic Hand Gestures Classification Based on Surface Electromyography.

    Science.gov (United States)

    Palkowski, Aleksander; Redlarski, Grzegorz

    2016-01-01

    This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method. PMID:27298630

  15. Basic Hand Gestures Classification Based on Surface Electromyography

    Directory of Open Access Journals (Sweden)

    Aleksander Palkowski

    2016-01-01

    Full Text Available This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method.

  16. Egocentric visual event classification with location-based priors

    OpenAIRE

    Sundaram, Sudeep; Mayol-Cuevas, Walterio

    2010-01-01

    We present a method for visual classification of actions and events captured from an egocentric point of view. The method tackles the challenge of a moving camera by creating deformable graph models for classification of actions. Action models are learned from low resolution, roughly stabilized difference images acquired using a single monocular camera. In parallel, raw images from the camera are used to estimate the user's location using a visual Simultaneous Localization and Mapping (SLAM) ...

  17. Consistent image-based measurement and classification of skin color

    OpenAIRE

    Harville, Michael; Baker, Harlyn; Bhatti, Nina; Süsstrunk, Sabine

    2005-01-01

    Little prior image processing work has addressed estimation and classification of skin color in a manner that is independent of camera and illuminant. To this end, we first present new methods for 1) fast, easy-to-use image color correction, with specialization toward skin tones, and 2) fully automated estimation of facial skin color, with robustness to shadows, specularities, and blemishes. Each of these is validated independently against ground truth, and then combined with a classification...

  18. Basic Hand Gestures Classification Based on Surface Electromyography

    Science.gov (United States)

    Palkowski, Aleksander; Redlarski, Grzegorz

    2016-01-01

    This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method. PMID:27298630

  19. Basic Hand Gestures Classification Based on Surface Electromyography

    OpenAIRE

    Aleksander Palkowski; Grzegorz Redlarski

    2016-01-01

    This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the prop...

  20. IMPROVEMENT OF TCAM-BASED PACKET CLASSIFICATION ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    Xu Zhen; Zhang Jun; Rui Liyang; Sun Jun

    2008-01-01

    The feature of Ternary Content Addressable Memories (TCAMs) makes them particularly attractive for IP address lookup and packet classification applications in a router system. However, the limitations of TCAMs impede their utilization. In this paper, the solutions for decreasing the power consumption and avoiding entry expansion in range matching are addressed. Experimental results demonstrate that the proposed techniques can make some big improvements on the performance of TCAMs in IP address lookup and packet classification.

  1. Texture Features based Blur Classification in Barcode Images

    OpenAIRE

    Shamik Tiwari; Vidya Prasad Shukla; Sangappa Birada; Ajay Singh

    2013-01-01

    Blur is an undesirable phenomenon which appears as image degradation. Blur classification is extremely desirable before application of any blur parameters estimation approach in case of blind restoration of barcode image. A novel approach to classify blur in motion, defocus, and co-existence of both blur categories is presented in this paper. The key idea involves statistical features extraction of blur pattern in frequency domain and designing of blur classification system with feed forward ...

  2. CLASSIFICATION OF LiDAR DATA WITH POINT BASED CLASSIFICATION METHODS

    OpenAIRE

    N. Yastikli; Cetin, Z.

    2016-01-01

    LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applicati...

  3. SAR images classification method based on Dempster-Shafer theory and kernel estimate

    Institute of Scientific and Technical Information of China (English)

    He Chu; Xia Guisong; Sun Hong

    2007-01-01

    To study the scene classification in the Synthetic Aperture Radar (SAR) image, a novel method based on kernel estimate, with the Markov context and Dempster-Shafer evidence theory is proposed.Initially, a nonparametric Probability Density Function (PDF) estimate method is introduced, to describe the scene of SAR images.And then under the Markov context, both the determinate PDF and the kernel estimate method are adopted respectively, to form a primary classification.Next, the primary classification results are fused using the evidence theory in an unsupervised way to get the scene classification.Finally, a regularization step is used, in which an iterated maximum selecting approach is introduced to control the fragments and modify the errors of the classification.Use of the kernel estimate and evidence theory can describe the complicated scenes with little prior knowledge and eliminate the ambiguities of the primary classification results.Experimental results on real SAR images illustrate a rather impressive performance.

  4. Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Support Vector Machine

    Science.gov (United States)

    Kim, Sang-Kyun; Chang, Joon-Hyuk

    In this letter, we propose a novel approach to speech/music classification based on the support vector machine (SVM) to improve the performance of the 3GPP2 selectable mode vocoder (SMV) codec. We first analyze the features and the classification method used in real time speech/music classification algorithm in SMV, and then apply the SVM for enhanced speech/music classification. For evaluation of performance, we compare the proposed algorithm and the traditional algorithm of the SMV. The performance of the proposed system is evaluated under the various environments and shows better performance compared to the original method in the SMV.

  5. Review of Remotely Sensed Imagery Classification Patterns Based on Object-oriented Image Analysis

    Institute of Scientific and Technical Information of China (English)

    LIU Yongxue; LI Manchun; MAO Liang; XU Feifei; HUANG Shuo

    2006-01-01

    With the wide use of high-resolution remotely sensed imagery, the object-oriented remotely sensed information classification pattern has been intensively studied. Starting with the definition of object-oriented remotely sensed information classification pattern and a literature review of related research progress, this paper sums up 4 developing phases of object-oriented classification pattern during the past 20 years. Then, we discuss the three aspects of methodology in detail, namely remotely sensed imagery segmentation, feature analysis and feature selection, and classification rule generation, through comparing them with remotely sensed information classification method based on per-pixel. At last, this paper presents several points that need to be paid attention to in the future studies on object-oriented RS information classification pattern: 1) developing robust and highly effective image segmentation algorithm for multi-spectral RS imagery; 2) improving the feature-set including edge, spatial-adjacent and temporal characteristics; 3) discussing the classification rule generation classifier based on the decision tree; 4) presenting evaluation methods for classification result by object-oriented classification pattern.

  6. INDUS - a composition-based approach for rapid and accurate taxonomic classification of metagenomic sequences

    OpenAIRE

    Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Reddy, Rachamalla Maheedhar; Reddy, Chennareddy Venkata Siva Kumar; Singh, Nitin Kumar; Sharmila S Mande

    2011-01-01

    Background Taxonomic classification of metagenomic sequences is the first step in metagenomic analysis. Existing taxonomic classification approaches are of two types, similarity-based and composition-based. Similarity-based approaches, though accurate and specific, are extremely slow. Since, metagenomic projects generate millions of sequences, adopting similarity-based approaches becomes virtually infeasible for research groups having modest computational resources. In this study, we present ...

  7. Classification and Identification of Over-voltage Based on HHT and SVM

    Institute of Scientific and Technical Information of China (English)

    WANG Jing; YANG Qing; CHEN Lin; SIMA Wenxia

    2012-01-01

    This paper proposes an effective method for over-voltage classification based on the Hilbert-Huang transform(HHT) method.Hilbert-Huang transform method is composed of empirical mode decomposition(EMD) and Hilbert transform.Nine kinds of common power system over-voltages are calculated and analyzed by HHT.Based on the instantaneous amplitude spectrum,Hilbert marginal spectrum and Hilbert time-frequency spectrum,three kinds of over-voltage characteristic quantities are obtained.A hierarchical classification system is built based on HHT and support vector machine(SVM).This classification system is tested by 106 field over-voltage signals,and the average classification rate is 94.3%.This research shows that HHT is an effective time-frequency analysis algorithms in the application of over-voltage classification and identification.

  8. Image-classification-based global dimming algorithm for LED backlights in LCDs

    Science.gov (United States)

    Qibin, Feng; Huijie, He; Dong, Han; Lei, Zhang; Guoqiang, Lv

    2015-07-01

    Backlight dimming can help LCDs reduce power consumption and improve CR. With fixed parameters, dimming algorithm cannot achieve satisfied effects for all kinds of images. The paper introduces an image-classification-based global dimming algorithm. The proposed classification method especially for backlight dimming is based on luminance and CR of input images. The parameters for backlight dimming level and pixel compensation are adaptive with image classifications. The simulation results show that the classification based dimming algorithm presents 86.13% power reduction improvement compared with dimming without classification, with almost same display quality. The prototype is developed. There are no perceived distortions when playing videos. The practical average power reduction of the prototype TV is 18.72%, compared with common TV without dimming.

  9. Initial steps towards an evidence-based classification system for golfers with a physical impairment

    NARCIS (Netherlands)

    Stoter, Inge K; Hettinga, Florentina J; Altmann, Viola; Eisma, Wim; Arendzen, Hans; Bennett, Tony; van der Woude, Lucas H; Dekker, Rienk

    2015-01-01

    PURPOSE: The present narrative review aims to make a first step towards an evidence-based classification system in handigolf following the International Paralympic Committee (IPC). It intends to create a conceptual framework of classification for handigolf and an agenda for future research. METHOD:

  10. Dihedral-based segment identification and classification of biopolymers II: polynucleotides.

    Science.gov (United States)

    Nagy, Gabor; Oostenbrink, Chris

    2014-01-27

    In an accompanying paper (Nagy, G.; Oostenbrink, C. Dihedral-based segment identification and classification of biopolymers I: Proteins. J. Chem. Inf. Model. 2013, DOI: 10.1021/ci400541d), we introduce a new algorithm for structure classification of biopolymeric structures based on main-chain dihedral angles. The DISICL algorithm (short for DIhedral-based Segment Identification and CLassification) classifies segments of structures containing two central residues. Here, we introduce the DISICL library for polynucleotides, which is based on the dihedral angles ε, ζ, and χ for the two central residues of a three-nucleotide segment of a single strand. Seventeen distinct structural classes are defined for nucleotide structures, some of which--to our knowledge--were not described previously in other structure classification algorithms. In particular, DISICL also classifies noncanonical single-stranded structural elements. DISICL is applied to databases of DNA and RNA structures containing 80,000 and 180,000 segments, respectively. The classifications according to DISICL are compared to those of another popular classification scheme in terms of the amount of classified nucleotides, average occurrence and length of structural elements, and pairwise matches of the classifications. While the detailed classification of DISICL adds sensitivity to a structure analysis, it can be readily reduced to eight simplified classes providing a more general overview of the secondary structure in polynucleotides.

  11. 78 FR 58153 - Prevailing Rate Systems; North American Industry Classification System Based Federal Wage System...

    Science.gov (United States)

    2013-09-23

    ... RIN 3206-AM78 Prevailing Rate Systems; North American Industry Classification System Based Federal... Industry Classification System (NAICS) codes currently used in Federal Wage System wage survey industry..., 2013, the U.S. Office of Personnel Management (OPM) issued a proposed rule (78 FR 18252) to update...

  12. Dihedral-Based Segment Identification and Classification of Biopolymers II: Polynucleotides

    Science.gov (United States)

    2013-01-01

    In an accompanying paper (Nagy, G.; Oostenbrink, C. Dihedral-based segment identification and classification of biopolymers I: Proteins. J. Chem. Inf. Model. 2013, DOI: 10.1021/ci400541d), we introduce a new algorithm for structure classification of biopolymeric structures based on main-chain dihedral angles. The DISICL algorithm (short for DIhedral-based Segment Identification and CLassification) classifies segments of structures containing two central residues. Here, we introduce the DISICL library for polynucleotides, which is based on the dihedral angles ε, ζ, and χ for the two central residues of a three-nucleotide segment of a single strand. Seventeen distinct structural classes are defined for nucleotide structures, some of which—to our knowledge—were not described previously in other structure classification algorithms. In particular, DISICL also classifies noncanonical single-stranded structural elements. DISICL is applied to databases of DNA and RNA structures containing 80,000 and 180,000 segments, respectively. The classifications according to DISICL are compared to those of another popular classification scheme in terms of the amount of classified nucleotides, average occurrence and length of structural elements, and pairwise matches of the classifications. While the detailed classification of DISICL adds sensitivity to a structure analysis, it can be readily reduced to eight simplified classes providing a more general overview of the secondary structure in polynucleotides. PMID:24364355

  13. Classification of samples into two or more ordered populations with application to a cancer trial.

    Science.gov (United States)

    Conde, D; Fernández, M A; Rueda, C; Salvador, B

    2012-12-10

    In many applications, especially in cancer treatment and diagnosis, investigators are interested in classifying patients into various diagnosis groups on the basis of molecular data such as gene expression or proteomic data. Often, some of the diagnosis groups are known to be related to higher or lower values of some of the predictors. The standard methods of classifying patients into various groups do not take into account the underlying order. This could potentially result in high misclassification rates, especially when the number of groups is larger than two. In this article, we develop classification procedures that exploit the underlying order among the mean values of the predictor variables and the diagnostic groups by using ideas from order-restricted inference. We generalize the existing methodology on discrimination under restrictions and provide empirical evidence to demonstrate that the proposed methodology improves over the existing unrestricted methodology. The proposed methodology is applied to a bladder cancer data set where the researchers are interested in classifying patients into various groups.

  14. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  15. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  16. Dendritic cell-based cancer immunotherapy for colorectal cancer.

    Science.gov (United States)

    Kajihara, Mikio; Takakura, Kazuki; Kanai, Tomoya; Ito, Zensho; Saito, Keisuke; Takami, Shinichiro; Shimodaira, Shigetaka; Okamoto, Masato; Ohkusa, Toshifumi; Koido, Shigeo

    2016-05-01

    Colorectal cancer (CRC) is one of the most common cancers and a leading cause of cancer-related mortality worldwide. Although systemic therapy is the standard care for patients with recurrent or metastatic CRC, the prognosis is extremely poor. The optimal sequence of therapy remains unknown. Therefore, alternative strategies, such as immunotherapy, are needed for patients with advanced CRC. This review summarizes evidence from dendritic cell-based cancer immunotherapy strategies that are currently in clinical trials. In addition, we discuss the possibility of antitumor immune responses through immunoinhibitory PD-1/PD-L1 pathway blockade in CRC patients. PMID:27158196

  17. Dendritic cell-based cancer immunotherapy for colorectal cancer.

    Science.gov (United States)

    Kajihara, Mikio; Takakura, Kazuki; Kanai, Tomoya; Ito, Zensho; Saito, Keisuke; Takami, Shinichiro; Shimodaira, Shigetaka; Okamoto, Masato; Ohkusa, Toshifumi; Koido, Shigeo

    2016-05-01

    Colorectal cancer (CRC) is one of the most common cancers and a leading cause of cancer-related mortality worldwide. Although systemic therapy is the standard care for patients with recurrent or metastatic CRC, the prognosis is extremely poor. The optimal sequence of therapy remains unknown. Therefore, alternative strategies, such as immunotherapy, are needed for patients with advanced CRC. This review summarizes evidence from dendritic cell-based cancer immunotherapy strategies that are currently in clinical trials. In addition, we discuss the possibility of antitumor immune responses through immunoinhibitory PD-1/PD-L1 pathway blockade in CRC patients.

  18. SAR target classification based on multiscale sparse representation

    Science.gov (United States)

    Ruan, Huaiyu; Zhang, Rong; Li, Jingge; Zhan, Yibing

    2016-03-01

    We propose a novel multiscale sparse representation approach for SAR target classification. It firstly extracts the dense SIFT descriptors on multiple scales, then trains a global multiscale dictionary by sparse coding algorithm. After obtaining the sparse representation, the method applies spatial pyramid matching (SPM) and max pooling to summarize the features for each image. The proposed method can provide more information and descriptive ability than single-scale ones. Moreover, it costs less extra computation than existing multiscale methods which compute a dictionary for each scale. The MSTAR database and ship database collected from TerraSAR-X images are used in classification setup. Results show that the best overall classification rate of the proposed approach can achieve 98.83% on the MSTAR database and 92.67% on the TerraSAR-X ship database.

  19. Accelerometry-Based Classification of Human Activities Using Markov Modeling

    Directory of Open Access Journals (Sweden)

    Andrea Mannini

    2011-01-01

    Full Text Available Accelerometers are a popular choice as body-motion sensors: the reason is partly in their capability of extracting information that is useful for automatically inferring the physical activity in which the human subject is involved, beside their role in feeding biomechanical parameters estimators. Automatic classification of human physical activities is highly attractive for pervasive computing systems, whereas contextual awareness may ease the human-machine interaction, and in biomedicine, whereas wearable sensor systems are proposed for long-term monitoring. This paper is concerned with the machine learning algorithms needed to perform the classification task. Hidden Markov Model (HMM classifiers are studied by contrasting them with Gaussian Mixture Model (GMM classifiers. HMMs incorporate the statistical information available on movement dynamics into the classification process, without discarding the time history of previous outcomes as GMMs do. An example of the benefits of the obtained statistical leverage is illustrated and discussed by analyzing two datasets of accelerometer time series.

  20. Knowledge based cluster ensemble for cancer discovery from biomolecular data.

    Science.gov (United States)

    Yu, Zhiwen; Wongb, Hau-San; You, Jane; Yang, Qinmin; Liao, Hongying

    2011-06-01

    The adoption of microarray techniques in biological and medical research provides a new way for cancer diagnosis and treatment. In order to perform successful diagnosis and treatment of cancer, discovering and classifying cancer types correctly is essential. Class discovery is one of the most important tasks in cancer classification using biomolecular data. Most of the existing works adopt single clustering algorithms to perform class discovery from biomolecular data. However, single clustering algorithms have limitations, which include a lack of robustness, stability, and accuracy. In this paper, we propose a new cluster ensemble approach called knowledge based cluster ensemble (KCE) which incorporates the prior knowledge of the data sets into the cluster ensemble framework. Specifically, KCE represents the prior knowledge of a data set in the form of pairwise constraints. Then, the spectral clustering algorithm (SC) is adopted to generate a set of clustering solutions. Next, KCE transforms pairwise constraints into confidence factors for these clustering solutions. After that, a consensus matrix is constructed by considering all the clustering solutions and their corresponding confidence factors. The final clustering result is obtained by partitioning the consensus matrix. Comparison with single clustering algorithms and conventional cluster ensemble approaches, knowledge based cluster ensemble approaches are more robust, stable and accurate. The experiments on cancer data sets show that: 1) KCE works well on these data sets; 2) KCE not only outperforms most of the state-of-the-art single clustering algorithms, but also outperforms most of the state-of-the-art cluster ensemble approaches.

  1. Woven fabric defects detection based on texture classification algorithm

    International Nuclear Information System (INIS)

    In this paper we have compared two famous methods in texture classification to solve the problem of recognition and classification of defects occurring in a textile manufacture. We have compared local binary patterns method with co-occurrence matrix. The classifier used is the support vector machines (SVM). The system has been tested using TILDA database. The results obtained are interesting and show that LBP is a good method for the problems of recognition and classifcation defects, it gives a good running time especially for the real time applications.

  2. Seafloor Sediment Classification Based on Multibeam Sonar Data

    Institute of Scientific and Technical Information of China (English)

    ZHOU Xinghua; CHEN Yongqi

    2004-01-01

    The multibeam sonars can provide hydrographic quality depth data as well as hold the potential to provide calibrated measurements of the seafloor acoustic backscattering strength. There has been much interest in utilizing backscatters and images from multibeam sonar for seabed type identification and most results are obtained. This paper has presented a focused review of several main methods and recent developments of seafloor classification utilizing multibeam sonar data or/and images. These are including the power spectral analysis methods, the texture analysis, traditional Bayesian classification theory and the most active neural network approaches.

  3. Classification of Gait Types Based on the Duty-factor

    DEFF Research Database (Denmark)

    Fihl, Preben; Moeslund, Thomas B.

    2007-01-01

    on the speed of the human, the cameras setup etc. and hence a robust descriptor for gait classification. The dutyfactor is basically a matter of measuring the ground support of the feet with respect to the stride. We estimate this by comparing the incoming silhouettes to a database of silhouettes with known...... ground support. Silhouettes are extracted using the Codebook method and represented using Shape Contexts. The matching with database silhouettes is done using the Hungarian method. While manually estimated duty-factors show a clear classification the presented system contains misclassifications due...

  4. A Multi-Label Classification Approach Based on Correlations Among Labels

    Directory of Open Access Journals (Sweden)

    Raed Alazaidah

    2015-02-01

    Full Text Available Multi label classification is concerned with learning from a set of instances that are associated with a set of labels, that is, an instance could be associated with multiple labels at the same time. This task occurs frequently in application areas like text categorization, multimedia classification, bioinformatics, protein function classification and semantic scene classification. Current multi-label classification methods could be divided into two categories. The first is called problem transformation methods, which transform multi-label classification problem into single label classification problem, and then apply any single label classifier to solve the problem. The second category is called algorithm adaptation methods, which adapt an existing single label classification algorithm to handle multi-label data. In this paper, we propose a multi-label classification approach based on correlations among labels that use both problem transformation methods and algorithm adaptation methods. The approach begins with transforming multi-label dataset into a single label dataset using least frequent label criteria, and then applies the PART algorithm on the transformed dataset. The output of the approach is multi-labels rules. The approach also tries to get benefit from positive correlations among labels using predictive Apriori algorithm. The proposed approach has been evaluated using two multi-label datasets named (Emotions and Yeast and three evaluation measures (Accuracy, Hamming Loss, and Harmonic Mean. The experiments showed that the proposed approach has a fair accuracy in comparison to other related methods.

  5. PIXEL VS OBJECT-BASED IMAGE CLASSIFICATION TECHNIQUES FOR LIDAR INTENSITY DATA

    Directory of Open Access Journals (Sweden)

    N. El-Ashmawy

    2012-09-01

    Full Text Available Light Detection and Ranging (LiDAR systems are remote sensing techniques used mainly for terrain surface modelling. LiDAR sensors record the distance between the sensor and the targets (range data with a capability to record the strength of the backscatter energy reflected from the targets (intensity data. The LiDAR sensors use the near-infrared spectrum range which provides high separability in the reflected energy by the target. This phenomenon is investigated to use the LiDAR intensity data for land-cover classification. The goal of this paper is to investigate and evaluates the use of different image classification techniques applied on LiDAR intensity data for land cover classification. The two techniques proposed are: a Maximum likelihood classifier used as pixel- based classification technique; and b Image segmentation used as object-based classification technique. A study area covers an urban district in Burnaby, British Colombia, Canada, is selected to test the different classification techniques for extracting four feature classes: buildings, roads and parking areas, trees, and low vegetation (grass areas, from the LiDAR intensity data. Generally, the results show that LiDAR intensity data can be used for land cover classification. An overall accuracy of 63.5% can be achieved using the pixel-based classification technique. The overall accuracy of the results is improved to 68% using the object- based classification technique. Further research is underway to investigate different criteria for segmentation process and to refine the design of the object-based classification algorithm.

  6. A method for cloud detection and opacity classification based on ground based sky imagery

    Directory of Open Access Journals (Sweden)

    M. S. Ghonima

    2012-11-01

    Full Text Available Digital images of the sky obtained using a total sky imager (TSI are classified pixel by pixel into clear sky, optically thin and optically thick clouds. A new classification algorithm was developed that compares the pixel red-blue ratio (RBR to the RBR of a clear sky library (CSL generated from images captured on clear days. The difference, rather than the ratio, between pixel RBR and CSL RBR resulted in more accurate cloud classification. High correlation between TSI image RBR and aerosol optical depth (AOD measured by an AERONET photometer was observed and motivated the addition of a haze correction factor (HCF to the classification model to account for variations in AOD. Thresholds for clear and thick clouds were chosen based on a training image set and validated with set of manually annotated images. Misclassifications of clear and thick clouds into the opposite category were less than 1%. Thin clouds were classified with an accuracy of 60%. Accurate cloud detection and opacity classification techniques will improve the accuracy of short-term solar power forecasting.

  7. A method for cloud detection and opacity classification based on ground based sky imagery

    Directory of Open Access Journals (Sweden)

    M. S. Ghonima

    2012-07-01

    Full Text Available Digital images of the sky obtained using a total sky imager (TSI are classified pixel by pixel into clear sky, optically thin and optically thick clouds. A new classification algorithm was developed that compares the pixel red-blue ratio (RBR to the RBR of a clear sky library (CSL generated from images captured on clear days. The difference, rather than the ratio, between pixel RBR and CSL RBR resulted in more accurate cloud classification. High correlation between TSI image RBR and aerosol optical depth (AOD measured by an AERONET photometer was observed and motivated the addition of a haze correction factor (HCF to the classification model to account for variations in AOD. Thresholds for clear and thick clouds were chosen based on a training image set and validated with set of manually annotated images. Misclassifications of clear and thick clouds into the opposite category were less than 1%. Thin clouds were classified with an accuracy of 60%. Accurate cloud detection and opacity classification techniques will improve the accuracy of short-term solar power forecasting.

  8. Segmentation-Based PolSAR Image Classification Using Visual Features: RHLBP and Color Features

    Directory of Open Access Journals (Sweden)

    Jian Cheng

    2015-05-01

    Full Text Available A segmentation-based fully-polarimetric synthetic aperture radar (PolSAR image classification method that incorporates texture features and color features is designed and implemented. This method is based on the framework that conjunctively uses statistical region merging (SRM for segmentation and support vector machine (SVM for classification. In the segmentation step, we propose an improved local binary pattern (LBP operator named the regional homogeneity local binary pattern (RHLBP to guarantee the regional homogeneity in PolSAR images. In the classification step, the color features extracted from false color images are applied to improve the classification accuracy. The RHLBP operator and color features can provide discriminative information to separate those pixels and regions with similar polarimetric features, which are from different classes. Extensive experimental comparison results with conventional methods on L-band PolSAR data demonstrate the effectiveness of our proposed method for PolSAR image classification.

  9. Maximum-margin based representation learning from multiple atlases for Alzheimer's disease classification.

    Science.gov (United States)

    Min, Rui; Cheng, Jian; Price, True; Wu, Guorong; Shen, Dinggang

    2014-01-01

    In order to establish the correspondences between different brains for comparison, spatial normalization based morphometric measurements have been widely used in the analysis of Alzheimer's disease (AD). In the literature, different subjects are often compared in one atlas space, which may be insufficient in revealing complex brain changes. In this paper, instead of deploying one atlas for feature extraction and classification, we propose a maximum-margin based representation learning (MMRL) method to learn the optimal representation from multiple atlases. Unlike traditional methods that perform the representation learning separately from the classification, we propose to learn the new representation jointly with the classification model, which is more powerful in discriminating AD patients from normal controls (NC). We evaluated the proposed method on the ADNI database, and achieved 90.69% for AD/NC classification and 73.69% for p-MCI/s-MCI classification.

  10. A kernel-based multivariate feature selection method for microarray data classification.

    Directory of Open Access Journals (Sweden)

    Shiquan Sun

    Full Text Available High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Formula: see text]-nearest neighbor on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.

  11. Topic Modelling for Object-Based Classification of Vhr Satellite Images Based on Multiscale Segmentations

    Science.gov (United States)

    Shen, Li; Wu, Linmei; Li, Zhipeng

    2016-06-01

    Multiscale segmentation is a key prerequisite step for object-based classification methods. However, it is often not possible to determine a sole optimal scale for the image to be classified because in many cases different geo-objects and even an identical geo-object may appear at different scales in one image. In this paper, an object-based classification method based on mutliscale segmentation results in the framework of topic modelling is proposed to classify VHR satellite images in an entirely unsupervised fashion. In the stage of topic modelling, grayscale histogram distributions for each geo-object class and each segment are learned in an unsupervised manner from multiscale segments. In the stage of classification, each segment is allocated a geo-object class label by the similarity comparison between the grayscale histogram distributions of each segment and each geo-object class. Experimental results show that the proposed method can perform better than the traditional methods based on topic modelling.

  12. Open source, web-based machine-learning assisted classification system

    OpenAIRE

    Consarnau Pallarés, Mireia Roser

    2016-01-01

    The aim of this article is to provide a design overview of the web based machine learning assisted multi-user classification system. The design is based on open source standards both for multi-user environment written in PHP using the Laravel framework and a Python based machine learning toolkit, Scikit-Learn. The advantage of the proposed system is that it does not require the domain specific knowledge or programming skills. Machine learning classification tasks are done on the background...

  13. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function.

    Directory of Open Access Journals (Sweden)

    Derek G Groenendyk

    Full Text Available Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization

  14. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function.

    Science.gov (United States)

    Groenendyk, Derek G; Ferré, Ty P A; Thorp, Kelly R; Rice, Amy K

    2015-01-01

    Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape

  15. Emotion of Physiological Signals Classification Based on TS Feature Selection

    Institute of Scientific and Technical Information of China (English)

    Wang Yujing; Mo Jianlin

    2015-01-01

    This paper propose a method of TS-MLP about emotion recognition of physiological signal.It can recognize emotion successfully by Tabu search which selects features of emotion’s physiological signals and multilayer perceptron that is used to classify emotion.Simulation shows that it has achieved good emotion classification performance.

  16. Laguerre Kernels –Based SVM for Image Classification

    Directory of Open Access Journals (Sweden)

    Ashraf Afifi

    2014-01-01

    Full Text Available Support vector machines (SVMs have been promising methods for classification and regression analysis because of their solid mathematical foundations which convey several salient properties that other methods hardly provide. However the performance of SVMs is very sensitive to how the kernel function is selected, the challenge is to choose the kernel function for accurate data classification. In this paper, we introduce a set of new kernel functions derived from the generalized Laguerre polynomials. The proposed kernels could improve the classification accuracy of SVMs for both linear and nonlinear data sets. The proposed kernel functions satisfy Mercer’s condition and orthogonally properties which are important and useful in some applications when the support vector number is needed as in feature selection. The performance of the generalized Laguerre kernels is evaluated in comparison with the existing kernels. It was found that the choice of the kernel function, and the values of the parameters for that kernel are critical for a given amount of data. The proposed kernels give good classification accuracy in nearly all the data sets, especially those of high dimensions.

  17. A Classification System for Hospital-Based Infection Outbreaks

    Directory of Open Access Journals (Sweden)

    Paul S. Ganney

    2010-01-01

    Full Text Available Outbreaks of infection within semi-closed environments such as hospitals, whether inherent in the environment (such as Clostridium difficile (C.Diff or Methicillinresistant Staphylococcus aureus (MRSA or imported from the wider community (such as Norwalk-like viruses (NLVs, are difficult to manage. As part of our work on modelling such outbreaks, we have developed a classification system to describe the impact of a particular outbreak upon an organization. This classification system may then be used in comparing appropriate computer models to real outbreaks, as well as in comparing different real outbreaks in, for example, the comparison of differing management and containment techniques and strategies. Data from NLV outbreaks in the Hull and East Yorkshire Hospitals NHS Trust (the Trust over several previous years are analysed and classified, both for infection within staff (where the end of infection date may not be known and within patients (where it generally is known. A classification system consisting of seven elements is described, along with a goodness-of-fit method for comparing a new classification to previously known ones, for use in evaluating a simulation against history and thereby determining how ‘realistic’ (or otherwise it is.

  18. Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data

    Directory of Open Access Journals (Sweden)

    J. Sunil Rao

    2007-01-01

    Full Text Available In gene selection for cancer classifi cation using microarray data, we define an eigenvalue-ratio statistic to measure a gene’s contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalueratio statistic, we define a novel hypothesis testing for gene statistical redundancy and propose two gene selection methods. Simulation studies illustrate the agreement between statistical redundancy testing and gene selection methods. Real data examples show the proposed gene selection methods can select a compact gene subset which can not only be used to build high quality cancer classifiers but also show biological relevance.

  19. Optimal query-based relevance feedback in medical image retrieval using score fusion-based classification.

    Science.gov (United States)

    Behnam, Mohammad; Pourghassem, Hossein

    2015-04-01

    In this paper, a new content-based medical image retrieval (CBMIR) framework using an effective classification method and a novel relevance feedback (RF) approach are proposed. For a large-scale database with diverse collection of different modalities, query image classification is inevitable due to firstly, reducing the computational complexity and secondly, increasing influence of data fusion by removing unimportant data and focus on the more valuable information. Hence, we find probability distribution of classes in the database using Gaussian mixture model (GMM) for each feature descriptor and then using the fusion of obtained scores from the dependency probabilities, the most relevant clusters are identified for a given query. Afterwards, visual similarity of query image and images in relevant clusters are calculated. This method is performed separately on all feature descriptors, and then the results are fused together using feature similarity ranking level fusion algorithm. In the RF level, we propose a new approach to find the optimal queries based on relevant images. The main idea is based on density function estimation of positive images and strategy of moving toward the aggregation of estimated density function. The proposed framework has been evaluated on ImageCLEF 2005 database consisting of 10,000 medical X-ray images of 57 semantic classes. The experimental results show that compared with the existing CBMIR systems, our framework obtains the acceptable performance both in the image classification and in the image retrieval by RF. PMID:25246167

  20. Prognostic classification index in Iranian colorectal cancer patients: Survival tree analysis

    OpenAIRE

    Malehi, Amal Saki; Rahim, Fakher

    2016-01-01

    Aims: The aim of this study was to determine the prognostic index for separating homogenous subgroups in colorectal cancer (CRC) patients based on clinicopathological characteristics using survival tree analysis. Methods: The current study was conducted at the Research Center of Gastroenterology and Liver Disease, Shahid Beheshti Medical University in Tehran, between January 2004 and January 2009. A total of 739 patients who already have been diagnosed with CRC based on pathologic report were...

  1. Drug related webpages classification using images and text information based on multi-kernel learning

    Science.gov (United States)

    Hu, Ruiguang; Xiao, Liping; Zheng, Wenjuan

    2015-12-01

    In this paper, multi-kernel learning(MKL) is used for drug-related webpages classification. First, body text and image-label text are extracted through HTML parsing, and valid images are chosen by the FOCARSS algorithm. Second, text based BOW model is used to generate text representation, and image-based BOW model is used to generate images representation. Last, text and images representation are fused with a few methods. Experimental results demonstrate that the classification accuracy of MKL is higher than those of all other fusion methods in decision level and feature level, and much higher than the accuracy of single-modal classification.

  2. [Classification of cell-based medicinal products and legal implications: An overview and an update].

    Science.gov (United States)

    Scherer, Jürgen; Flory, Egbert

    2015-11-01

    In general, cell-based medicinal products do not represent a uniform class of medicinal products, but instead comprise medicinal products with diverse regulatory classification as advanced-therapy medicinal products (ATMP), medicinal products (MP), tissue preparations, or blood products. Due to the legal and scientific consequences of the development and approval of MPs, classification should be clarified as early as possible. This paper describes the legal situation in Germany and highlights specific criteria and concepts for classification, with a focus on, but not limited to, ATMPs and non-ATMPs. Depending on the stage of product development and the specific application submitted to a competent authority, legally binding classification is done by the German Länder Authorities, Paul-Ehrlich-Institut, or European Medicines Agency. On request by the applicants, the Committee for Advanced Therapies may issue scientific recommendations for classification.

  3. Scene Classification of Remote Sensing Image Based on Multi-scale Feature and Deep Neural Network

    Directory of Open Access Journals (Sweden)

    XU Suhui

    2016-07-01

    Full Text Available Aiming at low precision of remote sensing image scene classification owing to small sample sizes, a new classification approach is proposed based on multi-scale deep convolutional neural network (MS-DCNN, which is composed of nonsubsampled Contourlet transform (NSCT, deep convolutional neural network (DCNN, and multiple-kernel support vector machine (MKSVM. Firstly, remote sensing image multi-scale decomposition is conducted via NSCT. Secondly, the decomposing high frequency and low frequency subbands are trained by DCNN to obtain image features in different scales. Finally, MKSVM is adopted to integrate multi-scale image features and implement remote sensing image scene classification. The experiment results in the standard image classification data sets indicate that the proposed approach obtains great classification effect due to combining the recognition superiority to different scenes of low frequency and high frequency subbands.

  4. Radial Basis Function Networks Applied in Bacterial Classification Based on MALDI-TOF-MS

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The radial basis function networks were applied to bacterial classification based on the matrix-assisted laser desorption/ionization time-of-flight mass spectrometric (MALDI-TOF-MS) data. The classification of bacteria cultured at different time was discussed and the effect of the network parameters on the classification was investigated. The cross-validation method was used to test the trained networks. The correctness of the classification of different bacteria investigated changes in a wide range from 61.5% to 92.8%. Owing to the complexity of biological effects in bacterial growth, the more rigid control of bacterial culture conditions seems to be a critical factor for improving the rate of correctness for bacterial classification.

  5. Three-Class EEG-Based Motor Imagery Classification Using Phase-Space Reconstruction Technique

    Science.gov (United States)

    Djemal, Ridha; Bazyed, Ayad G.; Belwafi, Kais; Gannouni, Sofien; Kaaniche, Walid

    2016-01-01

    Over the last few decades, brain signals have been significantly exploited for brain-computer interface (BCI) applications. In this paper, we study the extraction of features using event-related desynchronization/synchronization techniques to improve the classification accuracy for three-class motor imagery (MI) BCI. The classification approach is based on combining the features of the phase and amplitude of the brain signals using fast Fourier transform (FFT) and autoregressive (AR) modeling of the reconstructed phase space as well as the modification of the BCI parameters (trial length, trial frequency band, classification method). We report interesting results compared with those present in the literature by utilizing sequential forward floating selection (SFFS) and a multi-class linear discriminant analysis (LDA), our findings showed superior classification results, a classification accuracy of 86.06% and 93% for two BCI competition datasets, with respect to results from previous studies. PMID:27563927

  6. Development and validation of a microRNA based diagnostic assay for primary tumor site classification of liver core biopsies.

    Science.gov (United States)

    Perell, Katharina; Vincent, Martin; Vainer, Ben; Petersen, Bodil Laub; Federspiel, Birgitte; Møller, Anne Kirstine; Madsen, Mette; Hansen, Niels Richard; Friis-Hansen, Lennart; Nielsen, Finn Cilius; Daugaard, Gedske

    2015-01-01

    Identification of the primary tumor site in patients with metastatic cancer is clinically important, but remains a challenge. Hence, efforts have been made towards establishing new diagnostic tools. Molecular profiling is a promising diagnostic approach, but tissue heterogeneity and inadequacy may negatively affect the accuracy and usability of molecular classifiers. We have developed and validated a microRNA-based classifier, which predicts the primary tumor site of liver biopsies, containing a limited number of tumor cells. Concurrently we explored the influence of surrounding normal tissue on classification. MicroRNA profiling was performed using quantitative Real-Time PCR on formalin-fixed paraffin-embedded samples. 278 primary tumors and liver metastases, representing nine primary tumor classes, as well as normal liver samples were used as a training set. A statistical model was applied to adjust for normal liver tissue contamination. Performance was estimated by cross-validation, followed by independent validation on 55 liver core biopsies with a tumor content as low as 10%. A microRNA classifier developed, using the statistical contamination model, showed an overall classification accuracy of 74.5% upon independent validation. Two-thirds of the samples were classified with high-confidence, with an accuracy of 92% on high-confidence predictions. A classifier trained without adjusting for liver tissue contamination, showed a classification accuracy of 38.2%. Our results indicate that surrounding normal tissue from the biopsy site may critically influence molecular classification. A significant improvement in classification accuracy was obtained when the influence of normal tissue was limited by application of a statistical contamination model. PMID:25131495

  7. Spectral Collaborative Representation based Classification for Hand Gestures recognition on Electromyography Signals

    OpenAIRE

    Boyali, Ali

    2015-01-01

    In this study, we introduce a novel variant and application of the Collaborative Representation based Classification in spectral domain for recognition of the hand gestures using the raw surface Electromyography signals. The intuitive use of spectral features are explained via circulant matrices. The proposed Spectral Collaborative Representation based Classification (SCRC) is able to recognize gestures with higher levels of accuracy for a fairly rich gesture set. The worst recognition result...

  8. Belief Function Based Decision Fusion for Decentralized Target Classification in Wireless Sensor Networks

    OpenAIRE

    Wenyu Zhang; Zhenjiang Zhang

    2015-01-01

    Decision fusion in sensor networks enables sensors to improve classification accuracy while reducing the energy consumption and bandwidth demand for data transmission. In this paper, we focus on the decentralized multi-class classification fusion problem in wireless sensor networks (WSNs) and a new simple but effective decision fusion rule based on belief function theory is proposed. Unlike existing belief function based decision fusion schemes, the proposed approach is compatible with any ty...

  9. A Spectral Signature Shape-Based Algorithm for Landsat Image Classification

    Directory of Open Access Journals (Sweden)

    Yuanyuan Chen

    2016-08-01

    Full Text Available Land-cover datasets are crucial for earth system modeling and human-nature interaction research at local, regional and global scales. They can be obtained from remotely sensed data using image classification methods. However, in processes of image classification, spectral values have received considerable attention for most classification methods, while the spectral curve shape has seldom been used because it is difficult to be quantified. This study presents a classification method based on the observation that the spectral curve is composed of segments and certain extreme values. The presented classification method quantifies the spectral curve shape and takes full use of the spectral shape differences among land covers to classify remotely sensed images. Using this method, classification maps from TM (Thematic mapper data were obtained with an overall accuracy of 0.834 and 0.854 for two respective test areas. The approach presented in this paper, which differs from previous image classification methods that were mostly concerned with spectral “value” similarity characteristics, emphasizes the "shape" similarity characteristics of the spectral curve. Moreover, this study will be helpful for classification research on hyperspectral and multi-temporal images.

  10. Machine Fault Classification Based on Local Discriminant Bases and Locality Preserving Projections

    Directory of Open Access Journals (Sweden)

    Qingbo He

    2014-01-01

    Full Text Available Machine fault classification is an important task for intelligent identification of the health patterns for a mechanical system being monitored. Effective feature extraction of vibration data is very critical to reliable classification of machine faults with different types and severities. In this paper, a new method is proposed to acquire the sensitive features through a combination of local discriminant bases (LDB and locality preserving projections (LPP. In the method, the LDB is employed to select the optimal wavelet packet (WP nodes that exhibit high discrimination from a redundant WP library of wavelet packet transform (WPT. Considering that the obtained discriminatory features on these selected nodes characterize the class pattern in different sensitivity, the LPP is then applied to address mining inherent class pattern feature embedded in the raw features. The proposed feature extraction method combines the merits of LDB and LPP and extracts the inherent pattern structure embedded in the discriminatory feature values of samples in different classes. Therefore, the proposed feature not only considers the discriminatory features themselves but also considers the dynamic sensitive class pattern structure. The effectiveness of the proposed feature is verified by case studies on vibration data-based classification of bearing fault types and severities.

  11. State-Based Models for Light Curve Classification

    Science.gov (United States)

    Becker, A.

    I discuss here the application of continuous time autoregressive models to the characterization of astrophysical variability. These types of models are general enough to represent many classes of variability, and descriptive enough to provide features for lightcurve classification. Importantly, the features of these models may be interpreted in terms of the power spectrum of the lightcurve, enabling constraints on characteristic timescales and periodicity. These models may be extended to include vector-valued inputs, raising the prospect of a fully general modeling and classification environment that uses multi-passband inputs to create a single phenomenological model. These types of spectral-temporal models are an important extension of extant techniques, and necessary in the upcoming eras of Gaia and LSST.

  12. Knowledge Based Pipeline Network Classification and Recognition Method of Maps

    Institute of Scientific and Technical Information of China (English)

    Liu Tongyu; Gu Shusheng

    2001-01-01

    Map recognition is an e.ssenfial data input means of Geographic Information System(GIS). How to solve the problems in the procedure, such as recognition of maps with crisscross pipeline networks, classification of buildings and roads, and processing of connected text, is a critical step for GIS keeping high-speed development. In this paper, a new recognition method of pipeline maps is presented, and some common patterns of pipeline connection and component labels are establishecd Through pattern matching, pipelines and component labels are recognized and peeled off from maps. After this approach, maps simply consist of buildings and roads, which are recognized and classified with fuzzy classification method. In addition, the Double Sides Scan (DSS) technique is also described, through which the effect of connected text can be eliminated.

  13. Power Disturbances Classification Using S-Transform Based GA-PNN

    Science.gov (United States)

    Manimala, K.; Selvi, K.

    2015-09-01

    The significance of detection and classification of power quality events that disturb the voltage and/or current waveforms in the electrical power distribution networks is well known. Consequently, in spite of a large number of research reports in this area, a research on the selection of proper parameter for specific classifiers was so far not explored. The parameter selection is very important for successful modelling of input-output relationship in a function approximation model. In this study, probabilistic neural network (PNN) has been used as a function approximation tool for power disturbance classification and genetic algorithm (GA) is utilised for optimisation of the smoothing parameter of the PNN. The important features extracted from raw power disturbance signal using S-Transform are given to the PNN for effective classification. The choice of smoothing parameter for PNN classifier will significantly impact the classification accuracy. Hence, GA based parameter optimization is done to ensure good classification accuracy by selecting suitable parameter of the PNN classifier. Testing results show that the proposed S-Transform based GA-PNN model has better classification ability than classifiers based on conventional grid search method for parameter selection. The noisy and practical signals are considered for the classification process to show the effectiveness of the proposed method in comparison with existing methods.

  14. Dihedral-based segment identification and classification of biopolymers I: proteins.

    Science.gov (United States)

    Nagy, Gabor; Oostenbrink, Chris

    2014-01-27

    A new structure classification scheme for biopolymers is introduced, which is solely based on main-chain dihedral angles. It is shown that by dividing a biopolymer into segments containing two central residues, a local classification can be performed. The method is referred to as DISICL, short for Dihedral-based Segment Identification and Classification. Compared to other popular secondary structure classification programs, DISICL is more detailed as it offers 18 distinct structural classes, which may be simplified into a classification in terms of seven more general classes. It was designed with an eye to analyzing subtle structural changes as observed in molecular dynamics simulations of biomolecular systems. Here, the DISICL algorithm is used to classify two databases of protein structures, jointly containing more than 10 million segments. The data is compared to two alternative approaches in terms of the amount of classified residues, average occurrence and length of structural elements, and pair wise matches of the classifications by the different programs. In an accompanying paper (Nagy, G.; Oostenbrink, C. Dihedral-based segment identification and classification of biopolymers II: Polynucleotides. J. Chem. Inf. Model. 2013, DOI: 10.1021/ci400542n), the analysis of polynucleotides is described and applied. Overall, DISICL represents a potentially useful tool to analyze biopolymer structures at a high level of detail.

  15. Dihedral-Based Segment Identification and Classification of Biopolymers I: Proteins

    Science.gov (United States)

    2013-01-01

    A new structure classification scheme for biopolymers is introduced, which is solely based on main-chain dihedral angles. It is shown that by dividing a biopolymer into segments containing two central residues, a local classification can be performed. The method is referred to as DISICL, short for Dihedral-based Segment Identification and Classification. Compared to other popular secondary structure classification programs, DISICL is more detailed as it offers 18 distinct structural classes, which may be simplified into a classification in terms of seven more general classes. It was designed with an eye to analyzing subtle structural changes as observed in molecular dynamics simulations of biomolecular systems. Here, the DISICL algorithm is used to classify two databases of protein structures, jointly containing more than 10 million segments. The data is compared to two alternative approaches in terms of the amount of classified residues, average occurrence and length of structural elements, and pair wise matches of the classifications by the different programs. In an accompanying paper (Nagy, G.; Oostenbrink, C. Dihedral-based segment identification and classification of biopolymers II: Polynucleotides. J. Chem. Inf. Model. 2013, DOI: 10.1021/ci400542n), the analysis of polynucleotides is described and applied. Overall, DISICL represents a potentially useful tool to analyze biopolymer structures at a high level of detail. PMID:24364820

  16. Carcinoma de mama: novos conceitos na classificação Breast cancer: new concepts in classification

    Directory of Open Access Journals (Sweden)

    Daniella Serafin Couto Vieira

    2008-01-01

    Full Text Available O carcinoma de mama é a neoplasia maligna mais comum em mulheres. Estudos moleculares do carcinoma de mama, baseados na identificação do perfil de expressão gênica por meio do cDNA microarray, permitiram definir pelo menos cinco sub-grupos distintos: luminal A, luminal B, superexpressão do HER2, basal e normal breast-like. A técnica de tissue microarray (TMA, descrita pela primeira vez em 1998, permitiu estudar, em várias amostras de carcinoma, os perfis de expressão protéica de diferentes neoplasias. No carcinoma de mama, os TMAs têm sido utilizados para validar os achados dos estudos preliminares, identificando, desta forma, os novos subtipos fenotípicos do carcinoma de mama. Dentre os subtipos classicamente descritos, o grupo basal constitui um dos mais intrigantes subtipos tumorais e é freqüentemente associado com pior prognóstico e ausência de alvos terapêuticos definidos. A classificação histopatológica do carcinoma de mama tem pobre valor preditivo. Portanto, a associação entre o diagnóstico histológico com técnicas moleculares nos laboratórios de anatomia patológica, por meio do estudo imunoistoquímico, pode determinar o perfil molecular do carcinoma de mama, buscando melhorar a resposta terapêutica. Este estudo visou resumir os mais recentes conhecimentos em que se baseiam os novos conceitos da classificação do carcinoma de mama.Breast cancer is the principal cause of death from cancer in women. Molecular studies of breast cancer, based in the identification of the molecular profiling techniques through cDNA microarray, had allowed defining at least five distinct sub-group: luminal A, luminal B, HER-2-overexpression, basal and " normal" type breast-like. The technique of tissue microarrays (TMA, described for the first time in 1998, allows to study, in some samples of breast cancer, distinguished by differences in their gene expression patterns, which provide a distinctive molecular portrait for each tumor

  17. Entropy-based Classification of 'Retweeting' Activity on Twitter

    OpenAIRE

    Ghosh, Rumi; Surachawala, Tawan; Lerman, Kristina

    2011-01-01

    Twitter is used for a variety of reasons, including information dissemination, marketing, political organizing and to spread propaganda, spamming, promotion, conversations, and so on. Characterizing these activities and categorizing associated user generated content is a challenging task. We present a information-theoretic approach to classification of user activity on Twitter. We focus on tweets that contain embedded URLs and study their collective `retweeting' dynamics. We identify two feat...

  18. Classification and identification of amino acids based on THz spectroscopy

    Science.gov (United States)

    Huang, Ping J.; Ma, Ye H.; Li, Xian; Hou, Di B.; Cai, Jin H.; Zhang, Guang X.

    2015-11-01

    Amino acids are important nutrient substances for life, and many of them have several isomerides, while only L-type amino acids can be absorbed by body as nutrients. So it is certain worth to accurately classify and identify amino acids. In this paper, terahertz time-domain spectroscopy (THz-TDS) was used to detect isomers of various amino acids to obtain their absorption spectra, and their spectral characteristics were analyzed and compared. Results show that not all isomerides of amino acids have unique spectral characteristics, causing the difficulty of classification and identification. To solve this problem, partial least squares discriminant analysis (PLS-DA), firstly, was performed on extracting principal component of THz spectroscopy and classifying amino acids. Moreover, variable selection (VS) was employed to optimize spectral interval of feature extraction to improve analysis effect. As a result, the optimal classification model was determined and most samples can be accurately classified. Secondly, for each class of amino acids, PLS-DA combined with VS was also applied to identify isomerides. This work provides a suggestion for material classification and identification with THz spectroscopy.

  19. Magnetic nanoparticle-based cancer therapy

    Institute of Scientific and Technical Information of China (English)

    Yu Jing; Huang Dong-Yan; Muhammad Zubair Yousaf; Hou Yang-Long; Gao Song

    2013-01-01

    Nanoparticles (NPs) with easily modified surfaces have been playing an important role in biomedicine.As cancer is one of the major causes of death,tremendous efforts have been devoted to advance the methods of cancer diagnosis and therapy.Recently,magnetic nanoparticles (MNPs) that are responsive to a magnetic field have shown great promise in cancer therapy.Compared with traditional cancer therapy,magnetic field triggered therapeutic approaches can treat cancer in an unconventional but more effective and safer way.In this review,we will discuss the recent progress in cancer therapies based on MNPs,mainly including magnetic hyperthermia,magnetic specific targeting,magnetically controlled drug delivery,magnetofection,and magnetic switches for controlling cell fate.Some recently developed strategies such as magnetic resonance imaging (MRI) monitoring cancer therapy and magnetic tissue engineering are also addressed.

  20. Ship Classification with High Resolution TerraSAR-X Imagery Based on Analytic Hierarchy Process

    Directory of Open Access Journals (Sweden)

    Zhi Zhao

    2013-01-01

    Full Text Available Ship surveillance using space-borne synthetic aperture radar (SAR, taking advantages of high resolution over wide swaths and all-weather working capability, has attracted worldwide attention. Recent activity in this field has concentrated mainly on the study of ship detection, but the classification is largely still open. In this paper, we propose a novel ship classification scheme based on analytic hierarchy process (AHP in order to achieve better performance. The main idea is to apply AHP on both feature selection and classification decision. On one hand, the AHP based feature selection constructs a selection decision problem based on several feature evaluation measures (e.g., discriminability, stability, and information measure and provides objective criteria to make comprehensive decisions for their combinations quantitatively. On the other hand, we take the selected feature sets as the input of KNN classifiers and fuse the multiple classification results based on AHP, in which the feature sets’ confidence is taken into account when the AHP based classification decision is made. We analyze the proposed classification scheme and demonstrate its results on a ship dataset that comes from TerraSAR-X SAR images.

  1. Virtual images inspired consolidate collaborative representation-based classification method for face recognition

    Science.gov (United States)

    Liu, Shigang; Zhang, Xinxin; Peng, Yali; Cao, Han

    2016-07-01

    The collaborative representation-based classification method performs well in the field of classification of high-dimensional images such as face recognition. It utilizes training samples from all classes to represent a test sample and assigns a class label to the test sample using the representation residuals. However, this method still suffers from the problem that limited number of training sample influences the classification accuracy when applied to image classification. In this paper, we propose a modified collaborative representation-based classification method (MCRC), which exploits novel virtual images and can obtain high classification accuracy. The procedure to produce virtual images is very simple but the use of them can bring surprising performance improvement. The virtual images can sufficiently denote the features of original face images in some case. Extensive experimental results doubtlessly demonstrate that the proposed method can effectively improve the classification accuracy. This is mainly attributed to the integration of the collaborative representation and the proposed feature-information dominated virtual images.

  2. Land Cover Classification from Full-Waveform LIDAR Data Based on Support Vector Machines

    Science.gov (United States)

    Zhou, M.; Li, C. R.; Ma, L.; Guan, H. C.

    2016-06-01

    In this study, a land cover classification method based on multi-class Support Vector Machines (SVM) is presented to predict the types of land cover in Miyun area. The obtained backscattered full-waveforms were processed following a workflow of waveform pre-processing, waveform decomposition and feature extraction. The extracted features, which consist of distance, intensity, Full Width at Half Maximum (FWHM) and back scattering cross-section, were corrected and used as attributes for training data to generate the SVM prediction model. The SVM prediction model was applied to predict the types of land cover in Miyun area as ground, trees, buildings and farmland. The classification results of these four types of land covers were obtained based on the ground truth information according to the CCD image data of Miyun area. It showed that the proposed classification algorithm achieved an overall classification accuracy of 90.63%. In order to better explain the SVM classification results, the classification results of SVM method were compared with that of Artificial Neural Networks (ANNs) method and it showed that SVM method could achieve better classification results.

  3. LAND COVER CLASSIFICATION FROM FULL-WAVEFORM LIDAR DATA BASED ON SUPPORT VECTOR MACHINES

    Directory of Open Access Journals (Sweden)

    M. Zhou

    2016-06-01

    Full Text Available In this study, a land cover classification method based on multi-class Support Vector Machines (SVM is presented to predict the types of land cover in Miyun area. The obtained backscattered full-waveforms were processed following a workflow of waveform pre-processing, waveform decomposition and feature extraction. The extracted features, which consist of distance, intensity, Full Width at Half Maximum (FWHM and back scattering cross-section, were corrected and used as attributes for training data to generate the SVM prediction model. The SVM prediction model was applied to predict the types of land cover in Miyun area as ground, trees, buildings and farmland. The classification results of these four types of land covers were obtained based on the ground truth information according to the CCD image data of Miyun area. It showed that the proposed classification algorithm achieved an overall classification accuracy of 90.63%. In order to better explain the SVM classification results, the classification results of SVM method were compared with that of Artificial Neural Networks (ANNs method and it showed that SVM method could achieve better classification results.

  4. A New Classification Analysis of Customer Requirement Information Based on Quantitative Standardization for Product Configuration

    Directory of Open Access Journals (Sweden)

    Zheng Xiao

    2016-01-01

    Full Text Available Traditional methods used for the classification of customer requirement information are typically based on specific indicators, hierarchical structures, and data formats and involve a qualitative analysis in terms of stationary patterns. Because these methods neither consider the scalability of classification results nor do they regard subsequent application to product configuration, their classification becomes an isolated operation. However, the transformation of customer requirement information into quantifiable values would lead to a dynamic classification according to specific conditions and would enable an association with product configuration in an enterprise. This paper introduces a classification analysis based on quantitative standardization, which focuses on (i expressing customer requirement information mathematically and (ii classifying customer requirement information for product configuration purposes. Our classification analysis treated customer requirement information as follows: first, it was transformed into standardized values using mathematics, subsequent to which it was classified through calculating the dissimilarity with general customer requirement information related to the product family. Finally, a case study was used to demonstrate and validate the feasibility and effectiveness of the classification analysis.

  5. Cancer survivors' experience of exercise-based cancer rehabilitation

    DEFF Research Database (Denmark)

    Midtgaard, Julie; Hammer, Nanna Maria; Andersen, Christina;

    2015-01-01

    BACKGROUND: Evidence for the safety and benefits of exercise training as a therapeutic and rehabilitative intervention for cancer survivors is accumulating. However, whereas the evidence for the efficacy of exercise training has been established in several meta-analyses, synthesis of qualitative...... research is lacking. In order to extend healthcare professionals' understanding of the meaningfulness of exercise in cancer survivorship care, this paper aims to identify, appraise and synthesize qualitative studies on cancer survivors' experience of participation in exercise-based rehabilitation. MATERIAL......-based rehabilitation according to cancer survivors. Accordingly, the potential of rebuilding structure in everyday life, creating a normal context and enabling the individual to re-establish confidentiality and trust in their own body and physical potential constitute substantial qualities fundamental...

  6. Land Use/Land Cover Classification Based on Multi-resolution Remote Sensing Data

    OpenAIRE

    Liu, Yuechen; Pei, Zhiyuan; Wu, Quan; Guo, Lin; Zhao, Hu; Chen, Xiwei

    2011-01-01

    Part 1: GIS, GPS, RS and Precision Farming; International audience; The paper summarized pre-existing research works relating to land use/land cover classification based on multi-resolution remote sensing data. According to the features of regions, we carried out of the land use/land cover classification of level III classes in 148 group of Xinjiang agricultural reclamation eighth division. The land use/land cover classification system divided land in study area into 6 level I classes, 16 lev...

  7. A Neuro-Fuzzy based System for Classification of Natural Textures

    Science.gov (United States)

    Jiji, G. Wiselin

    2016-06-01

    A statistical approach based on the coordinated clusters representation of images is used for classification and recognition of textured images. In this paper, two issues are being addressed; one is the extraction of texture features from the fuzzy texture spectrum in the chromatic and achromatic domains from each colour component histogram of natural texture images and the second issue is the concept of a fusion of multiple classifiers. The implementation of an advanced neuro-fuzzy learning scheme has been also adopted in this paper. The results of classification tests show the high performance of the proposed method that may have industrial application for texture classification, when compared with other works.

  8. Comparison of Supervised Classification Methods for Protein Profiling in Cancer Diagnosis

    OpenAIRE

    Nadège Dossat; Alain Mangé; Jérôme Solassol; William Jacot; Ludovic Lhermitte; Thierry Maudelonde; Jean-Pierre Daurès; Nicolas Molinari

    2007-01-01

    Summary: A key challenge in clinical proteomics of cancer is the identification of biomarkers that could allow detection, diagnosis and prognosis of the diseases. Recent advances in mass spectrometry and proteomic instrumentations offer unique chance to rapidly identify these markers. These advances pose considerable challenges, similar to those created by microarray-based investigation, for the discovery of pattern of markers from high-dimensional data, specific to each pathologic state (e.g...

  9. AR-based Method for ECG Classification and Patient Recognition

    Directory of Open Access Journals (Sweden)

    Branislav Vuksanovic

    2013-09-01

    Full Text Available The electrocardiogram (ECG is the recording of heart activity obtained by measuring the signals from electrical contacts placed on the skin of the patient. By analyzing ECG, it is possible to detect the rate and consistency of heartbeats and identify possible irregularities in heart operation. This paper describes a set of techniques employed to pre-process the ECG signals and extract a set of features – autoregressive (AR signal parameters used to characterise ECG signal. Extracted parameters are in this work used to accomplish two tasks. Firstly, AR features belonging to each ECG signal are classified in groups corresponding to three different heart conditions – normal, arrhythmia and ventricular arrhythmia. Obtained classification results indicate accurate, zero-error classification of patients according to their heart condition using the proposed method. Sets of extracted AR coefficients are then extended by adding an additional parameter – power of AR modelling error and a suitability of developed technique for individual patient identification is investigated. Individual feature sets for each group of detected QRS sections are classified in p clusters where p represents the number of patients in each group. Developed system has been tested using ECG signals available in MIT/BIH and Politecnico of Milano VCG/ECG database. Achieved recognition rates indicate that patient identification using ECG signals could be considered as a possible approach in some applications using the system developed in this work. Pre-processing stages, applied parameter extraction techniques and some intermediate and final classification results are described and presented in this paper.

  10. Computer vision-based limestone rock-type classification using probabilistic neural network

    Institute of Scientific and Technical Information of China (English)

    Ashok Kumar Patel; Snehamoy Chatterjee

    2016-01-01

    Proper quality planning of limestone raw materials is an essential job of maintaining desired feed in cement plant. Rock-type identification is an integrated part of quality planning for limestone mine. In this paper, a computer vision-based rock-type classification algorithm is proposed for fast and reliable identification without human intervention. A laboratory scale vision-based model was developed using probabilistic neural network (PNN) where color histogram features are used as input. The color image histogram-based features that include weighted mean, skewness and kurtosis features are extracted for all three color space red, green, and blue. A total nine features are used as input for the PNN classification model. The smoothing parameter for PNN model is selected judicially to develop an optimal or close to the optimum classification model. The developed PPN is validated using the test data set and results reveal that the proposed vision-based model can perform satisfactorily for classifying limestone rock-types. Overall the error of mis-classification is below 6%. When compared with other three classifica-tion algorithms, it is observed that the proposed method performs substantially better than all three classification algorithms.

  11. Improving the Computational Performance of Ontology-Based Classification Using Graph Databases

    Directory of Open Access Journals (Sweden)

    Thomas J. Lampoltshammer

    2015-07-01

    Full Text Available The increasing availability of very high-resolution remote sensing imagery (i.e., from satellites, airborne laser scanning, or aerial photography represents both a blessing and a curse for researchers. The manual classification of these images, or other similar geo-sensor data, is time-consuming and leads to subjective and non-deterministic results. Due to this fact, (semi- automated classification approaches are in high demand in affected research areas. Ontologies provide a proper way of automated classification for various kinds of sensor data, including remotely sensed data. However, the processing of data entities—so-called individuals—is one of the most cost-intensive computational operations within ontology reasoning. Therefore, an approach based on graph databases is proposed to overcome the issue of a high time consumption regarding the classification task. The introduced approach shifts the classification task from the classical Protégé environment and its common reasoners to the proposed graph-based approaches. For the validation, the authors tested the approach on a simulation scenario based on a real-world example. The results demonstrate a quite promising improvement of classification speed—up to 80,000 times faster than the Protégé-based approach.

  12. An Analysis of Social Class Classification Based on Linguistic Variables

    Institute of Scientific and Technical Information of China (English)

    QU Xia-sha

    2016-01-01

    Since language is an influential tool in social interaction, the relationship of speech and social factors, such as social class, gender, even age is worth studying. People employ different linguistic variables to imply their social class, status and iden-tity in the social interaction. Thus the linguistic variation involves vocabulary, sounds, grammatical constructions, dialects and so on. As a result, a classification of social class draws people’s attention. Linguistic variable in speech interactions indicate the social relationship between people. This paper attempts to illustrate three main linguistic variables which influence the social class, and further sociolinguistic studies need to be more concerned about.

  13. Multiobjective Simulated Annealing-Based Clustering of Tissue Samples for Cancer Diagnosis.

    Science.gov (United States)

    Acharya, Sudipta; Saha, Sriparna; Thadisina, Yamini

    2016-03-01

    In the field of pattern recognition, the study of the gene expression profiles of different tissue samples over different experimental conditions has become feasible with the arrival of microarray-based technology. In cancer research, classification of tissue samples is necessary for cancer diagnosis, which can be done with the help of microarray technology. In this paper, we have presented a multiobjective optimization (MOO)-based clustering technique utilizing archived multiobjective simulated annealing(AMOSA) as the underlying optimization strategy for classification of tissue samples from cancer datasets. The presented clustering technique is evaluated for three open source benchmark cancer datasets [Brain tumor dataset, Adult Malignancy, and Small Round Blood Cell Tumors (SRBCT)]. In order to evaluate the quality or goodness of produced clusters, two cluster quality measures viz, adjusted rand index and classification accuracy ( % CoA) are calculated. Comparative results of the presented clustering algorithm with ten state-of-the-art existing clustering techniques are shown for three benchmark datasets. Also, we have conducted a statistical significance test called t-test to prove the superiority of our presented MOO-based clustering technique over other clustering techniques. Moreover, significant gene markers have been identified and demonstrated visually from the clustering solutions obtained. In the field of cancer subtype prediction, this study can have important impact. PMID:25706936

  14. Marker-Based Hierarchical Segmentation and Classification Approach for Hyperspectral Imagery

    Science.gov (United States)

    Tarabalka, Yuliya; Tilton, James C.; Benediktsson, Jon Atli; Chanussot, Jocelyn

    2011-01-01

    The Hierarchical SEGmentation (HSEG) algorithm, which is a combination of hierarchical step-wise optimization and spectral clustering, has given good performances for hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. First, pixelwise classification is performed and the most reliably classified pixels are selected as markers, with the corresponding class labels. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. The experimental results show that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for hyperspectral image analysis.

  15. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  16. Content-based similarity for 3D model retrieval and classification

    Institute of Scientific and Technical Information of China (English)

    Ke Lü; Ning He; Jian Xue

    2009-01-01

    With the rapid development of 3D digital shape information,content-based 3D model retrieval and classification has become an important research area.This paper presents a novel 3D model retrieval and classification algorithm.For feature representation,a method combining a distance histogram and moment invariants is proposed to improve the retrieval performance.The major advantage of using a distance histogram is its invariance to the transforms of scaling,translation and rotation.Based on the premise that two similar objects should have high mutual information,the querying of 3D data should convey a great deal of information on the shape of the two objects,and so we propose a mutual information distance measurement to perform the similarity comparison of 3D objects.The proposed algorithm is tested with a 3D model retrieval and classification prototype,and the experimental evaluation demonstrates satisfactory retrieval results and classification accuracy.

  17. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  18. Deep learning based classification of breast tumors with shear-wave elastography.

    Science.gov (United States)

    Zhang, Qi; Xiao, Yang; Dai, Wei; Suo, Jingfeng; Wang, Congzhi; Shi, Jun; Zheng, Hairong

    2016-12-01

    This study aims to build a deep learning (DL) architecture for automated extraction of learned-from-data image features from the shear-wave elastography (SWE), and to evaluate the DL architecture in differentiation between benign and malignant breast tumors. We construct a two-layer DL architecture for SWE feature extraction, comprised of the point-wise gated Boltzmann machine (PGBM) and the restricted Boltzmann machine (RBM). The PGBM contains task-relevant and task-irrelevant hidden units, and the task-relevant units are connected to the RBM. Experimental evaluation was performed with five-fold cross validation on a set of 227 SWE images, 135 of benign tumors and 92 of malignant tumors, from 121 patients. The features learned with our DL architecture were compared with the statistical features quantifying image intensity and texture. Results showed that the DL features achieved better classification performance with an accuracy of 93.4%, a sensitivity of 88.6%, a specificity of 97.1%, and an area under the receiver operating characteristic curve of 0.947. The DL-based method integrates feature learning with feature selection on SWE. It may be potentially used in clinical computer-aided diagnosis of breast cancer.

  19. Deep learning based classification of breast tumors with shear-wave elastography.

    Science.gov (United States)

    Zhang, Qi; Xiao, Yang; Dai, Wei; Suo, Jingfeng; Wang, Congzhi; Shi, Jun; Zheng, Hairong

    2016-12-01

    This study aims to build a deep learning (DL) architecture for automated extraction of learned-from-data image features from the shear-wave elastography (SWE), and to evaluate the DL architecture in differentiation between benign and malignant breast tumors. We construct a two-layer DL architecture for SWE feature extraction, comprised of the point-wise gated Boltzmann machine (PGBM) and the restricted Boltzmann machine (RBM). The PGBM contains task-relevant and task-irrelevant hidden units, and the task-relevant units are connected to the RBM. Experimental evaluation was performed with five-fold cross validation on a set of 227 SWE images, 135 of benign tumors and 92 of malignant tumors, from 121 patients. The features learned with our DL architecture were compared with the statistical features quantifying image intensity and texture. Results showed that the DL features achieved better classification performance with an accuracy of 93.4%, a sensitivity of 88.6%, a specificity of 97.1%, and an area under the receiver operating characteristic curve of 0.947. The DL-based method integrates feature learning with feature selection on SWE. It may be potentially used in clinical computer-aided diagnosis of breast cancer. PMID:27529139

  20. Analysis on Systematic Water Scarcity Based on Establishment of Water Scarcity Classification System

    Institute of Scientific and Technical Information of China (English)

    2011-01-01

    It would be very helpful for making countermeasures against complex water scarcity by analysis on systematic water scarcity.Based on the previous researches on water scarcity classification,a classification system of water scarcity was established according to contributing factors,which comprises three water scarcity categories caused by anthropic factors,natural factors and mixed factors respectively.Accordingly,the concept of systematic water scarcity was proposed,which can be defined as one type of water...

  1. Object-Based Classification of Abandoned Logging Roads under Heavy Canopy Using LiDAR

    OpenAIRE

    Jason Sherba; Leonhard Blesius; Jerry Davis

    2014-01-01

    LiDAR-derived slope models may be used to detect abandoned logging roads in steep forested terrain. An object-based classification approach of abandoned logging road detection was employed in this study. First, a slope model of the study site in Marin County, California was created from a LiDAR derived DEM. Multiresolution segmentation was applied to the slope model and road seed objects were iteratively grown into candidate objects. A road classification accuracy of 86% was achieved using th...

  2. Management of patients with sphincter of Oddi dysfunction based on a new classification

    Institute of Scientific and Technical Information of China (English)

    Jia-Qing Gong; Jian-Dong Ren; Fu-Zhou Tian; Rui Jiang; Li-Jun Tang; Yong Pang

    2011-01-01

    AIM: To propose a new classification system for sphincter of Oddi dysfunction (SOD) based on clinical data of patients.METHODS: The clinical data of 305 SOD patients documented over the past decade at our center were analyzed retrospectively, and typical cases were reported.CONCLUSION: The newly proposed SOD classification system introduced in this study better explains the clinical symptoms of SOD from the anatomical perspective and can guide clinical treatment of this disease.

  3. Image-Based Airborne Sensors: A Combined Approach for Spectral Signatures Classification through Deterministic Simulated Annealing

    OpenAIRE

    P. Javier Herrera; Gonzalo Pajares; María Guijarro

    2009-01-01

    The increasing technology of high-resolution image airborne sensors, including those on board Unmanned Aerial Vehicles, demands automatic solutions for processing, either on-line or off-line, the huge amountds of image data sensed during the flights. The classification of natural spectral signatures in images is one potential application. The actual tendency in classification is oriented towards the combination of simple classifiers. In this paper we propose a combined strategy based on the D...

  4. Medical diagnosis of cardiovascular diseases using an interval-valued fuzzy rule-based classification system

    OpenAIRE

    Sanz Delgado, José Antonio; Galar Idoate, Mikel; Jurío Munárriz, Aránzazu; Brugos Larumbe, Antonio; Pagola Barrio, Miguel; Bustince Sola, Humberto

    2013-01-01

    Objective: To develop a classifier that tackles the problem of determining the risk of a patient of suffering from a cardiovascular disease within the next ten years. The system has to provide both a diagnosis and an interpretable model explaining the decision. In this way, doctors are able to analyse the usefulness of the information given by the system. Methods: Linguistic fuzzy rule-based classification systems are used, since they provide a good classification rate and a highly interpreta...

  5. Object-Based Crop Classification with Landsat-MODIS Enhanced Time-Series Data

    OpenAIRE

    Qingting Li; Cuizhen Wang; Bing Zhang; Linlin Lu

    2015-01-01

    Cropland mapping via remote sensing can provide crucial information for agri-ecological studies. Time series of remote sensing imagery is particularly useful for agricultural land classification. This study investigated the synergistic use of feature selection, Object-Based Image Analysis (OBIA) segmentation and decision tree classification for cropland mapping using a finer temporal-resolution Landsat-MODIS Enhanced time series in 2007. The enhanced time series extracted 26 layers of Normali...

  6. A Multi-Lead ECG Classification Based on Random Projection Features

    OpenAIRE

    Bogdanova Vandergheynst, Iva; Vallejos, Rincon; Javier, Francisco; Atienza Alonso, David

    2012-01-01

    This paper presents a novel method for classification of multilead electrocardiogram (ECG) signals. The feature extraction is based on the random projection (RP) concept for dimensionality reduction. Furthermore, the classification is performed by a neuro-fuzzy classifier. Such a model can be easily implemented on portable systems for practical applications in both health monitoring and diagnostic purposes. Moreover, the RP implementation on portable systems is very challenging featuring both...

  7. A Multi-Lead Ecg Classification Based On Random Projection Features

    OpenAIRE

    Bogdanova, Iva; Rincon, Francisco; Atienza, David

    2012-01-01

    This paper presents a novel method for classification of multi-lead electrocardiogram (ECG) signals. The feature extraction is based on the random projection (RP) concept for dimensionality reduction. Furthermore, the classification is performed by a neuro-fuzzy classifier. Such a model can be easily implemented on portable systems for practical applications in both health monitoring and diagnostic purposes. Moreover, the RP implementation on portable systems is very challenging featuring bot...

  8. Maximum likelihood based classification of electron tomographic data.

    Science.gov (United States)

    Stölken, Michael; Beck, Florian; Haller, Thomas; Hegerl, Reiner; Gutsche, Irina; Carazo, Jose-Maria; Baumeister, Wolfgang; Scheres, Sjors H W; Nickell, Stephan

    2011-01-01

    Classification and averaging of sub-tomograms can improve the fidelity and resolution of structures obtained by electron tomography. Here we present a three-dimensional (3D) maximum likelihood algorithm--MLTOMO--which is characterized by integrating 3D alignment and classification into a single, unified processing step. The novelty of our approach lies in the way we calculate the probability of observing an individual sub-tomogram for a given reference structure. We assume that the reference structure is affected by a 'compound wedge', resulting from the summation of many individual missing wedges in distinct orientations. The distance metric underlying our probability calculations effectively down-weights Fourier components that are observed less frequently. Simulations demonstrate that MLTOMO clearly outperforms the 'constrained correlation' approach and has advantages over existing approaches in cases where the sub-tomograms adopt preferred orientations. Application of our approach to cryo-electron tomographic data of ice-embedded thermosomes revealed distinct conformations that are in good agreement with results obtained by previous single particle studies.

  9. Classification of hospitals based on measured output: the VA system.

    Science.gov (United States)

    Thomas, J W; Berki, S E; Wyszewianski, L; Ashcraft, M L

    1983-07-01

    Evaluation of hospital performance and improvement of resource allocation in hospital systems require a method for classifying hospitals on the basis of their output. Previous approaches to hospital classification relied largely on input characteristics. The authors propose and apply a procedure for classifying hospitals into groups where within-group hospitals are similar with respect to output. Direct measures of case-mix-adjusted discharges and outpatient visits are the principal measures of patient care output; other measures capture training and research functions. The component measures were weighted, and a composite output measure was calculated for each of the 162 hospitals in the Veterans Administration health care system. The output score then was used as the dependent variable in an Automatic Interaction Detector analysis, which partitioned the 162 hospitals into 10 groups, accounting for 85 per cent of the variance in the dependent variable. An extension of the output classification method is presented for illustration of how the difference between hospitals' actual operating costs and costs predicted on the basis of output can be used in defining isoefficiency groups. PMID:6350744

  10. Multi-Frequency Polarimetric SAR Classification Based on Riemannian Manifold and Simultaneous Sparse Representation

    Directory of Open Access Journals (Sweden)

    Fan Yang

    2015-07-01

    Full Text Available Normally, polarimetric SAR classification is a high-dimensional nonlinear mapping problem. In the realm of pattern recognition, sparse representation is a very efficacious and powerful approach. As classical descriptors of polarimetric SAR, covariance and coherency matrices are Hermitian semidefinite and form a Riemannian manifold. Conventional Euclidean metrics are not suitable for a Riemannian manifold, and hence, normal sparse representation classification cannot be applied to polarimetric SAR directly. This paper proposes a new land cover classification approach for polarimetric SAR. There are two principal novelties in this paper. First, a Stein kernel on a Riemannian manifold instead of Euclidean metrics, combined with sparse representation, is employed for polarimetric SAR land cover classification. This approach is named Stein-sparse representation-based classification (SRC. Second, using simultaneous sparse representation and reasonable assumptions of the correlation of representation among different frequency bands, Stein-SRC is generalized to simultaneous Stein-SRC for multi-frequency polarimetric SAR classification. These classifiers are assessed using polarimetric SAR images from the Airborne Synthetic Aperture Radar (AIRSAR sensor of the Jet Propulsion Laboratory (JPL and the Electromagnetics Institute Synthetic Aperture Radar (EMISAR sensor of the Technical University of Denmark (DTU. Experiments on single-band and multi-band data both show that these approaches acquire more accurate classification results in comparison to many conventional and advanced classifiers.

  11. Fractal classification and natural classification of coal pore structure based on migration of coal bed methane

    Institute of Scientific and Technical Information of China (English)

    FU Xuehai; QIN Yong; ZHANG Wanhong; WEI Chongtao; ZHOU Rongfu

    2005-01-01

    According to the data of 146 coal samples measured by mercury penetration, coal pores are classified into two levels of <65 nm diffusion pore and >65 nm seeping pore by fractal method based on the characteristics of diffusion, seepage of coal bed methane(CBM) and on the research results of specific pore volume and pore structure. The diffusion pores are further divided into three categories: <8 nm micropore, 8-20 nm transitional pore, and 20-65 nm minipore based on the relationship between increment of specific surface area and diameter of pores, while seepage pores are further divided into three categories: 65-325 nm mesopore,325-1000 nm transitional pore, and >1000 nm macropore based on the abrupt change in the increment of specific pore volume.

  12. A multiple-point spatially weighted k-NN method for object-based classification

    Science.gov (United States)

    Tang, Yunwei; Jing, Linhai; Li, Hui; Atkinson, Peter M.

    2016-10-01

    Object-based classification, commonly referred to as object-based image analysis (OBIA), is now commonly regarded as able to produce more appealing classification maps, often of greater accuracy, than pixel-based classification and its application is now widespread. Therefore, improvement of OBIA using spatial techniques is of great interest. In this paper, multiple-point statistics (MPS) is proposed for object-based classification enhancement in the form of a new multiple-point k-nearest neighbour (k-NN) classification method (MPk-NN). The proposed method first utilises a training image derived from a pre-classified map to characterise the spatial correlation between multiple points of land cover classes. The MPS borrows spatial structures from other parts of the training image, and then incorporates this spatial information, in the form of multiple-point probabilities, into the k-NN classifier. Two satellite sensor images with a fine spatial resolution were selected to evaluate the new method. One is an IKONOS image of the Beijing urban area and the other is a WorldView-2 image of the Wolong mountainous area, in China. The images were object-based classified using the MPk-NN method and several alternatives, including the k-NN, the geostatistically weighted k-NN, the Bayesian method, the decision tree classifier (DTC), and the support vector machine classifier (SVM). It was demonstrated that the new spatial weighting based on MPS can achieve greater classification accuracy relative to the alternatives and it is, thus, recommended as appropriate for object-based classification.

  13. The study of a patient's immune system may prove to be a useful noninvasive tool for stage classification in colon cancer.

    Science.gov (United States)

    Pellegrini, Patrizia; Berghella, Anna Maria; Contasta, Ida; Del Beato, Tiziana; Adorno, Domenico

    2006-10-01

    Therapy, and, therefore, prognosis, is strictly related to cancer stage, and hence, screening tests that can contribute to the early classification of disease stage represent a step forward in treatment. Unfortunately, few prognostic indices are available, especially noninvasive ones. Our study of the physiological network of the immune response, however, leads us to believe that it may well be possible to define immunological indices for the classification of cancer stage using blood parameters. In this paper, we show how the study of a patient's immune system can be used as a noninvasive tool for early-stage classification.

  14. Non-target adjacent stimuli classification improves performance of classical ERP-based brain computer interface

    Science.gov (United States)

    Ceballos, G. A.; Hernández, L. F.

    2015-04-01

    Objective. The classical ERP-based speller, or P300 Speller, is one of the most commonly used paradigms in the field of Brain Computer Interfaces (BCI). Several alterations to the visual stimuli presentation system have been developed to avoid unfavorable effects elicited by adjacent stimuli. However, there has been little, if any, regard to useful information contained in responses to adjacent stimuli about spatial location of target symbols. This paper aims to demonstrate that combining the classification of non-target adjacent stimuli with standard classification (target versus non-target) significantly improves classical ERP-based speller efficiency. Approach. Four SWLDA classifiers were trained and combined with the standard classifier: the lower row, upper row, right column and left column classifiers. This new feature extraction procedure and the classification method were carried out on three open databases: the UAM P300 database (Universidad Autonoma Metropolitana, Mexico), BCI competition II (dataset IIb) and BCI competition III (dataset II). Main results. The inclusion of the classification of non-target adjacent stimuli improves target classification in the classical row/column paradigm. A gain in mean single trial classification of 9.6% and an overall improvement of 25% in simulated spelling speed was achieved. Significance. We have provided further evidence that the ERPs produced by adjacent stimuli present discriminable features, which could provide additional information about the spatial location of intended symbols. This work promotes the searching of information on the peripheral stimulation responses to improve the performance of emerging visual ERP-based spellers.

  15. Effect of World Health Organization (WHO) Histological Classification on Predicting Lymph Node Metastasis and Recurrence in Early Gastric Cancer

    Science.gov (United States)

    Lai, Ji Fu; Xu, Wen Na; Noh, Sung Hoon; Lu, Wei Qin

    2016-01-01

    Background The World Health Organization (WHO) histological classification for gastric cancer is widely accepted and used. However, its impact on predicting lymph node metastasis and recurrence in early gastric cancer (EGC) is not well studied. Material/Methods From 1987 to 2005, 2873 EGC patients with known WHO histological type who had undergone curative resection were enrolled in this study. In all, 637 well-differentiated adenocarcinomas (WD), 802 moderately-differentiated adenocarcinomas (MD), 689 poorly-differentiated adenocarcinomas (PD), and 745 signet-ring cell adenocarcinomas (SRC) were identified. Results The distribution of demographic and clinical features in early gastric cancer among WD, MD, PD, and SRC were significantly different. Lymph node metastasis was observed in 317 patients (11.0%), with the lymph node metastasis rate being 5.3%, 14.8%, 17.0%, and 6.3% in WD, MD, PD, and SRC, respectively. Univariate and multivariate analyses indicated that gender, tumor size, gross appearance, depth of invasion, and WHO classification were significantly associated with lymph node metastasis. Recurrence was observed in 83 patients (2.9%), with the recurrence rate being 2.2%, 4.5%, 3.0%, and 1.6% in WD, MD, PD, and SRC, respectively. Multivariate analysis confirmed that MD, elevated gross type, and lymph node metastasis were independent risk factors for recurrence in EGC. MD patients showed worse disease-free survival than non-MD patients (P=0.001). Conclusions WHO classification is useful and necessary to evaluate during the perioperative management of EGC. Treatment strategies for EGC should be made prudently according to WHO classification, especially for MD patients. PMID:27595490

  16. Effect of World Health Organization (WHO) Histological Classification on Predicting Lymph Node Metastasis and Recurrence in Early Gastric Cancer.

    Science.gov (United States)

    Lai, Ji Fu; Xu, Wen Na; Noh, Sung Hoon; Lu, Wei Qin

    2016-01-01

    BACKGROUND The World Health Organization (WHO) histological classification for gastric cancer is widely accepted and used. However, its impact on predicting lymph node metastasis and recurrence in early gastric cancer (EGC) is not well studied. MATERIAL AND METHODS From 1987 to 2005, 2873 EGC patients with known WHO histological type who had undergone curative resection were enrolled in this study. In all, 637 well-differentiated adenocarcinomas (WD), 802 moderately-differentiated adenocarcinomas (MD), 689 poorly-differentiated adenocarcinomas (PD), and 745 signet-ring cell adenocarcinomas (SRC) were identified. RESULTS The distribution of demographic and clinical features in early gastric cancer among WD, MD, PD, and SRC were significantly different. Lymph node metastasis was observed in 317 patients (11.0%), with the lymph node metastasis rate being 5.3%, 14.8%, 17.0%, and 6.3% in WD, MD, PD, and SRC, respectively. Univariate and multivariate analyses indicated that gender, tumor size, gross appearance, depth of invasion, and WHO classification were significantly associated with lymph node metastasis. Recurrence was observed in 83 patients (2.9%), with the recurrence rate being 2.2%, 4.5%, 3.0%, and 1.6% in WD, MD, PD, and SRC, respectively. Multivariate analysis confirmed that MD, elevated gross type, and lymph node metastasis were independent risk factors for recurrence in EGC. MD patients showed worse disease-free survival than non-MD patients (P=0.001). CONCLUSIONS WHO classification is useful and necessary to evaluate during the perioperative management of EGC. Treatment strategies for EGC should be made prudently according to WHO classification, especially for MD patients. PMID:27595490

  17. [Proposals for social class classification based on the Spanish National Classification of Occupations 2011 using neo-Weberian and neo-Marxist approaches].

    Science.gov (United States)

    Domingo-Salvany, Antònia; Bacigalupe, Amaia; Carrasco, José Miguel; Espelt, Albert; Ferrando, Josep; Borrell, Carme

    2013-01-01

    In Spain, the new National Classification of Occupations (Clasificación Nacional de Ocupaciones [CNO-2011]) is substantially different to the 1994 edition, and requires adaptation of occupational social classes for use in studies of health inequalities. This article presents two proposals to measure social class: the new classification of occupational social class (CSO-SEE12), based on the CNO-2011 and a neo-Weberian perspective, and a social class classification based on a neo-Marxist approach. The CSO-SEE12 is the result of a detailed review of the CNO-2011 codes. In contrast, the neo-Marxist classification is derived from variables related to capital and organizational and skill assets. The proposed CSO-SEE12 consists of seven classes that can be grouped into a smaller number of categories according to study needs. The neo-Marxist classification consists of 12 categories in which home owners are divided into three categories based on capital goods and employed persons are grouped into nine categories composed of organizational and skill assets. These proposals are complemented by a proposed classification of educational level that integrates the various curricula in Spain and provides correspondences with the International Standard Classification of Education. PMID:23394892

  18. BRAIN TUMOR CLASSIFICATION BASED ON CLUSTERED DISCRETE COSINE TRANSFORM IN COMPRESSED DOMAIN

    Directory of Open Access Journals (Sweden)

    V. Anitha

    2014-01-01

    Full Text Available This study presents a novel method to classify the brain tumors by means of efficient and integrated methods so as to increase the classification accuracy. In conventional systems, the problem being the same to extract the feature sets from the database and classify tumors based on the features sets. The main idea in plethora of earlier researches related to any classification method is to increase the classification accuracy.The actual need is to achieve a better accuracy in classification, by extracting more relevant feature sets after dimensionality reduction. There exists a trade-off between accuracy and the number of feature sets. Hence the focus in this study is to implement Discrete Cosine Transform (DCT on the brain tumor images for various classes. Using DCT, by itself, it offers a fair dimension reduction in feature sets.Later on, sequentially K-means algorithm is applied on DCT coefficients to cluster the feature sets. These cluster information are considered as refined feature sets and classified using Support Vector Machine (SVM is proposed in this study. This method of using DCT helps to adjust and vary the performance of classification based on the count of the DCT coefficients taken into account. There exists a good demand for an automatic classification of brain tumors which grealtly helps in the process of diagnosis. In this novel work, an average of 97% and a maximum of 100% classification accuracy has been achieved. This research is basically aiming and opening a new way of classification under compressed domain. Hence this study may be highly suitable for diagnosing under mobile computing and internet based medical diagnosis.

  19. Study design requirements for RNA sequencing-based breast cancer diagnostics.

    Science.gov (United States)

    Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias

    2016-01-01

    Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic. PMID:26830453

  20. The normalization of citation counts based on classification systems

    CERN Document Server

    Bornmann, Lutz; Barth, Andreas

    2013-01-01

    If we want to assess whether the paper in question has had a particularly high or low citation impact compared to other papers, the standard practice in bibliometrics is to normalize citations in respect of the subject category and publication year. A number of proposals for an improved procedure in the normalization of citation impact have been put forward in recent years. Against the background of these proposals this study describes an ideal solution for the normalization of citation impact: in a first step, the reference set for the publication in question is collated by means of a classification scheme, where every publication is associated with a single principal research field or subfield entry (e. g. via Chemical Abstracts sections) and a publication year. In a second step, percentiles of citation counts are calculated for this set and used to assign the normalized citation impact score to the publications (and also to the publication in question).

  1. Data association based on target signal classification information

    Institute of Scientific and Technical Information of China (English)

    Guo Lei; Tang Bin; Liu Gang

    2008-01-01

    In most of the passive tracking systems, only the target kinematical information is used in the measurement-to-track association, which results in error tracking in a multitarget environment, where the tar-gets are too close to each other. To enhance the tracking accuracy, the target signal classification information (TSCI) should be used to improve the data association. The TSCI is integrated in the data association process using the JPDA (joint probabilistie data association). The use of the TSCI in the data association can improve discrimination by yielding a purer track and preserving continuity. To verify the validity of the application of TSCI,two simulation experiments are done on an air target-tracing problem, that is, one using the TSCI and the other not using the TSCI. The final comparison shows that the use of the TSCI can effectively improve tracking accuracy.

  2. Darwin, les fossiles et les bases de la classification moderne

    OpenAIRE

    Duranthon, Francis

    2013-01-01

    Depuis l’antiquité, les hommes ont cherché à classer les espèces, manière de décrire le monde et de se l’approprier. A la suite de la systématisation de la nomenclature binominale par Carl von Linné (1758) et jusqu’aux premiers développements des théories évolutionnistes, la classification se veut être le reflet d’une échelle naturelle des êtres avec, bien entendu, une place prépondérante pour l’homme, classé chez les Primates (les premiers), tout au sommet de cette échelle. Avec la parution ...

  3. Content-based image classification with circular harmonic wavelets

    Science.gov (United States)

    Jacovitti, Giovanni; Neri, Alessandro

    1998-07-01

    Classification of an image on the basis of contained patterns is considered in a context of detection and estimation theory. To simplify mathematical derivations, image and reference patterns are represented on a complex support. This allows to convert the four positional parameters into two complex numbers: complex displacement and complex scale factor. The latter one represents isotropic dilations with its magnitude, and rotations with its phase. In this context, evaluation of the likelihood function under additive Gaussian noise assumption allows to relate basic template matching strategy to wavelet theory. It is shown that using circular harmonic wavelets simplifies the problem from a computational viewpoint. A general purpose pattern detection/estimation scheme is introduced by decomposing the images on a orthogonal basis formed by complex Laguerre-Gauss Harmonic wavelets.

  4. The Normalization of Citation Counts Based on Classification Systems

    Directory of Open Access Journals (Sweden)

    Andreas Barth

    2013-08-01

    Full Text Available If we want to assess whether the paper in question has had a particularly high or low citation impact compared to other papers, the standard practice in bibliometrics is to normalize citations in respect of the subject category and publication year. A number of proposals for an improved procedure in the normalization of citation impact have been put forward in recent years. Against the background of these proposals, this study describes an ideal solution for the normalization of citation impact: in a first step, the reference set for the publication in question is collated by means of a classification scheme, where every publication is associated with a single principal research field or subfield entry (e.g., via Chemical Abstracts sections and a publication year. In a second step, percentiles of citation counts are calculated for this set and used to assign the normalized citation impact score to the publications (and also to the publication in question.

  5. A NEW UNSUPERVISED CLASSIFICATION ALGORITHM FOR POLARIMETRIC SAR IMAGES BASED ON FUZZY SET THEORY

    Institute of Scientific and Technical Information of China (English)

    Fu Yusheng; Xie Yan; Pi Yiming; Hou Yinming

    2006-01-01

    In this letter, a new method is proposed for unsupervised classification of terrain types and man-made objects using POLarimetric Synthetic Aperture Radar (POLSAR) data. This technique is a combination of the usage of polarimetric information of SAR images and the unsupervised classification method based on fuzzy set theory. Image quantization and image enhancement are used to preprocess the POLSAR data. Then the polarimetric information and Fuzzy C-Means (FCM) clustering algorithm are used to classify the preprocessed images. The advantages of this algorithm are the automated classification, its high classification accuracy, fast convergence and high stability. The effectiveness of this algorithm is demonstrated by experiments using SIR-C/X-SAR (Spaceborne Imaging Radar-C/X-band Synthetic Aperture Radar) data.

  6. A wavelet transform based feature extraction and classification of cardiac disorder.

    Science.gov (United States)

    Sumathi, S; Beaulah, H Lilly; Vanithamani, R

    2014-09-01

    This paper approaches an intellectual diagnosis system using hybrid approach of Adaptive Neuro-Fuzzy Inference System (ANFIS) model for classification of Electrocardiogram (ECG) signals. This method is based on using Symlet Wavelet Transform for analyzing the ECG signals and extracting the parameters related to dangerous cardiac arrhythmias. In these particular parameters were used as input of ANFIS classifier, five most important types of ECG signals they are Normal Sinus Rhythm (NSR), Atrial Fibrillation (AF), Pre-Ventricular Contraction (PVC), Ventricular Fibrillation (VF), and Ventricular Flutter (VFLU) Myocardial Ischemia. The inclusion of ANFIS in the complex investigating algorithms yields very interesting recognition and classification capabilities across a broad spectrum of biomedical engineering. The performance of the ANFIS model was evaluated in terms of training performance and classification accuracies. The results give importance to that the proposed ANFIS model illustrates potential advantage in classifying the ECG signals. The classification accuracy of 98.24 % is achieved. PMID:25023652

  7. Hyperspectral remote sensing image classification based on combined SVM and LDA

    Science.gov (United States)

    Zhang, Chunsen; Zheng, Yiwei

    2014-11-01

    This paper presents a novel method for hyperspectral image classification based on the minimum noise fraction (MNF) and an approach combining support vector machine (SVM) and linear discriminant analysis (LDA). A new SVM/LDA algorithm is used for the classification. First, we use MNF method to reduce the dimension and extract features of the image, and then use the SVM/LDA algorithm to transform the extracted features. Next, we train the result of transformation, optimize the parameters through cross-validation and grid search method, then get a optimal hyperspectral image classifier. Finally, we use this classifier to complete classification. In order to verify the proposed method, the AVIRIS Indian Pines image was used. The experimental results show that the proposed method can solve the contradiction between the small amount of samples and high dimension, improve classification accuracy compared to the classical SVM method.

  8. CONTEXT MODELS FOR CRF-BASED CLASSIFICATION OF MULTITEMPORAL REMOTE SENSING DATA

    Directory of Open Access Journals (Sweden)

    T. Hoberg

    2012-07-01

    Full Text Available The increasing availability of multitemporal satellite remote sensing data offers new potential for land cover analysis. By combining data acquired at different epochs it is possible both to improve the classification accuracy and to analyse land cover changes at a high frequency. A simultaneous classification of images from different epochs that is also capable of detecting changes is achieved by a new classification technique based on Conditional Random Fields (CRF. CRF provide a probabilistic classification framework including local spatial and temporal context. Although context is known to improve image analysis results, so far only little research was carried out on how to model it. Taking into account context is the main benefit of CRF in comparison to many other classification methods. Context can be already considered by the choice of features and in the design of the interaction potentials that model the dependencies of interacting sites in the CRF. In this paper, these aspects are more thoroughly investigated. The impact of the applied features on the classification result as well as different models for the spatial interaction potentials are evaluated and compared to the purely label-based Markov Random Field model.

  9. Texture characterization for joint compression and classification based on human perception in the wavelet domain.

    Science.gov (United States)

    Fahmy, Gamal; Black, John; Panchanathan, Sethuraman

    2006-06-01

    Today's multimedia applications demand sophisticated compression and classification techniques in order to store, transmit, and retrieve audio-visual information efficiently. Over the last decade, perceptually based image compression methods have been gaining importance. These methods take into account the abilities (and the limitations) of human visual perception (HVP) when performing compression. The upcoming MPEG 7 standard also addresses the need for succinct classification and indexing of visual content for efficient retrieval. However, there has been no research that has attempted to exploit the characteristics of the human visual system to perform both compression and classification jointly. One area of HVP that has unexplored potential for joint compression and classification is spatial frequency perception. Spatial frequency content that is perceived by humans can be characterized in terms of three parameters, which are: 1) magnitude; 2) phase; and 3) orientation. While the magnitude of spatial frequency content has been exploited in several existing image compression techniques, the novel contribution of this paper is its focus on the use of phase coherence for joint compression and classification in the wavelet domain. Specifically, this paper describes a human visual system-based method for measuring the degree to which an image contains coherent (perceptible) phase information, and then exploits that information to provide joint compression and classification. Simulation results that demonstrate the efficiency of this method are presented. PMID:16764265

  10. Mastectomy or breast conserving surgery? Factors affecting type of surgical treatment for breast cancer – a classification tree approach

    International Nuclear Information System (INIS)

    A critical choice facing breast cancer patients is which surgical treatment – mastectomy or breast conserving surgery (BCS) – is most appropriate. Several studies have investigated factors that impact the type of surgery chosen, identifying features such as place of residence, age at diagnosis, tumor size, socio-economic and racial/ethnic elements as relevant. Such assessment of 'propensity' is important in understanding issues such as a reported under-utilisation of BCS among women for whom such treatment was not contraindicated. Using Western Australian (WA) data, we further examine the factors associated with the type of surgical treatment for breast cancer using a classification tree approach. This approach deals naturally with complicated interactions between factors, and so allows flexible and interpretable models for treatment choice to be built that add to the current understanding of this complex decision process. Data was extracted from the WA Cancer Registry on women diagnosed with breast cancer in WA from 1990 to 2000. Subjects' treatment preferences were predicted from covariates using both classification trees and logistic regression. Tumor size was the primary determinant of patient choice, subjects with tumors smaller than 20 mm in diameter preferring BCS. For subjects with tumors greater than 20 mm in diameter factors such as patient age, nodal status, and tumor histology become relevant as predictors of patient choice. Classification trees perform as well as logistic regression for predicting patient choice, but are much easier to interpret for clinical use. The selected tree can inform clinicians' advice to patients

  11. Non-gaussian distributions affect identification of expression patterns, functional annotation, and prospective classification in human cancer genomes.

    Directory of Open Access Journals (Sweden)

    Nicholas F Marko

    Full Text Available INTRODUCTION: Gene expression data is often assumed to be normally-distributed, but this assumption has not been tested rigorously. We investigate the distribution of expression data in human cancer genomes and study the implications of deviations from the normal distribution for translational molecular oncology research. METHODS: We conducted a central moments analysis of five cancer genomes and performed empiric distribution fitting to examine the true distribution of expression data both on the complete-experiment and on the individual-gene levels. We used a variety of parametric and nonparametric methods to test the effects of deviations from normality on gene calling, functional annotation, and prospective molecular classification using a sixth cancer genome. RESULTS: Central moments analyses reveal statistically-significant deviations from normality in all of the analyzed cancer genomes. We observe as much as 37% variability in gene calling, 39% variability in functional annotation, and 30% variability in prospective, molecular tumor subclassification associated with this effect. CONCLUSIONS: Cancer gene expression profiles are not normally-distributed, either on the complete-experiment or on the individual-gene level. Instead, they exhibit complex, heavy-tailed distributions characterized by statistically-significant skewness and kurtosis. The non-Gaussian distribution of this data affects identification of differentially-expressed genes, functional annotation, and prospective molecular classification. These effects may be reduced in some circumstances, although not completely eliminated, by using nonparametric analytics. This analysis highlights two unreliable assumptions of translational cancer gene expression analysis: that "small" departures from normality in the expression data distributions are analytically-insignificant and that "robust" gene-calling algorithms can fully compensate for these effects.

  12. Clustering and rule-based classifications of chemical structures evaluated in the biological activity space.

    Science.gov (United States)

    Schuffenhauer, Ansgar; Brown, Nathan; Ertl, Peter; Jenkins, Jeremy L; Selzer, Paul; Hamon, Jacques

    2007-01-01

    Classification methods for data sets of molecules according to their chemical structure were evaluated for their biological relevance, including rule-based, scaffold-oriented classification methods and clustering based on molecular descriptors. Three data sets resulting from uniformly determined in vitro biological profiling experiments were classified according to their chemical structures, and the results were compared in a Pareto analysis with the number of classes and their average spread in the profile space as two concurrent objectives which were to be minimized. It has been found that no classification method is overall superior to all other studied methods, but there is a general trend that rule-based, scaffold-oriented methods are the better choice if classes with homogeneous biological activity are required, but a large number of clusters can be tolerated. On the other hand, clustering based on chemical fingerprints is superior if fewer and larger classes are required, and some loss of homogeneity in biological activity can be accepted.

  13. Network traffic classification based on ensemble learning and co-training

    Institute of Scientific and Technical Information of China (English)

    HE HaiTao; LUO XiaoNan; MA FeiTeng; CHE ChunHui; WANG JianMin

    2009-01-01

    Classification of network traffic Is the essential step for many network researches. However, with the rapid evolution of Internet applications the effectiveness of the port-based or payload-based identifi-cation approaches has been greatly diminished In recent years. And many researchers begin to turn their attentions to an alternative machine learning based method. This paper presents a novel machine learning-based classification model, which combines ensemble learning paradigm with co-training tech-niques. Compared to previous approaches, most of which only employed single classifier, multiple clas-sifiers and semi-supervised learning are applied in our method and it mainly helps to overcome three shortcomings: limited flow accuracy rate, weak adaptability and huge demand of labeled training set. In this paper, statistical characteristics of IP flows are extracted from the packet level traces to establish the feature set, then the classification model is created and tested and the empirical results prove its feasibility and effectiveness.

  14. Study on Increasing the Accuracy of Classification Based on Ant Colony algorithm

    Science.gov (United States)

    Yu, M.; Chen, D.-W.; Dai, C.-Y.; Li, Z.-L.

    2013-05-01

    The application for GIS advances the ability of data analysis on remote sensing image. The classification and distill of remote sensing image is the primary information source for GIS in LUCC application. How to increase the accuracy of classification is an important content of remote sensing research. Adding features and researching new classification methods are the ways to improve accuracy of classification. Ant colony algorithm based on mode framework defined, agents of the algorithms in nature-inspired computation field can show a kind of uniform intelligent computation mode. It is applied in remote sensing image classification is a new method of preliminary swarm intelligence. Studying the applicability of ant colony algorithm based on more features and exploring the advantages and performance of ant colony algorithm are provided with very important significance. The study takes the outskirts of Fuzhou with complicated land use in Fujian Province as study area. The multi-source database which contains the integration of spectral information (TM1-5, TM7, NDVI, NDBI) and topography characters (DEM, Slope, Aspect) and textural information (Mean, Variance, Homogeneity, Contrast, Dissimilarity, Entropy, Second Moment, Correlation) were built. Classification rules based different characters are discovered from the samples through ant colony algorithm and the classification test is performed based on these rules. At the same time, we compare with traditional maximum likelihood method, C4.5 algorithm and rough sets classifications for checking over the accuracies. The study showed that the accuracy of classification based on the ant colony algorithm is higher than other methods. In addition, the land use and cover changes in Fuzhou for the near term is studied and display the figures by using remote sensing technology based on ant colony algorithm. In addition, the land use and cover changes in Fuzhou for the near term is studied and display the figures by using

  15. Volumetric magnetic resonance imaging classification for Alzheimer's disease based on kernel density estimation of local features

    Institute of Scientific and Technical Information of China (English)

    YAN Hao; WANG Hu; WANG Yong-hui; ZHANG Yu-mei

    2013-01-01

    Background The classification of Alzheimer's disease (AD) from magnetic resonance imaging (MRI) has been challenged by lack of effective and reliable biomarkers due to inter-subject variability.This article presents a classification method for AD based on kernel density estimation (KDE) of local features.Methods First,a large number of local features were extracted from stable image blobs to represent various anatomical patterns for potential effective biomarkers.Based on distinctive descriptors and locations,the local features were robustly clustered to identify correspondences of the same underlying patterns.Then,the KDE was used to estimate distribution parameters of the correspondences by weighting contributions according to their distances.Thus,biomarkers could be reliably quantified by reducing the effects of further away correspondences which were more likely noises from inter-subject variability.Finally,the Bayes classifier was applied on the distribution parameters for the classification of AD.Results Experiments were performed on different divisions of a publicly available database to investigate the accuracy and the effects of age and AD severity.Our method achieved an equal error classification rate of 0.85 for subject aged 60-80 years exhibiting mild AD and outperformed a recent local feature-based work regardless of both effects.Conclusions We proposed a volumetric brain MRI classification method for neurodegenerative disease based on statistics of local features using KDE.The method may be potentially useful for the computer-aided diagnosis in clinical settings.

  16. Cell morphology-based classification of red blood cells using holographic imaging informatics.

    Science.gov (United States)

    Yi, Faliu; Moon, Inkyu; Javidi, Bahram

    2016-06-01

    We present methods that automatically select a linear or nonlinear classifier for red blood cell (RBC) classification by analyzing the equality of the covariance matrices in Gabor-filtered holographic images. First, the phase images of the RBCs are numerically reconstructed from their holograms, which are recorded using off-axis digital holographic microscopy (DHM). Second, each RBC is segmented using a marker-controlled watershed transform algorithm and the inner part of the RBC is identified and analyzed. Third, the Gabor wavelet transform is applied to the segmented cells to extract a series of features, which then undergo a multivariate statistical test to evaluate the equality of the covariance matrices of the different shapes of the RBCs using selected features. When these covariance matrices are not equal, a nonlinear classification scheme based on quadratic functions is applied; otherwise, a linear classification is applied. We used the stomatocyte, discocyte, and echinocyte RBC for classifier training and testing. Simulation results demonstrated that 10 of the 14 RBC features are useful in RBC classification. Experimental results also revealed that the covariance matrices of the three main RBC groups are not equal and that a nonlinear classification method has a much lower misclassification rate. The proposed automated RBC classification method has the potential for use in drug testing and the diagnosis of RBC-related diseases. PMID:27375953

  17. Gene Expression Based Leukemia Sub‑Classification Using Committee Neural Networks

    Directory of Open Access Journals (Sweden)

    Mihir S. Sewak

    2009-09-01

    Full Text Available Analysis of gene expression data provides an objective and efficient technique for sub‑classification of leukemia. The purpose of the present study was to design a committee neural networks based classification systems to subcategorize leukemia gene expression data. In the study, a binary classification system was considered to differentiate acute lymphoblastic leukemia from acute myeloid leukemia. A ternary classification system which classifies leukemia expression data into three subclasses including B‑cell acute lymphoblastic leukemia, T‑cell acute lymphoblastic leukemia and acute myeloid leukemia was also developed. In each classification system gene expression profiles of leukemia patients were first subjected to a sequence of simple preprocessing steps. This resulted in filtering out approximately 95 percent of the non‑informative genes. The remaining 5 percent of the informative genes were used to train a set of artificial neural networks with different parameters and architectures. The networks that gave the best results during initial testing were recruited into a committee. The committee decision was by majority voting. The committee neural network system was later evaluated using data not used in training. The binary classification system classified microarray gene expression profiles into two categories with 100 percent accuracy and the ternary system correctly predicted the three subclasses of leukemia in over 97 percent of the cases.

  18. Object-Based Classification of Abandoned Logging Roads under Heavy Canopy Using LiDAR

    Directory of Open Access Journals (Sweden)

    Jason Sherba

    2014-05-01

    Full Text Available LiDAR-derived slope models may be used to detect abandoned logging roads in steep forested terrain. An object-based classification approach of abandoned logging road detection was employed in this study. First, a slope model of the study site in Marin County, California was created from a LiDAR derived DEM. Multiresolution segmentation was applied to the slope model and road seed objects were iteratively grown into candidate objects. A road classification accuracy of 86% was achieved using this fully automated procedure and post processing increased this accuracy to 90%. In order to assess the sensitivity of the road classification to LiDAR ground point spacing, the LiDAR ground point cloud was repeatedly thinned by a fraction of 0.5 and the classification procedure was reapplied. The producer’s accuracy of the road classification declined from 79% with a ground point spacing of 0.91 to below 50% with a ground point spacing of 2, indicating the importance of high point density for accurate classification of abandoned logging roads.

  19. A Brief Summary of Dictionary Learning Based Approach for Classification (revised)

    CERN Document Server

    Kong, Shu

    2012-01-01

    This note presents some representative methods which are based on dictionary learning (DL) for classification. We do not review the sophisticated methods or frameworks that involve DL for classification, such as online DL and spatial pyramid matching (SPM), but rather, we concentrate on the direct DL-based classification methods. Here, the "so-called direct DL-based method" is the approach directly deals with DL framework by adding some meaningful penalty terms. By listing some representative methods, we can roughly divide them into two categories, i.e. (1) directly making the dictionary discriminative and (2) forcing the sparse coefficients discriminative to push the discrimination power of the dictionary. From this taxonomy, we can expect some extensions of them as future researches.

  20. Research of Plant-Leaves Classification Algorithm Based on Supervised LLE

    Directory of Open Access Journals (Sweden)

    Yan Qing

    2013-06-01

    Full Text Available A new supervised LLE method based on the fisher projection was proposed in this paper, and combined it with a new classification algorithm based on manifold learning to realize the recognition of the plant leaves. Firstly,the method utilizes the Fisher projection distance to replace the sample's geodesic distance, and a new supervised LLE algorithm is obtained .Then, a classification algorithm which uses the manifold reconstruction error to distinguish the sample classification directly is adopted. This algorithm can utilize the category information better,and improve recognition rate effectively. At the same time, it has the advantage of the easily parameter estimation. The experimental results based on the real-world plant leaf databases shows its average accuracy of recognition was up to 95.17%.

  1. Fuzzy-logic-based hybrid locomotion mode classification for an active pelvis orthosis: Preliminary results.

    Science.gov (United States)

    Yuan, Kebin; Parri, Andrea; Yan, Tingfang; Wang, Long; Munih, Marko; Vitiello, Nicola; Wang, Qining

    2015-01-01

    In this paper, we present a fuzzy-logic-based hybrid locomotion mode classification method for an active pelvis orthosis. Locomotion information measured by the onboard hip joint angle sensors and the pressure insoles is used to classify five locomotion modes, including two static modes (sitting, standing still), and three dynamic modes (level-ground walking, ascending stairs, and descending stairs). The proposed method classifies these two kinds of modes first by monitoring the variation of the relative hip joint angle between the two legs within a specific period. Static states are then classified by the time-based absolute hip joint angle. As for dynamic modes, a fuzzy-logic based method is proposed for the classification. Preliminary experimental results with three able-bodied subjects achieve an off-line classification accuracy higher than 99.49%.

  2. Fuzzy-logic-based hybrid locomotion mode classification for an active pelvis orthosis: Preliminary results.

    Science.gov (United States)

    Yuan, Kebin; Parri, Andrea; Yan, Tingfang; Wang, Long; Munih, Marko; Vitiello, Nicola; Wang, Qining

    2015-08-01

    In this paper, we present a fuzzy-logic-based hybrid locomotion mode classification method for an active pelvis orthosis. Locomotion information measured by the onboard hip joint angle sensors and the pressure insoles is used to classify five locomotion modes, including two static modes (sitting, standing still), and three dynamic modes (level-ground walking, ascending stairs, and descending stairs). The proposed method classifies these two kinds of modes first by monitoring the variation of the relative hip joint angle between the two legs within a specific period. Static states are then classified by the time-based absolute hip joint angle. As for dynamic modes, a fuzzy-logic based method is proposed for the classification. Preliminary experimental results with three able-bodied subjects achieve an off-line classification accuracy higher than 99.49%. PMID:26737144

  3. Developing a novel nodal grading system to standardize nodal classification in gastric cancer patients with limited lymph node resection

    Institute of Scientific and Technical Information of China (English)

    2015-01-01

    Objective:To develop an easy applicable novel nodal grading system to improve the standardization of nodal classification in patients with limited lymphadenectomy. Methods: We formulated a new approach of nodal classification to classify this category of patients. Log-rank test was used for univariate analysis and Cox proportional hazards model was used for univariate and multivariate analysis. We used linear trendχ2 tests, likelihood ratioχ2 test and Akaike information criterion (AIC) value to assess the homogeneity, discriminatory ability and monotonicity of gradients of the two nodal staging systems.Results:Statistical analysis supported that both the hypothesized N’ stage and hypothesized TN’M stage outperforms the present AJCC/UICC staging system.Conclusion:We developed an easy applicable and reproducible novel nodal grading system that has a greater predicting value than the current AJCC/UICC staging system to classify gastric cancer patients with limited lymphadenectomy.

  4. The polarimetric entropy classification of SAR based on the clustering and signal noise ration

    Science.gov (United States)

    Shi, Lei; Yang, Jie; Lang, Fengkai

    2009-10-01

    Usually, Wishart H/α/A classification is an effective unsupervised classification method. However, the anisotropy parameter (A) is an unstable factor in the low signal noise ration (SNR) areas; at the same time, many clusters are useless to manually recognize. In order to avoid too many clusters to affect the manual recognition and the convergence of iteration and aiming at the drawback of the Wishart classification, in this paper, an enhancive unsupervised Wishart classification scheme for POLSAR data sets is introduced. The anisotropy parameter A is used to subdivide the target after H/α classification, this parameter has the ability to subdivide the homogeneity area in high SNR condition which can not be classified by using H/α. It is very useful to enhance the adaptability in difficult areas. Yet, the target polarimetric decomposition is affected by SNR before the classification; thus, the local homogeneity area's SNR evaluation is necessary. After using the direction of the edge detection template to examine the direction of POL-SAR images, the results can be processed to estimate SNR. The SNR could turn to a powerful tool to guide H/α/A classification. This scheme is able to correct the mistake judging of using A parameter such as eliminating much insignificant spot on the road and urban aggregation, even having a good performance in the complex forest. To convenience the manual recognition, an agglomerative clustering algorithm basing on the method of deviation-class is used to consolidate some clusters which are similar in 3by3 polarimetric coherency matrix. This classification scheme is applied to full polarimetric L band SAR image of Foulum area, Denmark.

  5. Seabed Classification Using BP Neural Network Based on GA

    Institute of Scientific and Technical Information of China (English)

    Yang Fanlin; Liu Jingnan

    2003-01-01

    Side scan sonar imaging is one of the advanced methods for seabed study. In order to be utilized in other projects, such as ocean engineering, the image needs to be classified according to the distributions of different classes of seabed materials. In this paper, seabed image is classified according to BP neural network, and Genetic Algorithm is adopted in train network in this paper. The feature vectors are average intensity, six statistics of texture and two dimensions of fractal. It considers not only the spatial correlation between different pixels, but also the terrain coarseness. The texture is denoted by the statistics of the co-occurrence matrix. Double Blanket algorithm is used to calculate dimension. Because a uniform fractal may not be sufficient to describe a seafloor, two dimensions are calculated respectively by the upper blanket and the lower blanket. However, in sonar image, fractal has directivity, i. e.there are different dimensions in different direction. Dimensions are different in acrosstrack and alongtrack, so the average of four directions is used to solve this problem. Finally, the real data verify the algorithm. In this paper, one hidden layer including six nodes is adopted. The BP network is rapidly and accurately convergent through GA. Correct classification rate is 92.5% in the result.

  6. Energy Based Feature Extraction for Classification of Respiratory Signals Using Modified Threshold Based Algorithm

    Directory of Open Access Journals (Sweden)

    A.BHAVANI SANKAR,

    2010-10-01

    Full Text Available In this work, we carried out a detailed study of various features of respiratory signal. Respiratory signals contains potentially precise information that could assist clinicians in making appropriate and timely decisions during sleeping disorder and labour. The extraction and detection of the sleep apnea from composite abdominal signals with powerful and advance methodologies is becoming a very important requirement in apnea patient monitoring. The method we proposed in this work is based on the extraction of four main features of respiratory signal. The automatic signal classification starts by extracting signal features from 30 seconds respiratory data through autoregressive modeling (AR and other techniques. Four features are: signal energy, zero crossing frequency, dominant frequency estimated by AR and strength of dominant frequency based on AR. These features are then compared to threshold values and introduced to a series of conditions to determine the signal category for each specific epoch.

  7. Nanomaterials based biosensors for cancer biomarker detection

    Science.gov (United States)

    Malhotra, Bansi D.; Kumar, Saurabh; Mouli Pandey, Chandra

    2016-04-01

    Biosensors have enormous potential to contribute to the evolution of new molecular diagnostic techniques for patients suffering with cancerous diseases. A major obstacle preventing faster development of biosensors pertains to the fact that cancer is a highly complex set of diseases. The oncologists currently rely on a few biomarkers and histological characterization of tumors. Some of the signatures include epigenetic and genetic markers, protein profiles, changes in gene expression, and post-translational modifications of proteins. These molecular signatures offer new opportunities for development of biosensors for cancer detection. In this context, conducting paper has recently been found to play an important role towards the fabrication of a biosensor for cancer biomarker detection. In this paper we will focus on results of some of the recent studies obtained in our laboratories relating to fabrication and application of nanomaterial modified paper based biosensors for cancer biomarker detection.

  8. Semisupervised classification for hyperspectral image based on multi-decision labeling and deep feature learning

    Science.gov (United States)

    Ma, Xiaorui; Wang, Hongyu; Wang, Jie

    2016-10-01

    Semisupervised learning is widely used in hyperspectral image classification to deal with the limited training samples, however, some more information of hyperspectral image should be further explored. In this paper, a novel semisupervised classification based on multi-decision labeling and deep feature learning is presented to exploit and utilize as much information as possible to realize the classification task. First, the proposed method takes two decisions to pre-label each unlabeled sample: local decision based on weighted neighborhood information is made by the surrounding samples, and global decision based on deep learning is performed by the most similar training samples. Then, some unlabeled ones with high confidence are selected to extent the training set. Finally, self decision, which depends on the self features exploited by deep learning, is employed on the updated training set to extract spectral-spatial features and produce classification map. Experimental results with real data indicate that it is an effective and promising semisupervised classification method for hyperspectral image.

  9. An application to pulmonary emphysema classification based on model of texton learning by sparse representation

    Science.gov (United States)

    Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi

    2012-03-01

    We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.

  10. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    Directory of Open Access Journals (Sweden)

    Manana Khachidze

    2016-01-01

    Full Text Available According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray and 13 subgroups using two well-known methods: Support Vector Machine (SVM and K-Nearest Neighbor (KNN. The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system due to common features characterizing these subclasses. The overall results of the study were successful.

  11. Semi-automatic classification of glaciovolcanic landforms: An object-based mapping approach based on geomorphometry

    Science.gov (United States)

    Pedersen, G. B. M.

    2016-02-01

    A new object-oriented approach is developed to classify glaciovolcanic landforms (Procedure A) and their landform elements boundaries (Procedure B). It utilizes the principle that glaciovolcanic edifices are geomorphometrically distinct from lava shields and plains (Pedersen and Grosse, 2014), and the approach is tested on data from Reykjanes Peninsula, Iceland. The outlined procedures utilize slope and profile curvature attribute maps (20 m/pixel) and the classified results are evaluated quantitatively through error matrix maps (Procedure A) and visual inspection (Procedure B). In procedure A, the highest obtained accuracy is 94.1%, but even simple mapping procedures provide good results (> 90% accuracy). Successful classification of glaciovolcanic landform element boundaries (Procedure B) is also achieved and this technique has the potential to delineate the transition from intraglacial to subaerial volcanic activity in orthographic view. This object-oriented approach based on geomorphometry overcomes issues with vegetation cover, which has been typically problematic for classification schemes utilizing spectral data. Furthermore, it handles complex edifice outlines well and is easily incorporated into a GIS environment, where results can be edited or fused with other mapping results. The approach outlined here is designed to map glaciovolcanic edifices within the Icelandic neovolcanic zone but may also be applied to similar subaerial or submarine volcanic settings, where steep volcanic edifices are surrounded by flat plains.

  12. A study on Pc- based ultrasonic testing system using intelligent ultrasonic flaw classification software

    International Nuclear Information System (INIS)

    For convenient application of ultrasonic pattern recognition approaches in practical field inspection of weldments, we have developed an intelligent ultrasonic flaw classification system by the novel combination of two ingredients; 1) a PC-based ultrasonic testing system, and 2) an intelligent ultrasonic flaw classification software with an invariant ultrasonic pattern recognition algorithm. Here, key aspects of this intelligent system are addressed including the Pc-based ultrasonic testing system, enhancement of the performance by use of newly proposed ultrasonic features, and feature selection.

  13. Object Class Detection and Classification using Multi Scale Gradient and Corner Point based Shape Descriptors

    OpenAIRE

    Fernando, Basura; Karaoglu, Sezer; Saha, Sajib Kumar

    2015-01-01

    This paper presents a novel multi scale gradient and a corner point based shape descriptors. The novel multi scale gradient based shape descriptor is combined with generic Fourier descriptors to extract contour and region based shape information. Shape information based object class detection and classification technique with a random forest classifier has been optimized. Proposed integrated descriptor in this paper is robust to rotation, scale, translation, affine deformations, noisy contour...

  14. Research on Heuristic Feature Extraction and Classification of EEG Signal Based on BCI Data Set

    Directory of Open Access Journals (Sweden)

    Lijuan Duan

    2013-01-01

    Full Text Available In this study, an EEG signal classification framework was proposed. The framework contained three feature extraction methods refer to optimization strategy. Firstly, we selected optimal electrodes based on the single electrode classification performance and combined all the optimal electrodes’ data as the feature. Then, we discussed the contribution of each time span of EEG signals for each electrode and joined all the optimal time spans’ data together to be used for classifying. In addition, we further selected useful information from original data based on genetic algorithm. Finally, the performances were evaluated by Bayes and SVM classifiers on BCI 2003 Competition data set Ia. And the accuracy of genetic algorithm has reached 91.81%. The experimental results show that our methods offer the better performance for reliable classification of the EEG signal.

  15. Protein Classification Based on Analysis of Local Sequence-Structure Correspondence

    Energy Technology Data Exchange (ETDEWEB)

    Zemla, A T

    2006-02-13

    The goal of this project was to develop an algorithm to detect and calculate common structural motifs in compared structures, and define a set of numerical criteria to be used for fully automated motif based protein structure classification. The Protein Data Bank (PDB) contains more than 33,000 experimentally solved protein structures, and the Structural Classification of Proteins (SCOP) database, a manual classification of these structures, cannot keep pace with the rapid growth of the PDB. In our approach called STRALCP (STRucture Alignment based Clustering of Proteins), we generate detailed information about global and local similarities between given set of structures, identify similar fragments that are conserved within analyzed proteins, and use these conserved regions (detected structural motifs) to classify proteins.

  16. Feature selection by separability assessment of input spaces for transient stability classification based on neural networks

    Energy Technology Data Exchange (ETDEWEB)

    Tso, S.K. [City University of Hong Kong (China). Dept. of Manufacturing Engineering; Gu, X.P. [North China Electric Power University, Baoding (China). Dept. of Electrical Engineering

    2004-03-01

    Power system transient-stability assessment based on neural networks can usually be treated as a two-pattern classification problem separating the stable class from the unstable class. In such a classification problem, the feature extraction and selection is the first important task to be carried out. A new approach of feature selection is presented using a new separability measure in this paper. Through finding the 'inconsistent cases' in a sample set, a separability index of input spaces is defined. Using the defined separability index as criterion, the breadth-first searching technique is employed to find the minimal or optimal subsets of the initial feature set. The numerical results based on extensive data obtained for the 10-unit 39-bus New England power system demonstrate the effectiveness of the proposed approach in extracting the 'best combination' of features for improving the quality of transient-stability classification. (author)

  17. Feature Extraction with Ordered Mean Values for Content Based Image Classification

    Directory of Open Access Journals (Sweden)

    Sudeep Thepade

    2014-01-01

    Full Text Available Categorization of images into meaningful classes by efficient extraction of feature vectors from image datasets has been dependent on feature selection techniques. Traditionally, feature vector extraction has been carried out using different methods of image binarization done with selection of global, local, or mean threshold. This paper has proposed a novel technique for feature extraction based on ordered mean values. The proposed technique was combined with feature extraction using discrete sine transform (DST for better classification results using multitechnique fusion. The novel methodology was compared to the traditional techniques used for feature extraction for content based image classification. Three benchmark datasets, namely, Wang dataset, Oliva and Torralba (OT-Scene dataset, and Caltech dataset, were used for evaluation purpose. Performance measure after evaluation has evidently revealed the superiority of the proposed fusion technique with ordered mean values and discrete sine transform over the popular approaches of single view feature extraction methodologies for classification.

  18. Survey on Parameters of Fingerprint Classification Methods Based On Algorithmic Flow

    Directory of Open Access Journals (Sweden)

    Dimple Parekh

    2011-09-01

    Full Text Available Classification refers to assigning a given fingerprint to one of the existing classes already recognized inthe literature. A search over all the records in the database takes a long time, so the goal is to reduce thesize of the search space by choosing an appropriate subset of database for search. Classifying afingerprint images is a very difficult pattern recognition problem, due to the minimal interclassvariability and maximal intraclass variability. This paper presents a sequence flow diagram which willhelp in developing the clarity on designing algorithm for classification based on various parametersextracted from the fingerprint image. It discusses in brief the ways in which the parameters are extractedfrom the image. Existing fingerprint classification approaches are based on these parameters as inputfor classifying the image. Parameters like orientation map, singular points, spurious singular points,ridge flow, transforms and hybrid feature are discussed in the paper.

  19. A Feature Selection Method for Large-Scale Network Traffic Classification Based on Spark

    Directory of Open Access Journals (Sweden)

    Yong Wang

    2016-02-01

    Full Text Available Currently, with the rapid increasing of data scales in network traffic classifications, how to select traffic features efficiently is becoming a big challenge. Although a number of traditional feature selection methods using the Hadoop-MapReduce framework have been proposed, the execution time was still unsatisfactory with numeral iterative computations during the processing. To address this issue, an efficient feature selection method for network traffic based on a new parallel computing framework called Spark is proposed in this paper. In our approach, the complete feature set is firstly preprocessed based on Fisher score, and a sequential forward search strategy is employed for subsets. The optimal feature subset is then selected using the continuous iterations of the Spark computing framework. The implementation demonstrates that, on the precondition of keeping the classification accuracy, our method reduces the time cost of modeling and classification, and improves the execution efficiency of feature selection significantly.

  20. Application of In-Segment Multiple Sampling in Object-Based Classification

    Directory of Open Access Journals (Sweden)

    Nataša Đurić

    2014-12-01

    Full Text Available When object-based analysis is applied to very high-resolution imagery, pixels within the segments reveal large spectral inhomogeneity; their distribution can be considered complex rather than normal. When normality is violated, the classification methods that rely on the assumption of normally distributed data are not as successful or accurate. It is hard to detect normality violations in small samples. The segmentation process produces segments that vary highly in size; samples can be very big or very small. This paper investigates whether the complexity within the segment can be addressed using multiple random sampling of segment pixels and multiple calculations of similarity measures. In order to analyze the effect sampling has on classification results, statistics and probability value equations of non-parametric two-sample Kolmogorov-Smirnov test and parametric Student’s t-test are selected as similarity measures in the classification process. The performance of both classifiers was assessed on a WorldView-2 image for four land cover classes (roads, buildings, grass and trees and compared to two commonly used object-based classifiers—k-Nearest Neighbor (k-NN and Support Vector Machine (SVM. Both proposed classifiers showed a slight improvement in the overall classification accuracies and produced more accurate classification maps when compared to the ground truth image.