WorldWideScience

Sample records for cancer classification based

  1. Module-Based Breast Cancer Classification

    OpenAIRE

    Zhang, Yuji; Xuan, Jianhua; Clarke, Robert; Ressom, Habtom W

    2013-01-01

    The reliability and reproducibility of gene biomarkers for classification of cancer patients has been challenged due to measurement noise and biological heterogeneity among patients. In this paper, we propose a novel module-based feature selection framework, which integrates biological network information and gene expression data to identify biomarkers not as individual genes but as functional modules. Results from four breast cancer studies demonstrate that the identified module biomarkers i...

  2. Nominated Texture Based Cervical Cancer Classification

    Directory of Open Access Journals (Sweden)

    Edwin Jayasingh Mariarputham

    2015-01-01

    Full Text Available Accurate classification of Pap smear images becomes the challenging task in medical image processing. This can be improved in two ways. One way is by selecting suitable well defined specific features and the other is by selecting the best classifier. This paper presents a nominated texture based cervical cancer (NTCC classification system which classifies the Pap smear images into any one of the seven classes. This can be achieved by extracting well defined texture features and selecting best classifier. Seven sets of texture features (24 features are extracted which include relative size of nucleus and cytoplasm, dynamic range and first four moments of intensities of nucleus and cytoplasm, relative displacement of nucleus within the cytoplasm, gray level cooccurrence matrix, local binary pattern histogram, tamura features, and edge orientation histogram. Few types of support vector machine (SVM and neural network (NN classifiers are used for the classification. The performance of the NTCC algorithm is tested and compared to other algorithms on public image database of Herlev University Hospital, Denmark, with 917 Pap smear images. The output of SVM is found to be best for the most of the classes and better results for the remaining classes.

  3. Multiclass cancer classification based on gene expression comparison

    OpenAIRE

    Yang Sitan; Naiman Daniel Q.

    2014-01-01

    As the complexity and heterogeneity of cancer is being increasingly appreciated through genomic analyses, microarray-based cancer classification comprising multiple discriminatory molecular markers is an emerging trend. Such multiclass classification problems pose new methodological and computational challenges for developing novel and effective statistical approaches. In this paper, we introduce a new approach for classifying multiple disease states associated with cancer based on gene expre...

  4. Human Cancer Classification: A Systems Biology- Based Model Integrating Morphology, Cancer Stem Cells, Proteomics, and Genomics

    OpenAIRE

    Halliday A Idikio

    2011-01-01

    Human cancer classification is currently based on the idea of cell of origin, light and electron microscopic attributes of the cancer. What is not yet integrated into cancer classification are the functional attributes of these cancer cells. Recent innovative techniques in biology have provided a wealth of information on the genomic, transcriptomic and proteomic changes in cancer cells. The emergence of the concept of cancer stem cells needs to be included in a classification model to capture...

  5. Pathological Bases for a Robust Application of Cancer Molecular Classification

    Directory of Open Access Journals (Sweden)

    Salvador J. Diaz-Cano

    2015-04-01

    Full Text Available Any robust classification system depends on its purpose and must refer to accepted standards, its strength relying on predictive values and a careful consideration of known factors that can affect its reliability. In this context, a molecular classification of human cancer must refer to the current gold standard (histological classification and try to improve it with key prognosticators for metastatic potential, staging and grading. Although organ-specific examples have been published based on proteomics, transcriptomics and genomics evaluations, the most popular approach uses gene expression analysis as a direct correlate of cellular differentiation, which represents the key feature of the histological classification. RNA is a labile molecule that varies significantly according with the preservation protocol, its transcription reflect the adaptation of the tumor cells to the microenvironment, it can be passed through mechanisms of intercellular transference of genetic information (exosomes, and it is exposed to epigenetic modifications. More robust classifications should be based on stable molecules, at the genetic level represented by DNA to improve reliability, and its analysis must deal with the concept of intratumoral heterogeneity, which is at the origin of tumor progression and is the byproduct of the selection process during the clonal expansion and progression of neoplasms. The simultaneous analysis of multiple DNA targets and next generation sequencing offer the best practical approach for an analytical genomic classification of tumors.

  6. Cancer Data Clustering and Classification Based using Efnn_Pcamethod

    OpenAIRE

    J. Saranya; Hemalatha, R.

    2014-01-01

    One challenge area inside the studies of natural phenomenon data is that the classifications of the expression dataset into correct classes. The distinctive nature of the of Obtainable natural phenomenon data set is that the foremost challenge. massive vary of extraneous attributes (genes), challenge arises from the applying domain of cancer classification. though Accuracy plays a major think about cancer classification, the biological conation is another key criterion, as any biological data...

  7. Ovarian Cancer Classification based on Mass Spectrometry Analysis of Sera

    Directory of Open Access Journals (Sweden)

    Baolin Wu

    2006-01-01

    Full Text Available In our previous study [1], we have compared the performance of a number of widely used discrimination methods for classifying ovarian cancer using Matrix Assisted Laser Desorption Ionization (MALDI mass spectrometry data on serum samples obtained from Reflectron mode. Our results demonstrate good performance with a random forest classifier. In this follow-up study, to improve the molecular classification power of the MALDI platform for ovarian cancer disease, we expanded the mass range of the MS data by adding data acquired in Linear mode and evaluated the resultant decrease in classification error. A general statistical framework is proposed to obtain unbiased classification error estimates and to analyze the effects of sample size and number of selected m/z features on classification errors. We also emphasize the importance of combining biological knowledge and statistical analysis to obtain both biologically and statistically sound results. Our study shows improvement in classification accuracy upon expanding the mass range of the analysis. In order to obtain the best classification accuracies possible, we found that a relatively large training sample size is needed to obviate the sample variations. For the ovarian MS dataset that is the focus of the current study, our results show that approximately 20-40 m/z features are needed to achieve the best classification accuracy from MALDI-MS analysis of sera. Supplementary information can be found at http://bioinformatics.med.yale.edu/proteomics/BioSupp2.html.

  8. Hybrid Local Feature Selection In DNA Analysis Based Cancer Classification

    OpenAIRE

    Akila, M.; Mr.S.Senthamarai kannan

    2012-01-01

    Feature selection, as a preprocessing step to machine learning, is effective in reducing dimensionality, removing irrelevant data and increasing learning accuracy. The development of microarray dataset technology has supplied a large volume of data to many fields. In particular, it has been applied to prediction and diagnosis of cancer, so that it helps us to exactly predict and diagnose cancer. To precisely classify cancer we have to select genes related to cancer. The challenging task in ca...

  9. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2005-11-01

    Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

  10. Cancer pain: A critical review of mechanism-based classification and physical therapy management in palliative care

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2011-01-01

    Full Text Available Mechanism-based classification and physical therapy management of pain is essential to effectively manage painful symptoms in patients attending palliative care. The objective of this review is to provide a detailed review of mechanism-based classification and physical therapy management of patients with cancer pain. Cancer pain can be classified based upon pain symptoms, pain mechanisms and pain syndromes. Classification based upon mechanisms not only addresses the underlying pathophysiology but also provides us with an understanding behind patient′s symptoms and treatment responses. Existing evidence suggests that the five mechanisms - central sensitization, peripheral sensitization, sympathetically maintained pain, nociceptive and cognitive-affective - operate in patients with cancer pain. Summary of studies showing evidence for physical therapy treatment methods for cancer pain follows with suggested therapeutic implications. Effective palliative physical therapy care using a mechanism-based classification model should be tailored to suit each patient′s findings, using a biopsychosocial model of pain.

  11. Optimal search-based gene subset selection for gene array cancer classification.

    Science.gov (United States)

    Li, Jiexun; Su, Hua; Chen, Hsinchun; Futscher, Bernard W

    2007-07-01

    High dimensionality has been a major problem for gene array-based cancer classification. It is critical to identify marker genes for cancer diagnoses. We developed a framework of gene selection methods based on previous studies. This paper focuses on optimal search-based subset selection methods because they evaluate the group performance of genes and help to pinpoint global optimal set of marker genes. Notably, this paper is the first to introduce tabu search (TS) to gene selection from high-dimensional gene array data. Our comparative study of gene selection methods demonstrated the effectiveness of optimal search-based gene subset selection to identify cancer marker genes. TS was shown to be a promising tool for gene subset selection. PMID:17674622

  12. Classification of Cancer Gene Selection Using Random Forest and Neural Network Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Jogendra Kushwah

    2013-06-01

    Full Text Available The free radical gene classification of cancer diseases is challenging job in biomedical data engineering. The improving of classification of gene selection of cancer diseases various classifier are used, but the classification of classifier are not validate. So ensemble classifier is used for cancer gene classification using neural network classifier with random forest tree. The random forest tree is ensembling technique of classifier in this technique the number of classifier ensemble of their leaf node of class of classifier. In this paper we combined neural network with random forest ensemble classifier for classification of cancer gene selection for diagnose analysis of cancer diseases. The proposed method is different from most of the methods of ensemble classifier, which follow an input output paradigm of neural network, where the members of the ensemble are selected from a set of neural network classifier. the number of classifiers is determined during the rising procedure of the forest. Furthermore, the proposed method produces an ensemble not only correct, but also assorted, ensuring the two important properties that should characterize an ensemble classifier. For empirical evaluation of our proposed method we used UCI cancer diseases data set for classification. Our experimental result shows that better result in compression of random forest tree classification.

  13. A minimum spanning forest based hyperspectral image classification method for cancerous tissue detection

    Science.gov (United States)

    Pike, Robert; Patton, Samuel K.; Lu, Guolan; Halig, Luma V.; Wang, Dongsheng; Chen, Zhuo Georgia; Fei, Baowei

    2014-03-01

    Hyperspectral imaging is a developing modality for cancer detection. The rich information associated with hyperspectral images allow for the examination between cancerous and healthy tissue. This study focuses on a new method that incorporates support vector machines into a minimum spanning forest algorithm for differentiating cancerous tissue from normal tissue. Spectral information was gathered to test the algorithm. Animal experiments were performed and hyperspectral images were acquired from tumor-bearing mice. In vivo imaging experimental results demonstrate the applicability of the proposed classification method for cancer tissue classification on hyperspectral images.

  14. Diagnostic Classification of Normal Persons and Cancer Patients by Using Neural Network Based on Trace Metal Contents in Serum Samples

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Artificial neural network with the back-propagation(BP-ANN) approach was applied to the classification of normal persons and various cancer patients based on the elemental contents in serum samples. This method was verified by the cross-validation method. The effects of the net work parameters were investigated and the related problems were discussed. The samples of 72, 42, and 52 for lung, liver, and stomach cancer patients and normal persons, respectively, were used for the classification study. About 95% of the samples can be classified correctly. There fore, the method can be used as an auxiliary means of the diagnosis of cancer.

  15. Classification of Cancer Gene Selection Using Random Forest and Neural Network Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Jogendra Kushwah

    2013-06-01

    Full Text Available The free radical gene classification of cancerdiseasesis challenging job in biomedical dataengineering. The improving of classification of geneselection of cancer diseases various classifier areused, but the classification of classifier are notvalidate. So ensemble classifier is used for cancergene classification using neural network classifierwith random forest tree. The random forest tree isensembling technique of classifier in this techniquethe number of classifier ensemble of their leaf nodeof class of classifier. In this paper we combinedneuralnetwork with random forest ensembleclassifier for classification of cancer gene selectionfor diagnose analysis of cancer diseases.Theproposed method is different from most of themethods of ensemble classifier, which follow aninput output paradigm ofneural network, where themembers of the ensemble are selected from a set ofneural network classifier. the number of classifiersis determined during the rising procedure of theforest. Furthermore, the proposed method producesan ensemble not only correct, but also assorted,ensuring the two important properties that shouldcharacterize an ensemble classifier. For empiricalevaluation of our proposed method we used UCIcancer diseases data set for classification. Ourexperimental result shows that betterresult incompression of random forest tree classification

  16. Swarm Intelligence Approach Based on Adaptive ELM Classifier with ICGA Selection for Microarray Gene Expression and Cancer Classification

    Directory of Open Access Journals (Sweden)

    T. Karthikeyan

    2014-05-01

    Full Text Available The aim of this research study is based on efficient gene selection and classification of microarray data analysis using hybrid machine learning algorithms. The beginning of microarray technology has enabled the researchers to quickly measure the position of thousands of genes expressed in an organic/biological tissue samples in a solitary experiment. One of the important applications of this microarray technology is to classify the tissue samples using their gene expression representation, identify numerous type of cancer. Cancer is a group of diseases in which a set of cells shows uncontrolled growth, instance that interrupts upon and destroys nearby tissues and spreading to other locations in the body via lymph or blood. Cancer has becomes a one of the major important disease in current scenario. DNA microarrays turn out to be an effectual tool utilized in molecular biology and cancer diagnosis. Microarrays can be measured to establish the relative quantity of mRNAs in two or additional organic/biological tissue samples for thousands/several thousands of genes at the same time. As the superiority of this technique become exactly analysis/identifying the suitable assessment of microarray data in various open issues. In the field of medical sciences multi-category cancer classification play a major important role to classify the cancer types according to the gene expression. The need of the cancer classification has been become indispensible, because the numbers of cancer victims are increasing steadily identified by recent years. To perform this proposed a combination of Integer-Coded Genetic Algorithm (ICGA and Artificial Bee Colony algorithm (ABC, coupled with an Adaptive Extreme Learning Machine (AELM, is used for gene selection and cancer classification. ICGA is used with ABC based AELM classifier to chose an optimal set of genes which results in an efficient hybrid algorithm that can handle sparse data and sample imbalance. The

  17. Proteomic classification of breast cancer.

    LENUS (Irish Health Repository)

    Kamel, Dalia

    2012-11-01

    Being a significant health problem that affects patients in various age groups, breast cancer has been extensively studied to date. Recently, molecular breast cancer classification has advanced significantly with the availability of genomic profiling technologies. Proteomic technologies have also advanced from traditional protein assays including enzyme-linked immunosorbent assay, immunoblotting and immunohistochemistry to more comprehensive approaches including mass spectrometry and reverse phase protein lysate arrays (RPPA). The purpose of this manuscript is to review the current protein markers that influence breast cancer prediction and prognosis and to focus on novel advances in proteomic classification of breast cancer.

  18. Molecular classification of gastric cancer.

    Science.gov (United States)

    Chia, N-Y; Tan, P

    2016-05-01

    Gastric cancer (GC), a heterogeneous disease characterized by epidemiologic and histopathologic differences across countries, is a leading cause of cancer-related death. Treatment of GC patients is currently suboptimal due to patients being commonly treated in a uniform fashion irrespective of disease subtype. With the advent of next-generation sequencing and other genomic technologies, GCs are now being investigated in great detail at the molecular level. High-throughput technologies now allow a comprehensive study of genomic and epigenomic alterations associated with GC. Gene mutations, chromosomal aberrations, differential gene expression and epigenetic alterations are some of the genetic/epigenetic influences on GC pathogenesis. In addition, integrative analyses of molecular profiling data have led to the identification of key dysregulated pathways and importantly, the establishment of GC molecular classifiers. Recently, The Cancer Genome Atlas (TCGA) network proposed a four subtype classification scheme for GC based on the underlying tumor molecular biology of each subtype. This landmark study, together with other studies, has expanded our understanding on the characteristics of GC at the molecular level. Such knowledge may improve the medical management of GC in the future. PMID:26861606

  19. Cancer classification in the genomic era: five contemporary problems

    OpenAIRE

    Song, Qingxuan; Merajver, Sofia D.; Li, Jun Z.

    2015-01-01

    Classification is an everyday instinct as well as a full-fledged scientific discipline. Throughout the history of medicine, disease classification is central to how we develop knowledge, make diagnosis, and assign treatment. Here, we discuss the classification of cancer and the process of categorizing cancer subtypes based on their observed clinical and biological features. Traditionally, cancer nomenclature is primarily based on organ location, e.g., “lung cancer” designates a tumor originat...

  20. Integrating Colon Cancer Microarray Data: Associating Locus-Specific Methylation Groups to Gene Expression-Based Classifications

    Directory of Open Access Journals (Sweden)

    Ana Barat

    2015-11-01

    Full Text Available Recently, considerable attention has been paid to gene expression-based classifications of colorectal cancers (CRC and their association with patient prognosis. In addition to changes in gene expression, abnormal DNA-methylation is known to play an important role in cancer onset and development, and colon cancer is no exception to this rule. Large-scale technologies, such as methylation microarray assays and specific sequencing of methylated DNA, have been used to determine whole genome profiles of CpG island methylation in tissue samples. In this article, publicly available microarray-based gene expression and methylation data sets are used to characterize expression subtypes with respect to locus-specific methylation. A major objective was to determine whether integration of these data types improves previously characterized subtypes, or provides evidence for additional subtypes. We used unsupervised clustering techniques to determine methylation-based subgroups, which are subsequently annotated with three published expression-based classifications, comprising from three to six subtypes. Our results showed that, while methylation profiles provide a further basis for segregation of certain (Inflammatory and Goblet-like finer-grained expression-based subtypes, they also suggest that other finer-grained subtypes are not distinctive and can be considered as a single subtype.

  1. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions. PMID:26484222

  2. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  3. Breast cancer detection and classification in digital mammography based on Non-Subsampled Contourlet Transform (NSCT) and Super Resolution.

    Science.gov (United States)

    Pak, Fatemeh; Kanan, Hamidreza Rashidy; Alikhassi, Afsaneh

    2015-11-01

    Breast cancer is one of the most perilous diseases among women. Breast screening is a method of detecting breast cancer at a very early stage which can reduce the mortality rate. Mammography is a standard method for the early diagnosis of breast cancer. In this paper, a new algorithm is proposed for breast cancer detection and classification in digital mammography based on Non-Subsampled Contourlet Transform (NSCT) and Super Resolution (SR). The presented algorithm includes three main parts including pre-processing, feature extraction and classification. In the pre-processing stage, after determining the region of interest (ROI) by an automatic technique, the quality of image is improved using NSCT and SR algorithm. In the feature extraction part, several features of the image components are extracted and skewness of each feature is calculated. Finally, AdaBoost algorithm is used to classify and determine the probability of benign and malign disease. The obtained results on Mammographic Image Analysis Society (MIAS) database indicate the significant performance and superiority of the proposed method in comparison with the state of the art approaches. According to the obtained results, the proposed technique achieves 91.43% and 6.42% as a mean accuracy and FPR, respectively. PMID:26206406

  4. A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification

    Directory of Open Access Journals (Sweden)

    Wang Lily

    2008-07-01

    Full Text Available Abstract Background Cancer diagnosis and clinical outcome prediction are among the most important emerging applications of gene expression microarray technology with several molecular signatures on their way toward clinical deployment. Use of the most accurate classification algorithms available for microarray gene expression data is a critical ingredient in order to develop the best possible molecular signatures for patient care. As suggested by a large body of literature to date, support vector machines can be considered "best of class" algorithms for classification of such data. Recent work, however, suggests that random forest classifiers may outperform support vector machines in this domain. Results In the present paper we identify methodological biases of prior work comparing random forests and support vector machines and conduct a new rigorous evaluation of the two algorithms that corrects these limitations. Our experiments use 22 diagnostic and prognostic datasets and show that support vector machines outperform random forests, often by a large margin. Our data also underlines the importance of sound research design in benchmarking and comparison of bioinformatics algorithms. Conclusion We found that both on average and in the majority of microarray datasets, random forests are outperformed by support vector machines both in the settings when no gene selection is performed and when several popular gene selection methods are used.

  5. Lung cancer gene expression database analysis incorporating prior knowledge with support vector machine-based classification method

    Directory of Open Access Journals (Sweden)

    Huang Desheng

    2009-07-01

    Full Text Available Abstract Background A reliable and precise classification is essential for successful diagnosis and treatment of cancer. Gene expression microarrays have provided the high-throughput platform to discover genomic biomarkers for cancer diagnosis and prognosis. Rational use of the available bioinformation can not only effectively remove or suppress noise in gene chips, but also avoid one-sided results of separate experiment. However, only some studies have been aware of the importance of prior information in cancer classification. Methods Together with the application of support vector machine as the discriminant approach, we proposed one modified method that incorporated prior knowledge into cancer classification based on gene expression data to improve accuracy. A public well-known dataset, Malignant pleural mesothelioma and lung adenocarcinoma gene expression database, was used in this study. Prior knowledge is viewed here as a means of directing the classifier using known lung adenocarcinoma related genes. The procedures were performed by software R 2.80. Results The modified method performed better after incorporating prior knowledge. Accuracy of the modified method improved from 98.86% to 100% in training set and from 98.51% to 99.06% in test set. The standard deviations of the modified method decreased from 0.26% to 0 in training set and from 3.04% to 2.10% in test set. Conclusion The method that incorporates prior knowledge into discriminant analysis could effectively improve the capacity and reduce the impact of noise. This idea may have good future not only in practice but also in methodology.

  6. Pitch Based Sound Classification

    OpenAIRE

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U.

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classif...

  7. Laser Raman detection for oral cancer based on a Gaussian process classification method

    Science.gov (United States)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Zhang, Chijun; Chen, He; Luo, Yusheng; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Jia, Jun; Shen, Aiguo; Hu, Jiming

    2013-06-01

    Oral squamous cell carcinoma is the most common neoplasm of the oral cavity. The incidence rate accounts for 80% of total oral cancer and shows an upward trend in recent years. It has a high degree of malignancy and is difficult to detect in terms of differential diagnosis, as a consequence of which the timing of treatment is always delayed. In this work, Raman spectroscopy was adopted to differentially diagnose oral squamous cell carcinoma and oral gland carcinoma. In total, 852 entries of raw spectral data which consisted of 631 items from 36 oral squamous cell carcinoma patients, 87 items from four oral gland carcinoma patients and 134 items from five normal people were collected by utilizing an optical method on oral tissues. The probability distribution of the datasets corresponding to the spectral peaks of the oral squamous cell carcinoma tissue was analyzed and the experimental result showed that the data obeyed a normal distribution. Moreover, the distribution characteristic of the noise was also in compliance with a Gaussian distribution. A Gaussian process (GP) classification method was utilized to distinguish the normal people and the oral gland carcinoma patients from the oral squamous cell carcinoma patients. The experimental results showed that all the normal people could be recognized. 83.33% of the oral squamous cell carcinoma patients could be correctly diagnosed and the remaining ones would be diagnosed as having oral gland carcinoma. For the classification process of oral gland carcinoma and oral squamous cell carcinoma, the correct ratio was 66.67% and the erroneously diagnosed percentage was 33.33%. The total sensitivity was 80% and the specificity was 100% with the Matthews correlation coefficient (MCC) set to 0.447 213 595. Considering the numerical results above, the application prospects and clinical value of this technique are significantly impressive.

  8. Laser Raman detection for oral cancer based on a Gaussian process classification method

    International Nuclear Information System (INIS)

    Oral squamous cell carcinoma is the most common neoplasm of the oral cavity. The incidence rate accounts for 80% of total oral cancer and shows an upward trend in recent years. It has a high degree of malignancy and is difficult to detect in terms of differential diagnosis, as a consequence of which the timing of treatment is always delayed. In this work, Raman spectroscopy was adopted to differentially diagnose oral squamous cell carcinoma and oral gland carcinoma. In total, 852 entries of raw spectral data which consisted of 631 items from 36 oral squamous cell carcinoma patients, 87 items from four oral gland carcinoma patients and 134 items from five normal people were collected by utilizing an optical method on oral tissues. The probability distribution of the datasets corresponding to the spectral peaks of the oral squamous cell carcinoma tissue was analyzed and the experimental result showed that the data obeyed a normal distribution. Moreover, the distribution characteristic of the noise was also in compliance with a Gaussian distribution. A Gaussian process (GP) classification method was utilized to distinguish the normal people and the oral gland carcinoma patients from the oral squamous cell carcinoma patients. The experimental results showed that all the normal people could be recognized. 83.33% of the oral squamous cell carcinoma patients could be correctly diagnosed and the remaining ones would be diagnosed as having oral gland carcinoma. For the classification process of oral gland carcinoma and oral squamous cell carcinoma, the correct ratio was 66.67% and the erroneously diagnosed percentage was 33.33%. The total sensitivity was 80% and the specificity was 100% with the Matthews correlation coefficient (MCC) set to 0.447 213 595. Considering the numerical results above, the application prospects and clinical value of this technique are significantly impressive. (letter)

  9. Improved classification of lung cancer tumors based on structural and physicochemical properties of proteins using data mining models.

    Directory of Open Access Journals (Sweden)

    R Geetha Ramani

    Full Text Available Detecting divergence between oncogenic tumors plays a pivotal role in cancer diagnosis and therapy. This research work was focused on designing a computational strategy to predict the class of lung cancer tumors from the structural and physicochemical properties (1497 attributes of protein sequences obtained from genes defined by microarray analysis. The proposed methodology involved the use of hybrid feature selection techniques (gain ratio and correlation based subset evaluators with Incremental Feature Selection followed by Bayesian Network prediction to discriminate lung cancer tumors as Small Cell Lung Cancer (SCLC, Non-Small Cell Lung Cancer (NSCLC and the COMMON classes. Moreover, this methodology eliminated the need for extensive data cleansing strategies on the protein properties and revealed the optimal and minimal set of features that contributed to lung cancer tumor classification with an improved accuracy compared to previous work. We also attempted to predict via supervised clustering the possible clusters in the lung tumor data. Our results revealed that supervised clustering algorithms exhibited poor performance in differentiating the lung tumor classes. Hybrid feature selection identified the distribution of solvent accessibility, polarizability and hydrophobicity as the highest ranked features with Incremental feature selection and Bayesian Network prediction generating the optimal Jack-knife cross validation accuracy of 87.6%. Precise categorization of oncogenic genes causing SCLC and NSCLC based on the structural and physicochemical properties of their protein sequences is expected to unravel the functionality of proteins that are essential in maintaining the genomic integrity of a cell and also act as an informative source for drug design, targeting essential protein properties and their composition that are found to exist in lung cancer tumors.

  10. Mechanism-based classification and physical therapy management of persons with cancer pain: A prospective case series

    Directory of Open Access Journals (Sweden)

    Senthil P Kumar

    2013-01-01

    Full Text Available Context: Mechanism-based classification (MBC was established with current evidence and physical therapy (PT management methods for both cancer and for noncancer pain. Aims: This study aims to describe the efficacy of MBC-based PT in persons with primary complaints of cancer pain. Settings and Design: A prospective case series of patients who attended the physiotherapy department of a multispecialty university-affiliated teaching hospital. Material and Methods: A total of 24 adults (18 female, 6 male aged 47.5 ± 10.6 years, with primary diagnosis of heterogeneous group of cancer, chief complaints of chronic disabling pain were included in the study on their consent for participation The patients were evaluated and classified on the basis of five predominant mechanisms for pain. Physical therapy interventions were recommended based on mechanisms identified and home program was prescribed with a patient log to ensure compliance. Treatments were given in five consecutive weekly sessions for five weeks each of 30 min duration. Statistical Analysis Used: Pre-post comparisons for pain severity (PS and pain interference (PI subscales of Brief pain inventory-Cancer pain (BPI-CP and, European organization for research and treatment in cancer-quality of life questionnaire (EORTC-QLQ-C30 were done using Wilcoxon signed-rank test at 95% confidence interval using SPSS for Windows version 16.0 (SPSS Inc, Chicago, IL. Results: There were statistically significant ( P < 0.05 reduction in pain severity, pain interference and total BPI-CP scores, and the EORTC-QLQ-C30. Conclusion: MBC-PT was effective for improving BPI-CP and EORTC-QLQ-C30 scores in people with cancer pain.

  11. Accurate molecular classification of cancer using simple rules

    OpenAIRE

    Gotoh Osamu; Wang Xiaosheng

    2009-01-01

    Abstract Background One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often ...

  12. An iterated Laplacian based semi-supervised dimensionality reduction for classification of breast cancer on ultrasound images.

    Science.gov (United States)

    Liu, Xiao; Shi, Jun; Zhou, Shichong; Lu, Minhua

    2014-01-01

    The dimensionality reduction is an important step in ultrasound image based computer-aided diagnosis (CAD) for breast cancer. A newly proposed l2,1 regularized correntropy algorithm for robust feature selection (CRFS) has achieved good performance for noise corrupted data. Therefore, it has the potential to reduce the dimensions of ultrasound image features. However, in clinical practice, the collection of labeled instances is usually expensive and time costing, while it is relatively easy to acquire the unlabeled or undetermined instances. Therefore, the semi-supervised learning is very suitable for clinical CAD. The iterated Laplacian regularization (Iter-LR) is a new regularization method, which has been proved to outperform the traditional graph Laplacian regularization in semi-supervised classification and ranking. In this study, to augment the classification accuracy of the breast ultrasound CAD based on texture feature, we propose an Iter-LR-based semi-supervised CRFS (Iter-LR-CRFS) algorithm, and then apply it to reduce the feature dimensions of ultrasound images for breast CAD. We compared the Iter-LR-CRFS with LR-CRFS, original supervised CRFS, and principal component analysis. The experimental results indicate that the proposed Iter-LR-CRFS significantly outperforms all other algorithms. PMID:25571036

  13. Pitch Based Sound Classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

    2006-01-01

    A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft...

  14. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases the...... classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...... datasets. Our model also outperforms A Decision Cluster Classification (ADCC) and the Decision Cluster Forest Classification (DCFC) models on the Reuters-21578 dataset....

  15. Clinical study of quantitative diagnosis of early cervical cancer based on the classification of acetowhitening kinetics

    Science.gov (United States)

    Wu, Tao; Cheung, Tak-Hong; Yim, So-Fan; Qu, Jianan Y.

    2010-03-01

    A quantitative colposcopic imaging system for the diagnosis of early cervical cancer is evaluated in a clinical study. This imaging technology based on 3-D active stereo vision and motion tracking extracts diagnostic information from the kinetics of acetowhitening process measured from the cervix of human subjects in vivo. Acetowhitening kinetics measured from 137 cervical sites of 57 subjects are analyzed and classified using multivariate statistical algorithms. Cross-validation methods are used to evaluate the performance of the diagnostic algorithms. The results show that an algorithm for screening precancer produced 95% sensitivity (SE) and 96% specificity (SP) for discriminating normal and human papillomavirus (HPV)-infected tissues from cervical intraepithelial neoplasia (CIN) lesions. For a diagnostic algorithm, 91% SE and 90% SP are achieved for discriminating normal tissue, HPV infected tissue, and low-grade CIN lesions from high-grade CIN lesions. The results demonstrate that the quantitative colposcopic imaging system could provide objective screening and diagnostic information for early detection of cervical cancer.

  16. Computerized three-class classification of MRI-based prognostic markers for breast cancer

    International Nuclear Information System (INIS)

    The purpose of this study is to investigate whether computerized analysis using three-class Bayesian artificial neural network (BANN) feature selection and classification can characterize tumor grades (grade 1, grade 2 and grade 3) of breast lesions for prognostic classification on DCE-MRI. A database of 26 IDC grade 1 lesions, 86 IDC grade 2 lesions and 58 IDC grade 3 lesions was collected. The computer automatically segmented the lesions, and kinetic and morphological lesion features were automatically extracted. The discrimination tasks-grade 1 versus grade 3, grade 2 versus grade 3, and grade 1 versus grade 2 lesions-were investigated. Step-wise feature selection was conducted by three-class BANNs. Classification was performed with three-class BANNs using leave-one-lesion-out cross-validation to yield computer-estimated probabilities of being grade 3 lesion, grade 2 lesion and grade 1 lesion. Two-class ROC analysis was used to evaluate the performances. We achieved AUC values of 0.80 ± 0.05, 0.78 ± 0.05 and 0.62 ± 0.05 for grade 1 versus grade 3, grade 1 versus grade 2, and grade 2 versus grade 3, respectively. This study shows the potential for (1) applying three-class BANN feature selection and classification to CADx and (2) expanding the role of DCE-MRI CADx from diagnostic to prognostic classification in distinguishing tumor grades.

  17. Molecular Classification and Correlates in Colorectal Cancer

    OpenAIRE

    Ogino, Shuji; Goel, Ajay

    2008-01-01

    Molecular classification of colorectal cancer is evolving. As our understanding of colorectal carcinogenesis improves, we are incorporating new knowledge into the classification system. In particular, global genomic status [microsatellite instability (MSI) status and chromosomal instability (CIN) status] and epigenomic status [CpG island methylator phenotype (CIMP) status] play a significant role in determining clinical, pathological and biological characteristics of colorectal cancer. In thi...

  18. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    Directory of Open Access Journals (Sweden)

    Liu Qingzhong

    2011-12-01

    Full Text Available Abstract Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.

  19. Accurate molecular classification of cancer using simple rules

    Directory of Open Access Journals (Sweden)

    Gotoh Osamu

    2009-10-01

    Full Text Available Abstract Background One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible. Methods We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV of training sets and classification of independent test sets. Results We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML], lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML. Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods. Conclusion In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction.

  20. Confocal Raman imaging for cancer cell classification

    Science.gov (United States)

    Mathieu, Evelien; Van Dorpe, Pol; Stakenborg, Tim; Liu, Chengxun; Lagae, Liesbet

    2014-05-01

    We propose confocal Raman imaging as a label-free single cell characterization method that can be used as an alternative for conventional cell identification techniques that typically require labels, long incubation times and complex sample preparation. In this study it is investigated whether cancer and blood cells can be distinguished based on their Raman spectra. 2D Raman scans are recorded of 114 single cells, i.e. 60 breast (MCF-7), 5 cervix (HeLa) and 39 prostate (LNCaP) cancer cells and 10 monocytes (from healthy donors). For each cell an average spectrum is calculated and principal component analysis is performed on all average cell spectra. The main features of these principal components indicate that the information for cell identification based on Raman spectra mainly comes from the fatty acid composition in the cell. Based on the second and third principal component, blood cells could be distinguished from cancer cells; and prostate cancer cells could be distinguished from breast and cervix cancer cells. However, it was not possible to distinguish breast and cervix cancer cells. The results obtained in this study, demonstrate the potential of confocal Raman imaging for cell type classification and identification purposes.

  1. Classification, staging and prognosis of lung cancer

    International Nuclear Information System (INIS)

    Lung cancer has increased in incidence throughout the twentieth century and is now the most common cancer in the Western World. It has a poor prognosis, only 10-15% of patients survive 5 years or longer. Outcome is dependent on clinical stage and cancer cell type. Lung cancer is broadly subclassified on the basis of histological features into squamous cell carcinoma, adenocarcinoma, large cell carcinoma and small cell carcinoma. The histopathological type of lung cancer correlates with tumour behaviour and prognosis. Staging based on prognosis is essential in clinical trials comparing different management strategies, and enables universal communication regarding the efficacy of different treatments in specific patient groups. The anatomic extent of disease determined either preoperatively using imaging supplemented by invasive procedures such as mediastinoscopy, and anterior mediastinotomy or following resection are described according to the T-primary tumour, N-regional lymph nodes, M-distant metastasis classification. The International System for Staging Lung Cancer attempts to group together patients with similar prognosis and treatment options. Various combinations of T, N, and M define different clinical or surgical-pathological stages (IA-IV) characterised by different survival characteristics. Refinements in staging based on imaging findings have enabled clinical staging to more accurately reflect the surgical-pathological stage and therefore more accurately predict prognosis. Recent advances including the use of positron emission tomography in combination with conventional staging promises to increase the accuracy of staging and therefore to reduce the number of invasive staging procedures and inappropriate thoracotomies

  2. Reproducibility of histologic classification of gastric cancer.

    OpenAIRE

    Palli, D; Bianchi, S.; Cipriani, F; Duca, P; Amorosi, A; C. Avellini; A. Russo; Saragoni, A; P. Todde; Valdes, E.

    1991-01-01

    A panel review of histologic specimens was carried out as part of a multi-centre case-control study of gastric cancer (GC) and diet. Comparisons of diagnoses of 100 GCs by six pathologists revealed agreement in histologic classification for about 70-80% of the cancers. Concordance was somewhat higher when using the Lauren rather than the Ming or World Health Organization classification systems. Histologic types from reading biopsy tissue agreed with those derived from surgical specimens for 6...

  3. A Gene Selection Approach based on Clustering for Classification Tasks in Colon Cancer

    Directory of Open Access Journals (Sweden)

    José Antonio CASTELLANOS GARZÓN

    2016-06-01

    Full Text Available Gene selection (GS is an important research area in the analysis of DNA-microarray data, since it involves gene discovery meaningful for a particular target annotation or able to discriminate expression profiles of samples coming from different populations. In this context, a wide number of filter methods have been proposed in the literature to identify subsets of relevant genes in accordance with prefixed targets. Despite the fact that there is a wide number of proposals, the complexity imposed by this problem (GS remains a challenge. Hence, this paper proposes a novel approach for gene selection by using cluster techniques and filter methods on the found groupings to achieve informative gene subsets. As a result of applying our methodology to Colon cancer data, we have identified the best informative gene subset between several one subsets. According to the above, the reached results have proven the reliability of the approach given in this paper.

  4. Laser Raman detection for oral cancer based on an adaptive Gaussian process classification method with posterior probabilities

    Science.gov (United States)

    Du, Zhanwei; Yang, Yongjian; Bai, Yuan; Wang, Lijun; Su, Le; Chen, Yong; Li, Xianchang; Zhou, Xiaodong; Jia, Jun; Shen, Aiguo; Hu, Jiming

    2013-03-01

    The existing methods for early and differential diagnosis of oral cancer are limited due to the unapparent early symptoms and the imperfect imaging examination methods. In this paper, the classification models of oral adenocarcinoma, carcinoma tissues and a control group with just four features are established by utilizing the hybrid Gaussian process (HGP) classification algorithm, with the introduction of the mechanisms of noise reduction and posterior probability. HGP shows much better performance in the experimental results. During the experimental process, oral tissues were divided into three groups, adenocarcinoma (n = 87), carcinoma (n = 100) and the control group (n = 134). The spectral data for these groups were collected. The prospective application of the proposed HGP classification method improved the diagnostic sensitivity to 56.35% and the specificity to about 70.00%, and resulted in a Matthews correlation coefficient (MCC) of 0.36. It is proved that the utilization of HGP in LRS detection analysis for the diagnosis of oral cancer gives accurate results. The prospect of application is also satisfactory.

  5. Laser Raman detection for oral cancer based on an adaptive Gaussian process classification method with posterior probabilities

    International Nuclear Information System (INIS)

    The existing methods for early and differential diagnosis of oral cancer are limited due to the unapparent early symptoms and the imperfect imaging examination methods. In this paper, the classification models of oral adenocarcinoma, carcinoma tissues and a control group with just four features are established by utilizing the hybrid Gaussian process (HGP) classification algorithm, with the introduction of the mechanisms of noise reduction and posterior probability. HGP shows much better performance in the experimental results. During the experimental process, oral tissues were divided into three groups, adenocarcinoma (n = 87), carcinoma (n = 100) and the control group (n = 134). The spectral data for these groups were collected. The prospective application of the proposed HGP classification method improved the diagnostic sensitivity to 56.35% and the specificity to about 70.00%, and resulted in a Matthews correlation coefficient (MCC) of 0.36. It is proved that the utilization of HGP in LRS detection analysis for the diagnosis of oral cancer gives accurate results. The prospect of application is also satisfactory. (paper)

  6. CLASSIFICATION OF SEVERAL SKIN CANCER TYPES BASED ON AUTOFLUORESCENCE INTENSITY OF VISIBLE LIGHT TO NEAR INFRARED RATIO

    Directory of Open Access Journals (Sweden)

    Aryo Tedjo

    2009-12-01

    Full Text Available Skin cancer is a malignant growth on the skin caused by many factors. The most common skin cancers are Basal Cell Cancer (BCC and Squamous Cell Cancer (SCC. This research uses a discriminant analysis to classify some tissues of skin cancer based on criterion number of independent variables. An independent variable is variation of excitation light sources (LED lamp, filters, and sensors to measure Autofluorescence Intensity (IAF of visible light to near infrared (VIS/NIR ratio of paraffin embedded tissue biopsy from BCC, SCC, and Lipoma. From the result of discriminant analysis, it is known that the discriminant function is determined by 4 (four independent variables i.e., Blue LED-Red Filter, Blue LED-Yellow Filter, UV LED-Blue Filter, and UV LED-Yellow Filter. The accuracy of discriminant in classifying the analysis of three skin cancer tissues is 100 %.

  7. A New Classification Approach Based on Multiple Classification Rules

    OpenAIRE

    Zhongmei Zhou

    2014-01-01

    A good classifier can correctly predict new data for which the class label is unknown, so it is important to construct a high accuracy classifier. Hence, classification techniques are much useful in ubiquitous computing. Associative classification achieves higher classification accuracy than some traditional rule-based classification approaches. However, the approach also has two major deficiencies. First, it generates a very large number of association classification rules, especially when t...

  8. Magnetic resonance imaging texture analysis classification of primary breast cancer

    Energy Technology Data Exchange (ETDEWEB)

    Waugh, S.A.; Lerski, R.A. [Ninewells Hospital and Medical School, Department of Medical Physics, Dundee (United Kingdom); Purdie, C.A.; Jordan, L.B. [Ninewells Hospital and Medical School, Department of Pathology, Dundee (United Kingdom); Vinnicombe, S. [University of Dundee, Division of Imaging and Technology, Ninewells Hospital and Medical School, Dundee (United Kingdom); Martin, P. [Ninewells Hospital and Medical School, Department of Clinical Radiology, Dundee (United Kingdom); Thompson, A.M. [University of Texas MD Anderson Cancer Center, Department of Surgical Oncology, Houston, TX (United States)

    2016-02-15

    Patient-tailored treatments for breast cancer are based on histological and immunohistochemical (IHC) subtypes. Magnetic Resonance Imaging (MRI) texture analysis (TA) may be useful in non-invasive lesion subtype classification. Women with newly diagnosed primary breast cancer underwent pre-treatment dynamic contrast-enhanced breast MRI. TA was performed using co-occurrence matrix (COM) features, by creating a model on retrospective training data, then prospectively applying to a test set. Analyses were blinded to breast pathology. Subtype classifications were performed using a cross-validated k-nearest-neighbour (k = 3) technique, with accuracy relative to pathology assessed and receiver operator curve (AUROC) calculated. Mann-Whitney U and Kruskal-Wallis tests were used to assess raw entropy feature values. Histological subtype classifications were similar across training (n = 148 cancers) and test sets (n = 73 lesions) using all COM features (training: 75 %, AUROC = 0.816; test: 72.5 %, AUROC = 0.823). Entropy features were significantly different between lobular and ductal cancers (p < 0.001; Mann-Whitney U). IHC classifications using COM features were also similar for training and test data (training: 57.2 %, AUROC = 0.754; test: 57.0 %, AUROC = 0.750). Hormone receptor positive and negative cancers demonstrated significantly different entropy features. Entropy features alone were unable to create a robust classification model. Textural differences on contrast-enhanced MR images may reflect underlying lesion subtypes, which merits testing against treatment response. (orig.)

  9. Magnetic resonance imaging texture analysis classification of primary breast cancer

    International Nuclear Information System (INIS)

    Patient-tailored treatments for breast cancer are based on histological and immunohistochemical (IHC) subtypes. Magnetic Resonance Imaging (MRI) texture analysis (TA) may be useful in non-invasive lesion subtype classification. Women with newly diagnosed primary breast cancer underwent pre-treatment dynamic contrast-enhanced breast MRI. TA was performed using co-occurrence matrix (COM) features, by creating a model on retrospective training data, then prospectively applying to a test set. Analyses were blinded to breast pathology. Subtype classifications were performed using a cross-validated k-nearest-neighbour (k = 3) technique, with accuracy relative to pathology assessed and receiver operator curve (AUROC) calculated. Mann-Whitney U and Kruskal-Wallis tests were used to assess raw entropy feature values. Histological subtype classifications were similar across training (n = 148 cancers) and test sets (n = 73 lesions) using all COM features (training: 75 %, AUROC = 0.816; test: 72.5 %, AUROC = 0.823). Entropy features were significantly different between lobular and ductal cancers (p < 0.001; Mann-Whitney U). IHC classifications using COM features were also similar for training and test data (training: 57.2 %, AUROC = 0.754; test: 57.0 %, AUROC = 0.750). Hormone receptor positive and negative cancers demonstrated significantly different entropy features. Entropy features alone were unable to create a robust classification model. Textural differences on contrast-enhanced MR images may reflect underlying lesion subtypes, which merits testing against treatment response. (orig.)

  10. The staging and classification of cancers

    International Nuclear Information System (INIS)

    The primary theme of this chapter is to place in perspective the process and purpose of tumor staging for both the oncologist and the radiologist. The foundation used for this discussion will be the Manual for Staging of Cancer-1983, published by the American Joint Committee for Cancer. Because the specifics of staging and classification by body site will be the province of the chapters that follow, only a broad discussion of the principles and applications of this system is given

  11. Reproducibility of histologic classification of gastric cancer.

    Science.gov (United States)

    Palli, D.; Bianchi, S.; Cipriani, F.; Duca, P.; Amorosi, A.; Avellini, C.; Russo, A.; Saragoni, A.; Todde, P.; Valdes, E.

    1991-01-01

    A panel review of histologic specimens was carried out as part of a multi-centre case-control study of gastric cancer (GC) and diet. Comparisons of diagnoses of 100 GCs by six pathologists revealed agreement in histologic classification for about 70-80% of the cancers. Concordance was somewhat higher when using the Lauren rather than the Ming or World Health Organization classification systems. Histologic types from reading biopsy tissue agreed with those derived from surgical specimens for 65-75% of the 100 tumours. Intra-observer agreement in histologic classification, assessed by repeat readings up to 3 years apart by one pathologist, was 95%. The findings indicate that, although overall concordance was good, it is important to standardise diagnoses in multi-centre epidemiologic studies of GC by histologic type. PMID:2039701

  12. Biogeography based Satellite Image Classification

    CERN Document Server

    Panchal, V K; Kaur, Navdeep; Kundra, Harish

    2009-01-01

    Biogeography is the study of the geographical distribution of biological organisms. The mindset of the engineer is that we can learn from nature. Biogeography Based Optimization is a burgeoning nature inspired technique to find the optimal solution of the problem. Satellite image classification is an important task because it is the only way we can know about the land cover map of inaccessible areas. Though satellite images have been classified in past by using various techniques, the researchers are always finding alternative strategies for satellite image classification so that they may be prepared to select the most appropriate technique for the feature extraction task in hand. This paper is focused on classification of the satellite image of a particular land cover using the theory of Biogeography based Optimization. The original BBO algorithm does not have the inbuilt property of clustering which is required during image classification. Hence modifications have been proposed to the original algorithm and...

  13. Reverse phase protein array based tumor profiling identifies a biomarker signature for risk classification of hormone receptor-positive breast cancer

    Directory of Open Access Journals (Sweden)

    Johanna Sonntag

    2014-03-01

    Full Text Available A robust subclassification of luminal breast cancer, the most common molecular subtype of human breast cancer, is crucial for therapy decisions. While a part of patients is at higher risk of recurrence and requires chemo-endocrine treatment, the other part is at lower risk and also poorly responds to chemotherapeutic regimens. To approximate the risk of cancer recurrence, clinical guidelines recommend determining histologic grading and abundance of a cell proliferation marker in tumor specimens. However, this approach assigns an intermediate risk to a substantial number of patients and in addition suffers from a high interobserver variability. Therefore, the aim of our study was to identify a quantitative protein biomarker signature to facilitate risk classification. Reverse phase protein arrays (RPPA were used to obtain quantitative expression data for 128 breast cancer relevant proteins in a set of hormone receptor-positive tumors (n = 109. Proteomic data for the subset of histologic G1 (n = 14 and G3 (n = 22 samples were used for biomarker discovery serving as surrogates of low and high recurrence risk, respectively. A novel biomarker selection workflow based on combining three different classification methods identified caveolin-1, NDKA, RPS6, and Ki-67 as top candidates. NDKA, RPS6, and Ki-67 were expressed at elevated levels in high risk tumors whereas caveolin-1 was observed as downregulated. The identified biomarker signature was subsequently analyzed using an independent test set (AUC = 0.78. Further evaluation of the identified biomarker panel by Western blot and mRNA profiling confirmed the proteomic signature obtained by RPPA. In conclusion, the biomarker signature introduced supports RPPA as a tool for cancer biomarker discovery.

  14. Classification-based reasoning

    Science.gov (United States)

    Gomez, Fernando; Segami, Carlos

    1991-01-01

    A representation formalism for N-ary relations, quantification, and definition of concepts is described. Three types of conditions are associated with the concepts: (1) necessary and sufficient properties, (2) contingent properties, and (3) necessary properties. Also explained is how complex chains of inferences can be accomplished by representing existentially quantified sentences, and concepts denoted by restrictive relative clauses as classification hierarchies. The representation structures that make possible the inferences are explained first, followed by the reasoning algorithms that draw the inferences from the knowledge structures. All the ideas explained have been implemented and are part of the information retrieval component of a program called Snowy. An appendix contains a brief session with the program.

  15. Tolerance to missing data using a likelihood ratio based classifier for computer-aided classification of breast cancer

    International Nuclear Information System (INIS)

    While mammography is a highly sensitive method for detecting breast tumours, its ability to differentiate between malignant and benign lesions is low, which may result in as many as 70% of unnecessary biopsies. The purpose of this study was to develop a highly specific computer-aided diagnosis algorithm to improve classification of mammographic masses. A classifier based on the likelihood ratio was developed to accommodate cases with missing data. Data for development included 671 biopsy cases (245 malignant), with biopsy-proved outcome. Sixteen features based on the BI-RADSTM lexicon and patient history had been recorded for the cases, with 1.3 ± 1.1 missing feature values per case. Classifier evaluation methods included receiver operating characteristic and leave-one-out bootstrap sampling. The classifier achieved 32% specificity at 100% sensitivity on the 671 cases with 16 features that had missing values. Utilizing just the seven features present for all cases resulted in decreased performance at 100% sensitivity with average 19% specificity. No cases and no feature data were omitted during classifier development, showing that it is more beneficial to utilize cases with missing values than to discard incomplete cases that cannot be handled by many algorithms. Classification of mammographic masses was commendable at high sensitivity levels, indicating that benign cases could be potentially spared from biopsy

  16. Cancer Cachexia: Classification, Pathophysiology and Treatment

    OpenAIRE

    Solheim, Tora Skeidsvoll

    2014-01-01

    Cachexia is a very common condition in patients with cancer and it has detrimental effects on both mortality and morbidity. Cachexia is characterized by progressive unintentional loss of muscle mass, with or without loss of fat mass.When the work on this thesis started there was neither any efficient treatment available against cachexia nor a consensus on how to define or classify the condition. The overall aim of this thesis was to contribute to the improvement of classification and treatmen...

  17. Feature Selection and Molecular Classification of Cancer Using Genetic Programming

    Directory of Open Access Journals (Sweden)

    Jianjun Yu

    2007-04-01

    Full Text Available Despite important advances in microarray-based molecular classification of tumors, its application in clinical settings remains formidable. This is in part due to the limitation of current analysis programs in discovering robust biomarkers and developing classifiers with a practical set of genes. Genetic programming (GP is a type of machine learning technique that uses evolutionary algorithm to simulate natural selection as well as population dynamics, hence leading to simple and comprehensible classifiers. Here we applied GP to cancer expression profiling data to select feature genes and build molecular classifiers by mathematical integration of these genes. Analysis of thousands of GP classifiers generated for a prostate cancer data set revealed repetitive use of a set of highly discriminative feature genes, many of which are known to be disease associated. GP classifiers often comprise five or less genes and successfully predict cancer types and subtypes. More importantly, GP classifiers generated in one study are able to predict samples from an independent study, which may have used different microarray platforms. In addition, GP yielded classification accuracy better than or similar to conventional classification methods. Furthermore, the mathematical expression of GP classifiers provides insights into relationships between classifier genes. Taken together, our results demonstrate that GP may be valuable for generating effective classifiers containing a practical set of genes for diagnostic/ prognostic cancer classification.

  18. Clinicopathological classification and individualized treatment of breast cancer

    Institute of Scientific and Technical Information of China (English)

    HU Hui; LIU Yin-hua; XU Ling; ZHAO Jian-xin; DUAN Xue-ning; YE Jing-ming; LI Ting

    2013-01-01

    Background The clinicopathological classification was proposed in the St.Gallen Consensus Report 2011.We conducted a retrospective analysis of breast cancer subtypes,tumor-nodal-metastatic (TNM) staging,and histopathological grade to investigate the value of these parameters in the treatment strategies of invasive breast cancer.Methods A retrospective analysis of breast cancer subtypes,TNM staging,and histopathological grading of 213 cases has been performed by the methods recommended in the St.Gallen International Expert Consensus Report 2011.The estrogen receptor (ER),progesterone receptor (PR),human epidermal growth factor receptor-2 (HER2),and Ki-67 of 213 tumor samples have been investigated by immunohistochemistry according to methods for classifying breast cancer subtypes proposed in the St.Gallen Consensus Report 2011.Results The luminal A subtype was found in 53 patients (24.9%),the luminal B subtype was found in 112 patients (52.6%),the HER2-positive subtype was found in 22 patients (10.3%),and the triple-negative subtype was found in 26 patients (12%).Histopathological grade and TNM staging differed significantly among the four subtypes of breast cancer (P<0.001).Conclusion It is important to consider TNM staging and histopathological grading in the treatment strategies of breast cancer based on the current clinicopathological classification methods.

  19. Modulation classification based on spectrogram

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    The aim of modulation classification (MC) is to identify the modulation type of a communication signal. It plays an important role in many cooperative or noncooperative communication applications. Three spectrogram-based modulation classification methods are proposed. Their reccgnition scope and performance are investigated or evaluated by theoretical analysis and extensive simulation studies. The method taking moment-like features is robust to frequency offset while the other two, which make use of principal component analysis (PCA) with different transformation inputs,can achieve satisfactory accuracy even at low SNR (as low as 2 dB). Due to the properties of spectrogram, the statistical pattern recognition techniques, and the image preprocessing steps, all of our methods are insensitive to unknown phase and frequency offsets, timing errors, and the arriving sequence of symbols.

  20. Computer aided decision support system for cervical cancer classification

    Science.gov (United States)

    Rahmadwati, Rahmadwati; Naghdy, Golshah; Ros, Montserrat; Todd, Catherine

    2012-10-01

    Conventional analysis of a cervical histology image, such a pap smear or a biopsy sample, is performed by an expert pathologist manually. This involves inspecting the sample for cellular level abnormalities and determining the spread of the abnormalities. Cancer is graded based on the spread of the abnormal cells. This is a tedious, subjective and time-consuming process with considerable variations in diagnosis between the experts. This paper presents a computer aided decision support system (CADSS) tool to help the pathologists in their examination of the cervical cancer biopsies. The main aim of the proposed CADSS system is to identify abnormalities and quantify cancer grading in a systematic and repeatable manner. The paper proposes three different methods which presents and compares the results using 475 images of cervical biopsies which include normal, three stages of pre cancer, and malignant cases. This paper will explore various components of an effective CADSS; image acquisition, pre-processing, segmentation, feature extraction, classification, grading and disease identification. Cervical histological images are captured using a digital microscope. The images are captured in sufficient resolution to retain enough information for effective classification. Histology images of cervical biopsies consist of three major sections; background, stroma and squamous epithelium. Most diagnostic information are contained within the epithelium region. This paper will present two levels of segmentations; global (macro) and local (micro). At the global level the squamous epithelium is separated from the background and stroma. At the local or cellular level, the nuclei and cytoplasm are segmented for further analysis. Image features that influence the pathologists' decision during the analysis and classification of a cervical biopsy are the nuclei's shape and spread; the ratio of the areas of nuclei and cytoplasm as well as the texture and spread of the abnormalities

  1. Automated noninvasive classification of renal cancer on multiphase CT

    International Nuclear Information System (INIS)

    Purpose: To explore the added value of the shape of renal lesions for classifying renal neoplasms. To investigate the potential of computer-aided analysis of contrast-enhanced computed-tomography (CT) to quantify and classify renal lesions. Methods: A computer-aided clinical tool based on adaptive level sets was employed to analyze 125 renal lesions from contrast-enhanced abdominal CT studies of 43 patients. There were 47 cysts and 78 neoplasms: 22 Von Hippel-Lindau (VHL), 16 Birt-Hogg-Dube (BHD), 19 hereditary papillary renal carcinomas (HPRC), and 21 hereditary leiomyomatosis and renal cell cancers (HLRCC). The technique quantified the three-dimensional size and enhancement of lesions. Intrapatient and interphase registration facilitated the study of lesion serial enhancement. The histograms of curvature-related features were used to classify the lesion types. The areas under the curve (AUC) were calculated for receiver operating characteristic curves. Results: Tumors were robustly segmented with 0.80 overlap (0.98 correlation) between manual and semi-automated quantifications. The method further identified morphological discrepancies between the types of lesions. The classification based on lesion appearance, enhancement and morphology between cysts and cancers showed AUC = 0.98; for BHD + VHL (solid cancers) vs. HPRC + HLRCC AUC = 0.99; for VHL vs. BHD AUC = 0.82; and for HPRC vs. HLRCC AUC = 0.84. All semi-automated classifications were statistically significant (p < 0.05) and superior to the analyses based solely on serial enhancement. Conclusions: The computer-aided clinical tool allowed the accurate quantification of cystic, solid, and mixed renal tumors. Cancer types were classified into four categories using their shape and enhancement. Comprehensive imaging biomarkers of renal neoplasms on abdominal CT may facilitate their noninvasive classification, guide clinical management, and monitor responses to drugs or interventions.

  2. Automated noninvasive classification of renal cancer on multiphase CT

    Energy Technology Data Exchange (ETDEWEB)

    Linguraru, Marius George; Wang, Shijun; Shah, Furhawn; Gautam, Rabindra; Peterson, James; Linehan, W. Marston; Summers, Ronald M. [Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, 10 Center Drive, Bethesda, Maryland 20892 (United States); Urologic Oncology Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892 (United States); Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, 10 Center Drive, Bethesda, Maryland 20892 (United States)

    2011-10-15

    Purpose: To explore the added value of the shape of renal lesions for classifying renal neoplasms. To investigate the potential of computer-aided analysis of contrast-enhanced computed-tomography (CT) to quantify and classify renal lesions. Methods: A computer-aided clinical tool based on adaptive level sets was employed to analyze 125 renal lesions from contrast-enhanced abdominal CT studies of 43 patients. There were 47 cysts and 78 neoplasms: 22 Von Hippel-Lindau (VHL), 16 Birt-Hogg-Dube (BHD), 19 hereditary papillary renal carcinomas (HPRC), and 21 hereditary leiomyomatosis and renal cell cancers (HLRCC). The technique quantified the three-dimensional size and enhancement of lesions. Intrapatient and interphase registration facilitated the study of lesion serial enhancement. The histograms of curvature-related features were used to classify the lesion types. The areas under the curve (AUC) were calculated for receiver operating characteristic curves. Results: Tumors were robustly segmented with 0.80 overlap (0.98 correlation) between manual and semi-automated quantifications. The method further identified morphological discrepancies between the types of lesions. The classification based on lesion appearance, enhancement and morphology between cysts and cancers showed AUC = 0.98; for BHD + VHL (solid cancers) vs. HPRC + HLRCC AUC = 0.99; for VHL vs. BHD AUC = 0.82; and for HPRC vs. HLRCC AUC = 0.84. All semi-automated classifications were statistically significant (p < 0.05) and superior to the analyses based solely on serial enhancement. Conclusions: The computer-aided clinical tool allowed the accurate quantification of cystic, solid, and mixed renal tumors. Cancer types were classified into four categories using their shape and enhancement. Comprehensive imaging biomarkers of renal neoplasms on abdominal CT may facilitate their noninvasive classification, guide clinical management, and monitor responses to drugs or interventions.

  3. Using fuzzy association rule mining in cancer classification

    International Nuclear Information System (INIS)

    Full text: The classification of the cancer tumors based on gene expression profiles has been extensively studied in numbers of studies. A wide variety of cancer datasets have been implemented by the various methods of gene selec tion and classification to identify the behavior of the genes in tumors and find the relationships between them and outcome of diseases. Interpretability of the model, which is developed by fuzzy rules and linguistic variables in this study, has been rarely considered. In addition, creating a fuzzy classifier with high performance in classification that uses a subset of significant genes which have been selected by different types of gene selection methods is another goal of this study. A new algorithm has been developed to identify the fuzzy rules and significant genes based on fuzzy association rule mining. At first, different subset of genes which have been selected by different methods, were used to generate primary fuzzy classifiers separately and then proposed algorithm was implemented to mix the genes which have been associated in the primary classifiers and generate a new classifier. The results show that fuzzy classifier can classify the tumors with high performance while presenting the relationships between the genes by linguistic variables

  4. Can Modified Dukes' Classification be Used in Gastric Cancer Staging?

    OpenAIRE

    Özgüç, Halil

    2006-01-01

    Aim: Dukes' staging system is a simple system used widely in the staging of colorectal cancer. This study was designed to analyze the applicability of the modified Dukes' classification system in gastric cancer cases. Methods: The prognostic factors affecting survival in 139 gastric cancer cases who had had at least 15 lymph nodes removed were analyzed. Modified Dukes' and TNM classifications were investigated to correlate statistically significant prognostic factors. The i...

  5. Projection Classification Based Iterative Algorithm

    Science.gov (United States)

    Zhang, Ruiqiu; Li, Chen; Gao, Wenhua

    2015-05-01

    Iterative algorithm has good performance as it does not need complete projection data in 3D image reconstruction area. It is possible to be applied in BGA based solder joints inspection but with low convergence speed which usually acts with x-ray Laminography that has a worse reconstruction image compared to the former one. This paper explores to apply one projection classification based method which tries to separate the object to three parts, i.e. solute, solution and air, and suppose that the reconstruction speed decrease from solution to two other parts on both side lineally. And then SART and CAV algorithms are improved under the proposed idea. Simulation experiment result with incomplete projection images indicates the fast convergence speed of the improved iterative algorithms and the effectiveness of the proposed method. Less the projection images, more the superiority is also founded.

  6. Impact of esophageal cancer staging on overall survival and disease-free survival based on the 2010 AJCC classification by lymph nodes

    International Nuclear Information System (INIS)

    This retrospective study investigated the effect of modifications presented in the seventh edition of the American Joint Committee on Cancer (AJCC) Manual for staging esophageal cancer on the characterization of the effectiveness of post-operative chemotherapy and/or radiotherapy, as measured by overall and disease-free survival. The seventh edition of the AJCC Manual classifies the number of lymph nodes (N) positive for regional metastasis into three subclasses. We used the AJCC classification system to characterize the cancers of 413 Chinese patients with esophageal cancer who underwent radical resection plus regional lymph node dissection over a 10-year period. The 10-year survival rate was 14.3% for stage N1 patients and 6.1% for stage N2 patients. Only one stage N3 patient was followed >4 years (53.4 months). The 10-year disease-free rate was 13.6% for stage N1 patients. Patients with stage N2 or N3 cancer were more likely to have tumor recurrences, metastases or death than patients with stage N1 cancer. Post-operative radiotherapy provided no survival benefit, and may have had a negative effect on survival. In this study, the N stage of esophageal cancer was an independent factor affecting overall and disease-free survival. Our results did not clarify whether or not radiotherapy after radical esophagectomy offers any survival benefit to patients with esophageal cancer. (author)

  7. Classification of neuropathic pain in cancer patients

    DEFF Research Database (Denmark)

    Brunelli, Cinzia; Bennett, Michael I; Kaasa, Stein;

    2014-01-01

    and on the relevance of patient-reported outcome (PRO) descriptors for the screening of NP in this population. An international group of 42 experts was invited to participate in a consensus process through a modified 2-round Internet-based Delphi survey. Relevant topics investigated were...... good agreement was found on the statement "the pathophysiology of NP due to cancer can be different from non-cancer NP" (MED=9, IQR=2). Satisfactory consensus was reached for the first 3 NeuPSIG criteria (pain distribution, history, and sensory findings; MEDs⩾8, IQRs⩽3), but not for the fourth one...

  8. Arabic Text Mining Using Rule Based Classification

    OpenAIRE

    Fadi Thabtah; Omar Gharaibeh; Rashid Al-Zubaidy

    2012-01-01

    A well-known classification problem in the domain of text mining is text classification, which concerns about mapping textual documents into one or more predefined category based on its content. Text classification arena recently attracted many researchers because of the massive amounts of online documents and text archives which hold essential information for a decision-making process. In this field, most of such researches focus on classifying English documents while there are limited studi...

  9. Texture Classification based on Gabor Wavelet

    OpenAIRE

    Amandeep Kaur; Savita Gupta

    2012-01-01

    This paper presents the comparison of Texture classification algorithms based on Gabor Wavelets. The focus of this paper is on feature extraction scheme for texture classification. The texture feature for an image can be classified using texture descriptors. In this paper we have used Homogeneous texture descriptor that uses Gabor Wavelets concept. For texture classification, we have used online texture database that is Brodatz’s database and three advanced well known classifiers: Support Vec...

  10. Domain-Based Classification of CSCW Systems

    Directory of Open Access Journals (Sweden)

    M. Khan

    2011-11-01

    Full Text Available CSCW systems are widely used for group activities in different organizations and setups. This study briefly describes the existing classifications of CSCW systems and their shortcomings. These existing classifications are helpful to categorize systems based on a general set of CSCW characteristics but do not provide any guidance towards system design and evaluation. After literature review of ACM CSCW conference (1986-2010, a new classification is proposed to categorize CSCW systems on the basis of domains. This proposed classification may help researchers to come up with more effective design and evaluation methods for CSCW systems.

  11. Sparse discriminant analysis for breast cancer biomarker identification and classification

    Institute of Scientific and Technical Information of China (English)

    Yu Shi; Daoqing Dai; Chaochun Liu; Hong Yan

    2009-01-01

    Biomarker identification and cancer classification are two important procedures in microarray data analysis. We propose a novel uni-fied method to carry out both tasks. We first preselect biomarker candidates by eliminating unrelated genes through the BSS/WSS ratio filter to reduce computational cost, and then use a sparse discriminant analysis method for simultaneous biomarker identification and cancer classification. Moreover, we give a mathematical justification about automatic biomarker identification. Experimental results show that the proposed method can identify key genes that have been verified in biochemical or biomedical research and classify the breast cancer type correctly.

  12. Texture Classification Based on Texton Features

    Directory of Open Access Journals (Sweden)

    U Ravi Babu

    2012-08-01

    Full Text Available Texture Analysis plays an important role in the interpretation, understanding and recognition of terrain, biomedical or microscopic images. To achieve high accuracy in classification the present paper proposes a new method on textons. Each texture analysis method depends upon how the selected texture features characterizes image. Whenever a new texture feature is derived it is tested whether it precisely classifies the textures. Here not only the texture features are important but also the way in which they are applied is also important and significant for a crucial, precise and accurate texture classification and analysis. The present paper proposes a new method on textons, for an efficient rotationally invariant texture classification. The proposed Texton Features (TF evaluates the relationship between the values of neighboring pixels. The proposed classification algorithm evaluates the histogram based techniques on TF for a precise classification. The experimental results on various stone textures indicate the efficacy of the proposed method when compared to other methods.

  13. Fingerprint Classification based on Orientaion Estimation

    Directory of Open Access Journals (Sweden)

    Manish Mathuria

    2013-06-01

    Full Text Available The geometric characteristics of an object make it distinguishable. The objects present in the Environment known by their features and properties. The fingerprint image as object may classify into sub classes based on minutiae structure. The minutiae structure may categorize as ridge curves generated by the orientation estimation. The extracted curves are invariant to location, rotation and scaling. This classification approach helps to manage fingerprints along their classes. This research provides a better collaboration of data mining based on classification.

  14. Cancer stem cell-related marker expression in lung adenocarcinoma and relevance of histologic subtypes based on IASLC/ATS/ERS classification

    Directory of Open Access Journals (Sweden)

    Shimada Y

    2013-11-01

    Full Text Available Yoshihisa Shimada,1 Hisashi Saji,3 Masaharu Nomura,1,2 Jun Matsubayashi,2 Koichi Yoshida,1 Masatoshi Kakihana,1 Naohiro Kajiwara,1 Tatsuo Ohira,1 Norihiko Ikeda11Department of Surgery I, 2Department of Anatomic Pathology, Tokyo Medical University Hospital, Tokyo, Japan; 3Department of Chest Surgery, St Marianna University School of Medicine, Kawasaki, JapanBackground: The cancer stem cell (CSC theory has been proposed to explain tumor heterogeneity and the carcinogenesis of solid tumors. The aim of this study was to clarify the clinical role of CSC-related markers in patients with lung adenocarcinoma and to determine whether each CSC-related marker expression correlates with the histologic subtyping proposed by the International Association for the Study of Lung Cancer (IASLC, the American Thoracic Society (ATS, and the European Respiratory Society (ERS classifications.Methods: We reviewed data for all 103 patients in whom complete resection of adenocarcinoma had been performed. Expression of CSC-related markers, ie, aldehyde dehydrogenase 1A1 (ALDH1A1, aldo-keto reductase 1C family member 1 (AK1C1, and 1C family member 3 (AK1C3, was examined using immunostaining on whole-mount tissue slides, and the tumors were reclassified according to the IASLC/ATS/ERS classification.Results: ALDH1A1 expression was observed in 66.0% of tumors, AK1C1 in 62.7%, and AK1C3 in 86.1%. Immunoreactivities with the frequency of mean expression of ALDH1A1 in papillary predominant adenocarcinoma were significantly higher than those of solid predominant adenocarcinoma (P<0.05. Papillary predominant adenocarcinoma had significantly lower expression of AK1C1 when compared with noninvasive or solid predominant adenocarcinomas (P<0.05. On multivariate analysis, larger tumor size (hazards ratio 1.899, P=0.044, lymph node metastasis (hazards ratio 2.702, P=0.005, and low expression of ALDH1A1 (hazards ratio 3.218, P<0.001 were shown to be independently associated with an

  15. Application of Artificial Neural Networks in Cancer Classification and Diagnosis Prediction of a Subtype of Lymphoma Based on Gene Expression Profile

    Directory of Open Access Journals (Sweden)

    L Ziaei

    2006-01-01

    Full Text Available Background: Diffuse Large B-cell Lymphoma (DLBCL is the most common subtype of non-Hodgkin’s Lymphoma. DLBCL patients have different survivals after diagnosis. 40% of patients respond well to current therapy and have prolonged survival, whereas the remainders survive less than 5 years. In this study, we have applied artificial neural network to classify patients with DLBCL on the basis of their gene expression profiles. Finally, we have attempted to extract a number of genes that their differential expression were significant in DLBCL subtypes. Methods: We studied 40 patients and 4026 genes. In this study, genes were ranked based on their signal to noise (S/N ratios. After selecting a suitable threshold, some of them whose ratios were less than the threshold were removed. Then we used PCA for more reducing and Perceptron neural network for classification of these patients. We extracted some appropriate genes based on their prediction ability. Results: We considered various targets for patients classifying. Thus patients were classified based on their 5 years survival with accuracy of 93%, in regard to Alizadeh et al study results with accuracy of 100%, and regarding with their International Prognosis Index (IPI with accuracy of 89%. Conclusion: Combination of PCA and S/N ratio is an effective method for the reduction of the dimension and neural network is a robust tool for classification of patients according to their gene expression profile. Keywords: classification, gene expression, DLBCL, neural network, Perceptron

  16. Multi-Organ Cancer Classification and Survival Analysis

    OpenAIRE

    Bauer, Stefan; Carion, Nicolas; Schüffler, Peter; Fuchs, Thomas; Wild, Peter; Buhmann, Joachim M.

    2016-01-01

    Accurate and robust cell nuclei classification is the cornerstone for a wider range of tasks in digital and Computational Pathology. However, most machine learning systems require extensive labeling from expert pathologists for each individual problem at hand, with no or limited abilities for knowledge transfer between datasets and organ sites. In this paper we implement and evaluate a variety of deep neural network models and model ensembles for nuclei classification in renal cell cancer (RC...

  17. Classification of Dukes' B and C colorectal cancers using expression arrays

    DEFF Research Database (Denmark)

    Frederiksen, C.M.; Knudsen, Steen; Laurberg, S.; Ørntoft, T.F.

    2003-01-01

    Purpose. Colorectal cancer is one of the most common malignancies. Substaging of the cancer is of importance not only to prognosis but also to treatment. Classification of substages based on DNA microarray technology is currently the most promising approach. We therefore investigated if gene...... and D could not be classified correctly. A number of interesting gene clusters showed a discriminating difference between Dukes' B and C samples. These included mitochondrial genes, stromal remodeling genes, and genes related to cell adhesion. Conclusion. Molecular classification based on gene...

  18. Classification of oral cancers using Raman spectroscopy of serum

    Science.gov (United States)

    Sahu, Aditi; Talathi, Sneha; Sawant, Sharada; Krishna, C. Murali

    2014-03-01

    Oral cancers are the sixth most common malignancy worldwide, with low 5-year disease free survival rates, attributable to late detection due to lack of reliable screening modalities. Our in vivo Raman spectroscopy studies have demonstrated classification of normal and tumor as well as cancer field effects (CFE), the earliest events in oral cancers. In view of limitations such as requirement of on-site instrumentation and stringent experimental conditions of this approach, feasibility of classification of normal and cancer using serum was explored using 532 nm excitation. In this study, strong resonance features of β-carotenes, present differentially in normal and pathological conditions, were observed. In the present study, Raman spectra of sera of 36 buccal mucosa, 33 tongue cancers and 17 healthy subjects were recorded using Raman microprobe coupled with 40X objective using 785 nm excitation, a known source of excitation for biomedical applications. To eliminate heterogeneity, average of 3 spectra recorded from each sample was subjected to PC-LDA followed by leave-one-out-cross-validation. Findings indicate average classification efficiency of ~70% for normal and cancer. Buccal mucosa and tongue cancer serum could also be classified with an efficiency of ~68%. Of the two cancers, buccal mucosa cancer and normal could be classified with a higher efficiency. Findings of the study are quite comparable to that of our earlier study, which suggest that there exist significant differences, other than β- carotenes, between normal and cancerous samples which can be exploited for the classification. Prospectively, extensive validation studies will be undertaken to confirm the findings.

  19. An Authentication Technique Based on Classification

    Institute of Scientific and Technical Information of China (English)

    李钢; 杨杰

    2004-01-01

    We present a novel watermarking approach based on classification for authentication, in which a watermark is embedded into the host image. When the marked image is modified, the extracted watermark is also different to the original watermark, and different kinds of modification lead to different extracted watermarks. In this paper, different kinds of modification are considered as classes, and we used classification algorithm to recognize the modifications with high probability. Simulation results show that the proposed method is potential and effective.

  20. Wavelet-based multiscale analysis of bioimpedance data measured by electric cell-substrate impedance sensing for classification of cancerous and normal cells

    Science.gov (United States)

    Das, Debanjan; Shiladitya, Kumar; Biswas, Karabi; Dutta, Pranab Kumar; Parekh, Aditya; Mandal, Mahitosh; Das, Soumen

    2015-12-01

    The paper presents a study to differentiate normal and cancerous cells using label-free bioimpedance signal measured by electric cell-substrate impedance sensing. The real-time-measured bioimpedance data of human breast cancer cells and human epithelial normal cells employs fluctuations of impedance value due to cellular micromotions resulting from dynamic structural rearrangement of membrane protrusions under nonagitated condition. Here, a wavelet-based multiscale quantitative analysis technique has been applied to analyze the fluctuations in bioimpedance. The study demonstrates a method to classify cancerous and normal cells from the signature of their impedance fluctuations. The fluctuations associated with cellular micromotion are quantified in terms of cellular energy, cellular power dissipation, and cellular moments. The cellular energy and power dissipation are found higher for cancerous cells associated with higher micromotions in cancer cells. The initial study suggests that proposed wavelet-based quantitative technique promises to be an effective method to analyze real-time bioimpedance signal for distinguishing cancer and normal cells.

  1. Novel approaches for the molecular classification of prostate cancer

    Institute of Scientific and Technical Information of China (English)

    Robert H. Getzenberg

    2010-01-01

    @@ Among the urologic cancers, prostate cancer is by far the most common, and it appears to have the potential to affect almost all men throughout the world as they age. A number of studies have shown that many men with prostate cancer will not die from their disease, but rather with the disease but from other causes. These men have a form of prostate cancer that is de-scribed as "very low risk" and has often been called indolent. There are however a group of men that have a form of prostate cancer that is much more aggressive and life threatening. Unlike other cancer types, we have few tools to provide for the molecular classification of prostate cancer.

  2. Texture Classification based on Gabor Wavelet

    Directory of Open Access Journals (Sweden)

    Amandeep Kaur

    2012-07-01

    Full Text Available This paper presents the comparison of Texture classification algorithms based on Gabor Wavelets. The focus of this paper is on feature extraction scheme for texture classification. The texture feature for an image can be classified using texture descriptors. In this paper we have used Homogeneous texture descriptor that uses Gabor Wavelets concept. For texture classification, we have used online texture database that is Brodatz’s database and three advanced well known classifiers: Support Vector Machine, K-nearest neighbor method and decision tree induction method. The results shows that classification using Support vector machines gives better results as compare to the other classifiers. It can accurately discriminate between a testing image data and training data.

  3. Normalization Benefits Microarray-Based Classification

    Directory of Open Access Journals (Sweden)

    Chen Yidong

    2006-01-01

    Full Text Available When using cDNA microarrays, normalization to correct labeling bias is a common preliminary step before further data analysis is applied, its objective being to reduce the variation between arrays. To date, assessment of the effectiveness of normalization has mainly been confined to the ability to detect differentially expressed genes. Since a major use of microarrays is the expression-based phenotype classification, it is important to evaluate microarray normalization procedures relative to classification. Using a model-based approach, we model the systemic-error process to generate synthetic gene-expression values with known ground truth. These synthetic expression values are subjected to typical normalization methods and passed through a set of classification rules, the objective being to carry out a systematic study of the effect of normalization on classification. Three normalization methods are considered: offset, linear regression, and Lowess regression. Seven classification rules are considered: 3-nearest neighbor, linear support vector machine, linear discriminant analysis, regular histogram, Gaussian kernel, perceptron, and multiple perceptron with majority voting. The results of the first three are presented in the paper, with the full results being given on a complementary website. The conclusion from the different experiment models considered in the study is that normalization can have a significant benefit for classification under difficult experimental conditions, with linear and Lowess regression slightly outperforming the offset method.

  4. Normalization Benefits Microarray-Based Classification

    Directory of Open Access Journals (Sweden)

    Edward R. Dougherty

    2006-08-01

    Full Text Available When using cDNA microarrays, normalization to correct labeling bias is a common preliminary step before further data analysis is applied, its objective being to reduce the variation between arrays. To date, assessment of the effectiveness of normalization has mainly been confined to the ability to detect differentially expressed genes. Since a major use of microarrays is the expression-based phenotype classification, it is important to evaluate microarray normalization procedures relative to classification. Using a model-based approach, we model the systemic-error process to generate synthetic gene-expression values with known ground truth. These synthetic expression values are subjected to typical normalization methods and passed through a set of classification rules, the objective being to carry out a systematic study of the effect of normalization on classification. Three normalization methods are considered: offset, linear regression, and Lowess regression. Seven classification rules are considered: 3-nearest neighbor, linear support vector machine, linear discriminant analysis, regular histogram, Gaussian kernel, perceptron, and multiple perceptron with majority voting. The results of the first three are presented in the paper, with the full results being given on a complementary website. The conclusion from the different experiment models considered in the study is that normalization can have a significant benefit for classification under difficult experimental conditions, with linear and Lowess regression slightly outperforming the offset method.

  5. Cuckoo search optimisation for feature selection in cancer classification: a new approach.

    Science.gov (United States)

    Gunavathi, C; Premalatha, K

    2015-01-01

    Cuckoo Search (CS) optimisation algorithm is used for feature selection in cancer classification using microarray gene expression data. Since the gene expression data has thousands of genes and a small number of samples, feature selection methods can be used for the selection of informative genes to improve the classification accuracy. Initially, the genes are ranked based on T-statistics, Signal-to-Noise Ratio (SNR) and F-statistics values. The CS is used to find the informative genes from the top-m ranked genes. The classification accuracy of k-Nearest Neighbour (kNN) technique is used as the fitness function for CS. The proposed method is experimented and analysed with ten different cancer gene expression datasets. The results show that the CS gives 100% average accuracy for DLBCL Harvard, Lung Michigan, Ovarian Cancer, AML-ALL and Lung Harvard2 datasets and it outperforms the existing techniques in DLBCL outcome and prostate datasets. PMID:26547979

  6. Classification of Breast Cancer Using SVM Classifier Technique

    OpenAIRE

    B.Senthil Murugan; S.Srirambabu; Santhosh Kumar. V

    2010-01-01

    This paper proposes a technique for classifying the breast cancer from mammogram. The proposed system aims at developing the visualization tool for detecting the breast cancer and minimizing the scheme of detection. The detection method is organized as follows: (a) Image Enhancement (b) Segmentation (c) Feature extraction (d) Classification using SVM classifier Technique. Image enhancement step concentrates on converting an image to more and better understandable level thereby applying Median...

  7. Classification of mitocans, anti-cancer drugs acting on mitochondria

    Czech Academy of Sciences Publication Activity Database

    Neužil, Jiří; Dong, L. F.; Rohlena, Jakub; Truksa, Jaroslav; Ralph, S. J.

    2013-01-01

    Roč. 13, č. 3 (2013), s. 199-208. ISSN 1567-7249 Institutional research plan: CEZ:AV0Z50520701 Keywords : Mitocans * Anti- cancer therapeutics * Classification Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.524, year: 2013

  8. Classification of mitocans, anti-cancer drugs acting on mitochondria

    Czech Academy of Sciences Publication Activity Database

    Neužil, Jiří; Dong, L. F.; Rohlena, Jakub; Truksa, Jaroslav; Ralph, S. J.

    2013-01-01

    Roč. 13, č. 3 (2013), s. 199-208. ISSN 1567-7249 Institutional research plan: CEZ:AV0Z50520701 Keywords : Mitocans * Anti-cancer therapeutics * Classification Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.524, year: 2013

  9. A CAD System for Identification and Classification of Breast Cancer Tumors in DCE-MR Images Based on Hierarchical Convolutional Neural Networks

    Directory of Open Access Journals (Sweden)

    Reza Rastiboroujeni

    2015-06-01

    Full Text Available In this paper, we propose a computer aided diagnosis (CAD system based on hierarchical convolutional neural networks (HCNNs to discriminate between malignant and benign tumors in breast DCE-MRIs. A HCNN is a hierarchical neural network that operates on two-dimensional images. A HCNN integrates feature extraction and classification processes into one single and fully adaptive structure. It can extract two-dimensional key features automatically, and it is relatively tolerant to geometric and local distortions in input images. We evaluate CNN implementation learning and testing processes based on gradient descent (GD and resilient back-propagation (RPROP approaches. We show that, proposed HCNN with RPROP learning approach provide an effective and robust neural structure to design a CAD base system for breast MRI, and has potential as a mechanism for the evaluation of different types of abnormalities in medical images.

  10. Cancer pain: Classification and pain syndromes

    Directory of Open Access Journals (Sweden)

    Grujičić Danica

    2004-01-01

    Full Text Available In spite the new information's about the physiology and biochemistry of pain, it remains true that pain is only partially understood. Cancer pain is often experienced as several different types of pain, with combined somatic and neuropathic types the most frequently. If the acute cancer pain does not subside with initial therapy, patients experience pain of more constant nature, the characteristics of which vary with the cause and the involved sites. Chronic pain related to cancer can be considered as tumor-induced pain, chemotherapy-induced pain, and radiation therapy induced pain. Certain pain mechanisms are present in cancer patients. These include inflammation due to infection, such as local sepsis or the pain of herpes zoster, and pain due to the obstruction or occlusion of a hollow organ, such as that caused by large bowel in cancer of colon. Pain also is commonly due to destruction of tissue, such as is often seen with bony metastases. Bony metastases also produce pain because of periostal irritation, medullar pressure, and fractures. Pain may be produced by the growth of tumor in a closed area richly supplied with pain receptors (nociceptors. Examples are tumors growing within the capsule of an organ such as the pancreas. Chest pain occurring after tumor of the lung or the mediastinum due to invasion of the pleura. Certain tumors produce characteristic types of pain. For example, back pain is seen with multiple myeloma, and severe shoulder pain and arm pain is seen with Pancoast tumors.

  11. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim U.; Emberton, Mark [University College London, Research Department of Urology, Division of Surgery and Interventional Science, London (United Kingdom); Kirkham, Alex [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2015-09-15

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  12. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    International Nuclear Information System (INIS)

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  13. The classification and staging of cancerous growths of the anal canal

    International Nuclear Information System (INIS)

    In this chapter authors give information about frequency of cancerous growths of the anal canal, general analysis of observations the classification and staging of cancerous growths of the anal canal, clinical-anatomy classification of cancerous growths of the anal canal and staging of cancerous growths of anal canal

  14. Histological image classification using biologically interpretable shape-based features

    International Nuclear Information System (INIS)

    Automatic cancer diagnostic systems based on histological image classification are important for improving therapeutic decisions. Previous studies propose textural and morphological features for such systems. These features capture patterns in histological images that are useful for both cancer grading and subtyping. However, because many of these features lack a clear biological interpretation, pathologists may be reluctant to adopt these features for clinical diagnosis. We examine the utility of biologically interpretable shape-based features for classification of histological renal tumor images. Using Fourier shape descriptors, we extract shape-based features that capture the distribution of stain-enhanced cellular and tissue structures in each image and evaluate these features using a multi-class prediction model. We compare the predictive performance of the shape-based diagnostic model to that of traditional models, i.e., using textural, morphological and topological features. The shape-based model, with an average accuracy of 77%, outperforms or complements traditional models. We identify the most informative shapes for each renal tumor subtype from the top-selected features. Results suggest that these shapes are not only accurate diagnostic features, but also correlate with known biological characteristics of renal tumors. Shape-based analysis of histological renal tumor images accurately classifies disease subtypes and reveals biologically insightful discriminatory features. This method for shape-based analysis can be extended to other histological datasets to aid pathologists in diagnostic and therapeutic decisions

  15. Malware Classification based on Call Graph Clustering

    OpenAIRE

    Kinable, Joris; Kostakis, Orestis

    2010-01-01

    Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations...

  16. An Agent Based Classification Model

    CERN Document Server

    Gu, Feng; Greensmith, Julie

    2009-01-01

    The major function of this model is to access the UCI Wisconsin Breast Can- cer data-set[1] and classify the data items into two categories, which are normal and anomalous. This kind of classifi cation can be referred as anomaly detection, which discriminates anomalous behaviour from normal behaviour in computer systems. One popular solution for anomaly detection is Artifi cial Immune Sys- tems (AIS). AIS are adaptive systems inspired by theoretical immunology and observed immune functions, principles and models which are applied to prob- lem solving. The Dendritic Cell Algorithm (DCA)[2] is an AIS algorithm that is developed specifi cally for anomaly detection. It has been successfully applied to intrusion detection in computer security. It is believed that agent-based mod- elling is an ideal approach for implementing AIS, as intelligent agents could be the perfect representations of immune entities in AIS. This model evaluates the feasibility of re-implementing the DCA in an agent-based simulation environ- ...

  17. Image-based Vehicle Classification System

    CERN Document Server

    Ng, Jun Yee

    2012-01-01

    Electronic toll collection (ETC) system has been a common trend used for toll collection on toll road nowadays. The implementation of electronic toll collection allows vehicles to travel at low or full speed during the toll payment, which help to avoid the traffic delay at toll road. One of the major components of an electronic toll collection is the automatic vehicle detection and classification (AVDC) system which is important to classify the vehicle so that the toll is charged according to the vehicle classes. Vision-based vehicle classification system is one type of vehicle classification system which adopt camera as the input sensing device for the system. This type of system has advantage over the rest for it is cost efficient as low cost camera is used. The implementation of vision-based vehicle classification system requires lower initial investment cost and very suitable for the toll collection trend migration in Malaysia from single ETC system to full-scale multi-lane free flow (MLFF). This project ...

  18. Cancer classification using the Immunoscore: a worldwide task force

    Directory of Open Access Journals (Sweden)

    Galon Jérôme

    2012-10-01

    Full Text Available Abstract Prediction of clinical outcome in cancer is usually achieved by histopathological evaluation of tissue samples obtained during surgical resection of the primary tumor. Traditional tumor staging (AJCC/UICC-TNM classification summarizes data on tumor burden (T, presence of cancer cells in draining and regional lymph nodes (N and evidence for metastases (M. However, it is now recognized that clinical outcome can significantly vary among patients within the same stage. The current classification provides limited prognostic information, and does not predict response to therapy. Recent literature has alluded to the importance of the host immune system in controlling tumor progression. Thus, evidence supports the notion to include immunological biomarkers, implemented as a tool for the prediction of prognosis and response to therapy. Accumulating data, collected from large cohorts of human cancers, has demonstrated the impact of immune-classification, which has a prognostic value that may add to the significance of the AJCC/UICC TNM-classification. It is therefore imperative to begin to incorporate the ‘Immunoscore’ into traditional classification, thus providing an essential prognostic and potentially predictive tool. Introduction of this parameter as a biomarker to classify cancers, as part of routine diagnostic and prognostic assessment of tumors, will facilitate clinical decision-making including rational stratification of patient treatment. Equally, the inherent complexity of quantitative immunohistochemistry, in conjunction with protocol variation across laboratories, analysis of different immune cell types, inconsistent region selection criteria, and variable ways to quantify immune infiltration, all underline the urgent requirement to reach assay harmonization. In an effort to promote the Immunoscore in routine clinical settings, an international task force was initiated. This review represents a follow-up of the announcement of

  19. Classification for breast cancer diagnosis with Raman spectroscopy

    Science.gov (United States)

    Li, Qingbo; Gao, Qishuo; Zhang, Guangjun

    2014-01-01

    In order to promote the development of the portable, low-cost and in vivo cancer diagnosis instrument, a miniature laser Raman spectrometer was employed to acquire the conventional Raman spectra for breast cancer detection in this paper. But it is difficult to achieve high discrimination accuracy. Then a novel method of adaptive weight k-local hyperplane (AWKH) is proposed to increase the classification accuracy. AWKH is an extension and improvement of K-local hyperplane distance nearest-neighbor (HKNN). It considers the features weights of the training data in the nearest neighbor selection and local hyperplane construction stage, which resolve the basic shortcoming of HKNN works well only for small values of the nearest-neighbor. Experimental results on Raman spectra of breast tissues in vitro show the proposed method can realize high classification accuracy. PMID:25071976

  20. A novel subtype classification and risk of breast cancer by histone modification profiling.

    Science.gov (United States)

    Chen, Xiaohua; Hu, Hanyang; He, Lin; Yu, Xueyuan; Liu, Xiangyu; Zhong, Rong; Shu, Maoguo

    2016-06-01

    Breast cancer has been classified into several intrinsic molecular subtypes on the basis of genetic and epigenetic factors. However, knowledge about histone modifications that contribute to the classification and development of biologically distinct breast cancer subtypes remains limited. Here we compared the genome-wide binding patterns of H3K4me3 and H3K27me3 between human mammary epithelial cells and three breast cancer cell lines representing the luminal, HER2, and basal subtypes. We characterized thousands of unique binding events as well as bivalent chromatin signatures unique to each cancer subtype, which were involved in different epigenetic regulation programs and signaling pathways in breast cancer progression. Genes linked to the unique histone mark features exhibited subtype-specific expression patterns, both in cancer cell lines and primary tumors, some of which were confirmed by qPCR in our primary cancer samples. Finally, histone mark-based gene classifiers were significantly correlated with relapse-free survival outcomes in patients. In summary, we have provided a valuable resource for the identification of novel biomarkers of subtype classification and clinical prognosis evaluation in breast cancers. PMID:27178334

  1. Mammogram-based discriminant fusion analysis for breast cancer diagnosis.

    Science.gov (United States)

    Li, Jun-Bao; Wang, Yun-Heng; Tang, Lin-Lin

    2012-01-01

    Mammogram-based classification is an important and effective way for computer-aided diagnosis (CAD)-based breast cancer diagnosis. In this paper, we present a novel discriminant fusing analysis (DFA)-based mammogram classification CAD-based breast cancer diagnosis. The discriminative breast tissue features are exacted and fused by DFA, and DFA achieves the optimal fusion coefficients. The largest class discriminant in the fused feature space is achieved by DFA for classification. Beside the detailed theory derivation, many experimental evaluations are implemented on Mammography Image Analysis Society mammogram database for breast cancer diagnosis. PMID:23153999

  2. Movie Review Classification and Feature based Summarization of Movie Reviews

    Directory of Open Access Journals (Sweden)

    Sabeeha Mohammed Basheer#1, Syed Farook

    2013-07-01

    Full Text Available Sentiment classification and feature based summarization are essential steps involved with the classification and summarization of movie reviews. The movie review classification is based on sentiment classification and condensed descriptions of movie reviews are generated from the feature based summarization. Experiments are conducted to identify the best machine learning based sentiment classification approach. Latent Semantic Analysis and Latent Dirichlet Allocation were compared to identify features which in turn affects the summary size. The focus of the system design is on classification accuracy and system response time.

  3. Mechanism-based drug exposure classification in pharmacoepidemiological studies

    NARCIS (Netherlands)

    Verdel, B.M.

    2010-01-01

    Mechanism-based classification of drug exposure in pharmacoepidemiological studies In pharmacoepidemiology and pharmacovigilance, the relation between drug exposure and clinical outcomes is crucial. Exposure classification in pharmacoepidemiological studies is traditionally based on pharmacotherapeu

  4. Contextual Deep CNN Based Hyperspectral Classification

    OpenAIRE

    Lee, Hyungtae; Kwon, Heesung

    2016-01-01

    In this paper, we describe a novel deep convolutional neural networks (CNN) based approach called contextual deep CNN that can jointly exploit spatial and spectral features for hyperspectral image classification. The contextual deep CNN first concurrently applies multiple 3-dimensional local convolutional filters with different sizes jointly exploiting spatial and spectral features of a hyperspectral image. The initial spatial and spectral feature maps obtained from applying the variable size...

  5. A Hybrid Reduction Approach for Enhancing Cancer Classification of Microarray Data

    Directory of Open Access Journals (Sweden)

    Abeer M. Mahmoud

    2014-10-01

    Full Text Available This paper presents a novel hybrid machine learning (MLreduction approach to enhance cancer classification accuracy of microarray data based on two ML gene ranking techniques (T-test and Class Separability (CS. The proposed approach is integrated with two ML classifiers; K-nearest neighbor (KNN and support vector machine (SVM; for mining microarray gene expression profiles. Four public cancer microarray databases are used for evaluating the proposed approach and successfully accomplish the mining process. These are Lymphoma, Leukemia SRBCT, and Lung Cancer. The strategy to select genes only from the training samples and totally excluding the testing samples from the classifier building process is utilized for more accurate and validated results. Also, the computational experiments are illustrated in details and comprehensively presented with literature related results. The results showed that the proposed reduction approach reached promising results of the number of genes supplemented to the classifiers as well as the classification accuracy.

  6. A Discrete Wavelet Based Feature Extraction and Hybrid Classification Technique for Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Jaison Bennet

    2014-01-01

    Full Text Available Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN, naive Bayes, and support vector machine (SVM. Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT and moving window technique (MWT is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  7. Collaborative Representation based Classification for Face Recognition

    CERN Document Server

    Zhang, Lei; Feng, Xiangchu; Ma, Yi; Zhang, David

    2012-01-01

    By coding a query sample as a sparse linear combination of all training samples and then classifying it by evaluating which class leads to the minimal coding residual, sparse representation based classification (SRC) leads to interesting results for robust face recognition. It is widely believed that the l1- norm sparsity constraint on coding coefficients plays a key role in the success of SRC, while its use of all training samples to collaboratively represent the query sample is rather ignored. In this paper we discuss how SRC works, and show that the collaborative representation mechanism used in SRC is much more crucial to its success of face classification. The SRC is a special case of collaborative representation based classification (CRC), which has various instantiations by applying different norms to the coding residual and coding coefficient. More specifically, the l1 or l2 norm characterization of coding residual is related to the robustness of CRC to outlier facial pixels, while the l1 or l2 norm c...

  8. Texture feature based liver lesion classification

    Science.gov (United States)

    Doron, Yeela; Mayer-Wolf, Nitzan; Diamant, Idit; Greenspan, Hayit

    2014-03-01

    Liver lesion classification is a difficult clinical task. Computerized analysis can support clinical workflow by enabling more objective and reproducible evaluation. In this paper, we evaluate the contribution of several types of texture features for a computer-aided diagnostic (CAD) system which automatically classifies liver lesions from CT images. Based on the assumption that liver lesions of various classes differ in their texture characteristics, a variety of texture features were examined as lesion descriptors. Although texture features are often used for this task, there is currently a lack of detailed research focusing on the comparison across different texture features, or their combinations, on a given dataset. In this work we investigated the performance of Gray Level Co-occurrence Matrix (GLCM), Local Binary Patterns (LBP), Gabor, gray level intensity values and Gabor-based LBP (GLBP), where the features are obtained from a given lesion`s region of interest (ROI). For the classification module, SVM and KNN classifiers were examined. Using a single type of texture feature, best result of 91% accuracy, was obtained with Gabor filtering and SVM classification. Combination of Gabor, LBP and Intensity features improved the results to a final accuracy of 97%.

  9. Review on Feature Selection Techniques and the Impact of SVM for Cancer Classification using Gene Expression Profile

    CERN Document Server

    George, G Victo Sudha; 10.5121/ijcses.2011.2302

    2011-01-01

    The DNA microarray technology has modernized the approach of biology research in such a way that scientists can now measure the expression levels of thousands of genes simultaneously in a single experiment. Gene expression profiles, which represent the state of a cell at a molecular level, have great potential as a medical diagnosis tool. But compared to the number of genes involved, available training data sets generally have a fairly small sample size for classification. These training data limitations constitute a challenge to certain classification methodologies. Feature selection techniques can be used to extract the marker genes which influence the classification accuracy effectively by eliminating the un wanted noisy and redundant genes This paper presents a review of feature selection techniques that have been employed in micro array data based cancer classification and also the predominant role of SVM for cancer classification.

  10. REVIEW ON FEATURE SELECTION TECHNIQUES AND THE IMPACT OF SVM FOR CANCER CLASSIFICATION USING GENE EXPRESSION PROFILE

    Directory of Open Access Journals (Sweden)

    G.Victo Sudha George

    2011-09-01

    Full Text Available The DNA microarray technology has modernized the approach of biology research in such a way thatscientists can now measure the expression levels of thousands of genes simultaneously in a singleexperiment. Gene expression profiles, which represent the state of a cell at a molecular level, have greatpotential as a medical diagnosis tool. But compared to the number of genes involved, available trainingdata sets generally have a fairly small sample size for classification. These training data limitationsconstitute a challenge to certain classification methodologies. Feature selection techniques can be usedto extract the marker genes which influence the classification accuracy effectively by eliminating the unwanted noisy and redundant genes This paper presents a review of feature selection techniques that havebeen employed in micro array data based cancer classification and also the predominant role of SVMfor cancer classification.

  11. Colorectal Cancer Classification and Cell Heterogeneity: A Systems Oncology Approach

    Directory of Open Access Journals (Sweden)

    Moisés Blanco-Calvo

    2015-06-01

    Full Text Available Colorectal cancer is a heterogeneous disease that manifests through diverse clinical scenarios. During many years, our knowledge about the variability of colorectal tumors was limited to the histopathological analysis from which generic classifications associated with different clinical expectations are derived. However, currently we are beginning to understand that under the intense pathological and clinical variability of these tumors there underlies strong genetic and biological heterogeneity. Thus, with the increasing available information of inter-tumor and intra-tumor heterogeneity, the classical pathological approach is being displaced in favor of novel molecular classifications. In the present article, we summarize the most relevant proposals of molecular classifications obtained from the analysis of colorectal tumors using powerful high throughput techniques and devices. We also discuss the role that cancer systems biology may play in the integration and interpretation of the high amount of data generated and the challenges to be addressed in the future development of precision oncology. In addition, we review the current state of implementation of these novel tools in the pathological laboratory and in clinical practice.

  12. Ladar-based terrain cover classification

    Science.gov (United States)

    Macedo, Jose; Manduchi, Roberto; Matthies, Larry H.

    2001-09-01

    An autonomous vehicle driving in a densely vegetated environment needs to be able to discriminate between obstacles (such as rocks) and penetrable vegetation (such as tall grass). We propose a technique for terrain cover classification based on the statistical analysis of the range data produced by a single-axis laser rangefinder (ladar). We first present theoretical models for the range distribution in the presence of homogeneously distributed grass and of obstacles partially occluded by grass. We then validate our results with real-world cases, and propose a simple algorithm to robustly discriminate between vegetation and obstacles based on the local statistical analysis of the range data.

  13. Digital image-based classification of biodiesel.

    Science.gov (United States)

    Costa, Gean Bezerra; Fernandes, David Douglas Sousa; Almeida, Valber Elias; Araújo, Thomas Souto Policarpo; Melo, Jessica Priscila; Diniz, Paulo Henrique Gonçalves Dias; Véras, Germano

    2015-07-01

    This work proposes a simple, rapid, inexpensive, and non-destructive methodology based on digital images and pattern recognition techniques for classification of biodiesel according to oil type (cottonseed, sunflower, corn, or soybean). For this, differing color histograms in RGB (extracted from digital images), HSI, Grayscale channels, and their combinations were used as analytical information, which was then statistically evaluated using Soft Independent Modeling by Class Analogy (SIMCA), Partial Least Squares Discriminant Analysis (PLS-DA), and variable selection using the Successive Projections Algorithm associated with Linear Discriminant Analysis (SPA-LDA). Despite good performances by the SIMCA and PLS-DA classification models, SPA-LDA provided better results (up to 95% for all approaches) in terms of accuracy, sensitivity, and specificity for both the training and test sets. The variables selected Successive Projections Algorithm clearly contained the information necessary for biodiesel type classification. This is important since a product may exhibit different properties, depending on the feedstock used. Such variations directly influence the quality, and consequently the price. Moreover, intrinsic advantages such as quick analysis, requiring no reagents, and a noteworthy reduction (the avoidance of chemical characterization) of waste generation, all contribute towards the primary objective of green chemistry. PMID:25882407

  14. BROAD PHONEME CLASSIFICATION USING SIGNAL BASED FEATURES

    Directory of Open Access Journals (Sweden)

    Deekshitha G

    2014-12-01

    Full Text Available Speech is the most efficient and popular means of human communication Speech is produced as a sequence of phonemes. Phoneme recognition is the first step performed by automatic speech recognition system. The state-of-the-art recognizers use mel-frequency cepstral coefficients (MFCC features derived through short time analysis, for which the recognition accuracy is limited. Instead of this, here broad phoneme classification is achieved using features derived directly from the speech at the signal level itself. Broad phoneme classes include vowels, nasals, fricatives, stops, approximants and silence. The features identified useful for broad phoneme classification are voiced/unvoiced decision, zero crossing rate (ZCR, short time energy, most dominant frequency, energy in most dominant frequency, spectral flatness measure and first three formants. Features derived from short time frames of training speech are used to train a multilayer feedforward neural network based classifier with manually marked class label as output and classification accuracy is then tested. Later this broad phoneme classifier is used for broad syllable structure prediction which is useful for applications such as automatic speech recognition and automatic language identification.

  15. Changing Histopathological Diagnostics by Genome-Based Tumor Classification

    Directory of Open Access Journals (Sweden)

    Michael Kloth

    2014-05-01

    Full Text Available Traditionally, tumors are classified by histopathological criteria, i.e., based on their specific morphological appearances. Consequently, current therapeutic decisions in oncology are strongly influenced by histology rather than underlying molecular or genomic aberrations. The increase of information on molecular changes however, enabled by the Human Genome Project and the International Cancer Genome Consortium as well as the manifold advances in molecular biology and high-throughput sequencing techniques, inaugurated the integration of genomic information into disease classification. Furthermore, in some cases it became evident that former classifications needed major revision and adaption. Such adaptations are often required by understanding the pathogenesis of a disease from a specific molecular alteration, using this molecular driver for targeted and highly effective therapies. Altogether, reclassifications should lead to higher information content of the underlying diagnoses, reflecting their molecular pathogenesis and resulting in optimized and individual therapeutic decisions. The objective of this article is to summarize some particularly important examples of genome-based classification approaches and associated therapeutic concepts. In addition to reviewing disease specific markers, we focus on potentially therapeutic or predictive markers and the relevance of molecular diagnostics in disease monitoring.

  16. Call for a Computer-Aided Cancer Detection and Classification Research Initiative in Oman.

    Science.gov (United States)

    Mirzal, Andri; Chaudhry, Shafique Ahmad

    2016-01-01

    Cancer is a major health problem in Oman. It is reported that cancer incidence in Oman is the second highest after Saudi Arabia among Gulf Cooperation Council countries. Based on GLOBOCAN estimates, Oman is predicted to face an almost two-fold increase in cancer incidence in the period 2008-2020. However, cancer research in Oman is still in its infancy. This is due to the fact that medical institutions and infrastructure that play central roles in data collection and analysis are relatively new developments in Oman. We believe the country requires an organized plan and efforts to promote local cancer research. In this paper, we discuss current research progress in cancer diagnosis using machine learning techniques to optimize computer aided cancer detection and classification (CAD). We specifically discuss CAD using two major medical data, i.e., medical imaging and microarray gene expression profiling, because medical imaging like mammography, MRI, and PET have been widely used in Oman for assisting radiologists in early cancer diagnosis and microarray data have been proven to be a reliable source for differential diagnosis. We also discuss future cancer research directions and benefits to Oman economy for entering the cancer research and treatment business as it is a multi-billion dollar industry worldwide. PMID:27268600

  17. Side effects of cancer therapies. International classification and documentation systems

    International Nuclear Information System (INIS)

    The publication presents and explains verified, international classification and documentation systems for side effects induced by cancer treatments, applicable in general and clinical practice and clinical research, and covers in a clearly arranged manner the whole range of treatments, including acute and chronic side effects of chemotherapy and radiotherapy, surgery, or combined therapies. The book fills a long-felt need in tumor documentation and is a major contribution to quality assurance in clinical oncology in German-speaking countries. As most parts of the book are bilingual, presenting German and English texts and terminology, it satisfies the principles of interdisciplinarity and internationality. The tabulated form chosen for presentation of classification systems and criteria facilitate the user's approach as well as application in daily work. (orig./CB)

  18. A MapReduce based Parallel SVM for Email Classification

    OpenAIRE

    Ke Xu; Cui Wen; Qiong Yuan; Xiangzhu He; Jun Tie

    2014-01-01

    Support Vector Machine (SVM) is a powerful classification and regression tool. Varying approaches including SVM based techniques are proposed for email classification. Automated email classification according to messages or user-specific folders and information extraction from chronologically ordered email streams have become interesting areas in text machine learning research. This paper presents a parallel SVM based on MapReduce (PSMR) algorithm for email classification. We discuss the chal...

  19. Les cancers de la cavité buccale et de l'oropharynx dans le monde : incidence internationale et classification TNM dans les registres du cancer

    OpenAIRE

    de Camargo Cancela, Marianna

    2010-01-01

    Oral cavity and oropharynx cancers : International incidence and TNM classification in population-based cancer registries The aim of this work was to know and to evaluate the epidemiological patterns of oral cavity and ororpharynx cancers. These topographies share some common risk factors and they are often grouped in epidemiological studies. However, the implication of the human papilloma virus in oropharyngeal tumors lead us to provide incidence rates according to the anatomical classificat...

  20. A Ranking-Based Meta-Analysis Reveals Let-7 Family as a Meta-Signature for Grade Classification in Breast Cancer

    OpenAIRE

    Oztemur, Yasemin; Bekmez, Tufan; Aydos, Alp; Yulug, Isik G; Bozkurt, Betul; Dedeoglu, Bala Gur

    2015-01-01

    Breast cancer is one of the most important causes of cancer-related deaths worldwide in women. In addition to gene expression studies, the progressing work in the miRNA area including miRNA microarray studies, brings new aspects to the research on the cancer development and progression. Microarray technology has been widely used to find new biomarkers in research and many transcriptomic microarray studies are available in public databases. In this study, the breast cancer miRNA and mRNA micro...

  1. TNM staging and classification (familial and nonfamilial of breast cancer in Jordanian females

    Directory of Open Access Journals (Sweden)

    M F Atoum

    2010-01-01

    Full Text Available Purpose : Staging of breast tumor has important implications for treatment and prognosis. This study aims at pinpointing the frequency of each stage among familial and nonfamilial breast cancers. Materials and Methods : Ninety-nine Jordanian females diagnosed with familial and nonfamilial breast cancer between 2000 and 2002 were enrolled in this study All breast cancer cases were staged according to the TNM classification into in situ, early invasive, advanced invasive and metastatic. Results : Forty-three cases were familial breast cancer and 56 were nonfamilial. One female breast cancer was diagnosed with ductal carcinoma in situ (DCIS cancer. Fifty cases were diagnosed in early stages of invasive breast cancer, of which 31 cases were familial, 29 cases were classified as advanced invasive, where 21 cases were nonfamilial and 19 cases were metastatic stage of breast cancer, with 16 nonfamilial cases. Stage 2b was the most common stage of early invasive cases and represented 48% of the early stage of breast cancer. On the other hand, among cases diagnosed with advanced invasive breast cancer, stage 3a was the most common stage and represented 89.6% of the advanced stage. Interestingly, all cases of stage 3a belonged to TNM stages of T2N2M0 and T3N1M0. The tumor size in all cases of Jordanian females diagnosed with advanced invasive breast cancer exceeded 2 cm in size due to selection bias from symptomatic women in our study. Conclusion : The incidence of nonfamilial breast cancer was slightly higher than that of the familial type amongst studied the Jordanian females studied. The early invasive stage of breast cancer was more common in the familial while the advanced invasive and metastatic breast cancer cases were encountered more often in the nonfamilial type. Our study was based on a small sample and symptomatic women. Therefore, more research with larger population samples is needed to confirm this conclusion.

  2. Classification of Cancer-related Death Certificates using Machine Learning

    Directory of Open Access Journals (Sweden)

    Luke Butt

    2013-05-01

    Full Text Available BackgroundCancer monitoring and prevention relies on the critical aspect of timely notification of cancer cases. However, the abstraction and classification of cancer from the free-text of pathology reports and other relevant documents, such as death certificates, exist as complex and time-consuming activities.AimsIn this paper, approaches for the automatic detection of notifiable cancer cases as the cause of death from free-text death certificates supplied to Cancer Registries are investigated.Method A number of machine learning classifiers were studied. Features were extracted using natural language techniques and the Medtex toolkit. The numerous features encompassed stemmed words, bi-grams, and concepts from the SNOMED CT medical terminology. The baseline consisted of a keyword spotter using keywords extracted from the long description of ICD-10 cancer related codes.ResultsDeath certificates with notifiable cancer listed as the cause of death can be effectively identified with the methods studied in this paper. A Support Vector Machine (SVM classifier achieved best performance with an overall F-measure of 0.9866 when evaluated on a set of 5,000 free-text death certificates using the token stem feature set. The SNOMED CT concept plus token stem feature set reached the lowest variance (0.0032 and false negative rate (0.0297 while achieving an F-measure of 0.9864. The SVM classifier accounts for the first 18 of the top 40 evaluated runs, and entails the most robust classifier with a variance of 0.001141, half the variance of the other classifiers.ConclusionThe selection of features significantly produced the most influences on the performance of the classifiers, although the type of classifier employed also affects performance. In contrast, the feature weighting schema created a negligible effect on performance. Specifically, it is found that stemmed tokens with or without SNOMED CT concepts create the most effective feature when combined with

  3. Cirrhosis classification based on texture classification of random features.

    Science.gov (United States)

    Liu, Hui; Shao, Ying; Guo, Dongmei; Zheng, Yuanjie; Zhao, Zuowei; Qiu, Tianshuang

    2014-01-01

    Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD) can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM) features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage). CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM. PMID:24707317

  4. Fuzzy Rule Base System for Software Classification

    Directory of Open Access Journals (Sweden)

    Adnan Shaout

    2013-07-01

    Full Text Available Given the central role that software development plays in the delivery and application of informationtechnology, managers have been focusing on process improvement in the software development area. Thisimprovement has increased the demand for software measures, or metrics to manage the process. Thismetrics provide a quantitative basis for the development and validation of models during the softwaredevelopment process. In this paper a fuzzy rule-based system will be developed to classify java applicationsusing object oriented metrics. The system will contain the following features:Automated method to extract the OO metrics from the source code,Default/base set of rules that can be easily configured via XML file so companies, developers, teamleaders,etc, can modify the set of rules according to their needs,Implementation of a framework so new metrics, fuzzy sets and fuzzy rules can be added or removeddepending on the needs of the end user,General classification of the software application and fine-grained classification of the java classesbased on OO metrics, andTwo interfaces are provided for the system: GUI and command.

  5. Classification of treatment-related mortality in children with cancer

    DEFF Research Database (Denmark)

    Alexander, Sarah; Pole, Jason D; Gibson, Paul;

    2015-01-01

    Treatment-related mortality is an important outcome in paediatric cancer clinical trials. An international group of experts in supportive care in paediatric cancer developed a consensus-based definition of treatment-related mortality and a cause-of-death attribution system. The reliability and va...

  6. PSG-Based Classification of Sleep Phases

    OpenAIRE

    Králík, M.

    2015-01-01

    This work is focused on classification of sleep phases using artificial neural network. The unconventional approach was used for calculation of classification features using polysomnographic data (PSG) of real patients. This approach allows to increase the time resolution of the analysis and, thus, to achieve more accurate results of classification.

  7. A Dataset for Breast Cancer Histopathological Image Classification.

    Science.gov (United States)

    Spanhol, Fabio A; Oliveira, Luiz S; Petitjean, Caroline; Heutte, Laurent

    2016-07-01

    Today, medical image analysis papers require solid experiments to prove the usefulness of proposed methods. However, experiments are often performed on data selected by the researchers, which may come from different institutions, scanners, and populations. Different evaluation measures may be used, making it difficult to compare the methods. In this paper, we introduce a dataset of 7909 breast cancer histopathology images acquired on 82 patients, which is now publicly available from http://web.inf.ufpr.br/vri/breast-cancer-database. The dataset includes both benign and malignant images. The task associated with this dataset is the automated classification of these images in two classes, which would be a valuable computer-aided diagnosis tool for the clinician. In order to assess the difficulty of this task, we show some preliminary results obtained with state-of-the-art image classification systems. The accuracy ranges from 80% to 85%, showing room for improvement is left. By providing this dataset and a standardized evaluation protocol to the scientific community, we hope to gather researchers in both the medical and the machine learning field to advance toward this clinical application. PMID:26540668

  8. Cancer Etiology and Selected Aspects of Cancer Pathology: A Decimal Classification, (Categories 51.4 and 51.5).

    Science.gov (United States)

    Schneider, John H.

    This is a hierarchical decimal classification of information related to various types of carcinogenesis (Chemical, viral, hormonal, radiation), cancer demography, and selected descriptive and "in vitro" aspects of cancer pathology. It is a working draft of categories taken from an extensive classification of many fields of biomedical information.…

  9. Malware Classification based on Call Graph Clustering

    CERN Document Server

    Kinable, Joris

    2010-01-01

    Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations away, and enable the detection of structural similarities between samples. The ability to cluster similar samples together will make more generic detection techniques possible, thereby targeting the commonalities of the samples within a cluster. To compare call graphs mutually, we compute pairwise graph similarity scores via graph matchings which approximately minimize the graph edit distance. Next, to facilitate the discovery of similar malware samples, we employ several clustering algorithms, including k-medoids and DB...

  10. Automatic web services classification based on rough set theory

    Institute of Scientific and Technical Information of China (English)

    陈立; 张英; 宋自林; 苗壮

    2013-01-01

    With development of web services technology, the number of existing services in the internet is growing day by day. In order to achieve automatic and accurate services classification which can be beneficial for service related tasks, a rough set theory based method for services classification was proposed. First, the services descriptions were preprocessed and represented as vectors. Elicited by the discernibility matrices based attribute reduction in rough set theory and taking into account the characteristic of decision table of services classification, a method based on continuous discernibility matrices was proposed for dimensionality reduction. And finally, services classification was processed automatically. Through the experiment, the proposed method for services classification achieves approving classification result in all five testing categories. The experiment result shows that the proposed method is accurate and could be used in practical web services classification.

  11. Pathohistological classification systems in gastric cancer: Diagnostic relevance and prognostic value

    OpenAIRE

    Berlth, Felix; Bollschweiler, Elfriede; Drebber, Uta; Hoelscher, Arnulf H; Moenig, Stefan

    2014-01-01

    Several pathohistological classification systems exist for the diagnosis of gastric cancer. Many studies have investigated the correlation between the pathohistological characteristics in gastric cancer and patient characteristics, disease specific criteria and overall outcome. It is still controversial as to which classification system imparts the most reliable information, and therefore, the choice of system may vary in clinical routine. In addition to the most common classification systems...

  12. Graph-based Methods for Orbit Classification

    Energy Technology Data Exchange (ETDEWEB)

    Bagherjeiran, A; Kamath, C

    2005-09-29

    An important step in the quest for low-cost fusion power is the ability to perform and analyze experiments in prototype fusion reactors. One of the tasks in the analysis of experimental data is the classification of orbits in Poincare plots. These plots are generated by the particles in a fusion reactor as they move within the toroidal device. In this paper, we describe the use of graph-based methods to extract features from orbits. These features are then used to classify the orbits into several categories. Our results show that existing machine learning algorithms are successful in classifying orbits with few points, a situation which can arise in data from experiments.

  13. A MapReduce based Parallel SVM for Email Classification

    Directory of Open Access Journals (Sweden)

    Ke Xu

    2014-06-01

    Full Text Available Support Vector Machine (SVM is a powerful classification and regression tool. Varying approaches including SVM based techniques are proposed for email classification. Automated email classification according to messages or user-specific folders and information extraction from chronologically ordered email streams have become interesting areas in text machine learning research. This paper presents a parallel SVM based on MapReduce (PSMR algorithm for email classification. We discuss the challenges that arise from differences between email foldering and traditional document classification. We show experimental results from an array of automated classification methods and evaluation methodologies, including Naive Bayes, SVM and PSMR method of foldering results on the Enron datasets based on the timeline. By distributing, processing and optimizing the subsets of the training data across multiple participating nodes, the parallel SVM based on MapReduce algorithm reduces the training time significantly

  14. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  15. Classification techniques based on AI application to defect classification in cast aluminum

    Science.gov (United States)

    Platero, Carlos; Fernandez, Carlos; Campoy, Pascual; Aracil, Rafael

    1994-11-01

    This paper describes the Artificial Intelligent techniques applied to the interpretation process of images from cast aluminum surface presenting different defects. The whole process includes on-line defect detection, feature extraction and defect classification. These topics are discussed in depth through the paper. Data preprocessing process, as well as segmentation and feature extraction are described. At this point, algorithms employed along with used descriptors are shown. Syntactic filter has been developed to modelate the information and to generate the input vector to the classification system. Classification of defects is achieved by means of rule-based systems, fuzzy models and neural nets. Different classification subsystems perform together for the resolution of a pattern recognition problem (hybrid systems). Firstly, syntactic methods are used to obtain the filter that reduces the dimension of the input vector to the classification process. Rule-based classification is achieved associating a grammar to each defect type; the knowledge-base will be formed by the information derived from the syntactic filter along with the inferred rules. The fuzzy classification sub-system uses production rules with fuzzy antecedent and their consequents are ownership rates to every defect type. Different architectures of neural nets have been implemented with different results, as shown along the paper. In the higher classification level, the information given by the heterogeneous systems as well as the history of the process is supplied to an Expert System in order to drive the casting process.

  16. Small Sample Issues for Microarray-Based Classification

    OpenAIRE

    Dougherty, Edward R

    2006-01-01

    In order to study the molecular biological differences between normal and diseased tissues, it is desirable to perform classification among diseases and stages of disease using microarray-based gene-expression values. Owing to the limited number of microarrays typically used in these studies, serious issues arise with respect to the design, performance and analysis of classifiers based on microarray data. This paper reviews some fundamental issues facing small-sample classification: classific...

  17. Gender Classification Based on Geometry Features of Palm Image

    OpenAIRE

    Ming Wu; Yubo Yuan

    2014-01-01

    This paper presents a novel gender classification method based on geometry features of palm image which is simple, fast, and easy to handle. This gender classification method based on geometry features comprises two main attributes. The first one is feature extraction by image processing. The other one is classification system with polynomial smooth support vector machine (PSSVM). A total of 180 palm images were collected from 30 persons to verify the validity of the proposed gender classi...

  18. DNA sequence analysis using hierarchical ART-based classification networks

    Energy Technology Data Exchange (ETDEWEB)

    LeBlanc, C.; Hruska, S.I. [Florida State Univ., Tallahassee, FL (United States); Katholi, C.R.; Unnasch, T.R. [Univ. of Alabama, Birmingham, AL (United States)

    1994-12-31

    Adaptive resonance theory (ART) describes a class of artificial neural network architectures that act as classification tools which self-organize, work in real-time, and require no retraining to classify novel sequences. We have adapted ART networks to provide support to scientists attempting to categorize tandem repeat DNA fragments from Onchocerca volvulus. In this approach, sequences of DNA fragments are presented to multiple ART-based networks which are linked together into two (or more) tiers; the first provides coarse sequence classification while the sub- sequent tiers refine the classifications as needed. The overall rating of the resulting classification of fragments is measured using statistical techniques based on those introduced to validate results from traditional phylogenetic analysis. Tests of the Hierarchical ART-based Classification Network, or HABclass network, indicate its value as a fast, easy-to-use classification tool which adapts to new data without retraining on previously classified data.

  19. Does the use of the 2009 FIGO classification of endometrial cancer impact on indications of the sentinel node biopsy?

    International Nuclear Information System (INIS)

    Lymphadenectomy is debated in early stages endometrial cancer. Moreover, a new FIGO classification of endometrial cancer, merging stages IA and IB has been recently published. Therefore, the aims of the present study was to evaluate the relevance of the sentinel node (SN) procedure in women with endometrial cancer and to discuss whether the use of the 2009 FIGO classification could modify the indications for SN procedure. Eighty-five patients with endometrial cancer underwent the SN procedure followed by pelvic lymphadenectomy. SNs were detected with a dual or single labelling method in 74 and 11 cases, respectively. All SNs were analysed by both H&E staining and immunohistochemistry. Presumed stage before surgery was assessed for all patients based on MR imaging features using the 1988 FIGO classification and the 2009 FIGO classification. An SN was detected in 88.2% of cases (75/85 women). Among the fourteen patients with lymph node metastases one-half were detected by serial sectioning and immunohistochemical analysis. There were no false negative case. Using the 1988 FIGO classification and the 2009 FIGO classification, the correlation between preoperative MRI staging and final histology was moderate with Kappa = 0.24 and Kappa = 0.45, respectively. None of the patients with grade 1 endometrioid carcinoma on biopsy and IA 2009 FIGO stage on MR imaging exhibited positive SN. In patients with grade 2-3 endometrioid carcinoma and stage IA on MR imaging, the rate of positive SN reached 16.6% with an incidence of micrometastases of 50%. The present study suggests that sentinel node biopsy is an adequate technique to evaluate lymph node status. The use of the 2009 FIGO classification increases the accuracy of MR imaging to stage patients with early stages of endometrial cancer and contributes to clarify the indication of SN biopsy according to tumour grade and histological type

  20. Artificial neural networks as classification and diagnostic tools for lymph node-negative breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Eswari J, Satya; Chandrakar, Neha [National Institute of Technology Raipur, Raipur (India)

    2016-04-15

    Artificial neural networks (ANNs) can be used to develop a technique to classify lymph node negative breast cancer that is prone to distant metastases based on gene expression signatures. The neural network used is a multilayered feed forward network that employs back propagation algorithm. Once trained with DNA microarraybased gene expression profiles of genes that were predictive of distant metastasis recurrence of lymph node negative breast cancer, the ANNs became capable of correctly classifying all samples and recognizing the genes most appropriate to the classification. To test the ability of the trained ANN models in recognizing lymph node negative breast cancer, we analyzed additional idle samples that were not used beforehand for the training procedure and obtained the correctly classified result in the validation set. For more substantial result, bootstrapping of training and testing dataset was performed as external validation. This study illustrates the potential application of ANN for breast tumor diagnosis and the identification of candidate targets in patients for therapy.

  1. Classification of Laser Induced Fluorescence Spectra from Normal and Malignant bladder tissues using Learning Vector Quantization Neural Network in Bladder Cancer Diagnosis

    DEFF Research Database (Denmark)

    Karemore, Gopal Raghunath; Mascarenhas, Kim Komal; Patil, Choudhary;

    2008-01-01

    classification accuracy of LVQ with other classifiers (eg. SVM and Multi Layer Perceptron) for the same data set. Good agreement has been obtained between LVQ based classification of spectroscopy data and histopathology results which demonstrate the use of LVQ classifier in bladder cancer diagnosis....

  2. Improved prognostic classification of breast cancer defined by antagonistic activation patterns of immune response pathway modules

    International Nuclear Information System (INIS)

    Elucidating the activation pattern of molecular pathways across a given tumour type is a key challenge necessary for understanding the heterogeneity in clinical response and for developing novel more effective therapies. Gene expression signatures of molecular pathway activation derived from perturbation experiments in model systems as well as structural models of molecular interactions ('model signatures') constitute an important resource for estimating corresponding activation levels in tumours. However, relatively few strategies for estimating pathway activity from such model signatures exist and only few studies have used activation patterns of pathways to refine molecular classifications of cancer. Here we propose a novel network-based method for estimating pathway activation in tumours from model signatures. We find that although the pathway networks inferred from cancer expression data are highly consistent with the prior information contained in the model signatures, that they also exhibit a highly modular structure and that estimation of pathway activity is dependent on this modular structure. We apply our methodology to a panel of 438 estrogen receptor negative (ER-) and 785 estrogen receptor positive (ER+) breast cancers to infer activation patterns of important cancer related molecular pathways. We show that in ER negative basal and HER2+ breast cancer, gene expression modules reflecting T-cell helper-1 (Th1) and T-cell helper-2 (Th2) mediated immune responses play antagonistic roles as major risk factors for distant metastasis. Using Boolean interaction Cox-regression models to identify non-linear pathway combinations associated with clinical outcome, we show that simultaneous high activation of Th1 and low activation of a TGF-beta pathway module defines a subtype of particularly good prognosis and that this classification provides a better prognostic model than those based on the individual pathways. In ER+ breast cancer, we find that

  3. Modern classification of breast cancer: should we stick with morphology or convert to molecular profile characteristics.

    Science.gov (United States)

    Rakha, Emad A; Ellis, Ian O

    2011-07-01

    Breast cancer represents a heterogeneous group of tumors with varied morphologic and biological features, behavior, and response to therapy. The present routine clinical management of breast cancer relies on the availability of robust prognostic and predictive factors to support decision making. Breast cancer patients are stratified into risk groups based on a combination of classical time-dependent prognostic variables (staging) and biological prognostic and predictive variables. Staging variables include tumor size, lymph node stage, and extent of tumor spread. Classical biological variables include morphologic variables such as tumor grade and molecular markers such as hormone receptor and human epidermal growth factor receptor 2 status. Although individual molecular markers were introduced in the field of breast cancer management many years ago, the concept of molecular classification was raised after the introduction of global gene expression profiling and the identification of multigene classifiers. Although there is no doubt that gene expression profiling technology has revolutionized the field of breast cancer research and have been widely expected to improve breast cancer prognostication, the unprecedented speed of progress and publicity associated with the introduction of these commercially-based multigene classifiers should not lead us to expect this technology to replace the classical classification systems. These multigene classifiers have the potential to complement traditional methods through provision of additional biological prognostic and predictive information in presently indeterminate risk groups. Here we present updated information on the present clinical value of classical clinicopathologic factors, molecular taxonomy, and multigene classifiers in routine patients management and provide some critical views and practical expectations. PMID:21654357

  4. Risk-based classification system of nanomaterials

    International Nuclear Information System (INIS)

    Various stakeholders are increasingly interested in the potential toxicity and other risks associated with nanomaterials throughout the different stages of a product's life cycle (e.g., development, production, use, disposal). Risk assessment methods and tools developed and applied to chemical and biological materials may not be readily adaptable for nanomaterials because of the current uncertainty in identifying the relevant physico-chemical and biological properties that adequately describe the materials. Such uncertainty is further driven by the substantial variations in the properties of the original material due to variable manufacturing processes employed in nanomaterial production. To guide scientists and engineers in nanomaterial research and application as well as to promote the safe handling and use of these materials, we propose a decision support system for classifying nanomaterials into different risk categories. The classification system is based on a set of performance metrics that measure both the toxicity and physico-chemical characteristics of the original materials, as well as the expected environmental impacts through the product life cycle. Stochastic multicriteria acceptability analysis (SMAA-TRI), a formal decision analysis method, was used as the foundation for this task. This method allowed us to cluster various nanomaterials in different ecological risk categories based on our current knowledge of nanomaterial physico-chemical characteristics, variation in produced material, and best professional judgments. SMAA-TRI uses Monte Carlo simulations to explore all feasible values for weights, criteria measurements, and other model parameters to assess the robustness of nanomaterial grouping for risk management purposes.

  5. Classification of CMEs Based on Their Dynamics

    Science.gov (United States)

    Nicewicz, J.; Michalek, G.

    2016-05-01

    A large set of coronal mass ejections CMEs (6621) has been selected to study their dynamics seen with the Large Angle and Spectroscopic Coronagraph (LASCO) onboard the Solar and Heliospheric Observatory (SOHO) field of view (LFOV). These events were selected based on having at least six height-time measurements so that their dynamic properties, in the LFOV, can be evaluated with reasonable accuracy. Height-time measurements (in the SOHO/LASCO catalog) were used to determine the velocities and accelerations of individual CMEs at successive distances from the Sun. Linear and quadratic functions were fitted to these data points. On the basis of the best fits to the velocity data points, we were able to classify CMEs into four groups. The types of CMEs do not only have different dynamic behaviors but also different masses, widths, velocities, and accelerations. We also show that these groups of events are initiated by different onset mechanisms. The results of our study allow us to present a consistent classification of CMEs based on their dynamics.

  6. Structure-Based Algorithms for Microvessel Classification

    KAUST Repository

    Smith, Amy F.

    2015-02-01

    © 2014 The Authors. Microcirculation published by John Wiley & Sons Ltd. Objective: Recent developments in high-resolution imaging techniques have enabled digital reconstruction of three-dimensional sections of microvascular networks down to the capillary scale. To better interpret these large data sets, our goal is to distinguish branching trees of arterioles and venules from capillaries. Methods: Two novel algorithms are presented for classifying vessels in microvascular anatomical data sets without requiring flow information. The algorithms are compared with a classification based on observed flow directions (considered the gold standard), and with an existing resistance-based method that relies only on structural data. Results: The first algorithm, developed for networks with one arteriolar and one venular tree, performs well in identifying arterioles and venules and is robust to parameter changes, but incorrectly labels a significant number of capillaries as arterioles or venules. The second algorithm, developed for networks with multiple inlets and outlets, correctly identifies more arterioles and venules, but is more sensitive to parameter changes. Conclusions: The algorithms presented here can be used to classify microvessels in large microvascular data sets lacking flow information. This provides a basis for analyzing the distinct geometrical properties and modelling the functional behavior of arterioles, capillaries, and venules.

  7. Computational hepatocellular carcinoma tumor grading based on cell nuclei classification.

    Science.gov (United States)

    Atupelage, Chamidu; Nagahashi, Hiroshi; Kimura, Fumikazu; Yamaguchi, Masahiro; Tokiya, Abe; Hashiguchi, Akinori; Sakamoto, Michiie

    2014-10-01

    Hepatocellular carcinoma (HCC) is the most common histological type of primary liver cancer. HCC is graded according to the malignancy of the tissues. It is important to diagnose low-grade HCC tumors because these tissues have good prognosis. Image interpretation-based computer-aided diagnosis (CAD) systems have been developed to automate the HCC grading process. Generally, the HCC grade is determined by the characteristics of liver cell nuclei. Therefore, it is preferable that CAD systems utilize only liver cell nuclei for HCC grading. This paper proposes an automated HCC diagnosing method. In particular, it defines a pipeline-path that excludes nonliver cell nuclei in two consequent pipeline-modules and utilizes the liver cell nuclear features for HCC grading. The significance of excluding the nonliver cell nuclei for HCC grading is experimentally evaluated. Four categories of liver cell nuclear features were utilized for classifying the HCC tumors. Results indicated that nuclear texture is the dominant feature for HCC grading and others contribute to increase the classification accuracy. The proposed method was employed to classify a set of regions of interest selected from HCC whole slide images into five classes and resulted in a 95.97% correct classification rate. PMID:26158066

  8. Cellular based cancer vaccines

    DEFF Research Database (Denmark)

    Hansen, Morten; Met, O; Svane, I M;

    2012-01-01

    Cancer vaccines designed to re-calibrate the existing host-tumour interaction, tipping the balance from tumor acceptance towards tumor control holds huge potential to complement traditional cancer therapies. In general, limited success has been achieved with vaccines composed of tumor...... in vitro migration via autocrine receptor-mediated endocytosis of CCR7. In the current review, we discuss optimal design of DC maturation focused on pre-clinical as well as clinical results from standard and polarized dendritic cell based cancer vaccines....

  9. Classification problems in object-based representation systems

    OpenAIRE

    Napoli, Amedeo

    1999-01-01

    Classification is a process that consists in two dual operations: generating a set of classes and then classifying given objects into the created classes. The class generation may be understood as a learning process and object classification as a problem-solving process. The goal of this position paper is to introduce and to make precise the notion of a classification problem in object-based representation systems, e.g. a query against a class hierarchy, to define a subsumption relation betwe...

  10. Fuzzy Inference System & Fuzzy Cognitive Maps based Classification

    OpenAIRE

    Kanika Bhutani; Gaurav; Megha Kumar

    2015-01-01

    Fuzzy classification is very necessary because it has the ability to use interpretable rules. It has got control over the limitations of crisp rule based classifiers. This paper mainly deals with classification on the basis of soft computing techniques fuzzy cognitive maps and fuzzy inference system on the lenses dataset. The results obtained with FIS shows 100% accuracy. Sometimes the data available for classification contain missing or ambiguous data so Neutrosophic logic is used for cla...

  11. A new classification algorithm based on RGH-tree search

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    In this paper, we put forward a new classification algorithm based on RGH-Tree search and perform the classification analysis and comparison study. This algorithm can save computing resource and increase the classification efficiency. The experiment shows that this algorithm can get better effect in dealing with three dimensional multi-kind data. We find that the algorithm has better generalization ability for small training set and big testing result.

  12. Appraisal of progenitor markers in the context of molecular classification of breast cancers

    OpenAIRE

    Haviv, Izhak

    2011-01-01

    Clinical management of breast cancer relies on case stratification, which increasingly employs molecular markers. The motivation behind delineating breast epithelial differentiation is to better target cancer cases through innate sensitivities bequeathed to the cancer from its normal progenitor state. A combination of histopathological and molecular classification of breast cancer cases suggests a role for progenitors in particular breast cancer cases. Although a remarkable fraction of the re...

  13. A novel sparse coding algorithm for classification of tumors based on gene expression data.

    Science.gov (United States)

    Kolali Khormuji, Morteza; Bazrafkan, Mehrnoosh

    2016-06-01

    High-dimensional genomic and proteomic data play an important role in many applications in medicine such as prognosis of diseases, diagnosis, prevention and molecular biology, to name a few. Classifying such data is a challenging task due to the various issues such as curse of dimensionality, noise and redundancy. Recently, some researchers have used the sparse representation (SR) techniques to analyze high-dimensional biological data in various applications in classification of cancer patients based on gene expression datasets. A common problem with all SR-based biological data classification methods is that they cannot utilize the topological (geometrical) structure of data. More precisely, these methods transfer the data into sparse feature space without preserving the local structure of data points. In this paper, we proposed a novel SR-based cancer classification algorithm based on gene expression data that takes into account the geometrical information of all data. Precisely speaking, we incorporate the local linear embedding algorithm into the sparse coding framework, by which we can preserve the geometrical structure of all data. For performance comparison, we applied our algorithm on six tumor gene expression datasets, by which we demonstrate that the proposed method achieves higher classification accuracy than state-of-the-art SR-based tumor classification algorithms. PMID:26337064

  14. Fuzzy classification rules based on similarity

    Czech Academy of Sciences Publication Activity Database

    Holeňa, Martin; Štefka, D.

    Seňa : PONT s.r.o., 2012 - (Horváth, T.), s. 25-31 ISBN 978-80-971144-0-4. [ITAT 2012. Conference on Theory and Practice of Information Technologies. Ždiar (SK), 17.09.2012-21.09.2012] R&D Projects: GA ČR GA201/08/0802 Institutional support: RVO:67985807 Keywords : classification rules * fuzzy classification * fuzzy integral * fuzzy measure * similarity Subject RIV: IN - Informatics, Computer Science

  15. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression

    Science.gov (United States)

    Shayan, Zahra; Mezerji, Naser Mohammad Gholi; Shayan, Leila; Naseri, Parisa

    2016-01-01

    Background: Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. Methods: This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. Results: CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. Conclusion: The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  16. Improving breast cancer classification with mammography, supported on an appropriate variable selection analysis

    Science.gov (United States)

    Pérez, Noel; Guevara, Miguel A.; Silva, Augusto

    2013-02-01

    This work addresses the issue of variable selection within the context of breast cancer classification with mammography. A comprehensive repository of feature vectors was used including a hybrid subset gathering image-based and clinical features. It aimed to gather experimental evidence of variable selection in terms of cardinality, type and find a classification scheme that provides the best performance over the Area Under Receiver Operating Characteristics Curve (AUC) scores using the ranked features subset. We evaluated and classified a total of 300 subsets of features formed by the application of Chi-Square Discretization, Information-Gain, One-Rule and RELIEF methods in association with Feed-Forward Backpropagation Neural Network (FFBP), Support Vector Machine (SVM) and Decision Tree J48 (DTJ48) Machine Learning Algorithms (MLA) for a comparative performance evaluation based on AUC scores. A variable selection analysis was performed for Single-View Ranking and Multi-View Ranking groups of features. Features subsets representing Microcalcifications (MCs), Masses and both MCs and Masses lesions achieved AUC scores of 0.91, 0.954 and 0.934 respectively. Experimental evidence demonstrated that classification performance was improved by combining image-based and clinical features. The most important clinical and image-based features were StromaDistortion and Circularity respectively. Other less important but worth to use due to its consistency were Contrast, Perimeter, Microcalcification, Correlation and Elongation.

  17. Preliminary Research on Grassland Fine-classification Based on MODIS

    International Nuclear Information System (INIS)

    Grassland ecosystem is important for climatic regulation, maintaining the soil and water. Research on the grassland monitoring method could provide effective reference for grassland resource investigation. In this study, we used the vegetation index method for grassland classification. There are several types of climate in China. Therefore, we need to use China's Main Climate Zone Maps and divide the study region into four climate zones. Based on grassland classification system of the first nation-wide grass resource survey in China, we established a new grassland classification system which is only suitable for this research. We used MODIS images as the basic data resources, and use the expert classifier method to perform grassland classification. Based on the 1:1,000,000 Grassland Resource Map of China, we obtained the basic distribution of all the grassland types and selected 20 samples evenly distributed in each type, then used NDVI/EVI product to summarize different spectral features of different grassland types. Finally, we introduced other classification auxiliary data, such as elevation, accumulate temperature (AT), humidity index (HI) and rainfall. China's nation-wide grassland classification map is resulted by merging the grassland in different climate zone. The overall classification accuracy is 60.4%. The result indicated that expert classifier is proper for national wide grassland classification, but the classification accuracy need to be improved

  18. Classification of Product Requirements Based on Product Environment

    OpenAIRE

    Chen, Zhen Yu; Zeng, Yong

    2006-01-01

    Abstract Effective management of product requirements is critical for designers to deliver a quality design solution in a reasonable range of cost and time. The management depends on a well-defined classification and a flexible representation of product requirements. This article proposes two classification criteria in terms of different partitions of product environment based on a formal structure of produ...

  19. Transportation Mode Choice Analysis Based on Classification Methods

    OpenAIRE

    Zeņina, N; Borisovs, A

    2011-01-01

    Mode choice analysis has received the most attention among discrete choice problems in travel behavior literature. Most traditional mode choice models are based on the principle of random utility maximization derived from econometric theory. This paper investigates performance of mode choice analysis with classification methods - decision trees, discriminant analysis and multinomial logit. Experimental results have demonstrated satisfactory quality of classification.

  20. A Curriculum-Based Classification System for Community Colleges.

    Science.gov (United States)

    Schuyler, Gwyer

    2003-01-01

    Proposes and tests a community college classification system based on curricular characteristics and their association with institutional characteristics. Seeks readily available data correlates to represent percentage of a college's course offerings that are in the liberal arts. A simple two-category classification system using total enrollment…

  1. An Object-Based Method for Chinese Landform Types Classification

    Science.gov (United States)

    Ding, Hu; Tao, Fei; Zhao, Wufan; Na, Jiaming; Tang, Guo'an

    2016-06-01

    Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM). In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.

  2. Classification of treatment-related mortality in children with cancer: a systematic assessment.

    Science.gov (United States)

    Alexander, Sarah; Pole, Jason D; Gibson, Paul; Lee, Michelle; Hesser, Tanya; Chi, Susan N; Dvorak, Christopher C; Fisher, Brian; Hasle, Henrik; Kanerva, Jukka; Möricke, Anja; Phillips, Bob; Raetz, Elizabeth; Rodriguez-Galindo, Carlos; Samarasinghe, Sujith; Schmiegelow, Kjeld; Tissing, Wim; Lehrnbecher, Thomas; Sung, Lillian

    2015-12-01

    Treatment-related mortality is an important outcome in paediatric cancer clinical trials. An international group of experts in supportive care in paediatric cancer developed a consensus-based definition of treatment-related mortality and a cause-of-death attribution system. The reliability and validity of the system was tested in 30 deaths, which were independently assessed by two clinical research associates and two paediatric oncologists. We defined treatment-related mortality as death occurring in the absence of progressive cancer. Of the 30 reviewed deaths, the reliability of classification for treatment-related mortality was noted as excellent by clinical research associates (κ=0·83, 95% CI 0·60-1·00) and paediatric oncologists (0·84, 0·63-1·00). Criterion validity was established because agreement between the consensus classifications by clinical research associates and paediatric oncologists was almost perfect (0·92, 0·78-1·00). Our approach should allow comparison of treatment-related mortality across trials and across time. PMID:26678213

  3. Knowledge-Based Classification in Automated Soil Mapping

    Institute of Scientific and Technical Information of China (English)

    ZHOU BIN; WANG RENCHAO

    2003-01-01

    A machine-learning approach was developed for automated building of knowledge bases for soil resourcesmapping by using a classification tree to generate knowledge from training data. With this method, buildinga knowledge base for automated soil mapping was easier than using the conventional knowledge acquisitionapproach. The knowledge base built by classification tree was used by the knowledge classifier to perform thesoil type classification of Longyou County, Zhejiang Province, China using Landsat TM bi-temporal imagesand GIS data. To evaluate the performance of the resultant knowledge bases, the classification results werecompared to existing soil map based on a field survey. The accuracy assessment and analysis of the resultantsoil maps suggested that the knowledge bases built by the machine-learning method was of good quality formapping distribution model of soil classes over the study area.

  4. Shape classification based on singular value decomposition transform

    Institute of Scientific and Technical Information of China (English)

    SHAABAN Zyad; ARIF Thawar; BABA Sami; KREKOR Lala

    2009-01-01

    In this paper, a new shape classification system based on singular value decomposition (SVD) transform using nearest neighbour classifier was proposed. The gray scale image of the shape object was converted into a black and white image. The squared Euclidean distance transform on binary image was applied to extract the boundary image of the shape. SVD transform features were extracted from the the boundary of the object shapes. In this paper, the proposed classification system based on SVD transform feature extraction method was compared with classifier based on moment invariants using nearest neighbour classifier. The experimental results showed the advantage of our proposed classification system.

  5. Behavior Based Social Dimensions Extraction for Multi-Label Classification.

    Directory of Open Access Journals (Sweden)

    Le Li

    Full Text Available Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions.

  6. Behavior Based Social Dimensions Extraction for Multi-Label Classification.

    Science.gov (United States)

    Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin

    2016-01-01

    Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes' behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes' connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions. PMID:27049849

  7. Multiclass Classification Based on the Analytical Center of Version Space

    Institute of Scientific and Technical Information of China (English)

    ZENGFanzi; QIUZhengding; YUEJianhai; LIXiangqian

    2005-01-01

    Analytical center machine, based on the analytical center of version space, outperforms support vector machine, especially when the version space is elongated or asymmetric. While analytical center machine for binary classification is well understood, little is known about corresponding multiclass classification.Moreover, considering that the current multiclass classification method: “one versus all” needs repeatedly constructing classifiers to separate a single class from all the others, which leads to daunting computation and low efficiency of classification, and that though multiclass support vector machine corresponds to a simple quadratic optimization, it is not very effective when the version spaceis asymmetric or elongated, Thus, the multiclass classification approach based on the analytical center of version space is proposed to address the above problems. Experiments on wine recognition and glass identification dataset demonstrate validity of the approach proposed.

  8. Behavior Based Social Dimensions Extraction for Multi-Label Classification

    Science.gov (United States)

    Li, Le; Xu, Junyi; Xiao, Weidong; Ge, Bin

    2016-01-01

    Classification based on social dimensions is commonly used to handle the multi-label classification task in heterogeneous networks. However, traditional methods, which mostly rely on the community detection algorithms to extract the latent social dimensions, produce unsatisfactory performance when community detection algorithms fail. In this paper, we propose a novel behavior based social dimensions extraction method to improve the classification performance in multi-label heterogeneous networks. In our method, nodes’ behavior features, instead of community memberships, are used to extract social dimensions. By introducing Latent Dirichlet Allocation (LDA) to model the network generation process, nodes’ connection behaviors with different communities can be extracted accurately, which are applied as latent social dimensions for classification. Experiments on various public datasets reveal that the proposed method can obtain satisfactory classification results in comparison to other state-of-the-art methods on smaller social dimensions. PMID:27049849

  9. EPA`s program for risk assessment guidelines: Cancer classification issues

    Energy Technology Data Exchange (ETDEWEB)

    Wiltse, J. [Environmental Protection Agency, Washington, DC (United States)

    1990-12-31

    Issues presented are related to classification of weight of evidence in cancer risk assessments. The focus in this paper is on lines of evidence used in constructing a conclusion about potential human carcinogenicity. The paper also discusses issues that are mistakenly addressed as classification issues but are really part of the risk assessment process. 2 figs.

  10. Program Classification for Performance-Based Budgeting

    OpenAIRE

    Robinson, Marc

    2013-01-01

    This guide provides practical guidance on program classification, that is, on how to define programs and their constituent elements under a program budgeting system. Program budgeting is the most widespread form of performance budgeting as applied to the government budget as a whole. The defining characteristics of program budgeting are: (1) funds are allocated in the budget to results-bas...

  11. A Fuzzy Logic Based Sentiment Classification

    Directory of Open Access Journals (Sweden)

    J.I.Sheeba

    2014-07-01

    Full Text Available Sentiment classification aims to detect information such as opinions, explicit , implicit feelings expressed in text. The most existing approaches are able to detect either explicit expressions or implicit expressions of sentiments in the text separately. In this proposed framework it will detect both Implicit and Explicit expressions available in the meeting transcripts. It will classify the Positive, Negative, Neutral words and also identify the topic of the particular meeting transcripts by using fuzzy logic. This paper aims to add some additional features for improving the classification method. The quality of the sentiment classification is improved using proposed fuzzy logic framework .In this fuzzy logic it includes the features like Fuzzy rules and Fuzzy C-means algorithm.The quality of the output is evaluated using the parameters such as precision, recall, f-measure. Here Fuzzy C-means Clustering technique measured in terms of Purity and Entropy. The data set was validated using 10-fold cross validation method and observed 95% confidence interval between the accuracy values .Finally, the proposed fuzzy logic method produced more than 85 % accurate results and error rate is very less compared to existing sentiment classification techniques.

  12. Classification tree analysis of second neoplasms in survivors of childhood cancer

    OpenAIRE

    Todorovski Ljupčo; Jazbec Janez; Jereb Berta

    2007-01-01

    Abstract Background Reports on childhood cancer survivors estimated cumulative probability of developing secondary neoplasms vary from 3,3% to 25% at 25 years from diagnosis, and the risk of developing another cancer to several times greater than in the general population. Methods In our retrospective study, we have used the classification tree multivariate method on a group of 849 first cancer survivors, to identify childhood cancer patients with the greatest risk for development of secondar...

  13. Two-Dimensional ARMA Modeling for Breast Cancer Detection and Classification

    CERN Document Server

    Bouaynaya, Nidhal; Schonfeld, Dan

    2009-01-01

    We propose a new model-based computer-aided diagnosis (CAD) system for tumor detection and classification (cancerous v.s. benign) in breast images. Specifically, we show that (x-ray, ultrasound and MRI) images can be accurately modeled by two-dimensional autoregressive-moving average (ARMA) random fields. We derive a two-stage Yule-Walker Least-Squares estimates of the model parameters, which are subsequently used as the basis for statistical inference and biophysical interpretation of the breast image. We use a k-means classifier to segment the breast image into three regions: healthy tissue, benign tumor, and cancerous tumor. Our simulation results on ultrasound breast images illustrate the power of the proposed approach.

  14. Efficient molecular subtype classification of high-grade serous ovarian cancer.

    Science.gov (United States)

    Leong, Huei San; Galletta, Laura; Etemadmoghadam, Dariush; George, Joshy; Köbel, Martin; Ramus, Susan J; Bowtell, David

    2015-07-01

    High-grade serous carcinomas (HGSCs) account for approximately 70% of all epithelial ovarian cancers diagnosed. Using microarray gene expression profiling, we previously identified four molecular subtypes of HGSC: C1 (mesenchymal), C2 (immunoreactive), C4 (differentiated), and C5 (proliferative), which correlate with patient survival and have distinct biological features. Here, we describe molecular classification of HGSC based on a limited number of genes to allow cost-effective and high-throughput subtype analysis. We determined a minimal signature for accurate classification, including 39 differentially expressed and nine control genes from microarray experiments. Taqman-based (low-density arrays and Fluidigm), fluorescent oligonucleotides (Nanostring), and targeted RNA sequencing (Illumina) assays were then compared for their ability to correctly classify fresh and formalin-fixed, paraffin-embedded samples. All platforms achieved > 90% classification accuracy with RNA from fresh frozen samples. The Illumina and Nanostring assays were superior with fixed material. We found that the C1, C2, and C4 molecular subtypes were largely consistent across multiple surgical deposits from individual chemo-naive patients. In contrast, we observed substantial subtype heterogeneity in patients whose primary ovarian sample was classified as C5. The development of an efficient molecular classifier of HGSC should enable further biological characterization of molecular subtypes and the development of targeted clinical trials. PMID:25810134

  15. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value.

    Directory of Open Access Journals (Sweden)

    Laetitia Marisa

    Full Text Available BACKGROUND: Colon cancer (CC pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses. METHODS AND FINDINGS: Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype-like, normal-like, serrated CC phenotype-like, and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II-III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse

  16. Network planning tool based on network classification and load prediction

    OpenAIRE

    Hammami, Seif eddine; Afifi, Hossam; Marot, Michel; Gauthier, Vincent

    2016-01-01

    Real Call Detail Records (CDR) are analyzed and classified based on Support Vector Machine (SVM) algorithm. The daily classification results in three traffic classes. We use two different algorithms, K-means and SVM to check the classification efficiency. A second support vector regression (SVR) based algorithm is built to make an online prediction of traffic load using the history of CDRs. Then, these algorithms will be integrated to a network planning tool which will help cellular operators...

  17. Iris Image Classification Based on Hierarchical Visual Codebook.

    Science.gov (United States)

    Zhenan Sun; Hui Zhang; Tieniu Tan; Jianyu Wang

    2014-06-01

    Iris recognition as a reliable method for personal identification has been well-studied with the objective to assign the class label of each iris image to a unique subject. In contrast, iris image classification aims to classify an iris image to an application specific category, e.g., iris liveness detection (classification of genuine and fake iris images), race classification (e.g., classification of iris images of Asian and non-Asian subjects), coarse-to-fine iris identification (classification of all iris images in the central database into multiple categories). This paper proposes a general framework for iris image classification based on texture analysis. A novel texture pattern representation method called Hierarchical Visual Codebook (HVC) is proposed to encode the texture primitives of iris images. The proposed HVC method is an integration of two existing Bag-of-Words models, namely Vocabulary Tree (VT), and Locality-constrained Linear Coding (LLC). The HVC adopts a coarse-to-fine visual coding strategy and takes advantages of both VT and LLC for accurate and sparse representation of iris texture. Extensive experimental results demonstrate that the proposed iris image classification method achieves state-of-the-art performance for iris liveness detection, race classification, and coarse-to-fine iris identification. A comprehensive fake iris image database simulating four types of iris spoof attacks is developed as the benchmark for research of iris liveness detection. PMID:26353275

  18. A Classification-based Review Recommender

    Science.gov (United States)

    O'Mahony, Michael P.; Smyth, Barry

    Many online stores encourage their users to submit product/service reviews in order to guide future purchasing decisions. These reviews are often listed alongside product recommendations but, to date, limited attention has been paid as to how best to present these reviews to the end-user. In this paper, we describe a supervised classification approach that is designed to identify and recommend the most helpful product reviews. Using the TripAdvisor service as a case study, we compare the performance of several classification techniques using a range of features derived from hotel reviews. We then describe how these classifiers can be used as the basis for a practical recommender that automatically suggests the mosthelpful contrasting reviews to end-users. We present an empirical evaluation which shows that our approach achieves a statistically significant improvement over alternative review ranking schemes.

  19. Text document classification based on mixture models

    Czech Academy of Sciences Publication Activity Database

    Novovičová, Jana; Malík, Antonín

    2004-01-01

    Roč. 40, č. 3 (2004), s. 293-304. ISSN 0023-5954 R&D Projects: GA AV ČR IAA2075302; GA ČR GA102/03/0049; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : text classification * text categorization * multinomial mixture model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004

  20. Fast Wavelet-Based Visual Classification

    OpenAIRE

    Yu, Guoshen; Slotine, Jean-Jacques

    2008-01-01

    We investigate a biologically motivated approach to fast visual classification, directly inspired by the recent work of Serre et al. Specifically, trading-off biological accuracy for computational efficiency, we explore using wavelet and grouplet-like transforms to parallel the tuning of visual cortex V1 and V2 cells, alternated with max operations to achieve scale and translation invariance. A feature selection procedure is applied during learning to accelerate recognition. We introduce a si...

  1. Blurred Image Classification based on Adaptive Dictionary

    OpenAIRE

    Xiaofei Zhou; Guangling Sun; Jie Yin

    2012-01-01

    Two frameworks for blurred image classification bas ed on adaptive dictionary are proposed. Given a blurred image, instead of image deblurring, the sem antic category of the image is determined by blur insensitive sparse coefficients calculated dependin g on an adaptive dictionary. The dictionary is adap tive to an assumed space invariant Point Spread Function (PSF) estimated from the input blurred image. In o ne of th...

  2. A classification-based review recommender

    OpenAIRE

    O'Mahony, Michael P.; Smyth, Barry

    2010-01-01

    Many online stores encourage their users to submit product or service reviews in order to guide future purchasing decisions. These reviews are often listed alongside product recommendations but, to date, limited attention has been paid as to how best to present these reviews to the end-user. In this paper, we describe a supervised classification approach that is designed to identify and recommend the most helpful product reviews. Using the TripAdvisor service as a case study, we compare...

  3. Classification

    Science.gov (United States)

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  4. A Comparative Analysis of Swarm Intelligence Techniques for Feature Selection in Cancer Classification

    Directory of Open Access Journals (Sweden)

    Chellamuthu Gunavathi

    2014-01-01

    Full Text Available Feature selection in cancer classification is a central area of research in the field of bioinformatics and used to select the informative genes from thousands of genes of the microarray. The genes are ranked based on T-statistics, signal-to-noise ratio (SNR, and F-test values. The swarm intelligence (SI technique finds the informative genes from the top-m ranked genes. These selected genes are used for classification. In this paper the shuffled frog leaping with Lévy flight (SFLLF is proposed for feature selection. In SFLLF, the Lévy flight is included to avoid premature convergence of shuffled frog leaping (SFL algorithm. The SI techniques such as particle swarm optimization (PSO, cuckoo search (CS, SFL, and SFLLF are used for feature selection which identifies informative genes for classification. The k-nearest neighbour (k-NN technique is used to classify the samples. The proposed work is applied on 10 different benchmark datasets and examined with SI techniques. The experimental results show that the results obtained from k-NN classifier through SFLLF feature selection method outperform PSO, CS, and SFL.

  5. A comparative analysis of swarm intelligence techniques for feature selection in cancer classification.

    Science.gov (United States)

    Gunavathi, Chellamuthu; Premalatha, Kandasamy

    2014-01-01

    Feature selection in cancer classification is a central area of research in the field of bioinformatics and used to select the informative genes from thousands of genes of the microarray. The genes are ranked based on T-statistics, signal-to-noise ratio (SNR), and F-test values. The swarm intelligence (SI) technique finds the informative genes from the top-m ranked genes. These selected genes are used for classification. In this paper the shuffled frog leaping with Lévy flight (SFLLF) is proposed for feature selection. In SFLLF, the Lévy flight is included to avoid premature convergence of shuffled frog leaping (SFL) algorithm. The SI techniques such as particle swarm optimization (PSO), cuckoo search (CS), SFL, and SFLLF are used for feature selection which identifies informative genes for classification. The k-nearest neighbour (k-NN) technique is used to classify the samples. The proposed work is applied on 10 different benchmark datasets and examined with SI techniques. The experimental results show that the results obtained from k-NN classifier through SFLLF feature selection method outperform PSO, CS, and SFL. PMID:25157377

  6. Hybrid Support Vector Machines-Based Multi-fault Classification

    Institute of Scientific and Technical Information of China (English)

    GAO Guo-hua; ZHANG Yong-zhong; ZHU Yu; DUAN Guang-huang

    2007-01-01

    Support Vector Machines (SVM) is a new general machine-learning tool based on structural risk minimization principle. This characteristic is very signific ant for the fault diagnostics when the number of fault samples is limited. Considering that SVM theory is originally designed for a two-class classification, a hybrid SVM scheme is proposed for multi-fault classification of rotating machinery in our paper. Two SVM strategies, 1-v-1 (one versus one) and 1-v-r (one versus rest), are respectively adopted at different classification levels. At the parallel classification level, using 1-v-1 strategy, the fault features extracted by various signal analysis methods are transferred into the multiple parallel SVM and the local classification results are obtained. At the serial classification level, these local results values are fused by one serial SVM based on 1-v-r strategy. The hybrid SVM scheme introduced in our paper not only generalizes the performance of signal binary SVMs but improves the precision and reliability of the fault classification results. The actually testing results show the availability suitability of this new method.

  7. Support vector classification algorithm based on variable parameter linear programming

    Institute of Scientific and Technical Information of China (English)

    Xiao Jianhua; Lin Jian

    2007-01-01

    To solve the problems of SVM in dealing with large sample size and asymmetric distributed samples, a support vector classification algorithm based on variable parameter linear programming is proposed.In the proposed algorithm, linear programming is employed to solve the optimization problem of classification to decrease the computation time and to reduce its complexity when compared with the original model.The adjusted punishment parameter greatly reduced the classification error resulting from asymmetric distributed samples and the detailed procedure of the proposed algorithm is given.An experiment is conducted to verify whether the proposed algorithm is suitable for asymmetric distributed samples.

  8. Words semantic orientation classification based on HowNet

    Institute of Scientific and Technical Information of China (English)

    LI Dun; MA Yong-tao; GUO Jian-li

    2009-01-01

    Based on the text orientation classification, a new measurement approach to semantic orientation of words was proposed. According to the integrated and detailed definition of words in HowNet, seed sets including the words with intense orientations were built up. The orientation similarity between the seed words and the given word was then calculated using the sentiment weight priority to recognize the semantic orientation of common words. Finally, the words' semantic orientation and the context were combined to recognize the given words' orientation. The experiments show that the measurement approach achieves better results for common words' orientation classification and contributes particularly to the text orientation classification of large granularities.

  9. Comparison of supervised classification methods for protein profiling in cancer diagnosis.

    Science.gov (United States)

    Dossat, Nadège; Mangé, Alain; Solassol, Jérôme; Jacot, William; Lhermitte, Ludovic; Maudelonde, Thierry; Daurès, Jean-Pierre; Molinari, Nicolas

    2007-01-01

    A key challenge in clinical proteomics of cancer is the identification of biomarkers that could allow detection, diagnosis and prognosis of the diseases. Recent advances in mass spectrometry and proteomic instrumentations offer unique chance to rapidly identify these markers. These advances pose considerable challenges, similar to those created by microarray-based investigation, for the discovery of pattern of markers from high-dimensional data, specific to each pathologic state (e.g. normal vs cancer). We propose a three-step strategy to select important markers from high-dimensional mass spectrometry data using surface enhanced laser desorption/ionization (SELDI) technology. The first two steps are the selection of the most discriminating biomarkers with a construction of different classifiers. Finally, we compare and validate their performance and robustness using different supervised classification methods such as Support Vector Machine, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Neural Networks, Classification Trees and Boosting Trees. We show that the proposed method is suitable for analysing high-throughput proteomics data and that the combination of logistic regression and Linear Discriminant Analysis outperform other methods tested. PMID:19455249

  10. Feature Extraction based Face Recognition, Gender and Age Classification

    Directory of Open Access Journals (Sweden)

    Venugopal K R

    2010-01-01

    Full Text Available The face recognition system with large sets of training sets for personal identification normally attains good accuracy. In this paper, we proposed Feature Extraction based Face Recognition, Gender and Age Classification (FEBFRGAC algorithm with only small training sets and it yields good results even with one image per person. This process involves three stages: Pre-processing, Feature Extraction and Classification. The geometric features of facial images like eyes, nose, mouth etc. are located by using Canny edge operator and face recognition is performed. Based on the texture and shape information gender and age classification is done using Posteriori Class Probability and Artificial Neural Network respectively. It is observed that the face recognition is 100%, the gender and age classification is around 98% and 94% respectively.

  11. A Human Gait Classification Method Based on Radar Doppler Spectrograms

    Directory of Open Access Journals (Sweden)

    Fok Hing Chi Tivive

    2010-01-01

    Full Text Available An image classification technique, which has recently been introduced for visual pattern recognition, is successfully applied for human gait classification based on radar Doppler signatures depicted in the time-frequency domain. The proposed method has three processing stages. The first two stages are designed to extract Doppler features that can effectively characterize human motion based on the nature of arm swings, and the third stage performs classification. Three types of arm motion are considered: free-arm swings, one-arm confined swings, and no-arm swings. The last two arm motions can be indicative of a human carrying objects or a person in stressed situations. The paper discusses the different steps of the proposed method for extracting distinctive Doppler features and demonstrates their contributions to the final and desirable classification rates.

  12. A NOVEL RULE-BASED FINGERPRINT CLASSIFICATION APPROACH

    Directory of Open Access Journals (Sweden)

    Faezeh Mirzaei

    2014-03-01

    Full Text Available Fingerprint classification is an important phase in increasing the speed of a fingerprint verification system and narrow down the search of fingerprint database. Fingerprint verification is still a challenging problem due to the difficulty of poor quality images and the need for faster response. The classification gets even harder when just one core has been detected in the input image. This paper has proposed a new classification approach which includes the images with one core. The algorithm extracts singular points (core and deltas from the input image and performs classification based on the number, locations and surrounded area of the detected singular points. The classifier is rule-based, where the rules are generated independent of a given data set. Moreover, shortcomings of a related paper has been reported in detail. The experimental results and comparisons on FVC2002 database have shown the effectiveness and efficiency of the proposed method.

  13. Analysis of Kernel Approach in Fuzzy-Based Image Classifications

    Directory of Open Access Journals (Sweden)

    Mragank Singhal

    2013-03-01

    Full Text Available This paper presents a framework of kernel approach in the field of fuzzy based image classification in remote sensing. The goal of image classification is to separate images according to their visual content into two or more disjoint classes. Fuzzy logic is relatively young theory. Major advantage of this theory is that it allows the natural description, in linguistic terms, of problems that should be solved rather than in terms of relationships between precise numerical values. This paper describes how remote sensing data with uncertainty are handled with fuzzy based classification using Kernel approach for land use/land cover maps generation. The introduction to fuzzification using Kernel approach provides the basis for the development of more robust approaches to the remote sensing classification problem. The kernel explicitly defines a similarity measure between two samples and implicitly represents the mapping of the input space to the feature space.

  14. Bazhenov Fm Classification Based on Wireline Logs

    Science.gov (United States)

    Simonov, D. A.; Baranov, V.; Bukhanov, N.

    2016-03-01

    This paper considers the main aspects of Bazhenov Formation interpretation and application of machine learning algorithms for the Kolpashev type section of the Bazhenov Formation, application of automatic classification algorithms that would change the scale of research from small to large. Machine learning algorithms help interpret the Bazhenov Formation in a reference well and in other wells. During this study, unsupervised and supervised machine learning algorithms were applied to interpret lithology and reservoir properties. This greatly simplifies the routine problem of manual interpretation and has an economic effect on the cost of laboratory analysis.

  15. PLANNING BASED ON CLASSIFICATION BY INDUCTION GRAPH

    Directory of Open Access Journals (Sweden)

    Sofia Benbelkacem

    2013-11-01

    Full Text Available In Artificial Intelligence, planning refers to an area of research that proposes to develop systems that can automatically generate a result set, in the form of an integrated decisionmaking system through a formal procedure, known as plan. Instead of resorting to the scheduling algorithms to generate plans, it is proposed to operate the automatic learning by decision tree to optimize time. In this paper, we propose to build a classification model by induction graph from a learning sample containing plans that have an associated set of descriptors whose values change depending on each plan. This model will then operate for classifying new cases by assigning the appropriate plan.

  16. A Novel Fault Classification Scheme Based on Least Square SVM

    OpenAIRE

    Dubey, Harishchandra; Tiwari, A. K.; Nandita; Ray, P. K.; Mohanty, S. R.; Kishor, Nand

    2016-01-01

    This paper presents a novel approach for fault classification and section identification in a series compensated transmission line based on least square support vector machine. The current signal corresponding to one-fourth of the post fault cycle is used as input to proposed modular LS-SVM classifier. The proposed scheme uses four binary classifier; three for selection of three phases and fourth for ground detection. The proposed classification scheme is found to be accurate and reliable in ...

  17. Feature Extraction based Face Recognition, Gender and Age Classification

    OpenAIRE

    Venugopal K R2; L M Patnaik; Ramesha K; K B Raja

    2010-01-01

    The face recognition system with large sets of training sets for personal identification normally attains good accuracy. In this paper, we proposed Feature Extraction based Face Recognition, Gender and Age Classification (FEBFRGAC) algorithm with only small training sets and it yields good results even with one image per person. This process involves three stages: Pre-processing, Feature Extraction and Classification. The geometric features of facial images like eyes, nose, mouth etc. are loc...

  18. Prognostic Significance of Subtype Classification for Short- and Long-Term Survival in Breast Cancer: Survival Time Holds the Key

    OpenAIRE

    Ambs, Stefan

    2010-01-01

    Stefan Ambs provides a perspective on a recent research article by Paul Pharoah and colleagues that evaluated the prognostic significance of immunohistochemical subtype classification in early breast cancer.

  19. Cancer cell detection and classification using transformation invariant template learning methods

    International Nuclear Information System (INIS)

    In traditional cancer cell detection, pathologists examine biopsies to make diagnostic assessments, largely based on cell morphology and tissue distribution. The process of image acquisition is very much subjective and the pattern undergoes unknown or random transformations during data acquisition (e.g. variation in illumination, orientation, translation and perspective) results in high degree of variability. Transformed Component Analysis (TCA) incorporates a discrete, hidden variable that accounts for transformations and uses the Expectation Maximization (EM) algorithm to jointly extract components and normalize for transformations. Further the TEMPLAR framework developed takes advantage of hierarchical pattern models and adds probabilistic modeling for local transformations. Pattern classification is based on Expectation Maximization algorithm and General Likelihood Ratio Tests (GLRT). Performance of TEMPLAR is certainly improved by defining area of interest on slide a priori. Performance can be further enhanced by making the kernel function adaptive during learning. (author)

  20. Identifying Cancer Biomarkers Via Node Classification within a Mapreduce Framework

    Directory of Open Access Journals (Sweden)

    Taysir Hassan A. Soliman

    2015-12-01

    Full Text Available Big data are giving new research challenges in the life sciences domain because of their variety, volume, veracity, velocity, and value. Predicting gene biomarkers is one of the vital research issues in bioinformatics field, where microarray gene expression and network based methods can be used. These datasets suffer from the huge data voluminous, causing main memory problems. In this paper, a Random Committee Node Classifier algorithm (RCNC is proposed for identifying cancer biomarkers, which is based on microarray gene expression data and Protein-Protein Interaction (PPI data. Data are enriched from other public databases, such as IntACT1 and UniProt2 and Gene Ontology3 (GO. Cancer Biomarkers are identified when applied to different datasets with an accuracy rate an accuracy rate 99.16%, 99.96% precision, 99.24% recall, 99.16% F1-measure and 99.6 ROC. To speed up the performance, it is run within a MapReduce framework, where RCNC MapReduce algorithm is much faster than RCNC sequential algorithm when having large datasets.

  1. Classification approach based on association rules mining for unbalanced data

    CERN Document Server

    Ndour, Cheikh

    2012-01-01

    This paper deals with the supervised classification when the response variable is binary and its class distribution is unbalanced. In such situation, it is not possible to build a powerful classifier by using standard methods such as logistic regression, classification tree, discriminant analysis, etc. To overcome this short-coming of these methods that provide classifiers with low sensibility, we tackled the classification problem here through an approach based on the association rules learning because this approach has the advantage of allowing the identification of the patterns that are well correlated with the target class. Association rules learning is a well known method in the area of data-mining. It is used when dealing with large database for unsupervised discovery of local patterns that expresses hidden relationships between variables. In considering association rules from a supervised learning point of view, a relevant set of weak classifiers is obtained from which one derives a classification rule...

  2. Ensemble polarimetric SAR image classification based on contextual sparse representation

    Science.gov (United States)

    Zhang, Lamei; Wang, Xiao; Zou, Bin; Qiao, Zhijun

    2016-05-01

    Polarimetric SAR image interpretation has become one of the most interesting topics, in which the construction of the reasonable and effective technique of image classification is of key importance. Sparse representation represents the data using the most succinct sparse atoms of the over-complete dictionary and the advantages of sparse representation also have been confirmed in the field of PolSAR classification. However, it is not perfect, like the ordinary classifier, at different aspects. So ensemble learning is introduced to improve the issue, which makes a plurality of different learners training and obtained the integrated results by combining the individual learner to get more accurate and ideal learning results. Therefore, this paper presents a polarimetric SAR image classification method based on the ensemble learning of sparse representation to achieve the optimal classification.

  3. Blurred Image Classification Based on Adaptive Dictionary

    Directory of Open Access Journals (Sweden)

    Guangling Sun

    2013-02-01

    Full Text Available Two frameworks for blurred image classification bas ed on adaptive dictionary are proposed. Given a blurred image, instead of image deblurring, the sem antic category of the image is determined by blur insensitive sparse coefficients calculated dependin g on an adaptive dictionary. The dictionary is adap tive to an assumed space invariant Point Spread Function (PSF estimated from the input blurred image. In o ne of the proposed two frameworks, the PSF is inferred separately and in the other, the PSF is updated combined with sparse coefficients calculation in an alternative and iterative manner. The experimental results have evaluated three types of blur namely d efocus blur, simple motion blur and camera shake bl ur. The experiment results confirm the effectiveness of the proposed frameworks.

  4. Texture analysis applied to second harmonic generation image data for ovarian cancer classification

    Science.gov (United States)

    Wen, Bruce L.; Brewer, Molly A.; Nadiarnykh, Oleg; Hocker, James; Singh, Vikas; Mackie, Thomas R.; Campagnola, Paul J.

    2014-09-01

    Remodeling of the extracellular matrix has been implicated in ovarian cancer. To quantitate the remodeling, we implement a form of texture analysis to delineate the collagen fibrillar morphology observed in second harmonic generation microscopy images of human normal and high grade malignant ovarian tissues. In the learning stage, a dictionary of "textons"-frequently occurring texture features that are identified by measuring the image response to a filter bank of various shapes, sizes, and orientations-is created. By calculating a representative model based on the texton distribution for each tissue type using a training set of respective second harmonic generation images, we then perform classification between images of normal and high grade malignant ovarian tissues. By optimizing the number of textons and nearest neighbors, we achieved classification accuracy up to 97% based on the area under receiver operating characteristic curves (true positives versus false positives). The local analysis algorithm is a more general method to probe rapidly changing fibrillar morphologies than global analyses such as FFT. It is also more versatile than other texture approaches as the filter bank can be highly tailored to specific applications (e.g., different disease states) by creating customized libraries based on common image features.

  5. Classification of LiDAR Data with Point Based Classification Methods

    Science.gov (United States)

    Yastikli, N.; Cetin, Z.

    2016-06-01

    LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applications directly use the classified point clouds. The different classification methods can be seen in recent researches and most of researches work with the gridded LiDAR point cloud. In grid based data processing of the LiDAR data, the characteristic point loss in the LiDAR point cloud especially vegetation and buildings or losing height accuracy during the interpolation stage are inevitable. In this case, the possible solution is the use of the raw point cloud data for classification to avoid data and accuracy loss in gridding process. In this study, the point based classification possibilities of the LiDAR point cloud is investigated to obtain more accurate classes. The automatic point based approaches, which are based on hierarchical rules, have been proposed to achieve ground, building and vegetation classes using the raw LiDAR point cloud data. In proposed approaches, every single LiDAR point is analyzed according to their features such as height, multi-return, etc. then automatically assigned to the class which they belong to. The use of un-gridded point cloud in proposed point based classification process helped the determination of more realistic rule sets. The detailed parameter analyses have been performed to obtain the most appropriate parameters in the rule sets to achieve accurate classes. The hierarchical rule sets were created for proposed Approach 1 (using selected spatial-based and echo-based features) and Approach 2 (using only selected spatial-based features

  6. ELABORATION OF A VECTOR BASED SEMANTIC CLASSIFICATION OVER THE WORDS AND NOTIONS OF THE NATURAL LANGUAGE

    OpenAIRE

    Safonov, K.; Lichargin, D.

    2009-01-01

    The problem of vector-based semantic classification over the words and notions of the natural language is discussed. A set of generative grammar rules is offered for generating the semantic classification vector. Examples of the classification application and a theorem of optional formal classification incompleteness are presented. The principles of assigning the meaningful phrases functions over the classification word groups are analyzed.

  7. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin

    DEFF Research Database (Denmark)

    Hoadley, Katherine A; Yau, Christina; Wolf, Denise M;

    2014-01-01

    on 3,527 specimens from 12 cancer types, revealing a unified classification into 11 major subtypes. Five subtypes were nearly identical to their tissue-of-origin counterparts, but several distinct cancer types were found to converge into common subtypes. Lung squamous, head and neck, and a subset......Recent genomic analyses of pathologically defined tumor types identify "within-a-tissue" disease subtypes. However, the extent to which genomic signatures are shared across tissues is still unclear. We performed an integrative analysis using five genome-wide platforms and one proteomic platform...... of bladder cancers coalesced into one subtype typified by TP53 alterations, TP63 amplifications, and high expression of immune and proliferation pathway genes. Of note, bladder cancers split into three pan-cancer subtypes. The multiplatform classification, while correlated with tissue-of-origin, provides...

  8. Comparison of Computational Algorithms for the Classification of Liver Cancer using SELDI Mass Spectrometry: A Case Study

    Directory of Open Access Journals (Sweden)

    Robert J Hickey

    2007-01-01

    Full Text Available Introduction: As an alternative to DNA microarrays, mass spectrometry based analysis of proteomic patterns has shown great potential in cancer diagnosis. The ultimate application of this technique in clinical settings relies on the advancement of the technology itself and the maturity of the computational tools used to analyze the data. A number of computational algorithms constructed on different principles are available for the classification of disease status based on proteomic patterns. Nevertheless, few studies have addressed the difference in the performance of these approaches. In this report, we describe a comparative case study on the classification accuracy of hepatocellular carcinoma based on the serum proteomic pattern generated from a Surface Enhanced Laser Desorption/Ionization (SELDI mass spectrometer.Methods: Nine supervised classifi cation algorithms are implemented in R software and compared for the classification accuracy.Results: We found that the support vector machine with radial function is preferable as a tool for classification of hepatocellular carcinoma using features in SELDI mass spectra. Among the rest of the methods, random forest and prediction analysis of microarrays have better performance. A permutation-based technique reveals that the support vector machine with a radial function seems intrinsically superior in learning from the training data since it has a lower prediction error than others when there is essentially no differential signal. On the other hand, the performance of the random forest and prediction analysis of microarrays rely on their capability of capturing the signals with substantial differentiation between groups.Conclusions: Our finding is similar to a previous study, where classification methods based on the Matrix Assisted Laser Desorption/Ionization (MALDI mass spectrometry are compared for the prediction accuracy of ovarian cancer. The support vector machine, random forest and prediction

  9. A new circulation type classification based upon Lagrangian air trajectories

    Science.gov (United States)

    Ramos, Alexandre; Sprenger, Michael; Wernli, Heini; Durán-Quesada, Ana María; Lorenzo, Maria Nieves; Gimeno, Luis

    2014-10-01

    A new classification method of the large-scale circulation characteristic for a specific target area (NW Iberian Peninsula) is presented, based on the analysis of 90-h backward trajectories arriving in this area calculated with the 3-D Lagrangian particle dispersion model FLEXPART. A cluster analysis is applied to separate the backward trajectories in up to five representative air streams for each day. Specific measures are then used to characterise the distinct air streams (e.g., curvature of the trajectories, cyclonic or anticyclonic flow, moisture evolution, origin and length of the trajectories). The robustness of the presented method is demonstrated in comparison with the Eulerian Lamb weather type classification. A case study of the 2003 heatwave is discussed in terms of the new Lagrangian circulation and the Lamb weather type classifications. It is shown that the new classification method adds valuable information about the pertinent meteorological conditions, which are missing in an Eulerian approach. The new method is climatologically evaluated for the five-year time period from December 1999 to November 2004. The ability of the method to capture the inter-seasonal circulation variability in the target region is shown. Furthermore, the multi-dimensional character of the classification is shortly discussed, in particular with respect to inter-seasonal differences. Finally, the relationship between the new Lagrangian classification and the precipitation in the target area is studied.

  10. A new circulation type classification based upon Lagrangian air trajectories

    Directory of Open Access Journals (Sweden)

    Alexandre M. Ramos

    2014-10-01

    Full Text Available A new classification method of the large-scale circulation characteristic for a specific target area (NW Iberian Peninsula is presented, based on the analysis of 90-h backward trajectories arriving in this area calculated with the 3-D Lagrangian particle dispersion model FLEXPART. A cluster analysis is applied to separate the backward trajectories in up to five representative air streams for each day. Specific measures are then used to characterise the distinct air streams (e.g., curvature of the trajectories, cyclonic or anticyclonic flow, moisture evolution, origin and length of the trajectories. The robustness of the presented method is demonstrated in comparison with the Eulerian Lamb weather type classification.A case study of the 2003 heatwave is discussed in terms of the new Lagrangian circulation and the Lamb weather type classifications. It is shown that the new classification method adds valuable information about the pertinent meteorological conditions, which are missing in an Eulerian approach. The new method is climatologically evaluated for the five-year time period from December 1999 to November 2004. The ability of the method to capture the inter-seasonal circulation variability in the target region is shown. Furthermore, the multi-dimensional character of the classification is shortly discussed, in particular with respect to inter-seasonal differences. Finally, the relationship between the new Lagrangian classification and the precipitation in the target area is studied.

  11. D Land Cover Classification Based on Multispectral LIDAR Point Clouds

    Science.gov (United States)

    Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong

    2016-06-01

    Multispectral Lidar System can emit simultaneous laser pulses at the different wavelengths. The reflected multispectral energy is captured through a receiver of the sensor, and the return signal together with the position and orientation information of sensor is recorded. These recorded data are solved with GNSS/IMU data for further post-processing, forming high density multispectral 3D point clouds. As the first commercial multispectral airborne Lidar sensor, Optech Titan system is capable of collecting point clouds data from all three channels at 532nm visible (Green), at 1064 nm near infrared (NIR) and at 1550nm intermediate infrared (IR). It has become a new source of data for 3D land cover classification. The paper presents an Object Based Image Analysis (OBIA) approach to only use multispectral Lidar point clouds datasets for 3D land cover classification. The approach consists of three steps. Firstly, multispectral intensity images are segmented into image objects on the basis of multi-resolution segmentation integrating different scale parameters. Secondly, intensity objects are classified into nine categories by using the customized features of classification indexes and a combination the multispectral reflectance with the vertical distribution of object features. Finally, accuracy assessment is conducted via comparing random reference samples points from google imagery tiles with the classification results. The classification results show higher overall accuracy for most of the land cover types. Over 90% of overall accuracy is achieved via using multispectral Lidar point clouds for 3D land cover classification.

  12. Super pixel density based clustering automatic image classification method

    Science.gov (United States)

    Xu, Mingxing; Zhang, Chuan; Zhang, Tianxu

    2015-12-01

    The image classification is an important means of image segmentation and data mining, how to achieve rapid automated image classification has been the focus of research. In this paper, based on the super pixel density of cluster centers algorithm for automatic image classification and identify outlier. The use of the image pixel location coordinates and gray value computing density and distance, to achieve automatic image classification and outlier extraction. Due to the increased pixel dramatically increase the computational complexity, consider the method of ultra-pixel image preprocessing, divided into a small number of super-pixel sub-blocks after the density and distance calculations, while the design of a normalized density and distance discrimination law, to achieve automatic classification and clustering center selection, whereby the image automatically classify and identify outlier. After a lot of experiments, our method does not require human intervention, can automatically categorize images computing speed than the density clustering algorithm, the image can be effectively automated classification and outlier extraction.

  13. Failure diagnosis using deep belief learning based health state classification

    International Nuclear Information System (INIS)

    Effective health diagnosis provides multifarious benefits such as improved safety, improved reliability and reduced costs for operation and maintenance of complex engineered systems. This paper presents a novel multi-sensor health diagnosis method using deep belief network (DBN). DBN has recently become a popular approach in machine learning for its promised advantages such as fast inference and the ability to encode richer and higher order network structures. The DBN employs a hierarchical structure with multiple stacked restricted Boltzmann machines and works through a layer by layer successive learning process. The proposed multi-sensor health diagnosis methodology using DBN based state classification can be structured in three consecutive stages: first, defining health states and preprocessing sensory data for DBN training and testing; second, developing DBN based classification models for diagnosis of predefined health states; third, validating DBN classification models with testing sensory dataset. Health diagnosis using DBN based health state classification technique is compared with four existing diagnosis techniques. Benchmark classification problems and two engineering health diagnosis applications: aircraft engine health diagnosis and electric power transformer health diagnosis are employed to demonstrate the efficacy of the proposed approach

  14. Classification of Gait Types Based on the Duty-factor

    DEFF Research Database (Denmark)

    Fihl, Preben; Moeslund, Thomas B.

    2007-01-01

    This paper deals with classification of human gait types based on the notion that different gait types are in fact different types of locomotion, i.e., running is not simply walking done faster. We present the duty-factor, which is a descriptor based on this notion. The duty-factor is independent...... with known ground support. Silhouettes are extracted using the Codebook method and represented using Shape Contexts. The matching with database silhouettes is done using the Hungarian method. While manually estimated duty-factors show a clear classification the presented system contains...

  15. Directional wavelet based features for colonic polyp classification.

    Science.gov (United States)

    Wimmer, Georg; Tamaki, Toru; Tischendorf, J J W; Häfner, Michael; Yoshida, Shigeto; Tanaka, Shinji; Uhl, Andreas

    2016-07-01

    In this work, various wavelet based methods like the discrete wavelet transform, the dual-tree complex wavelet transform, the Gabor wavelet transform, curvelets, contourlets and shearlets are applied for the automated classification of colonic polyps. The methods are tested on 8 HD-endoscopic image databases, where each database is acquired using different imaging modalities (Pentax's i-Scan technology combined with or without staining the mucosa), 2 NBI high-magnification databases and one database with chromoscopy high-magnification images. To evaluate the suitability of the wavelet based methods with respect to the classification of colonic polyps, the classification performances of 3 wavelet transforms and the more recent curvelets, contourlets and shearlets are compared using a common framework. Wavelet transforms were already often and successfully applied to the classification of colonic polyps, whereas curvelets, contourlets and shearlets have not been used for this purpose so far. We apply different feature extraction techniques to extract the information of the subbands of the wavelet based methods. Most of the in total 25 approaches were already published in different texture classification contexts. Thus, the aim is also to assess and compare their classification performance using a common framework. Three of the 25 approaches are novel. These three approaches extract Weibull features from the subbands of curvelets, contourlets and shearlets. Additionally, 5 state-of-the-art non wavelet based methods are applied to our databases so that we can compare their results with those of the wavelet based methods. It turned out that extracting Weibull distribution parameters from the subband coefficients generally leads to high classification results, especially for the dual-tree complex wavelet transform, the Gabor wavelet transform and the Shearlet transform. These three wavelet based transforms in combination with Weibull features even outperform the state

  16. Comparative Study of Classification Techniques on Breast Cancer FNA Biopsy Data

    Directory of Open Access Journals (Sweden)

    George Rumbe

    2010-12-01

    Full Text Available Accurate diagnostic detection of the cancerous cells in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Bayesian classifier and other Artificial neural network classifiers (Backpropagation, linear programming, Learning vector quantization, and K nearest neighborhood on the Wisconsin breast cancer classification problem.

  17. An Efficient Semantic Model For Concept Based Clustering And Classification

    Directory of Open Access Journals (Sweden)

    SaiSindhu Bandaru

    2012-03-01

    Full Text Available Usually in text mining techniques the basic measures like term frequency of a term (word or phrase is computed to compute the importance of the term in the document. But with statistical analysis, the original semantics of the term may not carry the exact meaning of the term. To overcome this problem, a new framework has been introduced which relies on concept based model and synonym based approach. The proposed model can efficiently find significant matching and related concepts between documents according to concept based and synonym based approaches. Large sets of experiments using the proposed model on different set in clustering and classification are conducted. Experimental results demonstrate the substantialenhancement of the clustering quality using sentence based, document based, corpus based and combined approach concept analysis. A new similarity measure has been proposed to find the similarity between adocument and the existing clusters, which can be used in classification of the document with existing clusters.

  18. Finding Combination of Features from Promoter Regions for Ovarian Cancer-related Gene Group Classification

    KAUST Repository

    Olayan, Rawan S.

    2012-12-01

    In classification problems, it is always important to use the suitable combination of features that will be employed by classifiers. Generating the right combination of features usually results in good classifiers. In the situation when the problem is not well understood, data items are usually described by many features in the hope that some of these may be the relevant or most relevant ones. In this study, we focus on one such problem related to genes implicated in ovarian cancer (OC). We try to recognize two important OC-related gene groups: oncogenes, which support the development and progression of OC, and oncosuppressors, which oppose such tendencies. For this, we use the properties of promoters of these genes. We identified potential “regulatory features” that characterize OC-related oncogenes and oncosuppressors promoters. In our study, we used 211 oncogenes and 39 oncosuppressors. For these, we identified 538 characteristic sequence motifs from their promoters. Promoters are annotated by these motifs and derived feature vectors used to develop classification models. We made a comparison of a number of classification models in their ability to distinguish oncogenes from oncosuppressors. Based on 10-fold cross-validation, the resultant model was able to separate the two classes with sensitivity of 96% and specificity of 100% with the complete set of features. Moreover, we developed another recognition model where we attempted to distinguish oncogenes and oncosuppressors as one group from other OC-related genes. That model achieved accuracy of 82%. We believe that the results of this study will help in discovering other OC-related oncogenes and oncosuppressors not identified as yet.

  19. Classification and Target Group Selection Based Upon Frequent Patterns

    NARCIS (Netherlands)

    W.H.L.M. Pijls (Wim); R. Potharst (Rob)

    2000-01-01

    textabstractIn this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is constr

  20. Time Series Classification by Class-Based Mahalanobis Distances

    CERN Document Server

    Prekopcsák, Zoltán

    2010-01-01

    To classify time series by nearest neighbor, we need to specify or learn a distance. We consider several variations of the Mahalanobis distance and the related Large Margin Nearest Neighbor Classification (LMNN). We find that the conventional Mahalanobis distance is counterproductive. However, both LMNN and the class-based diagonal Mahalanobis distance are competitive.

  1. Classification-Based Method of Linear Multicriteria Optimization

    OpenAIRE

    Vassilev, Vassil; Genova, Krassimira; Vassileva, Mariyana; Narula, Subhash

    2003-01-01

    The paper describes a classification-based learning-oriented interactive method for solving linear multicriteria optimization problems. The method allows the decision makers describe their preferences with greater flexibility, accuracy and reliability. The method is realized in an experimental software system supporting the solution of multicriteria optimization problems.

  2. Hierarchical Real-time Network Traffic Classification Based on ECOC

    Directory of Open Access Journals (Sweden)

    Yaou Zhao

    2013-09-01

    Full Text Available Classification of network traffic is basic and essential for manynetwork researches and managements. With the rapid development ofpeer-to-peer (P2P application using dynamic port disguisingtechniques and encryption to avoid detection, port-based and simplepayload-based network traffic classification methods were diminished.An alternative method based on statistics and machine learning hadattracted researchers' attention in recent years. However, most ofthe proposed algorithms were off-line and usually used a single classifier.In this paper a new hierarchical real-time model was proposed which comprised of a three tuple (source ip, destination ip and destination portlook up table(TT-LUT part and layered milestone part. TT-LUT was used to quickly classify short flows whichneed not to pass the layered milestone part, and milestones in layered milestone partcould classify the other flows in real-time with the real-time feature selection and statistics.Every milestone was a ECOC(Error-Correcting Output Codes based model which was usedto improve classification performance. Experiments showed that the proposedmodel can improve the efficiency of real-time to 80%, and themulti-class classification accuracy encouragingly to 91.4% on the datasets which had been captured from the backbone router in our campus through a week.

  3. Optimizing Mining Association Rules for Artificial Immune System based Classification

    Directory of Open Access Journals (Sweden)

    SAMEER DIXIT

    2011-08-01

    Full Text Available The primary function of a biological immune system is to protect the body from foreign molecules known as antigens. It has great pattern recognition capability that may be used to distinguish between foreigncells entering the body (non-self or antigen and the body cells (self. Immune systems have many characteristics such as uniqueness, autonomous, recognition of foreigners, distributed detection, and noise tolerance . Inspired by biological immune systems, Artificial Immune Systems have emerged during the last decade. They are incited by many researchers to design and build immune-based models for a variety of application domains. Artificial immune systems can be defined as a computational paradigm that is inspired by theoretical immunology, observed immune functions, principles and mechanisms. Association rule mining is one of the most important and well researched techniques of data mining. The goal of association rules is to extract interesting correlations, frequent patterns, associations or casual structures among sets of items in thetransaction databases or other data repositories. Association rules are widely used in various areas such as inventory control, telecommunication networks, intelligent decision making, market analysis and risk management etc. Apriori is the most widely used algorithm for mining the association rules. Other popular association rule mining algorithms are frequent pattern (FP growth, Eclat, dynamic itemset counting (DIC etc. Associative classification uses association rule mining in the rule discovery process to predict the class labels of the data. This technique has shown great promise over many other classification techniques. Associative classification also integrates the process of rule discovery and classification to build the classifier for the purpose of prediction. The main problem with the associative classification approach is the discovery of highquality association rules in a very large space of

  4. TENSOR MODELING BASED FOR AIRBORNE LiDAR DATA CLASSIFICATION

    OpenAIRE

    Li, N.; Liu, C; Pfeifer, N; Yin, J. F.; Liao, Z.Y.; Zhou, Y.

    2016-01-01

    Feature selection and description is a key factor in classification of Earth observation data. In this paper a classification method based on tensor decomposition is proposed. First, multiple features are extracted from raw LiDAR point cloud, and raster LiDAR images are derived by accumulating features or the “raw” data attributes. Then, the feature rasters of LiDAR data are stored as a tensor, and tensor decomposition is used to select component features. This tensor representation could kee...

  5. Interaction profile-based protein classification of death domain

    Directory of Open Access Journals (Sweden)

    Pio Frederic

    2004-06-01

    Full Text Available Abstract Background The increasing number of protein sequences and 3D structure obtained from genomic initiatives is leading many of us to focus on proteomics, and to dedicate our experimental and computational efforts on the creation and analysis of information derived from 3D structure. In particular, the high-throughput generation of protein-protein interaction data from a few organisms makes such an approach very important towards understanding the molecular recognition that make-up the entire protein-protein interaction network. Since the generation of sequences, and experimental protein-protein interactions increases faster than the 3D structure determination of protein complexes, there is tremendous interest in developing in silico methods that generate such structure for prediction and classification purposes. In this study we focused on classifying protein family members based on their protein-protein interaction distinctiveness. Structure-based classification of protein-protein interfaces has been described initially by Ponstingl et al. 1 and more recently by Valdar et al. 2 and Mintseris et al. 3, from complex structures that have been solved experimentally. However, little has been done on protein classification based on the prediction of protein-protein complexes obtained from homology modeling and docking simulation. Results We have developed an in silico classification system entitled HODOCO (Homology modeling, Docking and Classification Oracle, in which protein Residue Potential Interaction Profiles (RPIPS are used to summarize protein-protein interaction characteristics. This system applied to a dataset of 64 proteins of the death domain superfamily was used to classify each member into its proper subfamily. Two classification methods were attempted, heuristic and support vector machine learning. Both methods were tested with a 5-fold cross-validation. The heuristic approach yielded a 61% average accuracy, while the machine

  6. Pulse frequency classification based on BP neural network

    Institute of Scientific and Technical Information of China (English)

    WANG Rui; WANG Xu; YANG Dan; FU Rong

    2006-01-01

    In Traditional Chinese Medicine (TCM), it is an important parameter of the clinic disease diagnosis to analysis the pulse frequency. This article accords to pulse eight major essentials to identify pulse type of the pulse frequency classification based on back-propagation neural networks (BPNN). The pulse frequency classification includes slow pulse, moderate pulse, rapid pulse etc. By feature parameter of the pulse frequency analysis research and establish to identify system of pulse frequency features. The pulse signal from detecting system extracts period, frequency etc feature parameter to compare with standard feature value of pulse type. The result shows that identify-rate attains 92.5% above.

  7. Classification of CT-brain slices based on local histograms

    Science.gov (United States)

    Avrunin, Oleg G.; Tymkovych, Maksym Y.; Pavlov, Sergii V.; Timchik, Sergii V.; Kisała, Piotr; Orakbaev, Yerbol

    2015-12-01

    Neurosurgical intervention is a very complicated process. Modern operating procedures based on data such as CT, MRI, etc. Automated analysis of these data is an important task for researchers. Some modern methods of brain-slice segmentation use additional data to process these images. Classification can be used to obtain this information. To classify the CT images of the brain, we suggest using local histogram and features extracted from them. The paper shows the process of feature extraction and classification CT-slices of the brain. The process of feature extraction is specialized for axial cross-section of the brain. The work can be applied to medical neurosurgical systems.

  8. AN EFFICIENT CLASSIFICATION OF GENOMES BASED ON CLASSES AND SUBCLASSES

    Directory of Open Access Journals (Sweden)

    B.V. DHANDRA,

    2010-08-01

    Full Text Available The grass family has been the subject of intense research over the past. Reliable and fast classification / sub-classification of large sequences which are rapidly gaining importance due to genome sequencing projects all over the world is contributing large amount of genome sequences to public gene bank . Hence sequence classification has gained importance for predicting the genome function, structure, evolutionary relationships and also gives the insight into the features associated with the biological role of the class. Thus, classification of functional genome is an important andchallenging task to both computer scientists and biologists. The presence of motifs in grass genome chains predicts the functional behavior of the grass genome. The correlation between grass genome properties and their motifs is not always obvious since more than one motif may exist within a genome chain. Due to the complexity of this association most of the data mining algorithms are either non efficient or time consuming. Hence, in this paper we proposed an efficient method for main classes based on classes to reduce the time complexity for the classification of large sequences of grass genomes dataset. The proposed approaches classify the given dataset into classes with conserved threshold and again reclassify the class relaxed threshold into major classes. Experimental results indicate that the proposed method reduces the time complexity keepingclassification accuracy level as that compared with general NNCalgorithm.

  9. Online Network Traffic Classification Algorithm Based on RVM

    Directory of Open Access Journals (Sweden)

    Zhang Qunhui

    2013-06-01

    Full Text Available Since compared with the Support Vector Machine (SVM, the Relevance Vector Machine (RVM not only has the advantage of avoiding the over- learn which is the characteristic of the SVM, but also greatly reduces the amount of computation of the kernel function and avoids the defects of the SVM that the scarcity is not strong, the large amount of calculation as well as the kernel function must satisfy the Mercer's condition and that human empirically determined parameters, so we proposed a new online traffic classification algorithm base on the RVM for this purpose. Through the analysis of the basic principles of RVM and the steps of the modeling, we made use of the training traffic classification model of the RVM to identify the network traffic in the real time through this model and the “port number+ DPI”. When the RVM predicts that the probability is in the query interval, we jointly used the "port number" and "DPI". Finally, we made a detailed experimental validation which shows that: compared with the Support Vector Machine (SVM network traffic classification algorithm, this algorithm can achieve the online network traffic classification, and the classification predication probability is greatly improved.

  10. Classification of follicular cell-derived thyroid cancer by global RNA profiling

    DEFF Research Database (Denmark)

    Rossing, Maria

    2013-01-01

    classification will not only contribute to our biological insight but also improve clinical and pathological examinations, thus advancing thyroid tumour diagnosis and ultimately preventing superfluous surgery. This review evaluates the status of classification and biological insights gained from molecular...... classifiers that may differentiate malignant from benign thyroid nodules. Molecular classification models based on global RNA profiles from fine-needle aspirations are currently being evaluated; results are preliminary and lack validation in prospective clinical trials. There is no doubt that molecular...

  11. Torrent classification - Base of rational management of erosive regions

    Energy Technology Data Exchange (ETDEWEB)

    Gavrilovic, Zoran; Stefanovic, Milutin; Milovanovic, Irina; Cotric, Jelena; Milojevic, Mileta [Institute for the Development of Water Resources ' Jaroslav Cerni' , 11226 Beograd (Pinosava), Jaroslava Cernog 80 (Serbia)], E-mail: gavrilovicz@sbb.rs

    2008-11-01

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  12. Torrent classification - Base of rational management of erosive regions

    International Nuclear Information System (INIS)

    A complex methodology for torrents and erosion and the associated calculations was developed during the second half of the twentieth century in Serbia. It was the 'Erosion Potential Method'. One of the modules of that complex method was focused on torrent classification. The module enables the identification of hydro graphic, climate and erosion characteristics. The method makes it possible for each torrent, regardless of its magnitude, to be simply and recognizably described by the 'Formula of torrentially'. The above torrent classification is the base on which a set of optimisation calculations is developed for the required scope of erosion-control works and measures, the application of which enables the management of significantly larger erosion and torrential regions compared to the previous period. This paper will present the procedure and the method of torrent classification.

  13. Basal cytokeratins and their relationship to the cellular origin and functional classification of breast cancer

    OpenAIRE

    Gusterson, Barry A.; Ross, Douglas T.; Heath, Victoria J; Stein, Torsten

    2005-01-01

    Recent publications have classified breast cancers on the basis of expression of cytokeratin-5 and -17 at the RNA and protein levels, and demonstrated the importance of these markers in defining sporadic tumours with bad prognosis and an association with BRCA1-related breast cancers. These important observations using different technology platforms produce a new functional classification of breast carcinoma. However, it is important in developing hypotheses about the pathogenesis of this tumo...

  14. Fast rule-based bioactivity prediction using associative classification mining

    Directory of Open Access Journals (Sweden)

    Yu Pulan

    2012-11-01

    Full Text Available Abstract Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM, which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR, classification based on multiple association rules (CMAR and classification based on association rules (CBA are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB, mutagenicity and hERG (the human Ether-a-go-go-Related Gene blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM methods, and produce highly interpretable models.

  15. High dimensional multiclass classification with applications to cancer diagnosis

    DEFF Research Database (Denmark)

    Vincent, Martin

    Probabilistic classifiers are introduced and it is shown that the only regular linear probabilistic classifier with convex risk is multinomial regression. Penalized empirical risk minimization is introduced and used to construct supervised learning methods for probabilistic classifiers. A sparse...... group lasso penalized approach to high dimensional multinomial classification is presented. On different real data examples it is found that this approach clearly outperforms multinomial lasso in terms of error rate and features included in the model. An efficient coordinate descent algorithm is...

  16. Comparison of Supervised Classification Methods for Protein Profiling in Cancer Diagnosis

    Directory of Open Access Journals (Sweden)

    Nadège Dossat

    2007-01-01

    Full Text Available A key challenge in clinical proteomics of cancer is the identification of biomarkers that could allow detection, diagnosis and prognosis of the diseases. Recent advances in mass spectrometry and proteomic instrumentations offer unique chance to rapidly identify these markers. These advances pose considerable challenges, similar to those created by microarray-based investigation, for the discovery of pattern of markers from high-dimensional data, specific to each pathologic state (e.g. normal vs cancer. We propose a three-step strategy to select important markers from high-dimensional mass spectrometry data using surface enhanced laser desorption/ionization (SELDI technology. The fi rst two steps are the selection of the most discriminating biomarkers with a construction of different classifiers. Finally, we compare and validate their performance and robustness using different supervised classification methods such as Support Vector Machine, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Neural Networks, Classifi cation Trees and Boosting Trees. We show that the proposed method is suitable for analysing high-throughput proteomics data and that the combination of logistic regression and Linear Discriminant Analysis outperform other methods tested.

  17. Clinical classification of cancer cachexia: phenotypic correlates in human skeletal muscle.

    Directory of Open Access Journals (Sweden)

    Neil Johns

    Full Text Available BACKGROUND: Cachexia affects the majority of patients with advanced cancer and is associated with a reduction in treatment tolerance, response to therapy, and duration of survival. One impediment towards the effective treatment of cachexia is a validated classification system. METHODS: 41 patients with resectable upper gastrointestinal (GI or pancreatic cancer underwent characterisation for cachexia based on weight-loss (WL and/or low muscularity (LM. Four diagnostic criteria were used >5%WL, >10%WL, LM, and LM+>2%WL. All patients underwent biopsy of the rectus muscle. Analysis included immunohistochemistry for fibre size and type, protein and nucleic acid concentration, Western blots for markers of autophagy, SMAD signalling, and inflammation. FINDINGS: Compared with non-cachectic cancer patients, patients with LM or LM+>2%WL, mean muscle fibre diameter was reduced by about 25% (p = 0.02 and p = 0.001 respectively. No significant difference in fibre diameter was observed if patients had WL alone. Regardless of classification, there was no difference in fibre number or proportion of fibre type across all myosin heavy chain isoforms. Mean muscle protein content was reduced and the ratio of RNA/DNA decreased in patients with either >5%WL or LM+>2%WL. Compared with non-cachectic patients, SMAD3 protein levels were increased in patients with >5%WL (p = 0.022 and with >10%WL, beclin (p = 0.05 and ATG5 (p = 0.01 protein levels were increased. There were no differences in phospho-NFkB or phospho-STAT3 levels across any of the groups. CONCLUSION: Muscle fibre size, biochemical composition and pathway phenotype can vary according to whether the diagnostic criteria for cachexia are based on weight loss alone, a measure of low muscularity alone or a combination of the two. For intervention trials where the primary end-point is a change in muscle mass or function, use of combined diagnostic criteria may allow identification of a more

  18. Breast Cancer Survival Defined by the ER/PR/HER2 Subtypes and a Surrogate Classification according to Tumor Grade and Immunohistochemical Biomarkers

    Directory of Open Access Journals (Sweden)

    Carol A. Parise

    2014-01-01

    Full Text Available Introduction. ER, PR, and HER2 are routinely available in breast cancer specimens. The purpose of this study is to contrast breast cancer-specific survival for the eight ER/PR/HER2 subtypes with survival of an immunohistochemical surrogate for the molecular subtype based on the ER/PR/HER2 subtypes and tumor grade. Methods. We identified 123,780 cases of stages 1–3 primary female invasive breast cancer from California Cancer Registry. The surrogate classification was derived using ER/PR/HER2 and tumor grade. Kaplan-Meier survival analysis and Cox proportional hazards modeling were used to assess differences in survival and risk of mortality for the ER/PR/HER2 subtypes and surrogate classification within each stage. Results. The luminal B/HER2− surrogate classification had a higher risk of mortality than the luminal B/HER2+ for all stages of disease. There was no difference in risk of mortality between the ER+/PR+/HER2− and ER+/PR+/HER2+ in stage 3. With one exception in stage 3, the ER-negative subtypes all had an increased risk of mortality when compared with the ER-positive subtypes. Conclusions. Assessment of survival using ER/PR/HER2 illustrates the heterogeneity of HER2+ subtypes. The surrogate classification provides clear separation in survival and adjusted mortality but underestimates the wide variability within the subtypes that make up the classification.

  19. Classification of Regional Ionospheric Disturbances Based on Support Vector Machines

    Science.gov (United States)

    Begüm Terzi, Merve; Arikan, Feza; Arikan, Orhan; Karatay, Secil

    2016-07-01

    Ionosphere is an anisotropic, inhomogeneous, time varying and spatio-temporally dispersive medium whose parameters can be estimated almost always by using indirect measurements. Geomagnetic, gravitational, solar or seismic activities cause variations of ionosphere at various spatial and temporal scales. This complex spatio-temporal variability is challenging to be identified due to extensive scales in period, duration, amplitude and frequency of disturbances. Since geomagnetic and solar indices such as Disturbance storm time (Dst), F10.7 solar flux, Sun Spot Number (SSN), Auroral Electrojet (AE), Kp and W-index provide information about variability on a global scale, identification and classification of regional disturbances poses a challenge. The main aim of this study is to classify the regional effects of global geomagnetic storms and classify them according to their risk levels. For this purpose, Total Electron Content (TEC) estimated from GPS receivers, which is one of the major parameters of ionosphere, will be used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. In this work, for the automated classification of the regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. SVM is a supervised learning model used for classification with associated learning algorithm that analyze the data and recognize patterns. In addition to performing linear classification, SVM can efficiently perform nonlinear classification by embedding data into higher dimensional feature spaces. Performance of the developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from the GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing the developed classification

  20. Upper limit for context based crop classification

    DEFF Research Database (Denmark)

    Midtiby, Henrik; Åstrand, Björn; Jørgensen, Rasmus Nyholm;

    2012-01-01

    Mechanical in-row weed control of crops like sugarbeet require precise knowledge of where individual crop plants are located. If crop plants are placed in known pattern, information about plant locations can be used to discriminate between crop and weed plants. The success rate of such a classifier...... depends on the weed pressure, the position uncertainty of the crop plants and the crop upgrowth percentage. The first two measures can be combined to a normalized weed pressure, \\lambda. Given the normalized weed pressure an upper bound on the positive predictive value is shown to be 1/(1+\\lambda). If the...... weed pressure is \\rho = 400/m^2 and the crop position uncertainty is \\sigma_x = 0.0148m along the row and \\sigma_y = 0.0108m perpendicular to the row, the normalized weed pressure is \\lambda ~ 0.40$; the upper bound on the positive predictive value is then 0.71. This means that when a position based...

  1. Object-Based Classification and Change Detection of Hokkaido, Japan

    Science.gov (United States)

    Park, J. G.; Harada, I.; Kwak, Y.

    2016-06-01

    Topography and geology are factors to characterize the distribution of natural vegetation. Topographic contour is particularly influential on the living conditions of plants such as soil moisture, sunlight, and windiness. Vegetation associations having similar characteristics are present in locations having similar topographic conditions unless natural disturbances such as landslides and forest fires or artificial disturbances such as deforestation and man-made plantation bring about changes in such conditions. We developed a vegetation map of Japan using an object-based segmentation approach with topographic information (elevation, slope, slope direction) that is closely related to the distribution of vegetation. The results found that the object-based classification is more effective to produce a vegetation map than the pixel-based classification.

  2. Metagenome fragment classification based on multiple motif-occurrence profiles

    Directory of Open Access Journals (Sweden)

    Naoki Matsushita

    2014-09-01

    Full Text Available A vast amount of metagenomic data has been obtained by extracting multiple genomes simultaneously from microbial communities, including genomes from uncultivable microbes. By analyzing these metagenomic data, novel microbes are discovered and new microbial functions are elucidated. The first step in analyzing these data is sequenced-read classification into reference genomes from which each read can be derived. The Naïve Bayes Classifier is a method for this classification. To identify the derivation of the reads, this method calculates a score based on the occurrence of a DNA sequence motif in each reference genome. However, large differences in the sizes of the reference genomes can bias the scoring of the reads. This bias might cause erroneous classification and decrease the classification accuracy. To address this issue, we have updated the Naïve Bayes Classifier method using multiple sets of occurrence profiles for each reference genome by normalizing the genome sizes, dividing each genome sequence into a set of subsequences of similar length and generating profiles for each subsequence. This multiple profile strategy improves the accuracy of the results generated by the Naïve Bayes Classifier method for simulated and Sargasso Sea datasets.

  3. Comparison Of Power Quality Disturbances Classification Based On Neural Network

    Directory of Open Access Journals (Sweden)

    Nway Nway Kyaw Win

    2015-07-01

    Full Text Available Abstract Power quality disturbances PQDs result serious problems in the reliability safety and economy of power system network. In order to improve electric power quality events the detection and classification of PQDs must be made type of transient fault. Software analysis of wavelet transform with multiresolution analysis MRA algorithm and feed forward neural network probabilistic and multilayer feed forward neural network based methodology for automatic classification of eight types of PQ signals flicker harmonics sag swell impulse fluctuation notch and oscillatory will be presented. The wavelet family Db4 is chosen in this system to calculate the values of detailed energy distributions as input features for classification because it can perform well in detecting and localizing various types of PQ disturbances. This technique classifies the types of PQDs problem sevents.The classifiers classify and identify the disturbance type according to the energy distribution. The results show that the PNN can analyze different power disturbance types efficiently. Therefore it can be seen that PNN has better classification accuracy than MLFF.

  4. An AERONET-based aerosol classification using the Mahalanobis distance

    Science.gov (United States)

    Hamill, Patrick; Giordano, Marco; Ward, Carolyne; Giles, David; Holben, Brent

    2016-09-01

    We present an aerosol classification based on AERONET aerosol data from 1993 to 2012. We used the AERONET Level 2.0 almucantar aerosol retrieval products to define several reference aerosol clusters which are characteristic of the following general aerosol types: Urban-Industrial, Biomass Burning, Mixed Aerosol, Dust, and Maritime. The classification of a particular aerosol observation as one of these aerosol types is determined by its five-dimensional Mahalanobis distance to each reference cluster. We have calculated the fractional aerosol type distribution at 190 AERONET sites, as well as the monthly variation in aerosol type at those locations. The results are presented on a global map and individually in the supplementary material. Our aerosol typing is based on recognizing that different geographic regions exhibit characteristic aerosol types. To generate reference clusters we only keep data points that lie within a Mahalanobis distance of 2 from the centroid. Our aerosol characterization is based on the AERONET retrieved quantities, therefore it does not include low optical depth values. The analysis is based on "point sources" (the AERONET sites) rather than globally distributed values. The classifications obtained will be useful in interpreting aerosol retrievals from satellite borne instruments.

  5. Toward automated classification of consumers' cancer-related questions with a new taxonomy of expected answer types.

    Science.gov (United States)

    McRoy, Susan; Jones, Sean; Kurmally, Adam

    2016-09-01

    This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions. PMID:25759063

  6. Update on epidemiology classification, and management of thyroid cancer

    Directory of Open Access Journals (Sweden)

    Heitham Gheriani

    2006-06-01

    Full Text Available Thyroid cancer represents approximately 0.5–1% of all human malignancy1. In the UK the incidence of thyroid cancer is 2-3 per 100,000 populations 2. In geographical areas of low iodine intake and in areas exposed to nuclear disasters the incidence of thyroid cancer is higher. Benign thyroid conditions are much more common. In the UK approximately 8 % of the population have nodular thyroid disease2. Nodular thyroid disease increases with age and is also more common in females and in geographical areas of low iodine intake. Primary thyroid malignancy can be broadly divided into 2 groups. The first group, which generally have much better prognosis, are the well-differentiated thyroid carcinoma, which includes papillary carcinoma, follicular carcinoma and Hürthle cell tumours. The second group includes the poorly differentiated thyroid carcinoma like medullary thyroid carcinoma and the anaplastic thyroid carcinoma. Other rare tumours such as sarcomas, lymphomas and the extremely rare primary squamous cell carcinoma of the thyroid should be included in the second group. Secondary or metastatic thyroid cancer can be from breast, lung, colon and kidney malignancies.

  7. Automatic Detection of Cervical Cancer Cells by a Two-Level Cascade Classification System

    Science.gov (United States)

    Su, Jie; Xu, Xuan; He, Yongjun; Song, Jinming

    2016-01-01

    We proposed a method for automatic detection of cervical cancer cells in images captured from thin liquid based cytology slides. We selected 20,000 cells in images derived from 120 different thin liquid based cytology slides, which include 5000 epithelial cells (normal 2500, abnormal 2500), lymphoid cells, neutrophils, and junk cells. We first proposed 28 features, including 20 morphologic features and 8 texture features, based on the characteristics of each cell type. We then used a two-level cascade integration system of two classifiers to classify the cervical cells into normal and abnormal epithelial cells. The results showed that the recognition rates for abnormal cervical epithelial cells were 92.7% and 93.2%, respectively, when C4.5 classifier or LR (LR: logical regression) classifier was used individually; while the recognition rate was significantly higher (95.642%) when our two-level cascade integrated classifier system was used. The false negative rate and false positive rate (both 1.44%) of the proposed automatic two-level cascade classification system are also much lower than those of traditional Pap smear review. PMID:27298758

  8. Active Dictionary Learning in Sparse Representation Based Classification

    OpenAIRE

    Xu, Jin; He, Haibo; Man, Hong

    2014-01-01

    Sparse representation, which uses dictionary atoms to reconstruct input vectors, has been studied intensively in recent years. A proper dictionary is a key for the success of sparse representation. In this paper, an active dictionary learning (ADL) method is introduced, in which classification error and reconstruction error are considered as the active learning criteria in selection of the atoms for dictionary construction. The learned dictionaries are caculated in sparse representation based...

  9. Understanding Acupuncture Based on ZHENG Classification from System Perspective

    OpenAIRE

    Junwei Fang; Ningning Zheng; Yang Wang; Huijuan Cao; Shujun Sun; Jianye Dai; Qianhua Li; Yongyu Zhang

    2013-01-01

    Acupuncture is an efficient therapy method originated in ancient China, the study of which based on ZHENG classification is a systematic research on understanding its complexity. The system perspective is contributed to understand the essence of phenomena, and, as the coming of the system biology era, broader technology platforms such as omics technologies were established for the objective study of traditional chinese medicine (TCM). Omics technologies could dynamically determine molecular c...

  10. BCI Signal Classification using a Riemannian-based kernel

    OpenAIRE

    Barachant, Alexandre; Bonnet, Stéphane; Congedo, Marco; Jutten, Christian

    2012-01-01

    The use of spatial covariance matrix as feature is investigated for motor imagery EEG-based classification. A new kernel is derived by establishing a connection with the Riemannian geometry of symmetric positive definite matrices. Different kernels are tested, in combination with support vector machines, on a past BCI competition dataset. We demonstrate that this new approach outperforms significantly state of the art results without the need for spatial filtering.

  11. DATA MINING BASED TECHNIQUE FOR IDS ALERT CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    Hany Nashat Gabra

    2015-06-01

    Full Text Available Intrusion detection systems (IDSs have become a widely used measure for security systems. The main problem for such systems is the irrelevant alerts. We propose a data mining based method for classification to distinguish serious and irrelevant alerts with a performance of 99.9%, which is better in comparison with the other recent data mining methods that achieved 97%. A ranked alerts list is also created according to the alert’s importance to minimize human interventions.

  12. DATA MINING BASED TECHNIQUE FOR IDS ALERT CLASSIFICATION

    OpenAIRE

    Hany Nashat Gabra; Bahaa-Eldin, Ayman M.; Hoda Korashy Mohammed

    2015-01-01

    Intrusion detection systems (IDSs) have become a widely used measure for security systems. The main problem for such systems is the irrelevant alerts. We propose a data mining based method for classification to distinguish serious and irrelevant alerts with a performance of 99.9%, which is better in comparison with the other recent data mining methods that achieved 97%. A ranked alerts list is also created according to the alert’s importance to minimize human interventions.

  13. Data Mining Based Technique for IDS Alerts Classification

    OpenAIRE

    Gabra, Hany N.; Bahaa-Eldin, Ayman M.; Mohamed, Hoda K.

    2012-01-01

    Intrusion detection systems (IDSs) have become a widely used measure for security systems. The main problem for those systems results is the irrelevant alerts on those results. We will propose a data mining based method for classification to distinguish serious alerts and irrelevant one with a performance of 99.9% which is better in comparison with the other recent data mining methods that have reached the performance of 97%. A ranked alerts list also created according to alerts importance to...

  14. Classification of objects in images based on various object representations

    OpenAIRE

    Cichocki, Radoslaw

    2006-01-01

    Object recognition is a hugely researched domain that employs methods derived from mathematics, physics and biology. This thesis combines the approaches for object classification that base on two features – color and shape. Color is represented by color histograms and shape by skeletal graphs. Four hybrids are proposed which combine those approaches in different manners and the hybrids are then tested to find out which of them gives best results.

  15. A Cluster Based Approach for Classification of Web Results

    OpenAIRE

    Apeksha Khabia; M. B. Chandak

    2014-01-01

    Nowadays significant amount of information from web is present in the form of text, e.g., reviews, forum postings, blogs, news articles, email messages, web pages. It becomes difficult to classify documents in predefined categories as the number of document grows. Clustering is the classification of a data into clusters, so that the data in each cluster share some common trait – often vicinity according to some defined measure. Underlying distribution of data set can somewhat be depicted base...

  16. Expected energy-based restricted Boltzmann machine for classification.

    Science.gov (United States)

    Elfwing, S; Uchibe, E; Doya, K

    2015-04-01

    In classification tasks, restricted Boltzmann machines (RBMs) have predominantly been used in the first stage, either as feature extractors or to provide initialization of neural networks. In this study, we propose a discriminative learning approach to provide a self-contained RBM method for classification, inspired by free-energy based function approximation (FE-RBM), originally proposed for reinforcement learning. For classification, the FE-RBM method computes the output for an input vector and a class vector by the negative free energy of an RBM. Learning is achieved by stochastic gradient-descent using a mean-squared error training objective. In an earlier study, we demonstrated that the performance and the robustness of FE-RBM function approximation can be improved by scaling the free energy by a constant that is related to the size of network. In this study, we propose that the learning performance of RBM function approximation can be further improved by computing the output by the negative expected energy (EE-RBM), instead of the negative free energy. To create a deep learning architecture, we stack several RBMs on top of each other. We also connect the class nodes to all hidden layers to try to improve the performance even further. We validate the classification performance of EE-RBM using the MNIST data set and the NORB data set, achieving competitive performance compared with other classifiers such as standard neural networks, deep belief networks, classification RBMs, and support vector machines. The purpose of using the NORB data set is to demonstrate that EE-RBM with binary input nodes can achieve high performance in the continuous input domain. PMID:25318375

  17. Design and implementation based on the classification protection vulnerability scanning system

    International Nuclear Information System (INIS)

    With the application and spread of the classification protection, Network Security Vulnerability Scanning should consider the efficiency and the function expansion. It proposes a kind of a system vulnerability from classification protection, and elaborates the design and implementation of a vulnerability scanning system based on vulnerability classification plug-in technology and oriented classification protection. According to the experiment, the application of classification protection has good adaptability and salability with the system, and it also approves the efficiency of scanning. (authors)

  18. A new gammagraphic and functional-based classification for hyperthyroidism

    International Nuclear Information System (INIS)

    The absence of an universal classification for hyperthyroidism's (HT), give rise to inadequate interpretation of series and trials, and prevents decision making. We offer a tentative classification based on gammagraphic and functional findings. Clinical records from patients who underwent thyroidectomy in our Department since 1967 to 1997 were reviewed. Those with functional measurements of hyperthyroidism were considered. All were managed according to the same preestablished guidelines. HT was the surgical indication in 694 (27,1%) of the 2559 thyroidectomy. Based on gammagraphic studies, we classified HTs in: parenchymatous increased-uptake, which could be diffuse, diffuse with cold nodules or diffuse with at least one nodule, and nodular increased-uptake (Autonomous Functioning Thyroid Nodes-AFTN), divided into solitary AFTN or toxic adenoma and multiple AFTN o toxic multi-nodular goiter. This gammagraphic-based classification in useful and has high sensitivity to detect these nodules assessing their activity, allowing us to make therapeutic decision making and, in some cases, to choose surgical technique. (authors)

  19. Impact of full field digital mammography on the classification and mammographic characteristics of interval breast cancers

    Energy Technology Data Exchange (ETDEWEB)

    Knox, Mark, E-mail: marktknox@gmail.com; O’Brien, Angela, E-mail: angelaobrien@doctors.org.uk; Szabó, Endre, E-mail: endrebacsi@freemail.hu; Smith, Clare S., E-mail: csmith@mater.ie; Fenlon, Helen M., E-mail: helen.fenlon@cancerscreening.ie; McNicholas, Michelle M., E-mail: michelle.mcnicholas@cancerscreening.ie; Flanagan, Fidelma L., E-mail: fidelma.flanagan@cancerscreening.ie

    2015-06-15

    Highlights: • Digital mammography has changed the presentation of interval breast cancer. • Less interval breast cancers are associated with microcalcifications following FFDM. • Interval breast cancer audit remains a key feature of any breast screening program. - Abstract: Objective: Full field digital mammography (FFDM) is increasingly replacing screen film mammography (SFM) in breast screening programs. Interval breast cancers are an issue in all screening programs and the purpose of our study is to assess the impact of FFDM on the classification of interval breast cancers at independent blind review and to compare the mammographic features of interval cancers at FFDM and SFM. Materials and methods: This study included 138 cases of interval breast cancer, 76 following an FFDM screening examination and 62 following screening with SFM. The prior screening mammogram was assessed by each of five consultant breast radiologists who were blinded to the site of subsequent cancer. Subsequent review of the diagnostic mammogram was performed and cases were classified as missed, minimal signs, occult or true interval. Mammographic features of the interval cancer at diagnosis and any abnormality identified on the prior screening mammogram were recorded. Results: The percentages of cancers classified as missed at FFDM and SFM did not differ significantly, 10.5% (8 of 76) at FFDM and 8.1% (5 of 62) at SFM (p = .77). There were significantly less interval cancers presenting as microcalcifications (alone or in association with another abnormality) following screening with FFDM, 16% (12 of 76) than following a SFM examination, 32% (20 of 62) (p = .02). Conclusion: Interval breast cancers continue to pose a problem at FFDM. The switch to FFDM has changed the mammographic presentation of interval breast cancer, with less interval cancers presenting in association with microcalcifications.

  20. Impact of full field digital mammography on the classification and mammographic characteristics of interval breast cancers

    International Nuclear Information System (INIS)

    Highlights: • Digital mammography has changed the presentation of interval breast cancer. • Less interval breast cancers are associated with microcalcifications following FFDM. • Interval breast cancer audit remains a key feature of any breast screening program. - Abstract: Objective: Full field digital mammography (FFDM) is increasingly replacing screen film mammography (SFM) in breast screening programs. Interval breast cancers are an issue in all screening programs and the purpose of our study is to assess the impact of FFDM on the classification of interval breast cancers at independent blind review and to compare the mammographic features of interval cancers at FFDM and SFM. Materials and methods: This study included 138 cases of interval breast cancer, 76 following an FFDM screening examination and 62 following screening with SFM. The prior screening mammogram was assessed by each of five consultant breast radiologists who were blinded to the site of subsequent cancer. Subsequent review of the diagnostic mammogram was performed and cases were classified as missed, minimal signs, occult or true interval. Mammographic features of the interval cancer at diagnosis and any abnormality identified on the prior screening mammogram were recorded. Results: The percentages of cancers classified as missed at FFDM and SFM did not differ significantly, 10.5% (8 of 76) at FFDM and 8.1% (5 of 62) at SFM (p = .77). There were significantly less interval cancers presenting as microcalcifications (alone or in association with another abnormality) following screening with FFDM, 16% (12 of 76) than following a SFM examination, 32% (20 of 62) (p = .02). Conclusion: Interval breast cancers continue to pose a problem at FFDM. The switch to FFDM has changed the mammographic presentation of interval breast cancer, with less interval cancers presenting in association with microcalcifications

  1. Classification of pulmonary airway disease based on mucosal color analysis

    Science.gov (United States)

    Suter, Melissa; Reinhardt, Joseph M.; Riker, David; Ferguson, John Scott; McLennan, Geoffrey

    2005-04-01

    Airway mucosal color changes occur in response to the development of bronchial diseases including lung cancer, cystic fibrosis, chronic bronchitis, emphysema and asthma. These associated changes are often visualized using standard macro-optical bronchoscopy techniques. A limitation to this form of assessment is that the subtle changes that indicate early stages in disease development may often be missed as a result of this highly subjective assessment, especially in inexperienced bronchoscopists. Tri-chromatic CCD chip bronchoscopes allow for digital color analysis of the pulmonary airway mucosa. This form of analysis may facilitate a greater understanding of airway disease response. A 2-step image classification approach is employed: the first step is to distinguish between healthy and diseased bronchoscope images and the second is to classify the detected abnormal images into 1 of 4 possible disease categories. A database of airway mucosal color constructed from healthy human volunteers is used as a standard against which statistical comparisons are made from mucosa with known apparent airway abnormalities. This approach demonstrates great promise as an effective detection and diagnosis tool to highlight potentially abnormal airway mucosa identifying a region possibly suited to further analysis via airway forceps biopsy, or newly developed micro-optical biopsy strategies. Following the identification of abnormal airway images a neural network is used to distinguish between the different disease classes. We have shown that classification of potentially diseased airway mucosa is possible through comparative color analysis of digital bronchoscope images. The combination of the two strategies appears to increase the classification accuracy in addition to greatly decreasing the computational time.

  2. Full Intelligent Cancer Classification of Thermal Breast Images to Assist Physician in Clinical Diagnostic Applications.

    Science.gov (United States)

    Lashkari, AmirEhsan; Pak, Fatemeh; Firouzmand, Mohammad

    2016-01-01

    Breast cancer is the most common type of cancer among women. The important key to treat the breast cancer is early detection of it because according to many pathological studies more than 75% - 80% of all abnormalities are still benign at primary stages; so in recent years, many studies and extensive research done to early detection of breast cancer with higher precision and accuracy. Infra-red breast thermography is an imaging technique based on recording temperature distribution patterns of breast tissue. Compared with breast mammography technique, thermography is more suitable technique because it is noninvasive, non-contact, passive and free ionizing radiation. In this paper, a full automatic high accuracy technique for classification of suspicious areas in thermogram images with the aim of assisting physicians in early detection of breast cancer has been presented. Proposed algorithm consists of four main steps: pre-processing & segmentation, feature extraction, feature selection and classification. At the first step, using full automatic operation, region of interest (ROI) determined and the quality of image improved. Using thresholding and edge detection techniques, both right and left breasts separated from each other. Then relative suspected areas become segmented and image matrix normalized due to the uniqueness of each person's body temperature. At feature extraction stage, 23 features, including statistical, morphological, frequency domain, histogram and Gray Level Co-occurrence Matrix (GLCM) based features are extracted from segmented right and left breast obtained from step 1. To achieve the best features, feature selection methods such as minimum Redundancy and Maximum Relevance (mRMR), Sequential Forward Selection (SFS), Sequential Backward Selection (SBS), Sequential Floating Forward Selection (SFFS), Sequential Floating Backward Selection (SFBS) and Genetic Algorithm (GA) have been used at step 3. Finally to classify and TH labeling procedures

  3. Linear classifier and textural analysis of optical scattering images for tumor classification during breast cancer extraction

    Science.gov (United States)

    Eguizabal, Alma; Laughney, Ashley M.; Garcia Allende, Pilar Beatriz; Krishnaswamy, Venkataramanan; Wells, Wendy A.; Paulsen, Keith D.; Pogue, Brian W.; López-Higuera, José M.; Conde, Olga M.

    2013-02-01

    Texture analysis of light scattering in tissue is proposed to obtain diagnostic information from breast cancer specimens. Light scattering measurements are minimally invasive, and allow the estimation of tissue morphology to guide the surgeon in resection surgeries. The usability of scatter signatures acquired with a micro-sampling reflectance spectral imaging system was improved utilizing an empirical approximation to the Mie theory to estimate the scattering power on a per-pixel basis. Co-occurrence analysis is then applied to the scattering power images to extract the textural features. A statistical analysis of the features demonstrated the suitability of the autocorrelation for the classification of notmalignant (normal epithelia and stroma, benign epithelia and stroma, inflammation), malignant (DCIS, IDC, ILC) and adipose tissue, since it reveals morphological information of tissue. Non-malignant tissue shows higher autocorrelation values while adipose tissue presents a very low autocorrelation on its scatter texture, being malignant the middle ground. Consequently, a fast linear classifier based on the consideration of just one straightforward feature is enough for providing relevant diagnostic information. A leave-one-out validation of the linear classifier on 29 samples with 48 regions of interest showed classification accuracies of 98.74% on adipose tissue, 82.67% on non-malignant tissue and 72.37% on malignant tissue, in comparison with the biopsy H and E gold standard. This demonstrates that autocorrelation analysis of scatter signatures is a very computationally efficient and automated approach to provide pathological information in real-time to guide surgeon during tissue resection.

  4. Stratification and prognostic relevance of Jass’s molecular classification of colorectal cancer

    Directory of Open Access Journals (Sweden)

    Inti eZlobec

    2012-02-01

    Full Text Available Background: The current proposed model of colorectal tumorigenesis is based primarily on CpG island methylator phenotype (CIMP, microsatellite instability (MSI, KRAS, BRAF, and methylation status of 0-6-Methylguanine DNA Methyltransferase (MGMT and classifies tumors into 5 subgroups. The aim of this study is to validate this molecular classification and test its prognostic relevance. Methods: 302 patients were included in this study. Molecular analysis was performed for 5 CIMP-related promoters (CRABP1, MLH1, p16INK4a, CACNA1G, NEUROG1, MGMT, MSI, KRAS and BRAF. Tumors were CIMP-high or CIMP-low if ≥4 and 1-3 promoters were methylated, respectively. Results: CIMP-high, CIMP-low and CIMP–negative were found in 7.1%, 43% and 49.9% cases, respectively. 123 tumors (41% could not be classified into any one of the proposed molecular subgroups, including 107 CIMP-low, 14 CIMP-high and 2 CIMP-negative cases. The 10-year survival rate for CIMP-high patients (22.6% (95%CI: 7-43 was significantly lower than for CIMP-low or CIMP-negative (p=0.0295. Only the combined analysis of BRAF and CIMP (negative versus low/high led to distinct prognostic subgroups. Conclusion: Although CIMP status has an effect on outcome, our results underline the need for standardized definitions of low- and high-level CIMP, which clearly hinders an effective prognostic and molecular classification of colorectal cancer.

  5. A Fuzzy Similarity Based Concept Mining Model for Text Classification

    CERN Document Server

    Puri, Shalini

    2012-01-01

    Text Classification is a challenging and a red hot field in the current scenario and has great importance in text categorization applications. A lot of research work has been done in this field but there is a need to categorize a collection of text documents into mutually exclusive categories by extracting the concepts or features using supervised learning paradigm and different classification algorithms. In this paper, a new Fuzzy Similarity Based Concept Mining Model (FSCMM) is proposed to classify a set of text documents into pre - defined Category Groups (CG) by providing them training and preparing on the sentence, document and integrated corpora levels along with feature reduction, ambiguity removal on each level to achieve high system performance. Fuzzy Feature Category Similarity Analyzer (FFCSA) is used to analyze each extracted feature of Integrated Corpora Feature Vector (ICFV) with the corresponding categories or classes. This model uses Support Vector Machine Classifier (SVMC) to classify correct...

  6. SPEECH/MUSIC CLASSIFICATION USING WAVELET BASED FEATURE EXTRACTION TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Thiruvengatanadhan Ramalingam

    2014-01-01

    Full Text Available Audio classification serves as the fundamental step towards the rapid growth in audio data volume. Due to the increasing size of the multimedia sources speech and music classification is one of the most important issues for multimedia information retrieval. In this work a speech/music discrimination system is developed which utilizes the Discrete Wavelet Transform (DWT as the acoustic feature. Multi resolution analysis is the most significant statistical way to extract the features from the input signal and in this study, a method is deployed to model the extracted wavelet feature. Support Vector Machines (SVM are based on the principle of structural risk minimization. SVM is applied to classify audio into their classes namely speech and music, by learning from training data. Then the proposed method extends the application of Gaussian Mixture Models (GMM to estimate the probability density function using maximum likelihood decision methods. The system shows significant results with an accuracy of 94.5%.

  7. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    Science.gov (United States)

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  8. Network Traffic Anomalies Identification Based on Classification Methods

    Directory of Open Access Journals (Sweden)

    Donatas Račys

    2015-07-01

    Full Text Available A problem of network traffic anomalies detection in the computer networks is analyzed. Overview of anomalies detection methods is given then advantages and disadvantages of the different methods are analyzed. Model for the traffic anomalies detection was developed based on IBM SPSS Modeler and is used to analyze SNMP data of the router. Investigation of the traffic anomalies was done using three classification methods and different sets of the learning data. Based on the results of investigation it was determined that C5.1 decision tree method has the largest accuracy and performance and can be successfully used for identification of the network traffic anomalies.

  9. Long-term Prostate-specific Antigen Velocity in Improved Classification of Prostate Cancer Risk and Mortality

    DEFF Research Database (Denmark)

    Ørsted, David Dynnes; Bojesen, Stig E; Kamstrup, Pia R;

    2013-01-01

    BACKGROUND: It remains unclear whether adding long-term prostate-specific antigen velocity (PSAV) to baseline PSA values improves classification of prostate cancer (PCa) risk and mortality in the general population. OBJECTIVE: To determine whether long-term PSAV improves classification of PCa risk...

  10. Spectral classification of stars based on LAMOST spectra

    CERN Document Server

    Liu, Chao; Zhang, Bo; Wan, Jun-Chen; Deng, Li-Cai; Hou, Yonghui; Wang, Yuefei; Yang, Ming; Zhang, Yong

    2015-01-01

    In this work, we select the high signal-to-noise ratio spectra of stars from the LAMOST data andmap theirMK classes to the spectral features. The equivalentwidths of the prominent spectral lines, playing the similar role as the multi-color photometry, form a clean stellar locus well ordered by MK classes. The advantage of the stellar locus in line indices is that it gives a natural and continuous classification of stars consistent with either the broadly used MK classes or the stellar astrophysical parameters. We also employ a SVM-based classification algorithm to assignMK classes to the LAMOST stellar spectra. We find that the completenesses of the classification are up to 90% for A and G type stars, while it is down to about 50% for OB and K type stars. About 40% of the OB and K type stars are mis-classified as A and G type stars, respectively. This is likely owe to the difference of the spectral features between the late B type and early A type stars or between the late G and early K type stars are very we...

  11. Risk Classification and Risk-based Safety and Mission Assurance

    Science.gov (United States)

    Leitner, Jesse A.

    2014-01-01

    Recent activities to revamp and emphasize the need to streamline processes and activities for Class D missions across the agency have led to various interpretations of Class D, including the lumping of a variety of low-cost projects into Class D. Sometimes terms such as Class D minus are used. In this presentation, mission risk classifications will be traced to official requirements and definitions as a measure to ensure that projects and programs align with the guidance and requirements that are commensurate for their defined risk posture. As part of this, the full suite of risk classifications, formal and informal will be defined, followed by an introduction to the new GPR 8705.4 that is currently under review.GPR 8705.4 lays out guidance for the mission success activities performed at the Classes A-D for NPR 7120.5 projects as well as for projects not under NPR 7120.5. Furthermore, the trends in stepping from Class A into higher risk posture classifications will be discussed. The talk will conclude with a discussion about risk-based safety and mission assuranceat GSFC.

  12. Lung Cancer Early Diagnosis Using Some Data Mining Classification Techniques: A Survey

    Directory of Open Access Journals (Sweden)

    Thangaraju P

    2014-06-01

    Full Text Available Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining is primarily used to this requirement thus finding its applications in diverse fields such as retail, financial, communication, marketing organizations and medicine. Data Mining plays an important role in healthcare organization because with the growth of population and dangerous deadly diseases like Cancer, SARS, Leprosy, HIV etc, Lung cancer is one of the most dangerous disease. This survey for appropriate medical image mining, Data Preprocessing, Feature Extraction, rule generation and classification, it provides basic framework for further improvement in medical diagnosis.

  13. MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS

    Science.gov (United States)

    Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...

  14. Content-based image retrieval applied to BI-RADS tissue classification in screening mammography

    OpenAIRE

    2011-01-01

    AIM: To present a content-based image retrieval (CBIR) system that supports the classification of breast tissue density and can be used in the processing chain to adapt parameters for lesion segmentation and classification.

  15. A Hybrid Classification Approach based on FCA and Emerging Patterns - An application for the classification of biological inhibitors

    OpenAIRE

    Asses, Yasmine; Buzmakov, Aleksey; Bourquard, Thomas; Kuznetsov, Sergei O.; Napoli, Amedeo

    2012-01-01

    Classification is an important task in data analysis and learning. Classification can be performed using supervised or unsupervised methods. From the unsupervised point of view, Formal Concept Analysis (FCA) can be used for such a task in an efficient and well-founded way. From the supervised point of view, emerging patterns rely on pattern mining and can be used to characterize classes of objects w.r.t. a priori labels. In this paper, we present a hybrid classification method which is based ...

  16. Apparent diffusion coefficient value of gastric cancer by diffusion-weighted imaging: Correlations with the histological differentiation and Lauren classification

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Song, E-mail: songliu532909756@gmail.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Guan, Wenxian, E-mail: wenxianguan123@126.com [Department of Gastrointestinal Surgery, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Wang, Hao, E-mail: wanghao20140525@126.com [Department of Gastrointestinal Surgery, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Pan, Liang, E-mail: panliang2014@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Zhou, Zhuping, E-mail: zhupingzhou@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Yu, Haiping, E-mail: haipingyu2012@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Liu, Tian, E-mail: tianliu2014@126.com [Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322 (United States); Yang, Xiaofeng, E-mail: xiaofengyang2014@126.com [Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322 (United States); He, Jian, E-mail: hjxueren@126.com [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China); Zhou, Zhengyang, E-mail: zyzhou@nju.edu.cn [Department of Radiology, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008 (China)

    2014-12-15

    Highlights: • Gastric cancers’ ADC values were significantly lower than normal gastric wall. • Gastric adenocarcinomas with different differentiation had different ADC values. • Gastric adenocarcinomas’ ADC values correlated with histologic differentiations. • Gastric cancers’ ADC values correlated with Lauren classifications. • Mean ADC value was better than min ADC value in characterizing gastric cancers. - Abstract: Objective: The purpose of this study was to evaluate the correlations between histological differentiation and Lauren classification of gastric cancer and the apparent diffusion coefficient (ADC) value of diffusion weighted imaging (DWI). Materials and methods: Sixty-nine patients with gastric cancer lesions underwent preoperative magnetic resonance imaging (MRI) (3.0T) and surgical resection. DWI was obtained with a single-shot, echo-planar imaging sequence in the axial plane (b values: 0 and 1000 s/mm{sup 2}). Mean and minimum ADC values were obtained for each gastric cancer and normal gastric walls by two radiologists, who were blinded to the histological findings. Histological type, degree of differentiation and Lauren classification of each resected specimen were determined by one pathologist. Mean and minimum ADC values of gastric cancers with different histological types, degrees of differentiation and Lauren classifications were compared. Correlations between ADC values and histological differentiation and Lauren classification were analyzed. Results: The mean and minimum ADC values of gastric cancers, as a whole and separately, were significantly lower than those of normal gastric walls (all p values <0.001). There were significant differences in the mean and minimum ADC values among gastric cancers with different histological types, degrees of differentiation and Lauren classifications (p < 0.05). Mean and minimum ADC values correlated significantly (all p < 0.001) with histological differentiation (r = 0.564, 0.578) and

  17. Apparent diffusion coefficient value of gastric cancer by diffusion-weighted imaging: Correlations with the histological differentiation and Lauren classification

    International Nuclear Information System (INIS)

    Highlights: • Gastric cancers’ ADC values were significantly lower than normal gastric wall. • Gastric adenocarcinomas with different differentiation had different ADC values. • Gastric adenocarcinomas’ ADC values correlated with histologic differentiations. • Gastric cancers’ ADC values correlated with Lauren classifications. • Mean ADC value was better than min ADC value in characterizing gastric cancers. - Abstract: Objective: The purpose of this study was to evaluate the correlations between histological differentiation and Lauren classification of gastric cancer and the apparent diffusion coefficient (ADC) value of diffusion weighted imaging (DWI). Materials and methods: Sixty-nine patients with gastric cancer lesions underwent preoperative magnetic resonance imaging (MRI) (3.0T) and surgical resection. DWI was obtained with a single-shot, echo-planar imaging sequence in the axial plane (b values: 0 and 1000 s/mm2). Mean and minimum ADC values were obtained for each gastric cancer and normal gastric walls by two radiologists, who were blinded to the histological findings. Histological type, degree of differentiation and Lauren classification of each resected specimen were determined by one pathologist. Mean and minimum ADC values of gastric cancers with different histological types, degrees of differentiation and Lauren classifications were compared. Correlations between ADC values and histological differentiation and Lauren classification were analyzed. Results: The mean and minimum ADC values of gastric cancers, as a whole and separately, were significantly lower than those of normal gastric walls (all p values <0.001). There were significant differences in the mean and minimum ADC values among gastric cancers with different histological types, degrees of differentiation and Lauren classifications (p < 0.05). Mean and minimum ADC values correlated significantly (all p < 0.001) with histological differentiation (r = 0.564, 0.578) and Lauren

  18. A Chemistry-Based Classification for Peridotite Xenoliths

    Science.gov (United States)

    Block, K. A.; Ducea, M.; Raye, U.; Stern, R. J.; Anthony, E. Y.; Lehnert, K. A.

    2007-12-01

    The development of a petrological and geochemical database for mantle xenoliths is important for interpreting EarthScope geophysical results. Interpretation of compositional characteristics of xenoliths requires a sound basis for comparing geochemical results, even when no petrographic modes are available. Peridotite xenoliths are generally classified on the basis of mineralogy (Streckeisen, 1973) derived from point-counting methods. Modal estimates, particularly on heterogeneous samples, are conducted using various methodologies and are therefore subject to large statistical error. Also, many studies simply do not report the modes. Other classifications for peridotite xenoliths based on host matrix or tectonic setting (cratonic vs. non-cratonic) are poorly defined and provide little information on where samples from transitional settings fit within a classification scheme (e.g., xenoliths from circum-cratonic locations). We present here a classification for peridotite xenoliths based on bulk rock major element chemistry, which is one of the most common types of data reported in the literature. A chemical dataset of over 1150 peridotite xenoliths is compiled from two online geochemistry databases, the EarthChem Deep Lithosphere Dataset and from GEOROC (http://www.earthchem.org), and is downloaded with the rock names reported in the original publications. Ternary plots of combinations of the SiO2- CaO-Al2O3-MgO (SCAM) components display sharp boundaries that define the dunite, harzburgite, lherzolite, or wehrlite-pyroxenite fields and provide a graphical basis for classification. In addition, for the CaO-Al2O3-MgO (CAM) diagram, a boundary between harzburgite and lherzolite at approximately 19% CaO is defined by a plot of over 160 abyssal peridotite compositions calculated from observed modes using the methods of Asimow (1999) and Baker and Beckett (1999). We anticipate that our SCAM classification is a first step in the development of a uniform basis for

  19. Content Based Image Retrieval : Classification Using Neural Networks

    Directory of Open Access Journals (Sweden)

    Shereena V.B

    2014-11-01

    Full Text Available In a content-based image retrieval system (CBIR, the main issue is to extract the image features that effectively represent the image contents in a database. Such an extraction requires a detailed evaluation of retrieval performance of image features. This paper presents a review of fundamental aspects of content based image retrieval including feature extraction of color and texture features. Commonly used color features including color moments, color histogram and color correlogram and Gabor texture are compared. The paper reviews the increase in efficiency of image retrieval when the color and texture features are combined. The similarity measures based on which matches are made and images are retrieved are also discussed. For effective indexing and fast searching of images based on visual features, neural network based pattern learning can be used to achieve effective classification.

  20. Content Based Image Retrieval : Classification Using Neural Networks

    Directory of Open Access Journals (Sweden)

    Shereena V.B

    2014-10-01

    Full Text Available In a content-based image retrieval system (CBIR, the main issue is to extract the image features that effectively represent the image contents in a database. Such an extraction requires a detailed evaluation of retrieval performance of image features. This paper presents a review of fundamental aspects of content based image retrieval including feature extraction of color and texture features. Commonly used color features including color moments, color histogram and color correlogram and Gabor texture are compared. The paper reviews the increase in efficiency of image retrieval when the color and texture features are combined. The similarity measures based on which matches are made and images are retrieved are also discussed. For effective indexing and fast searching of images based on visual features, neural network based pattern learning can be used to achieve effective classification.

  1. Intrusion Awareness Based on Data Fusion and SVM Classification

    Directory of Open Access Journals (Sweden)

    Ramnaresh Sharma

    2012-06-01

    Full Text Available Network intrusion awareness is important factor forrisk analysis of network security. In the currentdecade various method and framework are availablefor intrusion detection and security awareness.Some method based on knowledge discovery processand some framework based on neural network.These entire model take rule based decision for thegeneration of security alerts. In this paper weproposed a novel method for intrusion awarenessusing data fusion and SVM classification. Datafusion work on the biases of features gathering ofevent. Support vector machine is super classifier ofdata. Here we used SVM for the detection of closeditem of ruled based technique. Our proposedmethod simulate on KDD1999 DARPA data set andget better empirical evaluation result in comparisonof rule based technique and neural network model.

  2. Intrusion Awareness Based on Data Fusion and SVM Classification

    Directory of Open Access Journals (Sweden)

    Ramnaresh Sharma

    2012-06-01

    Full Text Available Network intrusion awareness is important factor for risk analysis of network security. In the current decade various method and framework are available for intrusion detection and security awareness. Some method based on knowledge discovery process and some framework based on neural network. These entire model take rule based decision for the generation of security alerts. In this paper we proposed a novel method for intrusion awareness using data fusion and SVM classification. Data fusion work on the biases of features gathering of event. Support vector machine is super classifier of data. Here we used SVM for the detection of closed item of ruled based technique. Our proposed method simulate on KDD1999 DARPA data set and get better empirical evaluation result in comparison of rule based technique and neural network model.

  3. Texton Based Shape Features on Local Binary Pattern for Age Classification

    OpenAIRE

    V. Vijaya Kumar; B. Eswara Reddy; P. Chandra Sekhar Reddy

    2012-01-01

    Classification and recognition of objects is interest of many researchers. Shape is a significant feature of objects and it plays a crucial role in image classification and recognition. The present paper assumes that the features that drastically affect the adulthood classification system are the Shape features (SF) of face. Based on this, the present paper proposes a new technique of adulthood classification by extracting feature parameters of face on Integrated Texton based LBP (IT-LBP) ima...

  4. Generalization performance of graph-based semisupervised classification

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    Semi-supervised learning has been of growing interest over the past few years and many methods have been proposed. Although various algorithms are provided to implement semi-supervised learning,there are still gaps in our understanding of the dependence of generalization error on the numbers of labeled and unlabeled data. In this paper,we consider a graph-based semi-supervised classification algorithm and establish its generalization error bounds. Our results show the close relations between the generalization performance and the structural invariants of data graph.

  5. Hydrophobicity classification of polymeric materials based on fractal dimension

    Directory of Open Access Journals (Sweden)

    Daniel Thomazini

    2008-12-01

    Full Text Available This study proposes a new method to obtain hydrophobicity classification (HC in high voltage polymer insulators. In the method mentioned, the HC was analyzed by fractal dimension (fd and its processing time was evaluated having as a goal the application in mobile devices. Texture images were created from spraying solutions produced of mixtures of isopropyl alcohol and distilled water in proportions, which ranged from 0 to 100% volume of alcohol (%AIA. Based on these solutions, the contact angles of the drops were measured and the textures were used as patterns for fractal dimension calculations.

  6. An AIS-Based E-mail Classification Method

    Science.gov (United States)

    Qing, Jinjian; Mao, Ruilong; Bie, Rongfang; Gao, Xiao-Zhi

    This paper proposes a new e-mail classification method based on the Artificial Immune System (AIS), which is endowed with good diversity and self-adaptive ability by using the immune learning, immune memory, and immune recognition. In our method, the features of spam and non-spam extracted from the training sets are combined together, and the number of false positives (non-spam messages that are incorrectly classified as spam) can be reduced. The experimental results demonstrate that this method is effective in reducing the false rate.

  7. Commercial Shot Classification Based on Multiple Features Combination

    Science.gov (United States)

    Liu, Nan; Zhao, Yao; Zhu, Zhenfeng; Ni, Rongrong

    This paper presents a commercial shot classification scheme combining well-designed visual and textual features to automatically detect TV commercials. To identify the inherent difference between commercials and general programs, a special mid-level textual descriptor is proposed, aiming to capture the spatio-temporal properties of the video texts typical of commercials. In addition, we introduce an ensemble-learning based combination method, named Co-AdaBoost, to interactively exploit the intrinsic relations between the visual and textual features employed.

  8. A Method for Data Classification Based on Discernibility Matrix and Discernibility Function

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    A method for data classification will influence the efficiency of classification. Attributes reduction based on discernibility matrix and discernibility function in rough sets can use in data classification, so we put forward a method for data classification. Namely, firstly, we use discernibility matrix and discernibility function to delete superfluous attributes in formation system and get a necessary attribute set. Secondly, we delete superfluous attribute values and get decision rules. Finally, we classify data by means of decision rules. The experiments show that data classification using this method is simpler in the structure, and can improve the efficiency of classification.

  9. Semi-Supervised Classification based on Gaussian Mixture Model for remote imagery

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    Semi-Supervised Classification (SSC),which makes use of both labeled and unlabeled data to determine classification borders in feature space,has great advantages in extracting classification information from mass data.In this paper,a novel SSC method based on Gaussian Mixture Model (GMM) is proposed,in which each class’s feature space is described by one GMM.Experiments show the proposed method can achieve high classification accuracy with small amount of labeled data.However,for the same accuracy,supervised classification methods such as Support Vector Machine,Object Oriented Classification,etc.should be provided with much more labeled data.

  10. Vessel-guided airway segmentation based on voxel classification

    DEFF Research Database (Denmark)

    Lo, Pechin Chien Pau; Sporring, Jon; Ashraf, Haseem;

    2008-01-01

    surroundings of a voxel, estimated based on a tube model, is to that of a neighboring vessel. The proposed method is tested on 20 CT images from different subjects selected randomly from a lung cancer screening study. Length of the airway branches from the results of the proposed method are significantly...

  11. Feature selection gait-based gender classification under different circumstances

    Science.gov (United States)

    Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

    2014-05-01

    This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

  12. Forest Classification Based on Forest texture in Northwest Yunnan Province

    International Nuclear Information System (INIS)

    Forest texture is an intrinsic characteristic and an important visual feature of a forest ecological system. Full utilization of forest texture will be a great help in increasing the accuracy of forest classification based on remote sensed data. Taking Shangri-La as a study area, forest classification has been based on the texture. The results show that: (1) From the texture abundance, texture boundary, entropy as well as visual interpretation, the combination of Grayscale-gradient co-occurrence matrix and wavelet transformation is much better than either one of both ways of forest texture information extraction; (2) During the forest texture information extraction, the size of the texture-suitable window determined by the semi-variogram method depends on the forest type (evergreen broadleaf forest is 3×3, deciduous broadleaf forest is 5×5, etc.). (3)While classifying forest based on forest texture information, the texture factor assembly differs among forests: Variance Heterogeneity and Correlation should be selected when the window is between 3×3 and 5×5; Mean, Correlation, and Entropy should be used when the window in the range of 7×7 to 19×19; and Correlation, Second Moment, and Variance should be used when the range is larger than 21×21

  13. Biopharmaceutics classification system-based biowaivers for generic oncology drug products: case studies.

    Science.gov (United States)

    Tampal, Nilufer; Mandula, Haritha; Zhang, Hongling; Li, Bing V; Nguyen, Hoainhon; Conner, Dale P

    2015-02-01

    Establishing bioequivalence (BE) of drugs indicated to treat cancer poses special challenges. For ethical reasons, often, the studies need to be conducted in cancer patients rather than in healthy volunteers, especially when the drug is cytotoxic. The Biopharmaceutics Classification System (BCS) introduced by Amidon (1) and adopted by the FDA, presents opportunities to avoid conducting the bioequivalence studies in humans. This paper analyzes the application of the BCS approach by the generic pharmaceutical industry and the FDA to oncology drug products. To date, the FDA has granted BCS-based biowaivers for several drug products involving at least four different drug substances, used to treat cancer. Compared to in vivo BE studies, development of data to justify BCS waivers is considered somewhat easier, faster, and more cost effective. However, the FDA experience shows that the approval times for applications containing in vitro studies to support the BCS-based biowaivers are often as long as the applications containing in vivo BE studies, primarily because of inadequate information in the submissions. This paper deliberates some common causes for the delays in the approval of applications requesting BCS-based biowaivers for oncology drug products. Scientific considerations of conducting a non-BCS-based in vivo BE study for generic oncology drug products are also discussed. It is hoped that the information provided in our study would help the applicants to improve the quality of ANDA submissions in the future. PMID:25245330

  14. Classification Based on Hierarchical Linear Models: The Need for Incorporation of Social Contexts in Classification Analysis

    Science.gov (United States)

    Vaughn, Brandon K.; Wang, Qui

    2009-01-01

    Many areas in educational and psychological research involve the use of classification statistical analysis. For example, school districts might be interested in attaining variables that provide optimal prediction of school dropouts. In psychology, a researcher might be interested in the classification of a subject into a particular psychological…

  15. Joint Probability-Based Neuronal Spike Train Classification

    Directory of Open Access Journals (Sweden)

    Yan Chen

    2009-01-01

    Full Text Available Neuronal spike trains are used by the nervous system to encode and transmit information. Euclidean distance-based methods (EDBMs have been applied to quantify the similarity between temporally-discretized spike trains and model responses. In this study, using the same discretization procedure, we developed and applied a joint probability-based method (JPBM to classify individual spike trains of slowly adapting pulmonary stretch receptors (SARs. The activity of individual SARs was recorded in anaesthetized, paralysed adult male rabbits, which were artificially-ventilated at constant rate and one of three different volumes. Two-thirds of the responses to the 600 stimuli presented at each volume were used to construct three response models (one for each stimulus volume consisting of a series of time bins, each with spike probabilities. The remaining one-third of the responses where used as test responses to be classified into one of the three model responses. This was done by computing the joint probability of observing the same series of events (spikes or no spikes, dictated by the test response in a given model and determining which probability of the three was highest. The JPBM generally produced better classification accuracy than the EDBM, and both performed well above chance. Both methods were similarly affected by variations in discretization parameters, response epoch duration, and two different response alignment strategies. Increasing bin widths increased classification accuracy, which also improved with increased observation time, but primarily during periods of increasing lung inflation. Thus, the JPBM is a simple and effective method performing spike train classification.

  16. Association Technique based on Classification for Classifying Microcalcification and Mass in Mammogram

    Directory of Open Access Journals (Sweden)

    Herwanto

    2013-01-01

    Full Text Available Currently, mammography is recognized as the most effective imaging modality for breast cancer screening. The challenge of using mammography is how to locate the area, which is indeed a solitary geographic abnormality. In mammography screening it is important to define the risk for women who have radiologically negative findings and for those who might develop malignancy later in life. Microcalcification and mass segmentation are used frequently as the first step in mammography screening. The main objective of this paper is to apply association technique based on classification algorithm to classify microcalcification and mass in mammogram. The system that we propose consists of: (i a preprocessing phase to enhance the quality of the image and followed by segmentating region of interest; (ii a phase for mining a transactional table; and (iii a phase for organizing the resulted association rules in a classification model. This paper also illustrates how important the data cleaning phase in building the data mining process for image classification. The proposed method was evaluated using the mammogram data from Mammographic Image Analysis Society (MIAS. The MIAS data consist of 207 images of normal breast, 64 benign, and 51 malignant. 85 mammograms of MIAS data have mass, and 25 mammograms have microcalcification. The features of mean and Gray Level Co-occurrence Matrix homogeneity have been proved to be potential for discriminating microcalcification from mass. The accuracy obtained by this method is 83%.

  17. A NEW FUNCTIONAL CLASSIFICATION OF STOMACH CANCER AND ITS PATHOBIOLOGICAL AND CLINICAL SIGNIFICANCE

    Institute of Scientific and Technical Information of China (English)

    辛彦; 赵风凯; 宫伟; 王艳萍; 张荫昌; 闫瑞方

    1994-01-01

    The functional differentiations of stomach cancer specimens from 121patients were investigated by enzyme-,mucin-,affinity-and immunohistochemical methods,and the stomach cancers were divided into five functionally differentiated types:1)Absorptive Function Differentiation Type (AFDT),19.8%;2)Mucin Secreting Func-tion Differentiation Type (MSFDT),24.0%;3)Absorptive and Mucin-Producing Function Differentiation Type (AMPFDT),47.1%;4)Special Function Differentiation Type (SFDT),0.8%;and 5)Non-Function Differ-entiation Type(NFDT),8.3%.The results indicate that stomach cancer tissues of the same histological type of -ten display differing functional differentiation,and these functionally differentiated types have different invasive and metastatic characteristics.In addition,the functionally differentiated types have particular organic affinities of metastasis and different clinical prognoses.This study suggests that this new functional classification may supple-ment histological classification.The mechanisms of liver and ovary metastases of stomach cancer are also dis-cussed.

  18. A New Classification Analysis of Customer Requirement Information Based on Quantitative Standardization for Product Configuration

    OpenAIRE

    Zheng Xiao; Zude Zhou; Buyun Sheng

    2016-01-01

    Traditional methods used for the classification of customer requirement information are typically based on specific indicators, hierarchical structures, and data formats and involve a qualitative analysis in terms of stationary patterns. Because these methods neither consider the scalability of classification results nor do they regard subsequent application to product configuration, their classification becomes an isolated operation. However, the transformation of customer requirement inform...

  19. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    Directory of Open Access Journals (Sweden)

    Hala Alshamlan

    2015-01-01

    Full Text Available An artificial bee colony (ABC is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR, and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO. The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  20. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    Science.gov (United States)

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems. PMID:25961028

  1. Style-based classification of Chinese ink and wash paintings

    Science.gov (United States)

    Sheng, Jiachuan; Jiang, Jianmin

    2013-09-01

    Following the fact that a large collection of ink and wash paintings (IWP) is being digitized and made available on the Internet, their automated content description, analysis, and management are attracting attention across research communities. While existing research in relevant areas is primarily focused on image processing approaches, a style-based algorithm is proposed to classify IWPs automatically by their authors. As IWPs do not have colors or even tones, the proposed algorithm applies edge detection to locate the local region and detect painting strokes to enable histogram-based feature extraction and capture of important cues to reflect the styles of different artists. Such features are then applied to drive a number of neural networks in parallel to complete the classification, and an information entropy balanced fusion is proposed to make an integrated decision for the multiple neural network classification results in which the entropy is used as a pointer to combine the global and local features. Evaluations via experiments support that the proposed algorithm achieves good performances, providing excellent potential for computerized analysis and management of IWPs.

  2. ECG-based heartbeat classification for arrhythmia detection: A survey.

    Science.gov (United States)

    Luz, Eduardo José da S; Schwartz, William Robson; Cámara-Chávez, Guillermo; Menotti, David

    2016-04-01

    An electrocardiogram (ECG) measures the electric activity of the heart and has been widely used for detecting heart diseases due to its simplicity and non-invasive nature. By analyzing the electrical signal of each heartbeat, i.e., the combination of action impulse waveforms produced by different specialized cardiac tissues found in the heart, it is possible to detect some of its abnormalities. In the last decades, several works were developed to produce automatic ECG-based heartbeat classification methods. In this work, we survey the current state-of-the-art methods of ECG-based automated abnormalities heartbeat classification by presenting the ECG signal preprocessing, the heartbeat segmentation techniques, the feature description methods and the learning algorithms used. In addition, we describe some of the databases used for evaluation of methods indicated by a well-known standard developed by the Association for the Advancement of Medical Instrumentation (AAMI) and described in ANSI/AAMI EC57:1998/(R)2008 (ANSI/AAMI, 2008). Finally, we discuss limitations and drawbacks of the methods in the literature presenting concluding remarks and future challenges, and also we propose an evaluation process workflow to guide authors in future works. PMID:26775139

  3. Proposed classification of medial maxillary labial frenum based on morphology

    Directory of Open Access Journals (Sweden)

    Ranjana Mohan

    2014-01-01

    Full Text Available Objectives: To propose a new classification of median maxillary labial frenum (MMLF based on the morphology in permanent dentition, conducting a cross-sectional survey. Materials and Methods: Unicentric study was conducted on 2,400 adults (1,414 males, 986 females, aged between 18 and 76 years, with mean age = 38.62, standard deviation (SD = 12.53. Male mean age = 38.533 years and male SD = 12.498. Female mean age = 38.71 and female SD = 12.5750 for a period of 6 months at Teerthanker Mahaveer University, Moradabad, Northern India. The frenum morphology was determined by using the direct visual method under natural light and categorized. Results: Diverse frenum morphologies were observed. Several variations found in the study have not been documented in the past literature and were named and classified according to their morphology. Discussion: The MMLF presents a diverse array of morphological variations. Several other undocumented types of frena were observed and revised, detailed classification has been proposed based on cross-sectional survey.

  4. A Cluster Based Approach for Classification of Web Results

    Directory of Open Access Journals (Sweden)

    Apeksha Khabia

    2014-12-01

    Full Text Available Nowadays significant amount of information from web is present in the form of text, e.g., reviews, forum postings, blogs, news articles, email messages, web pages. It becomes difficult to classify documents in predefined categories as the number of document grows. Clustering is the classification of a data into clusters, so that the data in each cluster share some common trait – often vicinity according to some defined measure. Underlying distribution of data set can somewhat be depicted based on the learned clusters under the guidance of initial data set. Thus, clusters of documents can be employed to train the classifier by using defined features of those clusters. One of the important issues is also to classify the text data from web into different clusters by mining the knowledge. Conforming to that, this paper presents a review on most of document clustering technique and cluster based classification techniques used so far. Also pre-processing on text dataset and document clustering method is explained in brief.

  5. Understanding Acupuncture Based on ZHENG Classification from System Perspective

    Directory of Open Access Journals (Sweden)

    Junwei Fang

    2013-01-01

    Full Text Available Acupuncture is an efficient therapy method originated in ancient China, the study of which based on ZHENG classification is a systematic research on understanding its complexity. The system perspective is contributed to understand the essence of phenomena, and, as the coming of the system biology era, broader technology platforms such as omics technologies were established for the objective study of traditional chinese medicine (TCM. Omics technologies could dynamically determine molecular components of various levels, which could achieve a systematic understanding of acupuncture by finding out the relationships of various response parts. After reviewing the literature of acupuncture studied by omics approaches, the following points were found. Firstly, with the help of omics approaches, acupuncture was found to be able to treat diseases by regulating the neuroendocrine immune (NEI network and the change of which could reflect the global effect of acupuncture. Secondly, the global effect of acupuncture could reflect ZHENG information at certain structure and function levels, which might reveal the mechanism of Meridian and Acupoint Specificity. Furthermore, based on comprehensive ZHENG classification, omics researches could help us understand the action characteristics of acupoints and the molecular mechanisms of their synergistic effect.

  6. Robust Pedestrian Classification Based on Hierarchical Kernel Sparse Representation.

    Science.gov (United States)

    Sun, Rui; Zhang, Guanghai; Yan, Xiaoxing; Gao, Jun

    2016-01-01

    Vision-based pedestrian detection has become an active topic in computer vision and autonomous vehicles. It aims at detecting pedestrians appearing ahead of the vehicle using a camera so that autonomous vehicles can assess the danger and take action. Due to varied illumination and appearance, complex background and occlusion pedestrian detection in outdoor environments is a difficult problem. In this paper, we propose a novel hierarchical feature extraction and weighted kernel sparse representation model for pedestrian classification. Initially, hierarchical feature extraction based on a CENTRIST descriptor is used to capture discriminative structures. A max pooling operation is used to enhance the invariance of varying appearance. Then, a kernel sparse representation model is proposed to fully exploit the discrimination information embedded in the hierarchical local features, and a Gaussian weight function as the measure to effectively handle the occlusion in pedestrian images. Extensive experiments are conducted on benchmark databases, including INRIA, Daimler, an artificially generated dataset and a real occluded dataset, demonstrating the more robust performance of the proposed method compared to state-of-the-art pedestrian classification methods. PMID:27537888

  7. Pixel classification based color image segmentation using quaternion exponent moments.

    Science.gov (United States)

    Wang, Xiang-Yang; Wu, Zhi-Fang; Chen, Liang; Zheng, Hong-Liang; Yang, Hong-Ying

    2016-02-01

    Image segmentation remains an important, but hard-to-solve, problem since it appears to be application dependent with usually no a priori information available regarding the image structure. In recent years, many image segmentation algorithms have been developed, but they are often very complex and some undesired results occur frequently. In this paper, we propose a pixel classification based color image segmentation using quaternion exponent moments. Firstly, the pixel-level image feature is extracted based on quaternion exponent moments (QEMs), which can capture effectively the image pixel content by considering the correlation between different color channels. Then, the pixel-level image feature is used as input of twin support vector machines (TSVM) classifier, and the TSVM model is trained by selecting the training samples with Arimoto entropy thresholding. Finally, the color image is segmented with the trained TSVM model. The proposed scheme has the following advantages: (1) the effective QEMs is introduced to describe color image pixel content, which considers the correlation between different color channels, (2) the excellent TSVM classifier is utilized, which has lower computation time and higher classification accuracy. Experimental results show that our proposed method has very promising segmentation performance compared with the state-of-the-art segmentation approaches recently proposed in the literature. PMID:26618250

  8. Classification of Breast Cancer Subtypes by combining Gene Expression and DNA Methylation Data

    DEFF Research Database (Denmark)

    List, Markus; Hauschild, Anne-Christin; Tan, Qihua;

    2014-01-01

    Selecting the most promising treatment strategy for breast cancer crucially depends on determining the correct subtype. In recent years, gene expression profiling has been investigated as an alternative to histochemical methods. Since databases like TCGA provide easy and unrestricted access to gene......-20% and classification error of 1-50%, depending on breast cancer subtype and model. The gene expression model was clearly superior to the methylation model, which was also reflected in the combined model, which mainly selected features from gene expression data. However, the methylation model was able to identify...... unique features not considered as relevant by the gene expression model, which might provide deeper insights into breast cancer subtype differentiation on an epigenetic level....

  9. Target Image Classification through Encryption Algorithm Based on the Biological Features

    OpenAIRE

    Zhiwu Chen; Qing E. Wu; Weidong Yang

    2014-01-01

    In order to effectively make biological image classification and identification, this paper studies the biological owned characteristics, gives an encryption algorithm, and presents a biological classification algorithm based on the encryption process. Through studying the composition characteristics of palm, this paper uses the biological classification algorithm to carry out the classification or recognition of palm, improves the accuracy and efficiency of the existing biological classifica...

  10. Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Classification

    OpenAIRE

    Rajendra Palange,; Nishikant Pachpute

    2015-01-01

    This paper proposes the Rainfall Prediction System by using classification technique. The advanced and modified neural network called Data Core Based Fuzzy Min Max Neural Network (DCFMNN) is used for pattern classification. This classification method is applied to predict Rainfall. The neural network called fuzzy min max neural network (FMNN) that creates hyperboxes for classification and predication, has a problem of overlapping neurons that resoled in DCFMNN to give greater accu...

  11. An Assessment of Case Base Reasoning for Short Text Message Classification

    OpenAIRE

    Healy, Matt, (Thesis); Delany, Sarah Jane; Zamolotskikh, Anton

    2004-01-01

    Message classification is a text classification task that has provoked much interest in machine learning. One aspect of message classification that presents a particular challenge is the classification of short text messages. This paper presents an assessment of applying a case based approach that was developed for long text messages (specifically spam filtering) to short text messages. The evaluation involves determining the most appropriate feature types and feature representation for short...

  12. Cancer Biochemistry and Host-Tumor Interactions: A Decimal Classification, (Categories 51.6, 51.7, and 51.8).

    Science.gov (United States)

    Schneider, John H.

    This is a hierarchical decimal classification of information related to cancer biochemistry, to host-tumor interactions (including cancer immunology), and to occurrence of cancer in special types of animals and plants. It is a working draft of categories taken from an extensive classification of many fields of biomedical information. Because the…

  13. Utilizing ECG-Based Heartbeat Classification for Hypertrophic Cardiomyopathy Identification.

    Science.gov (United States)

    Rahman, Quazi Abidur; Tereshchenko, Larisa G; Kongkatong, Matthew; Abraham, Theodore; Abraham, M Roselle; Shatkay, Hagit

    2015-07-01

    Hypertrophic cardiomyopathy (HCM) is a cardiovascular disease where the heart muscle is partially thickened and blood flow is (potentially fatally) obstructed. A test based on electrocardiograms (ECG) that record the heart electrical activity can help in early detection of HCM patients. This paper presents a cardiovascular-patient classifier we developed to identify HCM patients using standard 10-second, 12-lead ECG signals. Patients are classified as having HCM if the majority of their recorded heartbeats are recognized as characteristic of HCM. Thus, the classifier's underlying task is to recognize individual heartbeats segmented from 12-lead ECG signals as HCM beats, where heartbeats from non-HCM cardiovascular patients are used as controls. We extracted 504 morphological and temporal features—both commonly used and newly-developed ones—from ECG signals for heartbeat classification. To assess classification performance, we trained and tested a random forest classifier and a support vector machine classifier using 5-fold cross validation. We also compared the performance of these two classifiers to that obtained by a logistic regression classifier, and the first two methods performed better than logistic regression. The patient-classification precision of random forests and of support vector machine classifiers is close to 0.85. Recall (sensitivity) and specificity are approximately 0.90. We also conducted feature selection experiments by gradually removing the least informative features; the results show that a relatively small subset of 264 highly informative features can achieve performance measures comparable to those achieved by using the complete set of features. PMID:25915962

  14. Classification of prostate cancer grade using temporal ultrasound: in vivo feasibility study

    Science.gov (United States)

    Ghavidel, Sahar; Imani, Farhad; Khallaghi, Siavash; Gibson, Eli; Khojaste, Amir; Gaed, Mena; Moussa, Madeleine; Gomez, Jose A.; Siemens, D. Robert; Leveridge, Michael; Chang, Silvia; Fenster, Aaron; Ward, Aaron D.; Abolmaesumi, Purang; Mousavi, Parvin

    2016-03-01

    Temporal ultrasound has been shown to have high classification accuracy in differentiating cancer from benign tissue. In this paper, we extend the temporal ultrasound method to classify lower grade Prostate Cancer (PCa) from all other grades. We use a group of nine patients with mostly lower grade PCa, where cancerous regions are also limited. A critical challenge is to train a classifier with limited aggressive cancerous tissue compared to low grade cancerous tissue. To resolve the problem of imbalanced data, we use Synthetic Minority Oversampling Technique (SMOTE) to generate synthetic samples for the minority class. We calculate spectral features of temporal ultrasound data and perform feature selection using Random Forests. In leave-one-patient-out cross-validation strategy, an area under receiver operating characteristic curve (AUC) of 0.74 is achieved with overall sensitivity and specificity of 70%. Using an unsupervised learning approach prior to proposed method improves sensitivity and AUC to 80% and 0.79. This work represents promising results to classify lower and higher grade PCa with limited cancerous training samples, using temporal ultrasound.

  15. Classification tree analysis of second neoplasms in survivors of childhood cancer

    International Nuclear Information System (INIS)

    Reports on childhood cancer survivors estimated cumulative probability of developing secondary neoplasms vary from 3,3% to 25% at 25 years from diagnosis, and the risk of developing another cancer to several times greater than in the general population. In our retrospective study, we have used the classification tree multivariate method on a group of 849 first cancer survivors, to identify childhood cancer patients with the greatest risk for development of secondary neoplasms. In observed group of patients, 34 develop secondary neoplasm after treatment of primary cancer. Analysis of parameters present at the treatment of first cancer, exposed two groups of patients at the special risk for secondary neoplasm. First are female patients treated for Hodgkin's disease at the age between 10 and 15 years, whose treatment included radiotherapy. Second group at special risk were male patients with acute lymphoblastic leukemia who were treated at the age between 4,6 and 6,6 years of age. The risk groups identified in our study are similar to the results of studies that used more conventional approaches. Usefulness of our approach in study of occurrence of second neoplasms should be confirmed in larger sample study, but user friendly presentation of results makes it attractive for further studies

  16. Hyperspectral image classification based on spatial and spectral features and sparse representation

    Institute of Scientific and Technical Information of China (English)

    Yang Jing-Hui; Wang Li-Guo; Qian Jin-Xi

    2014-01-01

    To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is based on the Gabor spatial texture features and nonparametric weighted spectral features, and the sparse representation classification method (Gabor–NWSF and SRC), abbreviated GNWSF–SRC. The proposed (GNWSF–SRC) method first combines the Gabor spatial features and nonparametric weighted spectral features to describe the hyperspectral image, and then applies the sparse representation method. Finally, the classification is obtained by analyzing the reconstruction error. We use the proposed method to process two typical hyperspectral data sets with different percentages of training samples. Theoretical analysis and simulation demonstrate that the proposed method improves the classification accuracy and Kappa coefficient compared with traditional classification methods and achieves better classification performance.

  17. Classification of EMG Signal Based on Human Percentile using SOM

    Directory of Open Access Journals (Sweden)

    M.H. Jali

    2014-07-01

    Full Text Available Electromyography (EMG is a bio signal that is formed by physiological variations in the state of muscle fibre membranes. Pattern recognition is one of the fields in the bio-signal processing which classified the signal into certain desired categories with subject to their area of application. This study described the classification of the EMG signal based on human body percentile using Self Organizing Mapping (SOM technique. Different human percentile definitively varies the arm circumference size. Variation of arm circumference is due to fatty tissue that lay between active muscle and skin. Generally the fatty tissue would decrease the overall amplitude of the EMG signal. Data collection is conducted randomly with fifteen subjects that have numerous percentiles using non-invasive technique at Biceps Brachii muscle. The signals are then going through filtering process to prepare them for the next stage. Then, five well known time domain feature extraction methods are applied to the signal before the classification process. Self Organizing Map (SOM technique is used as a classifier to discriminate between the human percentiles. Result shows that SOM is capable in clustering the EMG signal to the desired human percentile categories by optimizing the neurons of the technique.

  18. Neighborhood Hypergraph Based Classification Algorithm for Incomplete Information System

    Directory of Open Access Journals (Sweden)

    Feng Hu

    2015-01-01

    Full Text Available The problem of classification in incomplete information system is a hot issue in intelligent information processing. Hypergraph is a new intelligent method for machine learning. However, it is hard to process the incomplete information system by the traditional hypergraph, which is due to two reasons: (1 the hyperedges are generated randomly in traditional hypergraph model; (2 the existing methods are unsuitable to deal with incomplete information system, for the sake of missing values in incomplete information system. In this paper, we propose a novel classification algorithm for incomplete information system based on hypergraph model and rough set theory. Firstly, we initialize the hypergraph. Second, we classify the training set by neighborhood hypergraph. Third, under the guidance of rough set, we replace the poor hyperedges. After that, we can obtain a good classifier. The proposed approach is tested on 15 data sets from UCI machine learning repository. Furthermore, it is compared with some existing methods, such as C4.5, SVM, NavieBayes, and KNN. The experimental results show that the proposed algorithm has better performance via Precision, Recall, AUC, and F-measure.

  19. Texture-Based Automated Lithological Classification Using Aeromagenetic Anomaly Images

    Science.gov (United States)

    Shankar, Vivek

    2009-01-01

    This report consists of a thesis submitted to the faculty of the Department of Electrical and Computer Engineering, in partial fulfillment of the requirements for the degree of Master of Science, Graduate College, The University of Arizona, 2004 Aeromagnetic anomaly images are geophysical prospecting tools frequently used in the exploration of metalliferous minerals and hydrocarbons. The amplitude and texture content of these images provide a wealth of information to geophysicists who attempt to delineate the nature of the Earth's upper crust. These images prove to be extremely useful in remote areas and locations where the minerals of interest are concealed by basin fill. Typically, geophysicists compile a suite of aeromagnetic anomaly images, derived from amplitude and texture measurement operations, in order to obtain a qualitative interpretation of the lithological (rock) structure. Texture measures have proven to be especially capable of capturing the magnetic anomaly signature of unique lithological units. We performed a quantitative study to explore the possibility of using texture measures as input to a machine vision system in order to achieve automated classification of lithological units. This work demonstrated a significant improvement in classification accuracy over random guessing based on a priori probabilities. Additionally, a quantitative comparison between the performances of five classes of texture measures in their ability to discriminate lithological units was achieved.

  20. Classification of chronic obstructive pulmonary disease based on chest radiography

    Directory of Open Access Journals (Sweden)

    Leilane Marcos

    2013-12-01

    Full Text Available Objective Quantitative analysis of chest radiographs of patients with and without chronic obstructive pulmonary disease (COPD determining if the data obtained from such radiographic images could classify such individuals according to the presence or absence of disease. Materials and Methods For such a purpose, three groups of chest radiographic images were utilized, namely: group 1, including 25 individuals with COPD; group 2, including 27 individuals without COPD; and group 3 (utilized for the reclassification /validation of the analysis, including 15 individuals with COPD. The COPD classification was based on spirometry. The variables normalized by retrosternal height were the following: pulmonary width (LARGP; levels of right (ALBDIR and left (ALBESQ diaphragmatic eventration; costophrenic angle (ANGCF; and right (DISDIR and left (DISESQ intercostal distances. Results As the radiographic images of patients with and without COPD were compared, statistically significant differences were observed between the two groups on the variables related to the diaphragm. In the COPD reclassification the following variables presented the highest indices of correct classification: ANGCF (80%, ALBDIR (73.3%, ALBESQ (86.7%. Conclusion The radiographic assessment of the chest demonstrated that the variables related to the diaphragm allow a better differentiation between individuals with and without COPD.

  1. Classification of knee arthropathy with accelerometer-based vibroarthrography.

    Science.gov (United States)

    Moreira, Dinis; Silva, Joana; Correia, Miguel V; Massada, Marta

    2016-01-01

    One of the most common knee joint disorders is known as osteoarthritis which results from the progressive degeneration of cartilage and subchondral bone over time, affecting essentially elderly adults. Current evaluation techniques are either complex, expensive, invasive or simply fails into detection of small and progressive changes that occur within the knee. Vibroarthrography appeared as a new solution where the mechanical vibratory signals arising from the knee are recorded recurring only to an accelerometer and posteriorly analyzed enabling the differentiation between a healthy and an arthritic joint. In this study, a vibration-based classification system was created using a dataset with 92 healthy and 120 arthritic segments of knee joint signals collected from 19 healthy and 20 arthritic volunteers, evaluated with k-nearest neighbors and support vector machine classifiers. The best classification was obtained using the k-nearest neighbors classifier with only 6 time-frequency features with an overall accuracy of 89.8% and with a precision, recall and f-measure of 88.3%, 92.4% and 90.1%, respectively. Preliminary results showed that vibroarthrography can be a promising, non-invasive and low cost tool that could be used for screening purposes. Despite this encouraging results, several upgrades in the data collection process and analysis can be further implemented. PMID:27225550

  2. Pro duct Image Classification Based on Fusion Features

    Institute of Scientific and Technical Information of China (English)

    YANG Xiao-hui; LIU Jing-jing; YANG Li-jun

    2015-01-01

    Two key challenges raised by a product images classification system are classi-fication precision and classification time. In some categories, classification precision of the latest techniques, in the product images classification system, is still low. In this paper, we propose a local texture descriptor termed fan refined local binary pattern, which captures more detailed information by integrating the spatial distribution into the local binary pattern feature. We compare our approach with different methods on a subset of product images on Amazon/eBay and parts of PI100 and experimental results have demonstrated that our proposed approach is superior to the current existing methods. The highest classification precision is increased by 21%and the average classification time is reduced by 2/3.

  3. A Method of Soil Salinization Information Extraction with SVM Classification Based on ICA and Texture Features

    Institute of Scientific and Technical Information of China (English)

    ZHANG Fei; TASHPOLAT Tiyip; KUNG Hsiang-te; DING Jian-li; MAMAT.Sawut; VERNER Johnson; HAN Gui-hong; GUI Dong-wei

    2011-01-01

    Salt-affected soils classification using remotely sensed images is one of the most common applications in remote sensing,and many algorithms have been developed and applied for this purpose in the literature.This study takes the Delta Oasis of Weigan and Kuqa Rivers as a study area and discusses the prediction of soil salinization from ETM+ Landsat data.It reports the Support Vector Machine(SVM) classification method based on Independent Component Analysis(ICA) and Texture features.Meanwhile,the letter introduces the fundamental theory of SVM algorithm and ICA,and then incorporates ICA and texture features.The classification result is compared with ICA-SVM classification,single data source SVM classification,maximum likelihood classification(MLC) and neural network classification qualitatively and quantitatively.The result shows that this method can effectively solve the problem of low accuracy and fracture classification result in single data source classification.It has high spread ability toward higher array input.The overall accuracy is 98.64%,which increases by 10.2% compared with maximum likelihood classification,even increases by 12.94% compared with neural net classification,and thus acquires good effectiveness.Therefore,the classification method based on SVM and incorporating the ICA and texture features can be adapted to RS image classification and monitoring of soil salinization.

  4. Gene selection in class space for molecular classification of cancer

    Institute of Scientific and Technical Information of China (English)

    ZHANG Junying; Yue Joseph WANG; Javed KHAN; Robert CLARKE

    2004-01-01

    Gene selection (feature selection) is generally performed in gene space (feature space), where a very serious curse of dimensionality problem always exists because the number of genes is much larger than the number of samples in gene space (G-space). This results in difficulty in modeling the data set in this space and the low confidence of the result of gene selection. How to find a gene subset in this case is a challenging subject. In this paper, the above G-space is transformed into its dual space, referred to as class space (C-space) such that the number of dimensions is the very number of classes of the samples in G-space and the number of samples in C-space is the number of genes in G-space. It is obvious that the curse of dimensionality in C-space does not exist. A new gene selection method which is based on the principle of separating different classes as far as possible is presented with the help of Principal Component Analysis (PCA). The experimental results on gene selection for real data set are evaluated with Fisher criterion, weighted Fisher criterion as well as leave-one-out cross validation, showing that the method presented here is effective and efficient.

  5. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    Science.gov (United States)

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. PMID:25880524

  6. Radiological classification of renal angiomyolipomas based on 127 tumors

    Directory of Open Access Journals (Sweden)

    Prando Adilson

    2003-01-01

    Full Text Available PURPOSE: Demonstrate radiological findings of 127 angiomyolipomas (AMLs and propose a classification based on the radiological evidence of fat. MATERIALS AND METHODS: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73, multiple without tuberous sclerosis (TS (n = 4 and multiple with TS (n = 8, were retrospectively reviewed. Eighteen AMLs (14% presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13, hemorrhage (n = 11 and impossibility of an adequate preoperative characterization (n = 8. There was not a case of renal cell carcinoma (RCC with fat component in this group of patients. RESULTS: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal: 54%; Pattern-II, partially fatty (intrarenal or exophytic: 29%; Pattern-III, minimally fatty (most exophytic and perirenal: 11%; and Pattern-IV, without fat (most exophytic and perirenal: 6%. CONCLUSIONS: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm, pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exophytic and heterogeneous (patterns II and III. The rare pattern-IV AMLs, however, can be small or large, intra-renal or exophytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with detectable

  7. Radiological classification of renal angiomyolipomas based on 127 tumors

    Energy Technology Data Exchange (ETDEWEB)

    Prando, Adilson [Hospital Vera Cruz, Campinas, SP (Brazil). Dept. de Radiologia]. E-mail: aprando@mpc.com.br

    2003-05-15

    Purpose: Demonstrate radiological findings of 127 angiomyolipomas (AMLs) and propose a classification based on the radiological evidence of fat. Materials And Methods: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73), multiple without tuberous sclerosis (TS) (n = 4) and multiple with TS (n = 8), were retrospectively reviewed. Eighteen AMLs (14%) presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13), hemorrhage (n = 11) and impossibility of an adequate preoperative characterization (n = 8). There was not a case of renal cell carcinoma (RCC) with fat component in this group of patients. Results: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal): 54%; Pattern-II, partially fatty (intrarenal or exo phytic): 29%; Pattern-III, minimally fatty (most exo phytic and peri renal): 11%; and Pattern-IV, without fat (most exo phytic and peri renal): 6%. Conclusions: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm), pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exo phytic and heterogeneous (patterns II and III). The rare pattern-IV AMLs, however, can be small or large, intra-renal or exo phytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with

  8. Radiological classification of renal angiomyolipomas based on 127 tumors

    International Nuclear Information System (INIS)

    Purpose: Demonstrate radiological findings of 127 angiomyolipomas (AMLs) and propose a classification based on the radiological evidence of fat. Materials And Methods: The imaging findings of 85 consecutive patients with AMLs: isolated (n = 73), multiple without tuberous sclerosis (TS) (n = 4) and multiple with TS (n = 8), were retrospectively reviewed. Eighteen AMLs (14%) presented with hemorrhage. All patients were submitted to a dedicated helical CT or magnetic resonance studies. All hemorrhagic and non-hemorrhagic lesions were grouped together since our objective was to analyze the presence of detectable fat. Out of 85 patients, 53 were monitored and 32 were treated surgically due to large perirenal component (n = 13), hemorrhage (n = 11) and impossibility of an adequate preoperative characterization (n = 8). There was not a case of renal cell carcinoma (RCC) with fat component in this group of patients. Results: Based on the presence and amount of detectable fat within the lesion, AMLs were classified in 4 distinct radiological patterns: Pattern-I, predominantly fatty (usually less than 2 cm in diameter and intrarenal): 54%; Pattern-II, partially fatty (intrarenal or exo phytic): 29%; Pattern-III, minimally fatty (most exo phytic and peri renal): 11%; and Pattern-IV, without fat (most exo phytic and peri renal): 6%. Conclusions: This proposed classification might be useful to understand the imaging manifestations of AMLs, their differential diagnosis and determine when further radiological evaluation would be necessary. Small (< 1.5 cm), pattern-I AMLs tend to be intra-renal, homogeneous and predominantly fatty. As they grow they tend to be partially or completely exo phytic and heterogeneous (patterns II and III). The rare pattern-IV AMLs, however, can be small or large, intra-renal or exo phytic but are always homogeneous and hyperdense mass. Since no renal cell carcinoma was found in our series, from an evidence-based practice, all renal mass with

  9. Automated ancillary cancer history classification for mesothelioma patients from free-text clinical reports

    Directory of Open Access Journals (Sweden)

    Richard A Wilson

    2010-01-01

    Full Text Available Background: Clinical records are often unstructured, free-text documents that create information extraction challenges and costs. Healthcare delivery and research organizations, such as the National Mesothelioma Virtual Bank, require the aggregation of both structured and unstructured data types. Natural language processing offers techniques for automatically extracting information from unstructured, free-text documents. Methods: Five hundred and eight history and physical reports from mesothelioma patients were split into development (208 and test sets (300. A reference standard was developed and each report was annotated by experts with regard to the patient′s personal history of ancillary cancer and family history of any cancer. The Hx application was developed to process reports, extract relevant features, perform reference resolution and classify them with regard to cancer history. Two methods, Dynamic-Window and ConText, for extracting information were evaluated. Hx′s classification responses using each of the two methods were measured against the reference standard. The average Cohen′s weighted kappa served as the human benchmark in evaluating the system. Results: Hx had a high overall accuracy, with each method, scoring 96.2%. F-measures using the Dynamic-Window and ConText methods were 91.8% and 91.6%, which were comparable to the human benchmark of 92.8%. For the personal history classification, Dynamic-Window scored highest with 89.2% and for the family history classification, ConText scored highest with 97.6%, in which both methods were comparable to the human benchmark of 88.3% and 97.2%, respectively. Conclusion: We evaluated an automated application′s performance in classifying a mesothelioma patient′s personal and family history of cancer from clinical reports. To do so, the Hx application must process reports, identify cancer concepts, distinguish the known mesothelioma from ancillary cancers, recognize negation

  10. Classification and thermal history of petroleum based on light hydrocarbons

    Science.gov (United States)

    Thompson, K. F. M.

    1983-02-01

    Classifications of oils and kerogens are described. Two indices are employed, termed the Heptane and IsoheptaneValues, based on analyses of gasoline-range hydrocarbons. The indices assess degree of paraffinicity. and allow the definition of four types of oil: normal, mature, supermature, and biodegraded. The values of these indices measured in sediment extracts are a function of maximum attained temperature and of kerogen type. Aliphatic and aromatic kerogens are definable. Only the extracts of sediments bearing aliphatic kerogens having a specific thermal history are identical to the normal oils which form the largest group (41%) in the sample set. This group was evidently generated at subsurface temperatures of the order of 138°-149°C, (280°-300°F) defined under specific conditions of burial history. It is suggested that all other petroleums are transformation products of normal oils.

  11. MICROWAVE BASED CLASSIFICATION OF MATERIAL USING NEURAL NETWORK

    Directory of Open Access Journals (Sweden)

    Anil H. Soni

    2011-07-01

    Full Text Available Microwave radar has emerged as a useful tool in many remote sensing application including material classification, target detection and shape extraction. In this paper, we present method to classify material based on their dielectric characteristics. Microwave radar in X-band range is used for scanning the target made of various materials like Acrylic, Metal and Wood in free space. Depending on their respective electromagnetic property, reflections from each target are measured and radar image is obtained. Further various features such as Energy, Entropy, Normalized sum of image intensity and standard deviation etc. are extracted and fed to feedfor word multilayer perceptron classifier, which determines whether it is dielectric or non-dielectric (metallic. Results show good performance.

  12. About Classification Methods Based on Tensor Modelling for Hyperspectral Images

    Directory of Open Access Journals (Sweden)

    Salah Bourennane

    2010-03-01

    Full Text Available Denoising and Dimensionality Reduction (DR are key issue to improve the classifiers efficiency for Hyper spectral images (HSI. The multi-way Wiener filtering recently developed is used, Principal and independent component analysis (PCA; ICA and projection pursuit(PP approaches to DR have been investigated. These matrix algebra methods are applied on vectorized images. Thereof, the spatial rearrangement is lost. To jointly take advantage of the spatial and spectral information, HSI has been recently represented as tensor. Offering multiple ways to decompose data orthogonally, we introduced filtering and DR methods based on multilinear algebra tools. The DR is performed on spectral way using PCA, or PP joint to an orthogonal projection onto a lower subspace dimension of the spatial ways. Weshow the classification improvement using the introduced methods in function to existing methods. This experiment is exemplified using real-world HYDICE data. Multi-way filtering, Dimensionality reduction, matrix and multilinear algebra tools, tensor processing.

  13. [Galaxy/quasar classification based on nearest neighbor method].

    Science.gov (United States)

    Li, Xiang-Ru; Lu, Yu; Zhou, Jian-Ming; Wang, Yong-Jun

    2011-09-01

    With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification. PMID:22097877

  14. New classification system-based visual outcome in Eales′ disease

    Directory of Open Access Journals (Sweden)

    Saxena Sandeep

    2007-01-01

    Full Text Available Purpose: A retrospective tertiary care center-based study was undertaken to evaluate the visual outcome in Eales′ disease, based on a new classification system, for the first time. Materials and Methods: One hundred and fifty-nine consecutive cases of Eales′ disease were included. All the eyes were staged according to the new classification: Stage 1: periphlebitis of small (1a and large (1b caliber vessels with superficial retinal hemorrhages; Stage 2a: capillary non-perfusion, 2b: neovascularization elsewhere/of the disc; Stage 3a: fibrovascular proliferation, 3b: vitreous hemorrhage; Stage 4a: traction/combined rhegmatogenous retinal detachment and 4b: rubeosis iridis, neovascular glaucoma, complicated cataract and optic atrophy. Visual acuity was graded as: Grade I 20/20 or better; Grade II 20/30 to 20/40; Grade III 20/60 to 20/120 and Grade IV 20/200 or worse. All the cases were managed by medical therapy, photocoagulation and/or vitreoretinal surgery. Visual acuity was converted into decimal scale, denoting 20/20=1 and 20/800=0.01. Paired t-test / Wilcoxon signed-rank tests were used for statistical analysis. Results: Vitreous hemorrhage was the commonest presenting feature (49.32%. Cases with Stages 1 to 3 and 4a and 4b achieved final visual acuity ranging from 20/15 to 20/40; 20/80 to 20/400 and 20/200 to 20/400, respectively. Statistically significant improvement in visual acuities was observed in all the stages of the disease except Stages 1a and 4b. Conclusion: Significant improvement in visual acuities was observed in the majority of stages of Eales′ disease following treatment. This study adds further to the little available evidences of treatment effects in literature and may have effect on patient care and health policy in Eales′ disease.

  15. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  16. BRAIN TUMOR CLASSIFICATION USING NEURAL NETWORK BASED METHODS

    OpenAIRE

    Kalyani A. Bhawar*, Prof. Nitin K. Bhil

    2016-01-01

    MRI (Magnetic resonance Imaging) brain neoplasm pictures Classification may be a troublesome tasks due to the variance and complexity of tumors. This paper presents two Neural Network techniques for the classification of the magnetic resonance human brain images. The proposed Neural Network technique consists of 3 stages, namely, feature extraction, dimensionality reduction, and classification. In the first stage, we have obtained the options connected with tomography pictures victimization d...

  17. Investigating text message classification using case-based reasoning

    OpenAIRE

    Healy, Matt, (Thesis)

    2007-01-01

    Text classification is the categorization of text into a predefined set of categories. Text classification is becoming increasingly important given the large volume of text stored electronically e.g. email, digital libraries and the World Wide Web (WWW). These documents represent a massive amount of information that can be accessed easily. To gain benefit from using this information requires organisation. One way of organising it automatically is to use text classification. A number of well k...

  18. Classification of Dukes' B and C colorectal cancers using expression arrays

    DEFF Research Database (Denmark)

    Frederiksen, C.M.; Knudsen, Steen; Laurberg, S.;

    2003-01-01

    and D could not be classified correctly. A number of interesting gene clusters showed a discriminating difference between Dukes' B and C samples. These included mitochondrial genes, stromal remodeling genes, and genes related to cell adhesion. Conclusion. Molecular classification based on gene...

  19. Classification between normal and tumor tissues based on the pair-wise gene expression ratio

    International Nuclear Information System (INIS)

    Precise classification of cancer types is critically important for early cancer diagnosis and treatment. Numerous efforts have been made to use gene expression profiles to improve precision of tumor classification. However, reliable cancer-related signals are generally lacking. Using recent datasets on colon and prostate cancer, a data transformation procedure from single gene expression to pair-wise gene expression ratio is proposed. Making use of the internal consistency of each expression profiling dataset this transformation improves the signal to noise ratio of the dataset and uncovers new relevant cancer-related signals (features). The efficiency in using the transformed dataset to perform normal/tumor classification was investigated using feature partitioning with informative features (gene annotation) as discriminating axes (single gene expression or pair-wise gene expression ratio). Classification results were compared to the original datasets for up to 10-feature model classifiers. 82 and 262 genes that have high correlation to tissue phenotype were selected from the colon and prostate datasets respectively. Remarkably, data transformation of the highly noisy expression data successfully led to lower the coefficient of variation (CV) for the within-class samples as well as improved the correlation with tissue phenotypes. The transformed dataset exhibited lower CV when compared to that of single gene expression. In the colon cancer set, the minimum CV decreased from 45.3% to 16.5%. In prostate cancer, comparable CV was achieved with and without transformation. This improvement in CV, coupled with the improved correlation between the pair-wise gene expression ratio and tissue phenotypes, yielded higher classification efficiency, especially with the colon dataset – from 87.1% to 93.5%. Over 90% of the top ten discriminating axes in both datasets showed significant improvement after data transformation. The high classification efficiency achieved suggested

  20. Review of Image Classification Techniques Based on LDA, PCA and Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Mukul Yadav

    2014-02-01

    Full Text Available Image classification is play an important role in security surveillance in current scenario of huge Amount of image data base. Due to rapid change of feature content of image are major issues in classification. The image classification is improved by various authors using different model of classifier. The efficiency of classifier model depends on feature extraction process of traffic image. For the feature extraction process various authors used a different technique such as Gabor feature extraction, histogram and many more method on extraction process for classification. We apply the FLDA-GA for improved the classification rate of content based image classification. The improved method used heuristic function genetic algorithm. In the form of optimal GA used as feature optimizer for FLDA classification. The normal FLDA suffered from a problem of core and outlier problem. The both side kernel technique improved the classification process of support vector machine.FLDA perform a better classification in compression of another binary multi-class classification. Directed acyclic graph applied a graph portion technique for the mapping of feature data. The mapping space of feature data mapped correctly automatically improved the voting process of classification.

  1. Statistical Analysis of Tissue Images for Detection and Classification of Cervical Cancer

    CERN Document Server

    Jagtap, Jaidip; Pandey, Kiran; Agarwa, Asha; Panigrahi, Prasanta K; Pradhan, Asima

    2011-01-01

    Cervical cancer is one of the major health threats in women worldwide. The current "gold standard" for detecting cancer of the epithelial tissue is the histopathology analysis of biopsy samples. However it relies on the pathologist's judgment of the disease. We investigate the utility of statistical parameters as a potential tool for detection and discrimination of the stages of dysplasia. Digital images of the tissue slides are captured with the help of a digital camera plugged to a microscope. Statistical data analysis is performed with the help of software to evaluate parameters such as mean, maxima, full width half maxima, skewness, kurtosis etc. for the images. We believe that these parameters can help effectively to improve the diagnosis and further classify normal and abnormal tissue sections. These parameters can be used independently as well as in tandem with other parameters as features in classification algorithms that involve the use of Neural networks or Principal component analysis.

  2. Dendritic cell-based cancer immunotherapy for colorectal cancer

    OpenAIRE

    Kajihara, Mikio; Takakura, Kazuki; Kanai, Tomoya; Ito, Zensho; Saito, Keisuke; Takami, Shinichiro; Shimodaira, Shigetaka; Okamoto, Masato; Ohkusa, Toshifumi; Koido, Shigeo

    2016-01-01

    Colorectal cancer (CRC) is one of the most common cancers and a leading cause of cancer-related mortality worldwide. Although systemic therapy is the standard care for patients with recurrent or metastatic CRC, the prognosis is extremely poor. The optimal sequence of therapy remains unknown. Therefore, alternative strategies, such as immunotherapy, are needed for patients with advanced CRC. This review summarizes evidence from dendritic cell-based cancer immunotherapy strategies that are curr...

  3. Pooling breast cancer datasets has a synergetic effect on classification performance and improves signature stability

    Directory of Open Access Journals (Sweden)

    van de Vijver Marc J

    2008-08-01

    Full Text Available Abstract Background Michiels et al. (Lancet 2005; 365: 488–92 employed a resampling strategy to show that the genes identified as predictors of prognosis from resamplings of a single gene expression dataset are highly variable. The genes most frequently identified in the separate resamplings were put forward as a 'gold standard'. On a higher level, breast cancer datasets collected by different institutions can be considered as resamplings from the underlying breast cancer population. The limited overlap between published prognostic signatures confirms the trend of signature instability identified by the resampling strategy. Six breast cancer datasets, totaling 947 samples, all measured on the Affymetrix platform, are currently available. This provides a unique opportunity to employ a substantial dataset to investigate the effects of pooling datasets on classifier accuracy, signature stability and enrichment of functional categories. Results We show that the resampling strategy produces a suboptimal ranking of genes, which can not be considered to be a 'gold standard'. When pooling breast cancer datasets, we observed a synergetic effect on the classification performance in 73% of the cases. We also observe a significant positive correlation between the number of datasets that is pooled, the validation performance, the number of genes selected, and the enrichment of specific functional categories. In addition, we have evaluated the support for five explanations that have been postulated for the limited overlap of signatures. Conclusion The limited overlap of current signature genes can be attributed to small sample size. Pooling datasets results in more accurate classification and a convergence of signature genes. We therefore advocate the analysis of new data within the context of a compendium, rather than analysis in isolation.

  4. Molecular classification and prognostication of 300 node-negative breast cancer cases: A tertiary care experience

    Science.gov (United States)

    Shemin, K. M. Zuhara; Smitha, N. V.; Jojo, Annie; Vijaykumar, D. K.

    2015-01-01

    Background: The proportion of node-negative breast cancer patients has been increasing with improvement of diagnostic modalities and early detection. However, there is a 20–30% recurrence in node-negative breast cancers. Determining who should receive adjuvant therapy is challenging, as the majority are cured by surgery alone. Hence, it requires further stratification using additional prognostic and predictive factors. Subjects and Methods: Ours is a single institution retrospective study, on 300 node-negative breast cancer cases, who underwent primary surgery over a period of 7 years (2005–2011). We excluded all cases who took NACT. Prognostic factors of age, size, lymphovascular emboli, estrogen receptor (ER), progesterone receptor (PR), HER2neu Ki-67, grade and molecular classification were analyzed with respect to those with and without early events (recurrence, metastases or second malignancy, death) using-Pearson Chi-square method and logistic regression method for statistical analysis. Results: Majority belonged to the age group of 50–70 years. On univariate analysis, size >5 cm (P = 0.03) and ER negativity had significant association (P = 0.05) for early failures; PR negativity and lymphovascular emboli (LVE) had borderline significance (P = 0.07). Multivariate analysis showed size >5 cm to be significant (P = 0.04) and LVE positivity showed borderline significant association (P = 0.07) with early failures. About 62% belonged to luminal category followed by basal-like (25%) in molecular classification. Conclusions: ER negativity, PR negativity, LVE/lymphovascular invasion positivity and size >5 cm (T3 and T4) are associated with poor prognosis in node-negative breast cancers. PMID:26981506

  5. Molecular classification and prognostication of 300 node-negative breast cancer cases: A tertiary care experience

    Directory of Open Access Journals (Sweden)

    K M Zuhara Shemin

    2015-01-01

    Full Text Available Background: The proportion of node-negative breast cancer patients has been increasing with improvement of diagnostic modalities and early detection. However, there is a 20-30% recurrence in node-negative breast cancers. Determining who should receive adjuvant therapy is challenging, as the majority are cured by surgery alone. Hence, it requires further stratification using additional prognostic and predictive factors. Subjects and Methods: Ours is a single institution retrospective study, on 300 node-negative breast cancer cases, who underwent primary surgery over a period of 7 years (2005-2011. We excluded all cases who took NACT. Prognostic factors of age, size, lymphovascular emboli, estrogen receptor (ER, progesterone receptor (PR, HER2neu Ki-67, grade and molecular classification were analyzed with respect to those with and without early events (recurrence, metastases or second malignancy, death using-Pearson Chi-square method and logistic regression method for statistical analysis. Results: Majority belonged to the age group of 50-70 years. On univariate analysis, size >5 cm (P = 0.03 and ER negativity had significant association (P = 0.05 for early failures; PR negativity and lymphovascular emboli (LVE had borderline significance (P = 0.07. Multivariate analysis showed size >5 cm to be significant (P = 0.04 and LVE positivity showed borderline significant association (P = 0.07 with early failures. About 62% belonged to luminal category followed by basal-like (25% in molecular classification. Conclusions: ER negativity, PR negativity, LVE/lymphovascular invasion positivity and size >5 cm (T3 and T4 are associated with poor prognosis in node-negative breast cancers.

  6. Multiview Sample Classification Algorithm Based on L1-Graph Domain Adaptation Learning

    OpenAIRE

    Huibin Lu; Zhengping Hu; Hongxiao Gao

    2015-01-01

    In the case of multiview sample classification with different distribution, training and testing samples are from different domains. In order to improve the classification performance, a multiview sample classification algorithm based on L1-Graph domain adaptation learning is presented. First of all, a framework of nonnegative matrix trifactorization based on domain adaptation learning is formed, in which the unchanged information is regarded as the bridge of knowledge transformation from the...

  7. Three-Phase Tournament-Based Method for Better Email Classification

    OpenAIRE

    Sabah Sayed; Samir AbdelRahman; Ibrahim Farag

    2012-01-01

    Email classification performance has attracted much attention in the last decades. This paper proposes a tournament-based method to evolve email classification performance utilizing World Final Cup rules as a solution heuristics. Our proposed classification method passes through three phases: 1) clustering (grouping) email folders (topics or classes) based on their token and field similarities, 2) training binary classifiers on each class pair and 3) applying 2-layer tournament me...

  8. Classification of types of stuttering symptoms based on brain activity.

    Directory of Open Access Journals (Sweden)

    Jing Jiang

    Full Text Available Among the non-fluencies seen in speech, some are more typical (MT of stuttering speakers, whereas others are less typical (LT and are common to both stuttering and fluent speakers. No neuroimaging work has evaluated the neural basis for grouping these symptom types. Another long-debated issue is which type (LT, MT whole-word repetitions (WWR should be placed in. In this study, a sentence completion task was performed by twenty stuttering patients who were scanned using an event-related design. This task elicited stuttering in these patients. Each stuttered trial from each patient was sorted into the MT or LT types with WWR put aside. Pattern classification was employed to train a patient-specific single trial model to automatically classify each trial as MT or LT using the corresponding fMRI data. This model was then validated by using test data that were independent of the training data. In a subsequent analysis, the classification model, just established, was used to determine which type the WWR should be placed in. The results showed that the LT and the MT could be separated with high accuracy based on their brain activity. The brain regions that made most contribution to the separation of the types were: the left inferior frontal cortex and bilateral precuneus, both of which showed higher activity in the MT than in the LT; and the left putamen and right cerebellum which showed the opposite activity pattern. The results also showed that the brain activity for WWR was more similar to that of the LT and fluent speech than to that of the MT. These findings provide a neurological basis for separating the MT and the LT types, and support the widely-used MT/LT symptom grouping scheme. In addition, WWR play a similar role as the LT, and thus should be placed in the LT type.

  9. Sequence variant classification and reporting: recommendations for improving the interpretation of cancer susceptibility genetic test results.

    NARCIS (Netherlands)

    Plon, S.E.; Eccles, D.M.; Easton, D.; Foulkes, W.D.; Genuardi, M.; Greenblatt, M.S.; Hogervorst, F.B.; Hoogerbrugge, N.; Spurdle, A.B.; Tavtigian, S.V.

    2008-01-01

    Genetic testing of cancer susceptibility genes is now widely applied in clinical practice to predict risk of developing cancer. In general, sequence-based testing of germline DNA is used to determine whether an individual carries a change that is clearly likely to disrupt normal gene function. Genet

  10. Research and Application of Human Capital Strategic Classification Tool: Human Capital Classification Matrix Based on Biological Natural Attribute

    Directory of Open Access Journals (Sweden)

    Yong Liu

    2014-12-01

    Full Text Available In order to study the causes of weak human capital structure strategic classification management in China, we analyze that enterprises around the world face increasingly difficult for human capital management. In order to provide strategically sound answers, the HR managers need the critical information provided by the right technology processing and analytical tools. In this study, there are different types and levels of human capital in formal organization management, which is not the same contribution to a formal organization. An important guarantee for sustained and healthy development of the formal or informal organization is lower human capital risk. To resist this risk is primarily dependent on human capital hedge force and appreciation force in value, which is largely dependent on the strategic value of the performance of senior managers. Based on the analysis of high-level managers perspective, we also discuss the value and configuration of principles and methods to be followed in human capital strategic classification based on Boston Consulting Group (BCG matrix and build Human Capital Classification (HCC matrix based on biological natural attribute to effectively realize human capital structure strategic classification.

  11. [ECoG classification based on wavelet variance].

    Science.gov (United States)

    Yan, Shiyu; Liu, Chong; Wang, Hong; Zhao, Haibin

    2013-06-01

    For a typical electrocorticogram (ECoG)-based brain-computer interface (BCI) system in which the subject's task is to imagine movements of either the left small finger or the tongue, we proposed a feature extraction algorithm using wavelet variance. Firstly the definition and significance of wavelet variance were brought out and taken as feature based on the discussion of wavelet transform. Six channels with most distinctive features were selected from 64 channels for analysis. Consequently the EEG data were decomposed using db4 wavelet. The wavelet coeffi-cient variances containing Mu rhythm and Beta rhythm were taken out as features based on ERD/ERS phenomenon. The features were classified linearly with an algorithm of cross validation. The results of off-line analysis showed that high classification accuracies of 90. 24% and 93. 77% for training and test data set were achieved, the wavelet vari-ance had characteristics of simplicity and effectiveness and it was suitable for feature extraction in BCI research. K PMID:23865300

  12. Hyperspectral remote sensing image classification based on decision level fusion

    Institute of Scientific and Technical Information of China (English)

    Peijun Du; Wei Zhang; Junshi Xia

    2011-01-01

    @@ To apply decision level fusion to hyperspectral remote sensing (HRS) image classification, three decision level fusion strategies are experimented on and compared, namely, linear consensus algorithm, improved evidence theory, and the proposed support vector machine (SVM) combiner.To evaluate the effects of the input features on classification performance, four schemes are used to organize input features for member classifiers.In the experiment, by using the operational modular imaging spectrometer (OMIS) II HRS image, the decision level fusion is shown as an effective way for improving the classification accuracy of the HRS image, and the proposed SVM combiner is especially suitable for decision level fusion.The results also indicate that the optimization of input features can improve the classification performance.%To apply decision level fusion to hyperspectral remote sensing (HRS) image classification, three decision level fusion strategies are experimented on and compared, namely, linear consensus algorithm, improved evidence theory, and the proposed support vector machine (SVM) combiner. To evaluate the effects of the input features on classification performance, four schemes are used to organize input features for member classifiers. In the experiment, by using the operational modular imaging spectrometer (OMIS) Ⅱ HRS image, the decision level fusion is shown as an effective way for improving the classification accuracy of the HRS image, and the proposed SVM combiner is especially suitable for decision level fusion. The results also indicate that the optimization of input features can improve the classification performance.

  13. Text Classification Retrieval Based on Complex Network and ICA Algorithm

    Directory of Open Access Journals (Sweden)

    Hongxia Li

    2013-08-01

    Full Text Available With the development of computer science and information technology, the library is developing toward information and network. The library digital process converts the book into digital information. The high-quality preservation and management are achieved by computer technology as well as text classification techniques. It realizes knowledge appreciation. This paper introduces complex network theory in the text classification process and put forwards the ICA semantic clustering algorithm. It realizes the independent component analysis of complex network text classification. Through the ICA clustering algorithm of independent component, it realizes character words clustering extraction of text classification. The visualization of text retrieval is improved. Finally, we make a comparative analysis of collocation algorithm and ICA clustering algorithm through text classification and keyword search experiment. The paper gives the clustering degree of algorithm and accuracy figure. Through simulation analysis, we find that ICA clustering algorithm increases by 1.2% comparing with text classification clustering degree. Accuracy can be improved by 11.1% at most. It improves the efficiency and accuracy of text classification retrieval. It also provides a theoretical reference for text retrieval classification of eBook

  14. Radiological and 'Imaging' methods in TNM classification of non-small-cell lung cancer

    International Nuclear Information System (INIS)

    Lung cancer is the most common worldwide malignant disease according to its incidence and mortality. The aim of our study was to evaluate the diagnostic value of the radiological and imaging methods, according to the TNM classification, compared to postoperative histological diagnosis. Thirty-seven patients with pulmonary carcinoma were studied prospectively using native chest radiography (PA and LL view), computed tomography (CT) and magnetic resonance imaging (MRI) during ten days before thoracotomy. Radiological and imaging findings were reviewed separately and results were compared with surgical and pathohistological findings on the basis of the TNM classification. All patients underwent chest x-rays, CT was performed in 36 patients and MRI in 12 of them. Imaging methods (CT and MRI) showed more accuracy in sensitivity and specificity compared with the native chest radiography in a great percentage. Generally no statistically significant differences were found between the two imagining methods for the evaluation of tumour extent (T) or lymph node metastases (N). MRI was slightly superior to CT in determination of the chest wall extent of the tumour. In the conclusion CT remains the imaging modality og choice both for assessing patients with abnormal chest radiographs suspected of having lung cancer, and in staging patients with histologically proven pulmonary carcinoma.

  15. Gene expression-based classifications of fibroadenomas and phyllodes tumours of the breast.

    Science.gov (United States)

    Vidal, Maria; Peg, Vicente; Galván, Patricia; Tres, Alejandro; Cortés, Javier; Ramón y Cajal, Santiago; Rubio, Isabel T; Prat, Aleix

    2015-06-01

    Fibroepithelial tumors (FTs) of the breast are a heterogeneous group of lesions ranging from fibroadenomas (FAD) to phyllodes tumors (PT) (benign, borderline, malignant). Further understanding of their molecular features and classification might be of clinical value. In this study, we analysed the expression of 105 breast cancer-related genes, including the 50 genes of the PAM50 intrinsic subtype predictor and 12 genes of the Claudin-low subtype predictor, in a panel of 75 FTs (34 FADs, 5 juvenile FADs, 20 benign PTs, 5 borderline PTs and 11 malignant PTs) with clinical follow-up. In addition, we compared the expression profiles of FTs with those of 14 normal breast tissues and 49 primary invasive ductal carcinomas (IDCs). Our results revealed that the levels of expression of all breast cancer-related genes can discriminate the various groups of FTs, together with normal breast tissues and IDCs (False Discovery Rate genes (e.g. CCNB1 and MKI67) and mesenchymal/epithelial-related (e.g. CLDN3 and EPCAM) genes were found to be most discriminative. As expected, FADs showed the highest and lowest expression of epithelial- and proliferation-related genes, respectively, whereas malignant PTs showed the opposite expression pattern. Interestingly, the overall profile of benign PTs was found more similar to FADs and normal breast tissues than the rest of tumours, including juvenile FADs. Within the dataset of IDCs and normal breast tissues, the vast majority of FADs, juvenile FADs, benign PTs and borderline PTs were identified as Normal-like by intrinsic breast cancer subtyping, whereas 7 (63.6%) and 3 (27.3%) malignant PTs were identified as Claudin-low and Basal-like, respectively. Finally, we observed that the previously described PAM50 risk of relapse prognostic score better predicted outcome in FTs than the morphological classification, even within PTs-only. Our results suggest that classification of FTs using gene expression-based data is feasible and might provide

  16. Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Łukasz Augustyniak

    2015-12-01

    Full Text Available We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.

  17. Classification of Histological Images Based on the Stationary Wavelet Transform

    International Nuclear Information System (INIS)

    Non-Hodgkin lymphomas are of many distinct types, and different classification systems make it difficult to diagnose them correctly. Many of these systems classify lymphomas only based on what they look like under a microscope. In 2008 the World Health Organisation (WHO) introduced the most recent system, which also considers the chromosome features of the lymphoma cells and the presence of certain proteins on their surface. The WHO system is the one that we apply in this work. Herewith we present an automatic method to classify histological images of three types of non-Hodgkin lymphoma. Our method is based on the Stationary Wavelet Transform (SWT), and it consists of three steps: 1) extracting sub-bands from the histological image through SWT, 2) applying Analysis of Variance (ANOVA) to clean noise and select the most relevant information, 3) classifying it by the Support Vector Machine (SVM) algorithm. The kernel types Linear, RBF and Polynomial were evaluated with our method applied to 210 images of lymphoma from the National Institute on Aging. We concluded that the following combination led to the most relevant results: detail sub-band, ANOVA and SVM with Linear and RBF kernels

  18. An approach for mechanical fault classification based on generalized discriminant analysis

    Institute of Scientific and Technical Information of China (English)

    LI Wei-hua; SHI Tie-lin; YANG Shu-zi

    2006-01-01

    To deal with pattern classification of complicated mechanical faults,an approach to multi-faults classification based on generalized discriminant analysis is presented.Compared with linear discriminant analysis (LDA),generalized discriminant analysis (GDA),one of nonlinear discriminant analysis methods,is more suitable for classifying the linear non-separable problem.The connection and difference between KPCA (Kernel Principal Component Analysis) and GDA is discussed.KPCA is good at detection of machine abnormality while GDA performs well in multi-faults classification based on the collection of historical faults symptoms.When the proposed method is applied to air compressor condition classification and gear fault classification,an excellent performance in complicated multi-faults classification is presented.

  19. A NEW SVM BASED EMOTIONAL CLASSIFICATION OF IMAGE

    Institute of Scientific and Technical Information of China (English)

    Wang Weining; Yu Yinglin; Zhang Jianchao

    2005-01-01

    How high-level emotional representation of art paintings can be inferred from percep tual level features suited for the particular classes (dynamic vs. static classification)is presented. The key points are feature selection and classification. According to the strong relationship between notable lines of image and human sensations, a novel feature vector WLDLV (Weighted Line Direction-Length Vector) is proposed, which includes both orientation and length information of lines in an image. Classification is performed by SVM (Support Vector Machine) and images can be classified into dynamic and static. Experimental results demonstrate the effectiveness and superiority of the algorithm.

  20. Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study

    Directory of Open Access Journals (Sweden)

    Ghazi Raho

    2015-02-01

    Full Text Available Feature selection is necessary for effective text classification. Dataset preprocessing is essential to make upright result and effective performance. This paper investigates the effectiveness of using feature selection. In this paper we have been compared the performance between different classifiers in different situations using feature selection with stemming, and without stemming.Evaluation used a BBC Arabic dataset, different classification algorithms such as decision tree (D.T, K-nearest neighbors (KNN, Naïve Bayesian (NB method and Naïve Bayes Multinomial(NBM classifier were used. The experimental results are presented in term of precision, recall, F-Measures, accuracy and time to build model.

  1. Improving Sparse Representation-Based Classification Using Local Principal Component Analysis

    OpenAIRE

    Weaver, Chelsea; Saito, Naoki

    2016-01-01

    Sparse representation-based classification (SRC), proposed by Wright et al., seeks the sparsest decomposition of a test sample over the dictionary of training samples, with classification to the most-contributing class. Because it assumes test samples can be written as linear combinations of their same-class training samples, the success of SRC depends on the size and representativeness of the training set. Our proposed classification algorithm enlarges the training set by using local princip...

  2. Analysis on Design of Kohonen-network System Based on Classification of Complex Signals

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The key methods of detection and classification of the electroencephalogram(EEG) used in recent years are introduced . Taking EEG for example, the design plan of Kohonen neural network system based on detection and classification of complex signals is proposed, and both the network design and signal processing are analyzed, including pre-processing of signals, extraction of signal features, classification of signal and network topology, etc.

  3. Assessing the Performance of a Classification-Based Vulnerability Analysis Model

    OpenAIRE

    Wang, Tai-Ran; Mousseau, Vincent; Pedroni, Nicola; Zio, Enrico

    2015-01-01

    In this article, a classification model based on the majority rule sorting (MR-Sort) method is employed to evaluate the vulnerability of safety-critical systems with respect to malevolent intentional acts. The model is built on the basis of a (limited-size) set of data representing (a priori known) vulnerability classification examples. The empirical construction of the clas-sification model introduces a source of uncertainty into the vulnerability analysis process: a quantitative assessment ...

  4. Comparing Machine Learning Classifiers for Object-Based Land Cover Classification Using Very High Resolution Imagery

    OpenAIRE

    Yuguo Qian; Weiqi Zhou; Jingli Yan; Weifeng Li; Lijian Han

    2014-01-01

    This study evaluates and compares the performance of four machine learning classifiers—support vector machine (SVM), normal Bayes (NB), classification and regression tree (CART) and K nearest neighbor (KNN)—to classify very high resolution images, using an object-based classification procedure. In particular, we investigated how tuning parameters affect the classification accuracy with different training sample sizes. We found that: (1) SVM and NB were superior to CART and KNN, and both could...

  5. Trace elements based classification on clinkers. Application to Spanish clinkers

    Directory of Open Access Journals (Sweden)

    Tamás, F. D.

    2001-12-01

    Full Text Available The qualitative identification to determine the origin (i.e. manufacturing factory of Spanish clinkers is described. The classification of clinkers produced in different factories can be based on their trace element content. Approximately fifteen clinker sorts are analysed, collected from 11 Spanish cement factories to determine their Mg, Sr, Ba, Mn, Ti, Zr, Zn and V content. An expert system formulated by a binary decision tree is designed based on the collected data. The performance of the obtained classifier was measured by ten-fold cross validation. The results show that the proposed method is useful to identify an easy-to-use expert system that is able to determine the origin of the clinker based on its trace element content.

    En el presente trabajo se describe el procedimiento de identificación cualitativa de clínkeres españoles con el objeto de determinar su origen (fábrica. Esa clasificación de los clínkeres se basa en el contenido de sus elementos traza. Se analizaron 15 clínkeres diferentes procedentes de 11 fábricas de cemento españolas, determinándose los contenidos en Mg, Sr, Ba, Mn, Ti, Zr, Zn y V. Se ha diseñado un sistema experto mediante un árbol de decisión binario basado en los datos recogidos. La clasificación obtenida fue examinada mediante la validación cruzada de 10 valores. Los resultados obtenidos muestran que el modelo propuesto es válido para identificar, de manera fácil, un sistema experto capaz de determinar el origen de un clínker basándose en el contenido de sus elementos traza.

  6. China's Classification-Based Forest Management: Procedures, Problems, and Prospects

    Science.gov (United States)

    Dai, Limin; Zhao, Fuqiang; Shao, Guofan; Zhou, Li; Tang, Lina

    2009-06-01

    China’s new Classification-Based Forest Management (CFM) is a two-class system, including Commodity Forest (CoF) and Ecological Welfare Forest (EWF) lands, so named according to differences in their distinct functions and services. The purposes of CFM are to improve forestry economic systems, strengthen resource management in a market economy, ease the conflicts between wood demands and public welfare, and meet the diversified needs for forest services in China. The formative process of China’s CFM has involved a series of trials and revisions. China’s central government accelerated the reform of CFM in the year 2000 and completed the final version in 2003. CFM was implemented at the provincial level with the aid of subsidies from the central government. About a quarter of the forestland in China was approved as National EWF lands by the State Forestry Administration in 2006 and 2007. Logging is prohibited on National EWF lands, and their landowners or managers receive subsidies of about 70 RMB (US10) per hectare from the central government. CFM represents a new forestry strategy in China and its implementation inevitably faces challenges in promoting the understanding of forest ecological services, generalizing nationwide criteria for identifying EWF and CoF lands, setting up forest-specific compensation mechanisms for ecological benefits, enhancing the knowledge of administrators and the general public about CFM, and sustaining EWF lands under China’s current forestland tenure system. CFM does, however, offer a viable pathway toward sustainable forest management in China.

  7. Neural Network based Vehicle Classification for Intelligent Traffic Control

    Directory of Open Access Journals (Sweden)

    Saeid Fazli

    2012-06-01

    Full Text Available Nowadays, number of vehicles has been increased and traditional systems of traffic controlling couldn’t be able to meet the needs that cause to emergence of Intelligent Traffic Controlling Systems. They improve controlling and urban management and increase confidence index in roads and highways. The goal of thisarticle is vehicles classification base on neural networks. In this research, it has been used a immovable camera which is located in nearly close height of the road surface to detect and classify the vehicles. The algorithm that used is included two general phases; at first, we are obtaining mobile vehicles in the traffic situations by using some techniques included image processing and remove background of the images and performing edge detection and morphology operations. In the second phase, vehicles near the camera areselected and the specific features are processed and extracted. These features apply to the neural networks as a vector so the outputs determine type of vehicle. This presented model is able to classify the vehicles in three classes; heavy vehicles, light vehicles and motorcycles. Results demonstrate accuracy of the algorithm and its highly functional level.

  8. Brazilian Cardiorespiratory Fitness Classification Based on Maximum Oxygen Consumption

    Science.gov (United States)

    Herdy, Artur Haddad; Caixeta, Ananda

    2016-01-01

    Background Cardiopulmonary exercise test (CPET) is the most complete tool available to assess functional aerobic capacity (FAC). Maximum oxygen consumption (VO2 max), an important biomarker, reflects the real FAC. Objective To develop a cardiorespiratory fitness (CRF) classification based on VO2 max in a Brazilian sample of healthy and physically active individuals of both sexes. Methods We selected 2837 CEPT from 2837 individuals aged 15 to 74 years, distributed as follows: G1 (15 to 24); G2 (25 to 34); G3 (35 to 44); G4 (45 to 54); G5 (55 to 64) and G6 (65 to 74). Good CRF was the mean VO2 max obtained for each group, generating the following subclassification: Very Low (VL): VO2 105%. Results Men VL 105% G1 53.13 G2 49.77 G3 47.67 G4 42.52 G5 37.06 G6 31.50 Women G1 40.85 G2 40.01 G3 34.09 G4 32.66 G5 30.04 G6 26.36 Conclusions This chart stratifies VO2 max measured on a treadmill in a robust Brazilian sample and can be used as an alternative for the real functional evaluation of physically and healthy individuals stratified by age and sex. PMID:27305285

  9. Hadoop-based Multi-classification Fusion for Intrusion Detection

    OpenAIRE

    Xun-Yi Ren; Yu-Zhu Qi

    2013-01-01

    Intrusion detection system is the most important security technology in computer network, currently clustering and classification of data mining technology are often used to build detection model. However, different classification and clustering device has its own advantages and disadvantages and the testing result of detection model is not ideal. Cloud Computing, which can integrate multiple inexpensive computing nodes into a distributed system with a stro...

  10. Egocentric visual event classification with location-based priors

    OpenAIRE

    Sundaram, Sudeep; Mayol-Cuevas, Walterio

    2010-01-01

    We present a method for visual classification of actions and events captured from an egocentric point of view. The method tackles the challenge of a moving camera by creating deformable graph models for classification of actions. Action models are learned from low resolution, roughly stabilized difference images acquired using a single monocular camera. In parallel, raw images from the camera are used to estimate the user's location using a visual Simultaneous Localization and Mapping (SLAM) ...

  11. Basic Hand Gestures Classification Based on Surface Electromyography

    OpenAIRE

    Aleksander Palkowski; Grzegorz Redlarski

    2016-01-01

    This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the prop...

  12. Consistent image-based measurement and classification of skin color

    OpenAIRE

    Harville, Michael; Baker, Harlyn; Bhatti, Nina; Süsstrunk, Sabine

    2005-01-01

    Little prior image processing work has addressed estimation and classification of skin color in a manner that is independent of camera and illuminant. To this end, we first present new methods for 1) fast, easy-to-use image color correction, with specialization toward skin tones, and 2) fully automated estimation of facial skin color, with robustness to shadows, specularities, and blemishes. Each of these is validated independently against ground truth, and then combined with a classification...

  13. Text Classification Retrieval Based on Complex Network and ICA Algorithm

    OpenAIRE

    Hongxia Li

    2013-01-01

    With the development of computer science and information technology, the library is developing toward information and network. The library digital process converts the book into digital information. The high-quality preservation and management are achieved by computer technology as well as text classification techniques. It realizes knowledge appreciation. This paper introduces complex network theory in the text classification process and put forwards the ICA semantic clustering algorithm. It...

  14. Texture Features based Blur Classification in Barcode Images

    OpenAIRE

    Shamik Tiwari; Vidya Prasad Shukla; Sangappa Birada; Ajay Singh

    2013-01-01

    Blur is an undesirable phenomenon which appears as image degradation. Blur classification is extremely desirable before application of any blur parameters estimation approach in case of blind restoration of barcode image. A novel approach to classify blur in motion, defocus, and co-existence of both blur categories is presented in this paper. The key idea involves statistical features extraction of blur pattern in frequency domain and designing of blur classification system with feed forward ...

  15. IMPROVEMENT OF TCAM-BASED PACKET CLASSIFICATION ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    Xu Zhen; Zhang Jun; Rui Liyang; Sun Jun

    2008-01-01

    The feature of Ternary Content Addressable Memories (TCAMs) makes them particularly attractive for IP address lookup and packet classification applications in a router system. However, the limitations of TCAMs impede their utilization. In this paper, the solutions for decreasing the power consumption and avoiding entry expansion in range matching are addressed. Experimental results demonstrate that the proposed techniques can make some big improvements on the performance of TCAMs in IP address lookup and packet classification.

  16. Basic Hand Gestures Classification Based on Surface Electromyography.

    Science.gov (United States)

    Palkowski, Aleksander; Redlarski, Grzegorz

    2016-01-01

    This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method. PMID:27298630

  17. Basic Hand Gestures Classification Based on Surface Electromyography

    Directory of Open Access Journals (Sweden)

    Aleksander Palkowski

    2016-01-01

    Full Text Available This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method.

  18. Basic Hand Gestures Classification Based on Surface Electromyography

    Science.gov (United States)

    Palkowski, Aleksander; Redlarski, Grzegorz

    2016-01-01

    This paper presents an innovative classification system for hand gestures using 2-channel surface electromyography analysis. The system developed uses the Support Vector Machine classifier, for which the kernel function and parameter optimisation are conducted additionally by the Cuckoo Search swarm algorithm. The system developed is compared with standard Support Vector Machine classifiers with various kernel functions. The average classification rate of 98.12% has been achieved for the proposed method. PMID:27298630

  19. Review of Remotely Sensed Imagery Classification Patterns Based on Object-oriented Image Analysis

    Institute of Scientific and Technical Information of China (English)

    LIU Yongxue; LI Manchun; MAO Liang; XU Feifei; HUANG Shuo

    2006-01-01

    With the wide use of high-resolution remotely sensed imagery, the object-oriented remotely sensed information classification pattern has been intensively studied. Starting with the definition of object-oriented remotely sensed information classification pattern and a literature review of related research progress, this paper sums up 4 developing phases of object-oriented classification pattern during the past 20 years. Then, we discuss the three aspects of methodology in detail, namely remotely sensed imagery segmentation, feature analysis and feature selection, and classification rule generation, through comparing them with remotely sensed information classification method based on per-pixel. At last, this paper presents several points that need to be paid attention to in the future studies on object-oriented RS information classification pattern: 1) developing robust and highly effective image segmentation algorithm for multi-spectral RS imagery; 2) improving the feature-set including edge, spatial-adjacent and temporal characteristics; 3) discussing the classification rule generation classifier based on the decision tree; 4) presenting evaluation methods for classification result by object-oriented classification pattern.

  20. CLASSIFICATION OF LiDAR DATA WITH POINT BASED CLASSIFICATION METHODS

    OpenAIRE

    N. Yastikli; Cetin, Z.

    2016-01-01

    LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applicati...

  1. Identification of area-level influences on regions of high cancer incidence in Queensland, Australia: a classification tree approach

    Directory of Open Access Journals (Sweden)

    Mengersen Kerrie L

    2011-07-01

    Full Text Available Abstract Background Strategies for cancer reduction and management are targeted at both individual and area levels. Area-level strategies require careful understanding of geographic differences in cancer incidence, in particular the association with factors such as socioeconomic status, ethnicity and accessibility. This study aimed to identify the complex interplay of area-level factors associated with high area-specific incidence of Australian priority cancers using a classification and regression tree (CART approach. Methods Area-specific smoothed standardised incidence ratios were estimated for priority-area cancers across 478 statistical local areas in Queensland, Australia (1998-2007, n = 186,075. For those cancers with significant spatial variation, CART models were used to identify whether area-level accessibility, socioeconomic status and ethnicity were associated with high area-specific incidence. Results The accessibility of a person's residence had the most consistent association with the risk of cancer diagnosis across the specific cancers. Many cancers were likely to have high incidence in more urban areas, although male lung cancer and cervical cancer tended to have high incidence in more remote areas. The impact of socioeconomic status and ethnicity on these associations differed by type of cancer. Conclusions These results highlight the complex interactions between accessibility, socioeconomic status and ethnicity in determining cancer incidence risk.

  2. INDUS - a composition-based approach for rapid and accurate taxonomic classification of metagenomic sequences

    OpenAIRE

    Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Reddy, Rachamalla Maheedhar; Reddy, Chennareddy Venkata Siva Kumar; Singh, Nitin Kumar; Sharmila S Mande

    2011-01-01

    Background Taxonomic classification of metagenomic sequences is the first step in metagenomic analysis. Existing taxonomic classification approaches are of two types, similarity-based and composition-based. Similarity-based approaches, though accurate and specific, are extremely slow. Since, metagenomic projects generate millions of sequences, adopting similarity-based approaches becomes virtually infeasible for research groups having modest computational resources. In this study, we present ...

  3. Breast tomosynthesis and digital mammography: a comparison of breast cancer visibility and BIRADS classification in a population of cancers with subtle mammographic findings

    International Nuclear Information System (INIS)

    The main purpose was to compare breast cancer visibility in one-view breast tomosynthesis (BT) to cancer visibility in one- or two-view digital mammography (DM). Thirty-six patients were selected on the basis of subtle signs of breast cancer on DM. One-view BT was performed with the same compression angle as the DM image in which the finding was least/not visible. On BT, 25 projections images were acquired over an angular range of 50 degrees, with double the dose of one-view DM. Two expert breast imagers classified one- and two-view DM, and BT findings for cancer visibility and BIRADS cancer probability in a non-blinded consensus study. Forty breast cancers were found in 37 breasts. The cancers were rated more visible on BT compared to one-view and two-view DM in 22 and 11 cases, respectively, (p<0.01 for both comparisons). Comparing one-view DM to one-view BT, 21 patients were upgraded on BIRADS classification (p<0.01). Comparing two-view DM to one-view BT, 12 patients were upgraded on BIRADS classification (p<0.01). The results indicate that the cancer visibility on BT is superior to DM, which suggests that BT may have a higher sensitivity for breast cancer detection. (orig.)

  4. Classification and Identification of Over-voltage Based on HHT and SVM

    Institute of Scientific and Technical Information of China (English)

    WANG Jing; YANG Qing; CHEN Lin; SIMA Wenxia

    2012-01-01

    This paper proposes an effective method for over-voltage classification based on the Hilbert-Huang transform(HHT) method.Hilbert-Huang transform method is composed of empirical mode decomposition(EMD) and Hilbert transform.Nine kinds of common power system over-voltages are calculated and analyzed by HHT.Based on the instantaneous amplitude spectrum,Hilbert marginal spectrum and Hilbert time-frequency spectrum,three kinds of over-voltage characteristic quantities are obtained.A hierarchical classification system is built based on HHT and support vector machine(SVM).This classification system is tested by 106 field over-voltage signals,and the average classification rate is 94.3%.This research shows that HHT is an effective time-frequency analysis algorithms in the application of over-voltage classification and identification.

  5. 78 FR 18252 - Prevailing Rate Systems; North American Industry Classification System Based Federal Wage System...

    Science.gov (United States)

    2013-03-26

    ... Industry Classification System Based Federal Wage System Wage Surveys AGENCY: U. S. Office of Personnel... is issuing a proposed rule that would update the 2007 North American Industry Classification System..., the U.S. Office of Personnel Management (OPM) issued a final rule (73 FR 45853) to update the...

  6. 78 FR 58153 - Prevailing Rate Systems; North American Industry Classification System Based Federal Wage System...

    Science.gov (United States)

    2013-09-23

    ... RIN 3206-AM78 Prevailing Rate Systems; North American Industry Classification System Based Federal... Industry Classification System (NAICS) codes currently used in Federal Wage System wage survey industry..., 2013, the U.S. Office of Personnel Management (OPM) issued a proposed rule (78 FR 18252) to update...

  7. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  8. SPAM CLASSIFICATION BASED ON SUPERVISED LEARNING USING MACHINE LEARNING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Hamsapriya

    2011-12-01

    Full Text Available E-mail is one of the most popular and frequently used ways of communication due to its worldwide accessibility, relatively fast message transfer, and low sending cost. The flaws in the e-mail protocols and the increasing amount of electronic business and financial transactions directly contribute to the increase in e-mail-based threats. Email spam is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. Spam emails are invading users without their consent and filling their mail boxes. They consume more network capacity as well as time in checking and deleting spam mails. The vast majority of Internet users are outspoken in their disdain for spam, although enough of them respond to commercial offers that spam remains a viable source of income to spammers. While most of the users want to do right think to avoid and get rid of spam, they need clear and simple guidelines on how to behave. In spite of all the measures taken to eliminate spam, they are not yet eradicated. Also when the counter measures are over sensitive, even legitimate emails will be eliminated. Among the approaches developed to stop spam, filtering is the one of the most important technique. Many researches in spam filtering have been centered on the more sophisticated classifier-related issues. In recent days, Machine learning for spam classification is an important research issue. The effectiveness of the proposed work is explores and identifies the use of different learning algorithms for classifying spam messages from e-mail. A comparative analysis among the algorithms has also been presented.

  9. Algebraic classification of higher dimensional spacetimes based on null alignment

    CERN Document Server

    Ortaggio, Marcello; Pravdova, Alena

    2012-01-01

    We review recent developments and applications of the classification of the Weyl tensor in higher dimensional Lorentzian geometries. First, we discuss the general setup, i.e. main definitions and methods for the classification, some refinements and the generalized Newman-Penrose and Geroch-Held-Penrose formalisms. Next, we summarize general results, such as a partial extension of the Goldberg-Sachs theorem, characterization of spacetimes with vanishing (or constant) curvature invariants and the peeling behaviour in asymptotically flat spacetimes. Finally, we discuss certain invariantly defined families of metrics and their relation with the Weyl tensor classification, including: Kundt and Robinson-Trautman spacetimes; the Kerr-Schild ansatz in a constant-curvature background; purely electric and purely magnetic spacetimes; direct and (some) warped products; and geometries with certain symmetries. To conclude, some applications to quadratic gravity are also overviewed.

  10. A method for cloud detection and opacity classification based on ground based sky imagery

    Directory of Open Access Journals (Sweden)

    M. S. Ghonima

    2012-11-01

    Full Text Available Digital images of the sky obtained using a total sky imager (TSI are classified pixel by pixel into clear sky, optically thin and optically thick clouds. A new classification algorithm was developed that compares the pixel red-blue ratio (RBR to the RBR of a clear sky library (CSL generated from images captured on clear days. The difference, rather than the ratio, between pixel RBR and CSL RBR resulted in more accurate cloud classification. High correlation between TSI image RBR and aerosol optical depth (AOD measured by an AERONET photometer was observed and motivated the addition of a haze correction factor (HCF to the classification model to account for variations in AOD. Thresholds for clear and thick clouds were chosen based on a training image set and validated with set of manually annotated images. Misclassifications of clear and thick clouds into the opposite category were less than 1%. Thin clouds were classified with an accuracy of 60%. Accurate cloud detection and opacity classification techniques will improve the accuracy of short-term solar power forecasting.

  11. A method for cloud detection and opacity classification based on ground based sky imagery

    Directory of Open Access Journals (Sweden)

    M. S. Ghonima

    2012-07-01

    Full Text Available Digital images of the sky obtained using a total sky imager (TSI are classified pixel by pixel into clear sky, optically thin and optically thick clouds. A new classification algorithm was developed that compares the pixel red-blue ratio (RBR to the RBR of a clear sky library (CSL generated from images captured on clear days. The difference, rather than the ratio, between pixel RBR and CSL RBR resulted in more accurate cloud classification. High correlation between TSI image RBR and aerosol optical depth (AOD measured by an AERONET photometer was observed and motivated the addition of a haze correction factor (HCF to the classification model to account for variations in AOD. Thresholds for clear and thick clouds were chosen based on a training image set and validated with set of manually annotated images. Misclassifications of clear and thick clouds into the opposite category were less than 1%. Thin clouds were classified with an accuracy of 60%. Accurate cloud detection and opacity classification techniques will improve the accuracy of short-term solar power forecasting.

  12. Gender Classification Method Based on Gait Energy Motion Derived from Silhouette Through Wavelet Analysis of Human Gait Moving Pictures

    OpenAIRE

    Kohei Arai; Rosa Andrie Asmara

    2014-01-01

    Gender classification method based on Gait Energy Motion: GEM derived through wavelet analysis of human gait moving pictures is proposed. Through experiments with human gait moving pictures, it is found that the extracted features of wavelet coefficients using silhouettes images are useful for improvement of gender classification accuracy. Also, it is found that the proposed gender classification method shows the best classification performance, 97.63% of correct classification ratio.

  13. Gender Classification Method Based on Gait Energy Motion Derived from Silhouette Through Wavelet Analysis of Human Gait Moving Pictures

    Directory of Open Access Journals (Sweden)

    Kohei Arai

    2014-02-01

    Full Text Available Gender classification method based on Gait Energy Motion: GEM derived through wavelet analysis of human gait moving pictures is proposed. Through experiments with human gait moving pictures, it is found that the extracted features of wavelet coefficients using silhouettes images are useful for improvement of gender classification accuracy. Also, it is found that the proposed gender classification method shows the best classification performance, 97.63% of correct classification ratio.

  14. A Multi-Label Classification Approach Based on Correlations Among Labels

    Directory of Open Access Journals (Sweden)

    Raed Alazaidah

    2015-02-01

    Full Text Available Multi label classification is concerned with learning from a set of instances that are associated with a set of labels, that is, an instance could be associated with multiple labels at the same time. This task occurs frequently in application areas like text categorization, multimedia classification, bioinformatics, protein function classification and semantic scene classification. Current multi-label classification methods could be divided into two categories. The first is called problem transformation methods, which transform multi-label classification problem into single label classification problem, and then apply any single label classifier to solve the problem. The second category is called algorithm adaptation methods, which adapt an existing single label classification algorithm to handle multi-label data. In this paper, we propose a multi-label classification approach based on correlations among labels that use both problem transformation methods and algorithm adaptation methods. The approach begins with transforming multi-label dataset into a single label dataset using least frequent label criteria, and then applies the PART algorithm on the transformed dataset. The output of the approach is multi-labels rules. The approach also tries to get benefit from positive correlations among labels using predictive Apriori algorithm. The proposed approach has been evaluated using two multi-label datasets named (Emotions and Yeast and three evaluation measures (Accuracy, Hamming Loss, and Harmonic Mean. The experiments showed that the proposed approach has a fair accuracy in comparison to other related methods.

  15. Seafloor Sediment Classification Based on Multibeam Sonar Data

    Institute of Scientific and Technical Information of China (English)

    ZHOU Xinghua; CHEN Yongqi

    2004-01-01

    The multibeam sonars can provide hydrographic quality depth data as well as hold the potential to provide calibrated measurements of the seafloor acoustic backscattering strength. There has been much interest in utilizing backscatters and images from multibeam sonar for seabed type identification and most results are obtained. This paper has presented a focused review of several main methods and recent developments of seafloor classification utilizing multibeam sonar data or/and images. These are including the power spectral analysis methods, the texture analysis, traditional Bayesian classification theory and the most active neural network approaches.

  16. Woven fabric defects detection based on texture classification algorithm

    International Nuclear Information System (INIS)

    In this paper we have compared two famous methods in texture classification to solve the problem of recognition and classification of defects occurring in a textile manufacture. We have compared local binary patterns method with co-occurrence matrix. The classifier used is the support vector machines (SVM). The system has been tested using TILDA database. The results obtained are interesting and show that LBP is a good method for the problems of recognition and classifcation defects, it gives a good running time especially for the real time applications.

  17. Topic Modelling for Object-Based Classification of Vhr Satellite Images Based on Multiscale Segmentations

    Science.gov (United States)

    Shen, Li; Wu, Linmei; Li, Zhipeng

    2016-06-01

    Multiscale segmentation is a key prerequisite step for object-based classification methods. However, it is often not possible to determine a sole optimal scale for the image to be classified because in many cases different geo-objects and even an identical geo-object may appear at different scales in one image. In this paper, an object-based classification method based on mutliscale segmentation results in the framework of topic modelling is proposed to classify VHR satellite images in an entirely unsupervised fashion. In the stage of topic modelling, grayscale histogram distributions for each geo-object class and each segment are learned in an unsupervised manner from multiscale segments. In the stage of classification, each segment is allocated a geo-object class label by the similarity comparison between the grayscale histogram distributions of each segment and each geo-object class. Experimental results show that the proposed method can perform better than the traditional methods based on topic modelling.

  18. Dendritic cell-based cancer immunotherapy for colorectal cancer

    Science.gov (United States)

    Kajihara, Mikio; Takakura, Kazuki; Kanai, Tomoya; Ito, Zensho; Saito, Keisuke; Takami, Shinichiro; Shimodaira, Shigetaka; Okamoto, Masato; Ohkusa, Toshifumi; Koido, Shigeo

    2016-01-01

    Colorectal cancer (CRC) is one of the most common cancers and a leading cause of cancer-related mortality worldwide. Although systemic therapy is the standard care for patients with recurrent or metastatic CRC, the prognosis is extremely poor. The optimal sequence of therapy remains unknown. Therefore, alternative strategies, such as immunotherapy, are needed for patients with advanced CRC. This review summarizes evidence from dendritic cell-based cancer immunotherapy strategies that are currently in clinical trials. In addition, we discuss the possibility of antitumor immune responses through immunoinhibitory PD-1/PD-L1 pathway blockade in CRC patients. PMID:27158196

  19. Segmentation-Based PolSAR Image Classification Using Visual Features: RHLBP and Color Features

    Directory of Open Access Journals (Sweden)

    Jian Cheng

    2015-05-01

    Full Text Available A segmentation-based fully-polarimetric synthetic aperture radar (PolSAR image classification method that incorporates texture features and color features is designed and implemented. This method is based on the framework that conjunctively uses statistical region merging (SRM for segmentation and support vector machine (SVM for classification. In the segmentation step, we propose an improved local binary pattern (LBP operator named the regional homogeneity local binary pattern (RHLBP to guarantee the regional homogeneity in PolSAR images. In the classification step, the color features extracted from false color images are applied to improve the classification accuracy. The RHLBP operator and color features can provide discriminative information to separate those pixels and regions with similar polarimetric features, which are from different classes. Extensive experimental comparison results with conventional methods on L-band PolSAR data demonstrate the effectiveness of our proposed method for PolSAR image classification.

  20. Open source, web-based machine-learning assisted classification system

    OpenAIRE

    Consarnau Pallarés, Mireia Roser

    2016-01-01

    The aim of this article is to provide a design overview of the web based machine learning assisted multi-user classification system. The design is based on open source standards both for multi-user environment written in PHP using the Laravel framework and a Python based machine learning toolkit, Scikit-Learn. The advantage of the proposed system is that it does not require the domain specific knowledge or programming skills. Machine learning classification tasks are done on the background...

  1. A kernel-based multivariate feature selection method for microarray data classification.

    Directory of Open Access Journals (Sweden)

    Shiquan Sun

    Full Text Available High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, [Formula: see text]-nearest neighbor on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.

  2. Analysis of uncertainty in multi-temporal object-based classification

    Science.gov (United States)

    Löw, Fabian; Knöfel, Patrick; Conrad, Christopher

    2015-07-01

    Agricultural management increasingly uses crop maps based on classification of remotely sensed data. However, classification errors can translate to errors in model outputs, for instance agricultural production monitoring (yield, water demand) or crop acreage calculation. Hence, knowledge on the spatial variability of the classier performance is important information for the user. But this is not provided by traditional assessments of accuracy, which are based on the confusion matrix. In this study, classification uncertainty was analyzed, based on the support vector machines (SVM) algorithm. SVM was applied to multi-spectral time series data of RapidEye from different agricultural landscapes and years. Entropy was calculated as a measure of classification uncertainty, based on the per-object class membership estimations from the SVM algorithm. Permuting all possible combinations of available images allowed investigating the impact of the image acquisition frequency and timing, respectively, on the classification uncertainty. Results show that multi-temporal datasets decrease classification uncertainty for different crops compared to single data sets, but there was no "one-image-combination-fits-all" solution. The number and acquisition timing of the images, for which a decrease in uncertainty could be realized, proved to be specific to a given landscape, and for each crop they differed across different landscapes. For some crops, an increase of uncertainty was observed when increasing the quantity of images, even if classification accuracy was improved. Random forest regression was employed to investigate the impact of different explanatory variables on the observed spatial pattern of classification uncertainty. It was strongly influenced by factors related with the agricultural management and training sample density. Lower uncertainties were revealed for fields close to rivers or irrigation canals. This study demonstrates that classification uncertainty estimates

  3. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function.

    Directory of Open Access Journals (Sweden)

    Derek G Groenendyk

    Full Text Available Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization

  4. Hydrologic-Process-Based Soil Texture Classifications for Improved Visualization of Landscape Function.

    Science.gov (United States)

    Groenendyk, Derek G; Ferré, Ty P A; Thorp, Kelly R; Rice, Amy K

    2015-01-01

    Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth's surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape

  5. Emotion of Physiological Signals Classification Based on TS Feature Selection

    Institute of Scientific and Technical Information of China (English)

    Wang Yujing; Mo Jianlin

    2015-01-01

    This paper propose a method of TS-MLP about emotion recognition of physiological signal.It can recognize emotion successfully by Tabu search which selects features of emotion’s physiological signals and multilayer perceptron that is used to classify emotion.Simulation shows that it has achieved good emotion classification performance.

  6. Optimal query-based relevance feedback in medical image retrieval using score fusion-based classification.

    Science.gov (United States)

    Behnam, Mohammad; Pourghassem, Hossein

    2015-04-01

    In this paper, a new content-based medical image retrieval (CBMIR) framework using an effective classification method and a novel relevance feedback (RF) approach are proposed. For a large-scale database with diverse collection of different modalities, query image classification is inevitable due to firstly, reducing the computational complexity and secondly, increasing influence of data fusion by removing unimportant data and focus on the more valuable information. Hence, we find probability distribution of classes in the database using Gaussian mixture model (GMM) for each feature descriptor and then using the fusion of obtained scores from the dependency probabilities, the most relevant clusters are identified for a given query. Afterwards, visual similarity of query image and images in relevant clusters are calculated. This method is performed separately on all feature descriptors, and then the results are fused together using feature similarity ranking level fusion algorithm. In the RF level, we propose a new approach to find the optimal queries based on relevant images. The main idea is based on density function estimation of positive images and strategy of moving toward the aggregation of estimated density function. The proposed framework has been evaluated on ImageCLEF 2005 database consisting of 10,000 medical X-ray images of 57 semantic classes. The experimental results show that compared with the existing CBMIR systems, our framework obtains the acceptable performance both in the image classification and in the image retrieval by RF. PMID:25246167

  7. Classification of idiopathic toe walking based on gait analysis: development and application of the ITW severity classification.

    Science.gov (United States)

    Alvarez, Christine; De Vera, Mary; Beauchamp, Richard; Ward, Valerie; Black, Alec

    2007-09-01

    Idiopathic toe walking (ITW), considered abnormal after the age of 3 years, is a common complaint seen by medical professionals, especially orthopaedic surgeons and physiotherapists. A classification for idiopathic toe walking would be helpful to better understand the condition, delineate true idiopathic toe walkers from patients with other conditions, and allow for assignment of a severity gradation, thereby directing management of ITW. The purpose of this study was to describe idiopathic toe walking and develop a toe walking classification scheme in a large sample of children. Three primary criteria, presence of a first ankle rocker, presence of an early third ankle rocker, and predominant early ankle moment, were used to classify idiopathic toe walking into three severity groups: Type 1 mild; Type 2 moderate; and Type 3 severe. Supporting data, based on ankle range of motion, sagittal joint powers, knee kinematics, and EMG data were also analyzed. Prospectively collected gait analysis data of 133 children (266 feet) with idiopathic toe walking were analyzed. Subjects' age range was from 4.19 to 15.96 years with a mean age of 8.80 years. Pooling right and left foot data, 40 feet were classified as Type 1, 129 were classified as Type 2, and 90 were classified as Type 3. Seven feet were unclassifiable. Statistical analysis of continuous variables comprising the primary criteria showed that the toe walking severity classification was able to differentiate between three levels of toe walking severity. This classification allowed for the quantitative description of the idiopathic toe walking pattern as well as the delineation of three distinct types of ITW patients (mild, moderate, and severe). PMID:17161602

  8. Classification of weld defect based on information fusion technology for radiographic testing system

    Science.gov (United States)

    Jiang, Hongquan; Liang, Zeming; Gao, Jianmin; Dang, Changying

    2016-03-01

    Improving the efficiency and accuracy of weld defect classification is an important technical problem in developing the radiographic testing system. This paper proposes a novel weld defect classification method based on information fusion technology, Dempster-Shafer evidence theory. First, to characterize weld defects and improve the accuracy of their classification, 11 weld defect features were defined based on the sub-pixel level edges of radiographic images, four of which are presented for the first time in this paper. Second, we applied information fusion technology to combine different features for weld defect classification, including a mass function defined based on the weld defect feature information and the quartile-method-based calculation of standard weld defect class which is to solve a sample problem involving a limited number of training samples. A steam turbine weld defect classification case study is also presented herein to illustrate our technique. The results show that the proposed method can increase the correct classification rate with limited training samples and address the uncertainties associated with weld defect classification.

  9. Drug related webpages classification using images and text information based on multi-kernel learning

    Science.gov (United States)

    Hu, Ruiguang; Xiao, Liping; Zheng, Wenjuan

    2015-12-01

    In this paper, multi-kernel learning(MKL) is used for drug-related webpages classification. First, body text and image-label text are extracted through HTML parsing, and valid images are chosen by the FOCARSS algorithm. Second, text based BOW model is used to generate text representation, and image-based BOW model is used to generate images representation. Last, text and images representation are fused with a few methods. Experimental results demonstrate that the classification accuracy of MKL is higher than those of all other fusion methods in decision level and feature level, and much higher than the accuracy of single-modal classification.

  10. Novel round-robin tabu search algorithm for prostate cancer classification and diagnosis using multispectral imagery.

    Science.gov (United States)

    Tahir, Muhammad Atif; Bouridane, Ahmed

    2006-10-01

    Quantitative cell imagery in cancer pathology has progressed greatly in the last 25 years. The application areas are mainly those in which the diagnosis is still critically reliant upon the analysis of biopsy samples, which remains the only conclusive method for making an accurate diagnosis of the disease. Biopsies are usually analyzed by a trained pathologist who, by analyzing the biopsies under a microscope, assesses the normality or malignancy of the samples submitted. Different grades of malignancy correspond to different structural patterns as well as to apparent textures. In the case of prostate cancer, four major groups have to be recognized: stroma, benign prostatic hyperplasia, prostatic intraepithelial neoplasia, and prostatic carcinoma. Recently, multispectral imagery has been used to solve this multiclass problem. Unlike conventional RGB color space, multispectral images allow the acquisition of a large number of spectral bands within the visible spectrum, resulting in a large feature vector size. For such a high dimensionality, pattern recognition techniques suffer from the well-known "curse-of-dimensionality" problem. This paper proposes a novel round-robin tabu search (RR-TS) algorithm to address the curse-of-dimensionality for this multiclass problem. The experiments have been carried out on a number of prostate cancer textured multispectral images, and the results obtained have been assessed and compared with previously reported works. The system achieved 98%-100% classification accuracy when testing on two datasets. It outperformed principal component/linear discriminant classifier (PCA-LDA), tabu search/nearest neighbor classifier (TS-1NN), and bagging/boosting with decision tree (C4.5) classifier. PMID:17044412

  11. A HowNet-Based Semantic Relatedness Kernel for Text Classification

    Directory of Open Access Journals (Sweden)

    Pei-Ying Zhang

    2013-04-01

    Full Text Available The exploitation of the semantic relatedness kernel has always been an appealing subject in the context of text retrieval and information management. Typically, in text classification the documents are represented in the vector space using the bag-of-words (BOW approach. The BOW approach does not take into account the semantic relatedness information. To further improve the text classification performance, this paper presents a new semantic-based kernel of support vector machine algorithm for text classification. This method firstly using CHI method to select document feature vectors, secondly calculates the feature vector weights using TF-IDF method, and utilizes the semantic relatedness kernel which involves the semantic similarity computation and semantic relevance computation to classify the document using support vector machines. Experimental results show that compared with the traditional support vector machine algorithm, the algorithm in the text classification achieves improved classification F1-measure.

  12. Machine Fault Classification Based on Local Discriminant Bases and Locality Preserving Projections

    Directory of Open Access Journals (Sweden)

    Qingbo He

    2014-01-01

    Full Text Available Machine fault classification is an important task for intelligent identification of the health patterns for a mechanical system being monitored. Effective feature extraction of vibration data is very critical to reliable classification of machine faults with different types and severities. In this paper, a new method is proposed to acquire the sensitive features through a combination of local discriminant bases (LDB and locality preserving projections (LPP. In the method, the LDB is employed to select the optimal wavelet packet (WP nodes that exhibit high discrimination from a redundant WP library of wavelet packet transform (WPT. Considering that the obtained discriminatory features on these selected nodes characterize the class pattern in different sensitivity, the LPP is then applied to address mining inherent class pattern feature embedded in the raw features. The proposed feature extraction method combines the merits of LDB and LPP and extracts the inherent pattern structure embedded in the discriminatory feature values of samples in different classes. Therefore, the proposed feature not only considers the discriminatory features themselves but also considers the dynamic sensitive class pattern structure. The effectiveness of the proposed feature is verified by case studies on vibration data-based classification of bearing fault types and severities.

  13. Spectral Collaborative Representation based Classification for Hand Gestures recognition on Electromyography Signals

    OpenAIRE

    Boyali, Ali

    2015-01-01

    In this study, we introduce a novel variant and application of the Collaborative Representation based Classification in spectral domain for recognition of the hand gestures using the raw surface Electromyography signals. The intuitive use of spectral features are explained via circulant matrices. The proposed Spectral Collaborative Representation based Classification (SCRC) is able to recognize gestures with higher levels of accuracy for a fairly rich gesture set. The worst recognition result...

  14. An Improved AIS Based E-mail Classification Technique for Spam Detection

    OpenAIRE

    Idris, Ismaila; Abdulhamid, Shafii Muhammad

    2014-01-01

    An improved email classification method based on Artificial Immune System is proposed in this paper to develop an immune based system by using the immune learning, immune memory in solving complex problems in spam detection. An optimized technique for e-mail classification is accomplished by distinguishing the characteristics of spam and non-spam that is been acquired from trained data set. These extracted features of spam and non-spam are then combined to make a single detector, therefore re...

  15. Belief Function Based Decision Fusion for Decentralized Target Classification in Wireless Sensor Networks

    OpenAIRE

    Wenyu Zhang; Zhenjiang Zhang

    2015-01-01

    Decision fusion in sensor networks enables sensors to improve classification accuracy while reducing the energy consumption and bandwidth demand for data transmission. In this paper, we focus on the decentralized multi-class classification fusion problem in wireless sensor networks (WSNs) and a new simple but effective decision fusion rule based on belief function theory is proposed. Unlike existing belief function based decision fusion schemes, the proposed approach is compatible with any ty...

  16. Power Disturbances Classification Using S-Transform Based GA-PNN

    Science.gov (United States)

    Manimala, K.; Selvi, K.

    2015-09-01

    The significance of detection and classification of power quality events that disturb the voltage and/or current waveforms in the electrical power distribution networks is well known. Consequently, in spite of a large number of research reports in this area, a research on the selection of proper parameter for specific classifiers was so far not explored. The parameter selection is very important for successful modelling of input-output relationship in a function approximation model. In this study, probabilistic neural network (PNN) has been used as a function approximation tool for power disturbance classification and genetic algorithm (GA) is utilised for optimisation of the smoothing parameter of the PNN. The important features extracted from raw power disturbance signal using S-Transform are given to the PNN for effective classification. The choice of smoothing parameter for PNN classifier will significantly impact the classification accuracy. Hence, GA based parameter optimization is done to ensure good classification accuracy by selecting suitable parameter of the PNN classifier. Testing results show that the proposed S-Transform based GA-PNN model has better classification ability than classifiers based on conventional grid search method for parameter selection. The noisy and practical signals are considered for the classification process to show the effectiveness of the proposed method in comparison with existing methods.

  17. Tumor classification: molecular analysis meets Aristotle

    International Nuclear Information System (INIS)

    Traditionally, tumors have been classified by their morphologic appearances. Unfortunately, tumors with similar histologic features often follow different clinical courses or respond differently to chemotherapy. Limitations in the clinical utility of morphology-based tumor classifications have prompted a search for a new tumor classification based on molecular analysis. Gene expression array data and proteomic data from tumor samples will provide complex data that is unobtainable from morphologic examination alone. The growing question facing cancer researchers is, 'How can we successfully integrate the molecular, morphologic and clinical characteristics of human cancer to produce a helpful tumor classification?' Current efforts to classify cancers based on molecular features ignore lessons learned from millennia of experience in biological classification. A tumor classification must include every type of tumor and must provide a unique place for each tumor within the classification. Groups within a classification inherit the properties of their ancestors and impart properties to their descendants. A classification was prepared grouping tumors according to their histogenetic development. The classification is simple (reducing the complexity of information received from the molecular analysis of tumors), comprehensive (providing a place for every tumor of man), and consistent with recent attempts to characterize tumors by cytogenetic and molecular features. The clinical and research value of this historical approach to tumor classification is discussed. This manuscript reviews tumor classification and provides a new and comprehensive classification for neoplasia that preserves traditional nomenclature while incorporating information derived from the molecular analysis of tumors. The classification is provided as an open access XML document that can be used by cancer researchers to relate tumor classes with heterogeneous experimental and clinical tumor

  18. Cell-based therapy technology classifications and translational challenges.

    Science.gov (United States)

    Mount, Natalie M; Ward, Stephen J; Kefalas, Panos; Hyllner, Johan

    2015-10-19

    Cell therapies offer the promise of treating and altering the course of diseases which cannot be addressed adequately by existing pharmaceuticals. Cell therapies are a diverse group across cell types and therapeutic indications and have been an active area of research for many years but are now strongly emerging through translation and towards successful commercial development and patient access. In this article, we present a description of a classification of cell therapies on the basis of their underlying technologies rather than the more commonly used classification by cell type because the regulatory path and manufacturing solutions are often similar within a technology area due to the nature of the methods used. We analyse the progress of new cell therapies towards clinical translation, examine how they are addressing the clinical, regulatory, manufacturing and reimbursement requirements, describe some of the remaining challenges and provide perspectives on how the field may progress for the future. PMID:26416686

  19. Knowledge Based Pipeline Network Classification and Recognition Method of Maps

    Institute of Scientific and Technical Information of China (English)

    Liu Tongyu; Gu Shusheng

    2001-01-01

    Map recognition is an e.ssenfial data input means of Geographic Information System(GIS). How to solve the problems in the procedure, such as recognition of maps with crisscross pipeline networks, classification of buildings and roads, and processing of connected text, is a critical step for GIS keeping high-speed development. In this paper, a new recognition method of pipeline maps is presented, and some common patterns of pipeline connection and component labels are establishecd Through pattern matching, pipelines and component labels are recognized and peeled off from maps. After this approach, maps simply consist of buildings and roads, which are recognized and classified with fuzzy classification method. In addition, the Double Sides Scan (DSS) technique is also described, through which the effect of connected text can be eliminated.

  20. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge Emil Borch Laurs; Lo, Pechin Chien Pau; Dirksen, Asger; Petersen, Jens; de Bruijne, Marleen

    between the branch feature vectors representing those trees. Hereby, localized information in the branches is collectively used in classification and variations in feature values across the tree are taken into account. An approximate anatomical correspondence between matched branches can be achieved by......, as well as anatomical features to characterize each branch, an area under the receiver operating characteristic curve of 0.912 is achieved. This is significantly better than computing the average WA%....

  1. Dissimilarity-based classification of anatomical tree structures

    DEFF Research Database (Denmark)

    Sørensen, Lauge Emil Borch Laurs; Lo, Pechin Chien Pau; Dirksen, Asger; Petersen, Jens; de Bruijne, Marleen

    2011-01-01

    between the branch feature vectors representing those trees. Hereby, localized information in the branches is collectively used in classification and variations in feature values across the tree are taken into account. An approximate anatomical correspondence between matched branches can be achieved by......, as well as anatomical features to characterize each branch, an area under the receiver operating characteristic curveof 0.912 is achieved. This is significantly better than computing the average WA%....

  2. Statistics-based classification on tissues in the mandibular region

    Czech Academy of Sciences Publication Activity Database

    Marcon, P.; Bartušek, Karel; Mikulka, J.; Gescheidtová, E.

    Brno : University of technolgy, 2013, s. 624-627. ISBN 978-1-4799-0404-4. [International conference on telecommunications and signal processing /36./. Rome (IT), 02.07.2013-04.07.2013] R&D Projects: GA ČR GAP102/11/0318; GA ČR GAP102/12/1104 Institutional support: RVO:68081731 Keywords : classification * mandibular region * T1W * T2W Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering

  3. Entropy-based Classification of 'Retweeting' Activity on Twitter

    OpenAIRE

    Ghosh, Rumi; Surachawala, Tawan; Lerman, Kristina

    2011-01-01

    Twitter is used for a variety of reasons, including information dissemination, marketing, political organizing and to spread propaganda, spamming, promotion, conversations, and so on. Characterizing these activities and categorizing associated user generated content is a challenging task. We present a information-theoretic approach to classification of user activity on Twitter. We focus on tweets that contain embedded URLs and study their collective `retweeting' dynamics. We identify two feat...

  4. Fast rule-based bioactivity prediction using associative classification mining

    OpenAIRE

    Yu Pulan; Wild David J

    2012-01-01

    Abstract Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, class...

  5. Magnetic nanoparticle-based cancer therapy

    Institute of Scientific and Technical Information of China (English)

    Yu Jing; Huang Dong-Yan; Muhammad Zubair Yousaf; Hou Yang-Long; Gao Song

    2013-01-01

    Nanoparticles (NPs) with easily modified surfaces have been playing an important role in biomedicine.As cancer is one of the major causes of death,tremendous efforts have been devoted to advance the methods of cancer diagnosis and therapy.Recently,magnetic nanoparticles (MNPs) that are responsive to a magnetic field have shown great promise in cancer therapy.Compared with traditional cancer therapy,magnetic field triggered therapeutic approaches can treat cancer in an unconventional but more effective and safer way.In this review,we will discuss the recent progress in cancer therapies based on MNPs,mainly including magnetic hyperthermia,magnetic specific targeting,magnetically controlled drug delivery,magnetofection,and magnetic switches for controlling cell fate.Some recently developed strategies such as magnetic resonance imaging (MRI) monitoring cancer therapy and magnetic tissue engineering are also addressed.

  6. Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data

    Directory of Open Access Journals (Sweden)

    J. Sunil Rao

    2007-01-01

    Full Text Available In gene selection for cancer classifi cation using microarray data, we define an eigenvalue-ratio statistic to measure a gene’s contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalueratio statistic, we define a novel hypothesis testing for gene statistical redundancy and propose two gene selection methods. Simulation studies illustrate the agreement between statistical redundancy testing and gene selection methods. Real data examples show the proposed gene selection methods can select a compact gene subset which can not only be used to build high quality cancer classifiers but also show biological relevance.

  7. Computer-aided decision-making for early detection of breast cancers using fuzzy classification

    International Nuclear Information System (INIS)

    The experts system DIDIMA was developed for timely detection of the early stage of breast cancer in women using the HEWLETT-PACKARD 9845 B computer in the BASIC programming language. The expert system is based on the theory of fuzzy sets. The parameters and structure of the fuzzy knowledge base is given. The system was introduced into practice and its possibilities are described for diagnosis and therapy. (E.S.)

  8. Ship Classification with High Resolution TerraSAR-X Imagery Based on Analytic Hierarchy Process

    Directory of Open Access Journals (Sweden)

    Zhi Zhao

    2013-01-01

    Full Text Available Ship surveillance using space-borne synthetic aperture radar (SAR, taking advantages of high resolution over wide swaths and all-weather working capability, has attracted worldwide attention. Recent activity in this field has concentrated mainly on the study of ship detection, but the classification is largely still open. In this paper, we propose a novel ship classification scheme based on analytic hierarchy process (AHP in order to achieve better performance. The main idea is to apply AHP on both feature selection and classification decision. On one hand, the AHP based feature selection constructs a selection decision problem based on several feature evaluation measures (e.g., discriminability, stability, and information measure and provides objective criteria to make comprehensive decisions for their combinations quantitatively. On the other hand, we take the selected feature sets as the input of KNN classifiers and fuse the multiple classification results based on AHP, in which the feature sets’ confidence is taken into account when the AHP based classification decision is made. We analyze the proposed classification scheme and demonstrate its results on a ship dataset that comes from TerraSAR-X SAR images.

  9. Virtual images inspired consolidate collaborative representation-based classification method for face recognition

    Science.gov (United States)

    Liu, Shigang; Zhang, Xinxin; Peng, Yali; Cao, Han

    2016-07-01

    The collaborative representation-based classification method performs well in the field of classification of high-dimensional images such as face recognition. It utilizes training samples from all classes to represent a test sample and assigns a class label to the test sample using the representation residuals. However, this method still suffers from the problem that limited number of training sample influences the classification accuracy when applied to image classification. In this paper, we propose a modified collaborative representation-based classification method (MCRC), which exploits novel virtual images and can obtain high classification accuracy. The procedure to produce virtual images is very simple but the use of them can bring surprising performance improvement. The virtual images can sufficiently denote the features of original face images in some case. Extensive experimental results doubtlessly demonstrate that the proposed method can effectively improve the classification accuracy. This is mainly attributed to the integration of the collaborative representation and the proposed feature-information dominated virtual images.

  10. Land Cover Classification from Full-Waveform LIDAR Data Based on Support Vector Machines

    Science.gov (United States)

    Zhou, M.; Li, C. R.; Ma, L.; Guan, H. C.

    2016-06-01

    In this study, a land cover classification method based on multi-class Support Vector Machines (SVM) is presented to predict the types of land cover in Miyun area. The obtained backscattered full-waveforms were processed following a workflow of waveform pre-processing, waveform decomposition and feature extraction. The extracted features, which consist of distance, intensity, Full Width at Half Maximum (FWHM) and back scattering cross-section, were corrected and used as attributes for training data to generate the SVM prediction model. The SVM prediction model was applied to predict the types of land cover in Miyun area as ground, trees, buildings and farmland. The classification results of these four types of land covers were obtained based on the ground truth information according to the CCD image data of Miyun area. It showed that the proposed classification algorithm achieved an overall classification accuracy of 90.63%. In order to better explain the SVM classification results, the classification results of SVM method were compared with that of Artificial Neural Networks (ANNs) method and it showed that SVM method could achieve better classification results.

  11. Classification of high resolution imagery based on fusion of multiscale texture features

    International Nuclear Information System (INIS)

    In high resolution data classification process, combining texture features with spectral bands can effectively improve the classification accuracy. However, the window size which is difficult to choose is regarded as an important factor influencing overall classification accuracy in textural classification and current approaches to image texture analysis only depend on a single moving window which ignores different scale features of various land cover types. In this paper, we propose a new method based on the fusion of multiscale texture features to overcome these problems. The main steps in new method include the classification of fixed window size spectral/textural images from 3×3 to 15×15 and comparison of all the posterior possibility values for every pixel, as a result the biggest probability value is given to the pixel and the pixel belongs to a certain land cover type automatically. The proposed approach is tested on University of Pavia ROSIS data. The results indicate that the new method improve the classification accuracy compared to results of methods based on fixed window size textural classification

  12. A Neuro-Fuzzy based System for Classification of Natural Textures

    Science.gov (United States)

    Jiji, G. Wiselin

    2016-06-01

    A statistical approach based on the coordinated clusters representation of images is used for classification and recognition of textured images. In this paper, two issues are being addressed; one is the extraction of texture features from the fuzzy texture spectrum in the chromatic and achromatic domains from each colour component histogram of natural texture images and the second issue is the concept of a fusion of multiple classifiers. The implementation of an advanced neuro-fuzzy learning scheme has been also adopted in this paper. The results of classification tests show the high performance of the proposed method that may have industrial application for texture classification, when compared with other works.

  13. Carcinoma de mama: novos conceitos na classificação Breast cancer: new concepts in classification

    Directory of Open Access Journals (Sweden)

    Daniella Serafin Couto Vieira

    2008-01-01

    Full Text Available O carcinoma de mama é a neoplasia maligna mais comum em mulheres. Estudos moleculares do carcinoma de mama, baseados na identificação do perfil de expressão gênica por meio do cDNA microarray, permitiram definir pelo menos cinco sub-grupos distintos: luminal A, luminal B, superexpressão do HER2, basal e normal breast-like. A técnica de tissue microarray (TMA, descrita pela primeira vez em 1998, permitiu estudar, em várias amostras de carcinoma, os perfis de expressão protéica de diferentes neoplasias. No carcinoma de mama, os TMAs têm sido utilizados para validar os achados dos estudos preliminares, identificando, desta forma, os novos subtipos fenotípicos do carcinoma de mama. Dentre os subtipos classicamente descritos, o grupo basal constitui um dos mais intrigantes subtipos tumorais e é freqüentemente associado com pior prognóstico e ausência de alvos terapêuticos definidos. A classificação histopatológica do carcinoma de mama tem pobre valor preditivo. Portanto, a associação entre o diagnóstico histológico com técnicas moleculares nos laboratórios de anatomia patológica, por meio do estudo imunoistoquímico, pode determinar o perfil molecular do carcinoma de mama, buscando melhorar a resposta terapêutica. Este estudo visou resumir os mais recentes conhecimentos em que se baseiam os novos conceitos da classificação do carcinoma de mama.Breast cancer is the principal cause of death from cancer in women. Molecular studies of breast cancer, based in the identification of the molecular profiling techniques through cDNA microarray, had allowed defining at least five distinct sub-group: luminal A, luminal B, HER-2-overexpression, basal and " normal" type breast-like. The technique of tissue microarrays (TMA, described for the first time in 1998, allows to study, in some samples of breast cancer, distinguished by differences in their gene expression patterns, which provide a distinctive molecular portrait for each tumor

  14. Improving the Computational Performance of Ontology-Based Classification Using Graph Databases

    Directory of Open Access Journals (Sweden)

    Thomas J. Lampoltshammer

    2015-07-01

    Full Text Available The increasing availability of very high-resolution remote sensing imagery (i.e., from satellites, airborne laser scanning, or aerial photography represents both a blessing and a curse for researchers. The manual classification of these images, or other similar geo-sensor data, is time-consuming and leads to subjective and non-deterministic results. Due to this fact, (semi- automated classification approaches are in high demand in affected research areas. Ontologies provide a proper way of automated classification for various kinds of sensor data, including remotely sensed data. However, the processing of data entities—so-called individuals—is one of the most cost-intensive computational operations within ontology reasoning. Therefore, an approach based on graph databases is proposed to overcome the issue of a high time consumption regarding the classification task. The introduced approach shifts the classification task from the classical Protégé environment and its common reasoners to the proposed graph-based approaches. For the validation, the authors tested the approach on a simulation scenario based on a real-world example. The results demonstrate a quite promising improvement of classification speed—up to 80,000 times faster than the Protégé-based approach.

  15. Computer vision-based limestone rock-type classification using probabilistic neural network

    Institute of Scientific and Technical Information of China (English)

    Ashok Kumar Patel; Snehamoy Chatterjee

    2016-01-01

    Proper quality planning of limestone raw materials is an essential job of maintaining desired feed in cement plant. Rock-type identification is an integrated part of quality planning for limestone mine. In this paper, a computer vision-based rock-type classification algorithm is proposed for fast and reliable identification without human intervention. A laboratory scale vision-based model was developed using probabilistic neural network (PNN) where color histogram features are used as input. The color image histogram-based features that include weighted mean, skewness and kurtosis features are extracted for all three color space red, green, and blue. A total nine features are used as input for the PNN classification model. The smoothing parameter for PNN model is selected judicially to develop an optimal or close to the optimum classification model. The developed PPN is validated using the test data set and results reveal that the proposed vision-based model can perform satisfactorily for classifying limestone rock-types. Overall the error of mis-classification is below 6%. When compared with other three classifica-tion algorithms, it is observed that the proposed method performs substantially better than all three classification algorithms.

  16. An object-oriented classification method of high resolution imagery based on improved AdaTree

    International Nuclear Information System (INIS)

    With the popularity of the application using high spatial resolution remote sensing image, more and more studies paid attention to object-oriented classification on image segmentation as well as automatic classification after image segmentation. This paper proposed a fast method of object-oriented automatic classification. First, edge-based or FNEA-based segmentation was used to identify image objects and the values of most suitable attributes of image objects for classification were calculated. Then a certain number of samples from the image objects were selected as training data for improved AdaTree algorithm to get classification rules. Finally, the image objects could be classified easily using these rules. In the AdaTree, we mainly modified the final hypothesis to get classification rules. In the experiment with WorldView2 image, the result of the method based on AdaTree showed obvious accuracy and efficient improvement compared with the method based on SVM with the kappa coefficient achieving 0.9242

  17. AR-based Method for ECG Classification and Patient Recognition

    Directory of Open Access Journals (Sweden)

    Branislav Vuksanovic

    2013-09-01

    Full Text Available The electrocardiogram (ECG is the recording of heart activity obtained by measuring the signals from electrical contacts placed on the skin of the patient. By analyzing ECG, it is possible to detect the rate and consistency of heartbeats and identify possible irregularities in heart operation. This paper describes a set of techniques employed to pre-process the ECG signals and extract a set of features – autoregressive (AR signal parameters used to characterise ECG signal. Extracted parameters are in this work used to accomplish two tasks. Firstly, AR features belonging to each ECG signal are classified in groups corresponding to three different heart conditions – normal, arrhythmia and ventricular arrhythmia. Obtained classification results indicate accurate, zero-error classification of patients according to their heart condition using the proposed method. Sets of extracted AR coefficients are then extended by adding an additional parameter – power of AR modelling error and a suitability of developed technique for individual patient identification is investigated. Individual feature sets for each group of detected QRS sections are classified in p clusters where p represents the number of patients in each group. Developed system has been tested using ECG signals available in MIT/BIH and Politecnico of Milano VCG/ECG database. Achieved recognition rates indicate that patient identification using ECG signals could be considered as a possible approach in some applications using the system developed in this work. Pre-processing stages, applied parameter extraction techniques and some intermediate and final classification results are described and presented in this paper.

  18. Marker-Based Hierarchical Segmentation and Classification Approach for Hyperspectral Imagery

    Science.gov (United States)

    Tarabalka, Yuliya; Tilton, James C.; Benediktsson, Jon Atli; Chanussot, Jocelyn

    2011-01-01

    The Hierarchical SEGmentation (HSEG) algorithm, which is a combination of hierarchical step-wise optimization and spectral clustering, has given good performances for hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. First, pixelwise classification is performed and the most reliably classified pixels are selected as markers, with the corresponding class labels. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. The experimental results show that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for hyperspectral image analysis.

  19. Content-based similarity for 3D model retrieval and classification

    Institute of Scientific and Technical Information of China (English)

    Ke Lü; Ning He; Jian Xue

    2009-01-01

    With the rapid development of 3D digital shape information,content-based 3D model retrieval and classification has become an important research area.This paper presents a novel 3D model retrieval and classification algorithm.For feature representation,a method combining a distance histogram and moment invariants is proposed to improve the retrieval performance.The major advantage of using a distance histogram is its invariance to the transforms of scaling,translation and rotation.Based on the premise that two similar objects should have high mutual information,the querying of 3D data should convey a great deal of information on the shape of the two objects,and so we propose a mutual information distance measurement to perform the similarity comparison of 3D objects.The proposed algorithm is tested with a 3D model retrieval and classification prototype,and the experimental evaluation demonstrates satisfactory retrieval results and classification accuracy.

  20. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  1. A Bayes fusion method based ensemble classification approach for Brown cloud application

    Directory of Open Access Journals (Sweden)

    M.Krishnaveni

    2014-03-01

    Full Text Available Classification is a recurrent task of determining a target function that maps each attribute set to one of the predefined class labels. Ensemble fusion is one of the suitable classifier model fusion techniques which combine the multiple classifiers to perform high classification accuracy than individual classifiers. The main objective of this paper is to combine base classifiers using ensemble fusion methods namely Decision Template, Dempster-Shafer and Bayes to compare the accuracy of the each fusion methods on the brown cloud dataset. The base classifiers like KNN, MLP and SVM have been considered in ensemble classification in which each classifier with four different function parameters. From the experimental study it is proved, that the Bayes fusion method performs better classification accuracy of 95% than Decision Template of 80%, Dempster-Shaferof 85%, in a Brown Cloud image dataset.

  2. Correction of Alar Retraction Based on Frontal Classification.

    Science.gov (United States)

    Kim, Jae Hoon; Song, Jin Woo; Park, Sung Wan; Bartlett, Erica; Nguyen, Anh H

    2015-11-01

    Among the various types of alar deformations in Asians, alar retraction not only has the highest occurrence rate, but is also very complicated to treat because the ala is supported only by cartilage and its soft tissue envelope cannot be easily stretched. As patients' knowledge of aesthetic procedures is becoming more extensive due to increased information dissemination through various media, doctors must give more accurate, logical explanations of the procedures to be performed and their anticipated results, with an emphasis on relevant anatomical features, accurate diagnoses, detailed classifications, and various appropriate methods of surgery. PMID:26648808

  3. A study of land use/land cover information extraction classification technology based on DTC

    Science.gov (United States)

    Wang, Ping; Zheng, Yong-guo; Yang, Feng-jie; Jia, Wei-jie; Xiong, Chang-zhen

    2008-10-01

    Decision Tree Classification (DTC) is one organizational form of the multi-level recognition system, which changes the complicated classification into simple categories, and then gradually resolves it. The paper does LULC Decision Tree Classification research on some areas of Gansu Province in the west of China. With the mid-resolution remote sensing data as the main data resource, the authors adopt decision-making classification technology method, taking advantage of its character that it imitates the processing pattern of human judgment and thinking and its fault-tolerant character, and also build the decision tree LULC classical pattern. The research shows that the methods and techniques can increase the level of automation and accuracy of LULC information extraction, and better carry out LULC information extraction on the research areas. The main aspects of the research are as follows: 1. We collected training samples firstly, established a comprehensive database which is supported by remote sensing and ground data; 2. By utilizing CART system, and based on multiply sources and time phases remote sensing data and other assistance data, the DTC's technology effectively combined the unsupervised classification results with the experts' knowledge together. The method and procedure for distilling the decision tree information were specifically developed. 3. In designing the decision tree, based on the various object of types classification rules, we established and pruned DTC'S model for the purpose of achieving effective treatment of subdivision classification, and completed the land use and land cover classification of the research areas. The accuracy of evaluation showed that the classification accuracy reached upwards 80%.

  4. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  5. Multiobjective Simulated Annealing-Based Clustering of Tissue Samples for Cancer Diagnosis.

    Science.gov (United States)

    Acharya, Sudipta; Saha, Sriparna; Thadisina, Yamini

    2016-03-01

    In the field of pattern recognition, the study of the gene expression profiles of different tissue samples over different experimental conditions has become feasible with the arrival of microarray-based technology. In cancer research, classification of tissue samples is necessary for cancer diagnosis, which can be done with the help of microarray technology. In this paper, we have presented a multiobjective optimization (MOO)-based clustering technique utilizing archived multiobjective simulated annealing(AMOSA) as the underlying optimization strategy for classification of tissue samples from cancer datasets. The presented clustering technique is evaluated for three open source benchmark cancer datasets [Brain tumor dataset, Adult Malignancy, and Small Round Blood Cell Tumors (SRBCT)]. In order to evaluate the quality or goodness of produced clusters, two cluster quality measures viz, adjusted rand index and classification accuracy ( % CoA) are calculated. Comparative results of the presented clustering algorithm with ten state-of-the-art existing clustering techniques are shown for three benchmark datasets. Also, we have conducted a statistical significance test called t-test to prove the superiority of our presented MOO-based clustering technique over other clustering techniques. Moreover, significant gene markers have been identified and demonstrated visually from the clustering solutions obtained. In the field of cancer subtype prediction, this study can have important impact. PMID:25706936

  6. Computational Discrimination of Breast Cancer for Korean Women Based on Epidemiologic Data Only.

    Science.gov (United States)

    Lee, Chiwon; Lee, Jung Chan; Park, Boyoung; Bae, Jonghee; Lim, Min Hyuk; Kang, Daehee; Yoo, Keun-Young; Park, Sue K; Kim, Youdan; Kim, Sungwan

    2015-08-01

    Breast cancer is the second leading cancer for Korean women and its incidence rate has been increasing annually. If early diagnosis were implemented with epidemiologic data, the women could easily assess breast cancer risk using internet. National Cancer Institute in the United States has released a Web-based Breast Cancer Risk Assessment Tool based on Gail model. However, it is inapplicable directly to Korean women since breast cancer risk is dependent on race. Also, it shows low accuracy (58%-59%). In this study, breast cancer discrimination models for Korean women are developed using only epidemiological case-control data (n = 4,574). The models are configured by different classification techniques: support vector machine, artificial neural network, and Bayesian network. A 1,000-time repeated random sub-sampling validation is performed for diverse parameter conditions, respectively. The performance is evaluated and compared as an area under the receiver operating characteristic curve (AUC). According to age group and classification techniques, AUC, accuracy, sensitivity, specificity, and calculation time of all models were calculated and compared. Although the support vector machine took the longest calculation time, the highest classification performance has been achieved in the case of women older than 50 yr (AUC = 64%). The proposed model is dependent on demographic characteristics, reproductive factors, and lifestyle habits without using any clinical or genetic test. It is expected that the model could be implemented as a web-based discrimination tool for breast cancer. This tool can encourage potential breast cancer prone women to go the hospital for diagnostic tests. PMID:26240478

  7. Medical diagnosis of cardiovascular diseases using an interval-valued fuzzy rule-based classification system

    OpenAIRE

    Sanz Delgado, José Antonio; Galar Idoate, Mikel; Jurío Munárriz, Aránzazu; Brugos Larumbe, Antonio; Pagola Barrio, Miguel; Bustince Sola, Humberto

    2013-01-01

    Objective: To develop a classifier that tackles the problem of determining the risk of a patient of suffering from a cardiovascular disease within the next ten years. The system has to provide both a diagnosis and an interpretable model explaining the decision. In this way, doctors are able to analyse the usefulness of the information given by the system. Methods: Linguistic fuzzy rule-based classification systems are used, since they provide a good classification rate and a highly interpreta...

  8. Image-Based Airborne Sensors: A Combined Approach for Spectral Signatures Classification through Deterministic Simulated Annealing

    OpenAIRE

    P. Javier Herrera; Gonzalo Pajares; María Guijarro

    2009-01-01

    The increasing technology of high-resolution image airborne sensors, including those on board Unmanned Aerial Vehicles, demands automatic solutions for processing, either on-line or off-line, the huge amountds of image data sensed during the flights. The classification of natural spectral signatures in images is one potential application. The actual tendency in classification is oriented towards the combination of simple classifiers. In this paper we propose a combined strategy based on the D...

  9. Object-Based Classification of Abandoned Logging Roads under Heavy Canopy Using LiDAR

    OpenAIRE

    Jason Sherba; Leonhard Blesius; Jerry Davis

    2014-01-01

    LiDAR-derived slope models may be used to detect abandoned logging roads in steep forested terrain. An object-based classification approach of abandoned logging road detection was employed in this study. First, a slope model of the study site in Marin County, California was created from a LiDAR derived DEM. Multiresolution segmentation was applied to the slope model and road seed objects were iteratively grown into candidate objects. A road classification accuracy of 86% was achieved using th...

  10. Generating Estimates of Classification Confidence for a Case-Based Spam Filter

    OpenAIRE

    Delany, Sarah Jane; Cunningham, Padraig; Doyle, Donal

    2005-01-01

    Producing estimates of classification confidence is surprisingly difficult. One might expect that classifiers that can produce numeric classification scores (e.g. k-Nearest Neighbour or Naive Bayes) could readily produce confidence estimates based on thresholds. In fact, this proves not to be the case, probably because these are not probabilistic classifiers in the strict sense. The numeric scores coming from k-Nearest Neighbour or Naive Bayes classifiers are not well correl...

  11. Generating estimates of classification confidence for a case-based spam filter

    OpenAIRE

    Delany, Sarah Jane; Cunningham, Padraig; Coyle, Donal; Zamolotskikh, Anton

    2005-01-01

    Producing estimates of classification confidence is surprisingly difficult. One might expect that classifiers that can produce numeric classification scores (e.g. k-Nearest Neighbour, Na¨ıve Bayes or Support Vector Machines) could readily produce confidence estimates based on thresholds. In fact, this proves not to be the case, probably because these are not probabilistic classifiers in the strict sense. The numeric scores coming from k-Nearest Neighbour, Na¨ıve Bayes and Support Vector Machi...

  12. A Whitening Transformation Based Approach to One-class Classification of Remote Sensing Imagery

    OpenAIRE

    BO Shukui; Li, Xiang; Li, Lingling

    2015-01-01

    In this study, a whitening transformation based approach to one-class classification of remote sensing imagery is investigated. Only positive data are required to train the one-class classifier. Firstly, the image data is mapped to a new feature space using the whitening processing with all directions of the class of interest. Then a threshold is selected to make a binary prediction. A heuristic method of threshold selection is performed in the experiment of one-class classification. A series...

  13. Analysis on Systematic Water Scarcity Based on Establishment of Water Scarcity Classification System

    Institute of Scientific and Technical Information of China (English)

    2011-01-01

    It would be very helpful for making countermeasures against complex water scarcity by analysis on systematic water scarcity.Based on the previous researches on water scarcity classification,a classification system of water scarcity was established according to contributing factors,which comprises three water scarcity categories caused by anthropic factors,natural factors and mixed factors respectively.Accordingly,the concept of systematic water scarcity was proposed,which can be defined as one type of water...

  14. A Multi-Lead ECG Classification Based on Random Projection Features

    OpenAIRE

    Bogdanova Vandergheynst, Iva; Vallejos, Rincon; Javier, Francisco; Atienza Alonso, David

    2012-01-01

    This paper presents a novel method for classification of multilead electrocardiogram (ECG) signals. The feature extraction is based on the random projection (RP) concept for dimensionality reduction. Furthermore, the classification is performed by a neuro-fuzzy classifier. Such a model can be easily implemented on portable systems for practical applications in both health monitoring and diagnostic purposes. Moreover, the RP implementation on portable systems is very challenging featuring both...

  15. A Multi-Lead Ecg Classification Based On Random Projection Features

    OpenAIRE

    Bogdanova, Iva; Rincon, Francisco; Atienza, David

    2012-01-01

    This paper presents a novel method for classification of multi-lead electrocardiogram (ECG) signals. The feature extraction is based on the random projection (RP) concept for dimensionality reduction. Furthermore, the classification is performed by a neuro-fuzzy classifier. Such a model can be easily implemented on portable systems for practical applications in both health monitoring and diagnostic purposes. Moreover, the RP implementation on portable systems is very challenging featuring bot...

  16. Artificial-neural-network-based classification of mammographic microcalcifications using image structure features

    Science.gov (United States)

    Dhawan, Atam P.; Chitre, Yateen S.; Moskowitz, Myron

    1993-07-01

    Mammography associated with clinical breast examination and self-breast examination is the only effective and viable method for mass breast screening. It is however, difficult to distinguish between benign and malignant microcalcifications associated with breast cancer. Most of the techniques used in the computerized analysis of mammographic microcalcifications segment the digitized gray-level image into regions representing microcalcifications. We present a second-order gray-level histogram based feature extraction approach to extract microcalcification features. These features, called image structure features, are computed from the second-order gray-level histogram statistics, and do not require segmentation of the original image into binary regions. Several image structure features were computed for 100 cases of `difficult to diagnose' microcalcification cases with known biopsy results. These features were analyzed in a correlation study which provided a set of five best image structure features. A feedforward backpropagation neural network was used to classify mammographic microcalcifications using the image structure features. The network was trained on 10 cases of mammographic microcalcifications and tested on additional 85 `difficult-to-diagnose' microcalcifications cases using the selected image structure features. The trained network yielded good results for classification of `difficult-to- diagnose' microcalcifications into benign and malignant categories.

  17. Multi-Frequency Polarimetric SAR Classification Based on Riemannian Manifold and Simultaneous Sparse Representation

    Directory of Open Access Journals (Sweden)

    Fan Yang

    2015-07-01

    Full Text Available Normally, polarimetric SAR classification is a high-dimensional nonlinear mapping problem. In the realm of pattern recognition, sparse representation is a very efficacious and powerful approach. As classical descriptors of polarimetric SAR, covariance and coherency matrices are Hermitian semidefinite and form a Riemannian manifold. Conventional Euclidean metrics are not suitable for a Riemannian manifold, and hence, normal sparse representation classification cannot be applied to polarimetric SAR directly. This paper proposes a new land cover classification approach for polarimetric SAR. There are two principal novelties in this paper. First, a Stein kernel on a Riemannian manifold instead of Euclidean metrics, combined with sparse representation, is employed for polarimetric SAR land cover classification. This approach is named Stein-sparse representation-based classification (SRC. Second, using simultaneous sparse representation and reasonable assumptions of the correlation of representation among different frequency bands, Stein-SRC is generalized to simultaneous Stein-SRC for multi-frequency polarimetric SAR classification. These classifiers are assessed using polarimetric SAR images from the Airborne Synthetic Aperture Radar (AIRSAR sensor of the Jet Propulsion Laboratory (JPL and the Electromagnetics Institute Synthetic Aperture Radar (EMISAR sensor of the Technical University of Denmark (DTU. Experiments on single-band and multi-band data both show that these approaches acquire more accurate classification results in comparison to many conventional and advanced classifiers.

  18. Classification of hospitals based on measured output: the VA system.

    Science.gov (United States)

    Thomas, J W; Berki, S E; Wyszewianski, L; Ashcraft, M L

    1983-07-01

    Evaluation of hospital performance and improvement of resource allocation in hospital systems require a method for classifying hospitals on the basis of their output. Previous approaches to hospital classification relied largely on input characteristics. The authors propose and apply a procedure for classifying hospitals into groups where within-group hospitals are similar with respect to output. Direct measures of case-mix-adjusted discharges and outpatient visits are the principal measures of patient care output; other measures capture training and research functions. The component measures were weighted, and a composite output measure was calculated for each of the 162 hospitals in the Veterans Administration health care system. The output score then was used as the dependent variable in an Automatic Interaction Detector analysis, which partitioned the 162 hospitals into 10 groups, accounting for 85 per cent of the variance in the dependent variable. An extension of the output classification method is presented for illustration of how the difference between hospitals' actual operating costs and costs predicted on the basis of output can be used in defining isoefficiency groups. PMID:6350744

  19. A space-based classification system for RF transients

    Energy Technology Data Exchange (ETDEWEB)

    Moore, K.R.; Call, D.; Johnson, S.; Payne, T.; Ford, W.; Spencer, K.; Wilkerson, J.F. [Los Alamos National Lab., NM (United States); Baumgart, C. [EG and G, Inc., Los Alamos, NM (United States)

    1993-12-01

    The FORTE (Fast On-Orbit Recording of Transient Events) small satellite is scheduled for launch in mid 1995. The mission is to measure and classify VHF (30--300 MHz) electromagnetic pulses, primarily due to lightning, within a high noise environment dominated by continuous wave carriers such as TV and FM stations. The FORTE Event Classifier will use specialized hardware to implement signal processing and neural network algorithms that perform onboard classification of RF transients and carriers. Lightning events will also be characterized with optical data telemetered to the ground. A primary mission science goal is to develop a comprehensive understanding of the correlation between the optical flash and the VHF emissions from lightning. By combining FORTE measurements with ground measurements and/or active transmitters, other science issues can be addressed. Examples include the correlation of global precipitation rates with lightning flash rates and location, the effects of large scale structures within the ionosphere (such as traveling ionospheric disturbances and horizontal gradients in the total electron content) on the propagation of broad bandwidth RF signals, and various areas of lightning physics. Event classification is a key feature of the FORTE mission. Neural networks are promising candidates for this application. The authors describe the proposed FORTE Event Classifier flight system, which consists of a commercially available digital signal processing board and a custom board, and discuss work on signal processing and neural network algorithms.

  20. Multivariate Discretization Based on Evolutionary Cut Points Selection for Classification.

    Science.gov (United States)

    Ramirez-Gallego, Sergio; Garcia, Salvador; Benitez, Jose Manuel; Herrera, Francisco

    2016-03-01

    Discretization is one of the most relevant techniques for data preprocessing. The main goal of discretization is to transform numerical attributes into discrete ones to help the experts to understand the data more easily, and it also provides the possibility to use some learning algorithms which require discrete data as input, such as Bayesian or rule learning. We focus our attention on handling multivariate classification problems, where high interactions among multiple attributes exist. In this paper, we propose the use of evolutionary algorithms to select a subset of cut points that defines the best possible discretization scheme of a data set using a wrapper fitness function. We also incorporate a reduction mechanism to successfully manage the multivariate approach on large data sets. Our method has been compared with the best state-of-the-art discretizers on 45 real datasets. The experiments show that our proposed algorithm overcomes the rest of the methods producing competitive discretization schemes in terms of accuracy, for C4.5, Naive Bayes, PART, and PrUning and BuiLding Integrated in Classification classifiers; and obtained far simpler solutions. PMID:25794409

  1. A space-based classification system for RF transients

    International Nuclear Information System (INIS)

    The FORTE (Fast On-Orbit Recording of Transient Events) small satellite is scheduled for launch in mid 1995. The mission is to measure and classify VHF (30--300 MHz) electromagnetic pulses, primarily due to lightning, within a high noise environment dominated by continuous wave carriers such as TV and FM stations. The FORTE Event Classifier will use specialized hardware to implement signal processing and neural network algorithms that perform onboard classification of RF transients and carriers. Lightning events will also be characterized with optical data telemetered to the ground. A primary mission science goal is to develop a comprehensive understanding of the correlation between the optical flash and the VHF emissions from lightning. By combining FORTE measurements with ground measurements and/or active transmitters, other science issues can be addressed. Examples include the correlation of global precipitation rates with lightning flash rates and location, the effects of large scale structures within the ionosphere (such as traveling ionospheric disturbances and horizontal gradients in the total electron content) on the propagation of broad bandwidth RF signals, and various areas of lightning physics. Event classification is a key feature of the FORTE mission. Neural networks are promising candidates for this application. The authors describe the proposed FORTE Event Classifier flight system, which consists of a commercially available digital signal processing board and a custom board, and discuss work on signal processing and neural network algorithms

  2. Non-target adjacent stimuli classification improves performance of classical ERP-based brain computer interface

    Science.gov (United States)

    Ceballos, G. A.; Hernández, L. F.

    2015-04-01

    Objective. The classical ERP-based speller, or P300 Speller, is one of the most commonly used paradigms in the field of Brain Computer Interfaces (BCI). Several alterations to the visual stimuli presentation system have been developed to avoid unfavorable effects elicited by adjacent stimuli. However, there has been little, if any, regard to useful information contained in responses to adjacent stimuli about spatial location of target symbols. This paper aims to demonstrate that combining the classification of non-target adjacent stimuli with standard classification (target versus non-target) significantly improves classical ERP-based speller efficiency. Approach. Four SWLDA classifiers were trained and combined with the standard classifier: the lower row, upper row, right column and left column classifiers. This new feature extraction procedure and the classification method were carried out on three open databases: the UAM P300 database (Universidad Autonoma Metropolitana, Mexico), BCI competition II (dataset IIb) and BCI competition III (dataset II). Main results. The inclusion of the classification of non-target adjacent stimuli improves target classification in the classical row/column paradigm. A gain in mean single trial classification of 9.6% and an overall improvement of 25% in simulated spelling speed was achieved. Significance. We have provided further evidence that the ERPs produced by adjacent stimuli present discriminable features, which could provide additional information about the spatial location of intended symbols. This work promotes the searching of information on the peripheral stimulation responses to improve the performance of emerging visual ERP-based spellers.

  3. Fractal classification and natural classification of coal pore structure based on migration of coal bed methane

    Institute of Scientific and Technical Information of China (English)

    FU Xuehai; QIN Yong; ZHANG Wanhong; WEI Chongtao; ZHOU Rongfu

    2005-01-01

    According to the data of 146 coal samples measured by mercury penetration, coal pores are classified into two levels of <65 nm diffusion pore and >65 nm seeping pore by fractal method based on the characteristics of diffusion, seepage of coal bed methane(CBM) and on the research results of specific pore volume and pore structure. The diffusion pores are further divided into three categories: <8 nm micropore, 8-20 nm transitional pore, and 20-65 nm minipore based on the relationship between increment of specific surface area and diameter of pores, while seepage pores are further divided into three categories: 65-325 nm mesopore,325-1000 nm transitional pore, and >1000 nm macropore based on the abrupt change in the increment of specific pore volume.

  4. Tumor Size Evaluation according to the T Component of the Seventh Edition of the International Association for the Study of Lung Cancer's TNM Classification: Interobserver Agreement between Radiologists and Computer-Aided Diagnosis System in Patients with Lung Cancer

    International Nuclear Information System (INIS)

    To assess the interobserver agreement for tumor size evaluation between radiologists and the computer-aided diagnosis (CAD) system based on the 7th edition of the TNM classification by the International Association for the Study of Lung Cancer in patients with lung cancer. We evaluated 20 patients who underwent a lobectomy or pneumonectomy for primary lung cancer. The maximum diameter of each primary tumor was measured by two radiologists and a CAD system on CT, and was staged based on the 7th edition of the TNM classification. The CT size and T-staging of the primary tumors was compared with the pathologic size and staging and the variability in the sizes and T stages of primary tumors was statistically analyzed between each radiologist's measurement or CAD estimation and the pathologic results. There was no statistically significant interobserver difference for the CT size among the two radiologists, between pathologic and CT size estimated by the radiologists, and between pathologic and CT staging by the radiologists and CAD system. However, there was a statistically significant interobserver difference between pathologic size and the CT size estimated by the CAD system (p = 0.003). No significant differences were found in the measurement of tumor size among radiologists or in the assessment of T-staging by radiologists and the CAD system.

  5. BRAIN TUMOR CLASSIFICATION BASED ON CLUSTERED DISCRETE COSINE TRANSFORM IN COMPRESSED DOMAIN

    Directory of Open Access Journals (Sweden)

    V. Anitha

    2014-01-01

    Full Text Available This study presents a novel method to classify the brain tumors by means of efficient and integrated methods so as to increase the classification accuracy. In conventional systems, the problem being the same to extract the feature sets from the database and classify tumors based on the features sets. The main idea in plethora of earlier researches related to any classification method is to increase the classification accuracy.The actual need is to achieve a better accuracy in classification, by extracting more relevant feature sets after dimensionality reduction. There exists a trade-off between accuracy and the number of feature sets. Hence the focus in this study is to implement Discrete Cosine Transform (DCT on the brain tumor images for various classes. Using DCT, by itself, it offers a fair dimension reduction in feature sets.Later on, sequentially K-means algorithm is applied on DCT coefficients to cluster the feature sets. These cluster information are considered as refined feature sets and classified using Support Vector Machine (SVM is proposed in this study. This method of using DCT helps to adjust and vary the performance of classification based on the count of the DCT coefficients taken into account. There exists a good demand for an automatic classification of brain tumors which grealtly helps in the process of diagnosis. In this novel work, an average of 97% and a maximum of 100% classification accuracy has been achieved. This research is basically aiming and opening a new way of classification under compressed domain. Hence this study may be highly suitable for diagnosing under mobile computing and internet based medical diagnosis.

  6. Comparison of Supervised Classification Methods for Protein Profiling in Cancer Diagnosis

    OpenAIRE

    Nadège Dossat; Alain Mangé; Jérôme Solassol; William Jacot; Ludovic Lhermitte; Thierry Maudelonde; Jean-Pierre Daurès; Nicolas Molinari

    2007-01-01

    Summary: A key challenge in clinical proteomics of cancer is the identification of biomarkers that could allow detection, diagnosis and prognosis of the diseases. Recent advances in mass spectrometry and proteomic instrumentations offer unique chance to rapidly identify these markers. These advances pose considerable challenges, similar to those created by microarray-based investigation, for the discovery of pattern of markers from high-dimensional data, specific to each pathologic state (e.g...

  7. Study design requirements for RNA sequencing-based breast cancer diagnostics.

    Science.gov (United States)

    Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias

    2016-01-01

    Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic. PMID:26830453

  8. Comparison of Soft Computing Approaches for Texture Based Land Cover Classification of Remotely Sensed Image

    Directory of Open Access Journals (Sweden)

    S. Jenicka

    2015-08-01

    Full Text Available Texture feature is a predominant feature in land cover classification of remotely sensed images. In this study, texture features were extracted using the proposed multivariate descriptor, Multivariate Ternary Pattern (MTP. The soft classifiers such as Fuzzy k-Nearest Neighbor (Fuzzy k-NN, Support Vector Machine (SVM and Extreme Learning Machine (ELM were used along with the proposed multivariate descriptor for performing land cover classification. The experiments were conducted on IRS P6 LISS-IV data and the results were evaluated based on error matrix, classification accuracy and Kappa statistics. From the experiments, it was found that the proposed descriptor with SVM classifier gave 93.04% classification accuracy.

  9. A wavelet transform based feature extraction and classification of cardiac disorder.

    Science.gov (United States)

    Sumathi, S; Beaulah, H Lilly; Vanithamani, R

    2014-09-01

    This paper approaches an intellectual diagnosis system using hybrid approach of Adaptive Neuro-Fuzzy Inference System (ANFIS) model for classification of Electrocardiogram (ECG) signals. This method is based on using Symlet Wavelet Transform for analyzing the ECG signals and extracting the parameters related to dangerous cardiac arrhythmias. In these particular parameters were used as input of ANFIS classifier, five most important types of ECG signals they are Normal Sinus Rhythm (NSR), Atrial Fibrillation (AF), Pre-Ventricular Contraction (PVC), Ventricular Fibrillation (VF), and Ventricular Flutter (VFLU) Myocardial Ischemia. The inclusion of ANFIS in the complex investigating algorithms yields very interesting recognition and classification capabilities across a broad spectrum of biomedical engineering. The performance of the ANFIS model was evaluated in terms of training performance and classification accuracies. The results give importance to that the proposed ANFIS model illustrates potential advantage in classifying the ECG signals. The classification accuracy of 98.24 % is achieved. PMID:25023652

  10. A NEW UNSUPERVISED CLASSIFICATION ALGORITHM FOR POLARIMETRIC SAR IMAGES BASED ON FUZZY SET THEORY

    Institute of Scientific and Technical Information of China (English)

    Fu Yusheng; Xie Yan; Pi Yiming; Hou Yinming

    2006-01-01

    In this letter, a new method is proposed for unsupervised classification of terrain types and man-made objects using POLarimetric Synthetic Aperture Radar (POLSAR) data. This technique is a combination of the usage of polarimetric information of SAR images and the unsupervised classification method based on fuzzy set theory. Image quantization and image enhancement are used to preprocess the POLSAR data. Then the polarimetric information and Fuzzy C-Means (FCM) clustering algorithm are used to classify the preprocessed images. The advantages of this algorithm are the automated classification, its high classification accuracy, fast convergence and high stability. The effectiveness of this algorithm is demonstrated by experiments using SIR-C/X-SAR (Spaceborne Imaging Radar-C/X-band Synthetic Aperture Radar) data.

  11. The normalization of citation counts based on classification systems

    CERN Document Server

    Bornmann, Lutz; Barth, Andreas

    2013-01-01

    If we want to assess whether the paper in question has had a particularly high or low citation impact compared to other papers, the standard practice in bibliometrics is to normalize citations in respect of the subject category and publication year. A number of proposals for an improved procedure in the normalization of citation impact have been put forward in recent years. Against the background of these proposals this study describes an ideal solution for the normalization of citation impact: in a first step, the reference set for the publication in question is collated by means of a classification scheme, where every publication is associated with a single principal research field or subfield entry (e. g. via Chemical Abstracts sections) and a publication year. In a second step, percentiles of citation counts are calculated for this set and used to assign the normalized citation impact score to the publications (and also to the publication in question).

  12. Darwin, les fossiles et les bases de la classification moderne

    OpenAIRE

    Duranthon, Francis

    2013-01-01

    Depuis l’antiquité, les hommes ont cherché à classer les espèces, manière de décrire le monde et de se l’approprier. A la suite de la systématisation de la nomenclature binominale par Carl von Linné (1758) et jusqu’aux premiers développements des théories évolutionnistes, la classification se veut être le reflet d’une échelle naturelle des êtres avec, bien entendu, une place prépondérante pour l’homme, classé chez les Primates (les premiers), tout au sommet de cette échelle. Avec la parution ...

  13. The Normalization of Citation Counts Based on Classification Systems

    Directory of Open Access Journals (Sweden)

    Andreas Barth

    2013-08-01

    Full Text Available If we want to assess whether the paper in question has had a particularly high or low citation impact compared to other papers, the standard practice in bibliometrics is to normalize citations in respect of the subject category and publication year. A number of proposals for an improved procedure in the normalization of citation impact have been put forward in recent years. Against the background of these proposals, this study describes an ideal solution for the normalization of citation impact: in a first step, the reference set for the publication in question is collated by means of a classification scheme, where every publication is associated with a single principal research field or subfield entry (e.g., via Chemical Abstracts sections and a publication year. In a second step, percentiles of citation counts are calculated for this set and used to assign the normalized citation impact score to the publications (and also to the publication in question.

  14. MEDLINE Abstracts Classification Based on Noun Phrases Extraction

    Science.gov (United States)

    Ruiz-Rico, Fernando; Vicedo, José-Luis; Rubio-Sánchez, María-Consuelo

    Many algorithms have come up in the last years to tackle automated text categorization. They have been exhaustively studied, leading to several variants and combinations not only in the particular procedures but also in the treatment of the input data. A widely used approach is representing documents as Bag-Of-Words (BOW) and weighting tokens with the TFIDF schema. Many researchers have thrown into precision and recall improvements and classification time reduction enriching BOW with stemming, n-grams, feature selection, noun phrases, metadata, weight normalization, etc. We contribute to this field with a novel combination of these techniques. For evaluation purposes, we provide comparisons to previous works with SVM against the simple BOW. The well known OHSUMED corpus is exploited and different sets of categories are selected, as previously done in the literature. The conclusion is that the proposed method can be successfully applied to existing binary classifiers such as SVM outperforming the mixture of BOW and TFIDF approaches.

  15. Texton Based Shape Features on Local Binary Pattern for Age Classification

    Directory of Open Access Journals (Sweden)

    V.Vijaya Kumar

    2012-07-01

    Full Text Available Classification and recognition of objects is interest of many researchers. Shape is a significant feature of objects and it plays a crucial role in image classification and recognition. The present paper assumes that the features that drastically affect the adulthood classification system are the Shape features (SF of face. Based on this, the present paper proposes a new technique of adulthood classification by extracting feature parameters of face on Integrated Texton based LBP (IT-LBP images. The present paper evaluates LBP features on facial images. On LBP Texton Images complex shape features are evaluated on facial images for a precise age classification.LBP is a local texture operator with low computational complexity and low sensitivity to changes in illumination. Textons are considered as texture shape primitives which are located with certain placement rules. The proposed shape features represent emergent patterns showing a common property all over the image. The experimental evidence on FGnet aging database clearly indicates the significance and accuracy of the proposed classification method over the other existing methods.

  16. Texture characterization for joint compression and classification based on human perception in the wavelet domain.

    Science.gov (United States)

    Fahmy, Gamal; Black, John; Panchanathan, Sethuraman

    2006-06-01

    Today's multimedia applications demand sophisticated compression and classification techniques in order to store, transmit, and retrieve audio-visual information efficiently. Over the last decade, perceptually based image compression methods have been gaining importance. These methods take into account the abilities (and the limitations) of human visual perception (HVP) when performing compression. The upcoming MPEG 7 standard also addresses the need for succinct classification and indexing of visual content for efficient retrieval. However, there has been no research that has attempted to exploit the characteristics of the human visual system to perform both compression and classification jointly. One area of HVP that has unexplored potential for joint compression and classification is spatial frequency perception. Spatial frequency content that is perceived by humans can be characterized in terms of three parameters, which are: 1) magnitude; 2) phase; and 3) orientation. While the magnitude of spatial frequency content has been exploited in several existing image compression techniques, the novel contribution of this paper is its focus on the use of phase coherence for joint compression and classification in the wavelet domain. Specifically, this paper describes a human visual system-based method for measuring the degree to which an image contains coherent (perceptible) phase information, and then exploits that information to provide joint compression and classification. Simulation results that demonstrate the efficiency of this method are presented. PMID:16764265

  17. Waste-acceptance criteria and risk-based thinking for radioactive-waste classification

    International Nuclear Information System (INIS)

    The US system of radioactive-waste classification and its development provide a reference point for the discussion of risk-based thinking in waste classification. The official US system is described and waste-acceptance criteria for disposal sites are introduced because they constitute a form of de facto waste classification. Risk-based classification is explored and it is found that a truly risk-based system is context-dependent: risk depends not only on the waste-management activity but, for some activities such as disposal, it depends on the specific physical context. Some of the elements of the official US system incorporate risk-based thinking, but like many proposed alternative schemes, the physical context of disposal is ignored. The waste-acceptance criteria for disposal sites do account for this context dependence and could be used as a risk-based classification scheme for disposal. While different classes would be necessary for different management activities, the waste-acceptance criteria would obviate the need for the current system and could better match wastes to disposal environments saving money or improving safety or both

  18. Network traffic classification based on ensemble learning and co-training

    Institute of Scientific and Technical Information of China (English)

    HE HaiTao; LUO XiaoNan; MA FeiTeng; CHE ChunHui; WANG JianMin

    2009-01-01

    Classification of network traffic Is the essential step for many network researches. However, with the rapid evolution of Internet applications the effectiveness of the port-based or payload-based identifi-cation approaches has been greatly diminished In recent years. And many researchers begin to turn their attentions to an alternative machine learning based method. This paper presents a novel machine learning-based classification model, which combines ensemble learning paradigm with co-training tech-niques. Compared to previous approaches, most of which only employed single classifier, multiple clas-sifiers and semi-supervised learning are applied in our method and it mainly helps to overcome three shortcomings: limited flow accuracy rate, weak adaptability and huge demand of labeled training set. In this paper, statistical characteristics of IP flows are extracted from the packet level traces to establish the feature set, then the classification model is created and tested and the empirical results prove its feasibility and effectiveness.

  19. Effect of World Health Organization (WHO) Histological Classification on Predicting Lymph Node Metastasis and Recurrence in Early Gastric Cancer.

    Science.gov (United States)

    Lai, Ji Fu; Xu, Wen Na; Noh, Sung Hoon; Lu, Wei Qin

    2016-01-01

    BACKGROUND The World Health Organization (WHO) histological classification for gastric cancer is widely accepted and used. However, its impact on predicting lymph node metastasis and recurrence in early gastric cancer (EGC) is not well studied. MATERIAL AND METHODS From 1987 to 2005, 2873 EGC patients with known WHO histological type who had undergone curative resection were enrolled in this study. In all, 637 well-differentiated adenocarcinomas (WD), 802 moderately-differentiated adenocarcinomas (MD), 689 poorly-differentiated adenocarcinomas (PD), and 745 signet-ring cell adenocarcinomas (SRC) were identified. RESULTS The distribution of demographic and clinical features in early gastric cancer among WD, MD, PD, and SRC were significantly different. Lymph node metastasis was observed in 317 patients (11.0%), with the lymph node metastasis rate being 5.3%, 14.8%, 17.0%, and 6.3% in WD, MD, PD, and SRC, respectively. Univariate and multivariate analyses indicated that gender, tumor size, gross appearance, depth of invasion, and WHO classification were significantly associated with lymph node metastasis. Recurrence was observed in 83 patients (2.9%), with the recurrence rate being 2.2%, 4.5%, 3.0%, and 1.6% in WD, MD, PD, and SRC, respectively. Multivariate analysis confirmed that MD, elevated gross type, and lymph node metastasis were independent risk factors for recurrence in EGC. MD patients showed worse disease-free survival than non-MD patients (P=0.001). CONCLUSIONS WHO classification is useful and necessary to evaluate during the perioperative management of EGC. Treatment strategies for EGC should be made prudently according to WHO classification, especially for MD patients. PMID:27595490

  20. Feature selection for neural network based defect classification of ceramic components using high frequency ultrasound.

    Science.gov (United States)

    Kesharaju, Manasa; Nagarajah, Romesh

    2015-09-01

    The motivation for this research stems from a need for providing a non-destructive testing method capable of detecting and locating any defects and microstructural variations within armour ceramic components before issuing them to the soldiers who rely on them for their survival. The development of an automated ultrasonic inspection based classification system would make possible the checking of each ceramic component and immediately alert the operator about the presence of defects. Generally, in many classification problems a choice of features or dimensionality reduction is significant and simultaneously very difficult, as a substantial computational effort is required to evaluate possible feature subsets. In this research, a combination of artificial neural networks and genetic algorithms are used to optimize the feature subset used in classification of various defects in reaction-sintered silicon carbide ceramic components. Initially wavelet based feature extraction is implemented from the region of interest. An Artificial Neural Network classifier is employed to evaluate the performance of these features. Genetic Algorithm based feature selection is performed. Principal Component Analysis is a popular technique used for feature selection and is compared with the genetic algorithm based technique in terms of classification accuracy and selection of optimal number of features. The experimental results confirm that features identified by Principal Component Analysis lead to improved performance in terms of classification percentage with 96% than Genetic algorithm with 94%. PMID:26081920

  1. Lyrics-Based Genre Classification Using Variant tf-idf Weighting Schemes

    Directory of Open Access Journals (Sweden)

    Teh Chao Ying

    2015-01-01

    Full Text Available Music documents are often classified based on genre and mood. In recent years, features from lyrics text have been used for classification of musical documents and the feasibility of lyrics features to classify musical documents has been shown. In this study an approach to lyrics based musical genre classification was presented which utilizing mood information. From the analysis of the lyrics text in the data collection, correlation of terms between genre and mood was observed. Based on this correlation of terms, new weighting equation with combine weights from genre and mood was introduced and implemented in two different ways. Ten musical genre and mood categories were selected respectively based on a summary from the literature. Musical genre classification experiments were performed using a test collection consists of 1000 English songs. To confirm present approach can improve the genre classification, experiments were conducted using similar weighting metric from previous study. Experimental results with new weighting equation reveal improvement in musical genre classification.

  2. Volumetric magnetic resonance imaging classification for Alzheimer's disease based on kernel density estimation of local features

    Institute of Scientific and Technical Information of China (English)

    YAN Hao; WANG Hu; WANG Yong-hui; ZHANG Yu-mei

    2013-01-01

    Background The classification of Alzheimer's disease (AD) from magnetic resonance imaging (MRI) has been challenged by lack of effective and reliable biomarkers due to inter-subject variability.This article presents a classification method for AD based on kernel density estimation (KDE) of local features.Methods First,a large number of local features were extracted from stable image blobs to represent various anatomical patterns for potential effective biomarkers.Based on distinctive descriptors and locations,the local features were robustly clustered to identify correspondences of the same underlying patterns.Then,the KDE was used to estimate distribution parameters of the correspondences by weighting contributions according to their distances.Thus,biomarkers could be reliably quantified by reducing the effects of further away correspondences which were more likely noises from inter-subject variability.Finally,the Bayes classifier was applied on the distribution parameters for the classification of AD.Results Experiments were performed on different divisions of a publicly available database to investigate the accuracy and the effects of age and AD severity.Our method achieved an equal error classification rate of 0.85 for subject aged 60-80 years exhibiting mild AD and outperformed a recent local feature-based work regardless of both effects.Conclusions We proposed a volumetric brain MRI classification method for neurodegenerative disease based on statistics of local features using KDE.The method may be potentially useful for the computer-aided diagnosis in clinical settings.

  3. Comparison of Back propagation neural network and Back propagation neural network Based Particle Swarm intelligence in Diagnostic Breast Cancer

    Directory of Open Access Journals (Sweden)

    Farahnaz SADOUGHI

    2014-03-01

    Full Text Available Breast cancer is the most commonly diagnosed cancer and the most common cause of death in women all over the world. Use of computer technology supporting breast cancer diagnosing is now widespread and pervasive across a broad range of medical areas. Early diagnosis of this disease can greatly enhance the chances of long-term survival of breast cancer victims. Artificial Neural Networks (ANN as mainly method play important role in early diagnoses breast cancer. This paper studies Levenberg Marquardet Backpropagation (LMBP neural network and Levenberg Marquardet Backpropagation based Particle Swarm Optimization(LMBP-PSO for the diagnosis of breast cancer. The obtained results show that LMBP and LMBP based PSO system provides higher classification efficiency. But LMBP based PSO needs minimum training and testing time. It helps in developing Medical Decision System (MDS for breast cancer diagnosing. It can also be used as secondary observer in clinical decision making.

  4. Cell morphology-based classification of red blood cells using holographic imaging informatics.

    Science.gov (United States)

    Yi, Faliu; Moon, Inkyu; Javidi, Bahram

    2016-06-01

    We present methods that automatically select a linear or nonlinear classifier for red blood cell (RBC) classification by analyzing the equality of the covariance matrices in Gabor-filtered holographic images. First, the phase images of the RBCs are numerically reconstructed from their holograms, which are recorded using off-axis digital holographic microscopy (DHM). Second, each RBC is segmented using a marker-controlled watershed transform algorithm and the inner part of the RBC is identified and analyzed. Third, the Gabor wavelet transform is applied to the segmented cells to extract a series of features, which then undergo a multivariate statistical test to evaluate the equality of the covariance matrices of the different shapes of the RBCs using selected features. When these covariance matrices are not equal, a nonlinear classification scheme based on quadratic functions is applied; otherwise, a linear classification is applied. We used the stomatocyte, discocyte, and echinocyte RBC for classifier training and testing. Simulation results demonstrated that 10 of the 14 RBC features are useful in RBC classification. Experimental results also revealed that the covariance matrices of the three main RBC groups are not equal and that a nonlinear classification method has a much lower misclassification rate. The proposed automated RBC classification method has the potential for use in drug testing and the diagnosis of RBC-related diseases. PMID:27375953

  5. Object-Based Classification of Abandoned Logging Roads under Heavy Canopy Using LiDAR

    Directory of Open Access Journals (Sweden)

    Jason Sherba

    2014-05-01

    Full Text Available LiDAR-derived slope models may be used to detect abandoned logging roads in steep forested terrain. An object-based classification approach of abandoned logging road detection was employed in this study. First, a slope model of the study site in Marin County, California was created from a LiDAR derived DEM. Multiresolution segmentation was applied to the slope model and road seed objects were iteratively grown into candidate objects. A road classification accuracy of 86% was achieved using this fully automated procedure and post processing increased this accuracy to 90%. In order to assess the sensitivity of the road classification to LiDAR ground point spacing, the LiDAR ground point cloud was repeatedly thinned by a fraction of 0.5 and the classification procedure was reapplied. The producer’s accuracy of the road classification declined from 79% with a ground point spacing of 0.91 to below 50% with a ground point spacing of 2, indicating the importance of high point density for accurate classification of abandoned logging roads.

  6. Research of Plant-Leaves Classification Algorithm Based on Supervised LLE

    Directory of Open Access Journals (Sweden)

    Yan Qing

    2013-06-01

    Full Text Available A new supervised LLE method based on the fisher projection was proposed in this paper, and combined it with a new classification algorithm based on manifold learning to realize the recognition of the plant leaves. Firstly,the method utilizes the Fisher projection distance to replace the sample's geodesic distance, and a new supervised LLE algorithm is obtained .Then, a classification algorithm which uses the manifold reconstruction error to distinguish the sample classification directly is adopted. This algorithm can utilize the category information better,and improve recognition rate effectively. At the same time, it has the advantage of the easily parameter estimation. The experimental results based on the real-world plant leaf databases shows its average accuracy of recognition was up to 95.17%.

  7. Fuzzy-logic-based hybrid locomotion mode classification for an active pelvis orthosis: Preliminary results.

    Science.gov (United States)

    Yuan, Kebin; Parri, Andrea; Yan, Tingfang; Wang, Long; Munih, Marko; Vitiello, Nicola; Wang, Qining

    2015-08-01

    In this paper, we present a fuzzy-logic-based hybrid locomotion mode classification method for an active pelvis orthosis. Locomotion information measured by the onboard hip joint angle sensors and the pressure insoles is used to classify five locomotion modes, including two static modes (sitting, standing still), and three dynamic modes (level-ground walking, ascending stairs, and descending stairs). The proposed method classifies these two kinds of modes first by monitoring the variation of the relative hip joint angle between the two legs within a specific period. Static states are then classified by the time-based absolute hip joint angle. As for dynamic modes, a fuzzy-logic based method is proposed for the classification. Preliminary experimental results with three able-bodied subjects achieve an off-line classification accuracy higher than 99.49%. PMID:26737144

  8. New classification scheme for ozone monitoring stations based on frequency distribution of hourly data.

    Science.gov (United States)

    Tapia, O; Escudero, M; Lozano, Á; Anzano, J; Mantilla, E

    2016-02-15

    According to European Union (EU) legislation, ozone (O3) monitoring sites can be classified regarding their location (rural background, rural, suburban, urban) or based on the presence of emission sources (background, traffic, industrial). There have been attempts to improve these classifications aiming to reduce their ambiguity and subjectivity, but although scientifically sound, they lack the simplicity needed for operational purposes. We present a simple methodology for classifying O3 stations based on the characteristics of frequency distribution curves which are indicative of the actual impact of combustion sources emitting NO that consumes O3 via titration. Four classes are identified using 1998-2012 hourly data from 72 stations widely distributed in mainland Spain and the Balearic Islands. Types 1 and 2 present unimodal bell-shaped distribution with very low amount of data near zero reflecting a limited influence of combustion sources while Type 4 has a primary mode close to zero, showing the impact of combustion sources, and a minor mode for higher concentrations. Type 3 stations present bimodal distributions with the main mode in the higher levels. We propose a quantitative metric based on the Gini index with the objective of reproducing this classification and finding empirical ranges potentially useful for future classifications. The analysis of the correspondence with the EUROAIRNET classes for the 72 stations reveals that the proposed scheme is only dependent on the impact of combustion sources and not on climatic or orographic aspects. It is demonstrated that this classification is robust since in 87% of the occasions the classification obtained for individual years coincide with the global classification obtained for the 1998-2012 period. Finally, case studies showing the applicability of the new classification scheme for assessing the impact on O3 of a station relocation and performing a critical evaluation of an air quality monitoring network are

  9. Semi-automatic classification of glaciovolcanic landforms: An object-based mapping approach based on geomorphometry

    Science.gov (United States)

    Pedersen, G. B. M.

    2016-02-01

    A new object-oriented approach is developed to classify glaciovolcanic landforms (Procedure A) and their landform elements boundaries (Procedure B). It utilizes the principle that glaciovolcanic edifices are geomorphometrically distinct from lava shields and plains (Pedersen and Grosse, 2014), and the approach is tested on data from Reykjanes Peninsula, Iceland. The outlined procedures utilize slope and profile curvature attribute maps (20 m/pixel) and the classified results are evaluated quantitatively through error matrix maps (Procedure A) and visual inspection (Procedure B). In procedure A, the highest obtained accuracy is 94.1%, but even simple mapping procedures provide good results (> 90% accuracy). Successful classification of glaciovolcanic landform element boundaries (Procedure B) is also achieved and this technique has the potential to delineate the transition from intraglacial to subaerial volcanic activity in orthographic view. This object-oriented approach based on geomorphometry overcomes issues with vegetation cover, which has been typically problematic for classification schemes utilizing spectral data. Furthermore, it handles complex edifice outlines well and is easily incorporated into a GIS environment, where results can be edited or fused with other mapping results. The approach outlined here is designed to map glaciovolcanic edifices within the Icelandic neovolcanic zone but may also be applied to similar subaerial or submarine volcanic settings, where steep volcanic edifices are surrounded by flat plains.

  10. Mastectomy or breast conserving surgery? Factors affecting type of surgical treatment for breast cancer – a classification tree approach

    International Nuclear Information System (INIS)

    A critical choice facing breast cancer patients is which surgical treatment – mastectomy or breast conserving surgery (BCS) – is most appropriate. Several studies have investigated factors that impact the type of surgery chosen, identifying features such as place of residence, age at diagnosis, tumor size, socio-economic and racial/ethnic elements as relevant. Such assessment of 'propensity' is important in understanding issues such as a reported under-utilisation of BCS among women for whom such treatment was not contraindicated. Using Western Australian (WA) data, we further examine the factors associated with the type of surgical treatment for breast cancer using a classification tree approach. This approach deals naturally with complicated interactions between factors, and so allows flexible and interpretable models for treatment choice to be built that add to the current understanding of this complex decision process. Data was extracted from the WA Cancer Registry on women diagnosed with breast cancer in WA from 1990 to 2000. Subjects' treatment preferences were predicted from covariates using both classification trees and logistic regression. Tumor size was the primary determinant of patient choice, subjects with tumors smaller than 20 mm in diameter preferring BCS. For subjects with tumors greater than 20 mm in diameter factors such as patient age, nodal status, and tumor histology become relevant as predictors of patient choice. Classification trees perform as well as logistic regression for predicting patient choice, but are much easier to interpret for clinical use. The selected tree can inform clinicians' advice to patients

  11. Nanomaterials based biosensors for cancer biomarker detection

    Science.gov (United States)

    Malhotra, Bansi D.; Kumar, Saurabh; Mouli Pandey, Chandra

    2016-04-01

    Biosensors have enormous potential to contribute to the evolution of new molecular diagnostic techniques for patients suffering with cancerous diseases. A major obstacle preventing faster development of biosensors pertains to the fact that cancer is a highly complex set of diseases. The oncologists currently rely on a few biomarkers and histological characterization of tumors. Some of the signatures include epigenetic and genetic markers, protein profiles, changes in gene expression, and post-translational modifications of proteins. These molecular signatures offer new opportunities for development of biosensors for cancer detection. In this context, conducting paper has recently been found to play an important role towards the fabrication of a biosensor for cancer biomarker detection. In this paper we will focus on results of some of the recent studies obtained in our laboratories relating to fabrication and application of nanomaterial modified paper based biosensors for cancer biomarker detection.

  12. Non-gaussian distributions affect identification of expression patterns, functional annotation, and prospective classification in human cancer genomes.

    Directory of Open Access Journals (Sweden)

    Nicholas F Marko

    Full Text Available INTRODUCTION: Gene expression data is often assumed to be normally-distributed, but this assumption has not been tested rigorously. We investigate the distribution of expression data in human cancer genomes and study the implications of deviations from the normal distribution for translational molecular oncology research. METHODS: We conducted a central moments analysis of five cancer genomes and performed empiric distribution fitting to examine the true distribution of expression data both on the complete-experiment and on the individual-gene levels. We used a variety of parametric and nonparametric methods to test the effects of deviations from normality on gene calling, functional annotation, and prospective molecular classification using a sixth cancer genome. RESULTS: Central moments analyses reveal statistically-significant deviations from normality in all of the analyzed cancer genomes. We observe as much as 37% variability in gene calling, 39% variability in functional annotation, and 30% variability in prospective, molecular tumor subclassification associated with this effect. CONCLUSIONS: Cancer gene expression profiles are not normally-distributed, either on the complete-experiment or on the individual-gene level. Instead, they exhibit complex, heavy-tailed distributions characterized by statistically-significant skewness and kurtosis. The non-Gaussian distribution of this data affects identification of differentially-expressed genes, functional annotation, and prospective molecular classification. These effects may be reduced in some circumstances, although not completely eliminated, by using nonparametric analytics. This analysis highlights two unreliable assumptions of translational cancer gene expression analysis: that "small" departures from normality in the expression data distributions are analytically-insignificant and that "robust" gene-calling algorithms can fully compensate for these effects.

  13. Body mass index: different nutritional status according to WHO, OPAS and Lipschitz classifications in gastrointestinal cancer patients

    Directory of Open Access Journals (Sweden)

    Katia Barao

    2012-06-01

    Full Text Available CONTEXT: The body mass index (BMI is the most common marker used on diagnoses of the nutritional status. The great advantage of this index is the easy way to measure, the low cost, the good correlation with the fat mass and the association to morbidity and mortality. OBJECTIVE: To compare the BMI differences according to the WHO, OPAS and Lipschitz classification. METHODS: A prospective study on 352 patients with esophageal, gastric or colorectal cancer was done. The BMI was calculated and analyzed by the classification of WHO, Lipschitz and OPAS. RESULTS: The mean age was 62.1 ± 12.4 years and 59% of them had more than 59 years. The BMI had not difference between the genders in patients <59 years (P = 0.75, but over 59 years the BMI was higher in women (P<0.01. The percentage of undernourished was 7%, 18% and 21% (P<0.01 by WHO, Lipschitz and OPAS, respectively. The overweight/obesity was also different among the various classifications (P<0.01. CONCLUSIONS: Most of the patients with gastrointestinal cancer had more than 65 years. A different cut off must be used for this patients, because undernourished patients may be wrongly considered well nourished.

  14. An application to pulmonary emphysema classification based on model of texton learning by sparse representation

    Science.gov (United States)

    Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi

    2012-03-01

    We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.

  15. Ligand and structure-based classification models for Prediction of P-glycoprotein inhibitors

    DEFF Research Database (Denmark)

    Klepsch, Freya; Poongavanam, Vasanthanathan; Ecker, Gerhard Franz

    2014-01-01

    obtained by docking into a homology model of P-gp, to supervised machine learning methods, such as Kappa nearest neighbor, support vector machine (SVM), random forest and binary QSAR, by using a large, structurally diverse data set. In addition, the applicability domain of the models was assessed using an...... algorithm based on Euclidean distance. Results show that random forest and SVM performed best for classification of P-gp inhibitors and non-inhibitors, correctly predicting 73/75 % of the external test set compounds. Classification based on the docking experiments using the scoring function Chem...

  16. Computationally Efficient Modulation Level Classification Based on Probability Distribution Distance Functions

    CERN Document Server

    Urriza, Paulo; Pawe\\lczak, Przemys\\law; \\vCabrić, Danijela

    2010-01-01

    We present a novel modulation level classification (MLC) method based on probability distribution distance functions. The proposed method uses modified Kuiper and Kolmogorov- Smirnov (KS) distances to achieve low computational complexity and outperforms the state of the art methods based on cumulants and goodness-of-fit (GoF) tests. We derive the theoretical performance of the proposed MLC method and verify it via simulations. The best classification accuracy under AWGN with SNR mismatch and phase jitter is achieved with the proposed MLC method using Kuiper distances.

  17. A study on Pc- based ultrasonic testing system using intelligent ultrasonic flaw classification software

    International Nuclear Information System (INIS)

    For convenient application of ultrasonic pattern recognition approaches in practical field inspection of weldments, we have developed an intelligent ultrasonic flaw classification system by the novel combination of two ingredients; 1) a PC-based ultrasonic testing system, and 2) an intelligent ultrasonic flaw classification software with an invariant ultrasonic pattern recognition algorithm. Here, key aspects of this intelligent system are addressed including the Pc-based ultrasonic testing system, enhancement of the performance by use of newly proposed ultrasonic features, and feature selection.

  18. Semantic classification of diseases in discharge summaries using a context-aware rule-based classifier.

    Science.gov (United States)

    Solt, Illés; Tikk, Domonkos; Gál, Viktor; Kardkovács, Zsolt T

    2009-01-01

    OBJECTIVE Automated and disease-specific classification of textual clinical discharge summaries is of great importance in human life science, as it helps physicians to make medical studies by providing statistically relevant data for analysis. This can be further facilitated if, at the labeling of discharge summaries, semantic labels are also extracted from text, such as whether a given disease is present, absent, questionable in a patient, or is unmentioned in the document. The authors present a classification technique that successfully solves the semantic classification task. DESIGN The authors introduce a context-aware rule-based semantic classification technique for use on clinical discharge summaries. The classification is performed in subsequent steps. First, some misleading parts are removed from the text; then the text is partitioned into positive, negative, and uncertain context segments, then a sequence of binary classifiers is applied to assign the appropriate semantic labels. Measurement For evaluation the authors used the documents of the i2b2 Obesity Challenge and adopted its evaluation measures: F(1)-macro and F(1)-micro for measurements. RESULTS On the two subtasks of the Obesity Challenge (textual and intuitive classification) the system performed very well, and achieved a F(1)-macro = 0.80 for the textual and F(1)-macro = 0.67 for the intuitive tasks, and obtained second place at the textual and first place at the intuitive subtasks of the challenge. CONCLUSIONS The authors show in the paper that a simple rule-based classifier can tackle the semantic classification task more successfully than machine learning techniques, if the training data are limited and some semantic labels are very sparse. PMID:19390101

  19. Object Class Detection and Classification using Multi Scale Gradient and Corner Point based Shape Descriptors

    OpenAIRE

    Fernando, Basura; Karaoglu, Sezer; Saha, Sajib Kumar

    2015-01-01

    This paper presents a novel multi scale gradient and a corner point based shape descriptors. The novel multi scale gradient based shape descriptor is combined with generic Fourier descriptors to extract contour and region based shape information. Shape information based object class detection and classification technique with a random forest classifier has been optimized. Proposed integrated descriptor in this paper is robust to rotation, scale, translation, affine deformations, noisy contour...

  20. A Feature Selection Method for Large-Scale Network Traffic Classification Based on Spark

    Directory of Open Access Journals (Sweden)

    Yong Wang

    2016-02-01

    Full Text Available Currently, with the rapid increasing of data scales in network traffic classifications, how to select traffic features efficiently is becoming a big challenge. Although a number of traditional feature selection methods using the Hadoop-MapReduce framework have been proposed, the execution time was still unsatisfactory with numeral iterative computations during the processing. To address this issue, an efficient feature selection method for network traffic based on a new parallel computing framework called Spark is proposed in this paper. In our approach, the complete feature set is firstly preprocessed based on Fisher score, and a sequential forward search strategy is employed for subsets. The optimal feature subset is then selected using the continuous iterations of the Spark computing framework. The implementation demonstrates that, on the precondition of keeping the classification accuracy, our method reduces the time cost of modeling and classification, and improves the execution efficiency of feature selection significantly.